\section*{Supplementary code} The Python codebase of the proposed LiteTransNet model is saved in a directory named "lite-trans-net" which contains several components: \vspace{12pt} \dirtree{% .1 lite-trans-net. .2 data. .3 .... .2 models. .3 .... .2 training.py. .2 tst. .3 encoder.py. .3 decoder.py. .3 multiHeadAttention.py. .3 positionwiseFeedForward.py. .3 transformer.py. .3 utils.py. .2 dataset.py. .2 utils.py. } \vspace{3pt} The LiteTransNet codebase is a structured collection of files and directories for LiteTransNet model. Below is an introduction to the function of each component within the \texttt{"lite-trans-net"} directory: \begin{itemize} \item \textbf{data}: This directory holds the CSV files of the landslide dataset that are used in the case study. These files provide the data needed to train the LiteTransNet model. \item \textbf{models}: The \texttt{models} directory stores the saved PyTorch models of the trained Transformer networks. \item \textbf{training.py}: This script manages the training process for the LiteTransNet model. It includes code for handling the training loop and optimization steps, evaluating the test set, and saving checkpoints of the model. \item \textbf{tst}: This directory contains several key components of the transformer architecture used in LiteTransNet: \begin{itemize} \item \texttt{encoder.py}: Contains the implementation of the encoder part of the transformer model, which processes the input data. \item \texttt{decoder.py}: Implements the decoder part, which generates the output from the encoded representations. \item \texttt{multiHeadAttention.py}: Provides the multi-head attention mechanism, a key component of transformers that allows the model to focus on different parts of the input sequence. \item \texttt{positionwiseFeedForward.py}: Defines position-wise feedforward networks used within the transformer model. \item \texttt{transformer.py}: This is the script where the entire transformer model is put together using the encoder, decoder, and other components. \item \texttt{utils.py}: A utility script that contains helper functions used across various scripts in the \texttt{tst} directory. \end{itemize} \item \textbf{dataset.py}: This file includes code for handling the dataset class, which preprocesses and provides data to the LiteTransNet model during both training and test stages. \item \textbf{utils.py}: This script offers general utility functions that assist in preprocessing data. \end{itemize} \newpage The training.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import numpy as np ...... \end{minted} \newpage The dataset.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import numpy as np .... \end{minted} \newpage The utils.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import csv ....... \end{minted} \newpage The tst/encoder.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import numpy as np ..... \end{minted} \newpage The tst/decoder.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import numpy as np def forward(self, x: torch.Tensor, memory: torch.Tensor) -> torch.Tensor: """Propagate the input through the Decoder block. Apply the self attention block, add residual and normalize. Apply the encoder-decoder attention block, add residual and normalize. Apply the feed forward network, add residual and normalize. Parameters ---------- x: Input tensor with shape (batch_size, K, d_model). memory: Memory tensor with shape (batch_size, K, d_model) from encoder output. Returns ------- x: Output tensor with shape (batch_size, K, d_model). """ # Self attention residual = x x = self._selfAttention(query=x, key=x, value=x, mask="subsequent") x = self._dopout(x) x = self._layerNorm1(x + residual) # Encoder-decoder attention residual = x x = self._encoderDecoderAttention(query=x, key=memory, value=memory) x = self._dopout(x) x = self._layerNorm2(x + residual) # Feed forward residual = x x = self._feedForward(x) x = self._dopout(x) x = self._layerNorm3(x + residual) return x ...... \end{minted} \newpage The tst/multiHeadAttention.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import numpy as np class MultiHeadAttention(nn.Module): """Multi Head Attention block from Attention is All You Need. Given 3 inputs of shape (batch_size, K, d_model), that will be used to compute query, keys and values, we output a self attention tensor of shape (batch_size, K, d_model). Parameters ---------- d_model: Dimension of the input vector. q: Dimension of all query matrix. v: Dimension of all value matrix. h: Number of heads. attention_size: Number of backward elements to apply attention. Deactivated if ``None``. Default is ``None``. """ def __init__(self, d_model: int, q: int, v: int, h: int, attention_size: int = None): """Initialize the Multi Head Block.""" super().__init__() self._h = h self._attention_size = attention_size # Query, keys and value matrices self._W_q = nn.Linear(d_model, q*self._h) self._W_k = nn.Linear(d_model, q*self._h) self._W_v = nn.Linear(d_model, v*self._h) # Output linear function self._W_o = nn.Linear(self._h*v, d_model) # Score placeholder self._scores = None ...... \end{minted} \newpage The tst/positionwiseFeedForward.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import torch ..... \end{minted} \newpage The tst/transformer.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} import torch ..... \end{minted} \newpage The tst/utils.py file: \begin{minted}[bgcolor=LightGray,breaklines=true,fontsize=\footnotesize]{python} from typing import Optional, Union import numpy as np ...... \end{minted} \end{document}