DeepDynaForecast: Phylogenetic-informed graph deep learning for epidemic transmission dynamic prediction
Fig 2
Our proposed DeepDynaForecast architecture.
A: Pathogen genomic data collected during outbreak molecular surveillance. B: A phylogenetic tree, reconstructed from the genomic data, which is used as input to trace the transmission among the populations. In this tree, nodes represent individuals, while edges represented transmission or mutation events. The phylogenetic tree is modeled using a bi-directed graph, where initial node representation vector vi is randomly generated for each node i, and edge representation vector eij for edge eij from node i to j is initialized from the branch length with a neural network. C-D: Example of Primal-Dual Graph Long Short-Term Memory (PDGLSTM) learning architecture on a subtree to update vi and eij at the l-th layer. Two parallel LSTM modules are utilized to update the node and edge representations in each message-passing iteration. Within this process, each edge/node aggregates adjacent node/edge representations and encodes low-dimensional messages by the neural networks ϕE and ϕN. These node/edge messages are input into their corresponding LSTM modules to facilitate the update of node and edge representations E: This system sequentially applies N rounds of message-passing iterations, thus producing updated nodes and edges representations. F: Cross-layer Prediction (CLP) module on each leaf node. A series of neural networks {ψL} are engaged in predicting the dynamics of leaf n using various levels of node representations {vn}. This process is followed by dropout layers and summation operations to generate the final prediction. G: Predicted dynamics for leaves on the phylogenetic tree.