Skip to main content
Advertisement

< Back to Article

Fig 1.

An overview of DeepPL framework for predicting phage lifecycle, including data collection, feature extraction, and model training.

Diverse lysogenic and lytic complete phage genomes were collected from the National Center for Biotechnology Information (NCBI) database. Non-ATGC letters within the phage nucleotide sequences were randomly replaced by ATGC letters. The phage lifecycles were manually confirmed by the literature review. Further, the lysogenic genes were identified and extracted from lysogenic phage genomes. The sliding window of 100 bp in length and further conversion of sets of k-mer 6 sequences from phage sequences were used as input for a fine-tuning training process based on the pre-trained DNABERT model. The process generated the binary classification probability (0–1) of each k-mer 6 sequence. Therefore, threshold 1 was used to identify a good match between phage sequences and lysogenic genes. The threshold 1 of the binary classification probability above 0.9 was identified as a good match between 100 bp DNA segments and lysogenic genes. Further, the results from the frame-by-frame classification results were aggregated into one final classification result for phage lifecycle prediction with the threshold 2 of 0.016. The input phage sequence with a threshold 2 below 0.016 was identified as a lytic phage; otherwise, it was predicted as a lysogenic phage. The figure was created using BioRender. Zhang, Y. (2024) BioRender.com/w89b419.

More »

Fig 1 Expand

Fig 2.

The ROC curve comparison between DeepPL and previously published tools for phage lifecycle prediction on test dataset.

The value shown in the legend is AUCROC score.

More »

Fig 2 Expand

Table 1.

The performance comparison between DeepPL and previously published tools for phage lifecycle prediction.

More »

Table 1 Expand

Table 2.

The performance validation of DeepPL on phage lifecycle prediction using in-house verified phages.

More »

Table 2 Expand

Fig 3.

Comparison of DeepPL and PhaTYP performance on phage contigs in a mock phage community generated by different metagenomic sequencing technologies and assemblies.

More »

Fig 3 Expand

Table 3.

The phages with known lifecycles used to construct a mock phage community for metagenomic sequencing.

More »

Table 3 Expand