The Aggregated Gut Viral Catalogue (AVrC): A unified resource for exploring the viral diversity of the human gut
Fig 3
Aggregated Viral Catalogue (AVrC) overview.
A: Schematic overview of the AVrC construction. The AVrC included 9 previously published catalogues and resources [10–18] and more than 7,000 additional infant gut metagenomes (PRJEB70237, PRJNA345144, PRJEB32135, PRJEB6456, PRJNA384716, PRJNA473126, PRJNA290380, PRJEB42363, PRJNA695570, PRJEB32631, PRJNA497734, PRJNA489090). The metadata for age and health status associated to previously published catalogues were extracted and manually curated when possible (excluding the IMG/Vr dataset and the KGP). An estimation of the mined sample counts per age group and health status were computed. For each vOTU, the representative sequence quality was assessed using CheckV and the potential plasmid contamination was assessed using geNomad. The vOTU size was calculated as the number of sequences grouped into a single cluster by mmSeqs2. B: Accumulation curves of the AVrC at the species-level vOTU. C: Predicted host phylum distribution for the viral sequences contained in the AVrC. The putative host for each viral sequence was obtained from iPHoP. Sequences without any predicted putative host are not displayed in the figure.