Table 1.
Details of the C. trachomatis samples along with the preliminary analysis results that were successfully amplified in this study.
Figure 1.
Flow chart representing the workflow of the entire procedure of target amplification, sequencing and genotype profiling performed in this study.
The dashed box represents the core analysis starting with downloaded reference genomes necessary for primer design and binstrain analysis. CT-ASR is the C. trachomatis ancestral reconstruction sequence.
Figure 2.
Summary of binstrain analysis of selected simulated data.
Each panel illustrates representative data for (top to bottom) a single strain identical (or very similar) to a previously sequenced genome, a mixture of strains, and a single novel recombinant strain. From left to right are the histogram of Major Allele Percentage (MAP) across the sequenced region and a barplot of binstrain β values for the reference genome set. In the MAP plots, minor peaks representing the subpopulation of mixed alleles are shown by red arrows. The minor β-value associated with the introduction recombinant strain is shown with a green arrow.
Figure 3.
Whole-genome phylogeny of the reference C. trachomatis strains used in this study.
The tree was constructed using a neighbor-joining algorithm based on whole-genome alignment. All the branches were supported by 100% confidence in 100 bootstrap sampling except for the A/HAR-13/B/Jali20 branch. All nodes with bootstrap support <100% are designated with an asterisk. Each leaf and internal branch has the number of SNPs unique to this branch compared to the CT-ASR, reconstructed ancestral sequence. The leaves are colored by membership of major C. trachomatis Clades [7]: yellow, blue, green and red for Clades 1–4, respectively.
Figure 4.
Tree calculated in the same manner as Figure 2 but instead based on the 100 kb region of the C. trachomatis genome selected for targeted amplification. The colors are the same as Figure 3.
Figure 5.
Diverse C. trachomatis genomes were aligned using Whole Genome Alignment (WGA) software against the C. trachomatis D/UW3/CX reference sequence and SNPs (hash lines) were identified (a small number of indels were also identified but were omitted from the figure to make it simpler). The genome was divided into 100 bp blocks. Blocks with a threshold of two or more SNPs are labeled in black and correspond to gray regions in the genome). Starts and ends for primers (amplicon regions in dashed lines) were designed to avoid variable blocks (the black portions of the barcode). Primers were designed to allow approximately 5-fold overlapping amplicon coverage.
Table 2.
The number of SNPs recovered through the RainDance targeted capture methodology for each of the single strain purified C. trachomatis samples used in this study that already has a genome sequence available.
Figure 6.
Summary of binstrain analysis of selected gDNA and clinical sample data.
Format is the same as Figure 2. Note that for a potential mixed infection, we would not be able to currently distinguish between multiple the presence of recombinant and non-recombinant strains, just the proportion of the genotype specific SNPs represented by the β values.