The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific
The horizontal axis of each panel corresponds to a 100-kb segment of genomic sequence from the indicated reference microbial genome. The vertical axis indicates the sequence identity of an alignment between a GOS sequence and the reference genomic sequence. The identity ranges from 100% (top) to 50% (bottom). Individual GOS sequencing reads were colored to reflect the sample from which they were isolated. Geographically nearby samples have similar colors (see Poster S1 for key). Each organism shows a distinct pattern of recruitment reflecting its origin and relationship to the environmental data collected during the course of this study.
(A) P. ubique HTCC1062 recruits the greatest density of GOS sequences of any genome examined to date. The GOS sequences show geographic stratification into bands, with sequences from temperate water samples off the North American coast having the highest identity (yellow to yellow-green colors). At lower identity, sequences from all the marine environments could be aligned to HTCC1062.
(B) P. marinus MIT9312 recruits a large number of GOS sequences into a single band that zigzags between 85%–95% identity on average. These sequences are largely derived from warm water samples in the Gulf of Mexico and eastern Pacific (green to greenish-blue reads).
(C) P. marinus MED4 recruits largely the same set of reads as MIT9312 (B) though the sequences that form the zigzag recruit at a substantially lower identity. A small number of sequences from the Sargasso Sea samples (red) are found at high identity.
(D) P. marinus NATL2A recruits far fewer sequences than any of the preceding panels. Like MED4, a small number of high-identity sequences were recruited from the Sargasso samples.
(E) P. marinus MIT9313 is a deep-water low-light–adapted strain of Prochlorococcus. GOS sequences were recruited almost exclusively at low identity in vertical stacks that correspond to the locations of conserved genes. On the left side of this panel is a very distinctive pattern of recruitment that corresponds to the highly conserved 16S and 23S mRNA gene operon.
(F) P. marinus CCMP1375, another deep-water low-light–adapted strain, does not recruit GOS sequences at high identity. Only stacks of sequences are seen corresponding to the location of conserved genes.
(G) Synechococcus WH8102 recruits a modest number of high-identity sequences primarily from the Sargasso Sea samples. A large number of moderate identity matches from the Pacific and hypersaline lagoon (GS33) samples are also visible.
(H) Synechococcus CC9605 recruits largely the same sequences as does Synechococcus WH8102, but was isolated from Pacific waters. GOS sequences from some of the Pacific samples recruit at high identity, while sequences from the Sargasso and hypersaline lagoon (bluish-purple) were recruited at moderate identities.
(I) Synechococcus CC9902 is distantly related to either of the preceding Synechococcus strains. While this strain also recruits largely the same sequences as the WH8102 and CC9902 strains, they recruit at significantly lower identity.
(J–O) Fragment recruitment plots to extreme assemblies seeded with phylogenetically informative sequences. Using this approach it is not only possible to assemble contigs with strong similarities to known genomes but to identify contigs from previously uncultured genomes. In each case a 100-kb segment from an extreme assembly is shown. Each plot shows a distinct pattern of recruitment that distinguishes the panels from each other.
(J) Seeded from a Prochlorococcus marinus-related sequence, this contig recruits a broad swath of GOS sequences that correspond to the GOS sequences that form the zigzag on P. marinus MIT9312 recruitment plots (see [B] or Poster S1 for comparison).
(K–L) Seeded from SAR11 clones, these contigs show significant synteny to the known P. ubique HTCC1062 genome. (K) is strikingly similar to previous recruitment plots to the HTCC1062 genome (see [A] or Poster S1). In contrast, (L) identifies a different strain that recruits high-identity GOS sequences primarily from the Sargasso Sea samples (red).
(M–O) These three panels show recruitment plots to contigs belonging to the uncultured Actinobacter, Roseobacter, and SAR86 lineages.