The Microbial Community of the Cystic Fibrosis Airway Is Disrupted in Early Life

Background Molecular techniques have uncovered vast numbers of organisms in the cystic fibrosis (CF) airways, the clinical significance of which is yet to be determined. The aim of this study was to describe and compare the microbial communities of the lower airway of clinically stable children with CF and children without CF. Methods Bronchoalveolar lavage (BAL) fluid and paired oropharyngeal swabs from clinically stable children with CF (n = 13) and BAL from children without CF (n = 9) were collected. DNA was isolated, the 16S rRNA regions amplified, fragmented, biotinylated and hybridised to a 16S rRNA microarray. Patient medical and demographic information was recorded and standard microbiological culture was performed. Results A diverse bacterial community was detected in the lower airways of children with CF and children without CF. The airway microbiome of clinically stable children with CF and children without CF were significantly different as measured by Shannon's Diversity Indices (p = 0.001; t test) and Principle coordinate analysis (p = 0.01; Adonis test). Overall the CF airway microbial community was more variable and had a less even distribution than the microbial community in the airways of children without CF. We highlighted several bacteria of interest, particularly Prevotella veroralis, CW040 and a Corynebacterium, which were of significantly differential abundance between the CF and non-CF lower airways. Both Pseudomonas aeruginosa and Streptococcus pneumoniae culture abundance were found to be associated with CF airway microbial community structure. The CF upper and lower airways were found to have a broadly similar microbial milieu. Conclusion The microbial communities in the lower airways of stable children with CF and children without CF show significant differences in overall diversity. These discrepancies indicate a disruption of the airway microflora occurring early in life in children with CF.

For each OTU, multiple specific 25-mer targets were sought for prevalence in members of a given OTU but dissimilar from sequences outside the given OTU. In the first step of probe selection for a particular OTU, each of the sequences in the OTU was separated into overlapping 25-mers, the potential targets. Then each potential target was matched to as many sequences of the OTU as possible. It was not adequate to use simplistic pattern searches to match potential targets and sequences since partial gene sequences were included in the reference set. Therefore, the multiple sequence alignment provided by Greengenes was necessary to provide a discrete measurement of group size at each potential probe site. In ranking the possible targets, those having data for all members of that OTU were preferred over those found only in a fraction of the OTU members.
In the second step, a subset of the prevalent targets was selected and the probe orientation was flipped to the reverse complement to minimize hybridization to unintended amplicon. Probes presumed to be potentially problematic were 25-mers containing a central 17-mer matching sequences in more than one OTU. Thus, probes that were unique to an OTU solely due to a distinctive base in one of the four flanking bases were avoided. Also, probes with mis-hybridization potential to sequences having a common tree node near the root were favoured over those with a common node near the terminal branch. Probes complementary to target sequences that were Of the features (oligonucleotides) on the array, the majority represents publicly available 16S rRNA genes, as described above. Additional probes for quality management, processing controls, image orientation, normalization controls, hierarchical taxonomic identification, or for pathogen-specific signature detection and some implement additional targeted regions of the chromosome. Probes complementary to lower confidence 16S sequences were also included to enable broadening the phylogenetic scope of analysis, when those sequences are validated with unambiguous entries into public repositories. The PhyloChip assay design includes control probes for pre-analytic, processing, pre-labelled hybridization controls, and negative controls. Pre-analytic and hybridization controls also interpretation of background signal intensity and support normalization of overall fluorescent intensity for sample to sample comparisons.

Data processing and data reductions:
Pre-processing and Data Reduction. Once OTUs are defined as present or absent as described in the main body of this manuscript, taxa are filtered to those present in at least one of the samples (Filter-1), to taxa present in samples from one category but not detected in any samples of the alternate categories (Filter-3) and to taxa significantly increased in their abundance in one category compared to the alternate categories (Filter-5). For Filter-3, the percent prevalence required among the samples in one state is first set to 100% but then iteratively decreased until the set of passing taxa intersects all samples. This ensures that each sample contains a present call for at least one of the passing OTUs. For Filter-5, the parametric Welch test was employed to calculate p-values. As this is an exploratory analysis, false discovery rates are not considered in the p-value calculations. For this study additional Benjamini-Hochberg and permutation tests were performed to account for false discovery rates and multiple testing.

Summarization
After the taxa are identified for inclusion in the analysis, the values used for each taxa-sample intersection are populated in two distinct ways. In the first case, the Abundance metrics are used directly (AT). Note that sub-detection abundance values are not discarded. In the second case, Binary metrics are created where 1's represent presence, 0's indicate absence (BT).

Sample-to-Sample Distances Function
All profiles are inter-compared in a pair-wise fashion to determine a dissimilarity score and store it in a distance matrix. The distance functions are chosen to allow similar biological samples to produce only small dissimilarity scores. The Unifrac distance metric, as described by Lozupone et al. [3], utilizes the phylogenetic distance between OTUs to determine the dissimilarity between communities. For weighted Unifrac, the OTU abundance is additionally considered.