Obesity Alters the Microbial Community Profile in Korean Adolescents

Obesity is an increasing public health concern worldwide. According to the latest Organization for Economic Co-operation and Development (OECD) report (2014), the incidence of child obesity in Korea has exceeded the OECD average. To better understand and control this condition, the present study examined the composition of the gut microbial community in normal and obese adolescents. Fecal samples were collected from 67 obese (body mass index [BMI] ≥ 30 kg/m2, or ≥ 99th BMI percentile) and 67 normal (BMI < 25 kg/m2 or < 85th BMI percentile) Korean adolescents aged 13–16 years and subjected to 16S rRNA gene sequencing. Analysis of bacterial composition according to taxonomic rank (genus, family, and phylum) revealed marked differences in the Bacteroides and Prevotella populations in normal and obese samples (p < 0.005) at the genus and family levels; however, there was no difference in the Firmicutes-to-Bacteroidetes (F/B) ratio between normal and obese adolescents samples at the phylum level (F/B normal = 0.50 ± 0.53; F/B obese = 0.56 ± 0.86; p = 0.384). Statistical analysis revealed a significant association between the compositions of several bacterial taxa and child obesity. Among these, Bacteroides and Prevotella showed the most significant association with BMI (p < 0.0001 and 0.0001, respectively). We also found that the composition of Bacteroides was negatively associated with triglycerides (TG), total cholesterol, and high-sensitive C-reactive protein (hs-crp) (p = 0.0049, 0.0023, and 0.0038, respectively) levels, whereas that of Prevotella was positively associated with TG and hs-crp levels (p = 0.0394 and 0.0150, respectively). We then applied the association rule mining algorithm to generate “rules” to identify the association between the populations of multiple bacterial taxa and obesity; these rules were able to discriminate obese from normal states. Therefore, the present study describes a systemic approach to identify the association between bacterial populations in the gut and childhood obesity.

Obesity is an increasing public health concern worldwide. According to the latest Organization for Economic Co-operation and Development (OECD) report (2014), the incidence of child obesity in Korea has exceeded the OECD average. To better understand and control this condition, the present study examined the composition of the gut microbial community in normal and obese adolescents. Fecal samples were collected from 67 obese (body mass index [BMI] ! 30 kg/m 2 , or ! 99 th BMI percentile) and 67 normal (BMI < 25 kg/m 2 or < 85 th BMI percentile) Korean adolescents aged 13-16 years and subjected to 16S rRNA gene sequencing. Analysis of bacterial composition according to taxonomic rank (genus, family, and phylum) revealed marked differences in the Bacteroides and Prevotella populations in normal and obese samples (p < 0.005) at the genus and family levels; however, there was no difference in the Firmicutes-to-Bacteroidetes (F/B) ratio between normal and obese adolescents samples at the phylum level (F/B normal = 0.50 ± 0.53; F/B obese = 0.56 ± 0.86; p = 0.384). Statistical analysis revealed a significant association between the compositions of several bacterial taxa and child obesity. Among these, Bacteroides and Prevotella showed the most significant association with BMI (p < 0.0001 and 0.0001, respectively). We also found that the composition of Bacteroides was negatively associated with triglycerides (TG), total cholesterol, and high-sensitive C-reactive protein (hs-crp) (p = 0.0049, 0.0023, and 0.0038, respectively) levels, whereas that of Prevotella was positively associated with TG and hs-crp levels (p = 0.0394 and 0.0150, respectively). We then applied the association rule mining algorithm to generate "rules" to identify the association between the populations of multiple bacterial taxa and obesity; these rules were able to discriminate obese from normal states. Therefore, the present study describes a systemic approach to identify the association between bacterial populations in the gut and childhood obesity.

Introduction
Obesity is a public health issue worldwide and tops the public health agenda in both industrialized and developing countries. Although obesity rates in Korean adults are one of lowest rates among Organization for Economic Co-operation and Development (OECD) countries (OECD report 2014, http://www.oecd.org/els/health-systems/Obesity-Update-2014.pdf), levels have increased steadily in recent decades. Notably, the rate of childhood obesity in Korea exceeded the OECD average of 23%; indeed, 25% of Korean boys (aged 5 to 17 years) are obese. The obesity rate for Korean girls (20%) is slightly below the OECD average (21%). When many of these children reach young adulthood, they suffer several obesity-related conditions, such as diabetes, heart disease, or certain types of cancer [1,2], which places a heavy burden on the healthcare system and society in general.
Therefore, to better understand and control this epidemic, a number of studies have attempted to identify genetic and/or environmental factors associated with obesity [3][4][5][6]. The recently emerged field of metagenomics has allowed researchers to examine the microorganisms that inhabit the human gut, known as the microbiota, as a novel environmental factor associated with obesity [7][8][9][10][11][12]. These studies identified marked changes in the composition of the gut microbiota and its metabolic function in obese subjects. They also suggested that the gut microbiota plays a significant role in harvesting energy from food and storing it within the host [7]. Other studies report that the composition of the gut microbiota differs according to age and geographical location [13][14][15][16]. Taken together, these results imply that there is a marked difference in the composition of obesity-associated gut microbiota between children and adults and among populations. To date, few studies have examined the association between childhood obesity and the composition of the gut microbiota [17][18][19]. Studies suggesting that the Firmicutesto-Bacteroidetes (F/B) ratio plays a role in human obesity are rather contradictory. Two studies, which were conducted in Spanish [19] and Belgian children [18], reported an increased F/B ratio; however, a study by Karlsson et al. [17] found no significant difference in this ratio in obese and lean Swedish preschool children. Recently, a metagenomic analysis of samples from European populations indicated that the number of gut microbial genes, and thus the "richness" of gut bacterial population, differed between obese and lean groups [20].
Therefore, we performed an in-depth analysis of the gut microbiota in 67 obese and 67 normal Korean adolescents aged 13 to 16 years. Sequencing the 16S rRNA genes obtained from fecal samples was performed to obtain an overall picture of the gut bacterial composition of the two groups according to taxonomic rank. Next, we performed statistical analyses to identify bacterial taxa significantly associated with childhood obesity. We then examined the correlation between microbial composition and BMI or the levels of biochemical markers. Finally, we attempted to generate rules (patterns) that associate multiple bacterial taxa with obesity, thereby allowing discrimination between obese and normal states.

Study participants
This study subjects comprised 67 obese adolescents (BMI ! 30 kg/m 2 or ! 99 th BMI percentile) and 67 normal adolescents (BMI < 25 kg/m 2 or < 85 th BMI percentile). A total of 134 adolescents aged 13 to 16 years were recruited from Seoul and Kyunggi province as part of the Korean Children and Adolescent Study (KoCAS) in 2012, which has monitored this cohort on an annual basis since they entered elementary school in 2005 aged 7 years. None of the subjects had taken antibiotics in the 4 weeks before the sampling dates and none had significant comorbidities such as acute infection or chronic disease. The study was approved by the and Sin-Gi Park, but did not have any additional role in the study design, data collection and analysis (other than already disclosed), decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the 'author contributions' section.

Biochemical analyses
A vacutainer tube was used to collect blood samples from the antecubital vein between 9:00 AM and 11:00 AM after a 12-hour overnight fast. Within 30 minutes, plasma and serum were separated and stored at -80°C prior to further analysis. The levels of triglycerides (TG), total cholesterol (T chol ), high-density lipoprotein-cholesterol (HDL-C), hs-crp, and glucose were measured using an autoanalyzer (model 7600II; Hitachi, Tokyo, Japan).

Stool sampling and DNA extraction
Samples were taken by each participant at home. A fresh stool sample (~30 ml) was placed into a collection container with dry ice and brought to the study center within 12 h. The sample was stored at -70°C in the laboratory prior to DNA extraction. DNA was extracted using a QIAamp DNA Stool Kit (Qiagen, Valencia, CA), according to the manufacturers protocol. The quality and concentration of the DNA were checked using a Nanodrop 2000 spectrophotometer (Nanodrop Technologies, Wilmington, DE).

Pyrosequencing of 16S rRNA
The 16S rRNA gene fragments were amplified from the extracted DNA. The following barcoded primers were designed to target the hyper-variable regions (V1 to V3) within the 16S rRNA gene: 9F (5'-CCTATCCCCTGTGTGCCTTGGCAGTC-TCAG-AC-AGAGTTTGATCMTGGCTCAG-3' [bacteria 16rRNA primer included 90%]; 5'-CCTATCCCCTGTGTGCCTTGGCAGTC-TCAG-AC-GGGTTCGATTCTGGCTCAG-3' [Bifidobacterium 16 rRNA primer included 10%]; the target region primers are underlined) and 541R (5'-CCATCTCATCCCTGCGTGTCTCCGAC-TCA G-X-AC-ATTACCGCGGCTGCTGG-3'; 'X' denotes the unique barcode for each subject) (http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-61493/protocols/). PCR was performed as follows: initial denaturation at 95°C for 5 min, followed by 30 cycles of denaturation at 95°C for 30 sec, primer annealing at 55°C for 30 sec, and extension at 72°C for 30 sec, with a final elongation at 72°C for 5 min. Each sample was subjected to PCR on three separate occasions. The quality of the PCR product was confirmed by running a sample in 2% agarose gels followed by visualization using the Gel Doc system (BioRad, Hercules, CA, USA). The amplified products were purified with the QIAquick PCR purification kit (Qiagen, Valencia, CA, USA) and equal amounts pooled. Short DNA fragments (non-target products) were removed using an Ampure beads kit (Agencourt Bioscience, MA, USA). The quality and size of the products were assessed using a Bioanalyzer 2100 (Agilent, Palo Alto, CA, USA) and a DNA 7500 chip. Mixed amplicons were used for emulsion PCR and deposited on Picotiter plates. Sequencing was performed with the GS Junior Sequencing system (Roche, Branford, CT, USA) according to the manufacturer's instructions.

Determination of operational taxonomic units and taxonomic classification
The pre-processed reads from each sample were used to calculate the number of operational taxonomic units (OTUs). The number of OTUs was determined by clustering the sequences from each sample using a 97% sequence identity cut-off [16,25,26] using QIIME software (v.1.8.0). Taxonomic abundance was counted with RDP Classifier v1.1 using a confidence threshold of 0.8 derived from the pre-processed reads for each sample. The microbial composition was normalized using the value calculated from the taxonomy abundance count divided by the number of pre-processed reads for each sample.

Association rule mining
The CPAR (Classification based on Predictive Association Rules) algorithm was used to generate a complete rule set [27]. This algorithm is more suited to bioinformatics applications than the traditional support-confidence based measure for market basket data analysis because it uses information metrics to generate rules [28]. The CPAR algorithm was implemented by the LUCS-KDD research group (http://cgi.csc.liv.ac.uk/~frans/KDD/Software/FOIL_PRM_ CPAR/foilPrmCpar.html).
The accuracy of the rules generated by CPAR is presented in terms of Laplace accuracy. Using rule r, Laplace accuracy is defined as follows: where m is the number of target groups and N total is the total number of examples that satisfy the body of the rules (total number of As in the rule, A ! B). N g is the number of examples that belong to the predicted target group g.

Statistical analyses
Statistical analysis was performed with IBM SPSS version 17.0 (SPSS Inc., Chicago, IL, USA) and the R package (version 2.15.3). Variables were examined for normality, and those that were not normally distributed were log-transformed before analysis. To measure the alpha diversity of each sample, the OTUs were analyzed using the Shannon index, H 0 ¼ À P S i¼1 ðp i lnðp i ÞÞ [29]. To measure beta diversity, the difference in organism composition was measured according to Bray-Curtis distance, BC ij ¼ Principal component analysis (PCA) was then performed using the measured beta diversities [30]. The Chi-square test was used to test whether gender showed an equal distribution between normal and obese groups. Age, height, weight, BMI, and blood profiles (all mean values) were compared using Student's t-test. The association between microbiota composition and BMI or levels of biochemical markers was expressed in terms of the Pearson partial correlation coefficient (controlled for age and gender). The Kolmogorov-Smirnov test with Lilliefors significance correction was used to examine the normality of data used to test the microbial composition between obese and normal groups. Accordingly, the non-parametric Mann-Whitney U test was used for the comparison. The association between microbiota composition and BMI or biochemical markers was expressed in terms of the Pearson's correlation coefficient after adjusting for age and gender. FDR multiple test correction was performed with the R package. An adjusted P-value < 0.05 was considered significant.

Subject characteristics
The general characteristics of the study subjects/samples are shown in Table 1. The 134 participants were classified as obese or normal (67 obese and 67 normal) according to BMI status. Gender and age showed a similar distribution between the two groups; however, the individuals in the obese group were slightly taller than those in the normal group. There were significant differences in BMI, BMI z-score and the level of biochemical markers (TG, Tchol, HDLc, and hs-crp) between the two groups (Table 1). These levels of biochemical markers were significantly lower in the normal group than in the obese group.

Microbial diversity across all samples
After filtering out primer sequences and low-quality and chimeric sequences, we obtained a total of 1,185,358 high-quality sequences (reads) (range, 2,065-42,522 reads) from 134 samples (A of Figure A in S1 File). Each sample was covered by an average of 8,846 reads, which is similar to the number of average reads (8,427 reads) reported in a study examining the gut microbiota of 20 Korean individuals [16]. Based on these reads, we calculated the number of OTUs to examine the diversity of the gut microbiota. The mean number of OTUs was 356 ± 140 (range, 105-876) (B of Figure A in S1 File). There was no significant difference in the number of OTUs between normal and obese individuals (Mann-Whitney U test, p = 0.072). In addition, we found no obvious difference in the alpha diversity (Shannon Index) values between normal and obese samples (mean number of OTUs: normal 6.94 ± 0.49; obese 6.98 ± 0.59) ( Figure B in S1 File). Also, the PCA result for beta diversity did not show a differential pattern between normal and obese samples (Figure C in S1 File).  obese individuals shows the composition at the phylum level, whereas the middle and outermost rings show the composition at the family and genus levels, respectively. Notably, there was a marked difference in the average proportions of Bacteroides and Prevotella between normal and obese samples at the genus level. The proportion of Bacteroides was highest in normal children (45%), whereas that in obese adolescents was 25%. Conversely, the proportion of Prevotella in normal adolescents was 16%; however, it was highest in obese adolescents (35%). This trend persisted at the family level (from 45% to 25% and from 16% to 36% in normal and obese adolescents, respectively). The box plots in Fig 2 clearly show the differential abundance of these two bacterial taxa in obese and normal adolescents; however, this difference was not apparent at the phylum level, since both Bacteroides and Prevotella belong to the phylum Bacteroidetes. At the phylum level, we found no significant differences between the Bacteroidetes, Firmicutes, and Proteobacteria populations in normal and obese adolescents (Fig 3A). At this level, the microbial composition of samples from normal adolescents was similar to that observed in a study examining the gut microbiota of 20 Korean individuals [16]. The authors of that study reported that Bacteroidetes and Firmicutes were the two major microbial taxa, amounting to an average of 94.8% of sequence reads. In the present study, these taxa accounted for 94% (on average) of the sequence reads obtained from normal adolescents. We next examined the F/B ratio, whose association with human obesity is unclear [17][18][19]. We found no significant difference in the F/B ratio between normal and obese adolescents (F/B ratio ± SD: F/B normal = 0.50 ± 0.53; F/B obese = 0.56 ± 0.86; p = 0.384) (Fig 3B).

Comparison of fecal microbial composition
The non-parametric Mann-Whitney U test also identified other bacterial taxa that differed significantly in terms of composition between obese and normal samples. We found significant differences in the Alistipes, Faecalibacterium, and Oscillibacter populations at the genus level (FDR-adjusted p = 0.0125, 0.0420, and 0.0007, respectively) ( Table 2), and in the Rikenellaceae, Sutterellaceae, Ruminococcaceae, and Veillonellaceae populations at the family level (FDRadjusted p = 0.0112, 0.0450, 0.0022, and 0.0450, respectively) ( Table 3).

Association between BMI and biochemical markers
We next measured the correlation between taxa at the genus level (based on the five taxa showing statistically significant differences in composition between obese and normal adolescents at the genus level) and the BMI z-score or levels of biochemical markers (Table 4). Consistent with previous observations, we found that the Bacteroides and Prevotella populations were significantly associated with the BMI z-score (p < 0.0001 and 0.0001, respectively); the former showed a negative correlation while the latter showed a positive correlation. We also found that Alistipes was negatively correlated with BMI, although the correlation was less significant (p = 0.0360).
TG, Tchol, and hs-crp showed a negative correlation with the proportion of Bacteroides (p = 0.0049, 0.0023, and 0.0038, respectively), while HDLc showed a positive correlation (p = 0.0165). By contrast, TG and hs-crp showed a positive correlation with the Prevotella

Association rule mining
A machine learning algorithm called association rule mining was used to identify patterns of association between multiple bacterial taxa in obese and normal adolescents [27]. Rules were generated based on the five taxa showing significantly different compositions in obese and normal adolescents at the genus level ( Table 2). The composition of each bacterial taxon was categorized into four groups (1, 2, 3, or 4) according to the quartile values (see the applied category for rule generation in Table 5). Five "obese" rules and six "normal" rules, each with an accuracy ! 80%, are presented in Table 5. The most accurate rule is "normal" rule 6, which states that if the proportion of Bacteroides is in the 4th quartile and that of Prevotella in the 1st quartile, the sample is classified as "normal" (accuracy, 95%). Similarly, "normal" rule 7 states that if the proportions of both Bacteroides and Oscillibacter are in the 4th quartile, then the sample is classified as "normal" (accuracy, 89%). "Obese" rule 1 states that if the proportion of Prevotella is in the 4th quartile and that of both Faecalibacterium and Oscillibacter is in the 1st quartile, then the sample is classified as "obese" (accuracy, 86%). Interestingly, if we compare rule 5 with rule 11, we see that the proportion of Oscillibacter appears to account for all the differences between the obese and normal states. Namely, in both rules, the proportions of the first two genera (Bacteroides and Faecalibacterium) are in the same quartile. However, with the first quartile proportion of Oscillibacter, they became to the obese rule and with the fourth quartile proportion, to the normal rule. Indeed, except for rule 4, all obese rules in Table 5 were involved with the first quartile proportion of Oscillibacter whereas most normal rules (except for rule 9) were with the fourth quartile proportion.

Discussion
Here, we examined fecal samples from 67 obese and 67 normal Korean adolescents to identify the association between the composition of the gut microbiota and childhood obesity. We used the QIAamp DNA Stool Kit to extract DNA because this kit shows high efficiency when used with different protocols [31,32]. A study that compared mechanical and enzymatic methods of DNA extraction indicated that the mechanical cell disruption results in higher bacterial diversity and improves DNA extraction efficiency [33]. However, the microbiota was highly similar regardless of the extraction method used. We found that the composition of the microbiota in samples from normal adolescents was in agreement with that reported in a study that examined the gut microbiota of 20 Korean individuals at the phylum level [16] and a study that examined the gut enterotypes in Korean monozygotic twins [34]. Even though two of the 20 Korean samples were from children, we can assume that normal Korean adolescents and adults have similar gut microbial compositions, at least at the phylum level. The results of the study of Korean monozygotic twins indicated that the microbiota of healthy Koreans clustered into two enterotypes, which are dominated by either Bacteroides or Prevotella. However, these enterotypes were not significantly correlated with biomarkers such as age, BMI, blood pressure, Tchol, or TG. Only one biomarker, serum uric acid, was different between the two enterotypes [34].
Many studies have attempted to identify an association between the composition of the gut microbiota and obesity [7][8][9][10][11]35]. Some results are consistent whereas others are contradictory. For example, we found no significant difference in the F/B ratio between obese and normal adolescents, which is in line with a study by Karlsson et al., [17], who examined this ratio in obese and lean Swedish preschool children. By contrast, one Spanish study [19] and one Belgian study [18] reported an increased F/B ratio in obese children. Considering that the association between the composition of the gut microbiota and obesity appears to differ according to age and geographical location [13][14][15][16], we can surmise that the results of these studies may apply only to specific populations and to specific age groups. We assume that the inconsistencies are due to the complex relationships between genetic, environmental, technical, and/or clinical factors. Therefore, more integrative approaches will be needed if we are to fully understand this complex association.
Most studies searching for a link between the microbial composition in the gut and obesity focused on only one specific taxonomic rank when comparing normal and obese individuals. Here, we examined bacterial composition at the genus, family, and phylum levels. We found that the proportions of Bacteroides (Bacteroidaceae) and Prevotella (Prevotellaceae) were markedly different in normal and obese adolescents at both the genus and family levels. The adolescents examined in the present study were morbidly obese (BMI, 35.4± 2.9 kg/m 2 or ! 99 th BMI percentile). Likewise, Zhang et al. [36] reported that Prevotellaceae, a subgroup of Bacteroides, were highly enriched in severely obese subjects. Here, we found that Bacteroides was the most prevalent genus in the normal adolescent group, a finding that is inconsistent with that of Agans et al., who found that Ruminococcus was the most prevalent genus in the normal adolescent group [37]. A metagenomic analysis examining the number of gut microbial genes, and thus the "richness" of the gut microbiota, indicated that obese individuals were more likely to possess low gene count (LGC) microbiota [20]. The significant difference in the Faecalibacterium proportion between obese and lean individuals in that study is consistent with the results reported herein. We believe that these adiposity-and age-specific differences in the bacterial populations are more informative, and will increase our understanding of the association between gut microbial composition and obesity. We also found that the size of the Bacteroides population was negatively associated with TG, Tchol, and hs-crp levels (p = 0.0049, 0.0023, and 0.0038, respectively) but positively associated with HDLc (p = 0.0165); however, these results do not agree with those of Bervoets et al., who examined gut microbiota composition in 26 obese and 27 lean children aged 6-16 years [18]. They found no significant association between the Bacteroides population and biochemical markers such as glucose, HDLc, or Tchol; however, they did find a positive association between Lactobacillus and plasma hs-CRP levels (p = 0.007). It is not clear whether the difference between the results of their study and our own is due to the smaller sample sizes or to population differences.
We hypothesized that the populations of multiple bacterial taxa are associated with obesity; therefore, we attempted to identify patterns using association rule mining. By categorizing the composition of statistically significant microbiota, we devised several rules to determine whether a sample can be classified as "obese" or "normal". We found that when Bacteroides Table 5. Association rules generated by association rule mining.  and Faecalibacterium were equally abundant, the abundance of Oscillibacter was the major determinant of obese or normal status. Other studies have also associated the abundance of Oscillibacter with obesity. For example, Tims et al. examined twins that were discordant in terms of BMI status and found that Oscillibacter was more abundant in the leaner twin [38]. Walker et al. examined the effect of a precisely controlled diet in 14 overweight men; they found that Oscillibacter group increased on the resistant starch (RS) and a reduced carbohydrate weight loss (WL) diet [39]. Even though the authors were unsure whether Oscillibacter is a starch degrader, they assumed that the increase in the population must be due to the diet itself. Further studies are needed to verify this observation. To investigate what happens when we take into account genera other than the five showing significant differences in terms of population, we examined 28 other genera from both obese and normal samples whose median values for rule generation were greater than 0 (Table A in S1 File). As expected, the rule set was larger than that obtained for five genera; indeed, 12 "obese" rules and 17 "normal" rules, each with an accuracy ! 80%, were generated. However, including the rarer genera did not lead to an improvement in overall accuracy. One interesting finding is that the obese rules were often met when the proportion of Sutterella was in the 3rd or 4th quartiles. However, since these rules were generated from a relatively small sample set, they may not be generalizable. We believe that examining a larger sample set in the future will identify more reliable rules. Recent studies examined the correlation between bacterial composition and diet. De Filippo et al. showed that the fecal communities in rural African children were different from those in European children [15]. African children, who consume a diet low in fat and protein and rich in plant-based foods, have a significantly enriched Bacteroidetes population and a depleted Firmicutes population when compared with European children. Other studies also demonstrate that the Prevotella-enriched enterotype is associated with a high carbohydrate diet [18,40], while the Bacteroides-enriched enterotype is correlated with a diet high in protein and animal fat [41]. These results suggest that alterations in diet induce changes in the gut microbiota. Even though we could not evaluate the effects of dietary pattern and gut microbial composition on obesity in the current study, we believe that this should be taken into consideration in future in-depth studies. In addition, the present study was of cross-sectional design. Therefore, additional prospective studies are required to fully determine the causal relationships between gut microbiota, diet, and obesity.
Supporting Information S1 File. Figure Table A, Association rules generated by association rule mining using 28 different genera. (PDF)