Transcriptome and Biochemical Analyses Revealed a Detailed Proanthocyanidin Biosynthesis Pathway in Brown Cotton Fiber

Brown cotton fiber is the major raw material for colored cotton industry. Previous studies have showed that the brown pigments in cotton fiber belong to proanthocyanidins (PAs). To clarify the details of PA biosynthesis pathway in brown cotton fiber, gene expression profiles in developing brown and white fibers were compared via digital gene expression profiling and qRT-PCR. Compared to white cotton fiber, all steps from phenylalanine to PA monomers (flavan-3-ols) were significantly up-regulated in brown fiber. Liquid chromatography mass spectrometry analyses showed that most of free flavan-3-ols in brown fiber were in 2, 3-trans form (gallocatechin and catechin), and the main units of polymeric PAs were trihydroxylated on B ring. Consistent with monomeric composition, the transcript levels of flavonoid 3′, 5′-hydroxylase and leucoanthocyanidin reductase in cotton fiber were much higher than their competing enzymes acting on the same substrates (dihydroflavonol 4-reductase and anthocyanidin synthase, respectively). Taken together, our data revealed a detailed PA biosynthesis pathway wholly activated in brown cotton fiber, and demonstrated that flavonoid 3′, 5′-hydroxylase and leucoanthocyanidin reductase represented the primary flow of PA biosynthesis in cotton fiber.


Introduction
Naturally colored cotton is an important raw material for ecological textiles. With naturally colored cotton, textile manufacturers can eliminate dyeing during processing, and significantly reduce processing costs, environmental pollutions and chemical residues in fabrics [1,2]. In addition, naturally colored cotton may have lower flammability and higher ultraviolet protection value compared to traditional white cotton [3,4]. Brown cotton fibers with different shades are most widely used in the modern colored cotton industry. A wealth of information has been obtained about the chemical properties and biosynthesis pathway of brown pigments in cotton fiber, suggesting that these pigments belonged to proanthocyanidins (PAs) [5][6][7][8][9]. However, exact chemical properties of PA pigments and details of the PA biosynthesis pathway in brown cotton fiber are still to be elucidated.
PAs, also known as condensed tannins, are widely distributed in plants with various functions such as pigments in seed coat and protectants against herbivores and microbes [10,11]. PA is also an important factor affecting mouthfeel, contributing the bitter flavor and astringency to our daily foods and beverages [12]. In addition, PA's antioxidant and anti-inflammatory properties make it a potential chemopreventive and chemotherapeutic agent for some human diseases, including cancers [12,13]. Chemically, PAs are oligomers or polymers of polyhydroxy flavan-3-ol units. PA polymers are synthesized presumably by adding flavan-3, 4-diol (leucoanthocyanidin) molecules to an initiating flavan-3-ol unit or the terminal unit of a flavan-3-ol chain ( Figure 1) [10,11,14]. Both flavan-3-ols and flavan-3, 4-diols are synthesized through plant flavonoid pathway. The details of this pathway vary with tissues and species and determine PA compositions in different plants [15].
In attempt to clarify the details of PA biosynthesis pathway in brown cotton fiber, we performed a digital gene expression (DGE) analysis to compare the gene expression profiles in brown and white fibers. A total of 24 PA synthase genes were identified to be significantly up-regulated in brown fiber. Furthermore, we determined the chemical properties of PAs in brown fiber by using liquid chromatography-mass spectrometry (LC-MS) method. It was found that the majority of PA initiating units consisted of gallocatechin and the main extension units in brown fiber were trihydroxylated on B ring. These results demonstrated a detailed PA pathway involved in the brown pigmentation in cotton fiber, which may be essential to manipulate the biosynthesis of pigment and other flavonoids in cotton fiber.
A recombination inbred line (RIL) population derived from the cross between white fiber cotton cultivar Yumian No.1 and brown fiber line T586 was constructed as described [16]. Total RNAs were extracted from fibers of 14 days post anthesis (DPA) using a modified CTAB method [17]. Equal amounts of total RNAs from each 10 white and brown fiber RILs were mixed to form two RNA bulks (WCF and BCF, respectively). DGE analyses of RNA bulks were performed in BGI-Shenzhen using standard procedure (Shenzhen, Guangdong, China) [18,19]. Briefly, biotin-labeled oligo d(T) primer was employed to initiate the synthesis of double-stranded cDNA. After NlaIII restriction, the 39 cDNA ends were separated by magnetic method and linked to an adaptor containing a MmeI recognition site. After MmeI digestion and ligation to the second adaptor, the tags were sequenced on an Illumina Genome Analyzer (Illumina, Inc., San Diego, CA) using Solexa technology. The raw sequence data were deposited in NIH Sequence Read Archive under the accession number SRP033354. Sequenced tags were annotated by aligning with Gossypium hirsutum unigenes and singleton ESTs (ftp://ftp.ncbi.nih.gov/repository/ UniGene/Gossypium-hirsutum/Ghi.seq.uniq.gz, 2010) [20]. Differentially expressed genes (DEGs) in WCF and BCF bulks were identified according to the frequency of corresponding tags by setting a cutoff on a false discovery rate (FDR) ,0.001 and a Figure 1. The detailed PA biosynthesis pathway in brown cotton fiber. A typical PA monomer is depicted in the up right (R = H or OH). Solid arrows represent reactions from substrates to products with corresponding synthases indicated. The arrow line thickness roughly reflects the expression levels of corresponding synthases and putative flow rates in these steps. Expression levels are classified into 4 categories according to transcript abundance detected in the DGE analysis, i.e. low, moderate, high and very high expression (2,20,20,200,200,1000 and over 1000 TPM, respectively). Transcript levels (in TPM) of various PA synthases in white/brown fibers are indicated. Dashed arrows indicate the monomeric origins of oligomeric or polymeric PAs. PA synthases are abbreviated as in Table 1

Quantitative RT-PCR
Quantitative RT-PCR (qRT-PCR) was employed to detect the expression levels of predominant PA synthase genes in each 10 brown and white fiber RILs. The investigated genes and corresponding primers were listed in table 1. The histone 3 and elongation initiation factor 5 genes from cotton were amplified as RNA standard [23,24]. PCRs were performed on a CFX96 TM real-time PCR detection system with SYB Green supermix (Bio-Rad, CA, USA). The thermocycling parameters were as follows: 95uC, 2 min, 40 cycles of 95uC, 10 s and 57uC, 20 s. A standard melting curve was added to monitor the specificity of PCR products. The reactions were duplicated for 3 times and data were analyzed using the software Bio-Rad CFX Manager 2.0 provided by manufacturer.

Extraction and Purification of Cotton Fiber PAs
PAs were extracted from developing brown cotton fibers and purified as described [25]. Around 10 g fibers harvested from bolls of 20 DPA were ground to fine powder in liquid N 2 , and extracted in 50 ml 80% aqueous acetone twice at 4uC for 2 h. Solutions were centrifuged at 10 000 rpm for 5 min. Supernatants were combined and acetone was evaporated under vacuum at 35uC. Two-microliter aliquots of the remaining aqueous solution were applied to a packed Supelco Discovery DPA-6S polyamide cartridge (500 mg, Sigma-Aldrich Chemie Inc.). After washing with 5 ml 30% methanol, PAs were eluted with 2 ml 90% N, Ndimethylformamide. Finally, eluates were evaporated to dry under vacuum, re-suspended in methanol, and subjected to LC-MS analyses.

LC-MS Analysis
To detect monomeric flavan-3-ols in the purified PAs, LC-MS analyses were performed on LC-MS 2010A system (Shimadzu, Japan) with an Xtimate C18 column (2.16150 mm, 5 mm, Welch Materials, Inc., Shanghai, China). Solvent A was water: formic acid (99:1, v/v), and solvent B was acetonitrile: formic acid (99:1, v/v). Around 5 mg PAs were injected and eluted with a gradient of solvent B to A at a flow rate of 0.2 ml/min. After 3-min isocratic elution in 90% solvent A and 10% solvent B, solvent B concentrations increased from 10% to 30% (v/v) in 12 min, followed 15-min washing with 100% of B and 3-min reequilibration in 10% of B. Positive ions were monitored in a selected ion monitoring mode. Monomeric flavan-3-ols were identified according to the typical ions and retention times of authentic standards. Catechin and epicathechin with [290+H] + ions were eluted at 8.64 and 14.12 min, while gallocatechin and epigallocatechin with [306+H] + ions were detected at 3.92 and 6.62 min, respectively. Flavan-3-ol amounts were calculated by reference to the standard curves of authentic standards.
To elucidate the chemical properties of PA extension unit, purified PAs were hydrolyzed by acid butanol method [26]. PAs were mixed with equal volume (50 ml) of concentrated HCl: butanol (1:9, v/v) and incubated at 100uC for 1 h. The hydrolytes were evaporated to dry under vacuum, re-suspended in 100 ml aqueous methanol (15%, v/v), and subjected to LC-MS analysis. The column and solvents were the same as those used to detect monomeric flavan-3-ols, while the gradient profile was as follows: 0 to 3 min, 10% of solvent B; 3 to 20 min, 10% to 40% of B; 20 to 40 min 100% of B; and 20 to 23 min, 10% of B for reequilibration. Typical ions and retention times for delphinidin, cyanidin and pelargonidin (Indofine, Hisllsborough, NJ, USA) were [304+H] + at 14.81 min, [288+H] + at 16.73 min, and [272+H] + at 18.33 min, respectively. Anthocyanidins were quantified according to the areas of typical peaks by reference to the standard curves of corresponding standards.

Digital Gene Expression (DGE) Analysis of Brown and White Cotton Fibers
Accumulation of flavonoids and pigments in fibers may retard the fiber development and reduce the final fiber quality and yield [9,27,28]. To dissect the molecular basis of pigment biosynthesis and its effect on fiber development, gene expression profiles in brown and white cotton fibers (BCF and WCF) were compared by DGE analysis. More than 3 million tags were generated from the developing fibers, including 2 860 036 and 2 913 186 clean tags in BCF and WCF libraries, respectively. Among these, 35664 (31.26%) distinct tags in BCF and 29048 (30.27%) in WCF were unambiguously mapped to a certain G. hirsutum unigene or singleton EST (ftp://ftp.ncbi.nih. gov/repository/UniGene/Gossypium-hirsutum/Ghi.seq.uniq.gz, 2010). Gene expression levels in BCF and WCF bulks were compared according to the corresponding tag frequencies. A total of 2079 differentially expressed genes (FDR,0.001) were identified, among which 1165 genes were up-regulated (expression level ratio BCF/WCF.2) and 914 were down-regulated (BCF/WCF,0.5) in brown fibers ( Figure 2).
Gene ontology (GO) analysis suggested that the cellular components of vehicles (including membrane-bounded vesicle and cytoplasmic vesicle) and membrane were most significantly over-represented in differentially expressed genes (DEGs) between white and brown fibers ( Table 2). The most significantly overrepresented Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways in DEGs were oxidative phosphorylation and ubiquitin mediated proteolysis (Table 3). In addition, some DEGs were related to the physiological processes involved in regulation of fiber development, such as reactive oxygen [29] and ethylene [30] ( Table 4). These data indicated that the brown fiber gene (Lc1) and/or pigment accumulation in cotton fiber may affect multiple aspects in fiber development other than fiber coloration.

Expressions of PA Synthase Genes in White and Brown Fibers
In higher plants, PAs are synthesized through the flavonoid pathway from phenylalanine to flavan-3-ols ( Figure 1) [10,11,14,15]. By DGE analysis of brown and white cotton fibers, a total of 34 PA synthase genes were identified (Table 5). Among these genes, 24 were significantly up-regulated in brown fiber, but none was significantly down-regulated. These up-regulated genes   (Figure 1), there was at least one encoding gene significantly up-regulated in brown fiber (Table 5).
To verify the result of DGE analysis, expression levels of the predominant PA synthase genes identified for all steps from phenylalanine to flavan-3-ols ( Figure 1, Table 1 and 5) and a flavonoid 39-hydroxylase homologous gene (F39H, Gorai.008G198200 from G. raimondii) were detected in brown and white fibers via qRT-PCR. Compared to white fiber, all of the 12 PA synthase genes were up-regulated in brown fiber (16,260 fold, Figure 3A). Furthermore, the high level expressions of PA synthase genes were co-segregated with brown fiber in individual RILs, as exemplified by LAR and F3959H genes ( Figure 3B). These results were consistent with DGE profiles and confirmed that the brown fiber gene (Lc1) wholly activated PA biosynthesis pathway in cotton fiber.
DGE analysis also revealed that transcript levels of different flavonoid synthases varied dramatically in cotton fibers (Table 5 and Figure 4). In white fiber, the synthases related to common steps in flavonoid biosynthesis (PAL, C4H, 4CL, CHS, CHI, F3H and F3959H) and PA specific steps (LAR and ANR) showed moderate expression levels (20,200 TPM), while DFR and ANS had very low expression (,2 TPM). In brown fiber, the relative transcription pattern of different PA synthases was similar to that in white fiber (Figure 4). Although the whole PA pathway was significantly up-regulated in brown fiber, DFR and ANS had moderate (55.26 TPM) and low (7.34 TPM) expression levels, respectively, in comparison to much higher expression levels of other PA synthases (e.g. 940.2 TPM for F3959H and 616.43 TPM for LAR). These data suggested that the relative expression profile of different flavonoid synthase genes in cotton fiber was strictly regulated at transcription level.

Monomeric Composition of PAs in Brown Cotton Fiber
Transcriptional analysis revealed a PA biosynthesis pathway wholly activated in brown cotton fiber. To further clarify the details in this pathway, we employed LC-MS method to determine the monomeric composition of PAs in brown fiber. Four flavan-3ols (gallocatechin, epigallocatechin, catechin and epicatechin) were identified in the PAs from brown fiber by LC-MS analysis, with mol percentages of 85.461.4%, 3.060.2%, 10.861.2% and 0.860.1%, respectively ( Figure 5A-C). Among these monomers, the 2, 3-trans-flavan-3-ols (gallocatechin and catechin) account for 96.2%, suggesting that most of free flavan-3-ols and then initiating units of polymeric PAs in brown cotton fiber are synthesized via LAR branch, rather than ANS/ANR (Figure 1).
The anthocyanidin composition in the acid hydrolysate reflected the extension unit composition of corresponding PAs [26]. To determine composition of PA extension units in brown fiber, we detected the anthocyanidins released by acid-butanol reactions. As shown in Figure 5D-G, three kinds of anthocyanidins (pelargonidin, cyanidin and delphinidin) were detected with the mol percentage of 9.460.7%, 12.861.4% and 77.861.8%, respectively. High percentage of delphinidin in PA acid hydrolysate indicated that the main PA extension units in brown cotton fiber were favan-3-ols trihydroxy1ated on B ring. Therefore, both initiating and extension units of PAs in brown cotton fiber consist   Table 1. doi:10.1371/journal.pone.0086344.g004 mainly of flavan-3-ols trihydroxy1ated on B ring, indicating that F3959H plays a primary role in PA biosynthesis in brown cotton fiber.
Taken together, transcriptome and biochemical analyses collectively demonstrated a detailed PA biosynthesis pathway wholly up-regulated in brown cotton fiber, in which F3959H and LAR represented the primary flow for PA biosynthesis (Figure 1).

Discussion
The exact chemical property of pigments is an important clue for exploitation of naturally colored cottons. Early extraction experiment suggested that pigments in naturally colored cotton belonged to flavonoids [1]. Hua et al revealed much higher PAL activity in brown cotton fiber compared to white fiber [9]. Expression analyses showed that several flavonoid synthase genes, such as CHI, F3H, DFR, ANS, ANR, C4H, CHS, F39H and F3959H, were significantly up-regulated in brown fiber [6,8].
Recently, Li and coworkers identified 15 flavonoid-related proteins (including PAL, CHS, F3H, DFR and ANR) with high abundance in brown cotton fiber via comparative proteomic analysis of BCF and WCF near isogenic lines [5]. In addition, the concentrations of PAs and PA precursors in brown fiber were much higher than in white fiber [5,6,8,27]. These studies consistently indicated that the pigments in brown fiber belonged to PAs. In the present study, we aimed to dissect the details of PA biosynthesis pathway in brown cotton fiber. By DGE and qRT-PCR analyses, we found that all the investigated PA synthases (including PAL, C4H, 4CL, CHS, CHI, F3H, F39H, F3959H, DFR, ANS, ANR and LAR) were significantly up-regulated in brown fiber, suggesting that the brown fiber gene (Lc1) activated the whole PA biosynthesis pathway in cotton fiber. Furthermore, biochemical analyses demonstrated that the main PA units were trihydroxylated on B ring and most of free flavan-3-ols were in 2, 3-trans form. These results demonstrated that F3959H and LAR represented the major flow for PA biosynthesis in brown cotton fiber. By dissecting the details of PA biosynthesis pathway in brown cotton fiber, our results paved the way to manipulate the biosynthesis of pigment and other flavonoids in cotton fiber via biotechnology techniques.
PAs are the predominant coloring compounds in seed coats, and may function as barrier to fungus infection of embryos [15]. It has been found for a long time that the PAs in cotton seed coats and fuzzes consist mainly of catechin and catechin-derived polymers [31]. Since the majority of PAs in brown cotton fiber are gallocatechin and its polymers, brown cotton fiber may have a different PA biosynthesis pathway independent of seed coat and fuzz. Additionally, with flavan-3-ols trihydroxylated on B ring as main units, brown cotton fiber may represent a novel PA resource compared to Arabidopsis, grapevine and Medicago truncatula which consist mainly of epicatechin and/or catechin [15]. Given the simplicity of cotton production, PAs from brown cotton fiber are also potential to be applied in food and medicine industry [12,13].
In higher plant, PAs include a large number of oligomers or polymers of flavan-3-ols. In addition to various degrees of polymerization, difference in monomeric composition is a key factor influencing the complexity of PA components. There are two major branching points in the PA pathway which lead to different PA monomers ( Figure 1). Firstly, DFR converts dihydrokampferol to leucoparlegonidin finally leading to PA monomers with a single hydroxyl on B ring, while F3959H catalyzes the hydroxylation on C-39 and/or C-59 of B ring which results in PA monomers di-or trihydroylated on B ring [32]. Secondly, LAR directly converts leucoanthocyanidins to 2, 3trans-flavan-3-ols (catechin and gallocatechin), while ANS catalyzes leucoanthocyanidins to form anthocyanindins and then 2, 3cis-flavan-3-ols (epicatechin and epigallocatechin) with ANR activity (Figure 1). High percentages of flavan-3-ols trihydroxylated on B ring implied that dihydrokampferols in brown cotton fiber were primarily converted to dihydromyricetin instead of leucopelargonidin and therefore F3959H activity should be much higher than DFR. Likewise, LAR activity might be much higher than ANS in brown cotton fiber, for the free flavan-3-ols were mainly in 2, 3-trans form. Consistently, DGE analysis revealed that the expression levels of F3959H and LAR were dramatically higher than those of DFR and ANS, respectively. These results implied that the flavonoid profiles in cotton fiber were mainly Flavonoids may play roles in many aspects of plant growth and development [33,34]. Several studies have suggested that pigment accumulation in cotton fiber may affect the fiber quality and yield [9,27,28]. Biochemical analyses indicated that the contents of PA and flavonoid precursors in brown cotton fiber were much higher than in white fibers [27]. Tan and coworkers showed that downregulation of F3H and accumulation of flavonoid narigenin retarded the fiber development and reduced the final fiber quality and yield [27]. Our DGE analysis also showed that PA accumulation in developing cotton fiber might significantly affect several cellular components, KEGG pathways and other fiberrelated physiological processes. However, the molecular basis of the negative influence of accumulation of pigment and other flavonoids on fiber quality and yield was largely unclear. Documenting the expression profiles of flavonoid synthases in white and brown fibers may facilitate to design transgenic strategy to engineer flavonoid pathway and to dissect the relationship between flavonoid accumulation and fiber development.

Conclusion
Transcriptome analysis revealed that a whole PA pathway from phenylalanine to flavan-3-ol was activated in cotton fiber by the brown fiber gene. LC-MS analyses demonstrated that most of free favan-3-ols in brown cotton fiber were in 2, 3-trans form, and the main PA units were favan-3-ols trihydroxylated on B ring. The PA monomeric composition was consistent with the expression profiles of PA synthase genes, and suggested that F3959H and LAR represented the major flow of the PA biosynthesis pathway in brown cotton fiber.