Genome-Wide Analysis of Differentially Expressed Genes Relevant to Rhizome Formation in Lotus Root (Nelumbo nucifera Gaertn)

Lotus root is a popular wetland vegetable which produces edible rhizome. At the molecular level, the regulation of rhizome formation is very complex, which has not been sufficiently addressed in research. In this study, to identify differentially expressed genes (DEGs) in lotus root, four libraries (L1 library: stolon stage, L2 library: initial swelling stage, L3 library: middle swelling stage, L4: later swelling stage) were constructed from the rhizome development stages. High-throughput tag-sequencing technique was used which is based on Solexa Genome Analyzer Platform. Approximately 5.0 million tags were sequenced, and 4542104, 4474755, 4777919, and 4750348 clean tags including 151282, 137476, 215872, and 166005 distinct tags were obtained after removal of low quality tags from each library respectively. More than 43% distinct tags were unambiguous tags mapping to the reference genes, and 40% were unambiguous tag-mapped genes. From L1, L2, L3, and L4, total 20471, 18785, 23448, and 21778 genes were annotated, after mapping their functions in existing databases. Profiling of gene expression in L1/L2, L2/L3, and L3/L4 libraries were different among most of the selected 20 DEGs. Most of the DEGs in L1/L2 libraries were relevant to fiber development and stress response, while in L2/L3 and L3/L4 libraries, major of the DEGs were involved in metabolism of energy and storage. All up-regulated transcriptional factors in four libraries and 14 important rhizome formation-related genes in four libraries were also identified. In addition, the expression of 9 genes from identified DEGs was performed by qRT-PCR method. In a summary, this study provides a comprehensive understanding of gene expression during the rhizome formation in lotus root.


Introduction
Lotus root (Nelumbo nucifera Gaertn), which originated from India and China, is an aquatic herb vegetable and a member of the family Nelumbonaceae [1]. It is one of the oldest dicot plants in the world with many features of monocot plants, and has been widely cultivated in China, Japan, and other Southeast Asian counties for multiple purposes [2]. The products of lotus root such as fresh, salted and boiled rhizomes, lotus root starch, drinks, teas, and lotus seeds are very popular in the daily diet because of its richness in nutrients including starch, proteins, vitamins, and mineral substances [3,4]. China is already exporting the processed products of lotus root to Japan, Korea, Europe, and the United States as a kind of off-season vegetable. In addition, Nodus nelumbinis rhizomatis, germ, stamens, and lotus root stems are also used as important ingredients in the traditional medicine [5,6,7].
With the unique characteristics, the rhizome of lotus root is formed underground, and it grows in size after sprouting in the shallow water, such as the pools, water gardens, tanks or tubs of the greenhouse, which indicates that the plant has developed mechanisms of surviving in the submerged environment. Then some floating leaves emerge from the nodes of enlarged and elongated rhizomes. Lotus root produces several rhizomes in a single growing season with average length of 10-20 cm each. Actually, similar with other wetland vegetable (corm, tuber, and bulbs), rhizome is a kind of underground stems, and work as storage organs. These are storage units for food that provide the plants with the energy for growth, blooming, and completing their lifecycle.
Development of storage organ (rhizome) can be classified into four stages: induction (stolon stage), initial swelling, middle swelling, and later swelling stage [8]. Stolon tips grow radically in the induction stage. In the second stages, longitudinal growth of stolon stops and its tip swells [9,10]. At the middle swelling and later swelling stages, some important carbohydrates are synthesized in the storage organ. For example, accumulation of starch greatly increases in these stages. Environmental factors affect above four stages through triggering signal molecules or gene regulation.
Development of storage organs have been extensively studied, especially in tuber and corm, and great changes have been found in genetic and morphometric processes [12,13]. Short days (SDP) condition promotes the formation of storage organ, while, long days (LDP) prolongs this process. For example, Masuda et al. (2007) have found that rhizome enlargement was brought about under 8,10 and 12 h photoperiods [14]. The leaf receives photoperiodic signal, and then transports it to the underground stolon tips via the phloem, which promotes the transition of storage organ. CONSTANS, SFT family protein, GIGANTEA, and cycling dof factor are believed to participate in the signal transduction of photoperiodic control, and expression of these genes affects the formation of storage organ [15,16,17,18]. In addition, under SDP condition, StBEL5 represses the gibberellin StGA20 ox1 biosynthesis, which promotes formation of the storage organ [19]. At the same time, the expression of StBEL5 is enhanced by miR172, suggesting that long distance transport of RNA signal also participates in the formation of underground storage organ [20]. In addition, phytochrome B is involved in the response of plants to photoperiodic control, and the formation of storage organ is affected by PHYB in SD. Decreasing the levels of PHYB in transgenic plants promotes the formation of storage organ both in SD and LD, while non-transgenic plants form storage organ only in SD [21]. These results suggest that plants lose the inhibitory effect on formation of storage organ caused by LD when expression of PHYB is down-regulated [15]. It is believed that high content of sucrose is required as a necessary condition during the formation of storage organs [15], and sucrose transporters can trigger the formation of storage organ at the early stages [22]. Therefore, sucrose plays an important role at the initial swelling stages of the formation of storage organ [23].
Hormones including: cytokinin, jasmonic acid (JA), gibberellic acid (GA), abscisic acid (ABA), and ethylene are also involved in the initiation and regulation of growth in these storage organs [24,25,26]. It has been reported that exogenous application of GA can act as an inhibitor from stolon to induction stage. Transgenic potato plants with GA oxidase gene postpone the storage organ formation. Whereas, decreased expression of this gene results in an earlier formation than non-transgenic plants [27]. Cytokinin and jasmonic acid promote the induction and elongation of storage organ [28]. Bhat et al 29. found that exogenous cytokinin is necessary to induce formation of storage organ in ginger due to improvement in photosynthesis. ABA shows high correlation to tuber formation because ABA-deficient potato plants show retarded tuberization [30]. Exogenous application of auxin on the decapitated peas and potatoes inhibits the formation of axillary buds [31]. Ethylene, produced by almost all plants mediates a variety of developmental processes in plants, such as seed germination, lateral bud stimulation, adventitious rooting, overcoming dormancy, and organ senescence and abscission [26,32]. Exogenous ethylene is believed as an inducer for the storage organ and root bulking in carrots [33].
Dynamics of rhizome enlargement is relatively poorly known in the lotus root, because detailed investigation of underground rhizome growth is time and labour consuming. Therefore, lotus can be suitable as a model plant to study rhizome growth [14]. Just like the other storage organs, rhizome is also an important edible product, and the developmental processes of this kind of storage organ are regulated by genes [34]. Expressions of these genes affect rhizome formation. Although much work in other species has partially described the above processes, and many storage organ-related genes have been documented [35], but the expression of genes which affect rhizome formation in lotus root have not been studied in detail. Tag sequencing technique has been established as an efficient approach to study gene expression in different environmental conditions [36,37]. In some vegetable plants, a lot of important genes involved in plant critical metabolisms have been successfully identified by this technique [38,39,40,41]. In this study, DEGs from four developmental stages of lotus root rhizome were sequenced and analyzed with aim to comprehensively understand the processes of rhizome formation and development at molecular level.

Plant Materials
'MeiRen Hong', a wildly cultivated species in China, was planted in the field at the water depth of 20-25 cm in spring with average temperature 30uC/day and 20uC/night during the whole growth season. Three or more stolons were developed and elongated in proper order in each plant. When plants grew up to 4-5 leaves stage (about 90-100 days after plantation), formation of rhizome started at stolon tips. For the analysis of tag-sequencing and gene expression, rhizomes of four developmental stages (stolon, initial swelling, middle swelling and later swelling stage) (Fig. S1) from the plants (three tips from different plants were combined for each stage) were used. To get the materials of different developmental stages, lotus was (rhizome lotus, flower lotus and seed lotus) were cultivated in a field (non-private), located in the South-Eastern China. The permission for sample collection was taken from the Department of Horticulture of YangZhou University, China. No specific permissions were required for the location and the field studies, because the experiments did not involve any endangered or protected species.

Screening DEGs
Rhizome transcriptome from the above four development stages was analyzed. Stolon tips, rhizomes in the initial swelling, middle swelling and later swelling stages were collected and ground, and the RNA was isolated from the ground samples using RNA extraction mini kit (QIAGEN, Germany). DNaseI was added to eliminate DNA contamination. Sequencing of transcripts in the form of special constructs was completed by Beijing Institute of Genomics (BIG).
To screen the DEGs, transcriptome from these four stages was analyzed with the aspirations to track the major changes in metabolism. RNA was isolated from the materials of these four stages. The DEG libraries of four samples were determined in parallel using Illumina gene expression sample preparation kits. Briefly, the total RNA from four stages was used for mRNA capture with magnetic oligo (dT) beads. The first and second strand cDNA were synthesized, and bead-bound cDNA was subsequently digested with NlaIII.
The 39-cDNA fragments attached to the oligo (dT) beads were ligated to the Illumina GEX NlaIII adapter 1, which contained a recognition site for the endonuclease MmeI for cutting 17 bp downstream of the recognition site (CATG) to produce tags with adapter 1. After removing 39 fragment via magnetic beads precipitation, an Illumina GEX adapter 2 was introduced at the site of MmeI cleavage. The resulting adapter-ligated cDNA tags were amplified using PCR-primers that were annealed to the adaptor ends for 15 cycles.
The 85 base fragments were purified and recovered by 6% polyacrylamide Trisborate-EDTA gel. The final quality of the tagged sequences was checked by an Agilent 2100 Bioanalyzer. The four tag libraries constructed underwent Illumina proprietary sequencing chip for cluster generation through in situ amplification and were deep-sequenced using Illumina Genome Analyzer. For the raw data, we filtered adaptor sequences, low quality tags (tags with unknown nucleotides N), empty reads and tags that were too short or too long, and tags with only one copy to get clean tags.
The types of clean tags were represented as the distinct clean tags. Subsequently, we classified the clean tags and distinct clean tags according to their copy number in the library, and showed their percentage in the total clean and distinct tags, and analyzed saturation of the four libraries.
For annotation, all tags were mapped to the reference sequence of NCBI database (http://www.ncbi.nlm.nih.gov/), and no more than 1-bp nucleotide mismatch was allowed. The alignment procedures were conducted essentially by following the protocols described in the online documentation (http://maq. sourceforge.net) and adopting the default parameter values. To monitor mapping events on both strands, both sense and complementary antisense sequences were included in the mapping process. The tags mapped to reference sequences from multiple genes were filtered [38].

Identification of DEGs
The transcriptome of the lotus root from the above four stages was used as reference for the screening and analysis of the DEGs due to unavailability of the existing data. All expressed genes were monitored, and the gene functions were explored by using database annotations like nr, Swiss-Prot, KEGG, and COG with following criteria: for the gene annotations, blastx alignment (evalue ,0.00001) between unigenes and protein databases, such as Swiss-Prot, KEGG, and COG were performed. Best aligning results were used to decide sequence direction and functions of the unigenes.
In case of any conflict in the results from different databases, a priority order of nr, Swiss-Prot, KEGG, and COG was followed when deciding sequence direction of unigenes. When a unigene happened to be unaligned with none of the above databases, ESTscan was used to predict its coding regions as well as to decide its sequence direction. All of the expressed unigenes were classified according to their functions in metabolism processes. For screening the differentially expressed genes, ''FDR#0.001'' and the absolute value of ''log 2 Ratio $1'' were used as a threshold to judge the significance of difference in expression of unigenes. Important genes related to rhizome formation were reposited in NCBI database (TSA: BioProject ID is PRJNA196449; Accession number is GAHV01000000; BioSample is SAMN02028153; Sequence read archive is SRR82669).

Gene Expression Analysis by qRT-PCR
To carry out the study of gene expression in different species, the cultivation conditions of lotus (rhizome lotus, flower lotus and seed lotus) were kept same as described above. Quantitative RT-PCR analysis was performed to quantify the transcriptional level of nine novel genes with rhizome lotus at stolon stage, initial stage, middle swelling stage and later swelling stage to evaluate the results of tag-sequencing. In addition, expression of 18 important genes at stolon stage, initial stage and middle swelling stage (no obvious later swelling stage in flower and seed lotus) relevant to rhizome formation were also studied in rhizome lotus, flower lotus and seed lotus. Total RNA was extracted from stolon tips, rhizomes of initial swelling, middle swelling and later swelling stage respectively, using RNA extraction mini kit (QIAGEN, Germany). DNaseI was used to digest DNA during the RNA extraction process to eliminate DNA contamination. A total of 1-2 mg of RNA was used in cDNA synthesis according to the manufacturer's instructions (Promega, USA). The quantitative RT-PCR reaction was performed with the Mx 3000P machine (STRATAGENE, http://www.stratagene.com). The SYBR Green Master Mix was used to identify mRNA level according to the manufacturer's instructions (Tiangen, China). According to the sequencing results, the primers were designed for the genes that enhanced transcriptional level during rhizome formation which were listed in Table 1. b-Actin was used as internal standard and amplified with the primers, forward: 59-AACCTCCTCCT-CATCGTACT-39, and reverse: 59-GACAGCATCAG CCATGTTCA-39. Amplification was performed in a 20 ml reaction mixture, containing 0.16 mM dNTPs, 0.1 mM forward and reversed primers respectively, 1 mM MgCl 2 , 0.4 U Taq polymerase (Tiangen, China), and 1 ml cDNA. The PCR program consisted of 30 cycles: 94uC for 10 min; 94uC for 1 s; 56-60uC for 30 s; 72uC for 60 s, and the final extension at 72uC for 10 min. Triplicate samples were used for quantitative RT-PCR.

Transcriptome of Rhizome
Four stages of rhizome development: stolon stage, initial swelling, middle swelling and later swelling stages were selected to construct libraries using the Illumina sequencing platform to investigate the genes involved in rhizome formation. A total of 49053 genes which included 39497 (accounted for 80.52%) CATG site genes used as reference genes were obtained after the transcriptome of above four stages were preprocessed.
About 5 million raw tags in each library were obtained with 327019, 312178, 363183, and 385558 raw distinct tags respectively. To get clean tags, all the raw tags were filtered with reference sequences, and 4542104, 4474755, 4777919, and 4750348 clean tags including 151282, 137476, 215872, and 166005 distinct tags in L1, L2, L3, and L4 libraries were obtained respectively. All data of tags for each library is given in Table 2. To testify whether or not the sequenced tags were sufficient to cover the whole transcriptome, the analysis of sequencing saturation was also performed in the four libraries. The number of detected genes increased until the sequencing tags reached 3 million or more, which indicated that the identified expressed tags were enough to reflect the whole transcriptional information in genome of lotus root (Fig. S2).
Functions of gene were annotated by comparison against existing NCBI database.
Among which, more than 41% of all distinct sequences in four libraries showed an above cut-off BLAST result, and about 59% could not match with the known genes. These identified genes were classified into 26 catalogues according to their functions. One catalogue containing the largest number of genes predicted functions, and the smallest one was relevant to nuclear structure (Fig.S3). The copy distribution of total and distinct clean tags in four libraries showed the same tendency with about 4% of distinct clean tags higher than 100 copies and more than 11% tags being in 20-50 counts. The number of distinct clean tags between 2-5 copies (approximately 40%) was high as compared with that of others ( Fig.1).

DEGs in Four Libraries
Gene expression profiles during rhizome formation. Gene expression profiles during rhizome development in four libraries were examined. Total 16111, 14602, 19281, and 17557 transcripts were identified from L1, L2, L3, and L4 libraries respectively. We found that 10544 genes were expressed in all four libraries, and 1051, 541, 2997, and 1125 were especially expressed in L1, L2, L3, and L4 libraries respectively (Fig. 2). Differentially expressed genes were identified from these four different developmental stages during rhizome formation to uncover the changes in metabolism at molecular level. The abundance of transcripts in four libraries was counted by the Table 1. Detailed information about primers used for qRT-PCR variation.

Gene
Forward primer  Reverse primer  T m ( 6C)   Most DEGs in each library. The variation in expression was observed in 1714, 4013, and 2344 genes in L1/L2, L2/L3, and L3/L4 libraries, respectively. Therefore, 20 DEGs with higher levels of changes were selected respectively, to monitor changes in metabolism during rhizome formation. We found that the most DEGs in L1/L2 libraries were Pectinesterase, a ubiquitous cellwall-associated enzyme that facilitates plant cell wall modification and subsequent breakdown. DEGs in L1/L2 libraries could be mainly classified into 3 catalogues which belonged to hormone response (SAUR family protein and Ethylene responsive transcription factor), fiber synthesis (Glycerophosphoryl diester phosphodiesterase, Receptor protein kinase-like protein, G-type lectin S-receptor-like serine, Fiber expressed protein, Calcium-dependent protein kinase, Sucrose synthase) and stress response proteins (Ethylene responsive transcription factor, Calcium-dependent protein kinase, Peroxidase, SAUR family protein). In addition, an extensin and vacuolar-processing enzyme was also found to be involved in the process from stolon stage to initial swelling stage (Table 4). DEGs in L2/L3 libraries could be mainly classified into four catalogues. One catalogue was associated with energy metabolism (NADH dehydrogenase, Malate dehydrogenase [NADP], ATP synthase subunit beta). The second was relevant to synthesis and transport of matter (Starch-branching enzyme I, Sugar transporter, 3-ketoacyl-CoA synthase 4, 2-methyl-6-phytylbenzoquinone methyltranferase, Vacuolar H+-pyrophosphatase), DEGs in third catalogue were involved in hormone metabolism and response (Auxin response factor 2, Ethylene-responsive transcription factor ERF114, ACC oxidase, Xanthine dehydrogenase). The fourth catalogue of DEGs covered was stress response genes (Transcription factor bHLH80, Vacuolar H+-pyrophosphatase). In L3/L4 libraries, DEGs were classified into three major catalogues, such as synthesis of matter (Basic 7S globulin, UDPglycosyltransferase 89B2, Sucrose synthase, UDP-glycosyltransferase, Glutathione S-transferase, Fiber expressed protein, Patatin group), hormone synthesis and response (Gibberellin-regulated protein 6, Gibberellin 20 oxidase, Ethylene-responsive transcription factor ERF025, ERF3 transcription factor) and swelling related proteins (Alpha-expansin 3, Extensin). In addition, we also found ATPase subunit enhanced transcriptional level in later swelling stage. A lot of genes were down regulated in L1/L2, L2/ L3, and L3/L4 libraries, most of which in L1/L2 libraries were involved into synthesis of matter, and others in L2/L3 and L3/L4 libraries were relevant to cell growth and differentiation, translation, RNA process, and modification.
In L2/L3 libraries, expression levels of many important regulators (e.g. transcription factor bHLH80, homeobox protein BEL1 homolog, ethylene-responsive transcription factor ERF114, trihelix transcription factor GT-1, NAC domain protein, transcription factor ICE1, transcription factor MYB59-like, transcription factor SPATULA, NF-X1-type zinc finger protein NFXL1-like, KA-NADI like transcription factor, WRKY transcription factor 44, scarecrow-like transcription factor PAT1-like, BZIP domain class transcription factor, calmodulin-binding transcription factor SR3L, GATA domain class transcription factor, HSF domain class transcription factor, HD domain class transcription factor, transcription factor KAN2, and zinc finger protein CONSTANS-LIKE) were enhanced. In L3/L4, most of the genes were found to be involved in matter metabolism and stress response.
Genes related to rhizome formation. To test whether the transcription patterns in this study had coverage of the welldefined genes, the data sets of this experiment were compared to previous reports. We found 14 identified genes relevant to rhizome formation. The expression of these genes and their biological functions are listed in Table 6. Among which, 8 genes including zinc finger CONSTANS-like protein, GIGANTEA (clock-regulated protein), BEL1-like HD transcription factor, sucrose synthase, ca 2+ /Calmodulin-like protein, lipoxygenase, glucose pyrophosphorylase, granule-bound starch synthase, phytochrome B enhanced their transcriptional levels, and 3 genes (SFT family protein, cycling dof Factor) did not show any significant change in its expression in all the four libraries. Three genes [MADS-box transcription factor, GA 20-oxidase (aside from L3/L4 stages), APETALA1)] directly decreased their expression during the rhizome development.

Expression Analysis of Nine Genes through qRT-PCR
Nine genes including CALM, SUS, GA20ox, ERF, bZIP, WRKY, SAUR, CBF, and GBSS involved in rhizome formation were studied by quantitative RT-PCR method. From the data of results, expression profiling of seven genes in four developmental stages i.e. stolon, initial swelling, middle swelling, and later swelling stage by qRT-PCR showed similar tendency as found in Tagsequencing, which indicates a correspondence of the results from qRT-PCR analysis with the Tag sequencing analysis. Only two genes (CBF and GBSS) were observed with some difference in L2/ L3 libraries in transcriptional level between Tag-sequencing analysis and qRT-PCR data (Fig. 3). In addition, detailed study for the expression of 18 genes in rhizome lotus, flower lotus and seed lotus was also carried out with qRT-PCR method. The results showed that many genes including bZIPs, WRKY, GBSS, CON-STANS, GIGANTEA, Dof, HD, PHYB, Lipoxygenase and GP showed different expression profiles in three phenotypes of lotus. Among which, expression of CONSTANS, GIGANTEA, Dof, HD, PHYB, which involved in photoperiodic signals was higher in flower lotus and seed lotus than that of rhizome. Whereas, GBSS and Lipoxygenase bZIPs, WRKY and GP showed higher transcriptional level in rhizome as compared with that of flower lotus and seed lotus (Fig. S4).

Tags Identified in Four Stages of Lotus Rhizome
Recently, a lot of regulatory mechanisms have been identified by high throughout tag-sequencing technique [38,39,41,42]. Although a large number of genes have not been annotated, tag-mapped genes have been believed to fully cover the whole plant genome [39]. It is an evidence that enlargement of lotus rhizome (Nelumbo nucifera) is strictly regulated by environmental factors, especially photoperiodic response can affect rhizome morphogenesis [14]. Therefore, in this experiment, lotus root was cultivated in an open field with normal environmental factors. In addition, rhizomes at four obvious differential developmental stages including stolon stage, initial stage, middle swelling stage, and later swelling stage were used due to few reports on the rhizome transition. Genes with differential expression during rhizome formation were identified using this technique. Approximately 5.0 million tags were identified per library, 4542104, 4474755, 4777919, and 4750348 clean tags were obtained. Only 20471, 18785, 23448, and 21778 genes from L1, L2, L3, and L4 libraries respectively, could be annotated, due to unavailability of complete genome of lotus root ( Table 2). We found that many genes significantly changed their expression during rhizome development. The higher number of DEGs in L2/L3 libraries (Table 3) suggests that regulation from the initial swelling stage to middle swelling stage is more complex as compared with that of other developmental stages.

Identification of DEGs in Four Libraries
DEGs in four libraries were involved in the transport signal transduction, small molecular biosynthesis, transcription, cell cycle, response to the stimuli, organelle organization, anatomical structure development, cell differentiation, translation, organ development, cellular macromolecular metabolic processes, energy, and cellular component organization (Fig. S3). In L1/L2 libraries, 20 most DEGs are involved in hormone response, fiber synthesis, and response to stresses. In L2/L3 libraries, however, the expression profiles of major DEGs were relevant to the energy metabolism, synthesis and transport of storage, hormone metabolism and response, stress response and energy metabolism. In L3/ L4 libraries, most DEGs were involved in storage synthesis, hormone synthesis and response. All genes which showed enhanced expression in these four libraries were listed in Table  S1 (Table S1).

Adaptation to Submergence Stress
Adaptation to submergence is very important for the survival of lotus root. In this study, we found that several genes which included (ADH (L15773), MYB (L24021), NADH dehydrogenase (L6212), SOD (L3610), Ethylene-responsive transcription factor (L16154), Calcium-dependent protein kinase (L3989), ATPase subunit (L6411), heat shock protein (L4407) related to submergence enhanced their expression during rhizome formation ( Table 1 and Table S1). Some species such as soybean, barley, tomato, and corn are also known to be sensitive to flooding [43,44,45,46], and their productivity is seriously affected due to exposure to anaerobic stress. However, most aquatic plants (including lotus root) exposed to submergence show high resistance [45,47]. According to the results of present study, the regulation of gene expression to submergence adaptation is essential for formation of storage organ, such as ERF, ADH, MYB, SNF like protein gene, SOS2 like protein kinase gene, SNF-related kinase gene, superoxide dismutase gene, G-box binding protein gene, and heat shock protein gene [45]. ADH (alcohol dehydrogenase) plays an important role in low-oxygen tolerance. Expression and activity of ADH are thought to be an indicator of oxygen limitation [48,49,50,51,52,53]. Therefore, ADH enhanced their expression during rhizome development which is essential for lotus root to adapt to submergence condition.
Ethylene not only mediates a range of different biotic and abiotic stress responses 54. but is also involved in storage organ development and submergence adaptation [55]. It is considered to be a critical hormone in ethylene-mediated growth promotion which also helps to confront anoxia [56]. The signal transduction to oxygen shortage during stem development is dependent on ethylene, and elongation of stem in response to ethylene is a widely spread adaptation to anaerobic stress [56]. Ethylene responsive element binding factor (ERF) belongs to AP2 which regulate plant metabolism process. Many ERF genes were found to be responsive to abiotic stresses such as cold, wounding or drought stress, and also induced by some kinds of hormones such as ethylene, JA, and SA [57]. In four developmental stages of lotus root, we also observed enhanced mRNA level of several ERFs during development of storage organ, which indicates that regulations of ethylene also affects the growth and development of lotus root. In Arabidopsis, ERFs have been studied and their functions were also identified in recent years [58]. Three ethylene response-factor (ERF) genes were identified to be located on an sus1 locus, which was considered as a submergence tolerance QTL in rice [59,60,61]. In this study, we found increase in the transcriptional level of many members of ERF family during growth of lotus root, suggesting that ERFs also aid to lotus root survival in anaerobic stress.
It was reported that energy regulation was a measure to alleviate the hurt acquired because of submergence, and at the same time, many critical processes for storage organ formation also need energy consumption [62,63,64,65]. ATP is normally generated from glycolysis in plant cells to sustain plant growth. For generation of ATP, ATP/ADP translocator (L23655) is thought to be a gated pore through which ADP and ATP are exchanged between the mitochondrial matrix and the cytoplasm. ATP/ADP translocator is an inner membrane protein which is observed to enhance the number and morphology of storage organ in transgenic plants. For lotus root, an ATP/ADP translocator was observed to improve the transcriptional level during rhizome development, suggesting that enhanced ATP level is helpful for survival and plant growth [66]. Evidence shows that delivery of NADPH/NADH reduced ferredoxin, and ATP in the specific cellular compartments occurs for energy-consuming reactions under stress condition. Generally, NADPH/NADH, and reduced ferredoxin are not directly transported across bio-membranes, therefore, indirect transport of reducing equivalents is achieved by malate/oxaloacetate shuttles, involving malate dehydrogenase (MDH) and NADP-MDH (Isoenzyme of MDH) for the interconversion of oxaloacetic acid and malate [67,68]. MDH is one of the more active enzymes in peroxisomes, mitochondria, chloroplast, glyoxysomes, and cytoplasm. The activities of the enzymes of malate and NADP-MDH valves are changed when plants are subjected to conditions such as high light, high CO2, nutrition or stress. A lot of evidences show that NADP-MDH is involved in response to environmental factors. When plants are subjected to stress conditions, which require changed activities of the enzymes of malate valves, changed expression levels of MDH isoforms can be observed [69,70,71,72]. The expression of ATP/ADP translocator (L23655), MDH (L8940) and NADP-MDH (L26121) was enhanced in L1/L2, L2/L3, and L3/L4 libraries. Therefore, these genes might be an adaptive response for lotus root in submergence conditions during storage development.

Carbohydrate/storage Metabolism
Starch is considered as an important component in the storage organ (rhizome, corm, tuber, and bulb). Simultaneous swelling of storage organ and accumulation of starch have already been testified [73], suggesting that synthesis of starch shows high coordination with formation of storage organ [74]. In our study, a gene encoding granule-bound starch synthase was found to enhance mRNA level during the rhizome development (Table 4). Granule-bound starch synthase is believed to have a higher amounts of starch synthesized in plant storage organs [75]. It is similar to the report of Hanashiro et al. (2008) [76]. in this study, GBSS enhanced the transcriptional level in L1/L2 and L2/L3 libraries, which indicates that the expression profile of GBSS has high correlation with the development of rhizome.
Patatin, which is identical to glycoprotein, is usually believed as a major storage protein in rhizome, tuber, corm, and bulb. We  found that expression of patatin (L10302) decreased in L1/L2 libraries and L2/L3 libraries, but enhanced in L3/L4 libraries, suggesting that patatin accumulation increases in the later swelling stage. Compared with other storage proteins, patatin is more stable because no degradation products are detected during tuber development. Evidence shows that patatin might be involved in metabolism because acyl hydrolase has been found to be encoded by this gene [77,78]. Another study shows that patatin is involved in pollen development [79,80], which is further testified by Vancanneyt et al. 81. Further study shows that patatin is associated with early events of formation of underground storage organ according to its expression profile [82,83]. In stolon tips of non-induced plants, only a small amount of patatin is found, which rapidly accumulates during storage organ formation [84]. Accumulation of patatin has high correlation with the swelling of storage organ because it is observed to be synthesized only in stolon and storage organ and induced by sucrose [85,86,87]. For lotus root, patatin enhances the expression only in L3/L4 libraries, which obviously showed that patatin was accumulated in later swelling stages of rhizome. High expression pattern might have some relationship with rhizome development, although the data of present result did not show any correlation between patatin and formation of storage organ. Sugar can provide enough materials and energy for plant to complete some important activities during its life cycle [88]. The processes of sugar synthesis, transport, consumption, and storage have been widely studied in the past decades [89]. Evidence shows that soluble carbohydrates, most notably sucrose are considered to be the strong inducers of underground storage organ formation, because increasing concentration of sucrose in medium during cultivation leads to higher numbers of tubers [90]. Sucrose plays its role as an inducing signal molecule, and enhancing the level of sucrose in stolons results in an increased number of initiated storage organs [91]. A gene (L2349) encoding sucrose improved its transcriptional level in L1/L2 libraries ( Table 4), suggesting that accumulation of sucrose is helpful for rhizome formation of lotus root. Further evidence also shows that SNF1 kinase (the sucrose non-fermenting-1) is involved in sugar-signaling pathways to regulate metabolism of carbohydrate or other storage proteins [92]. mRNA level of SNF1of lotus root was enhanced in L1/L2 (log 2 ratio (L2/L1: 1.2) and L2/L3 (log 2 ratio (L2/L1: 6.39) and L3/L4 (log 2 Ratio (L2/L1: 2.08) libraries. From the characteristics of gene expression, SNF1 (L10886) undoubtedly promoted the formation and development of rhizome (Table S1).

Up-regulation of Transcription Factors during Rhizome Formation
Transcription factors regulate gene expression during cell metabolism processes. Transcriptional levels of 14, 20, and 20 regulators were enhanced in L1/L2, L2/L3, and L3/C4 libraries, respectively (Table 5). For these transcription factors, we found that CaM-binding protein (L1140) and AP2 domain class transcription factors (L1140) have been identified to perform critical roles in the formation of underground storage organ. Ca 2+ has been testified to perceive endogenous and exogenous signals as a second messenger before system responses [93]. Calmodulin is a sensor of Ca 2+ , and many cellular processes are modulated after Ca 2+ binds to CaM [94]. Ca 2+ /CaM have already been identified to be involved in the formation of storage organs [95]. Further study shows that transgenic potato plants with constitutive expression of a CaM gene (PCM1) produces more elongated tuber [96]. CaM-binding proteins have been found in many plants and their functions have also been considered for the development of storage organs [93,97].
A potato CaM-binding protein (PCBP) is found to play an important role in signal transduction during tuber formation according to characteristics of its expression [98]. We observed that calmodulin-like protein (L16799) and calmodulin binding protein (L22245) have similar expression profiles in L1/L2, L2/L3, and L3/L4 libraries. From stolon to initial swelling stage of rhizome, expression levels of these two genes improved, and decreased from initial swelling stage to middle swelling stage and middle swelling stage to swelling stage, indicating that Ca 2+ /CaM might be involved in processes from stolon stage to initial swelling stage of rhizome.
In this study, the expression level of ethylene responsive factor was found to be enhanced (Table 5). From the characteristics of expression, we could conclude that this gene play an important role in rhizome formation. Ethylene is not only involved in a range of biotic and abiotic stress responses, especially to help wetland plants to adapt to anaerobic condition, but also mediates swelling of underground storage organs [99]. At the same time, AP2 domain class transcription factor, which is unique in plant kingdom also shows multi-functions in metabolism from response to stresses to regulation of plant development [100]. Overexpression of an AP2 domain class transcription factor, confers potato plant more resistance to abiotic stress, whereas down-regulation leads to a series of other effects such as decreased cell size, plant height, hypocotyl elongation and fertility [101]. Several AP2 domain class transcription factors are found to control flowering time [102]. It is reported that a putative AP2 domain class transcription factor (WRI1), regulates the storage metabolism in Arabidopsis seed. Constitutive expression of this gene leads to improvement of the seed oil content and triacylglycerols [103]. Transgenic rice plant with an AP2 domain class transcription factor increases the expression of waxy gene, which directly results in a change of storage content [104]. In addition, AP2 domain class transcription factors have been found to be induced by abscisic acid (ABA) and sugar, suggesting that this gene might be involved in ABA and sugar signal transduction pathway [105]. Therefore, future studies about role of AP2 domain class transcription factor in rhizome formation should be carried out with molecular engineering techniques.

Hormonal Regulation
Swelling of underground storage organ is affected by various environmental and endogenous factors. Short photoperiods, low temperatures, low nitrogen, and hormone favor the formation of storage organ [90]. In literature, many reports describe the importance of gibberellic acid (GA), JA, ethylene for the formation of storage organ. GA content is enhanced in transgenic plants by overexpression of GA oxidase gene, and elevation of GA content leads to transgenic potato plants that require a longer duration of short-day photoperiods to form storage organ. Whereas, inhibition of this enzyme activity results in earlier formation of these organs as compared with that of non transgenic plants [27]. Decrease in GA content in a dwarf mutant of S. tuberosum ssp. Andigena showed lead to formation of storage organ both in LD and SD condition. However, the storage organ is also inhibited in SD when GA biosynthesis is inhibited [9]. These results suggest that GA is probably involved in the photoperiodic induction to regulate the formation of storage organ. In addition, Xu et al. 106. observed that high content of GA promotes stolon elongation and inhibit storage organ formation. In this study, the expression of GA 20oxidase gene (L18298) involved in GA biosynthesis pathway is high in L1 libraries (stolon stage), whereas it decreased in L2 and L3, but increased in L4 stages (data not shown). Enhancing GA level in stolon probably benefited the elongation of stolon and decreased GA level in L2 and L3 might have benefited the rhizome development. As we all known, in later development stages of lotus root, few new rhizomes will be formed again, and this is the reason why we observed increase in the expression of gibberellin 20 oxidase in the later swelling stage.
JA is widely distributed in plants and exerts a critical function in plant development and growth [107]. Furthermore, JA also plays a significant role in stress response [108]. Several studies have been performed to study the effects of JA on the swell of underground storage organ. Endogenous JA is believed to be a strong inducer of storage organ in dicot potato plants and monocot yam plants [109]. Ravnikar et al (1993) reports that exogenous JA plays an important role in the formation of garlic bulbs [110]. Zeal 111. found that application of exogenous JA not only induces the formation of storage organ, but simultaneously enhances the storage organ number. In addition, tuber formation of Pterostylis sanguinea is also promoted by exogenous JA [112]. In the four stages of rhizome development, we found that many JA-induced genes (MYB, transcription factor bHLH, bZIP, and AP2/ERF domain-containing transcription factor, lipoxygenase) enhanced transcriptional level during rhizome development [113,114,115,116,117]. These JA-induced genes, which promote storage organ formation, have already been testified in the past decades. Therefore, JA and JA responsive genes are helpful for rhizome formation. In addition, some storage organ-related genes were also documented in this study (Table 6), and the relationship between gene expression and storage organ formation has been testified in literature [118,119,120,121,122,123]. Overall, expression profiles of these genes show that formation of rhizome is very complex and regulated by multiple genes.

Conclusions
Using high-throughput tag-sequencing (based on Solexa Genome Analyzer Platform), four expression libraries were constructed and analyzed from different developmental stages for rhizome formation. Based on the results of tag-sequencing, after comparison with the existing databases, overall 20471, 18785, 23448, and 21778 tags were annotated from four developmental stages (stolon stage, initial swelling stage, middle swelling stage, and swelling stage) respectively. In addition, several DEGs relevant to rhizome formation were also identified from these four libraries. Expression of nine genes was studied by qRT-PCR to verify the results of tag-sequencing. The qRT-PCR results revealed that gene expression showed high correlation with the tag sequencing analysis. Table S1 Genes were found to enhance their expression in L1/ L2, L2/L3, and L3/L4 libraries. These genes were listed in descending order according to alteration of expression during rhizome formation. (XLS)