Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer

Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species.

Introduction Ginseng, Panax ginseng A.C. Meyer (2n = 4x = 48), is a traditional Chinese medicinal herb, a perennial of the Araliaceae family and has been cultivated in China for over 2,000 years. Studies have shown that ginseng possesses several important biological functions for human health, such as recovery and promotion of vitality, improvement of immune and metabolism systems, regulation of central nervous system, etc. [1,2]. Ginseng native to the Province of Jilin, China, is often known as Jilin ginseng and estimated to produce 85% and 70% of the ginseng of China and the world, respectively. However, it is being subjected to numerous diseases that threaten the continued production of Jilin ginseng. Therefore, it is imperative to improve its disease resistance to continue and secure ginseng production. Nevertheless, because it is perennial, ginseng breeding using traditional methods is a great challenge. Molecular breeding promises to significantly enhance ginseng genetic improvement and breeding.
Plants have evolved different strategies to protect themselves from pathogens. One of the most important and well-studied mechanisms in which plants defense pathogens is based on disease resistance (R) genes whose products recognize pathogen avirulence (Avr) gene directly or indirectly [3]. This gene-for-gene mechanism may activate signal transduction cascades that turn on complex defense responses against pathogens [4]. Studies have documented that most of the cloned R genes that were shown to confer resistance to pathogens encode proteins containing a nucleotide binding site (NBS) domain and a leucine-rich repeat (LRR) domain [5]. These genes have often been referred as to the NBS genes. The NBS domain of the R genes is a region of approximately 300 amino acids extending from the P-loop to the MHDV motif [6]. This domain is a signaling domain responsible for the binding and hydrolysis of ATP and GTP. The LRR domain is devoted to protein-protein interactions. Under a diversifying selection, the LRR domain evolves different binding specificities, which may play a vital role in defining pathogen recognition specificity [4].
The NBS genes have been classified into two major groups, based on their protein structures at N-termini [7]. One group has a Toll/interleukin-1 receptor (TIR) domain in the N-terminal region and the gene members of this group are usually defined TIR-NBS-LRR or TNL genes. The other group has a coiled-coil (CC) structure in the N-terminal region and the gene members of this group are often defined CC-NBS-LRR or CNL genes. Both groups contain five conserved and strictly ordered motifs in the NBS domain, including P-loop, kinase-2, kinase-3a, GLPL, and MHDL [8][9][10][11]. Phylogenetic analyses positioned the CNL and TNL genes in separated clades of the NBS gene family phylogenic tree [7,9,12].
We previously sequenced and characterized the transcriptomes and expression profiles of the genes expressed in fourteen tissues of four-year-old plants, four different-year-old roots and four-year-old roots of 42 genotypes of Jilin ginseng [19]. From the transcriptomes of the 14 tissues, we assembled 248,992 transcript unigenes because studies have documented that different transcripts alternatively spliced from a gene are translated into different proteins potentially having different functions [20]. These expressed gene sequences and their expression profiles have provided resources necessary for genome-wide analysis of the NBS genes actively expressed in ginseng. In this study, we identified and categorized the NBS genes, analyzed their origin and evolution, constructed their phylogeny and characterized their expressions and networks in 14 tissues, four differently aged roots and four-year-old roots of 42 genotypes. These results have provided a deep insight into the NBS genes and the mechanisms underlying their function and evolution in ginseng and thus will facilitate disease resistance genetic improvement and breeding in ginseng and related species.

Identification of the PgNBS genes expressed in ginseng
The 248,992 transcript unigenes previously generated from the transcriptomes of 14 tissues of four-year-old Jilin Ginseng cv. Damaya plants (S1 Fig) [19] were used for this study. Identification of NBS genes expressed in ginseng was as described for identification of the NBS genes in soybean and L. japonicas [17,21].
First, search was performed for possible homologs to NBS genes in the Jilin ginseng transcript unigene database by TBLASTN with the amino acid sequence of the NB-ARC domain (Pfam: PF00931) as a query. A threshold of 1.0E-04 was determined empirically and used for the search to filter out most of the spurious hits. Second, the candidate NBS genes were further subjected to the BLASTn search at an e-value of 1.0E-05 using their nucleotide sequences as queries. Third, the Blast2GO program [22] was used to filter out the potential spurious hits using an e-value of 1.0E-05. Finally, the transcripts identified as above that had same gene models were considered to be derived from a single NBS gene. The NBS genes identified herein were named PgNBS001 -PgNBS284, with a digital suffix (e.g., -01) for a transcript of a PgNBS gene (S1 Table).
Furthermore, the PgNBS genes were analyzed using the Pfam database (http://pfam.janelia. org/) to verify whether they encode TIR, RPW8, NBS or LRR motifs. Because the Pfam program did not predict the CC structure of NBS genes, the COILSn program (http://www.ch. embnet.org/software/COILS_form.html) was used to detect the CC structures of the PgNBS genes at a threshold of 0.9 [23].

In silico annotation and functional categorization
Because different transcripts of a gene may have different biological functions [20], the PgNBS gene transcripts were subjected to annotation (S2 Table). The putatively encoding proteins of the PgNBS transcripts were searched against the Arabidopsis protein database in the Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org), the Swiss-Prot protein database (http://www.expasy.ch/sprot) and the NCBI non-redundant protein (Nr) database (http:// www.ncbi.nlm.nih.gov) using the BLASTX algorithm at an e-value of 1.0E-10. The PgNBS genes were then functionally categorized using gene ontology (GO) term by Blast2GO [22].

Phylogenetic analysis
To determine the origin and evolution of the PgNBS genes, the PgNBS genes having complete NBS domains were extracted and aligned using the ClustalW program [24]. The PgNBS genes with poor sequence alignments were manually removed. Consequently, 67 PgNBS genes, including 9 CN-, 2 CNL-, 34 N-, 7 NL-, 4 RPW8-N-, 9 TN-and 2 TNL-type NBS genes (S3 Table), were selected to construct the phylogenetic tree of the PgNBS gene family. To estimate the origin of the PgNBS genes, a selection of the NBS-encoding R genes previously cloned and known to confer disease resistance from a group of species of known position in a seed plant phylogenetic tree were included as references (S4 Table) [25]. The conserved sequences from the P-loop to GLPL of the PgNBS proteins were aligned and used to construct the phylogenetic tree of the PgNBS gene family using the neighbor-joining method by MEGA 5 [26]. The confidence of the tree was estimated using 1,000 bootstrap replications, the model of Poisson correction was applied and the missing data were treated by Pairwise Deletion of the gaps.

Protein conserved motif structure analysis
To further characterize the structure diversity of the predicted proteins encoded by the PgNBS genes, the TIR-, CC-and RPW8-NBS genes were selected and the amino acid sequences of their NBS domains and the N termini were subjected to motif analysis. The conserved motifs in these representative genes were then analyzed using the MEME online program (http:// meme.sdsc.edu/meme/website/intro.html) [24].

Expression profile analysis
The expression profiles of the PgNBS transcripts in 14 tissues of four-year-old plants (S5 Table), four different year-old roots (S6 Table) and four-year-old roots of 42 Jilin ginseng farmers' cultivars (S7 Table) were extracted from the expressed profile database of their transcriptomes previously developed [19]. The PgNBS transcripts were then subjected to heatmap construction and co-expression network analysis. The expression heatmaps were constructed using the R programming language and software (http://www.r-project.org/) and the coexpression networks were constructed and visualized using the BioLayout Express 3D software [27].

Identification of PgNBS genes
Four hundred and twelve PgNBS gene transcripts were identified (S1 Table) from the 248,992 transcript unigenes of Jilin ginseng [19]. These PgNBS transcripts were derived from 284 gene models. These 284 PgNBS genes were classified into eight types, based on their N terminal, C terminal and LRR domains (Table 1). Of the 284 PgNBS genes, 11 (3.87%) were identified to have the TIR domains at the N-termini, two of which contain the LRR domain (coded by TNL) and nine do not (coded by TN). Sixteen of the PgNBS genes (5.63%) have the CC motifs at the N-termini, two of which have the LRR domain (coded by CNL) and 14 do not (coded by CN). Five PgNBS genes (1.76%) have the RPW8 domain, which is known as Arabidopsis resistance to powdery mildew 8 (RPW8) [28]. Of these five RPW8 PgNBS genes, one has the LRR domain (coded by RPW8-NL) and four do not (coded by RPW8-N). Thirteen of the PgNBS genes (4.58%) possess the NBS-LRR domain (coded by NL) and 239 (84.15%) only have the NBS domain (coded by N). S2 Table shows the details of the genes classification.

Annotation, functional categorization and pathway mapping of PgNBS genes
To infer the potential biological functions of the PgNBS genes, we first annotated the gene transcripts, functionally categorized them in silico and mapped to pathways. Of the 412 PgNBS transcripts, 172 were annotated, based on sequence similarities to proteins in TAIR database and assigned to GO terms (S2 Table). The remaining 238 PgNBS gene transcripts could not be annotated. One hundred thirty-three, 109 and 55 of the 174 PgNBS gene transcripts were annotated to molecular function (MF), biological process (BP) and cellular compound (CC), respectively ( Fig 1A). Of these gene transcripts, only 31 (17.8%) had functions in all three categories. Moreover, the genes of each category were further categorized into multiple functional subcategories (Level 2), even though a majority of the PgNBS genes categorized into Molecu-lar_Function, Ion Binding, Cell and Response to Stress (Fig 1B). These results indicate that the PgNBS gene family has been dramatically differentiated in function, since its gene members originated. Nevertheless, none of them was mapped to the KEGG metabolic pathways.

NBS domain analysis of the PgNBS putative proteins
Because eight major motifs, P-loop, RNBS-A, Kinase-2, RNBS-B, RNBS-C, GLPL, RNBS-D, and MHDV, have been identified in the NBS region of the known NBS-encoding genes in plants [11], we analyzed the PgNBS genes using these motifs. MEME analysis revealed that seven of the eight conserved motifs (except for the MHDV motif) exist in the TIR-and CCtyped PgNBS genes (Fig 2A and 2B), while only six exist in the RPW8-typed PgNBS genes ( Fig  2C). Four of these conserved motifs, including P-loop, TNBS-B, Kinase-2 and GLPL, exist in all TIR-, CC-and RPW8-typed PgNBS genes (Fig 2).  In the CC-typed PgNBS genes, the RNBS-D motif was specially identified in their NBS domain and two of their other conserved motifs (Fig 2A, Motifs 6 and 7) are dramatically different from the motifs that previously reported. In addition, the proteins of two PgNBS genes (PgNBS027 and PgNBS051-01) were shown to be weakly conserved in both P-loop and GLPL motifs, and the proteins of some PgNBS genes lack one or two motifs in the NBS region. For instance, the protein of PgNBS004 lacks the Kinase-2 motif and that of PgNBS034 lacks the GLPL motif.
In the TIR-typed PgNBS genes, the RNBS-C and TNBS-1 motifs were specially identified in their NBS domain. Moreover, some amino acids were diverged from those of the CC-typed PgNBS genes in some motifs. For example, in the TNBS-B domain a conserved amino acid sequence is TTR for TIR-typed PgNBS genes, while it is TTR or TSR for CC-typed PgNBS genes. In the Kinase-2 motif, a conserved amino acid sequence was DDVD or DDVN for TIRtyped PgNBS genes, while it is DDVW for CC-typed PgNBS genes ( Fig 2B). The proteins of some TIR-typed PgNBS genes were also weakly conserved in the NBS domain. For example, those of PgNBS038-01 and PgNBS040 were weakly conserved in the Kinase-2 motif.
In the RPW8-typed PgNBS genes, except for the four conserved motifs described above, two highly conserved motifs (motif5:

N-terminal region analysis of PgNBS putative proteins
The N-terminal region often ranges from 150 to 250 amino acids from the start of the coding region to the beginning of the P-loop of the NBS domain. We analyzed the N-terminal regions of the putative proteins in different typed PgNBS genes separately. No consistent motifs were identified in the CC-typed PgNBS genes, whereas four and three distinct conserved motifs were identified in the N-terminal regions of TIR-and RPW8-typed PgNBS genes, respectively (Fig 3).

Origin, evolution and phylogeny of PgNBS genes
Studies showed that the NBS domains of the NBS genes are highly conserved and have been used to construct phylogenetic trees for the NBS gene family in several species [15,29]. Therefore, we investigated the origin, evolution and phylogeny of the PgNBS gene family using the amino acid sequences of this domain. Analysis showed that 67 of the PgNBS genes have a complete NBS domain and therefore, was used for this experiment. These 67 PgNBS genes included 9 CN-, 2 CNL-, 34 N-, 7 NL-, 4 RPW8-N-, 9 TN-and 2 TNL-typed PgNBS genes (S3 Table). Moreover, 18 previously cloned R genes were selected from a selection of species that were able to position P. ginseng in the phylogenic tree of seed plants [25] and used as references to estimate the origin and evolution of the PgNBS gene family. These 18 R genes were from Arabidopsis thaliana, Helianthus annuus, Cicer arietinum, Camelina sativa, Oryza sativa and Solanum tuberosum, including 10 TN-, 5 CN-and 3 RPW8-N-typed R genes (S4 Table). The amino acid sequences of the NBS domains were predicted, aligned using the region between the P-loop and GLPL motifs and used to construct the phylogenetic tree of the PgNBS gene family by the Neighbor-Joining method. further classified into two subclades, Ia and Ib. Clade II was classified into two subclades, IIa and IIb, and Subclade IIa was then classified into two clusters, IIa-1 and IIa-2 ( Fig 4A). According to the phylogenetic tree of P. ginseng and the reference species (Fig 4B), Subclade Ia was originated before monocot plants (O. sativa) originated because it includes the NBS genes that were not only from P. ginseng, but also from O. sativa. Subclade Ib originated before the dicot plants split. Subclade IIb was originated most recently, after P. ginseng split from H. annuus. Therefore, the PgNBS gene family is an ancient, but continuously evolving gene family.

Expression characteristics of PgNBS gene transcripts in different tissues
To determine the activity characteristics of the PgNBS genes in ginseng, we first profiled the expressions, determined the GO functional categories and constructed the expression heatmap and co-expression network of the gene transcripts in 14 tissues of four-year-old ginseng plants (S1 Fig). The gene transcripts were analyzed because different transcripts likely have different biological functions [20]. We found that of the 412 PgNBS transcripts, 403 expressed in at least one of these tissues and their expression profiles varied dramatically among tissues (Fig 5A).
Only 49 of the PgNBS gene transcripts expressed in all 14 tissues analyzed, accounting for 12%, while 160 of them specifically expressed in only one tissue, accounting for 39%, suggesting the tissue specificity of their expressions. The remaining 194 PgNBS gene transcripts (47%)  Table). These results indicated that the numbers of PgNBS gene transcripts expressed in each tissue varied greatly, with a coefficient of variation (CV) = 39.2%. However, if the 336 gene transcripts expressed in fruit pedicel that dramatically differed from those expressed in the 13 other tissues were excluded from the analysis, the variation of the expressed transcripts in each tissue ranged from 84 to 160, with an average of 132 and CV = 15.2%.
Comparative analysis showed that the numbers of the PgNBS gene transcripts categorized into each GO functional category (Fig 1B) (Fig 6). The coexpression network analysis of the 403 PgNBS gene transcripts expressed in the 14 tissues resulted in a single network (P 0.05) that consisted of 403 gene nodes, 21,140 co-expressing edges and 17 clusters (Fig 7). The result showed that each of the PgNBS genes co-expressed with 2 to 199 other PgNBS genes, with an average of 105 other PgNBS genes, suggesting their co-originated functions.

Expression characteristics of PgNBS gene transcripts in different aged roots
Next, we profiled the expressions, determined the GO functional categories and constructed the expression heatmap and co-expression network of the gene transcripts in the roots of 5-, 12-, 18-and 25-year-old ginsengs. One hundred fifty-two of the 412 PgNBS gene transcripts were found to express in the different year-old roots (Fig 5B and S6 Table).   Table). Of the 206 PgNBS gene transcripts expressed in the 4-year-old roots of different genotypes, the numbers of the transcripts expressed in each genotype were relatively consistent, varying only from 102 to 140, with a CV = 6.0% and an average of 125. GO analysis showed that the numbers of  Heatmap analysis did not formed distinct expression patterns characterizing each genotype, expect for genotype S29 that has a cluster of eight transcripts that were significantly up-regulated (Fig 9). Network analysis led to a single co-expression network of all 206 PgNBS gene transcripts expressed in the roots of the genotypes (Fig 10A). The network consisted of 206 gene nodes, 2,446 co-expressing edges and 14 clusters (Fig 10B). Each of the transcripts in the network coexpressed with 1-64 other PgNBS gene transcripts and an average of 24 other PgNBS gene transcripts.

Discussion
The PgNBS gene family is a size moderate, diverged and ancient NBSencoding gene family We have genome-wide identified and characterized the NBS-encoding genes expressed in Jilin ginseng. We identified 412 PgNBS gene transcripts from the transcriptomes of Jilin ginseng developed from 14 tissues [19]. These transcripts were derived from 284 gene models (Table 1), with an average of 93 gene models expressed in each tissue (S5 Table). According to a study in maize (X. Qi, M.P. Z., X. Su, J. Qin and H.-B. Zhang, unpublished) that approximately 30.7% of the NBS genes were expressed in a tissue, the number of PgNBS genes in the ginseng genome is estimated to be approximately 303. This number is much higher than that of A. thaliana (177) [6], but it is much lower than those of poplar (402) [16], rice (400-785) [12,14,30], cotton (1,350) and soybean (1,044) [30] that have smaller genomes than ginseng.
The PgNBS gene family has the typical features of the NBS gene family in dicot species, including both CC-and TIR-typed genes ( Table 1). Although its genes maintain six or seven of the eight conserved motifs identified in the NBS domain of the NBS gene proteins in plants [11] and three or four conserved motifs in the N terminal region of the NBS gene proteins (Figs 2 and 3), they could be classified into TNL, CNL., TN, CN, N, NL, RPW8-NL and RPW8-N types (Table 1). However, the ratio of CNL-typed to TNL-typed differed from those of other dicot species. For instance, the ratio of the CNL-typed to TNL-typed for A. thaliana was approximately 1: 2 [6] and that for potato was about 2: 1 [11]. This study revealed that the ratio for Jilin ginseng is nearly 1: 1. Moreover, a majority (84.15%) of the PgNBS genes are characterized with the NBS-N domain. In addition, RPW8-typed genes were found in the PgNBS gene family, in which some new conserved motifs such as Motifs 5, 6 and 8 were found.
The PgNBS gene family originated at least before the separation between dicot and monocot plants (Fig 4). Of the 67 PgNBS genes phylogenetically analyzed, approximately 9% originated before dicot plants split from monocot plants and 34% originated before the Asterids split from the Rosids. Only about 15% of the PgNBS genes originated after P. ginseng originated.
The PgNBS gene family has functionally dramatically differentiated, but expresses coordinately Although the PgNBS genes identified in this study belong to a single gene family, they are functionally categorized into 36 functional subcategories of CC, MF and BP. It is recognized that they are mainly involved in four of the 36 functions, including Molecular_Function, Ion One hundred fifty-two of the 412 NBS gene transcripts were found to express in the roots of these different year-old plants and therefore, used for the heatmap construction. https://doi.org/10.1371/journal.pone.0181596.g008 Functional differentiation and co-expression networks of the NBS-encoding gene family Binding, Cell and Response to Stress (Fig 1). In particular, the genes categorized into Response to Stress and Cell Death likely play a role in plant defense to pathogens. Nevertheless, this study shows that all the genes in the family co-express, regardless of in different tissues or in the roots of different genotypes, and form a single co-expression network (Figs 7 and 10). These results indicate that the genes of the PgNBS gene family, in spite of their functional divergence, still function in a coordinated manner.  .g., S43). Two hundred and six of the 412 NBS gene transcripts were found to express in the roots of these cultivars and therefore, used for the heatmap construction. https://doi.org/10.1371/journal.pone.0181596.g009 Functional differentiation and co-expression networks of the NBS-encoding gene family The genes of the PgNBS gene family express differently in different tissues, at different developmental stages and among different genotypes This study shows that the expressions of most of the genes in the PgNBS gene family vary in multiple fold not only at the expression level, but also in the number of genes in different tissues (S5 Table), at different developmental stages (S6 Table) and among different farmers' cultivars (S7 Table). Moreover, such large expression variations are also common among different PgNBS genes within a tissue, at a different development stage or in a particular genotype. A distinguishing character of the gene expression variation is that nearly 39% of 412 PgNBS gene transcripts identified in this study only express in a tissue or are tissue-specific and 12% express at a developmental stage or are developmental stage-specific (Fig 5). Some of the PgNBS genes may be up-regulated in a tissue, at a developmental stage or in a genotype, but down-regulated in the other tissues, at the other developmental stages or in the other genotypes (Figs 6, 8 and 9). Finally, the numbers of the PgNBS gene transcripts categorized into each functional category also vary dramatically across tissues, developmental stages or genotypes (S2-S4 Figs). These expression variations of the genes, therefore, are likely associated with the variation of their functions.

Conclusions
We have identified 284 PgNBS genes that are expressed in Jilin ginseng and found that the members of the PgNBS gene family, in spite of their same origin, have dramatically differentiated in function and vary in spatial and temporal expression and network. These studies have revealed several new characters of the NBS gene family in plants in general and in Jilin ginseng particularly. These results have provided a deep insight into the origin, evolution, expression and function of the NBS gene family, and also resources and tools for further characterization of the genes that defense to pathogens in ginseng and for enhanced ginseng disease resistance breeding.