Genome-Wide Analysis of the Lysine Biosynthesis Pathway Network during Maize Seed Development

Lysine is one of the most limiting essential amino acids for humans and livestock. The nutritional value of maize (Zea mays L.) is reduced by its poor lysine content. To better understand the lysine biosynthesis pathway in maize seed, we conducted a genome-wide analysis of the genes involved in lysine biosynthesis. We identified lysine biosynthesis pathway genes (LBPGs) and investigated whether a diaminopimelate pathway variant exists in maize. We analyzed two genes encoding the key enzyme dihydrodipicolinate synthase, and determined that they contribute differently to lysine synthesis during maize seed development. A coexpression network of LBPGs was constructed using RNA-sequencing data from 21 developmental stages of B73 maize seed. We found a large set of genes encoding ribosomal proteins, elongation factors and zein proteins that were coexpressed with LBPGs. The coexpressed genes were enriched in cellular metabolism terms and protein related terms. A phylogenetic analysis of the LBPGs from different plant species revealed different relationships. Additionally, six transcription factor (TF) families containing 13 TFs were identified as the Hub TFs of the LBPGs modules. Several expression quantitative trait loci of LBPGs were also identified. Our results should help to elucidate the lysine biosynthesis pathway network in maize seed.


Introduction
Lysine cannot be synthesized in the body and must be obtained from food, and consequently it is one of the most limiting essential amino acids for human and livestock. Lysine is highly limited in cereal grains [1].The poor lysine content of maize (Zea mays L.), one of the most important cereal crops, thus greatly reduces its nutritional value to monogastric animals [2].
Two different lysine biosynthesis pathways occur in nature. One is via α-aminoadipate which exists in fungi and Euglena [3]. The other is the diaminopimelate pathway which exists in bacteria, plants, and archaea [4]. Diaminopimelate pathway belongs to the aspartate family pathway in which aspartate kinase (AK) is the first enzyme that catalyzes aspartate phosphorylation to form aspartate-β-semialdehydea [5]. Dihydrodipicolinate synthase (DHDPS) to identify genetic variants that regulated LBPG expression. Our results should increase our knowledge of the lysine biosynthesis network in maize seed.

LBPG coexpression network construction
The WGCNA program in the R software package [25] was used, along with the step-by-step network construction and module detection method, to construct the coexpression network. The data used for the network were based on RNA-Seq data published by Chen [39] from 21 seed developmental stages of maize cultivar B73. Only genes expressed during at least two stages with the reads per kilobase per million reads (RPKM) values 2 or expressed at only one stage with an RPKM 5 were used for network construction. The weighted gene coexpression network during maize seed development was constructed using the Log 2 +1 normalized RPKM expression values of selected genes. The soft thresholding power β was used to calculate adjacencies. To minimize the effects of noise and false associations, we transformed the adjacencies into a topological overlap matrix (TOM), and calculated the corresponding dissimilarities, dissTOM, as 1 -TOM. To hierarchically cluster the genes, we used dissTOM as a distance measure and set the minimum module size (number of genes) to 100 to detect modules. To quantify the coexpression similarities of entire modules, the eigengenes of modules were calculated and clustered based on their correlation [40]. We choose the correlation of 0.9 to cluster the modules. The modules' eigengenes were used to represent the gene expression pattern within a module [41]. Cytoscape (v 3.0.2) [42] was used to display the coexpression network.

Annotation of coexpressed genes
The online tool WEGO (http://wego.genomics.org.cn) was used for the Gene Ontology (GO) enrichment analysis [43]. GO terms of genes (RefGen_v2 5b) were obtained from the maize annotation file downloaded from Phytozome (Zmays_181_annotation_info.txt, ftp://ftp.jgipsf.org/pub/compgen/phytozome/v9.0/Zmays/annotation/) and significances were determined based on the Pearson chi-square test with P-values < 0.05. For each GO term, at least five genes were mapped. The significant terms were displayed in the WEGO output figure.
Maize transcription factor (TF) information was downloaded from the Plant Transcription Factor Database (PlantTFDB) v 3.0 (http://planttfdb.cbi.pku.edu.cn/), which contains 3,316 TFs corresponding to 2,231 loci classified into 55 families. We used this information to identify the TFs of the coexpressed genes.
Elongation factor 1α genes and ribosomal protein genes were obtained using the keywords 'elongation factor Tu' and 'ribosomal', respectively, to search the maize annotation file.
The annotation of zein genes were obtained from Chen [39].

Identification of enriched cis-motifs
The genes which are among the most highly connected within a module are referred to as hub genes [23]. The top 30 hub genes in LBPGs modules were identified by ranking the intramodular connectivity of each gene. The cis-motifs in the promoter regions of hub genes within the module that contain TFs were analyzed using AME software [44]. The sequence 1-kb upstream transcriptional start site was defined as the promoter region. The promoter sequences of module genes were used as the input sequences and those of all of the maize genes (RefGen_v2 5b) were used as the control. The JASPAR CORE 2014 database was used [45].

Analysis of the lysine contents of proteins encoded by LBPGS coexpressed genes
The lysine content of each protein was defined as the ratio of the number of lysine residues to the total number of amino acids. Protein lysine contents were calculated, and a statistical analysis was performed in R [46].

Identification and phylogenetic analysis of LBPGs
To identify LBPGs in maize, the annotation of the maize lysine biosynthesis pathway on KEGG pathways was used, and a BLASTP algorithm-based search was performed in Phytozome 9.0 using the maize genome [17,50].  Table). MEGA5 [38] was then used to construct a neighbor-joining tree (with 500 bootstrap replicates) for each of the seven lysine biosynthesis enzymes (Fig 1). Except for gene the Medtr1g101400 in the DapB tree and gene AT3G53580 in the DapF tree, each family's genes formed a monophyletic group. The AK tree was divided into two groups, one group contained AK1 and AK2, and the other contained AK3 (Fig 1 AK). AK3 is a homoserine dehydrogenase (HSDH, EC: 1.1.1.3), which also catalyzes the third step in the aspartate pathway as a bifunctional enzyme. In accordance with grass evolutionary relationships [51,52], the DHDPS genes of Z. mays, S. bicolor and S. italica, which are C4 grasses, were separated into two groups, while the DHDPS genes of O. sativa and B. distachyon, which are C3 grasses, were not separated (Fig 1 DHDPS). The LL-DAP genes were divided into two clades. LL-DAP1-AT, LL-DAP2-AT, and At4g33680/AtLL-DAP1-AT, and LL-DAP3-AT and At2g13810/ALD1 belong to the two clades, respectively. The AtLL-DAP-AT identified by Hudson et al. was involved in the Arabidopsis lysine biosynthesis pathway [10]. ALD1 has strong activity when lysine was used as an amino donor, indicating that it is unlikely to have the activity of LL-DAP-AT [53]. These results indicated that LL-DAP1-AT and LL-DAP2-AT may have primary functions in the lysine biosynthesis pathway in maize.

Construction of a LBPGs coexpression network
To obtain a coexpression network of LBPGs during maize seed development, we analyzed publicly available RNA-Seq data from 21 developmental stages of B73 maize seed [39], with RPKM values representing gene expression levels. The resulting dendrogram (S1A Fig), derived from a clustered analysis of the 21 samples using 21,679 genes that were either expressed during at least two stages with RPKM 2 or at only one stage with RPKM 5, was identical to that published by Chen [39]. We used the 21,679 genes for further analyses. Because of its low expression across the 21 seed developmental stages, the LL-DAP2-AT gene was removed prior to the analysis. WGCNA [23] was then applied to construct a weighted gene coexpression network.
To calculate the adjacency of the data, we chose the soft threshold power β = 16, which is the lowest power at which the scale-free topology fit index reaches 0.68 (S1B Fig).
The coexpression network construction yielded 24 modules (S2 Fig). The modules ranged in size from 123 genes in the skyblue module to 4,248 genes in the blue module, with 607 genes not being included in any coexpression module (grey module) (S3 Table). The LBPGs were distributed in eight modules. AK1, DHDPS1 and LysA2 were in the blue module, AK2 and DapB1 were in the cyan module, AK3 was in the purple module, ASD, DHDPS2 and DapF1 were in the darkred module, DapB2 was in the steelblue module, LL-DAP1 was in the brown module, DapF2 was in the pink module, and LysA1 was in the darkturquoise module (Table 1). There were 11 elongation factor 1α genes (S3A Fig and S4 Table) and 313 ribosomal protein genes (S3B Fig and S5 Table) coexpressed with the LBPGs. These results are in accordance with the previous reports that lysine metabolism is correlated with ribosomal proteins and elongation factor 1α genes [54,55]. The LL-DAP3-AT was in the grey module, so we removed it from the following analysis. The gene expression levels in the modules with LBPGs were further analyzed, and the results showed that the genes in the different modules had some similar features during seed development, even though these modules had different expression patterns. The expression levels of genes in the blue and brown modules were higher at the early and late stages, but lower at the middle stage. The expression level of the steelblue module was higher at 6-12 DAP and the late stage. The levels of the cyan, darkred, pink and darkturquois modules were higher only at early stage, while that of the purple module was higher only at the late stage (Fig 2).
To better understand the biological functions of the coexpressed genes, the online tool WEGO [43] was used for the GO enrichment analysis (S4 Fig). We found that several 'Cellular Component' GO categories, such as cell part, organelle part and protein complex, were overrepresented among the coexpressed genes. Several 'Molecular Function' categories, including enzyme activator, ligase, protein binding, structural constituent of ribosome, and RNA polymerase II TF, were enriched in the coexpressed genes. Cell communication and terms related to protein transport, such as localization, establishment of localization, and microtubule-based process, were the subcategories of the 'Biological Process' category that were enriched in the coexpressed genes. This result indicates that LBPGS coexpressed genes may play important roles in cell metabolism, specifically in protein metabolism.
Further, TF binding sites (TFBS) were analyzed using the AME software suite (Fig 3). We found that the binding sites of myb.Ph3 (a Petunia hybrid MYB protein) were enriched in the  brown module, the binding sites of HAT5 (an Arabidopsis homeobox-leucine zipper protein) in the cyan module, the binding sites of MEF2A (a human MADS box transcription enhancer factor) in the pink module, and the binding sites of Hac1 (a yeast bZIP protein) in the purple module. Additionally, the sequences of these binding sites were used to screen the gene promoter regions. The results are shown in S6 Table. Among the 3,769 genes in the brown module, there were 1,065 genes with the myb.Ph3 binding site in their promoter region. There were 703 genes containing HAT5 among the 2,250 genes in the cyan module, and there were 222 genes containing the MEF2A among the 2,139 genes in the pink module. In the purple module, there were 622 genes containing Hac1 among the 2,204 genes. These results were coincident with that the corresponding TFs were the Hub TFs in the modules, indicating that the MYB, homeobox, MADS box and bZIP families may have broad regulatory roles during maize seed development.
Remarkably, the Opaque2 gene (O2) (GRMZM2G015534) encoding one of the bZIP family TFs was found in the blue module and coexpressed with DHDPS1, AK1 and LysA2 (S3 Table). The inactivation of O2 could cause lysine accumulation [56]. We noticed that, except at 12-18 DAP and at 22-26 DAP, the O2 and the DHDPS1 were almost negatively correlated (Pearson's correlation coefficient = -0.83) during maize seed development (S5A Fig). Further, we analyzed the transcriptome data of 15 DAP endosperm of the o2 mutant published recently [57], and found that the expression level of DHDPS1 was also lower than in the wild type (S5B Fig). In addition, Li et al. analyzed the transcriptome data and chromatin immunoprecipitation data of the o2 mutant, finally identified 35 down-regulated differentially expressed genes having O2 binding sites in their promoter regions [57]. We found that 28 of the 35 genes were in the Filtered Gene Set (5b, RefGen_v2), and that 11 of the 28 genes were in the blue module, coexpressed with O2, including b-32 (GRMZM2G063536), CyPPDK1 (GRMZM2G306345) and the zein genes (50-kDa, 22-kDa, 19-kDa and 14-kDa) which were the well-studied downstream genes regulated by Opaque2 (S7 Table).

Functional annotation of DHDPS coexpressed genes
The two DHDPS genes were split into two modules in the coexpression network, and the expression level of DHDPS2 was higher than that of DHDPS1 during maize seed development (Fig 4A). To confirm the expression pattern of DHDPS genes at the maize seed developmental stages, we sampled maize seeds at 2, 6, 8, 10, 16, 20, 24, and 32 DAP for qRT-PCR validation ( Fig 4B). The qRT-PCR result showed similar expression trends to those obtained by RNA-Seq. We further analyzed the expression patterns of genes that coexpressed with these two DHDPS genes (Fig 4C). We found that the DHDPS1 coexpressed genes were highly expressed at early (0-8 DAP) and late (30-38 DAP) stages, whereas DHDPS2 coexpressed genes were highly expressed at 8-10 DAP, which corresponded to the end of the early stage and the beginning of the middle stage of maize seed development.
To further understand the differences between the two DHDPS genes, the DHDPS coexpressed genes were analyzed using WEGO (Fig 5). The result showed that the organelle lumen, RNA polymerase II TF, nutrient reservoir and organophosphate metabolic process were enriched in DHDPS1 coexpressed genes. Twenty-two genes were related to organelle lumen, such as abp1 (GRMZM2G116204) and abp4 (GRMZM2G064371), which are associated with endoplasmic reticulum lumen. Sixteen genes were related to RNA polymerase II TF. Thirty genes were related to nutrient reservoir, including one 14-kDa zein, seventeen 19-kDa zeins, eleven 22-kDa zeins and one 50-kD γ-zein. Twenty-one genes in the organophosphate metabolic process GO category corresponded to the phospholipid biosynthetic process and the GPI anchor biosynthetic process. While, cell part, intracellular organelle, structural constituent of  ribosome and biosynthetic process were enriched in DHDPS2 coexpressed genes. There were 93 ribosomal genes related to the structural constituent of the ribosome term. In addition, 152 genes were involved in translation processes: translational initiation process, e.g., EIF2 (GRMZM2G030646), SUI1 (GRMZM2G113414) and EIF3K (GRMZM2G115182); translational elongation process, e.g., EF1B (GRMZM2G029559); tRNA aminoacylation for protein translation process: lysyl-tRNA (GRMZM2G146589 and GRMZM2G386714). Further, we found that two GroES-like family protein genes (GRMZM2G013652 and GRMZM2G035063), which were chaperones assisting protein folding, were coexpressed with DHDPS2.

Lysine content of proteins encoded by the coexpressed genes
To analyze the differences in the lysine contents among proteins encoded by coexpressed genes, non-coexpressed genes and different modules, we defined the ratio of the number of lysines to the length of proteins as the lysine content. We found that the mean lysine content of proteins encoded by coexpressed genes (5.274%) was significantly higher (P < 2.200 −16 , Mann-Whitney-Wilcoxon single tail test) than that of proteins encoded by all of the gene sets (4.761%) and proteins encoded by the same number of randomly selected genes from all of the gene sets (4.790%) ( Fig 6A). Next, we calculated the mean lysine content of proteins encoded by DHDPS coexpressed genes (Fig 6B). The mean lysine content of proteins encoded by DHDPS2 coexpressed genes (6.775%) was significantly higher (P < 2.200 −16 , Mann-Whitney-Wilcoxon single tail test) than that of proteins encoded by DHDPS1 coexpressed genes (5.129%) and all of the gene sets (4.761%). The mean lysine content of proteins encoded by DHDPS1 coexpressed genes (5.129%) was also significantly higher (P < 2.200 −16 , Mann-Whitney-Wilcoxon single tail Test) than that of proteins encoded by all of the gene sets (4.761%). Because the transcriptional level is not always coupled to the translational level of a given gene, and the total lysine content is also determined by the protein abundance, we analyzed the protein abundance data that was published by Walley et al. [58]. We found that the average abundance of proteins encoded by LBPG coexpressed genes was higher than that of all of the gene sets (S6A Fig). The average abundance of protein encoded by DHDPS2 coexpressed genes was higher than those encoded by DHDPS1 in the select tissues (S6B Fig). Taken together, these results suggest that the free lysine might be preferentially involved in the synthesis of high lysine content proteins encoded by LBPG coexpressed genes with higher lysine contents.

GWAS-based discovery of LBPG eQTLs
For the GWAS analysis, we used the expression data of 28,850 maize genes and more than 1.06 million SNPs, generated from immature kernels (15 DAP) of 368 maize inbred lines [33]. GAPIT [59] was used for the association analysis of the expression levels of the 13 LBPGs. The GWAS analysis uncovered six LBPGs, DHDPS1, DapB1, DapF1, DapF2, LysA1 and LysA2, with 140 significantly associated SNPs (P < 1.8× 10 −6 and a false discovery rate < 0.05). Manhattan and quantile-quantile plots for the six LBPGs resulting from the GWAS analysis are shown in S7 Fig. From the significantly associated SNPs, we identified 16 candidate eQTLs by grouping three or more SNPs when the distance between two consecutive SNPs was less than 5 kb. Because DapF1 had only one significantly associated SNP, this gene was excluded. The most significantly associated SNP in each eQTL region was defined as the lead SNP, and used to represent the eQTL. As defined by Fu et al. [33], the eQTLs with lead SNPs located more than 20-kb from their associated genes were regarded as distant eQTLs, while, the remaining eQTLs were considered to be local eQTLs. For the 16 loci, all of the possible regulators of target genes within the eQTL regions are listed in Table 2. DHDPS1 was predicted to be regulated by one local eQTL and one distant eQTL, DapB1 by two local eQTLs and four distant eQTLs, and DapF2 and LysA1 each by only one local eQTL. LysA2 was predicted to be regulated by six distant eQTLs (Table 2). Remarkably, the local eQTLs of four LBPGs (DHDPS1, DapB1, DapF2, and LysA1) all contained those target genes, indicating that the expression of these LBPGs may be regulated directly by the variation of local sequences. However, LysA2 was predicted only to be regulated by distant eQTLs. Five genes were predicted in the eQTL region, among which, gfa1 (GRMZM2G005849) which encodes a glucosamine-fructose-6-phosphate aminotransferase existed in the LysA2 coexpression network, suggesting that GFA1 may be the main regulator of lysA2 expression.

Discussion
Because lysine is one of the most limiting essential amino acids, the improvement of the lysine content of cereal grains, especially maize, is very important. Many studies have reported that the free lysine content was elevated by inactivating the sensitivity of lysine biosynthesis enzymes (AK and DHDPS) to lysine feedback inhibition [13][14][15]60]. In this study, we used RNA-Seq data from 21 developmental stages of B73 maize seed to investigate the network associated with lysine biosynthesis during maize seed development.
In the coexpression network, the 13 LBPGs were split into eight modules. This result is in accordance with the generally low degree of coexpression for genes encoding enzymes involved in metabolic pathways [61,62]. Further, the eight modules tended to have higher expression levels at the early stage and lower expression levels at the middle stage of maize seed development. One explanation is that at the early stage of maize seed development, more free lysine needs to be synthesized to meet the increasing demands of protein synthesis, while at the middle stage the amount of free lysine transport from the peduncle vascular sap or other tissues to the developing seed may provide enough free lysine for protein synthesis [63,64].
According to the GO enrichment analysis, the main functions of LBPG coexpressed genes were related to cellular protein activities. These associations with cellular proteins indicated that the lysine biosynthesis genes had a close relationship with protein metabolic networks. In plants, transcriptional regulation is very important in many metabolic pathways, as TFs tend to control multiple pathway steps [65]. We identified 643 TFs in the network (S6 Table), many of which were members of TF families regulating seed-related biological processes, such as seed protein metabolism and seed development. For example, the bZIP family regulates lysine metabolism [66,67]; MYB, B3 and DOF families control seed storage protein genes expression [68][69][70][71]; and C3H, AP2, bHLH, HB-other, MIKC and NAC families regulate seed development [72,73]. We also found that some TFs were hub genes within a module, such as the seven MADS-box TFs in the pink module. Further, we analyzed the TFBSs of the modules that contain TFs in the hub genes, and found the corresponding binding sites were enriched in the  (Fig 3). Additionally, we analyzed the regulatory network of LBPGs, and detected 5 local eQTLs and 11 distant eQTLs for 5 LBPGs (Table 2). DHDPS1 was predicted to be regulated by a local eQTL. The two LysAs were regulated by eQTLs, LysA1 was regulated by a local eQTL, and LysA2 was regulated by six distant eQTLs. These results will improve our understanding of lysine biosynthesis regulation. Together, the results indicated that genetic control is involved in the lysine biosynthesis, and that the lysine biosynthesis is regulated at the transcriptional level.
In Escherichia coli, the lack of GroE causes the level of dihydrodipicolinate synthase to be significantly reduced [74]. In our network, we found two GroES-like family protein genes were coexpressed with DHDPS2. DHDPS2 may require chaperones to fold, and its activity is regulated by molecular chaperones in maize, as well as in Escherichia coli.
DHDPS is the core enzyme in the diaminopimelate pathway, and its structure has been solved in several plants recently, such as Arabidopsis [75] and grape [76]. DHDPS activity is influenced by the changes in the kinetic parameters and is feedback-inhibited by lysine. Meanwhile, in Arabidopsis, it had been shown that transcriptional regulation of the DHDPS gene exerts a primary control on lysine synthesis [77]. In this work, we studied the DHDPS genes' coexpression network in maize seed. We identified two genes encoding DHDPS in maize seed. They share 73.45% identity at the nucleotide level. However, in the coexpression network, the two genes were separated into different modules and there were three different points in the coexpression network.
First, the two DHDPS genes had different expression levels. The expression level of DHDPS2 was higher than that of DHDPS1 during seed development (Fig 4A). The same result is also found in the seedlings of Arabidopsis in which the DHDPS2 expression level is higher than that of DHDPS1 [78]. Mutant analyses of Arabidopsis showed that dhdps2 has an increase in threonine, but dhdps1 does not [78]. These indicated that the DHDPS2 controls the flux in lysine biosynthesis. When the activity of DHDPS2 decreased, aspartate semialdehyde, which is the common material for the synthesis of lysine, threonine and methionine, is used to synthesize more threonine [79].
Second, according to the GO enrichment analysis, the DHDPS1 coexpressed genes were enriched in the GO term of nutrient reservoir activity, RNA polymerase II TF and organophosphate metabolic process, while the DHDPS2 coexpressed genes were enriched in cell part, organelle, ribosomal, and biosynthesis process (Fig 5).
Third, the lysine content of proteins encoded by genes coexpressed with DHDPS2 was significantly higher than that of proteins encoded by DHDPS1 coexpressed genes (Fig 6B). These results imply that the DHDPS1 and DHSPS2 genes may contribute differently to lysine synthesis during maize seed development.
Previous studies showed that free lysine accumulated transiently at intermediate stages of tobacco seed development [80]. Free lysine accumulation in Arabidopsis seeds is positive correlated (r = 0.7704) with DHDPS expression levels [6,81]. DHDPS2 and their coexpressed genes were highly expressed in maize seeds at 8-10 DAP (Fig 4C) and in central starchy endosperm at 8 DAP (S8 Fig). Considering that in the physiological process of seed development, the most active stage of endosperm starch and storage protein synthesis is at 12-15 DAP (Kynast, 2012) -as is the spatiotemporal expression (Fig 4C and S8 Fig) and the functions of the coexpressed genes-we think it is very likely that DHDPS2 is responsible for the synthesis of some high lysine content proteins related to translation, such as ribosomal proteins that have a high protein abundance at 8-12 DAP (S6C Fig). These high lysine content proteins are involved in and/ or contribute to the synthesis of the proteins involved in the later stage of seed development.
There is another question about the role of DHDPS1 in maize seed. In Arabidopsis, the two isoforms of DHDPS (DHDPS1 and DHDPS2) contribute differently to lysine biosynthesis [78]. During the seed development stage, we found that the DHDPS1 had a close relationship with O2 (Pearson's correlation coefficient = −0.83). Varisi et al. reported that there was no clear evidence that o1, o2, fl1 and fl2 genes influenced DHDPS activity in developing maize seeds [82]. This discrepancy may be because the expression level of DHDPS2 was higher than that of DHDPS1 in maize seed, and the reduced activity of DHDPS1 had no effect on the total DHDPS activity level. DHDPS1 and DHDPS2 might make different contributions to lysine biosynthesis in maize seed as well as in Arabidopsis. Based on the MaizeGDB gene expression profile, we found that DHDPS1 was expressed highly in thirteenth leaves and in leaves after pollination (http://www.maizegdb.org/gene_center/gene/GRMZM2G556131), while DHDPS2 was expressed highly in seed (http://www.maizegdb.org/gene_center/gene/ GRMZM2G027835). This suggested that DHDPS1 and DHDPS2 may have specific functions in different tissues. DHDPS1 and DHDPS2 may have important roles in leaves and seeds, respectively.
Recently, Walley et al. reported that there was a poor correlation between mRNA levels and protein abundance in maize seed endosperm at 12 DAP (r = 0.414) and embryo at 20 DAP (r = 0.413) [58]. Here, we compared the DHDPS2 coexpressed gene proteome data with the RNA-Seq data of maize seed endosperm at 8, 10 and 12 DAP, and the correlation coefficients were 0.34, 0.39 and 0.43, respectively. The results were consistent with those of Walley et al. [58]. Interestingly, the mRNA levels of LBPGs showed a relatively high correlation (r = 0.85) to protein abundances in maize endosperm at 12 DAP in Walley et al.'s study [58]. Ponnala et al. also reported that, in maize leaves, the photosynthetic genes show the highest correlation when leaves serve as the carbon source [83]. These results indicated that the correlations between mRNA and protein abundance have a wide range depending on the gene function and growth stage.
In summary, we constructed a complex LBPGs coexpression network based on RNA-Seq data from 21 different maize seed developmental stages. After determining that DHDPS genes exhibit different expression levels and expression patterns, we comparatively analyzed DHDPS1 and DHDPS2 coexpressed genes and determined that DHDPS1 and DHDPS2contribute differently to lysine biosynthesis in maize seed. We also analyzed the function of LBPG coexpressed genes and identified some potential regulators of lysine biosynthesis, such as certain TF families, and local and distant eQTLs. Although further biochemical and molecular experiments are needed for verification, our results provide a foundation for future studies on the lysine biosynthesis pathway network in maize seed.

Accession Numbers
The sources of data underlying the findings described in our manuscript can be found in the S9 Table. Supporting Information  8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38, days after pollination) and the color represents the module color. The value of the module eigengenes for each sample is displayed on the y axis.