Identification and characterization of the bZIP transcription factor family and its expression in response to abiotic stresses in sesame

Basic leucine zipper (bZIP) gene family is one of the largest transcription factor families in plants, and members of this family play important roles in multiple biological processes such as light signaling, seed maturation, flower development as well as abiotic and biotic stress responses. Nonetheless, genome-wide comprehensive analysis of the bZIP family is lacking in the important oil crop sesame. In the present study, 63 bZIP genes distributed on 14 linkage groups were identified in sesame, and denominated as SibZIP01-SibZIP63. Besides, all members of SibZIP family were divided into nine groups based on the phylogenetic relationship of Arabidopsis bZIPs, which was further supported by the analysis of their conserved motifs and gene structures. Promoter analysis showed that all SibZIP genes harbor cis-elements related to stress responsiveness in their promoter regions. Expression analyses of SibZIP genes based on transcriptome data showed that these genes have different expression patterns in different tissues. Additionally, we showed that a majority of SibZIPs (85.71%) exhibited significant transcriptional changes in responses to abiotic stresses, including drought, waterlogging, osmotic, salt, and cold, suggesting that SibZIPs may play a cardinal role in the regulation of stress responses in sesame. Together, these results provide new insights into stress-responsive SibZIP genes and pave the way for future studies of SibZIPs-mediated abiotic stress response in sesame.


Introduction
Abiotic stresses severely affect the growth and development of plants. As a sessile organism, plant has evolved complex signaling transduction pathways and various mechanisms of abiotic stress tolerance to survive in a variety of adverse conditions such as drought, salinity, waterlogging, and extreme temperature [1][2][3]. Transcription factors (TFs) play a vital role in abiotic stress responses signaling networks in plants through binding to promoters of specific sets of a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 implicated in branched-chain amino acid catabolism, which is an alternative mitochondrial respiratory pathway that is crucial for plant survival during extended darkness induced carbohydrate deprivation [38].
Sesame (Sesamum indicum L.) is an ancient and prized oil crop, which is grown mainly in tropical and subtropical areas of the world. Sesame has high nutritional quality [39] and is widely used in baked and confectionery products. Although market demand for sesame seed continues to rise, the production of sesame is severely affected by adverse environmental stresses such as drought and waterlogging [40][41][42]. It is well known that bZIP genes play a crucial role in response to various abiotic stresses in crops. However, no genome-wide information is available for bZIP gene family in sesame. In this study, we performed genome-wide identification of bZIP gene family in sesame and comprehensively analyzed their phylogenetic relationships, conserved motifs and gene structure arrangement. Furthermore, the expression patterns of bZIP genes in different tissues and in response to abiotic stresses were also analyzed using publicly available transcriptome data and quantitative real-time RT-PCR (qPCR). Our results provide a perspective for further functional characterization of potentially important bZIPs that are highly involved in abiotic stress responses in sesame.

Identification and phylogenetic analyses of the bZIP gene family in sesame
To identify all genes encoding bZIP transcription factors in sesame (Sesamum indicum L.), all annotated proteins were downloaded from the Sesamum indicum genome database (Sinbase, http://ocri-genomics.org/Sinbase/index.html) [43]. Local Hidden Markov Model (HMM) search was firstly performed based on HMM profile of the bZIP domain (PF00170) using HMMER3.0. BLAST algorithm was also used to identify the predicted sesame bZIPs with all Arabidopsis bZIPs as queries. Then, all resulting candidate protein sequences were further examined by SMART (http://smart.embl-heidelberg.de/) to confirm the integrity of the bZIP domain. Finally, non-redundant and confident genes were gathered and assigned as sesame bZIP genes. Multiple sequence alignments of SibZIPs and Arabidopsis bZIPs were performed with ClustalX (version 1.83) [44]. Subsequently, the results were used to construct unrooted phylogenetic trees based on the neighbor-joining (NJ) method in MEGA (version 5.0) program with the pairwise deletion mode [23]. The bootstrap analysis was conducted with 1000 replicates.

Chromosomal location and duplication analysis
SibZIP genes were located on sesame linkage group according to their position information in the Sinbase database. The duplication pattern of SibZIP genes was analyzed using MCScanX software (http://chibba.pgml.uga.edu/mcscan2/) according to the previous description [45], and genes were classified into various types of duplications including segmental, tandem, proximal and dispersed under a default criterion.

Analyses of gene structure, promoters, conserved motifs, and construction of the interaction network
The exon-intron compositions of the SibZIP genes were analyzed using Gene Structure Display Server (GSDS) (http://gsds.cbi.pku.edu.cn/index.php) by comparing the coding sequences with their corresponding genomic sequences from Sinbase database. The upstream 1 kb genomic DNA sequences of SibZIP genes were submitted to the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) to identify the putative stress-related cis-elements. Conserved motifs of SibZIP proteins were analyzed using MEME (Multiple Em for Motif Elicitation) v4.11.4 (http://meme-suite.org/tools/meme). Interaction network of Arabidopsis bZIPs was constructed using STRING database (https://string-db.org/) [46] with high confidence (score>0.9). Then, the homologs of these interactive proteins in sesame were identified with reciprocal best BLASTP analysis [47].

Expression profiling of SibZIP genes using available transcriptome data
Expression analysis of SibZIP genes in different tissues (root, stem, flower, leaf, capsule and seed) were based on the transcriptome data derived from Sesame Functional Genomics Database (SesameFG, http://www.ncgr.ac.cn/SesameFG) [48]. Expression patterns of SibZIP genes under drought treatment were extracted from the transcriptome data of two sesame genotypes with contrasting drought tolerance (drought-tolerant cultivar ZZM0635 and drought-sensitive cultivar ZZM4782) (SRA accession number: SRR2886790) [49]. Expression patterns of SibZIP genes under waterlogging stress were obtained from transcriptome data of two sesame genotypes with contrasting tolerance to waterlogging (waterlogging-tolerant cultivar Zhongzhi No. 13 and the waterlogging-sensitive cultivar ZZM0563) (BioProject accession number: PRJNA356988) [50]. Heatmaps and hierarchical cluster analyses were constructed by MeV (MultiExperiment Viewer) [51].

Plant materials and treatments
Sesame plants of Zhongzhi No. 13 cultivar were grown hydroponically in a growth chamber (16 h light/8 h dark cycle at 28˚C). Two-week old seedlings were exposed to different abiotic stresses including osmotic, salt, and cold treatments as described in our previous study [52]. The shoots of seedlings with different treatments were harvested at 0, 2, 6 and 12 h after treatment. Shoot samples were frozen immediately in liquid nitrogen, and stored at -80˚C until use.

Quantitative real-time RT-PCR
Total RNA was extracted using the EASY spin Plus kit (Aidlab, China). The first-strand cDNAs were synthesized using the HiScript II 1st Strand cDNA Synthesis kit (Vazyme, China) following the manufacturer's instruction. Quantitative real-time RT-PCR (qPCR) was performed on Roche LightCycler 480 real-time PCR system using the ChamQ SYBR qPCR Master Mix (Vazyme Biotech, China). Additionally, the qPCR assays were performed with three replicates and Histone H3.3 gene (SIN_1004293) was used as endogenous control [53]. The relative expression levels were analyzed according to 2 -ΔΔCT method [54]. The gene-specific primers are listed in S1 Table.

Identification of bZIP transcription factors in sesame
To identify bZIP TFs family members in sesame, a Hidden Markov Model search using the HMM profile of bZIP domains (PF00170), as well as genome-wide BLAST searches using Arabidopsis bZIP sequences as query, were performed to screen protein sequence data from the Sesamum indicum genome database (Sinbase, http://ocri-genomics.org/Sinbase/index.html) [43]. After validating the integrity of the bZIP domain using the SMART database (http:// smart.embl-heidelberg.de/), a total of 63 non-redundant genes were assigned to sesame bZIP genes, and named from SibZIP01 to SibZIP63 based on their physical location in the sesame genome. The 63 predicted SibZIP proteins ranging from 131 (SibZIP58) to 780 (SibZIP11) amino acid residues in length, with an average of 339 amino acids. The gene name, gene locus ID, linkage group location, protein length, and other corresponding information of all SibZIP genes are shown in S2 Table.

Distribution on chromosome and duplication events of the SibZIP genes
Based on the genomic position information obtained from the Sinbase database, we found that 61 SibZIP genes were unevenly distributed among 14 linkage groups (LGs) out of the 16 LGs of the sesame genome (Fig 1). However, other two genes (SibZIP62 and SibZIP63) were distributed on the unassembled genomic scaffolds. The LG02 harbored the highest number of SibZIP (8 genes), followed by LG06 (7 genes). In contrast, only one SibZIP gene was located each on the LG13 and LG16. To understand the evolutionary mechanisms of SibZIP gene family, both tandem and segmental duplication events were further analyzed. Surprisingly, duplication in the case of bZIP gene family in sesame was confined to only segmental duplication because no tandem duplicated SibZIP gene pair was identified. As shown in S1 Fig, a total of 21 pairs of bZIP genes were located on duplicated genomic segments on 12 LGs of the sesame genome, and most of them were localized on two different linkage groups (S3 Table). These results indicate that segmental genome duplication events have contributed predominantly to the expansion of the bZIP gene family in sesame.

Phylogenetic relationship of sesame and Arabidopsis bZIPs
In Arabidopsis, 75 AtbZIP sequences are clustered into 10 groups (A-I, and S) according to the sequence similarity. In order to understand the phylogenetic relationship of the bZIP TF family between sesame and Arabidopsis, an unrooted NJ tree was constructed based on protein sequences of 63 SibZIPs and 75 AtbZIPs. As shown in Fig 2, 138 bZIPs from sesame and Arabidopsis were divided into nine groups (A, B, C, D, F, G, H, I and S) and the groups were named following Arabidopsis classification. However, the group E AtbZIPs including AtbZIP34 and AtbZIP61, were clustered within the group I, which might be due to the high sequence similarity of bZIP domain between these two groups [4]. So, we integrated these two groups and designated as group I in this study. Therefore, the group I is the largest group including 15 members of SibZIP family. The group S contains 13 members. The groups D and A contain ten and nine members, respectively. In contrast, the groups B and H are the smallest clusters, each having two SibZIPs. According to the functional characterization of the bZIP subgroups in Arabidopsis [4], phylogeny-based functional prediction was performed for corresponding subgroups of bZIP in sesame (S4 Table).

Gene structure and conserved motifs analyses of sesame bZIPs
To better understand the structural features of SibZIP genes, we analyzed the exon/intron organization of SibZIPs using Gene Structure Display Server 2.0. The result showed that the number of introns varied from 0 to 11 in SibZIP genes. As shown in Fig 3B, 17 SibZIP genes belonging to the group S and F are intronless. Groups A, B, C, H, and I have 1-6 introns, whereas, the SibZIP genes of the groups D and G contain the largest number of introns (7-11 introns). To obtain insight into the diversification of SibZIP proteins, conserved motifs were predicted using the MEME program. A total of 20 conserved motifs were predicted in SibZIPs, Characterization of bZIP family in sesame and the distribution of each motif was shown in Fig 3C. It can be observed that SibZIPs within the same group displayed similar motif compositions, which further supported the group classification. For instance, all members of the group A share motif 1, 7 and 9; group B possesses motifs 1, 7, 13 and 16; groups C and S contain motif 1 and 6; group G harbors motifs 1, 7, and 16. The details of the sequence logo of each motif were presented in S2 Fig. However, the biological function of most of these motifs is unknown. Motif 1, which is widely present in all the SibZIP TFs, was annotated as bZIP domain. In addition, motif 7 is found in six groups, and motif 6 is present in two groups. In contrast, most of the conserved motifs appeared in specific groups. For example, motif 2 and 3 were annotated as transcription factor TGA like domain, and exclusively present in group D SibZIPs along with motif 5 and 15. Besides, motif 12 is present in four members of group A, and motifs 13 only appeared in group B. These groupspecific motifs may imply diverse functions of the bZIP family in sesame.

Expression profiles of SibZIP genes under drought and waterlogging stresses
Using RNA-seq data previously developed by our group [49,50], expression patterns of SibZIP genes in response to drought and waterlogging stresses were investigated in the root of sesame varieties with contrasting tolerance levels. Heatmap representing fold change of SibZIP genes under drought stresses showed that most of the SibZIP genes have similar expression pattern between drought-tolerant variety (DT, cv.ZZM0635) and drought-sensitive variety (DS, cv. ZZM4782) (Fig 5). For example, ten SibZIPs (SibZIP05, 15, 16, 17, 20, 24, 32, 44, 56 and 62) were up-regulated, whereas 12 SibZIPs (SibZIP03, 04, 13, 22, 23, 26, 30, 35, 36, 41, 47 and 58) were down-regulated, by drought stress in both genotypes. However, some SibZIP genes showed different expression pattern between DS and DT varieties and may be good candidates for drought tolerance improvement in sesame. For instance, SibZIP02 was up-regulated in DT plants throughout the drought stress treatment, but was not significantly affected in the DS variety.

Expression profiles of SibZIP genes exposure to various stresses
To further uncover the possible implication of SibZIP genes in response to other abiotic stresses, 28 SibZIPs (including all members of group A, and one or two members of other groups) were selected to analyze their expression patterns under osmotic, salinity and cold stresses by qPCR. Expression of SibZIP55, one of the group-A SibZIP genes, was not detected in all samples probably due to its extremely low expression level in the tested tissue. A heatmap representation of expression changes in response to the three stresses is shown in Fig 7. Under osmotic stress, SibZIP07, 16 and 44 were significantly up-regulated (fold change>2, P < 0.01) at 2 h, while SibZIP04 and SibZIP19 were specifically induced at 12 h. In contrast, SibZIP35 and 53 were down-regulated (fold change<0.5, P < 0.01) at all of the time points, and Sib-ZIP02, 17,25,29,38,54 were repressed at one or two time points after osmotic stress treatment. Under salt treatment, eight SibZIP genes (including SibZIP07, 08, 17, 19, 21, 33, 44, and 57) were significantly up-regulated (fold change > 2, P < 0.01) at two or three time points, whereas five SibZIP genes (SibZIP10, 16, 31, 35, and 38) were significantly down-regulated (fold change <0.5, P < 0.01) at 6 h and 12 h. Besides, SibZIP01 and SibZIP28 were specifically induced at 2 h, while SibZIP02, 04 and 48 were specifically induced at 12 h. Under cold treatment, almost all of the selected SibZIP genes tend to be down-regulated, and the expression of three SibZIP genes (SibZIP08, 16, and 31) was significantly repressed during the whole treatment period. However, SibZIP35 and 57 were significantly up-regulated (fold change > 2, P < 0.01) at 12 h under cold stress. Taken together, we found that the expression patterns of several SibZIP genes are similar among different abiotic stresses (Fig 7). SibZIP07, 19, 33 and 44 were up-regulated under osmotic and salt treatments. These results suggested that these genes might play a vital role in response to multiple abiotic stresses in sesame.

Stress-related cis-elements in SibZIP promoters
To identify putative stress-responsive cis-elements in the promoter regions of the SibZIP genes, 1kb upstream promoter sequences of the SibZIP genes were investigated using the PlantCARE database [55]. The promoters of all SibZIP members contain one or more stressrelated cis-elements, such as DRE (dehydration-responsive element), MBS (MYB binding site involved in drought-inducibility), HSE (heat shock element), LTR (low temperature-responsive element), TC-rich repeats (defense and stress-responsive element), and ABRE (ABA response element) (S5 Table). Among 63 SibZIP genes, 36, 34, and 34 genes have MBSs, TCrich repeats, and HSEs, respectively. Moreover, 29 SibZIP genes have more than four stressresponsive cis-elements, and 25 SibZIP genes have more than three types of cis-elements. Sib-ZIP33 contains three MBS elements and may be related to drought tolerance. Similarly, Sib-ZIP62 contains three HSEs, suggesting its possible function in thermal response. Furthermore, SibZIP24, SibZIP33, SibZIP56, and SibZIP61 harbor various stress-related cis-elements, indicating these SibZIPs may be involved in response to multiple abiotic stresses.

Analysis of SibZIP proteins interaction network
The possible interaction network of sesame bZIPs based on their homology to Arabidopsis proteins was constructed, to identify the putative function and interaction relationship between SibZIPs and other sesame proteins. By applying STRING database, networks of Arabidopsis Characterization of bZIP family in sesame bZIPs were constructed with high confidence (score > 0.9), involving 24 bZIPs and 10 other interactive proteins, including protein phosphatase 2C, serine/threonine-protein kinase, and E3 ubiquitin ligase (Fig 8; S6 Table). Subsequently, the homologs of these proteins involved in the interaction network were identified from sesame with reciprocal best BLASTP analysis (Fig 8; S7 Table). The largest network contains eight bZIP proteins from the group A, which was associated with ABI1, ABI2, OST1, SnRK2.2, SnRK2.3 and PYL1, participating in the ABA signal pathway. Seven members of the group D interacted with NPR1, and are involved in plant defense responses. Besides, bZIP members from the groups B, C, H, and S are also involved in different interaction networks. These results offer key clues to further investigate the function of SibZIPs based on experimentally validated protein-protein interactions.

Discussion
Sesame is an important oil crop due to its high oil, antioxidant, and protein content. Compared with other important crops, research on sesame has developed slowly, especially regarding the genetic improvement for tolerance to abiotic stresses such as waterlogging and drought [40]. Environmental stresses occurring at the vegetative and/or reproduction stage not only decrease the yield of sesame, but also affect the quality of sesame seed [42,56]. The bZIP gene family, one of the largest transcription factor families in plants, has been reported to be involved in various biological processes, including the regulation of plant growth, development, Characterization of bZIP family in sesame as well as responses to abiotic and biotic stresses [4]. Recently, the genome of sesame has been sequenced [57], and this platform provides an opportunity for comprehensive analysis of sesame bZIP transcription factors from whole genome view. In total, 63 bZIP genes were identified in sesame. The bZIP genes have been identified in several plant genomes, including Arabidopsis [4], rice [5], maize [11], soybean [12], tomato [15], grapevine [16], Brachypodium distachyon [17], and Brassica napus [18]. Compared with these plant species, sesame contains the fewest bZIP genes except grapevine. Gene family expansion in plants through whole genome duplication or tandem duplications has played a major role in the evolution of functional diversity [58]. We identified 21 pairs of segmental duplicated genes, representing 66.7% of the SibZIPs. However, no pair of the SibZIP genes was identified to be arranged in tandem. Thus, we conclude that segmental duplications have played predominantly roles in the expansion of the SibZIP gene family, which is in agreement with previous studies in rice and grapevine [5,16]. It has been reported that sesame has experienced a whole genome duplication event approximately 71 million years ago [57], which provides the opportunity for subsequent functional divergence of duplicated gene pairs. We found that some segmental duplicated SibZIP gene pairs, such as SibZIP04/SibZIP13 and SibZIP36/SibZIP38, have different expression patterns in various organs and/or abiotic stress responses, suggesting these gene pairs have undergone functional divergence during long-term evolution.
Phylogenetic analysis show that the sesame bZIPs could be separated into 9 groups, according to the classification of bZIPs in Arabidopsis [4]. Except that group E bZIPs is clustered into group I in sesame duo to highly similarity of zipper motif of these two groups [4]. The phylogenetic analysis was also supported by the conserved motif and gene structure analyses. Generally, bZIPs within the same group shared similar motif architecture and tend to perform the same biological functions. For example, AtHY5 from Arabidopsis and OsbZIP48 from rice, which belong to the group H bZIPs, play a central role in regulating photomorphogenesis in dicot and monocot system, respectively [59]. Since none of the bZIP transcription factors has been functional characterization in sesame, we performed the phylogeny-based functional prediction of SibZIPs based on the functional characterization of corresponding bZIP subgroups in Arabidopsis (S4 Table). Moreover, the potential interaction networks of SibZIPs were constructed, which may provide key evidence for functional predictions of SibZIPs. In Arabidopsis, ABA receptors (PYR/PYL/RCAR), protein phosphatases (PP2C), protein kinases (SnRK2) and transcriptional regulators (group-A bZIP, AREB/ABFs) were confirmed as crucial components of ABA signaling [27]. Based on the interaction networks of SibZIPs, the group-A bZIPs of sesame (SibZIP17, SibZIP29 and SibZIP55), which may be triggered by SiSnRK (SIN_1003687), probably function in ABA signaling as well as regulation of stress responses and seed development. SibZIP07, SibZIP21 and SibZIP49 were clustered with Arabidopsis TGAs in the group D. They are predicted to interact with SiNPR1 (SIN_1007630) and may be involved in plant defense responses by activating the expression of SA-responsive genes [60]. Two group-F SibZIPs, SibZIP46 and SibZIP54, showing high similarity to AtbZIP19 and Atb-ZIP23 in Arabidopsis, were predicted to regulate the adaptation to zinc deficiency [61].
The bZIP gene family is described as involved in growth and development processes of plants, including flower development and seed maturation [4,19]. For instance, AtbZIP29, which is specifically expressed in proliferative tissues, participates in leaf and root development by regulating the expression of genes involved in cell cycle and cell wall organization [62]. We analyzed the expression profiles of SibZIP genes in six different tissues of sesame, including root, stem, flower, leaf, capsule and seed. Although most of SibZIP genes were broadly expressed in all tested tissues, some SibZIP genes showed significant variation in their expression between different tissues, which is consistent with previous studies in rice, maize and grapevine [5,11,16]. Tissue-specific expressed genes including SibZIP26 and SibZIP55 in seed, Sib-ZIP14 and SibZIP49 in root, may play key roles in specific tissue/organ development. In Arabidopsis, ABI5 is necessary for seed development, germination, and seedling growth [4,63]. The gene SibZIP55, which is specially expressed in seed, is the homolog of the gene ABI5 in Arabidopsis, suggesting its role in seed development in sesame.
Increasing evidences demonstrated that bZIP TFs have an essential function in plant abiotic stress resistance. Group-A bZIPs have been extensively examined and are reported to play significant functions in various abiotic stresses by mediating ABA signaling in Arabidopsis [4,21]. A number of studies also have evaluated the function of group-A bZIP TFs in abiotic stress resistance of crop plants [22,30]. For example, overexpression of exogenous and endogenous AREB/ABF orthologs in cotton substantially increases drought tolerance through stomatal regulation [64]. In this study, the expression level of all SibZIP genes under drought and waterlogging stresses were analyzed based on transcriptome data. Moreover, group-A SibZIPs and other SibZIPs from different groups were further tested under cold, osmotic and salinity stresses by qPCR. The cis-elements and expression pattern analysis of SibZIP genes indicated that SibZIPs are widely involved in responses to abiotic stresses, which is consistent with the results observed in other plants such as soybean, grapevine, and rice [5,12,16]. In total, over 80% SibZIP genes showed significantly transcriptional changes (> 2-fold change) after abiotic stress treatments at least one time point. Most of the group-A bZIPs in sesame were responsive to at least one abiotic stress condition. Notably, the expression of SibZIP17, homolog of Arabidopsis ABF gene, was up-regulated by drought and salinity stresses. In addition, SibZIP44 containing one MBS and one ABRE, was induced by drought and waterlogging stresses. Group-D bZIPs, such as TGA1 and TGA4, not only involved in defense response to pathogens, but also identified as important regulatory factors of the nitrate response in Arabidopsis [65]. Moreover, overexpression of AtTGA4 improved drought resistance and reduced nitrogen starvation in Arabidopsis [66]. The expression of group-D genes SibZIP07 and SibZIP20 were up-regulated by drought stress, consistent with the presence of MBSs and TC-rich repeats in their promoters. These results suggested their possible roles in regulating drought and low nitrogen stresses in sesame. The group-S bZIPs are transcriptionally activated after stress treatment and regulate carbon and nitrogen metabolism [4,24,36]. The up-regulation of four group-S Sib-ZIPs (SibZIP04, 30, 38 and 62) in response to waterlogging stress suggested that they might play important roles in the metabolic reprogramming during waterlogging stress.

Conclusions
In conclusion, we identified 63 bZIP genes from sesame and investigated their phylogenetic classification, conserved protein motif, and gene structure. Transcriptomic analysis revealed the constitutively or tissue-specific expressed SibZIP genes. Expression profiles of SibZIP genes under various abiotic stress treatments and protein interactional network analyses indicated that they are involved in abiotic stress signaling. Meanwhile, some important candidates for improving sesame resistance to multiple stresses were identified. Together, these data provide useful information for further functional characterization of SibZIP genes and extending our knowledge of SibZIPs-mediated abiotic stress response in sesame.