Genome-wide analysis of transcription factors during somatic embryogenesis in banana (Musa spp.) cv. Grand Naine

Transcription factors BABY BOOM (BBM), WUSCHEL (WUS), BSD, LEAFY COTYLEDON (LEC), LEAFY COTYLEDON LIKE (LIL), VIVIPAROUS1 (VP1), CUP SHAPED COTYLEDONS (CUC), BOLITA (BOL), and AGAMOUS LIKE (AGL) play a crucial role in somatic embryogenesis. In this study, we identified eighteen genes of these nine transcription factors families from the banana genome database. All genes were analyzed for their structural features, subcellular, and chromosomal localization. Protein sequence analysis indicated the presence of characteristic conserved domains in these transcription factors. Phylogenetic analysis revealed close evolutionary relationship among most transcription factors of various monocots. The expression patterns of eighteen genes in embryogenic callus containing somatic embryos (precisely isolated by Laser Capture Microdissection), non-embryogenic callus, and cell suspension cultures of banana cultivar Grand Naine were analyzed. The application of 2, 4-dichlorophenoxyacetic acid (2, 4-D) in the callus induction medium enhanced the expression of MaBBM1, MaBBM2, MaWUS2, and MaVP1 in the embryogenic callus. It suggested 2, 4-D acts as an inducer for the expression of these genes. The higher expression of MaBBM2 and MaWUS2 in embryogenic cell suspension (ECS) as compared to non-embryogenic cells suspension (NECS), suggested that these genes may play a crucial role in banana somatic embryogenesis. MaVP1 showed higher expression in both ECS and NECS, whereas MaLEC2 expression was significantly higher in NECS. It suggests that MaLEC2 has a role in the development of non-embryogenic cells. We postulate that MaBBM2 and MaWUS2 can be served as promising molecular markers for the embryogencity in banana.


Introduction
Banana is an important staple food fruit crop in several developing countries. Being a vegetatively propagated plant, its multiplication index through sucker is very low [1]. Somatic [32] and AGAMOUS-LIKE (AGL) [33] are highly specific to plant lineage, suggesting their importance in plant-specific processes. These TFs are restricted to plant lineage and characterized by the presence of conserved domain in their proteins. The study directing towards understanding the role of these TFs in banana could be of great interest and has the potential for improvement of SE. For instance, BBM expression patterns in cacao tissue reported as a biomarker for embryogenesis [19]. WUS play a vital role in SE by promoting the vegetative to embryogenic transition in arabidopsis [21]. BSD is known to be associated with cell proliferation during the SE [24][25]34]. LEC [27] and L1L [28] contain HAP3 subunit and reported for their role in embryo development, morphogenesis, and cellular differentiation. The role of AGL has been demonstrated in SE of arabidopsis and radish plants [35][36].
Several other TFs are known to be involved in the process of only organogenesis or both SE and organogenesis. For example, VP1 regulates seed dormancy and involved in organogenesis and SE [30, [37][38]. However, CUC plays a role only in the organogenesis [39][40]. CUC is also known to promote adventitious root formation in calli [41]. Another TF like BOL is known to be involved in the regulation of the cell expansion and proliferation [32].
Earlier, efforts have been made to characterize different TFs for their role in SE in several plant species [15,20,24,28], but no report is available in banana. In the present study, 18 genes belonging to the nine TF families were identified in the banana genome. Phylogenetic tree analyses of these genes in land plants including monocot, dicot, and gymnosperm were conducted for their molecular evolution. In addition, the expression patterns of these genes were characterized during the critical steps of in-vitro cultures of cv. Grand Naine (AAA).

Identification, chromosome distribution and exon-intron prediction of TFs in banana
Nine TF families (BBM, WUS, BSD, LEC, L1L, VP1, CUC, BOL and AGL) were selected for the in-silico study. Homologs of these TFs were identified in arabidopsis, maize and rice (TAIR, http://www.arabidopsis.org; http://bioinformatics.psb.ugent.be/plaza; TIGR, http://rice. plantbiology.msu.edu) and used as a query sequence in BLASTP search to retrieve the sequences from the banana genome (http://banana-genome.cirad.fr/). The putative protein sequences resulting from each blast search (E-value 10 −5 ) in banana were collected and redundant sequences were removed. Gene models were refined and genomic coding sequences (CDS) were retrieved from the banana genome hub database [42,

Phylogenetic analysis, motif identification and sequence analysis
Protein sequences of the selected TFs of banana and their homologs from angiosperm (monocot and dicot) and gymnosperm were used for phylogeny analyses. The phylogenetic tree was constructed using the neighbor-joining method with MEGA 6.0 software [45]. The unrooted tree was generated through 1000 bootstrap values for the reliability of the tree. Multiple sequence alignment was carried out by using CLC genomic workbench (QIAGEN Denmark). The amino acid sequence corresponding to TF families in banana were studied for conserved domains analysis using conserved domain database [46]. Theoretical isoelectric point (pI) and molecular weight (MW) were predicted using the Compute pI/MW tool on the ExPASy server [47,

Callus development
Total seventy immature male flower buds (source of explant) of cv. Grand Naine were collected from the experimental field of National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab, India (310 m above sea level; Latitude 30˚47' North; Longitude 76˚41' East). Immature male flowers (explants) of rank (1-15) adjacent to the floral apex were isolated and inoculated on callus induction medium for callus formation [7,50]. Six hundred explants derived from forty flower buds were inoculated on 2, 4-dichlorophenoxyacetic acid (2, 4-D) containing medium, while the four hundred fifty explants prepared from thirty buds were cultured on 2, 4-D -free MS basal medium. All cultures were incubated under the dark condition at 27˚C for the emergence of somatic embryos in plant tissue culture chambers (Percival, USA). After 12 Week (W) of incubation, embryogenic calli (with proembryos) and nonembryogenic calli were identified under the upright microscope (Leica Microsystems, Germany), collected, frozen immediately in the liquid nitrogen and stored at −80˚C until further use.

ECS establishment
Nearly six months (24W) old embryogenic calli comprised with somatic embryos and nonembryogenic calli were inoculated in suspension medium containing 2, 4-D and zeatin [7]. Suspension culture was kept in dark at 27˚C with agitation at 90 rotations per minute (Kuhner, Switzerland) and sub-cultured weekly. ECS samples were collected at 0W, 1W, and 24W intervals while the NECS collected at 0W, and 1W for further study.

Histological analysis
Samples collected from callus induction medium with 2, 4-D and without 2, 4-D were used for histological analysis. Samples were preserved in 70% ethanol at 4˚C for one day [51] and then fixed in blocks using freezing solution (Leica Biosystems, Germany). Ten to twelve μm thin sections were prepared by using cryomicrotome (Leica Biosystems, Germany) and observed under the microscope (Leica Microsystems, Germany). For suspension culture analysis, cells were placed on the slide along with suspension medium and cover slip. The embryogenic and non-embryogenic cells were stained with iodine to confirm the presence of starch granules. These cells were observed directly under the upright microscope (Leica Microsystems, Germany).

Laser capture microdissection (LCM) of embryogenic cells
RNAase free conditions (tools, solutions, handling) were followed throughout the experiment. The experiment was performed as per previously described protocol [52]. In brief, banana callus (24W) containing embryos were fixed at -23˚C under vacuum using tissue freezing medium (Leica biosystems, Germany). Tissue blocks were fixed with the holding clamp of cryomicrotome (Leica biosystems, Germany) and 10-12 μm thin sections were prepared. These sections were taken on slides, air-dried at room temperature and observed under LCM microscope (Zeiss, Germany).
For microdissection, embryos were identified and marked using PALM (Zeiss, Germany) tool. Tissues were snipped-off with a laser beam along with marking in RNase-free tubes and stored at -80˚C for RNA isolation.

RNA isolation and cDNA synthesis
Total RNA was isolated from different samples using RNA extraction kit (Sigma-Aldrich, USA). Isolated RNA was treated with DNase I kit (AmbionThermo Scientific, USA) to eliminate DNA contamination. Total RNA was analyzed by agarose gel electrophoresis for size and integrity. The quantification of total RNA was done with a NanoQuant (Infinite 200 PRO NanoQuant, Austria). DNA-free RNA was used for cDNA first strand synthesis by using revert aid first strand cDNA synthesis kit (Thermo Scientific, USA) as per manufacturer's protocol. Oligo dT primer was used for cDNA preparation.

Quantitative real-time PCR (qPCR)
The qPCR study was performed using ABI 7700 Sequence Detector (Applied Biosystems, USA). Housekeeping gene Actin1 (GenBank Accession No. AF246288) was used to normalize variant expression of selected genes [53][54]. The primers were tested for single band amplification using conventional end-point PCR. Melting curve study was carried out using qPCR. The total volume of each reaction was 10 μl and consisted of 1X SYBR Green Master mix (Applied Biosystems, USA) 5 pmol of each primer, 0.5 μl cDNA template and sterile distilled H 2 O. PCR conditions followed during real-time PCR experiment were: step (1) 50˚C 2 min, step (2) 95˚C 10 min, step (3) (95˚C 0.15 min, 60˚C 1 min) x 40 cycles, followed by the thermal dissociation curve. The relative expression level was analyzed using the 2 -ΔΔCt method [55], where ΔΔCt = (Ct target-Ct actin) time x −(Ct target-Ct actin) time 0 . Primer details are mentioned in S1 Table. All experiments were performed in biological triplicates and each experiment consisted of three technical replicates. The t-test was carried out for assessment of statistical significance of data.

Identification and sequence analysis of SE related TFs in Musa acuminata
The BLAST search was carried out for identification of BBM, WUS, BSD, LEC, L1L, VP1, CUC, BOL, and AGL protein sequences in banana by using query sequences of arabidopsis (27), rice (8), and maize (13). A total of 18 sequences belonging to nine TF families were retrieved from the banana genome. All deduced protein sequences contain conserved domains of their respective TF families (S1, S2, S3, S4, S5, S6, S7 and S8 Figs). Protein sequence identity between banana TFs and their homologs in arabidopsis, maize, and rice ranges from 24% to 83%, 25% to 74%, and 45% to 74%, respectively (S2 Table). Various features such as locus ids, chromosomal coordinates, predicted protein MW, pI, subcellular localization and TMH of banana sequences were summarized in Table 1. Protein length and MW of all TFs ranged from 63 to 678 amino acids and 7.20 to 75.04 kDa, respectively. Most of the TFs (11) showed pI in acidic range (4.3-6.83), while others had an alkaline range (8.6-10.14). However, only MaLEC1 had shown neutral pI. MaBBM1, MaBBM2, and MaLIL1 were localized in the chloroplast, whereas MaBSD3 and MaAGL1 in mitochondria. Subcellular localization of other TFs was not available. No transmembrane domain was predicted in any TF.

Chromosome localization and exon-intron composition
All TFs except one (MaVP1) that was found to be associated with un_random chromosome could be mapped on banana chromosomes (Fig 2). All the mapped TFs were distributed over MaAGL1, MaAGL2, and MaWUS1 were intron-free genes, while other TFs have shown the presence of introns varying from 1 to 11. MaBBM1contained the highest number of introns (11) while its homolog in arabidopsis has only 8 introns. Except for BOL and CUC, no other TFs were found to have its homologs in rice (S3 Table).

Phylogenetic study and conserved motif analysis
The TF sequences from the gymnosperm, monocot and dicot were included in the phylogenetic tree construction. As expected, all the homologs of banana TFs were clustered with the monocot clade except MaBSD2, MaCUC2, and MaCUC3. MaBSD2 grouped with gymnosperm, whereas MaCUC2 and MaCUC3 clustered with dicot (Fig 3). The amino acid sequences corresponding to 9 TFs in banana were studied for conserved domains analysis (S1, S2, S3, S4, S5, S6, S7 and S8 Figs). MaBBM was found to have DNA binding sites within two conserved AP2 domains. It is required for transcription regulation of developmental processes, whereas MaWUS had DNA binding site along with specific DNA base contact site in the homeobox conserved domain. It is known to play a key role in plant development. MaBSD protein containing signature BSD domain is synapse-associated protein and is not much explored. LEC/L1L was noticed to contain a conserved HAP3 domain that played an important role in signal transduction and light harvesting. MaVP1 contained one acidic amino-terminal region (A1) and three basic regions (B1, B2, B3), having a role in seed development and auxin transport. MaBOL possessed BolA region that is involved in callus induction. MaCUC a member of NAC gene family is known to be related with organogenesis. The presence of these domains in the selected TFs confirmed their belongings to the respective families.

Development of somatic embryogenesis
Total eighteen explants responded on the 2, 4-D containing medium with 3% frequency of embryogenic callus formation. On the other side, none of the explants responded for   (Fig 4A). Embryogenic region beneath the somatic embryos contained many numbers of friable embryogenic cells and suitable for initiation of ECS. The non-embryogenic callus was yellow, compact, hard and lacked embryo-like structure on the surface (Fig 4D). Embryogenic as well as non-embryogenic cells were inoculated in the suspension medium. Suspension cultures of independent embryogenic and nonembryogenic lines were kept separately in the shaker. Embryogenic cell aggregates multiplied and formed many lobed structures from peripheries of which new aggregates were released. Non-embryogenic cells did not divide and eventually led to cell death after 1W. The responded ECS were sub-cultured weekly and maintained for more than 20 months with the high efficiency of embryogenic response.
Embryogenic cells have dense cytoplasm with small vacuole and abundant starch granules at borders (Fig 4B). These cells were mostly isodiametric and spherical in shape. The cells of non-embryogenic nature were highly vacuolated, contained fewer starch granules and abnormal in shape and size (Fig 4E). These observations are in agreement with previous studies [2,7,56]. The darkly stained cell aggregates have also been confirmed the presence of dense cytoplasm with abundant starch granules in ECS (Fig 4C). The similar observation was reported in the other studies [2,57]. These embryogenic cells were spherical in shape and multiplied at a higher rate. The cells in NECS were the irregular shape, highly vacuolated and hardly contained any starch granules (Fig 4F). Apart from this, the potent embryogenic cells were snipped off using LCM from a mature embryo, which is used as 0W ECS for the expression study (Fig 5).

Expression study of TFs in response to 2, 4-D treatment in callus
Banana male flowers were inoculated on callus induction medium without 2, 4-D and with 2, 4-D treatment. After 12W, the callus was harvested and further used for gene expression study (Fig 6). The explant (male flower) was used as a control for the study. The expression of MaBBM1, MaBBM2, and MaVP1 was upregulated on 2, 4-D free medium with respect to the explant. The significantly higher expression of MaWUS2 along with MaBBM1, MaBBM2, and MaVP1 was observed on 2, 4-D supplemented medium. The expression of MaBBM1, MaBBM2, and MaVP1 was increased to 4.6, 14.7, and 66.9 fold respectively, in 2, 4-D supplemented medium as compared to 2, 4-D free medium (Fig 6).

Gene expression analysis of TFs at different developmental stages of ECS and NECS
Differential expression of all 18 genes was observed at various developmental stages of ECS (0W, 1W, and 24W) and NECS (0W and 1W) cultures (Fig 7). The male flower (explant) was    used as a control for the expression study. MaBBM1, MaBBM2, MaWUS1, MaLEC2, and MaVP1 were expressed differentially in 0W and 1W NECS. The higher expression of MaBBM (MaBBM1 and MaBBM2) was noticed in both ECS and NECS. However, MaBBM2 showed increased transcript level in comparison to MaBBM1 in both ECS and NECS (Fig 7). The expression of MaWUS2 was gradually increased in different stages of ECS, but it was undetected in NECS. In contrast, MaWUS1 was highly expressed in different stages of NECS, but in the case of ECS, its expression was detected only at 24W. Out of three paralogs of MaBSD, the expression of MaBSD2 was significantly upregulated at early (0W) and late (24W) stages of ECS, while MaBSD3 showed higher expression at 1W ECS. MaBSD2 and MaBSD3 were expressed only at 0W and 1W NECS, respectively.
The higher expression of MaLEC2 was noticed in all stages of ECS and 0W NECS. On the other side, MaLEC1 showed higher expression at different stages of ECS but not NECS. MaLEC2 was highly expressed at 0W NECS. MaVP1 in banana was significantly upregulated at different development stages of ECS and NECS. MaL1L1 and MaCUC1 did not show any

Discussion
SE is widely utilized for micropropagation and genetic transformation, but the basic molecular mechanism behind it is not well understood [58][59]. Recently, the proteome approach has been reported to identify the differentially expressed proteins during the SE in banana [2]. Gene expression study of SE related TFs could be the significant approach for understanding their role in in-vitro developmental biology. In banana potency of explant to develop embryogenic callus is very low [59]. Moreover, the prolonged culturing in the callus induction medium could also lead to a somaclonal variation [11]. Therefore, it is important to find the molecular regulators that can be explored to enhance the SE potential in banana. In this study, in-silico characterization and expression analysis of the 18 genes of 9 TF families were studied for their role in SE of banana. Homologs of most of these genes in other plant species have already been reported for their role in SE [30-31, 15, 38, 60, 29, 25, 61]. The presence of multiple homologs of TFs in banana may be the result of gene duplication events during evolution, which may have significance for functional divergence [53].
The differential expression patterns of 18 genes in embryogenic and non-embryogenic tissues, and cell suspension cultures with respect to the explant of cv. Grand Naine were determined. MaBBM2 is highly expressed in all stages of ECS. The expression of MaBBM2 in ECS was increased to 22.95 fold at 24W as compared to 0W. However, the lower expression of MaBBM2 was noticed in NECS as compared to ECS. BBM role in the conversion of vegetative tissue to embryogenic culture has been reported in Brassica napus [15]. In arabidopsis, overexpression of GmBBM1 resulted in somatic embryos emergence in transgenic lines [62]. Similarly, our expression study also revealed that MaBBM2 may play an important role in the conversion of explant to embryogenic callus and ECS.
WUS contains a homeodomain that is involved in regulation of developmental processes [21,63]. WUS role in promoting vegetative to embryogenic transition and stem cell maintenance has been reported in arabidopsis [21]. Among the two homologs of WUS under study, MaWUS2 was found to be highly expressed in the late stage of ECS that suggests its potential role in ECS maintenance. Recently, BBM and WUS have also been reported to improve transformation efficiency in monocots i.e., sorghum, sugarcane and rice [64].
BSD domain containing genes are known to be a basal TF which are reported for their association with cell proliferation during SE [34]. Here, we identified the three homologs of BSD (MaBSD1, MaBSD2, and MaBSD3) in banana. MaBSD homologs shared two distinct clades in phylogenetic tree analysis. MaBSD2 clustered with gymnosperm, while MaBSD1 and MaBSD3 were grouped in a clade with monocot. This indicates their divergence during the evolution. MaBSD2 was differentially expressed in ECS, whereas MaBDS1 did not show any significant change in expression level. These results suggested that gene duplication during the evolution process may hamper the functionality of MaBSD homologs.
LEC plays role in controlling embryogenesis [65]. The present study showed high accumulation of MaLEC2 transcript in 0W NECS. It suggests that MaLEC2 could lead to the cell necrosis via non-embryogenic callus in banana. However, LEC1 was expressed in embryogenic cells as compared to non-embryogenic cells in carrot [66]. The L1L has close structural resemblance with LEC and known to be a regulator of embryo development. In our study, the homologs of MaL1L did not show any significant change in the expression.
VP1 an auxin-inducible gene that encodes a TF involved in ABA signaling [67][68]. We noticed that MaVP1 was highly expressed in callus, ECS, and NECS. Expression of VP1 in ECS of arabidopsis has been correlated with its role in embryo development [38]. In Secale cereale, VP1 has been reported to have a negative effect on the development of embryogenic callus [69].
CUC is known to induce adventitious shoots in arabidopsis [70][71][72]. It is utilized as a predictive marker for root and shoot organogenesis [39,73]. Hence, we selected CUC gene from banana as a negative control for the expression study. As expected CUC homologs were either absent or having basal expression level in different development stages of SE in banana. The expression of MaAGL and MaBOL was not detected. Previous reports showed AGL15 accumulation in tissues derived from double fertilization and its participation in the early stages of zygotic embryo development [33]. The significantly higher expression of MaLEC2 at 0W NECS has suggested their role towards non-embryogenecity in banana.
Based on the differential expression patterns, we anticipate that MaBBM2 and MaWUS2 are the promising candidates for the embryogenicity in banana. Besides, it would be needed to confirm the functional role of MaLEC2 in the development of non-embryogenic callus. Therefore, future studies will be directed to functionally assess the role of MaBBM2, MaWUS2, and MaLEC2 for the better understanding of SE in banana.