Genome-wide analysis of the CCCH zinc finger family identifies tissue specific and stress responsive candidates in chickpea (Cicer arietinum L.)

The CCCH zinc finger is a group of proteins characterised by a typical motif consisting of three cysteine residues and one histidine residue. These proteins have been reported to play important roles in regulation of plant growth, developmental processes and environmental responses. In the present study, genome wide analysis of the CCCH zinc finger gene family was carried out in the available chickpea genome. Various bioinformatics tools were employed to predict 58 CCCH zinc finger genes in chickpea (designated CarC3H1-58), which were analysed for their physio-chemical properties. Phylogenetic analysis classified the proteins into 12 groups in which members of a particular group had similar structural organization. Further, the numbers as well as the types of CCCH motifs present in the CarC3H proteins were compared with those from Arabidopsis and Medicago truncatula. Synteny analysis revealed valuable information regarding the evolution of this gene family. Tandem and segmental duplication events were identified and their Ka/Ks values revealed that the CarC3H gene family in chickpea had undergone purifying selection. Digital, as well as real time qRT-PCR expression analysis was performed which helped in identification of several CarC3H members that expressed preferentially in specific chickpea tissues as well as during abiotic stresses (desiccation, cold, salinity). Moreover, molecular characterization of an important member CarC3H45 was carried out. This study provides comprehensive genomic information about the important CCCH zinc finger gene family in chickpea. The identified tissue specific and abiotic stress specific CCCH genes could be potential candidates for further characterization to delineate their functional roles in development and stress.


Introduction
The Zinc finger (Znf) family is one of the largest transcription factor families in eukaryotes [1][2][3][4] and is known to regulate genes at the transcriptional or posttranscriptional level [1,5]. type and similar) downloaded from Pfam database (http://pfam.xfam.org/family/PF00642). The presence of CCCH Znf motifs in these predicted proteins was confirmed by performing a domain search on SMART and Pfam. The sequences thus obtained were aligned and manually checked for redundancies. Consequently, a total of 58 non-redundant, full length CCCH Znf transcription factor genes were obtained (named CarC3H1-58) and used for all further analysis (S1 Table). The characteristics of these genes, including the number of amino acids present in each gene, their isoelectric point (PI), molecular weight and number of CCCH Znf motifs present in each gene are listed in Table 1. It was observed that the CarC3H genes encoded proteins of variable lengths, the longest being CarC3H33 (1915 amino acids) while the smallest encoded 127 amino acids (CarC3H26). The isoelectric points of these proteins ranged from 4.26 (CarC3H8) to 9.67 (CarC3H10). SMART and Pfam databases were used to calculate the total number of CCCH Znf motifs in the CarC3H proteins and a total of 138 CCCH Znf motifs were identified in this study. The members of chickpea CCCH Znf gene family were found to have 1 to 6 C3H type Znf domains. Majority of the members had either one (20 members) or two (15 members) C3H domains while 11 members contained three C3H domains and 12 members contained more than three C3H Znf domains (Table 1). Comparison with the CCCH Znf genes reported from Arabidopsis and M. truncatula showed that the number of CCCH Znf genes in chickpea was greater than those reported in M. truncatula while lower than Arabidopsis ( Fig 1A). The numbers of CCCH motifs in the three plants were seen to vary accordingly, with Arabidopsis containing the highest number of CCCH motifs, followed by chickpea and M. truncatula ( Fig  1A). The MEME program was used to identify all the motifs present in the CarC3H protein sequences. This led to prediction of a total of 10 different motifs including Znf-CCCH (Table 2). Similar to results observed in Arabidopsis and M. truncatula, the most common types of CCCH motifs observed were C-X 8 -C-X 5 -C-X 3 -H and C-X 7 -C-X 5 -C-X 3 -H type CCCH motifs (Fig 1B). A few members (CarC3H7 and CarC3H16) were also seen to contain a CCCH motif with the constitution C-X 10 -C-X 5 -C-X 3 -H. In addition, some members were found to contain unconventional CCCH motifs. For example, CarC3H20 and CarC3H25 contained a CCCH motif with the constitution C-X 17 -C-X 5 -C-X 3 -H while CarC3H30 contained a motif with C-X 17 -C-X 6 -C-X 3 -H as its consensus (S2 Table).

Phylogenetic classification and structural organisation of CarC3H gene family
The amino acid sequences of the CarC3H genes were aligned and used to generate the phylogenetic tree (Fig 2) that revealed that the chickpea C3H Znf genes were divided into twelve groups designated I-XII (Fig 2). Most of the clusters were supported by high bootstrap values in the phylogenetic analysis which validated the prediction and alignment of the CarC3H genes. To further substantiate the phylogenetic data, the CarC3H genes were aligned to the kabuli chickpea genome [23] and their structural organisation based on the exon-intron arrangement was deduced. It was observed that most of the CCCH genes in a particular group had similar genomic organisation. For example, all the members of group V, VI and VII were intron-less while members of the adjacent group VIII had four introns and five exons each. Members of group IX and XI exhibited similarity in the numbers of introns and exons present in their genes. Despite a number of deviations from this trend, where the numbers of exons and introns amongst members of a group were different, the pattern of genomic organisation was remarkably well conserved.
The phylogenetic classification was further reinforced through prediction of functional domains in the proteins of respective CarC3H genes (Fig 2). The CCCH Znf proteins are known to regulate gene expression through a number of methods including post transcriptional modification of target pre-mRNAs [24,25], transcriptional activation or repression of target genes [1,5] and interaction with different proteins [4]. This regulation is facilitated by the presence of various functional domains in addition to the CCCH motifs. The functional motifs found in members of CarC3H family included motifs like K homolog domains (KH type) and RRM motifs that are involved in RNA processing and Ankyrin, WD repeats and Zf-RING motifs that are involved in protein-protein interactions. In addition, some members were seen to contain domains like HTH-OST, SAM-MT-RNA-M5U and SAP. Schematic  representation of various domains in the CarC3H protein sequences showed that members of a group displayed remarkable similarity in the type of domains present (Fig 2). For example, all members of group III were seen to contain KH type domain while members of group IX and XI were found to possess WD repeats and Zf-RING domains respectively. Phylogenetic analysis of members of CCCH Znf gene families of chickpea along with members from M. truncatula and Arabidopsis was also carried out (Fig 3). It was observed that most members of chickpea C3H gene family were phylogenetically very close to their homologs from Medicago and Arabidopsis. For example, some pairs of CarC3H and AtC3H genes like CarC3H27/AtC3H32, CarC3H1/AtC3H37, CarC3H30/AtC3H21, CarC3H7/AtC3H5 and CarC3H39/AtC3H53 were very closely related. Similarly, a number of CarC3H members showed very close homology with their homologs from M. truncatula. The amino acid sequences of these gene pairs were analysed and it was observed that they shared similar domains in their protein sequences. For example, SAP domain was present in the amino acid sequences of both CarC3H54 and MtC3H32 whereas both CarC3H22 and MtC3H29 contained a KH type domain thereby clearly validating the phylogenetic arrangement of the CarC3H members in the tree.

Synteny analysis
Synteny may be defined as the conserved order of genes on chromosomes of related species as a result of descent from a common ancestor and comparative mapping is a valuable technique to identify similarities and differences between species [26]. Therefore, the CarC3H sequences were mapped onto the genomes of M. truncatula, G. max, P. vulgaris and Arabidopsis. This comparative analysis revealed that 52 members of CarC3H gene family found homologs in the M. truncatula genome, 48 members found homologs in G. max genome, 43 CarC3H genes found homologs in P. vulgaris genome and only 8 CarC3H genes could be mapped onto the A. thaliana genome (Fig 4). These results reaffirmed the common ancestry shared by chickpea and M. truncatula and G. max Similar inferences have been derived based on a number of studies which report that both chickpea and M. truncatula belong to the Galegoid group of legumes and hence share a closer phylogenetic relationship [27].

Chromosomal location and duplication of CarC3H genes
The CarC3H genes were mapped onto the kabuli chickpea genome [23] in order to assign their chromosomal positions (Fig 5). Fifty-two CarC3H genes were located on the 8 chickpea chromosomes whereas 6 members (CarC3H1-CarC3H6) mapped onto the unanchored scaffolds of the kabuli chickpea genome. However, keeping in mind their size and their similarity to known CCCH genes, these have been included in all analyses. Maximum number (9) of CarC3H genes were located on Chromosome 4 while Chromosome 8 contained the least (2). Chromosomes 6 and 7 contained 8 CarC3H genes each while Chromosomes 1 and 5 accounted for 7 CarC3H genes each. Chromosomes 2 and 3 contained 6 and 5 CarC3H genes respectively ( Fig 5).
Determination of duplication events revealed 10 pairs of paralogous CarC3H genes distributed on various chromosomes ( Fig 5). One of these events was found to be tandem gene duplication ( Fig 5). On the other hand, nine duplication events involved gene pairs with segmental duplications which occur at more than one site within the genome and typically share a high level of sequence identity [28]. According to Wagner [29] accumulation of advantageous mutations usually leads to divergence of species and hence is known as positive or diversifying selection while removal of deleterious mutation leads to survival of a species and hence is called negative or purifying selection. Diversifying or purifying nature of a selection can be determined by comparison of synonymous (Ks) and non-synonymous (Ka) rates of substitution of bases [30,31]. Hence, this method was adopted to generate the Ka/Ks values for the 10 pairs of paralogous genes ( Table 3). The results indicated that the CarC3H genes were under purifying selection since the Ka/Ks values for all gene pairs was less than 1.

Tissue specific expression of CarC3H genes
The availability of deep transcriptomes of different chickpea tissues (leaf, root, flower bud, young pod) [32] as well as the transcriptome of chickpea seeds at various stages of development [33], allowed analysis of the tissue specific expression of the CarC3H genes. The raw 454 sequence reads from 8 different chickpea tissues (leaf, root, flower bud, young pod, 10DAA seed, 20DAA seed, 30DAA seed and 40DAA seed) were mapped onto the CarC3H genes and RPKM values were calculated based on the number of raw reads mapped onto each tissue. These RPKM values obtained were log 2 normalized and used to generate a heat map depicting the digital expression pattern of the CarC3H genes across different chickpea tissues. The samples were clustered according to their corresponding expression patterns using hierarchical clustering (Fig 6A). The analysis showed that CarC3H28, 52, 16, 44 and 55 were highly expressed in the flower bud. CarC3H46 was seen to have especially high expression in root tissue, while CarC3H21, 32, 36, 33 and 58 were observed to have comparatively higher expression in root. Also a number of CarC3H genes were found to be expressed highly in chickpea seed tissue. For example, CarC3H47 and CarC3H51 were expressed at higher levels in early stages of seed development (10 and 20 DAA) while CarC3H11 and CarC3H45 had greater expression levels in later stages (30 and 40 DAA). In addition, the promoter sequences for some genes contained motifs which correlate with their tissue specific expression. For example, the motif  Genome-wide analysis of the CCCH zinc finger family in chickpea "CARGCW8GAT", which is a binding site for AGL15, was found in promoter sequences of CarC3H16, 44 and 55 and responsible for their expression in flower bud and seed. Genes with higher expression in seed tissue contained promoter motifs like "RYREPEAT", "SEF1" and "SEF4", which are known to impart seed specific expression of genes (S3 Table). On the other hand, a large number of genes contained promoter elements that did not correlate with their tissue specific expressions. Quantitative real time PCR was used to further analyse and validate the expression of CarC3H genes in various tissues of chickpea, including germinating seedling at 24hr, 48hr and 72hr of germination. According to this analysis, CarC3H26, CarC3H11, CarC3H45 and CarC3H51 had higher expression in seed tissue while CarC3H10 and CarC3H58 had higher expression in germinating chickpea seedlings (Fig 6B).

Expression of CarC3H genes under abiotic stress conditions
To study the effect of various abiotic stresses on the expression of CarC3H genes, the raw reads generated by sequencing of chickpea shoot and root tissue subjected to different abiotic stresses (desiccation, salinity and cold) were downloaded from NCBI SRA database (SRX402843, SRX402845, SRX402844, SRX402846, SRX402842, SRX402841, SRX402840, SRX402839) and mapped onto CarC3H gene sequences. The RPKM values were calculated based on the number of raw reads that mapped and log 2 normalised to generate the heat map ( Fig 7A). It was observed that most of the CarC3H genes had high expression in both control Genome-wide analysis of the CCCH zinc finger family in chickpea and stressed tissues. However, CarC3H11, 16, 45, 58, 55 and 5 were found to have higher expression in chickpea shoots subjected to desiccation and salinity stress (Fig 7A). In case of roots, CarC3H25, 55, 4 and 5 were found to have higher expression under salinity stress. CarC3H19, 45 and 11 had high expression in shoot during desiccation, salinity and cold stresses while expressing at very low levels in root. It was also observed that cold stress had no significant effect on the expression of the CarC3H genes.
To further analyse and validate their expression under abiotic stress, some of the CarC3H genes were chosen for quantitative real time PCR. Three-week old chickpea seedlings (cv. ICCV2) were subjected to dehydration, salinity, cold and desiccation stresses and samples were collected at various time points. Quantitative real time PCR analysis showed that Genome-wide analysis of the CCCH zinc finger family in chickpea dehydration and desiccation stresses had a pronounced effect on expression of CarC3H genes as compared to salinity and cold (Fig 7B). These results indicated the putative involvement of the members of groups III and V of CarC3Hgene family which included CarC3H29, 22, 31 (group III) and CarC3H11, 41, 45 and 58 (group V). The genes in these groups showed higher expression under dehydration and desiccation stresses (Fig 7B. The promoters of these genes were found to contain motifs like "ABRELATERD1" and "ACGTATERD1", which may drive their expression in response to dehydration (S3 Table).

Molecular characterization of the CarC3H45 gene
Amplification and copy number determination for CarC3H45 gene. Amongst the CarC3H members differentially expressed across various tissues and stress conditions, an interesting member CarC3H45 (a group V member), showed significant expression in seed tissue as well as during dehydration and desiccation stress. It was therefore selected for further characterisation. Sequencing of the coding region revealed that the ORF for CarC3H45 gene was 996bp long that coded for a protein of 331 amino acids. The ORF was mapped onto the whole genome sequence of chickpea to determine its genomic organisation. TheCarC3H45 gene was found to be intronless. Moreover, CarC3H45 was found to consist of two CCCH zinc finger motifs (Table 1), which was similar to the reported tandem zinc finger proteins in Arabidopsis (AtTZF1, AtTZF4, AtTZF5 and AtTZF6), which have been studied for their role in seed development [34,35,36]. Further the tandem CCCH Znf motif was conserved across the four Arabidopsis as well as the CarC3H45 with the consensus being C-x7-8-C-x5-C-x3-H-x16-C-x5-C-x4-C-x3-H (Fig 8A). In addition, it was observed that a conserved arginine rich region was also found to be present upstream of the CCCH Znf motif.
In order to determine the copy number of CarC3H45 gene in C. arietinum cv. ICCV2 southern blotting was carried out. A part of the CarC3H45 gene excluding the conserved tandem CCCH coding region was used as a probe for hybridisation. A prominent single band was observed after hybridisation in lanes where the genomic DNA had been digested with BamHI and SacI (both of which did not have restriction site in the gene). Whereas, genomic DNA digested with EcoRI (that has a restriction site within the gene) produced two bands ( Fig 8B). This clearly indicated that the CarC3H45 was present as a single copy in the chickpea genome.
Quantitative expression analysis. Expression patterns of CarC3H45 were analysed across tissues, various stages of developing seeds and under ABA and GA treatment. The qRT-PCR analysis showed that CarC3H45 was expressed significantly in the mature seed (40 DAA) ( Fig  9A). It was also observed that although the gene had high expression in 40 DAA seed tissue, it had much lower expression in germinating seedlings of chickpea suggesting its involvement in seed maturation and dormancy. In plants abscisic acid (ABA) and gibberellic acid (GA) are known to regulate maturation and dormancy, therefore mature seeds of chickpea were subjected to ABA and GA treatment and the level of expression of CarC3H45 was determined using qRT PCR. The results showed that the expression of CarC3H45 increased upon application of ABA and decreased under the influence of GA (Fig 9B).
Promoter analysis and transactivation activity. Motifs present in the promoter regions of genes play crucial roles in determining their regulation. About 1500bp region upstream of the CarC3H45 ORF was amplified using specific primers (S4 Table), cloned and sequenced. Analysis showed the presence of a number of important motifs in the promoter region of CarC3H45 as had been predicted In-silico (S3 Table). One of the motifs was the RY Repeat element which is known to bind of B3 domain containing proteins transcription factors implicated in seed development [37,38]. Previous studies have also reported that ABI3, which is a B3 domain containing protein and one of the master regulators of seed development can regulate CCCh-Znf protein SOMNUS [39]. Therefore, yeast one hybrid analysis was performed to investigate the interactions between CarC3H45 and ABI3. Results showed that the elevated aureobasidin levels had little or no effect on the growth of colonies containing pGADT7-Car-ABI3/pAbAi-C3H45Prom while the rest of the samples showed reduced or no growth at high concentrations of the antibiotic. This suggested the putative role of CarABI3 in regulation of CarC3H45.
Transcription factors display transactivation activity which can be measured by yeast βgalactosidase assay. Therefore, the yeast β-galactosidase assay was carried out to determine the effect of CarC3H45 on the number of units of β-galactosidase enzyme produced in yeast cells. Full length CDS of CarC3H45 was cloned into pGBKT7 vector which contains a GAL4-DNA binding domain and used to carry out the assay. It was observed that there was no significant difference in the number of units of β-galactosidase produced in case of both negative control (pGBKT7) and for CarC3H45-pGBKT7 containing yeast cells (Fig 10B). The results suggested that CarC3H45 may not possess an activation domain that could activate gene transcription.

Discussion
Genome wide analysis of gene families provides valuable insights into regulation of biological processes in plants and also serves as a foundation for identifying candidates for further characterisation of important genes. The in silico methods available today have made it possible to predict gene families on a genome wide level. Using these tools genome wide analysis of the CCCH Znf family in chickpea led to the prediction of 58 CCCH Zinc finger genes. Our observations revealed that the genome size does not determine the number of CCCH TFs reported for a species. The number of CCCH TFs identified in chickpea was higher than those predicted in M. truncatula [19] but lower than those reported in Arabidopsis [9], both of which have genome sizes smaller than chickpea. The number of CCCH motifs in each CarC3H gene was seen to range from one to six and even in other plants, as many as six CCCH motifs have been reported to be present in a single CCCH TF [9,18,19,20]. The 58 CarC3H proteins were also seen to exhibit a wide range of isoelectric points (pI). The isoelectric point and charge of a protein are known to be important for its solubility, subcellular localization as well as interaction. There is a correlation between subcellular location and protein pI. For example, proteins in the cytoplasm possess an acidic pI (< 7.4), while those in the nucleus have a more neutral pI (7.4 <pI < 8.1) [40,41]. The diversity in their lengths, motif patterns and isoelectric points (pI) suggested that the chickpea CCCH Zn finger genes are involved in a wide range of biological functions and may be regulating various aspects of chickpea plant development.
Phylogenetic analyses have been of paramount importance in the study of evolution of species [42]. The origin and course of evolution of a gene family can provide useful data about its functions and relative importance in a species [43,44,45]. Members of a clade were seen to display similarities in their structural organisations and functional domains. It was observed that while the numbers of introns and exons might vary amongst members of a clade, they had remarkably similar pattern of intron-exon arrangement. This could be indicative of exon shuffling during the course of evolution [46]. Presence of similar functional domains in the protein sequences of members belonging to a particular group further corroborates this assumption. Comparative phylogeny between CCCH TFs of chickpea, M. truncatula and Arabidopsis showed that the clustering was based on the presence of specific functional domains and not according to their plant species which implied that the CCCH TF family is highly conserved across the plant kingdom. In addition, the comparative phylogenetic analysis also revealed important CarC3H homologs in Arabidopsis. For example, CarC3H1 showed close homology with HUA1 in Arabidopsis, which has been reported to be involved in floral development [16]. Similarly, CarC3H17 (group VI, Fig 2) was found to be closely related to AtSZF1 which has been implicated in regulation of salt stress related genes [17]. CarC3H58 from group V of the phylogenetic tree (Fig 2) was found to be a close homolog of AtTZF1, a key regulator of ABA/ sugar and GA responses in Arabidopsis [15] and CarC3H45 (group V) was found to be closely related to SOMNUS, one of the most extensively studied CCCH Znf proteins in Arabidopsis and implicated in seed development [39]. Overall analysis indicated that there were a number of CarC3H members which could have potentially vital roles in the process of chickpea development.
Synteny analysis with the genomes of other plants revealed that the CarC3H genes showed higher levels of similarity with M. truncatula genome as compared with G. max, P. vulgaris and Arabidopsis genomes. This reiterates the observations put forth after studying the evolution of chickpea plant. Both chickpea and M. truncatula belong to the galegoid clade of the Papillionoideae family while soybean belongs to the millettioid clade. The two clades separated about 54 million years (Myr) ago [47] and genome analysis of the galegoid species revealed that chickpea diverged from M. truncatula~10-20 Myr ago. Therefore, M. truncatula is considered to be a closer relative of chickpea.
Various evolutionary forces act to determine the inheritance of advantageous elements to ensure propagation of species. Both positive (diversifying) and negative (purifying) selections play important roles in the evolution of gene families. Positive selection leads to high variability due to fixation of advantageous mutations in genes which in turn is responsible for species divergence. On the other hand, purifying selection is the default process of elimination of the unfit. The analysis of non-synonymous (Ka) and synonymous (Ks) substitutions in duplicated genes is an efficient method to study the evolution of important genes [48]. A Ka/Ks ratio of 1 indicates neutral selection while Ka/Ks > 1 indicates positive (diversifying) selection and Ka/ Ks < 1 indicates negative (purifying) selection [29]. This method was used to determine evolutionary characteristics of the CarC3H gene family and it was observed that the duplication events in CarC3H genes are a result of purifying selection (Ka/Ks < 1) which means that in the course of evolution deleterious mutations have been eliminated to conserve their functions in chickpea. Genes under purifying selection are known to persist over long evolutionary time frames [49,50]. Therefore, it may be assumed that the CarC3H gene family has a very important role in the development of the chickpea plant which has necessitated the conservation and propagation of its members. The presence of specific motifs in the protein sequences largely contribute towards determining their function. In-silico analysis of CarC3H proteins revealed the presence of a number of motifs in addition to the CCCH motif. All the three members of group III were found to contain KH type domain in their protein sequence. KH domains are known to bind RNA or ssDNA, and are found in proteins associated with transcriptional and translational regulation, along with other cellular processes [51]. Members of group VI and VII were seen to contain ankyrin repeat motifs which were first identified in the yeast cell cycle regulator Swi6/Cdc10 and the Drosophila signalling protein Notch [52]. The ankyrin repeat containing domains act as a scaffold for molecular interactions which are important for development of the numerous signalling pathways [53]. Members of group IX and XI in the phylogenetic tree were found to contain WD40 repeats and Zn finger RING motifs respectively. The WD40 repeat motifs and Zn finger RING domains are known to be involved in various important biological processes like signalling, cytoskeletal dynamics, protein trafficking, nuclear export and RNA processing. The WD40 repeats have been especially implicated in histone remodelling [54]and in flower development [55]. A number of CarC3H proteins such as CarC3H19, 34, 23, 12 and 32 contained RNA Recognition Motif (RRM) which are not only involved in RNA/DNA recognition but also in protein-protein interaction [56].
Digital expression analysis revealed the pattern of expression of CarC3H genes in different tissues of chickpea. Validation of this data through qRT PCR revealed that CarC3H26 and 51 had higher expression during early stages of chickpea seed development while CarC3H11 and 45 were specifically expressed in later stages. In addition, CarC3H10 and 58 had higher expression in germinating seed tissues. Moreover, several genes were found to contain tissue specific motifs in their promoters, however no reliable correlation could be established between presence of the specific promoter motifs and expression of genes in those tissues. The data further endorsed the assumption that CarC3H genes were involved in a variety of regulatory roles. Detailed analysis of CarC3H45, a member of group V of the phylogenetic tree, showed that it was an intronless gene containing the plant specific unique arginine rich tandem zinc finger (RR-TZF) motif as was evident from the alignment with known AtTZFs [57, 58]. It has been observed that, ABA accumulation in developing seeds is low during the early stages, highest during the middle stage i.e. during accumulation of storage reserves and declines as the seed undergoes maturation drying. A number of studies involving analysis of mutants deficient in ABA responsiveness provide support for the hypothesis that the absence of, or insensitivity to, ABA during seed development results in the production of precociously germinating seeds [59,60]. Expression analysis revealed that CarC3H45 had preferential expression in the later stage of seed development i.e. at 40 DAA while the expression was drastically reduced during seed germination. Application of exogenous ABA and GA to chickpea seeds showed that the levels of expression of CarC3H45 were increased during ABA treatment and reduced during GA treatment. Although the above data suggested probable regulation of this gene by ABA, this alone did not indicate a role for CarC3H45 in regulation of dormancy.
In order to endorse the quantitative expression data, the promoter of CarC3H45 was analysed for the presence of specific motifs which may indicate regulation by ABA. The RY repeat motif is known to be the binding site for B3 domain containing proteins such as ABI3 [61]. There exists a direct correlation between ABI3 and ABA since ABA is known to regulate dormancy through various signalling components, including three positive components, ABA-INSENSITIVE3 (ABI3), ABI4, and ABI5 [62]. The RY repeat motif was found to be present in the promoter region of CarC3H45. Yeast one hybrid assay showed that the CarABI3 protein could bind to the promoter of CarC3H45 thereby conclusively establishing the role of this gene in regulation of chickpea seed dormancy. However, due to the absence of transactivation activity, it could potentially be designated as a transcriptional regulator whose probable interaction with other molecules needs to be investigated in order to establish the regulatory role of this important gene in chickpea.

Plant materials, growth conditions and stress treatments
Chickpea plants (Cicer arietinum cv. ICCV2, Kabuli type) were grown in the fields at NIPGR, India and used for tissue collection. Flowers were tagged on days of full anthesis and seeds were collected 10, 20, 30 ad 40 days after anthesis (DAA). The seeds were removed from their pods and immediately frozen in liquid nitrogen and stored at -80˚C until further use. Flowers were collected from the field grown plants. Chickpea seedlings used for tissue-specific analysis were grown under control conditions (16hr/8hr light/dark photoperiod, 22±1˚C/ 20±1˚C day/ night temperature, 65% relative humidity) in the growth chambers at the Plant Growth Facility, NIPGR, India. Leaves and roots were collected from 3 week old seedlings. For collection of germinating seedlings, the seeds were surface sterilised using 70% ethanol and imbibed overnight in water. They were spread on blotting paper for germination and tissue was collected at 24 hr, 48 hr and 72 hr intervals.
For various stress treatments, chickpea plants were grown in controlled conditions at the Plant Growth Facility and 3 week old seedlings were subjected to dehydration, desiccation, salinity and cold stress. Seedlings were subjected to dehydration stress by putting them in 1/2 strength MS medium supplemented with 20% PEG4000. The roots of seedlings were placed between blotting sheets to impart desiccation stress and salinity stress was given by putting the roots of the seedlings in 150mM NaCl solution. For cold stress, the seedlings in their pots were kept at 4 0 C in the cold room for different time periods. Seedlings were subjected to the stress conditions for 0, 3, 6, 12 and 24 hrs and the 0hr samples were used as control.

Genome wide prediction of CCCH zinc finger proteins in chickpea
The HMM profile for Zinc finger C-X 8 -C-X 5 -C-X 3 -H type and similar proteins was downloaded from Pfamdatabase (http://pfam.xfam.org/family/PF00642). The hmmsearch function of HMMER (v.3.1) program was used to search for the defined profile in the predicted proteins in the chickpea genome [23]. The predicted CCCH domain containing protein sequences were isolated and checked using SMART (http://smart.embl-heidelberg.de/smart/batch.pl) and Pfam (http://pfam.xfam.org/search) to confirm the presence of this domain. The sequences thus obtained were aligned using Clustal omega (http://www.ebi.ac.uk/Tools/msa/ clustalo/) and any redundancy was manually removed.

Prediction of genomic organisation and structural domains
Genomic organisation of the CCCH zinc finger genes was predicted using Gene Structure Display Server 2.0 (GSDS; http://gsds.cbi.pku.edu.cn/). The GFF3 file containing information about the intron-exon structure of CarC3H genes was used as input. Structural domains in protein sequences were predicted using ScanProsite (http://prosite.expasy.org/scanprosite/) which provided information about positions of different domains in the protein sequence. This information was used to draw visual representation of distribution of various domains in the amino acid sequences of proteins using DOG v. 2.0 (http://dog.biocuckoo.org/links.php). Conserved motifs in the amino acid sequences were predicted using the online tool Multiple Em for Motif Elicitation (MEME, http://meme-suite.org/tools/meme) with the following parameters: Motif discovery mode-Normal mode; Site distribution-Any number of repetitions; Number of motifs-10; Width of motifs-6 to 50.

Phylogenetic tree construction
Amino acid sequences of proteins were aligned using ClustalW and the alignment file was used to generate the phylogenetic tree using Neighbour-joining method on MEGA5.2 (http:// www.megasoftware.net/). The parameters used for construction of phylogenetic tree were as follows: Statistical method-Neighbour-joining; Test of phylogeny-Bootstrap method with 1000 replications; Model/method-Poisson model with uniform rates of distribution; Gaps/ missing data treatment-Pairwise deletion.

Chromosomal location, duplication analysis and calculation of Ka/Ks
The stand-alone version of Blast (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.25/) was used to map the nucleotide sequences onto the kabuli chickpea genome [23] to determine the positions in the genome where the genes were located. The MapChart software (https:// www.wageningenur.nl/en/show/Mapchart.htm) was then used to derive the diagrammatic representation of location of genes on the 8 linkage groups of chickpea. Tandem and segmental duplication events were identified based on the information available in Plant Genome

Digital gene expression analysis
The 454 reads for expression analysis in chickpea tissues-leaf, root, flower bud, pod and seed were retrieved from SRA (Sequence Read Archive) available under accession numbers SRX048833, SRX048832, SRX048834, SRX048835 [32] and SRX125162 [33], respectively. Short reads for chickpea root and shoot tissue under three stress conditions-desiccation, salinity and cold, were retrieved from SRA database available under accession number SRP034839 [65]. The reads were mapped onto the predicted gene models in kabuli chickpea genome [23] using BWA-MEM [66] for 454 reads and BWA [67] for Illumina reads. Mapped reads were extracted using SAM tools [68] and were used for calculating the RPKM (reads per kb per million mapped) values [69]. The RPKM values for CarC3H genes were utilized for generating the heat maps and k-means clustering using the MeV software [70].

Quantitative real time PCR
Total RNA was isolated from different tissues using LiCl precipitation method as described by Pradhan et al. (2014). First strand cDNA was synthesized by reverse transcription from 3 μg of total RNA in 20 μl of reaction volume using AccuScript High Fidelity 1 st strand cDNA synthesis Kit (Agilent technologies, USA) as per manufacturers' instructions. 5X dilutions of all cDNA samples were used for Real time PCR analysis. Gene specific primers were designed using PRIMER EXPRESS version 3.0 (Applied Biosystems, USA) with default parameters. Reactions were carried out in a final volume of 10 μl with 200 nM of each primer mixed with SYBR Green PCR master mix (Brilliant III Ultra-Fast SYBR Green QPCR Master Mix, Agilent technologies, USA) and 1μl of 5X diluted first strand cDNA, as per manufacturer's instructions. The reaction was carried out in 96-well optical reaction plates (Applied Biosystems, USA), using Applied Biosystems' ViiA7 Real Time PCR system and software (Applied Biosystems, USA). To normalize the variance among samples, Elongation factor 1α and HSP90 were used as endogenous controls. Relative expression values were calculated after normalizing against the reference expression value (leaf in case of tissues and control in case of stressed samples). The values presented are the mean of the three biological replicates, each with three technical replicates. The error bars indicate standard deviation. All primers have been listed in S4 Table. Southern blotting Genomic DNA was isolated as described by Doyle and Doyle [71]. About 10 μg of genomic DNA was digested with BamHI, SacI and EcoRI (NEB, USA). Digested DNA was separated on 0.8% (w/v) agarose gel, and transferred onto Hybond-N nylon membrane (Amersham Biosciences, UK) in 20x SSC. Briefly, the gel was run at 20-30V until it reached 2/3rd the distance. The gel was rinsed in autoclaved MQ. Depurination step was carried out by rinsing the gel in 0.125N HCl for 10-15 min. After washing the gel with MQ 2-3 times, the gel was rinsed in denaturing solution for 30 min with gentle shaking. Lastly, the gel was gently shaken in neutralization solution for 30 min. Transfer of DNA to membrane was carried out according to Sambrook's protocol [72]. The membrane was washed in 2x SSC and UV-crosslinked. The blot was kept in hybridization bottle containing 10 ml Prehybridization buffer (0.1 M sodium phosphate buffer-pH-7.2, 10% SDS and 0.5M EDTA) and incubated at 60˚C for 4-5 hrs. The radioisotope labelled probe was added to the hybridization bottle with the blot and allowed to hybridize at 60˚C overnight. After hybridization, the blots were washed with 2X SSC, 0.1% SDS for 10 min at 60˚C followed by washing with 1X SSC, 0.1% SDS for 10 min at room temperature. The membranes were exposed to the storage phosphorscreens (Amersham Biosciences, UK) for 30min to 1 hr. Images were acquired by scanning the membranes with Typhoon 9210 scanner (Amersham Biosciences, UK).

Yeast one hybrid assay and transactivation assay
For yeast one hybrid assay, the promoter sequence of CarC3H45 (about 1500 bp) was amplified from genomic DNA of chickpea and cloned into pAbAi vector between the SacI and XhoI restriction sites. The ligated product was transformed into competent Y1HGold strain of yeast using Fast yeast transformation kits (G Biosciences), as per manufacturer's instructions. pAbAi+Bait (CarC3H45Prom) (100 ng) was transformed into S. cerevisaiaeY1H Gold strain. About 100 μl of a 1/10 dilution and a 1/100 dilution of the transformation mixture were spread onto separate plates containing different concentrations of Aureobasidin to check for autoactivation. The pGADT7-CarABI3 prey construct was then transformed into these competent cells and plated on SD/-Ura plates containing suitable amount of Aureobasidin and grown at 30˚C for 2-3 days. Yeast colony PCR was carried out to confirm the presence of both bait and prey DNA sequences. The Y1HGold competent cells were also transformed with p53-pAbAi/p53-pGADT7 constructs, to be used as positive control and CarABI3-pGADT7/p53-pAbAi, to be used as negative control. Yeast colony PCR was performed and positive colonies were inoculated in 5 ml of YPDA medium and grown overnight at 30˚C. The overnight cultures were diluted (3X) in 0.9% NaCl solution. Serial dilutions of 5X were prepared successively from each and spotted onto SD/-Ura/AbA plates. The plates were observed after 2-3days.
To perform transactivation assay, constructs containing the full CDS of CarC3H45 in pGBKT7 vector were transformed into competent yeast cells (strain AH109) using Fast yeast transformation kits (G Biosciences), as per manufacturer's instructions. About 80μl of the transformed yeast cells were plated on SD/-Trp plates and grown at 30 0 C for 2-3 days to obtain colonies. A single positive colony was picked and grown overnight in 5ml of SD/-Trp liquid medium. The culture was vortexed to disperse clumped yeast cells and 2 ml of this culture was added to 8 ml of YPDA broth. The culture was incubated at 30 0 C with shaking (220-240 rpm) till the cells were in mid-log phase (OD 600 = 0.5-0.8) and the exact OD was recorded. ONPG was dissolved at a concentration of 4mg/ml in Z buffer with shaking for 1-2hr. 1.5ml of the secondary culture was taken in three replicates, centrifuged at 14,000 rpm for 1 min. and supernatant was removed. 1.5ml of Z buffer was added to each tube and cells were resuspended by vortexing. The cells were centrifuged as before and supernatants were removed. The pellets were resuspended in 300 μl of Z buffer thereby giving a final concentration factor of 5. From this, 100 μl was transferred into a fresh 1.5 ml eppendorf tube and the tubes were placed in liquid nitrogen for 30 secs and then in 37 0 C water bath for 30 sec. This was repeated another three times. To 100 ml of Z buffer, 0.27 ml of β-mercaptoethanol was added and 0.7 ml of this was added to each tube, including a blank tube containing 100 μl of Z buffer. 160 μl of ONPG in Z buffer was added to each tube and tubes were placed at 30 0 C in an incubator and timer was started. As soon as yellow colour developed, 0.4 ml of 1M Na 2 CO 3 solution was added to all tubes and elapsed time was recorded in minutes. Tubes were centrifuged at 14,000 rpm and supernatants were collected. Absorbance was measured for the samples at 420nm and β-galactosidase units were calculated as follows: b À galactosidase units ¼ 1; 000 x OD420 =ðt x V x OD600Þ where: t = elapsed time (in min) of incubation; V = 0.1 ml x concentration factor (5 in this case); OD 600 = A 600 of 1 ml of culture Supporting information S1