Transcriptome analysis identified aberrant gene expression in pollen developmental pathways leading to CGMS in cotton (Gossypium hirsutum L.)

Male sterility (induced or natural) is a potential tool for commercial hybrid seed production in different crops. Despite numerous endeavors to understand the physiological, hereditary, and molecular cascade of events governing CMS in cotton, the exact biological process controlling sterility and fertility reconstruction remains obscure. During current study, RNA-Seq using Ion Torrent S5 platform is carried out to identify ‘molecular portraits’ in floral buds among the Cytoplasmic Genic Male Sterility (CGMS) line, its near-isogenic maintainer, and restorer lines. A total of 300, 438 and 455 genes were differentially expressed in CGMS, Maintainer, and Restorer lines respectively. The functional analysis using AgriGo revealed suppression in the pathways involved in biogenesis and metabolism of secondary metabolites which play an important role in pollen and anther maturation. Enrichment analysis showed dearth related to pollen and anther’s development in sterile line, including anomalous expression of genes and transcription factors that have a role in the development of the reproductive organ, abnormal cytoskeleton formation, defects in cell wall formation. The current study found aberrant expression of DYT1, AMS and cytochrome P450 genes involved in tapetum formation, pollen development, pollen exine and anther cuticle formation associated to male sterility as well as fertility restoration of CGMS. In the current study, more numbers of DEGs were found on Chromosome D05 and A05 as compared to other chromosomes. Expression pattern analysis of fourteen randomly selected genes using qRT-PCR showed high concurrence with gene expression profile of RNA-Seq analysis accompanied by a strong correlation of 0.82. The present study provides an important support for future studies in identifying interaction between cyto-nuclear molecular portraits, to accelerate functional genomics and molecular breeding related to cytoplasmic male sterility studies in cotton.

Introduction Cotton, (Gossypium spp.) belonging to the Malvaceae family, that is an important cash crop because of being a natural source of fiber for various industrial use. In addition to this, cotton is a major source of oil for human consumption and cottonseed meal provides important protein nutrients as animal feed [1]. It provides more than 40% of world's raw fiber for industrial use in more than 100 countries of temperate, tropical and subtropical regions, truly bestowed the name "white gold" [2]. Plant breeders of India and China [3] are able to increase yield of cotton from 10% to 20% by successfully acceptable limit of exploit heterosis using selective breeding [4]. Most of these heterosis are obtained through quite laborious, tedious, and costly process of artificial emasculation. This hybrid seed production system is practically difficult in application, as the pollen may not be completely removed. This prompted scientists to use a novel technology called 'male sterility', as a tool for commercial hybrid seed industry, as it escape the need for hand emasculation in hybrid seed production [5].
Within nature, it was observed that certain hermaphroditic angiosperm species frequently lost their capacity to produce viable pollen grains. The complex trait, termed as 'cytoplasmic male sterility' (CMS) is influenced by patterns of mitochondrial genome evolution, as well as intergenomic gene transfer among the organelle and nuclear compartments of plant cells [6,7]. A three-line technique (Male sterile line 'A', maintainer line 'B' and restorer line 'R') is popularly utilized in the CMS-based production of hybrids. Three-line hybrid production based method has been commonly exploited to produce commercial hybrids in crops like corn, turnip, and rapeseed [8]. However, the global gene expression profiling of CMS and it's interaction with restorer genes are still unknown to unveil the mechanisms underlying these processes still remains vague. Development of next-generation sequencing technology provides the biggest breakthrough to the scientific community at genomic, transcriptomic and clinical research areas. Genome sequencing of Gossypium arboretum, G. raimondii and G. hirsutum, set a stage and provide the backbone to dig deep inside the mechanism behind male sterility in cotton [9].
Gossypium hirsutum, with an estimated genome size of 2.25-2.43Gb, is an allotetraploid (AADD, 2n = 4x = 52) upland cotton, which is accounted for more than ninety percent of world's commercial cultivation [10]. The RNA sequencing (RNA-Seq) technology provides unrivalled aspects of messenger RNA profiling with in the cells without any specific prior knowledge [11]. Currently, transcriptome profiling using NGS has been extensively utilized to identify differentially expressed genes (DEGs) in CMS lines of many economically important plant species like, cotton [ [16]. However, there are only a few reports of NGS based transcriptome analysis to understand the cytoplasmic male sterility in G. hirsutum lines.
In previous study, a preliminary sketch of comparative transcriptome profile between fertile and sterile lines of cotton was done using low depth of RNA sequencing to identify DEGs participating in pathways related to sterility in cotton [17]. The current study is focused on better understanding and identification of 'molecular portraits' within DEGs using RNA-Seq by the Ion Torrent S5 technology in floral buds of restorer, sterile and its recurrent parent i.e., the maintainer line from three different cotton lines.

Plant materials
The three lines of cotton (Gossypium hirsutum L.) plants i.e. CGMS (JS178), maintainer (JB178), and restorer (JR178) were cultivated during Aug-Oct, 2017 at the Department of Agricultural Biotechnology, Junagadh Agricultural University (JAU), Gujarat, India. The experimental is carried out in a two greenhouse plots with medium black soils falling in the tropical zone, characterized by fairly hot summer, moderately cold winter and humid monsoon. Geographically the experimental site was situated at 21 0 49.68'N latitude 70 0 44.55 E longitude and at altitude of 97 meters above mean sea level. The CGMS lines using in this study was possessing G. harknessii cytoplasm and developed by JAU, Gujarat. The CGMS hybrid variety of JS178 is popular because it's extra-long staple (28 mm MHL), spun up to 69s with ginning 33% and cultivated in Saurashtra region of Gujarat state. All the lines were cultivated in a greenhouse with control environment at 72% relative humidity with a 21/16˚C (14 h/ 10 h) day/night temperature regime. After sprouting and growing in greenhouse plots, the flower buds were collected during 37 to 42 days at sporogenous cells stage (SS) and microsporocyte stage (MS) stage of 1.5-2.2 mm and 2.2-2.6 mm sized two developmental stages, respectively. A total of thirty plants from each line, were collected in duplicates (fifteen plants from each plot) and combined to reduce between plant variations. The outer protective portion of the bud was removed carefully and the pooled materials were stored in liquid nitrogen to extract RNA [18].

Total RNA extraction, m-RNA isolation and c-DNA library preparation
A total RNA was extracted from flower buds of CGMS and its fertile lines anthers at different stages using the "Hot Borate RNA isolation protocol" [19]. The contaminating genomic DNA was removed by DNAseI (Fermentas, Harover, MD) treatment at 37˚C for 1 h. The concentration of each RNA sample was measured using ND1000 (Thermo Fisher, Wilmington, DE) spectrophotometer. The quality of the RNA was assessed on 1.0% denaturing agarose gel in combination with the RNA 6000 Pico Kit of Bioanalyzer 2100 (Agilent Technologies, Palo Alto, CA). Total RNA with high purity and integrity (> 7.0 RIN) was utilized for mRNA purification after pooling duplicates in equal quantity. Equimolar pooling of RNA for duplicated samples of all three lines was done. The mRNA purification was performed using poly-(T) oligo-attached magnetic of Dynabeads mRNA DIRECT Purification Kit following the manufacturer's instructions. Quality and quantity of mRNA was confirmed using Qubit spectrofluorometric method. Libraries were generated by using RNA Library Prep Kit (Ion total RNA Seq V2), clonal amplification was carried out to generate multiple copies of fragments on the Ion Sphere (IPS) beads using the Ion torrent OT2 machine and RNA sequencing of each IPS was carried out in Ion Torrent S5 sequencer system using 540 chips.

Data analysis to identify genes of differential expression
The data analysis was done as depicted in Fig 1. The mapping was done using the Cotton (Gossypium hirsutum L.) reference genome (http://mascotton.njau.edu.cn/info/1054/1118.htm), using STAR 2.5.1 software [20]. The expression levels of genes in terms of fragments per kilobase of transcript per million fragments mapped (FPKM) using Cufflinks [21] were measured in three lines. DEGs between three lines were identified using the 'Cuffdiff'' module with absolute log2 fold change >2 between different groups and a false discovery rate (FDR) of p <0.05 [22]. Location of the differentially expressed gene on Cotton (Gossypium hirsutum L.) genome is plotted using Phenogram package of R [23].

Gene Ontology (GO), pathway and Singular Enrichment Analysis (SEA)
The transcripts that showed significant differential expression were further analyzed by Singular Enrichment Analysis (SEA) to identify enriched Gene Ontology (GO) term using AgriGO v2.0 analysis tools (http://bioinfo.cau.edu.cn/agriGO/) [24]. GO analysis was performed for functional categorization of identified DEGs. The specific biological functions were predicted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, by mapping the DEGs into the KEGG database. A corrected p �0.01 was set as a threshold to identify GO significant enrichment of gene while q �0.01 was selected as a threshold of pathway significant enrichment [25].

Validation of DEGs using qRT-PCR
Real-time quantitative reverse transcription PCR (RT-PCR) was performed at 37˚C for 15 min followed by 85˚C for 5 s by using the PrimeScript RT Master Mix Kit (Takara, USA). The reactions of 20 μl were prepared using 10 μl SYBR Premix Ex Taq II, 0.8 μl of 10 mM forward and reverse primers each, sterile water and 1 μl cDNA template following which amplification reactions were conducted. The cycling parameters used were; 95˚C for 30 s, 40 cycles of 95˚C for 5 s and 60˚C for 30 s [26]. All reactions were carried out in triplicate, and the data obtained were analyzed using the ABI 7500 Software v2.3, and Livak's -ΔΔ CT method was used for gene expression data analysis [27]. The PCR primer pairs used in qRT-PCR are shown in Table 1.
An overview of the implemented workflow to identifying DEGs, GO, SEA, and KEGG pathway analysis of three different lines of cotton (Gossypium hirsutum L.) is given in (Fig 1).

Transcriptome sequencing and assembly
The present study was resulted to identify the DEGs involved in CMS pathways by comparing the CGMS, maintainer and restorer lines of cotton (Gossypium hirsutum L.). The cDNA libraries for three lines i.e. JS178, JB178 and JR178 during SS and MS stages were prepared separately, and reads were pooled for both stages of each line accounting 53.05, 61.70 and 67.96 million raw reads, respectively. The result of quality trimming as well as mapping against cotton reference genome using the STAR version 2.5.1 is given in Table 2.

SEA, GO and KEGG classification of the transcriptome
AgriGo v2 was utilized to perform SEA to identify GO terms and associated pathways analysis revealed that, the anther development is negatively regulated via stamen, androecium, floral whorl & organ developments at a different level with negative regulation of pathways related to pollen wall assembly and pollen exine formation. The details of each pathway with a level of their involvement is given in (Fig 2). A total of 2,313 predicted transcripts were assigned GO annotations in three main GO categories and 32 subcategories (Fig 3). The Biological Process category of GO is mainly enriched with 'Metabolic process' (229 genes), 'Response to stimulus' (129 genes) and 'Biological regulation' (91 genes) representing 23.01%. 12.56% and 10% of total transcript involved in this category. The 'Cell and Cell part' (359 genes; 52.66%), 'Organelle' (200 genes; 29.5%) and 'Organelle part' (58 genes; 8.5%) had the most number of genes in the cellular component category. The molecular function category of GO was mainly sub-categorized by 'Catalytic activity' (250 genes; representing 38 .9% of transcripts in the category), 'binding' (234 genes; 36.4%) and 'Transcription regulator activity' (60 genes; 10%) (S1 Table).
In addition, the specific biological functions of DEGs and their metabolism networks were analyzed using KEGG pathway analysis, and a total of 2201 transcripts were classified into 110 pathways (S2 Table), among which metabolic pathways, biogenesis of secondary metabolites and signal transduction in plant hormones contained the most transcripts.

DEGs between sterile and fertile buds
A total of 754 non-redundant significantly DEGs between JS178, JB178, and JR178 were identified using Cufflinks. The exact location of the DEGs was identified and plotted on 26 chromosomes of G. hirsutum using Phenogram software (Fig 4A). The total number of DEGs present on each chromosome were presented in (Fig 4B). Location of DEGs on cotton chromosomes showed a great bias towards parenteral genome of G. raimondii genome (D) as compared to its counterpart. Moreover, it was found that chromosome number 5 has the highest number of DEGs followed by 8 and 12.
A total of 300 DEGs are found by comparing JS178 and JB178, which are assumed to have cytoplasmic differences. These DEGs includes, 11 protein-coding genes (S6 Table), possibly targeted to the mitochondria, which were highly down-regulated in JS178 in addition to 3 chloroplast targeted protein-coding genes Gh_A07G0166, Gh_D07G0223 and Gh_A09G1400. The same set of genes are having a different level of expression when the comparison was  made between JS178 and JR178, which are likely to have differences in both nuclear and cytoplasmic genome. The current study found only two common genes, of which Gh_D07G1246 has similar fold change in both compressions and another one i.e. Gh_A08G0763 was more down-regulated in JS178 compare to JR178. This DEGs may be affected by the interaction of the CMS cytoplasm and male sterile nucleus genes so can be a good candidate for more study on CGMS in cotton.
The total number of DEGs between the three lines in terms of various fold change is presented in Fig 5B. Genes with >10 fold differences, accounted for 82.62%, of all non-redundant DEGs. These genes mainly encoded for transcription factor myb domain protein 4, AMS, DYT1, bHLH DNA-binding family protein, WD40 repeat like superfamily protein, VDA-C4_ARATH mitochondrial outer membrane protein porin, and NAD (P)-binding superfamily protein. The results indicated that 21 shared DEGs with the similar expression pattern were identified in both between JS178 vs. JR178, as well as JS178 vs. JB178 (Fig 5C). All these of DEGs present on each chromosome of cotton. A total number of differentially expressed genes on diploid sets of Gossypium genome A and D. The X-axis represents different chromosomes. Y-axis is DEGS numbers on each chromosome.
Transcription factors (TFs) play a vital part in various biological processes by regulating gene transcription by binding to specific DNA sequences in the regulatory regions of multiple target genes [29]. Therefore, even a minor alteration in the expression of TFs normally leads to intense changes in the gene expression level. Our analysis identified 12 up-regulated (like Homeodomain-like superfamily protein, Chaperone DnaJ-domain superfamily protein, acyl-CoA -binding protein 6, multiprotein bridging factor 1A, MADS-box TF family protein and AGAMOUS-like 19) and 63 (including TFs belong to bHLH, NAC, WRKY transcription factor family, Heat shock protein, MYB, F-Box, bZIP transcription factor family protein and GATA-type ZFTF family protein) (S7 Table).
Pathway-based analysis of the deregulated genes were done using KEGG automatics annotation server and classified utilizing the single-directional best hit strategy [30]. In JS178 compared with the fertile lines (JB178 and JR178), most of the up-regulated DEGs functioned as 'ribosome' (S8 Table). In contrast, the large number of down-regulated DEGs functioned as 'metabolic pathways', 'biosynthesis of secondary metabolites, pentose and glucuronate interconversions', carbon metabolism, plant-pathogen interaction, and plant hormone signal transduction (S9 and S10 Tables).
GO analysis showed that, the up-regulated DEGs in the sterile line mainly regulates the metabolism of organonitrogen compounds, nucleobase-containing small molecule metabolic process and non-membrane-bounded organelle. While, down-regulated DEGs mainly functioned in the single-organism processes, anther wall tapetum development process, pollen development, carbohydrate metabolism and cell wall biogenesis or organization. For molecular function, main GO terms of DEGs contained tetrapyrrole binding, oxidoreductase activity, hydrolase activity, acting on ester bonds. For cellular component, five GO terms of DEGs were accessible, which participated in the extracellular region, apoplast and COPII vesicle coat (S11 Table).

DEGs involved in cell wall component
In current study, all the 28 genes down-regulated in sterile line were associated with anther wall tapetum, pollen exine and cell wall formation. These genes may be related to the abortive phenotypes anther and pollen development and consequent resultant sterile lines. Gh_D12G2768, Gh_A11G2384, and Gh_D11G0403 are highly down-regulated by 12.06, 12.91 and 13.34 folds respectively. These are important genes involved in degradation of callose and functional regulation of developmental stages of tapetum. In addition to tapetum related genes, 23 genes related to cell wall components were identified including acyl-CoA synthetase 5, 4 cellulose synthase, 2 CYP450 family polypeptide, 6 expansin (A4 and A8), 4 xyloglucan hydrolase, 2 pectin methylesterase inhibitor superfamily, Pectinacetylesterase family protein, Nucleotide-diphossugar_trans and, myb domain protein 4. All these genes were down-regulated in JS178. These genes contribute to the alteration of intine formation in JS178.

Differential expression of Pentatricopeptide Repeat (PPR) Proteins and carbohydrate metabolism related activity
Genes encoding the site-specific PPR motif are arbitrating through the interactions between RNA-binding substrates and enzymes. It play a critical role in the biogenesis of plant mitochondria [31]. It was observed that genes encoding PPR proteins i.e. Gh_D05G2657, Gh_A01G0058, Gh_D05G0682, Gh_D05G0343 and Gh_D08G0123 were down-regulated in JS178 compared to the restorer line. Besides these genes, 2 PPR genes Gh_D12G0614 and Gh_D12G0539 were highly down-regulated in maintainer when compared to restorer line JR178. These PPR proteins could be candidates for examining the co-operation between mitochondrial and nuclear gene expression [32].
Rearrangements in the mitochondrial genome can modify the expression profile of mitochondrial genes or cause reduction in carbohydrate accumulation [33]. Oxidative stress could lead to premature abortion of tapetal cells [34]. In the present study, 43 genes involved in carbohydrate metabolic and 58 involved in oxidoreductase activity were differentially expressed between sterile and fertile lines.

Concordance check of RNA-Seq profile using qRT-PCR
To evaluate the results of RNA-Seq with concordance of patterns of DEGs, we have investigated 14 DEGs (8 down and 6 up-regulated) by qRT-PCR assays from the same RNA samples that were used for the transcriptome analysis [35]. The corresponding primers use in this study are listed in Table 2. Out of the 14 transcripts subjected to validation in 3 different samples, thirteen transcripts showed complete concordance with the expression profile of RNA--Seq analysis and 1 transcript namely, Gh_D11G3072, showed contrasting expression profile. There was apparently a strong positive correlation (R 2 = 0.82) between the data obtained from RNA-Seq and qRT-PCR data. Dissimilarity between the biological replications was non-significant in real time PCR. Thus, based on these results, we confirmed that RNA-Seq was highly consistent with qRT-PCR. The correlation of RNA-Seq profile and qRT-PCR result for each gene tested is given in Fig 6 (S12 Table).

Discussion
In the current study, a relative transcriptome investigation of anthers of three lines of cotton was conducted to recognize molecular portraits that are differentially communicated in the fertile and sterile cotton plants for encouraging the distinguishing proof of systemic gene expression, regulatory pathways and mechanisms involved with CMS in cotton. The most of the DEGs in JS168 (94.33% vs JB178; and 93.52% vs JR178, Fig 4A) were down-regulated compared to fertile lines. A similar comparative pattern was also detailed in transcriptome study between CMS of Gossypium harknessii, its maintainer and restorer lines [12]. In another investigation to distinguish genes contributing towards faulty pollen wall in male sterility, more number of down-regulated genes were recognized in sterile line [36,37]. Total number of DEGs identified between JB178 and JR178 were less than other comparisons indicating regulatory roles of restorer genes in addition to the role of reproductive rehabilitation [38]. Most of DEGs were located in genome D especially Chr_D05. This result is similar to the study of Wu, Zhang (9) and Zhao [39] where they distinguished 20 hopeful genes including Rf1, for fertility restoration on chromosome 5 in cotton.
The fruitful collaboration of gametophytic and sporophytic genes results in the propagation of flowering plants through the development of viable pollen grains of the anther [40,41]. The link between these two procedures occurs in tapetum. The secretory tissue of the tapetum plays an important role by giving basic essential nutritional components required for the development of pollen and exine formation [42]. In the present investigation, aberrant expression of genes required for anther and pollen cell wall development were seen at various levels. The development of pollen exine mainly depends on the tapetum, which begins depositing lipidic components of tryphine into exine cavities which results in the formation of the sculptured exine [43]. The synergy between tapetal PCD and microspore development is inevitable. Several TFs involved in guiding the development of the tapetum and pollen arrangement have been reported [44]. In this study, six MYB genes (Gh_A05G2120, Gh_D07G2090, Gh_D06G0691, Gh_A09G1458, Gh_D01G1631 and Gh_A07G2342) were observed to be down-regulated in sterile lines. AtMYB103 particularly expressed in tapetum and in the middle layers of anthers assumes to have an essential role in the development of the tapetum and pollen, callose disintegration, exine arrangement in anthers. Reduction in the expression of AtMYB103 resulted in an earlier degeneration of tapetum, and majority of pollen grains lacked cytoplasm and were metamorphopsised [45,46]. Additionally, bHLH which is noted as tapetum-particular transcription factors were reported by many researchers to play critical role in tapetum and pollen developmental process [47][48][49][50], stomata development and hormone signaling [51]. In the current study, six bHLH genes (Gh_A07G0112, Gh_D10G0179, Gh_A11G1614, Gh_A12G0337, Gh_D06G1739 and Gh_D06G1740) were found to be downregulated in sterile lines. One of the most interesting down-regulated bHLH in sterile line (-14.05 fold) Gh_D10G0179, corresponding to AT4G21330 which acts as a DYT1_ARATH transcription factor dysfunctional tapetum. Arabidopsis dysfunctional tapetum 1 (DYT1), was the first identified bHLH transcription factor which specifically modulates the function and development of the tapetum via regulating thousands of anther genes. A total of 276 genes have been reported to be deregulated through the feed-forward regulatory loops between the downstream bHLH transcription factors, other bHLH gene families and DYT1 [47]. The one more promising target among bHLH gene i.e. Gh_A12G0337, was down-regulated in sterile line by more than 13 fold. It encodes for 'Aborted Microspores (AMS)' gene, which is key regulator of pollen wall formation [52]. The AMS encodes many of bHLH family protein. The mutant of AMS shows deterioration of tapetum as well as microspores at anther stage 7 [53].
The current study also traced many distinguishably down-regulated genes in the floral buds of JS178 which were related to pollen cell wall formation, pollen development, and pollen tube growth. These genes include Expansin, Cytochrome P450, and COBRA-like gene. In maize, expansin expedites pollen tube growth through breaking up the maternal cell walls [54]. COBRA-Like genes (COBLs) are mainly responsible for the secondary cell wall biogenesis [55]. The mutations COBl4 in A. thaliana were accounted for to influence the mechanical quality of vascular bundles and have a huge decrease in cellulose content [56]. P450 family are involved in anther pollen development [57]. Li et al 2010 announced that CYP704B2 a family member of Cytochrome P450 is necessary for cuticle and exine formation during plant male gamete formation [58].
Fertility rebuilding for CMS crops is the most essential concern in heterosis utilization and the creation of hybrid seeds [59]. The hereditary studies demonstrated that PPRs tie to a particular regulatory sequence of RNA and develop anterograde regulation like, post-transcriptional splicing, editing, processing, or regulating mRNA stability [60,61]. In this study, 1 up-regulated and 7 down-regulated PPR genes (Gh_D05G2657, Gh_A01G0058, Gh_D05G0682, Gh_D05G0343, Gh_D08G0123, Gh_D12G0614 and Gh_D12G0539) were identified in the sterile line. The hindering impact of CMS proteins on the development of the tapetum and pollen can be prevented by the interaction between members of CMS and Rf family members through various mechanisms [62]. About half of the identified Rf genes encode Pentatricopeptide Repeat (PPR) [63,64]. Emp5 gene is a corn PPR gene that regulates mitochondrial gene expression through altering transcripts of the ribosomal L16 rpl16-458 locus, prompts modification in seed advancement [65]. Chen and Liu (62) reported that PPR genes by hindering the expression of mitochondrial CMS genes could re-establish pollen fertility in many plant species [62]. Therefore, we could hypothesize that the JS178 genes and pathways that regulated male sterility are not only associated with the mitochondria but are also involved in nuclearmitochondria interaction.
The current study also found 5 down-regulated F-box TFs in the JS178 line. Li, Li [66] demonstrated that suppression of F-box protein-encoding gene, OsADF (anther development Fbox), causes degeneration of tapetum and finally a male sterility in rice. WRKY-family (Gh_D07G2328, Gh_A11G0868, Gh_A11G0954, and Gh_A12G2121) and bZIP TFs family protein (Gh_A01G1655, Gh_A09G0901, and Gh_A06G1283) were additional TFs, which were highly down-regulated in JS178. The WRKY-family and bZIP transcription factors have an established role in pollen and anther development pathways [67].
GO enrichment analysis of 55 DEGs located on chromosome D05, showed enriched pathways related to catalytic activity, carbohydrate metabolic process and reproduction possibly suggesting pathways of deprivation of soluble sugar could be a cause male sterility as observed by another group of researchers [39,68].

Conclusions
The reconnaissance of male-sterility associated genes and the clear understanding of the functions of these genes in cotton has high value in it's breeding programmes. In the present study, the examination of the whole transcriptome of CGMS, maintainer and sterile lines of Gossypium hirsutum L. resulted in the identification of 754 non-redundant DEGs consist of 53 up and 701 down-regulated DEGs in the sterile line as compare to fertile lines. The signaling pathway investigation through GO and KEGG enrichment analysis showed carbohydrate metabolism-related genes, cellular activity related genes like cell division, cell wall formation and development, cell membrane, and cytoskeleton as well as transcription factors which have a role in development of reproductive organs such as pollen wall formation, pollen maturation, pollen tube and anther development were far-reaching pathways altering among fertile and sterile lines. These data validated the already uncovered transcript related to sterility pathways in other plant species also exist in cotton. It is notable that in sterile line highly down-regulated genes DYT1 and AMS, which are involved in tapetum function, anther, and pollen development might be appropriate candidate genes for functional analysis of gene pertaining the genetic lattice that controlled the functional development of anther and pollen. The present investigation provide useful data for future investigations to uncover the biology behind the cyto-nucleoplsmic interaction leading sterility and restoration of in cotton.