Characterization of Sex Determination and Sex Differentiation Genes in Latimeria

Genes involved in sex determination and differentiation have been identified in mice, humans, chickens, reptiles, amphibians and teleost fishes. However, little is known of their functional conservation, and it is unclear whether there is a common set of genes shared by all vertebrates. Coelacanths, basal Sarcopterygians and unique “living fossils”, could help establish an inventory of the ancestral genes involved in these important developmental processes and provide insights into their components. In this study 33 genes from the genome of Latimeria chalumnae and from the liver and testis transcriptomes of Latimeria menadoensis, implicated in sex determination and differentiation, were identified and characterized and their expression levels measured. Interesting findings were obtained for GSDF, previously identified only in teleosts and now characterized for the first time in the sarcopterygian lineage; FGF9, which is not found in teleosts; and DMRT1, whose expression in adult gonads has recently been related to maintenance of sexual identity. The gene repertoire and testis-specific gene expression documented in coelacanths demonstrate a greater similarity to modern fishes and point to unexpected changes in the gene regulatory network governing sexual development.


Introduction
Two major processes take place in sexual development: sex determination and sex differentiation.The former process determines whether the bipotential primordium will develop into a testis or an ovary; the latter takes place after sex determination and involves the actual development of testes or ovaries from the undifferentiated gonad [1].Sex determination is considered as a default pathway or as suppression thereof and initiation of the opposite pathway; in contrast, sex differentiation seems to result from the antagonistic relationship among the genes influencing testis or ovary development [2,3].Recently it has emerged that sex-specific mechanisms, which are critical to maintaining the male or female identity of the testis and ovary, also operate in adult mammalian gonads [4][5][6].Other organs besides the gonads may also acquire elaborate male-and female-specific differences.In vertebrates-with the possible exception of birds [7]-such secondary sexual traits are generally believed to be instructed exclusively by the developing testis or ovary through sex steroids, whereas in invertebrates each somatic cell seems to have an inherent sexual identity [8].Compared with eutherian mammals, sex steroids and the proteins involved in their metabolism and binding play an earlier role in the sex differentiation process of fish, amphibians, reptiles, birds, and marsupials [9][10][11][12][13][14][15][16][17][18][19][20].
In vertebrates sexual development is determined by two main factors: either the genetic makeup of the individual or the environment, through the influence of temperature during development, nutrients, pH, etc [21][22][23].It has been demonstrated that in mammals the consecutive processes of sex determination, gonad differentiation and identity maintenance are brought about by a complex network of transcription factor interactions and signalling molecules; a master regulator upstream then directs the network towards male or female [24].The male-determining gene in most mammals is the Y chromosome SRY gene, which however has only been detected in placental mammals [25].In chickens (and possibly all birds) the master regulator of sexual development is Dmrt1; its homologues are dmrt1bY (or DMY) in the Japanese ricefish (medaka, Oryzias latipes) [26,27]; and DM-W in the frog Xenopus laevis [28].In several fish species this function is served by gonadal soma-derived factor (GSDF) [29], anti-Mu ¨llerian hormone (AMH) [30], anti-Mu ¨llerian hormone receptor (AMHR2) [31], or other genes.
In contrast to the variety of upstream sex determinants, genome-wide studies and homology cloning in teleost fishes, amphibians, reptiles and birds have suggested that the downstream components of the network have a conserved function.This has inspired the paradigm that in sex determination during evolution ''masters change, slaves remain'' [32][33][34].However, it is unclear how far back in the evolutionary history this applies and in particular when and how the vertebrate sex regulation network evolved and whether the relevant genes represent an ancient, conserved mechanism or else they were repeatedly and independently recruited to the process.
The unique opportunity to examine high-quality RNA from the Indonesian coelacanth Latimeria menadoensis for transcriptome analysis of testis and liver tissue, and the availability of the whole genome sequence of the African coelacanth Latimeria chalumnae, enabled us to gain insights into a ''living fossil'' that is held to be among the nearest living relatives of tetrapods.
The coelacanth gene repertoire and expression profiles were much more similar to those of modern fish than to those of tetrapods, although they may also represent an intermediate condition; these data unexpectedly suggest that the major evolutionary changes accompanying the transition to terrestrial life were also involved in gonad development.

Methods
The genome of the African coelacanth L. chalumnae has recently been sequenced (project accession PRJNA56111) [48] and is available in the framework of the whole genome shotgun (WGS) sequencing project at http://www.ncbi.nlm.nih.gov and http:// www.ensembl.org.The transcriptome of its Indonesian congener, L. menadoensis, has been described by Pallavicini and colleagues [49] and Canapa and co-workers [50].Briefly, an adult male specimen of L. menadoensis weighing 27 kg was caught in a shark net near Talise island, Indonesia [51].Liver and testis were collected immediately after death and preserved in RNAlater (Applied Biosystems, Warrington, UK).A good quality RNA samples, extracted using Trizol Reagent (Ambion/Life Technologies, Carlsbad, CA) following the manufacturer's instructions (RNA integrity number was 7.0 for testis and 6.6 for liver), were used to generate cDNA libraries for transcriptome sequencing on the Illumina Genome Analyzer II platform (Illumina, San Diego, CA, USA).After filtering high-quality reads, removing reads containing primer/adaptor sequences, and trimming read length, the Illumina 100-bp paired-end reads were assembled on a 4-core server (72GB RAM).CLC Genomic Workbench 4.5.1 (CLC Bio, Katrinebjerg, Denmark) and Trinity [52] were used for de novo assembly of short reads.Contigs confirmed and improved by both methods were pooled in a high-quality set.
To identify the coelacanth homologues of the genes involved in sexual development, the corresponding Xenopus tropicalis, Gallus gallus, Danio rerio and Homo sapiens sequences were BLASTed on the L. menadoensis transcript dataset.The identity of each retrieved putative transcript was confirmed through NCBI BLAST by homology.BLASTx analyses allowed transcript completeness to be established (coding sequences, CDSs).
The L. menadoensis sequences were then BLASTed against the WGS dataset of L. chalumnae, to identify the genomic scaffolds of the African coelacanth containing them.Species divergence was calculated with PAUP on the matching sequences as p-distance percentage; the Ka/Ks ratio was calculated with KaKs_calculator [53] using the Nei and Gojobori method [54].The synonymous distance was calculated using MEGA5 [55] by applying the uncorrected modified Nei and Gojobori method [56] to the concatenated CDSs, aligned with ClustalW2 (http://www.ebi.ac.uk/Tools/msa/clustalw2/; [57]).
The predicted transcripts of L. chalumnae were collected from ENSEMBL (http://www.ensembl.org/Latimeria_chalumnae/Info/Index).The GSDF CDS was obtained manually by aligning L. menadoensis transcripts to the L. chalumnae genome; FGF9, not found in the transcriptome and not annotated in ENSEMBL, was obtained manually by BLASTing annotated amino acid sequences of other species to the L. chalumnae WGS.GSDF and FGF9 putative transcripts were confirmed by homology through NCBI BLAST.
L. chalumnae and L. menadoensis transcripts were compared by ClustalW2 alignment; a graphical representation of each sequence pair is reported in Figures S1A and S1B.
Gene ontology (GO) terms involved in sex determination and sex differentiation (GO0007530 and GO0007548, respectively) were selected and L. menadoensis orthologues to D. rerio, X. tropicalis, G. gallus, Canis familiaris, Bos taurus, Sus scrofa, Mus musculus, Rattus norvegicus and H. sapiens counterparts counted.
L. menadoensis liver and testis gene expression levels were calculated using the CLC Genomic Workbench 4.5.1 by mapping paired reads from the transcriptome on the assembled transcripts, and given as Fragments Per Kilobase of exon per Million sequenced fragments (FPKM).The lack of some transcripts in the assembled transcriptome may depend on poor gene expression, hence on the limited number of reads, which prevented assembly of a contig.In such cases ENSEMBL gene predictions were used to determine absence or low expression taking into account the predicted transcripts.The FPKM value is therefore still a function of transcript length rather than gene length.The FPKM value was calculated for DMRT3, FOXL2, aromatase, WNT4, and CYP11B on ENSEMBL transcript predictions as well as on the inferred sequence of L. chalumnae FGF9.
Besides genes expected to be involved in sexual development, the expression levels of some house-keeping genes, i.e. phosphoglycerate kinase (PGK), heat shock protein class B (HSPCB), and the ribosomal proteins RPS27, RPL19, RPL11, RPL32, chosen according to Eisenberg and Levanon [58], were also evaluated.
Correct assignment to evolutionarily related gene groups was established by phylogenetic analysis.Sequences of SOXE, FGF9/ 16/20, and TGF-b groups of other vertebrates were retrieved from the NCBI protein database and ENSEMBL.Multiple alignments were performed with ClustalW2 using default parameters.Phylogenetic trees were obtained using Bayesian Inference (BI) and Maximum Parsimony (MP) methods.BI analysis was performed with MrBayes 3.1.2[59] by applying the amino acid model of Dayhoff et al. [60] to the SOXE and TGF-b groups and the one by Jones et al. [61] to the FGF9/16/20 group.Parameters were set to 1,000,000 generations, sampling every 100; burn-in was set at 2,500 and stationarity was defined when the average standard deviation of split frequencies reached a value,0.009.
MP analyses were performed with PAUP [62] by applying heuristic search with tree bisection-reconnection (TBR) branch swapping and random stepwise additions with 100 replications; 1,000 bootstrap replicates were calculated.Only minimal trees were retained.The outgroup, accession numbers, and constant, Fragmented contig. 2 Partial CDS. 3 Number of exons from the alignment of L. menadoensis transcripts to the L. chalumnae genome.Where the transcript carries only a partial CDS, the number of exon is partial. 4Divergence between the two coelacanth sequences calculated as p-distance x100. 5The L. chalumnae GSDF gene is split between scaffold JH127632 and contig AFYH01270444. 6The L. chalumnae SF-1 gene is split between scaffold JH126572 and contig AFYH01271535.doi:10.1371/journal.pone.0056006.t001parsimony informative, and parsimony non-informative sites are reported in the legend to each phylogenetic tree.
Conserved syntenic blocks were inferred from ENSEMBL annotation of putative CYP11B (Figure S2), DMRT1, FGF9, FGF16, and FGF20 flanking regions from some sequenced vertebrate genomes.Gene sizes and distances were calculated on the basis of the annotated coordinates of each element.Scaffolds containing FGF9 and flanking genes (EFHA1 and ZDHHC20) conserved in tetrapods were identified by homology through tBLASTn on L. chalumnae WGS data.

Results
GO analyses of 'sex determination' and 'sex differentiation' term annotations of the L. menadoensis transcriptome were conducted and the results compared to selected vertebrate genomes (Tables S1 and S2); 25 contigs were identified as orthologues of a GO0007530 (sex determination) annotation, and 297 contigs were orthologues of the GO0007544 (sex differentiation) annotation.
In this study we examined 33 genes with substantial evidence of involvement in sex determination and differentiation (Supplementary notes).CDSs were retrieved from the L. chalumnae genome and the L. menadoensis testis and liver transcriptomes (Tables 1 and 2) and their expression levels assessed.The putative orthology status of closely related genes was confirmed by tree topologies obtained by phylogenetic analysis.Furthermore the instances of microsynteny conservation described in other vertebrates for DMRT1 [36] and FGF9/16/20 [63,64] were analysed in the two coelacanths.
To establish whether the sequence information from L. menadoensis and L. chalumnae could be combined, their genetic distance was determined by comparing the transcripts of the former to the genomic sequences of the latter.The distance, calculated over all matching sequences, ranges between 0% and 0.826%, divergences being due mainly to mutations, insertions or deletions in untranslated regions (UTRs).Point mutations affecting the transcript coding region are predominantly synonymous (Tables 1 and 2).The synonymous distance calculated over the whole gene set was 0.0019 (standard error 0.0005).These findings showed that the data of the two species can be pooled and investigated together.
ENSEMBL prediction recovered 23 out of 25 genes in the L. chalumnae genome annotation.The two missing sequences were inferred manually from the genome assembly: one, FGF9, was identified by comparison with orthologous sequences of other species, and the other, GSDF, by aligning an L. menadoensis transcript to WGS contigs of L. chalumnae.Fifteen of the 23 predicted transcripts of L. chalumnae carried complete CDS whereas 8 were partial.The manually inferred L. chalumnae FGF9 covers the complete CDS, whereas the L. chalumnae GSDF homologue is incomplete (about 75% of the CDS).
The testis and liver transcriptomes of L. menadoensis contain 22 transcripts.Half of the contigs carried a complete CDS, the other half were partial or fragmented.Transcripts of 3 genes, FGF9, CYP11B, and DMRT3, were not found in liver and testis (Table 1).The male sex development sequences of L. menadoensis and L. chalumnae are compared in Figure S1A.
DMRT6 was the most highly expressed transcript among the 25 male sex development genes analysed (37.79 FPKM in testis, no expression in liver) and one of the 2,000 most abundant transcripts among the 61,000 plus contigs measured in testis.
DMRT1, a major gene in male development, plays a key function in fish [65,66], chickens [67,68], and reptiles [69].Fragmented contig. 2 Partial CDS. 3 Number of exons from the alignment of L. menadoensis transcripts to the L. chalumnae genome.Where the transcript carries only a partial CDS, the exon number is partial. 4Divergence between the two coelacanth sequences calculated as p-distance x100. 5The ERa gene in the L. chalumnae genome is split among scaffolds JH129227, JH129408, JH129637, and JH133026.doi:10.1371/journal.pone.0056006.t002Alignment of L. menadoensis transcripts to the L. chalumnae genome (Figure 3A) identified 5 exons which exceeded the ENSEMBL predicted transcript by 1,572 bp at the 39 end (Figure 3B).The DM domain is encoded in the first annotated exon.The long 39UTR harbours a 320-bp region containing a low-copy interspersed repeat.
The size of the DMRT1 gene in the L. chalumnae genome is.152 kb (Figure 3A), close to the 127 kb gene of H. sapiens (ENSEMBL annotation) but spanning a much longer range than the 3 kb gene of Crocodylus palustris [70], the 45 kb gene of D. rerio [71], and 53-58 kb gene of G. gallus ([72], ENSEMBL).Moreover the lack of a 59 UTR (Figure 3B), which in other fish is transcribed in the so-called exon 0 [73], both in sequences from the transcriptome and the ENSEMBL prediction, suggests the existence of another exon (which would further extend the genomic locus).
Brunner and colleagues [36] previously reported that the gene order around the DMRT1 gene, involving two other DM domain genes, DMRT2 and DMRT3, and the gene KANK1 (KIAA0172), was strictly conserved.A similar micro-synteny conservation was also noted in the L. chalumnae genome when the genomic scaffold JH127237 (1,057,921 bp), from position 608,000 to 941,000, was compared to other vertebrate chromosomes (Figure 3C).Interestingly, this region is linked to the Z gonosome in G. gallus (where DMRT1 is pivotal in male development) and to the X5 gonosome in Ornithorhynchus anatinus, whereas in other species of the actinopterygian and sarcopterygian lineages it is located on an autosome.To date it has been impossible to identify sex chromosomes in the Latimeria karyotype [74] or to relate the scaffold containing DMRT1 to a definite chromosome.
DMRT1 is the second most abundantly expressed gene in testis (11.84 FPKM units) and among the 10% most abundantly expressed transcripts (Figure 2A) of those analysed.
SOX9 is a transcription factor activating AMH; together with DMRT1 it inhibits WNT4 and FOXL2.In mammals it is activated by another SOX family protein, SRY, whereas in other vertebrates it is mainly regulated by SF-1 and DMRT1; together with SOX8 and SOX10 it belongs to SOX protein subgroup E. Phylogenetic analysis (Figure 4) of SOX E proteins from several vertebrates yielded a tree topology with 3 major clades corresponding to the 3 genes.In the SOX9 and SOX10 clades Latimeria sequences comprise a sister group of tetrapods, while the relationship of the Latimeria SOX8 was not clearly resolved given its phylogenetic position.SOX9 and SOX10 were more strongly expressed in testis than in liver (Figure 2A; FPKM: 11.60 and 1.38 for SOX9, FPKM: 2.25 and 0.04 for SOX10), whereas SOX8 expression was scanty in L. menadoensis liver (Figure 2B).
In mammals FGF9 has an important function in male development, creating a positive feedback cycle with SOX9 and inhibiting the WNT4 pathway in testis [75].It has not yet been detected in teleosts and seems to be replaced by FGF20b [63,64] in sexual development.Interestingly, we found an FGF9-like sequence in L. chalumnae.To confirm the orthology relationships of the putative Latimeria FGF9, FGF16, and FGF20, sequence comparisons were performed and the conserved synteny arrangements of the flanking regions investigated (Figure 5).In tetrapods the two blocks harbouring FGF9 or FGF20 are characterized by an EFHA and a ZDHHC gene upstream the FGF genes.Extensive gene-deserted regions are found downstream FGF9, 16 and 20.In teleosts (where FGF9 is absent) the other genes forming the microsyntenic cluster are distributed on different chromosomes.In L. chalumnae the FGF9 cluster is split between two scaffolds whose colocalization on the same chromosome cannot as yet be confirmed.However, the proximity of a putative EFHA1 coding fragment upstream the 59 end of FGF9 suggests that the Latimeria FGF9 follows the tetrapod pattern.
Phylogenetic analysis of the FGF9/16/20 group (Figure 6) uncovered three major clades corresponding to the 3 genes.The exact position of L. chalumnae FGF20 sequence is unresolved; like the X. laevis orthologue it is paraphyletic to teleosts and tetrapods.As expected, the coelacanth FGF16 sequence is basal to the tetrapods.However, the position of the Latimeria FGF9, albeit firmly nested within the FGF9 tetrapod clade, does not reflect its phylogenetic position in the taxonomic group.
Unexpectedly, neither FGF9 nor FGF20 expression was found in L. menadoensis testis.
GSDF, a recently described gene that appears to be critically involved in the development of male teleosts [29,39,40,45], has not been found in tetrapods and no sarcopterygian homologue has yet been described.However, BLAST analysis of teleost GSDF in the L. menadoensis transcript database suggested a putative GSDF gene, whose identity was confirmed by BLASTx analysis.Despite low similarity values (29% identity, 49% positive matching with Oncorhynchus mykiss GSDF NP_001118051.1,and 28% identity and 50% positive matching with O. latipes GSDF NP_001171213.1),BI and MP analyses reliably assigned the sequence to the teleost GSDF clade (Figure 7).Besides GSDF the phylogenetic analysis included two other proteins of the TGF-b family, AMH and inhibin-a, selected for their close relationships to GSDF [39].A multiple amino acid alignment of the conserved TGF-b domain of the 3 genes disclosed that the L. menadoensis GSDF is a sister group of teleost GSDFs, with a posterior probability of 100 in BI analysis and a bootstrap value of 97 in the MP tree (Figure 8).The lack of a glycine, a diagnostic amino acid not found in the GSDF protein [45], in a cysteine knot further confirms the inclusion of the L. menadoensis sequence in the GSDF group, the first homologue to be described in the sarcopterygian lineage.
BLAST analysis of L. menadoensis GSDF on the L. chalumnae genome allowed identification of a genomic counterpart that was found partly on contig AFYH01270444 and partly on scaffold JH127632, with an intervening gap of 171 bp.The L. menadoensis GSDF is strongly expressed in testis but is not expressed in liver (Figure 2A).
ENSEMBL prediction recovered all 8 gene sequences in the L. chalumnae genome.Four transcripts (ERb, CTNNB1, WNT4, and FOXL2) have a complete CDS; only two codons are missing at the 59 end of FST; RSPO-1 and aromatase are partial, whereas ERa, subdivided into 4 different scaffolds in the WGS, could be only partially identified.Analysis of the L. menadoensis transcriptome yielded 3 complete CDS sequences (CTNNB1, ERb, and FST) and 2 fragmented CDSs (RSPO-1 and ERa), whereas 3 transcripts were missing (FOXL2, WNT4, and aromatase).
The female sex development sequences of L. menadoensis and L. chalumnae are compared in Figure S1B.Their values in L. menadoensis testis and liver are shown in Figure 9.As expected, WNT4, FOXL2 and aromatase -held to be responsible for female development and pathway maintenance -were not expressed in testis.CTNNB1, FST, and ERb were a strongly expressed in liver (56.08,27.33,12.93 FPKM,respectively); the expression of FST and CTNNB1 was expected, because their expression is ubiquitous [76].Finally, ERb liver expression in the L. menadoensis specimen, a male individual, was unexpected.

Discussion
In this study a set of 33 genes held to be critically involved in sexual development were isolated and characterized for the first time in coelacanths.Comparison of the gene sequences of the two Latimeria species confirmed the very slow rate of gene evolution that has recently been documented in HOX genes [77], although the latter genes are known to evolve particularly slowly.The 33 genes examined belong to a range of different families, thus providing valuable information.
Interpretation of our data is of course limited by the fact that they come from a single adult individual.However, given the importance of this living fossil in understanding tetrapod and fish evolution, and the exceptional opportunity provided by the availability of high-quality RNA from a specimen of an endangered species, we nonetheless cautiously draw some conclusions.Ka/Ks analysis indicated that no gene in the set studied here is under positive selection in coelacanths.A totally unexpected finding was the very high DMRT6 expression in testis, which was actually the most abundant male-specific transcript.To date the gene had only been found in amniotes and is not annotated in Xenopus and all fish genomes.This phylogenetic pattern could be explained by its being a newly arisen paralogue of the DMRT family at the base of amniote vertebrates.Detection of a bona fide DMRT6 homologue in Latimeria points to a much earlier origin of the gene and supports a possible origin from the 1R/2R whole genome duplication events that occurred in ancestral vertebrates [78] and a subsequent, repeated loss in the teleost fish and amphibian lineages and even in basal chordates.Since information on DMRT6 expression is quite scanty the interpretation of some of these data is merely speculative.In mouse embryo it is expressed in the developing brain but not in the gonads [43].In the human microarray database (https://www.genevestigator.com) it is highly expressed exclusively in ovary and testis, whereas studies of mouse organs have disclosed that only erythroblasts and oocytes show elevated expression.Whatever its original function, it is reasonable to assume that DMRT6 was taken over by other members of the gene family, and that it has ceased to be required in those lineages where it is no longer extant.Its persistence in Latimeria may indicate an important function in male (and possibly female) development which, according to current knowledge, was then at least partially conserved in amniotes.Our findings suggest its being a putative novel gene in the gonad regulatory network.
The high DMRT1 expression found in Latimeria testis and its lack of expression in liver is in line with its expression pattern and important role in testis development and in maintenance of the male gonad identity documented in vertebrates, from fish to mammals [65,66].In teleost fish adult testis DMRT1 is found in germ cells, in somatic cell types or both [65].Unfortunately, RNA-Seq transcriptome data provide no information on the cell type expressing DMRT1 in coelacanth testis.In medaka a duplicated version of DMRT1 on the Y chromosome, designated dmrt1bY, is the master male sex determining gene [26,27].Its major function appears to suppress germ cell proliferation at the critical sexdetermining stage in males [79].In adult testes it is dramatically downregulated [80], and its high expression suggests that only the autosomal DMRT1 (dmrt1a in medaka) may function in mature testis.As in all the teleosts studied so far, a single DMRT1 copy was found in L. menadoensis, suggesting that in coelacanths it may not serve a major role in primary sex determination, but may do so in testis differentiation and adult testis function.
The TGF-b family member GSDF is an important gene in teleost fish gonad development and displays much higher expression in testis than in ovary [39,40].A duplicate of GSDF may actually have become the master male sex determination gene in Oryzas luzonensis [29]; there is strong evidence that in O. latipes the master male sex determining gene dmrt1bY upregulates GSDF and that upregulation correlates with early testis differentiation [45].No GSDF homologue has yet been identified outside teleosts.Identification in our study of a bona fide GSDF sequence in Latimeria and its high expression in testis (which also points at its functional conservation) suggests that the gene arose already at the base of the fish lineage, but was later lost during tetrapod evolution.GSDF thus appears to be an ancestral male sexdetermining gene.In the absence of functional data on GSDF function in fish, it remains unclear whether during tetrapod testis development another TGF-b family member may have taken over the function it exerted in teleosts and coelacanths.
The high expression in the L. menadoensis testis transcriptome of SOX9, SOX10, WT1, AMH, DHH, SF-1 and SDR5A1 and 3 (at least compared to liver), the low expression of AMHR2, and the absence of the female factors FST, RSPO-1, WNT4, FOXL2, aromatase, and oestrogen receptor transcripts are in line with their expression patterns documented in many vertebrate species and with their proposed function in sexual development.
In particular, the AMH/AMH-receptor system is of interest for Latimeria sexual development.In mammals and most likely in all tetrapods AMH induces Mu ¨llerian duct regression.Teleosts do not have Mu ¨llerian ducts, whereas lungfish and Latimeria possess oviducts that are homologous to those of tetrapods [84].Despite the absence of Mu ¨llerian ducts, AMH/AMH-receptor system has an important function in the manifestation of gonadal sex in teleosts, because in medaka AMH signalling is crucial in regulating germ cell proliferation in early gonad differentiation [85].Given that AMH and AMH-receptor are expressed in L. menadoensis adult testis, the AMH signalling system is present and probably active  there, as in adult teleosts [86,87,88], whereas in mouse testis the system is downregulated before sexual maturity [89].
Several of the 33 genes tested, all of which are involved in sex determination and differentiation in other organisms, were found to be abundantly expressed in the liver transcriptome.The high CTNNB1 levels were expected, due to the ubiquitous function of this signal transducer of the WNT pathway.High FST expression agrees with its expression in all vertebrates and with the finding that it is required for liver cell growth homeostasis in mice [90].This non-gonadal function of the gene may be conserved in coelacanths.Similarly the transcription factor GATA-4, besides a role in gene regulation in testis development [91], is also involved in the control of a number of liver genes, explaining why transcripts of the coelacanth homologue were found in both tissues.In contrast to coelacanths, where 5a-reductase 2 is highly expressed in liver, the 5a-reductase 1 isoform is differentially regulated by androgens and glucocorticoids in rat liver, resulting in high expression in this tissue, while 5a-reductase 2 is preferentially expressed in gonads [92].This may indicate lineage-specific subfunctionalization of the isozymes during evolution.
The absence of SOX8 expression in Latimeria testis was unexpected.In other vertebrates, including teleost fish, it is readily detected in this organ, and in mammals it has been assigned an important function in the FGF9/SOX9 interaction loop to maintain Sertoli cell identity by acting redundantly to SOX9 [6,93].Such back-up function does not seem to be required in Latimeria testis maintenance, or may have been lost in the extant coelacanth lineage.In medaka SOX9 is required for germ cell proliferation and survival, but not for testis determination [94].Together with the other L. menadoensis findings this may indicate that the sex-determining function was acquired later in tetrapod lineage, after the split of teleost and coelacanth lineages.
Intriguing data were found for FGF9 and 20, which together with FGF16 constitute a gene subfamily of paracrine FGFs.The critical role of FGF9 in mammalian testis development is well established and appears to be conserved in all tetrapods.On the other hand, the gene is not found in any teleost genome ( [63,64], ENSEMBL), unlike FGF16 and 20 (the latter being duplicated due to the teleost genome duplication).In the amphioxus an FGF gene is basal to the three FGFs in tetrapods [95].FGF9 could thus be a later duplicate of either FGF16 or 20, and its role in testis development could be interpreted as an innovation arising in tetrapods.However, identification of FGF9 in Latimeria supports an origin during the 1R/2R whole genome duplication events that took place in ancestral chordates and its loss in the lineage leading to teleosts.In the teleost Oreochromis niloticus (Tilapia) FGF20b and FGF16 are both expressed in ovary, whereas only FGF16 is (poorly) expressed in testis [64].Together with the complete absence of FGF9, FGF20 and FGF16 expression in L. menadoensis liver and testis, this indicates that the function of FGF signalling in testis, in particular the central role of FGF9, was acquired later in tetrapod evolution.
Surprisingly, the ERb gene was expressed in the liver of the male coelacanth.A previous study of the same individual had disclosed expression of the vitellogenin genes vtgABI, II and III [50].Vitellogenins are yolk proteins physiologically expressed in the liver of reproductive females upon induction by oestrogens.Thus expression of vitellogenins and oestrogen receptor indicates the presence of oestrogens in this male specimen.They could derive from environment pollutants, as reported in a number of specimens from polluted waters; however this individual lived in Bunaken Marine Park in submarine caves at a depth of 100 to 200 m, i.e. in a relatively protected environment.Alternatively, ERb expression could be the result of a pathological condition, of a hormone imbalance due to ageing, or of a physiological feature of coelacanths.

Conclusions
Analysis of the coelacanth testis transcriptome, reported here for the first time, disclosed important new information on which genes involved in sexual development and testis differentiation in other organisms are present and expressed in this living fossil and on the evolution of this process in vertebrates.Interestingly, some genes that are generally considered critical for testis maintenance in all vertebrates, like SOX8 or a fibroblast growth factor gene from the FGF9/16/20 subfamily, do not play this role in Latimeria.This finding and the high GSDF expression found in the coelacanths make their transcript profile more similar to that of modern fish.In summary, the coelacanth testis transcriptome is expected to contribute further important information to reconstruct the ancestral tetrapod situation and indicates that evolutionary innovations for sexual development occurred already during the transition from water to land.

Figure 3 .
Figure 3. Conserved micro-synteny and structure of the DMRT1 genomic locus and transcripts.A) Genomic representation of DMRT1 on scaffold JH127237 of L. chalumnae.Grey box corresponds to gene.Small boxes and V signs represent the intron/exon map.B) Transcript representation of DMRT1 in L. menadoensis and L. chalumnae.Boxes: exons; V signs: introns; white box: DM domain; light grey box: 39UTR; dashed box: putative transposable element contained in the 39UTR.Dotted boxes represent missing exons in the ENSEMBL transcript prediction.C) Microsyntenic conservation of genomic blocks containing the DMRT1 gene.White pentagons represent DMRT1 genes.The pentagon tip points to the relative gene orientation.Numbers near the pentagons stand for gene size expressed as kb, numbers on lines represent intergene distance expressed as kb.ENSEMBL data: H. sa (Homo sapiens), M. mu (Mus musculus), O. an (Ornithorhynchus anatinus), G. ga (Gallus gallus), A. ca (Anolis carolinensis), L. ch (Latimeria chalumnae), D. re (Danio rerio), T. ru (Takifugu rubripes).L. chalumnae DMRT1 position was clarified using the L. menadoensis transcript, by integrating the L. chalumnae ENSLACT00000015034 coordinates.*In O. anatinus DMRT1 gene size was defined by comparison with other species.**Values obtained in G. gallus from the annotation of NC_006127.3accession.doi:10.1371/journal.pone.0056006.g003

Figure 5 .
Figure 5. Analysis of micro-syntenic conservation in FGF9, FGF16 and FGF20 blocks.Micro-syntenic conservation of genomic regions containing the FGF9, FGF20 and FGF16 genes.White pentagons represent FGF genes.The pentagon tip points to the relative gene orientation.The grey mark on the top third of the figure indicates a EFHA1 putative sequence of Latimeria chalumnae.Numbers near pentagons stand for gene size expressed as kb, numbers on lines represent intergene distance expressed as kb.ENSEMBL data: H. sa (Homo sapiens), G. ga (Gallus gallus), A. ca (Anolis carolinensis), X. tr (Xenopus tropicalis), L. ch (Latimeria chalumnae), D. re (Danio rerio), T. ru (Takifugu rubripes).Syntenic blocks for FGF20 in L. chalumnae and X. tropicalis, and FGF16 in A. carolinensis are split between two different scaffolds.The ZDHHC15 genes belonging to the syntenic block of FGF16 in H. sapiens and X. tropicalis lie on the same chromosome or scaffold, but are far removed from the genomic locus of FGF16 and ATRX.*Genes missing in the ENSEMBL prediction.doi:10.1371/journal.pone.0056006.g005

Figure
Figure S1 Sequence pair comparison of male sex development genes.Sequence pair comparison of male sexdetermining/differentiation transcripts from the L. menadoensis transcriptome and L. chalumnae ENSEMBL predictions.Boxes represent CDSs.Lines represent UTRs.Dashed boxes represent a missing part in the CDS.Green lines/boxes represent an inaccurate gene prediction or a mismatch between L. chalumnae and L. menadoensis sequences.Scale dimension are preserved.B) Sequence pair comparison of female sex development genes.Sequence pair comparison of female sex-determining/differentiation transcripts from the L. menadoensis transcriptome and L. chalumnae ENSEMBL predictions.Boxes represent CDSs.Lines represent UTRs.Dashed boxes represent a missing part in the CDS.Green lines/boxes represent an inaccurate gene prediction or a mismatch between L. chalumnae and L. menadoensis sequences.Scale dimension are preserved.(PDF) Figure S2 Micro-syntenic conservation of CYP11B.Micro-syntenic conservation of genomic regions containing CYP11B genes.Black pentagons represent CYP11B genes.The pentagon tip points to the relative gene orientation.ENSEMBL data: H. sa (Homo sapiens), M. mu (Mus musculus), B. ta (Bos taurus), L. ch (Latimeria chalumnae), D. re (Danio rerio), T. ru (Takifugu rubripes).(PDF)

Table S1 Gene
Ontology analysis of the ''sex determination'' term.(PDF) Table S2 Gene Ontology analysis of the ''sex differentiation'' term.(PDF) materials/analysis tools: MF AC MB MAB FB AMF AP MG GDM DMM GS EO MS.Wrote the paper: MF AC MB MAB EO MS.