Skip to main content
Advertisement
  • Loading metrics

The fourspine stickleback (Apeltes quadracus) has an XY sex chromosome system with polymorphic inversions on both X and Y chromosomes

  • Zuyao Liu,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Visualization, Writing – original draft

    Affiliation Division of Evolutionary Ecology, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland

  • Amy L. Herbert,

    Roles Data curation, Formal analysis, Investigation, Resources, Software, Visualization, Writing – review & editing

    Affiliations Department of Developmental Biology, Stanford University School of Medicine, Stanford, California, United States of America, Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America

  • Yingguang Frank Chan,

    Roles Funding acquisition, Methodology, Resources, Supervision, Writing – review & editing

    Affiliations Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany, Groningen Institute for Evolutionary Life Sciences (GELIFES), University of Groningen, Groningen, The Netherlands

  • Marek Kučka,

    Roles Investigation, Methodology

    Affiliation Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany

  • David M. Kingsley,

    Roles Funding acquisition, Resources, Supervision, Writing – review & editing

    Affiliations Department of Developmental Biology, Stanford University School of Medicine, Stanford, California, United States of America, Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California, United States of America

  • Catherine L. Peichel

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – original draft

    * catherine.peichel@unibe.ch

    Affiliation Division of Evolutionary Ecology, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland

Abstract

Teleost fish are well-known for possessing a diversity of sex chromosomes and for undergoing frequent turnovers of these sex chromosomes. However, previous studies have mainly focused on variation between species, while comparatively little attention has been given to sex chromosome polymorphisms within species, which may capture early stages of sex chromosome changes. To better understand the evolution of sex chromosomes, we used the fourspine stickleback (Apeltes quadracus) as a model organism. Previous cytogenetic studies suggested that females of this species possessed a ZW heteromorphic sex chromosome system. However, genetic crosses and our whole-genome sequencing of three geographically distinct wild populations revealed that A. quadracus has an XY sex chromosome on chromosome 23. This chromosome has not previously been identified as a sex chromosome in any other stickleback species, indicating a recent sex chromosome turnover. We also identified two genes - rxfp2a and zar1l - as novel candidate sex determination genes. Notably, we observed inversions on both the X and Y chromosomes in different populations, resulting in distinctive patterns of differentiation between the X and Y chromosomes across populations. The new sex chromosome and intraspecies inversion polymorphisms observed in A. quadracus provide an excellent system for future work assessing the relative fitness effects of the inversions, which will enable testing theoretical models about the drivers of sex chromosome evolution and turnover.

Author summary

As compared to mammals and birds, teleost fish exhibit a very high level of diversity in their sex chromosomes, even among closely related species. Thus far, little attention has been paid to variation within species, although it offers a valuable opportunity to advance our understanding of the mechanisms underlying the formation and turnover of sex chromosomes. Through a quantitative trait locus (QTL) cross and sequencing diverse populations, we determined that instead of the previously reported ZW system, A. quadracus has an XY sex determination system on chromosome 23. Within the sex determining region, we identified rxfp2a and zar1l as putative sex determining genes. Notably, we also observed polymorphic inversions present on both the X and Y chromosomes that differ among populations. These observations represent a rare situation in which sex chromosomes remain polymorphic for sex-linked inversions, providing a unique opportunity to study the contribution of inversions to the divergence of sex chromosomes in the early stages of their evolution and the role of inversions in sex-specific selection.

Introduction

Sex determination systems are diverse across species, and genetic sex determination mechanisms associated with the presence of heteromorphic sex chromosomes have independently evolved many times across the tree of life [1]. There are two main types of sex chromosomes. When males are the heterogametic sex, as in mammals, females have two X chromosomes and males have an X chromosome and a Y chromosome. When females are the heterogametic sex, as in birds, females carry a Z and a W chromosome, and males have two Z chromosomes. Although some groups like mammals and birds have very stable sex chromosome systems, in other groups like frogs [2,3], lizards [4], and fishes [57], even closely related species have different sex chromosome systems [8].

This diversity of sex chromosome systems is due to sex chromosome turnover, which occurs either when an existing sex determination gene moves to a new chromosome or when a novel sex determination gene arises on a chromosome [9,10]. According to the classical model of sex chromosome evolution, the acquisition of a new sex determination gene on an autosome can lead to the loss of recombination between the newly evolving proto-X and Y (or Z and W) chromosomes, resulting in the accumulation of deleterious mutations on the sex-specific and therefore non-recombining chromosome (Y or W) and the eventual formation of the heteromorphic sex chromosome pair [11,12]. This process can be interrupted by sex chromosome turnover, which resets the cycle and initiates the process again [9,10]. Although the evolutionary forces driving these turnovers are still unknown, sex chromosome turnovers have been hypothesized to occur due to selection for linkage between sexually antagonistic alleles and a sex-determination locus [13], selection to purge deleterious mutations that have accumulated on sex chromosomes [14,15], selection to maintain unbiased sex ratios [16,17], or random genetic drift [16,18,19]. However, testing these hypotheses remains challenging because turnover events probably occur quickly; therefore, identifying sex chromosomes when they are experiencing such a turnover is unlikely [10].

The presence of polymorphic sex chromosome systems within species may provide an opportunity to observe turnovers in real time. Indeed, intraspecies sex chromosome variation has been found in different groups with young sex chromosomes, including frogs [2,2022], fishes [2328], and Drosophila [29,30]. Studies of these polymorphic systems have provided some insights into the evolutionary forces driving turnovers. For example, invasion of a new sex chromosome in cichlids is associated with linkage to a trait under sexually antagonistic selection [26]. Population-specific variation in the presence of a sex chromosome in a frog species is consistent with selection to purge deleterious mutations on the sex-specific chromosome [20]. Despite these limited examples, the evolutionary drivers and genetic mechanisms that underlie intraspecies variation in sex chromosomes are still mostly unknown.

A variety of sex chromosome systems have been identified in the species of the stickleback family (Gasterosteidae) that have diverged within the past 27 million years [31], suggesting that there have been recent sex chromosome turnovers. The three species in the genus Gasterosteus possesses a conserved heteromorphic sex XY sex chromosome on chromosome 19, with the anti-mullerian hormone Y (amhy) gene as the candidate sex determination gene [32,33]. The ancestral Y chromosome has independently fused to different autosomes in G. nipponicus and G. wheatlandi [7,34,35]. In a separate stickleback genus, independent duplication of amh has also been found as a candidate sex determination gene on chromosome 20 of Culaea inconstans [31]. Chromosome 12 is involved in an XY sex determination system in some Pungitius species, and a ZW sex determination system on chromosome 7 has been detected in P. sinensis based on genetic mapping [3639]. Even so, sex chromosomes have not yet been fully identified in other species in Gasterosteidae; therefore, additional work is needed to explore the origins and evolution of sex chromosome evolution and turnovers in this family.

Fourspine sticklebacks (Apeltes quadracus) are of interest, as previous studies suggest they might possess a different sex chromosome than other stickleback species, but the sex chromosome has still not been identified. Initially, cytogenetic analysis of a population from Maine, USA indicated that A. quadracus has a heteromorphic ZW sex chromosome [40]. In the years following this study, conflicting evidence from new cytogenetic analyses were reported. Females from a Massachusetts (MA), USA population were found to have a heteromorphic sex chromosome, while no heteromorphic sex chromosome was identified in females (or males) from a Connecticut (CT), USA population [7,41]. These data suggested that the sex chromosome system might be polymorphic within A. quadracus, making this species an attractive target for further study of the evolution of sex chromosome turnover.

To identify the sex chromosomes in fourspine stickleback, we used several distinct approaches, including taking advantage of a published quantitative trait locus (QTL) cross done in wild populations of A. quadracus from Nova Scotia, Canada [42]. Additionally, we collected wild samples from three additional populations, two of which (MA and CT) were used for the previous cytogenetic studies, and one of which (a third population from Nova Scotia, Canada, NS hereafter) was used to generate a de novo genome assembly (S1 Fig). Then, for each population, we created crosses from a single mother and father per population and used pooled sequencing data (Pool-seq) from the crosses to identify the sex chromosome and sex determination region (SDR) in fourspine stickleback. For each population, we also generated haplotagging linked-reads sequencing data from wild individuals of both sexes. We further used these data to explore the variation on the sex chromosome among populations and identify candidate sex-determination genes.

Results

Genetic mapping of sex in a Nova Scotia (NS) intercross

Utilizing a previously described QTL cross between two A. quadracus populations from NS [42], we scored male vs. female status and mapped sex to chromosome 23 of the existing female assembly [43] (logarithm of odds (LOD) score 93.43, percent variance explained = 69.23%) (Fig 1A). Surprisingly, when we examined the genotypes of fish at the peak marker on chromosome 23, males overwhelmingly appeared to be the heterogametic sex, indicating that A. quadracus has an XY sex determination system (Fig 1B).

thumbnail
Fig 1. Identification of an XY sex chromosome system on chromosome 23.

(A) QTL mapping of sex identifies a strong signal on A. quadracus chromosome 23 in the female genome assembly. The horizontal line shows the threshold for obtaining genome-wide significance with 1,000 permutations of the data and α = 0.05. (B) Bar plots show a highly significant correlation of genotype to phenotype at the top QTL peak marker (P = 7.84 e-75; Chi-square test), indicating that males are most likely the heterogametic sex. (C) Genomic distribution of fixation index (Fst) between males and females in wild populations from Connecticut (CT), Nova Scotia (NS), and Massachusetts (MA). The size of the sliding window is 20 kb and the step size is 10kb. Chromosomes are indicated on the X-axis, and the Fst values are shown on the Y-axis. Purple and yellow regions indicate the different chromosomes.

https://doi.org/10.1371/journal.pgen.1011465.g001

De novo assembly and annotation of a male A. quadracus genome

The existing A. quadracus genome assembly used in the QTL mapping analysis was generated from a NS female [43]. Due to the discovery of an XY sex determination system, we also generated a new high-quality assembly of a NS male genome from the same population using high-coverage PacBio HiFi and HiC reads. Raw HiFi read coverage was 116.79x (46.72 Gb in total) and HiC read coverage was 153.78x (63.05Gb in total). The final assembly is 475.93 Mb, and it contains 1261 scaffolds, including 24 chromosome-level scaffolds. The N50 length is 18.49 Mb, and the assembly quality assessed by BUSCO was relatively high with 97.4% completeness. The male and female assemblies are highly syntenic (S2 Fig). For chromosome 23, we constructed two haploid assemblies; the X chromosome assembly was identified by its greater synteny with chromosome 23 in the female assembly (S2 Fig).

We constructed a repeat library for A. quadracus using de novo-based approaches (see Materials and Methods). After masking the repetitive regions, the rest of the genome was annotated with evidence from brain RNA-seq data, homologous protein databases, and ab initio annotation, leading to 21,805 genes in the final version of the annotation. All analyses that follow are aligned to the X chromosome (chromosome 23) of this male reference genome; thus, all coordinates provided hereafter refer to the position on the X chromosome.

Genome-wide analysis confirms an XY sex determination system on chromosome 23

To confirm our findings from the QTL mapping, we first utilized Pool-seq data from crosses of each of three populations (CT, NS, and MA in S1 Fig). Because these data are from siblings of relatively small genetic crosses (S1 Table), the number of recombination events limits our ability to define the non-recombining SDR. For example, any SNPs that are in the recombining region in the father may look sex-linked in the offspring if they did not happen to recombine with the X chromosome. Hence, we also conducted linked-reads sequencing of additional wild samples from the same three populations (20 females and 20 males from the CT population, 15 females and 14 males from the NS population, and 13 females and 11 males from the MA population).

The ratio of sequencing depth between males and females shows no discernible differences on any chromosome in either the crosses (S3 Fig) or the wild population linked-read sequences (S4 Fig), suggesting that large structural insertions or deletions are not present. However, if smaller mutations specific to a sex-specific chromosome have accumulated, increased genetic differentiation between the two sexes as well as increased diversity within the heterogametic sex on the sex chromosome would be expected. Consistent with the QTL mapping data, the Weir and Cockerham’s fixation index (Fst) between males and females is elevated (albeit not to the expected value of 0.5 for a fully sex-linked SNP [44]; see below) on chromosome 23 in both the crosses (S5, S6 Figs) and the wild fish (Figs 1C, 2, S7) from all three populations. Furthermore, diversity (Pi) is elevated in males, but not females, on chromosome 23 in the CT and NS crosses (S8, S9 Figs) and the wild fish of all populations (Figs 2, S10). Higher diversity in males is consistent with an analysis of RNA-seq data from the NS cross by SEX-DETector [45], which showed more sex-linked transcripts with evidence of male heterogamety than female heterogamety (S2 Table). Thus, there is evidence that all populations have a shared XY sex determination system on chromosome 23.

thumbnail
Fig 2. Patterns of genetic differentiation and diversity on chromosome 23 in wild populations.

(A) Genetic differentiation (Fst) between males and females along chromosome 23 was calculated with linked-reads sequencing data from three wild populations, with the Connecticut (CT) population in coral color, the Massachusetts (MA) population in yellow, and the Nova Scotia (NS) population in light blue. The coral and light blue lines represent the corresponding inversions identified in the CT and NS populations, and the yellow lines represents a shared inversion among all populations. (B) Distribution of genetic diversity (Pi) with males (red dots) and females (cyan dots) on chromosome 23 calculated using linked-reads sequencing data from the three wild populations. The grey region represents the shared sex-determination region (SDR) across the three populations. Note that all sequences are aligned to the X chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.g002

Different populations have different X- and Y-linked inversions

Despite the presence of a shared XY sex determination system, the patterns of differentiation and diversity on chromosome 23 differed among the populations. In the CT cross, the differentiation between males and females (S6 Fig) and genetic diversity in males (S9 Fig) is elevated from 7.50 to 12.50 Mb on the X chromosome. Differentiation and genetic diversity show similar patterns across the X chromosome in the wild CT population (Fig 2): differentiation between males and females is elevated from 7.50 to 12.50 Mb, and males have higher diversity than females in this same region. The sharp boundaries of elevated divergence and diversity suggest that there is a rearrangement between the X and the Y chromosome. Indeed, analysis of the linked-reads from the CT wild population reveals that there is an inversion between 7.50 and 12.50 Mb that is polymorphic in both sexes (Fig 2). Specifically, 17 of 20 females are homozygous for the inverted orientation, while 17 of 20 males are heterozygous for the inverted orientation (Tables 1, S3). Hence, we inferred that the inversion is on the X chromosome, which explains its existence in both sexes. The presence of this X-linked chromosome inversion explains the high genetic differentiation between males and females in both the cross (S6 Fig) and population data (Fig 2A). However, Fst does not reach the expected value of 0.5, as the inversion in not fixed in the CT population (Table 1). Further consistent with genetic differentiation resulting from the inversion, females and males heterozygous for the inversion show an elevated SNP density in this region relative to homozygous females or males without the inversion (S11 Fig). The lack of elevated diversity in females from the CT cross (S9 Fig) suggests that the mother of this cross was homozygous for the X-linked inversion and that the father also had an X chromosome with the inversion such that all daughters were homozygous for the inversion on the X chromosome and all sons are heterozygous for the inversion (S12 Fig).

thumbnail
Table 1. Genotype frequencies of inversion orientations by sex and population.

https://doi.org/10.1371/journal.pgen.1011465.t001

In the NS cross, genetic differentiation between males and females is elevated between 2.73 and 17.20 Mb (S6 Fig); however, higher genetic diversity is observed only in males between 7.50 and 8.21 Mb (S9 Fig). The results of Fst and genetic diversity using linked reads from the wild NS population further refine the regions of elevated differentiation. There is a moderate elevation in genetic differentiation between 4.72 and 12.75 Mb, with a pronounced peak between 7.50 and 8.21 Mb (Fig 2A). Elevated male diversity in the NS wild population is confined to the 7.50 to 8.21 Mb region (Fig 2B). Analysis of the linked-reads further identifies an inversion between 4.72 and 12.44 Mb that is heterozygous and found at high frequency only in males, indicating it is a Y-specific inversion (Tables 1, S3). The presence of this inversion is further confirmed by comparing the assemblies of the X and Y chromosomes in an NS male (Fig 3). Despite the presence of this large inversion on the Y chromosome, both the cross and wild population data show that male genetic diversity is concentrated within the region between 7.50 and 8.21 Mb, suggesting limited divergence between the X and Y chromosomes in the inverted region (Figs 2B, S6). The region of elevated diversity around 12 Mb (Fig 2B), located near the breakpoint of the Y-specific inversion, may reflect the presence of repetitive elements rather than true differentiation between the X and Y chromosomes, as diversity is increased in both sexes.

thumbnail
Fig 3. Shared and population-specific inversions on the X and Y chromosomes.

(A) Synteny map between X chromosome (ChrX) and Y chromosome (ChrY) from the A. quadracus NS male assembly. This comparison is based on homologous coding region sequences. Colored lines are gene pairs. Red lines represent the larger inversion on the Y chromosome, and blue lines represent the nested inversion that covers the SDR. (B) Synteny map between ChrX and ChrY from the A. quadracus male assembly. This comparison is based on full sequences. Blue dots represent forward alignments, and red dots represent reverse alignments. (C) Model for population-specific inversions on the A. quadracus sex chromosomes. Orange bars represent the sex chromosome pair on chromosome 23. Coral bars show positions of the X-specific inversion in both sexes in the CT population, the light blue bar shows the position of the Y-specific inversion in males in the NS population, and the grey boxes represent the SDR.

https://doi.org/10.1371/journal.pgen.1011465.g003

In the MA cross, genetic differentiation is elevated between 2.73 and 7.50 Mb, as well as from 12.50 to 17.20 Mb, with both sexes displaying high levels of genetic diversity within these regions (S6, S9 Figs). However, in the linked-reads data from the wild MA individuals, Fst is primarily elevated between 7.50 and 8.21 Mb; the lower values of Fst in the MA population relative to the CT and NS populations likely reflects the lower sequencing quality of these samples (Figs 2, S7). This same region exhibits an enrichment of genetic diversity specifically in males in the MA population (Fig 2). Direct assessment of potential rearrangements was challenging due to the insufficient average sequencing depth in the MA wild population, preventing confident genotyping of the inversions. Nevertheless, there is a higher SNP density in females than in males in the region of X-linked inversion (S11 Fig), consistent with segregation of the X-linked inversion in females of this population. In addition, in the region associated with the X-linked inversion in the CT population, low divergence between males and females, coupled with high genetic diversity in both sexes, was observed in the cross data (S6, S9 Figs). This pattern could be explained if the mother in the cross was heterozygous for the inversion, while the father lacked the inversion. In such a scenario, the inversion would be inherited equally by sons and daughters, resulting in no differentiation between the sexes and similar levels of genetic diversity in both (S12 Fig). These findings suggest that the X-linked inversion may be present at a low frequency within the MA population.

Summarizing the above evidence, we propose a model for the pattern of inversions in different populations (Fig 3C). The CT population has an X-specific inversion, whereas the NS population has a Y-specific inversion covering most of the Y chromosome. The two identified inversions in the CT and NS populations are derived, as they are inverted relative to the X chromosome assembly from the NS population, whose orientation appears to be ancestral by comparison to the genome assemblies of other sticklebacks (G. aculeatus and P. pungitius) and an outgroup species (Aulorhynchus flavidus) [43] (S13 Fig).

Defining the shared sex determination region (SDR)

To define the shared SDR across the three populations, we focused on the linked-read data from the wild populations. A common region with elevated Fst is observed across all three populations, where higher genetic diversity is also evident in males (between 7.20 - 8.21Mb on the X chromosome, Figs 2, S7). Additionally, the homologous region on the Y chromosome is enriched in male-specific kmers in all three populations, suggesting that they have a shared region with Y-specific alleles (S14 Fig). Furthermore, split reads and discordantly mapped read pairs suggest that there is a shared Y-specific inversion between 7.50 and 8.21 Mb (S3 Table). Due to the poor mapping quality, the genotype of individuals was not determined for this inversion. However, the synteny map between the X and Y chromosome assemblies in the NS population suggests that this inversion exists and is nested within the larger inversion on the Y chromosome (Fig 3A and 3B). The reconstruction of phylogenies of different regions on the sex chromosome suggests that this shared inversion arose before the emergence of population-specific inversions (S15 Fig). Thus, we identify a shared SDR between 7.50 Mb and 8.21 Mb, which likely contains the primary sex determination gene. As this region is covered by the inversion shared among populations, the SDR might have been formed by an inversion.

Lack of extensive degeneration on the A. quadracus Y chromosome

Although the sex chromosomes of A. quadracus show no large-scale regions of read depth reduction (S3, S4 Figs), we explored whether there has been degeneration at a fine scale. One method to identify degenerated genes on the sex chromosomes involves comparing read depth of genes between males and females in wild populations using linked-reads data. If the ratio of male to female depth is less than 0.75, the gene is considered to be degenerate, indicating a loss of its content [35]. There are no genes on the X chromosome that are degenerate based on this criterion in either the CT or NS population (the MA population was excluded from these analyses given the low sequence coverage). We also looked for the presence of fixed loss-of-function mutations as evidence for degeneration. There are three loss-of-function mutations on the Y chromosome and two loss-of-function mutations on the X chromosome in the CT population, and three loss-of-function mutations on the Y chromosome and no loss-of-function mutations on the X chromosome in the NS population (S4 Table).

Finally, we investigated whether there has been degeneration at the level of gene expression using brain RNA-seq data from the NS population cross. Of the 859 genes on chromosome 23 that are expressed in brain, only eight genes are more highly expressed in females than males. Four of these genes are located within the Y inversion region, two of which are within the SDR (S5 Table), suggesting that lower expression in males is due to degeneration on the Y chromosome. However, the remainder of the 481 expressed genes in the Y inversion region and the 44 expressed genes in the SDR show no difference in expression between males and females. Taken together, these data suggest that the A. quadracus Y chromosome is at a very early stage of degeneration and that extensive degeneration has not occurred.

Rxfp2a and zar1l are candidate sex-determination genes in A. quadracus

Following the identification of the shared SDR, we conducted a thorough analysis of the genes within this region to identify potential candidates for primary sex determination. Our search for Y chromosome-specific genes revealed 8 predicted genes, of which four are in the SDR. Only three of the 8 genes had identifiable homologues based on BLAST analysis. Additionally, 34 genes were found to be present within the SDR on both the X and Y chromosomes. However, neither the four Y-specific genes nor the 34 genes shared between the X and Y chromosomes in the SDR show homology to known sex determination genes in other teleost species (S6 Table). There is also no homology between known sex determination genes and the five annotated genes present in the small region with elevated genetic diversity between 12.04Mb and 12.17Mb (Fig 2, S6 Table). Thus, we focused our analyses on the genes present on the X and the Y within the SDR; for each gene, we counted the number of synonymous and nonsynonymous changes, separately for both sexes. Because this is an XY system, we focused on genes with changes between the X and Y chromosome. In total, there are 23 candidate sex-determination genes with nonsynonymous changes located on chromosome 23 within the SDR (S6 Table). Among these genes, there are two genes of interest, rxfp2a (7.565 Mb – 7.592 Mb) and zar1l (7.515 Mb – 7.516 Mb), which are related to the development of the reproductive system (see Discussion for details). There are three amino acid changes in the Y chromosome allele of zar1l, and all of them are predicted to cause a deleterious mutation, according to SIFT analysis [46]. The dN/dS ratio is 0.289. For the rxfp2a gene, the dN/dS ratio cannot be calculated as there are no synonymous mutations. There are two nonsynonymous mutations located in the coding region, but they are not predicted to have a significant effect on the function of the gene.

Discussion

Variation and turnover of sex chromosomes in A. quadracus

Previous evidence from cytogenetic studies suggested that populations of A. quadracus from Maine and Massachusetts have a heteromorphic ZW sex chromosome in females [7,40]. However, no heteromorphic sex chromosome was detected in metaphase spreads of males or females from Connecticut [41]. Using data from four genetic crosses and wild fish from three populations, we determined that A. quadracus has an XY sex determination system on chromosome 23. Our analyses included samples from both the Massachusetts and Connecticut populations used in the previous cytogenetic studies. Thus, the discovery that A. quadracus has an XY sex determination system is surprising, as the morphology of chromosomes in the MA population clearly indicated the presence of a heteromorphic pair in females [7]. One possible explanation for this result is if the MA individuals used for cytogenetics were heterozygous for the X-linked inversion that we identified in the CT population. If the inversion caused a change in chromosome morphology at the cytogenetic level, the chromosome pair may have appeared to be heteromorphic. Although we performed linked-reads sequencing of some of the MA females used for the cytogenetic study [7], we did not obtain high enough sequencing coverage to confidently assess inversion genotypes in these individuals. However, the MA cross data suggest that the X-linked inversion is present in the MA population (S6, S9, S12 Figs). As we do not have samples from the Maine population used in the older cytogenetic study [40], we could not assess whether the X-linked inversion is present in this population. If heteromorphic chromosomes in females are indeed due to heterozygosity for the X-linked inversion, it is not surprising that the CT females were homomorphic in the previous cytogenetic study since these females are mostly fixed for the inversion (Tables 1, S3). However, to fully resolve this mystery, a more detailed molecular cytogenetic analyses of these different populations is needed, which will be facilitated by our identification of the SDR on chromosome 23 in A. quadracus.

Chromosome 23 has not previously been identified as a sex chromosome in sticklebacks, suggesting that there has been a sex chromosome turnover in A. quadracus. However, it is interesting to note that A. quadracus chromosome 23 is homologous to part of chromosome 7 in both G. aculeatus and P. pungitius [43]. The non-homologous part of chromosome 7 has fused to chromosome 12 in the Pungitius lineage, and there is evidence that chromosome 7 carries a female heterogametic (ZW) sex determination locus in P. sinensis and that chromosome 12 carries a male heterogametic sex (XY) determination locus in P. pungitius [36]. Given that the SDR in these two species is not homologous to that in A. quadracus, it is unlikely that they have the same sex determination gene. However, testing this hypothesis requires identifying the sex-determination gene in all three species. It is clear that A. quadracus has a different sex chromosome and sex determination gene from the Gasterosteus species, in which the master sex determination gene amhy is found on chromosome 19 [32,33,35,47], or in C. inconstans, in which there has been an independent duplication of amhy on chromosome 20 [31]. No duplicated copy of amh has been found in Pungitius species or in A. quadracus [31]. Further supporting a sex chromosome turnover in A. quadracus is the lack of extensive differentiation between the X and the Y or degeneration on the Y chromosome. Similar patterns on sex chromosomes in P. pungitius, P. sinensis, and C. inconstans hint that these turnovers also occurred quite recently [31,36,39], with evidence for a very recent additional turnover within P. pungitius [48]. The sex chromosomes in these species are in contrast to the Y chromosome in the Gasterosteus lineage, which evolved approximately 22 million years ago and has experienced extensive degeneration, albeit at different rates in the three species in this genus [32,33,35]. This variation in turnover among different stickleback lineages provides an opportunity to further investigate the factors that lead to sex chromosome stability in some lineages and turnover in others.

Two novel candidate sex determination genes

We identified two novel candidate sex determination genes in the shared SDR on chromosome 23. Although we identified four Y-specific genes within the SDR, none are homologous to the known sex-determining genes in teleost fish (S6 Table) and are therefore not promising candidates. However, further investigation of their functions in sticklebacks is necessary. The genes zar1l and rxfp2a are the only two genes within the SDR that have Y-specific SNPs across all populations studied and are known to play roles in the development of the reproductive system (S6 Table).

The rxfp2a (relaxin/insulin-like family peptide receptor 2) gene encodes a receptor that plays a crucial role in the development of placental mammals by binding with high affinity to the peptide INSL3 (insulin-like 3). This INSL3/RXFP2 pairing is essential for the proper descent of the testicles during development in mammals [49], and loss of rxfp2a results in cryptorchidism in mice [5053]. Phylogenetic analysis of 71 mammalian genomes revealed that the rxfp2a gene is lost or non-functional in four afrotherian species that lack testicular descent [54]. In zebrafish, INSL3 regulates spermatogonia stem cell differentiation from mitosis to meiosis [55]. Since rxfp2a is a receptor for INSL3, mutations in this gene have the potential to disrupt the entire INSL3/RXFP2 signaling pathway, ultimately affecting spermatogenesis. In A. quadracus, rxfp2a exhibits Y-specific mutations in both the coding and regulatory regions. Given the conserved role of this gene in testes development and spermatogenesis, rxfp2a is a candidate gene that warrants further investigation.

As a maternal effect gene conserved across vertebrates, zar1 plays an important role in oocyte-embryo transition and impacts female fertility in mice [56]. In Xenopus laevis, the zar1 gene controls the translation of Wee1 and Mos mRNAs in immature oocytes [57]. Additional evidence about zar1 impacting the sex ratio was found in Danio rerio, where a complete male-biased sex ratio was observed in zar1 knock-out mutants [58]. In addition, it was reported to have an effect on a number of known translation factors, such as CEPB, ePAB, and 4E-T [58,59], among which CPEB and ePAB are known for controlling the process of oogenesis [60] and 4E-T is associated with human primary ovarian insufficiency [50]. In our study, we found that there are two zar1 genes in the A. quadracus genome: the ancestral copy is on Chr8, and the duplicated copy (zar1l) is found on both the X and Y copies of chromosome 23. There are many amino acid differences between zar1 and zar1l, including insertions and deletions (S16 Fig), suggesting that the two genes might have evolved different functions. The Y chromosome allele of zar1l has three amino acid changes that are predicted to be deleterious. Considering this gene is quite conserved across species, it is likely that the amino acid changes disrupt the function. Hence, having one functional copy of zar1l could lead to male development, which would be consistent with the zebrafish data. Therefore, we conclude that zar1l is another appropriate candidate gene. However, further experiments, such as gene knockouts and/or SNP editing by CRISPR-Cas9 are necessary to determine whether rxfp2a or zar1l is the master sex determination gene in A. quadracus.

While numerous sex determination genes have been identified and studied in fish, the two genes mentioned above, rxfp2a and zar1l, have not been previously identified as sex determination genes. In contrast, other genes, such as amh, amhr2, dmrt1, and gdf6, have been repeatedly identified as master sex determination genes in various fish species [61,62]. In stickleback species, one key sex determination gene is the independent duplication of the amh gene on the Y chromosome of Gasterosteus species [33] and C. inconstans [31]. However, there is no evidence for an additional copy of the amh gene on Chr23 (this study) or elsewhere in the A. quadracus genome [31], suggesting that A. quadracus has probably undergone a turnover in the sex determination gene.

Interestingly, some of the other genes that we highlight in the SDR on Chr23 (S6 Table) and two genes slightly outside this region have previously been shown to be linked to the SDR in a distant relative, the Atlantic herring (Clupea harengus). The sex determining gene in this group is likely BMPR1BBY [63], and the nearby linked genes include mettl27, cldn4, smyd4, sms, and pgrmc1 [63,64]. Smyd4 has high expression in zebrafish testis and pgrmc1 has a role in zebrafish oocyte maturation, while the gene sms is linked to the X-chromosome in humans [64]. Both mettl27 and cldn4 are part of the cohort of genes deleted in the human disorder Williams-Beuren syndrome, which has been shown to affect genital development, with humans exhibiting phenotypes such as undescended testis, retractile testis, and cryptorchidism [6567]. Although not thought to be primary sex determining genes in Atlantic herring, both mettl27 and cldn4 have SNPs that differ in males and females and lead to nonsynonymous amino acid changes [64]. This may be a good example of genes with different functions in males and females becoming linked to the SDR, as predicted by models invoking sexually antagonistic selection in sex chromosome evolution [68,69]. Although Atlantic herring and A. quadracus are diverged by approximately ~220 million years of evolution [70], the finding of similar sex-linked genes in the two groups highlights the critical role that conserved supporting genes may play in sexual development and sex chromosome evolution.

Polymorphic X- and Y-linked inversions on the sex chromosomes of A. quadracus

We have also identified polymorphic and derived inversions on both the X and Y chromosomes in A. quadracus populations. There was a high frequency of an X-linked inversion in both males and females in the CT population, partially covering the SDR (Figs 2A, S12, Tables 1, S3). Evidence for a similar X-linked inversion was also found with the Pool-seq data from the MA cross, indicating that it might be present at a low frequency in this population (S6, S9 Figs). Although this X-linked inversion does not seem to be present in the NS population, discordantly mapped reads and shared barcodes point to a Y-specific inversion in this population (Fig 3, S3 Table). Both the X and the Y-linked inversion contain the shared SDR from 7.50 Mb and 8.21 Mb, which coincides with a potential inversion that is shared among populations (Figs 2, 3).

Inversions have been proposed as a mechanism to suppress recombination between X and Y chromosomes [68]. Although it is also formally possible that recombination suppression could have preceded the formation of inversions, several studies have now found evidence for Y-linked inversions associated with suppression of recombination on Y chromosomes [33,7173]. A number of hypotheses have been proposed to explain the suppression of recombination on sex chromosomes, including sexual antagonism [12,69,7476], meiotic drive [77], dosage compensation [78], sheltering of recessive deleterious mutations in heterozygotes [79,80], and genetic drift [8183]. Our finding of a Y-linked inversion in the NS population is consistent with all of these models for suppression of recombination between the X and the Y. The case presented here could present a valuable opportunity to empirically test one or more of these theories. As the Y-linked inversion is polymorphic among populations, it allows us to compare the gene content and gene expression of both the ancestral and inverted Y haplotypes. This might enable us to track the early events in Y chromosome degeneration after recombination suppression, as was done using population-specific neo-Y chromosome haplotypes in Drosophila albomicans [30]. It will also be interesting to compare the fitness of males with and without the inversion, particularly if we can identify populations for which the inversion is polymorphic. Such studies could shed light on the potential targets of selection within the inverted Y haplotype, or the lack thereof.

We also document here a polymorphic X-linked inversion, which is absent in the NS population and almost fixed in the CT population (and of unknown frequency in the MA population). The natural question to ask is whether this inversion is playing some role in the CT (and possibly MA) populations. Although no existing models explicitly address the suppression of recombination on sex chromosomes via X-linked inversions, they have also been identified in other species. For instance, a recent study in Silene latifolia identified an X-linked inversion that suppresses recombination between X and Y chromosomes [84]. In D. americana, a fusion between the ancestral X chromosome and an autosome results in a neo-X chromosome that is polymorphic within the species; some of these neo-X chromosomes also harbor an inversion that suppresses recombination with the neo-Y and shows signatures of positive selection [29]. These empirical examples highlight that more theoretical attention should be given to the evolutionary dynamics of X-linked inversions.

In particular, some of the models described above to explain the spread of Y-linked inversions cannot be easily applied to the X. The principal difference is that on Y chromosomes, inversions are specific to the heterogametic sex, and are therefore instantly and permanently heterozygous, which gives rise to their theorized fitness benefits (e.g., the sheltering of recessive deleterious alleles). In contrast inversions on the X can be present in both sexes and can be homozygous in females. This means that genes within an X-linked inversion cannot be sex-specific, and recessive deleterious alleles can only be sheltered by obligate heterozygosity in males, but not females, which would counter the spread of such inversions [80]. The meiotic drive model [77] is also unlikely to explain the presence of an X-linked inversion, as meiotic drivers often carry fitness costs in homozygous form [85], resulting in the exposure of deleterious alleles in females. However, it is worth noting that the impact of deleterious alleles may depend on the dynamics of the drive; for example, meiotic drivers that sweep rapidly to fixation may not have accumulated deleterious mutations and may therefore not incur fitness costs in homozygous females. The dosage compensation model [78] is also unlikely to explain the initial spread of an X-linked inversion because the inversion in the heterozygous state would lead to the incompatibility of expression modifiers in females. However, once fixed (regardless of why), the X-linked inversion could be maintained as the incompatibility of expression modifiers would prevent the restoration of recombination between the X and the Y. Although inversions do have a higher probability of fixation via genetic drift on X chromosomes than on autosomes, this is only the case when X-linked inversions are hemizygous in males, as in highly degenerate sex chromosomes [81]. As the A. quadracus Y chromosome has not experienced much degeneration, drift alone is not likely to be the explanation for the spread of the X-linked inversion.

Although more work is needed, we may still speculate on the circumstances leading to the spread and possible fixation of such inversions, based on the principles of X inheritance. Firstly, X-linked inversions are likely to be heterozygous more often than autosomal inversions, as homozygosity in males will not exist. Thus, theoretically a locus within the inverted haplotype which confers a fitness advantage to heterozygotes could favor the persistence and spread of an X-linked inversion under some circumstances. Secondly, as the X spends two-thirds of its time in females (as opposed to half of its time as in autosomes), there is scope for selection to favor alleles which benefit females, particularly if they are also deleterious in males (i.e., sexually antagonistic loci) [81]. Direct comparisons between the rate of fixation of X and Y-linked inversions under the sexually antagonistic selection hypothesis have not been done, but X-autosome fusions (which also could suppress recombination between the sex determination locus and a sexually antagonistic allele) can spread under sexually antagonistic selection, albeit more slowly than a Y-autosome fusion [75]. Alternatively, sexually antagonistic selection may lead to balancing selection, maintaining alternative alleles on autosomes or X-chromosomes at intermediate frequencies based on their relative benefits to each sex [86]. This mechanism could potentially explain why the X-linked inversion in A. quadracus is not fixed in the CT (or the MA) populations. Consistent with the prediction of the sexual antagonism hypothesis for linkage between the SDR and genes under sexually antagonistic selection, the inversion on the X chromosome includes the shared SDR. However, additional work is necessary to test this hypothesis, including assessing the frequencies of the X-linked inversion across many A. quadracus populations, and determining whether phenotypes under sexually antagonistic selection are associated with the inversion.

Conclusions

Although variation in sex chromosomes systems among closely related species is now well-documented, the mechanisms behind sex chromosome turnover remain unclear. By examining population data from wild-caught samples and genetic crosses, we find evidence of a recent turnover in both the sex determination gene and the sex chromosome in A. quadracus. Furthermore, there are polymorphic inversions on the X and Y chromosomes, with relatively little degeneration on the Y chromosomes. This within-species variation on the A. quadracus sex chromosomes provides an opportunity for further studies to test hypotheses of the evolutionary forces driving sex chromosome evolution and turnover.

Materials and methods

Ethics statement

All experiments involving animals at the University of Bern were approved by the Veterinary Service of the Department of Agriculture and Nature of the Canton of Bern (VTHa# BE4/16, BE17/17 and BE127/17). For the QTL mapping study, wild sticklebacks were collected from Nova Scotia, Canada, as previously described. Stickleback care at Stanford University was approved by the Institutional Care and Use Committee (protocol no. 13834).

Sample collections and genetic crosses

The generation of the A. quadracus QTL cross, genotyping markers, and linkage map construction was performed as previously described [42]. The sex of the animals in the QTL cross was determined visually by the presence of red spines in reproductive males and absence of red coloration in females. QTL mapping was performed in R version 4.2.2 using the package R/qtl [87] and a binary model. A total of 380 animals and 269 genotypic markers were used [42]. The bar plot showing phenotypes and genotypes at the top peak marker was generated using R version 4.2.2 and significance of the correlation was assessed using a Chi-square test in R.

Genetic crosses for Pool-seq were made from the following populations, with the wild parents of the crosses collected from: Canal Lake (44.49830, -63.90205) in Nova Scotia (NS), Canada in 2019 by Anne Dalziel; Demarest Lloyd State Park (41.5289936, -70.9833719) in Massachusetts (MA), USA in 2007 by Catherine Peichel; West River Memorial Park (41.314148, -72.956544) in Connecticut (CT), USA in 2009 by Thomas Near (S1 Fig). For each population, a single cross was generated using a single female and a single male. A total of 53 females and 55 males from the CT cross, 18 males and 22 females from the MA cross, and 19 males and 16 females from the NS cross were included in the pooled sequencing analysis (S1 Table). The sex of each F1 offspring was identified by dissection of the gonads, and a fin clip was sampled and preserved in ethanol for DNA extraction and sequencing. For the NS cross, brains were also dissected from 12 males and 12 females from the F1 offspring as well as from the male and female F0 parents used for crossing for further RNA-seq analysis.

Wild populations for whole-genome sequencing of A. quadracus were collected from the following localities: Canal Lake (44.49830, -63.90205) in Nova Scotia (NS), Canada in 2021 by Anne Dalziel; Demarest Lloyd State Park (41.5289936, -70.9833719) in Massachusetts (MA), USA in 2007 by Catherine Peichel; and West River Memorial Park (41.314148, -72.956544) in Connecticut (CT), USA in 2021 by Natalie Steinel and Daniel Bolnick. The sex of each individual was identified by dissection of the gonads, and a fin clip was sampled and preserved in 95% ethanol for DNA extraction and sequencing. Total numbers of individuals sequenced for each population and cross are provided in S1 Table.

Note that a previous cytogenetic study of the same MA population used here suggested it had a ZW sex chromosome [7], while a previous cytogenetic study of the same CT population used here did not identify any heteromorphic sex chromosome pair [41]. Examination of the original samples used for these cytogenetic analyses revealed that the sex and species of all samples was correctly identified.

DNA and RNA extraction and sequencing

For the male genome assembly, DNA from a single laboratory-reared male from a Canal Lake population cross (Nova Scotia, Canada) was used. For the male assembly, high molecular weight DNA was extracted from the liver following previously described methods [33] and used to prepare a HiFi SMRTbell library for PacBio HiFi sequencing. The blood of the same individual was used to prepare a Hi-C sequencing library using the Phase Genomics Proximo Hi-C animal kit (Phase Genomics, Seattle, WA). One SMRT cell was sequenced on a PacBio Sequel IIe, two SMRT cells were sequenced on a PacBio Revio, and Hi-C libraries were sequenced for 300 cycles on an Illumina NovaSeq S1 flow cell. All library preparation and sequencing were performed by the University of Bern Next Generation Sequencing Platform.

For Pool-seq of F1 offspring of genetic crosses and DNA-sequencing of F0 parents, DNA was extracted by phenol-chloroform extraction, followed by ethanol precipitation. Sequencing libraries were created by standard Illumina DNA TruSeq kits. For the RNA-sequencing of F0 parents and F1 offspring from the NS cross, total brain RNA from the F0 parents and F1 offspring of the NS cross was extracted using Trizol (Life Technologies, Carlsbad, California, USA) following the manufacturer’s instructions. RNA-seq libraries were prepared with the Illumina mRNA TruSeq kit. All libraries were subject to 150 bp paired-end sequencing on Illumina NovaSeq SP flow cells by the University of Bern Next Generation Sequencing Platform.

DNA of wild-caught samples for whole-genome sequencing was extracted by phenol-chloroform extraction, followed by ethanol precipitation. Multiplexed haplotagging libraries were prepared as described in [88] with the following modifications in WASH buffer volumes, Tn5 stripping, subsampling and exonuclease reaction. Briefly, DNA were processed in batches of 96 samples. For each sample, 0.75 ng input DNA at 0.15 ng/µl concentration were mixed with 2.5µl haplotagging beads resuspended in 20µl of WASH buffer (20 mM Tris pH8, 50 mM NaCl, 0.1% Triton X-100). We reduced the volume of the tagmentation reaction by using only 5µl of 5x tagmentation buffer (50 mM TAPS pH 8.5 with NaOH, 25 mM MgCl2, 50% N,N-dimethylformamide) and 15 µl of 0.6% SDS for Tn5 stripping following tagmentation. Next, the samples were pooled with 1/3 bead subsampling. This corresponds to a final input DNA of 0.25 ng per sample.

With only 8 pooled samples on the magnetic stand, the buffer was removed, and 20 µl of 1x Lambda Exonuclease buffer, supplemented with 10 units of Exonuclease I (M0293L, New England BioLabs), was added to each sample. Samples were incubated at 48 °C for 20 minutes and then washed twice for 5 minutes with 150 µl of WASH buffer. DNA library was then amplified using NEBNext High-Fidelity 2X PCR Master Mix (M0541L, New England BioLabs) in eight 50 µl PCR reaction according to manufacturer’s instructions, using 3 µl of 10 µM TruSeq-F AATGATACGGCGACCACCGAGATCTACAC and TruSeq-R CAAGCAGAAGACGGCATACGAGAT primers, with the following cycling conditions: 10 min at 72°C followed by 30 sec 98°C and 10 cycles of: 98°C for 15 sec, 65°C for 30 sec and 72°C for 60 sec. Libraries were pooled after PCR into a single library pool, size selected using 0.9x volume of Ampure magnetic beads (Beckman Coulter), Qubit quantified, followed by a second size selection with 0.45x and 0.85x volume of Ampure magnetic beads, to remove library longer than 800 bp and smaller than 300 bp, respectively. Pooled libraries were sequenced on a whole S4 lane of Novaseq 6000 (Illumina) instrument with a 151 + 13 + 13 + 151 cycle run setting, such that the run produced 13 and 13 nt in the i7 and i5 index reads, respectively. Sequence data were first converted into fastq format using --create-fastq-for-index-reads using the bcl2fastq program (Illumina). Then we performed beadTag demultiplexing to generate the modified fastq files using a custom demult_fastq program, resulting in a fastq file supplemented with molecular and sample barcode in the header of each read (e.g., BX:Z:A01C02B03D04). This program is available at https://github.com/evolgenomics/haplotagging.

Apeltes quadracus male de novo assembly and annotation

Raw HiC reads were trimmed by Trimmomatic (v 0.36) [89] with a sliding window of 4 bp. The first 13 bp of reads were dropped, and windows of the remaining reads were also dropped with an average quality score below 15. Together with the HiFi reads, two phased assemblies were generated using Hifiasm (0.19.8-r603) with the “Hi-C integration” option and the default parameters [90].

For each haploid assembly, contig scaffolding was conducted using Hi-C proximity guided assembly separately. Trimmed Hi-C reads were first processed with Chromap (v0.2.6) [91] and then assembled by YaHS [92]. After the first round of Hi-C scaffolding, the assembly was revised manually based on the contact map and then scaffolded again. The final step, gap-closing, was run by TGS-GapCloser (1.2.1) [93]. To identify the X and Y chromosomes, we compared two haploid assemblies with the previously published female assembly [43] using mummer 4 [94] and JCVI [95]; the X chromosome was identified by its higher synteny with the female assembly of chromosome 23. The final assembly contained a full set of haploid assembly of autosomes, a haploid X chromosome and a haploid Y chromosome. Assembly quality was evaluated by BUSCO v4 [96,97]. We generated synteny plots using nucmer from mummer4 [94] to compare the male and female assemblies at the whole-genome level.

Identification of repeat elements and the establishment of repeat library were conducted by EDTA (2.0.1) [98]. The genome assembly was masked by RepeatMasker (v. 4.1.1) [99]. The RNA-seq data generated from 12 A. quadracus males from the NS cross and RNA-seq from NCBI database were used to aid in genome annotation (See S1 Table for details). The raw reads were trimmed by Trimmomatic (v. 0.36) and then mapped against the soft-masked male assembly by Hisat2 (v2.2.1) [100]. Genome annotation was done by EGAPx (v0.2-alpha) with the integration of RNA and protein data of Actinopterygii from NCBI [101]. Lastly, the functional annotation was conducted by eggnog-mapper (v2) [102].

Short read data processing and SNP calling

All DNA sequencing reads were trimmed by Trimmomatic (v 0.36) [89] with a sliding window of 4 bp. The first 13 bp of all reads were dropped, and windows with an average quality score below 15 were also dropped.

For Pool-seq reads from genetic crosses, trimmed reads were first mapped to the male assembly without the Y chromosome by BWA (v 0.7.11) and sorted with duplicates removed by Picard 2.0.1. Pooplation2 [103] was used to create a sync file containing all the variants for each cross separately.

For linked-reads sequencing from wild populations, trimmed reads were first mapped to the male assembly without the Y chromosome by EMA [104], and remaining unmapped reads were further mapped by BWA (v 0.7.11) [105]. Bam files were sorted, and duplicates were removed by Picard 2.0.1 (http://broadinstitute.github.io/picard). SNP calling was done by GATK 4.1.1 [106]. Vcftools 0.1.16 [107] was used to further filter the SNP matrix with the following criteria: (1) individuals with a mean coverage lower than 6; (2) the population mean depth coverage at the SNP was less than 4x or greater than 40x; (3) the proportion of missing data at the SNP was greater than 0.2 in either the CT population or the NS population; (4) the minor allele frequency of the SNP was less than 0.05. The MA population had poor sequencing quality (likely due to the age of the samples) and was therefore not used for SNP filtering, in order to rescue as much information as possible from this population.

Identification of the sex determination system and sex chromosome in A. quadracus

The sex determination system and sex chromosome were identified in A. quadracus using multiple lines of evidence. Using mosdepth 0.3.3 [108] with a sliding window of 20kb and a step size of 10kb, sequencing depth was calculated for both linked read data from wild populations and Pool-seq data from genetic crosses. For the genetic crosses, PoPoolation1 [109] was used for calculating Pi, and PoPoolation2 [103] was used for calculating Fst between sexes. For the wild populations, Fst between sexes, Fst between populations of the same sex and genetic diversity of each sex within population were calculated by VCFtools 0.1.16 [107] with a sliding window of the same size. In addition, 40 bp male-specific kmers from each population were identified by KmerGO [110], and then mapped against Y assembly to explore the SDR.

To further confirm the sex determination pattern, RNA-seq data from the parents and offspring of the NS cross was fed into read2snp 2.0 [111] to obtain SNP data, and also fed into Trinity 2.11.2 [112] to obtain a de novo transcriptome assembly. The output SNP array and assembly were then processed by SEX-DETector [45].

Identification of population-specific inversions

To determine whether there are inversions on the sex chromosome, three methods were used. First, the linked-read sequences were used to identify shared barcodes among any pairs of windows of 10kb on each chromosome. Windows with shared barcodes were divided into two categories: windows that are adjacent, and windows that are 500kb apart on the same chromosome. Putative inversions were identified based on the number of shared barcodes between 500 kb apart non-adjacent window pairs. Second, inversions on the sex chromosomes were identified by LEVIATHAN V1.0.2 [113]. Third, screening of bam files for split and discordantly mapped read pairs near the breakpoints of inversion was done by IGV 2.14.1 [114]. Last, genotypes of inversions were determined by the divergence and diversity pattern between sexes within the inversion as well as the SNP density plot generated by VCFtools 0.1.16 with a minor allele frequency of 0.05 at the individual level. The above analyses were not conducted in the MA population due to the poor sequencing quality.

Further, we used the MCScan in JCVI package [95] to compare synteny among stickleback species and A. flavidus on the gene level in order to determine the ancestry of the population-specific inversions.

Phylogenetic analyses of inversions

The sex chromosome was divided into four regions based on divergence patterns across populations (S15 Fig). To investigate the evolutionary dynamics of each region, we employed a phylogenetic approach. For the CT population, we selected one male and one female, both heterozygous for the X-linked inversion. For the NS population, we selected one male heterozygous for the Y-linked inversion and one female with no inversion. Variants were phased using Hapcut2 v1.3.4 [115], and the full sequences of each haplotype were reconstructed using bcftools v1.2.1 [116]. Phylogenetic trees for each region were then constructed separately using IQ-TREE v2.3.6 [117].

Pattern of molecular evolution within inversions

Molecular evolution on the sex chromosome was analyzed in genes present on both X and Y chromosomes. Single-copy orthologues were identified using Blast (2.14.1) [118] and filtered following reciprocal blast hits. Mapping of gene pairs on the sex chromosome was first conducted by PRANK [119] and then filtered by Gblocks (0.91b) [120] to exclude non-conserved regions. dN, dS and dN/dS ratio between the X and Y alleles were further calculated by the CodeML module in PAML 4.9 [121].

Degeneration of the Y chromosome usually appears in three forms: (1) the loss of genes either due to complete deletion or degeneration such that the Y allele can no longer be aligned to the X allele; (2) accumulation of loss-of-function mutations; and/or (3) the reduction or loss of gene expression on the Y chromosome. To identify the genes that degenerated on the A. quadracus sex chromosomes, we calculated male to female read-depth ratios of each gene for each wild population by mosdepth. Further, loss-of-function mutations were identified by snpEff v5.1 [122] separately for each sex and each population. Fixed loss-of-function mutations were identified with an allele frequency greater than 0.9 in females or 0.45 in males. The above analyses were not conducted on the MA population. Finally, the brain RNA-seq data from 12 males and 12 females from the NS cross were used to identify genes with reduced expression in males. RNA reads were counted by Salmon [123] and differentially expressed genes were identified by DESeq2 [124].

Identification of potential sex determination genes

Because the fourspine stickleback genome assembly was obtained from a male individual, genes on X and Y chromosomes can be directly compared with each other. Therefore, we examined the functions of genes on the Y chromosome that have no BLAST hit on autosomes or the X chromosome, and genes within the SDR that have one-to-one orthologs between the X and Y chromosomes by blasting them against the NCBI nr database. For gene pairs with nonsynonymous changes between X and Y chromosomes, SIFT [46] was applied to predict the effects of amino acid changes. To compare the zar1l gene with its paralog zar1 on chromosome 8, amino acid sequences were aligned using PRANK [119] and visualized with NCBI MSA Viewer 1.25.3 [125].

In addition, published candidate sex determination genes of teleost fish [62] were blasted against our male assembly to search for homologues that mapped to the sex chromosome pair.

Supporting information

S1 Fig. Fourspine stickleback sampling locations in this study.

Red dots represent sampled populations. Connecticut (CT), USA; Massachusetts (MA), USA; Nova Scotia (NS), Canada; Samples used in the QTL analysis are labeled with brackets. The map was generated using the R package “maps” (GPLv2 license), based on public domain data from “CIA World DataBank II” (https://www.evl.uic.edu/pape/data/WDB/).

https://doi.org/10.1371/journal.pgen.1011465.s001

(TIF)

S2 Fig. Synteny between the male (this study) and the female (published) assemblies.

(A) Alignment between all chromosomes. (B) Alignment between chromosome 23 from the female assembly and the X chromosome from the male assembly. (C) Alignment between chromosome 23 from the female assembly and the Y chromosome from the male assembly. Colored lines show the strands of the alignments.

https://doi.org/10.1371/journal.pgen.1011465.s002

(TIF)

S3 Fig. Male-to-female depth ratio across the genome with pool-seq data from genetic crosses of three populations (CT, NS, and MA).

Raw depth values were normalized to eliminate the difference between two sexes. The size of the sliding window is 20kb and the step size is 10kb. Chromosomes are indicated on the X-axis, and the normalized depth ratio is shown on the Y-axis. Dark and light blue regions indicate the different chromosomes.

https://doi.org/10.1371/journal.pgen.1011465.s003

(TIF)

S4 Fig. Male-to-female depth ratio across the genome from linked-read data from the three populations (CT, NS, and MA).

Raw depth values were normalized to eliminate the difference between two sexes. The size of the sliding window is 20kb and the step size is 10kb. Chromosomes are indicated on the X-axis, and the normalized depth ratio is shown on the Y-axis. Dark and light blue regions indicate the different chromosomes.

https://doi.org/10.1371/journal.pgen.1011465.s004

(TIF)

S5 Fig. Genetic differentiation (Fst) between males and females from pool-seq data from three crosses (CT, NS, and MA).

The size of the sliding window is 20kb and the step size is 10kb. Chromosomes are indicated on the X-axis, and the Fst values are shown on the Y-axis. Purple and yellow regions indicate the different chromosomes.

https://doi.org/10.1371/journal.pgen.1011465.s005

(TIF)

S6 Fig. Genetic differentiation (Fst) between males and females on chromosome 23 calculated from pool-seq data from three genetic crosses.

The size of the sliding window is 20kb and the step size is 10kb. The Connecticut (CT) cross is in coral, the Massachusetts (MA) cross is in yellow, and the Nova Scotia (NS) cross is in light blue. The locations of the inversions are also indicated. The SDR is shown in the grey box. Note that all sequences are aligned to the X chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.s006

(TIF)

S7 Fig. Genetic differentiation (Fst) between males and females on chromosome 23, calculated from linked-read data, shown separately for each population.

The Connecticut (CT) population is in coral, the Massachusetts (MA) population is in yellow, and the Nova Scotia (NS) population is in light blue. The coral and light blue lines indicate inversions identified in the CT and NS populations, respectively, while the yellow line represents a shared inversion observed across all populations. The SDR is shown in the grey box. Note that all sequences are aligned to the X chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.s007

(TIF)

S8 Fig. Genomic distribution of genetic diversity (Pi) within males and females calculated from pool-seq data from three genetic crosses (CT, NS, and MA).

Chromosomes are indicated on the X-axis, and the values of genetic diversity (Pi) are shown on the Y-axis. The size of the sliding window is 20kb and the step size is 10kb. Red dots represent males and cyan dots represent females.

https://doi.org/10.1371/journal.pgen.1011465.s008

(TIF)

S9 Fig. Distribution of genetic diversity (Pi) within males and females on chromosome 23 calculated from pool-seq data from the three genetic crosses (CT, NS, and MA).

The position on the X chromosome is given on the X-axis, and the values of genetic diversity (Pi) are shown on the Y-axis. The size of the sliding window is 20kb and the step size is 10kb. Red dots represent males and cyan dots represent females. The coral and light blue lines indicate inversions identified in the CT and NS populations, respectively, while the yellow line represents a shared inversion observed across all populations. The SDR is shown in the grey box. Note that all sequences are aligned to the X chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.s009

(TIF)

S10 Fig. Genomic distribution of genetic diversity (Pi) within males and females calculated from linked-read data from three wild populations (CT, NS, and MA).

The size of the sliding window is 20kb and the step size is 10kb. Chromosomes are indicated on the X-axis, and the values of genetic diversity (Pi) are shown on the Y-axis. Red dots represent males and cyan dots represent females.

https://doi.org/10.1371/journal.pgen.1011465.s010

(TIF)

S11 Fig. SNP density distribution in 10 kb windows for each population, calculated from linked-read data from three wild populations (CT, NS, and MA).

Within each population, individuals were grouped based on the inversion genotypes and the colored lines represent the different inversion genotypes, shown in the legend. In the MA population, inversion genotypes could not be determined, so individuals were grouped by sex. Note that all sequences are aligned to the X chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.s011

(TIF)

S12 Fig. Segregation patterns of inversions in genetic crosses inferred from pool-seq data.

Red bars represent the X-specific inversion, and blue bars represent the Y-specific inversion. For the MA cross, the number of individuals with each genotype is assumed to be equal within a sex.

https://doi.org/10.1371/journal.pgen.1011465.s012

(TIF)

S13 Fig. Synteny map of the X chromosome and Y chromosome assemblies generated from the NS population of A. quadracus with homologous chromosomes from two other stickleback species (G. aculeatus, P. pungitius) and an outgroup species (A. flavidus).

Blue and green bars represent genes. Grey lines are syntenic blocks between species.

https://doi.org/10.1371/journal.pgen.1011465.s013

(TIF)

S14 Fig. Distributions of the density of male-specific kmers on the Y chromosome.

Male-specific, 40 bp kmers were calculated using linked-reads sequences from three wild populations from Connecticut (CT), Nova Scotia (NS), and Massachusetts (MA). A sliding window of 20kb was used. Note that all sequences are aligned to the Y chromosome of the male assembly.

https://doi.org/10.1371/journal.pgen.1011465.s014

(TIF)

S15 Fig. Phylogenetic relationship among sex-linked haplotypes of A. quadracus.

(A) Regions with different patterns of genetic differentiation were used for building separate phylogenetic trees. (B) Phylogenetic trees of different genomic regions on the sex chromosome. Haplotypes associated with the X-linked inversion are highlighted in coral (CT population) and haplotypes associated with the Y-linked inversion are highlighted in blue (NS population). Haplotypes corresponding to the SDR on Y chromosomes are indicated with a gray bar.

https://doi.org/10.1371/journal.pgen.1011465.s015

(TIF)

S16 Fig. Amino acid alignment of the ancestral zar1 gene on Chromosome 8 and the zar1l genes on Chromosome X (Chr23) and Chromosome Y.

https://doi.org/10.1371/journal.pgen.1011465.s016

(TIF)

S1 Table. Sample information and accession numbers for sequencing data in this study.

https://doi.org/10.1371/journal.pgen.1011465.s017

(XLSX)

S2 Table. SEX-Detector results.

The first column represents the model used in each run. The second column shows the number of sex-linked transcripts detected.

https://doi.org/10.1371/journal.pgen.1011465.s018

(XLSX)

S3 Table. Identification of inversions in the CT and NS populations.

Evidence comes from LEVIATHAN results, split reads, and read pairs with long insertion sizes.

https://doi.org/10.1371/journal.pgen.1011465.s019

(XLSX)

S4 Table. Genes with loss-of-function mutations on X and Y chromosomes.

https://doi.org/10.1371/journal.pgen.1011465.s020

(XLSX)

S5 Table. Genes with higher expression in females than in males on chromosome 23.

All sequences were aligned to the X chromosome of the male reference assembly.

https://doi.org/10.1371/journal.pgen.1011465.s021

(XLSX)

S6 Table. Y-specific genes, genes with one-to-one homologues on the X and Y chromosomes within the sex determination region, and genes around 12Mb.

Genes of interest are labeled with bold font.

https://doi.org/10.1371/journal.pgen.1011465.s022

(XLSX)

Acknowledgments

We thank Daniel Jeffries and Mark Kirkpatrick for comments on the manuscript, Daniel Bolnick, Anne Dalziel, Thomas Near, and Natalie Steinel for collecting wild fish, Melanie Hiltbrunner, Shaugnessy McCann, and Verena Saladin for making crosses, Melanie Hiltbrunner and Nicole Nesvadba for performing extractions, and the University of Bern Next Generation Sequencing Platform for library preparation and sequencing.

References

  1. 1. Bachtrog D, Mank JE, Peichel CL, Kirkpatrick M, Otto SP, Ashman T-L, et al. Sex determination: why so many ways of doing it?. PLoS Biol. 2014;12(7):e1001899. pmid:24983465
  2. 2. Jeffries DL, Lavanchy G, Sermier R, Sredl MJ, Miura I, Borzée A, et al. A rapid rate of sex-chromosome turnover and non-random transitions in true frogs. Nat Commun. 2018;9(1):4088. pmid:30291233
  3. 3. Ma W-J, Veltsos P. The diversity and evolution of sex chromosomes in frogs. Genes (Basel). 2021;12(4):483. pmid:33810524
  4. 4. Gamble T, Coryell J, Ezaz T, Lynch J, Scantlebury DP, Zarkower D. Restriction site-associated DNA sequencing (RAD-seq) reveals an extraordinary number of transitions among gecko sex-determining systems. Mol Biol Evol. 2015;32(5):1296–309. pmid:25657328
  5. 5. Darolti I, Wright AE, Sandkam BA, Morris J, Bloch NI, Farré M, et al. Extreme heterogeneity in sex chromosome differentiation and dosage compensation in livebearers. Proc Natl Acad Sci U S A. 2019;116(38):19031–6. pmid:31484763
  6. 6. El Taher A, Ronco F, Matschiner M, Salzburger W, Böhne A. Dynamics of sex chromosome evolution in a rapid radiation of cichlid fishes. Sci Adv. 2021;7(36):eabe8215. pmid:34516923
  7. 7. Ross JA, Urton JR, Boland J, Shapiro MD, Peichel CL. Turnover of sex chromosomes in the stickleback fishes (Gasterosteidae). PLoS Genet. 2009;5(2):e1000391. pmid:19229325
  8. 8. The Tree of Sex Consortium. Tree of Sex: a database of sexual systems. Sci Data. 2014;1:140015. pmid:25977773
  9. 9. Furman BLS, Metzger DCH, Darolti I, Wright AE, Sandkam BA, Almeida P, et al. Sex chromosome evolution: so many exceptions to the rules. Genome Biol Evol. 2020;12(6):750–63. pmid:32315410
  10. 10. Vicoso B. Molecular and evolutionary dynamics of animal sex-chromosome turnover. Nat Ecol Evol. 2019;3(12):1632–41. pmid:31768022
  11. 11. Bachtrog D. Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration. Nat Rev Genet. 2013;14(2):113–24. pmid:23329112
  12. 12. Charlesworth B. The evolution of sex chromosomes. Science. 1991;251(4997):1030–3. pmid:1998119
  13. 13. van Doorn GS, Kirkpatrick M. Turnover of sex chromosomes induced by sexual conflict. Nature. 2007;449(7164):909–12. pmid:17943130
  14. 14. Blaser O, Grossen C, Neuenschwander S, Perrin N. Sex-chromosome turnovers induced by deleterious mutation load: sex-chromosome turnovers. Evolution. 2013;67(3):635–45. pmid:23461315
  15. 15. Blaser O, Neuenschwander S, Perrin N. Sex-chromosome turnovers: the hot-potato model. Am Nat. 2014;183(1):140–6. pmid:24334743
  16. 16. Bull JJ, Charnov EL. Changes in the heterogametic mechanism of sex determination. Heredity (Edinb). 1977;39(1):1–14. pmid:268319
  17. 17. Werren JH, Beukeboom LW. Sex determination, sex ratios, and genetic conflict. Annu Rev Ecol Syst. 1998;29(1):233–61.
  18. 18. Saunders PA, Neuenschwander S, Perrin N. Sex chromosome turnovers and genetic drift: a simulation study. J Evol Biol. 2018;31(9):1413–9. pmid:29923246
  19. 19. Veller C, Muralidhar P, Constable GWA, Nowak MA. Drift-induced selection between male and female heterogamety. Genetics. 2017;207(2):711–27. pmid:28821587
  20. 20. Evans BJ, Mudd AB, Bredeson JV, Furman BLS, Wasonga DV, Lyons JB, et al. New insights into Xenopus sex chromosome genomics from the Marsabit clawed frog X. borealis. J Evol Biol. 2022;35: 1777–90.
  21. 21. Furman BLS, Cauret CMS, Knytl M, Song X-Y, Premachandra T, Ofori-Boateng C, et al. A frog with three sex chromosomes that co-mingle together in nature: Xenopus tropicalis has a degenerate W and a Y that evolved from a Z chromosome. PLoS Genet. 2020;16(11):e1009121. pmid:33166278
  22. 22. Rodrigues N, Vuille Y, Loman J, Perrin N. Sex-chromosome differentiation and “sex races” in the common frog (Rana temporaria). Proc Biol Sci. 2015;282(1806):20142726. pmid:25833852
  23. 23. Feller AF, Ogi V, Seehausen O, Meier JI. Identification of a novel sex determining chromosome in cichlid fishes that acts as XY or ZW in different lineages. Hydrobiologia. 2021;848(16):3727–45. pmid:34720170
  24. 24. Kocher TD, Behrens KA, Conte MA, Aibara M, Mrosso HDJ, Green ECJ, et al. New sex chromosomes in Lake Victoria cichlid fishes (Cichlidae: Haplochromini). Genes (Basel). 2022;13(5):804. pmid:35627189
  25. 25. Schultheis C, Böhne A, Schartl M, Volff JN, Galiana-Arnoux D. Sex determination diversity and sex chromosome evolution in poeciliid fish. Sex Dev. 2009;3(2–3):68–77. pmid:19684452
  26. 26. Roberts RB, Ser JR, Kocher TD. Sexual conflict resolved by invasion of a novel sex determiner in Lake Malawi cichlid fishes. Science. 2009;326(5955):998–1001. pmid:19797625
  27. 27. Lichilín N, Salzburger W, Böhne A. No evidence for sex chromosomes in natural populations of the cichlid fish Astatotilapia burtoni. G3 (Bethesda). 2023;13(3):jkad011. pmid:36649174
  28. 28. Sandkam BA, Almeida P, Darolti I, Furman BLS, van der Bijl W, Morris J, et al. Extreme Y chromosome polymorphism corresponds to five male reproductive morphs of a freshwater fish. Nat Ecol Evol. 2021;5(7):939–48. pmid:33958755
  29. 29. McAllister BF. Sequence differentiation associated with an inversion on the neo-X chromosome of Drosophila americana. Genetics. 2003;165(3):1317–28. pmid:14668385
  30. 30. Wei KH-C, Bachtrog D. Ancestral male recombination in Drosophila albomicans produced geographically restricted neo-Y chromosome haplotypes varying in age and onset of decay. PLoS Genet. 2019;15(11):e1008502. pmid:31738748
  31. 31. Jeffries DL, Mee JA, Peichel CL. Identification of a candidate sex determination gene in Culaea inconstans suggests convergent recruitment of an Amh duplicate in two lineages of stickleback. J Evol Biol. 2022;35(12):1683–95. pmid:35816592
  32. 32. Dagilis AJ, Sardell JM, Josephson MP, Su Y, Kirkpatrick M, Peichel CL. Searching for signatures of sexually antagonistic selection on stickleback sex chromosomes. Philos Trans R Soc Lond B Biol Sci. 2022;377(1856):20210205. pmid:35694749
  33. 33. Peichel CL, McCann SR, Ross JA, Naftaly AFS, Urton JR, Cech JN, et al. Assembly of the threespine stickleback Y chromosome reveals convergent signatures of sex chromosome evolution. Genome Biol. 2020;21(1):177. pmid:32684159
  34. 34. Kitano J, Ross JA, Mori S, Kume M, Jones FC, Chan YF, et al. A role for a neo-sex chromosome in stickleback speciation. Nature. 2009;461(7267):1079–83. pmid:19783981
  35. 35. Sardell JM, Josephson MP, Dalziel AC, Peichel CL, Kirkpatrick M. Heterogeneous histories of recombination suppression on stickleback sex chromosomes. Mol Biol Evol. 2021;38(10):4403–18. pmid:34117766
  36. 36. Natri HM, Merilä J, Shikano T. The evolution of sex determination associated with a chromosomal inversion. Nat Commun. 2019;10(1):145. pmid:30635564
  37. 37. Rastas P, Calboli FCF, Guo B, Shikano T, Merilä J. Construction of ultradense linkage maps with Lep-MAP2: stickleback F2 recombinant crosses as an example. Genome Biol Evol. 2015;8(1):78–93. pmid:26668116
  38. 38. Shapiro MD, Summers BR, Balabhadra S, Aldenhoven JT, Miller AL, Cunningham CB, et al. The genetic architecture of skeletal convergence and sex determination in ninespine sticklebacks. Curr Biol. 2009;19(13):1140–5. pmid:19500990
  39. 39. Dixon G, Kitano J, Kirkpatrick M. The origin of a new sex chromosome by introgression between two stickleback fishes. Mol Biol Evol. 2019;36(1):28–38. pmid:30272243
  40. 40. Chen TR, Reisman HM. A comparative chromosome study of the North American species of sticklebacks (Teleostei: Gasterosteidae). Cytogenetics. 1970;9(5):321–32. pmid:5501390
  41. 41. Urton JR, McCann SR, Peichel CL. Karyotype differentiation between two stickleback species (Gasterosteidae). Cytogenet Genome Res. 2011;135(2):150–9. pmid:21921583
  42. 42. Herbert AL, Lee D, McCoy MJ, Behrens VC, Wucherpfennig JI, Kingsley DM. Genetic mechanisms of axial patterning in Apeltes quadracus. Evol Lett. 2024;8(6):893–901. pmid:39677576
  43. 43. Liu Z, Roesti M, Marques D, Hiltbrunner M, Saladin V, Peichel CL. Chromosomal fusions facilitate adaptation to divergent environments in threespine stickleback. Mol Biol Evol. 2022;39(2):msab358. pmid:34908155
  44. 44. Gammerdinger WJ, Toups MA, Vicoso B. Disagreement in FST estimators: A case study from sex chromosomes. Mol Ecol Resour. 2020;20(6):1517–25. pmid:32543001
  45. 45. Muyle A, Käfer J, Zemp N, Mousset S, Picard F, Marais GA. SEX-DETector: A probabilistic approach to study sex chromosomes in non-model organisms. Genome Biol Evol. 2016;8(8):2530–43. pmid:27492231
  46. 46. Ng PC, Henikoff S. Predicting deleterious amino acid substitutions. Genome Res. 2001;11(5):863–74. pmid:11337480
  47. 47. Peichel CL, Ross JA, Matson CK, Dickson M, Grimwood J, Schmutz J, et al. The master sex-determination locus in threespine sticklebacks is on a nascent Y chromosome. Curr Biol. 2004;14(16):1416–24. pmid:15324658
  48. 48. Yi X, Wang D, Reid K, Feng X, Löytynoja A, Merilä J. Sex chromosome turnover in hybridizing stickleback lineages. Evol Lett. 2024;8(5):658–68. pmid:39328282
  49. 49. Overbeek PA, Gorlov IP, Sutherland RW, Houston JB, Harrison WR, Boettger-Tong HL, et al. A transgenic insertion causing cryptorchidism in mice. Genesis. 2001;30(1):26–35. pmid:11353515
  50. 50. Kasippillai T, MacArthur DG, Kirby A, Thomas B, Lambalk CB, Daly MJ, et al. Mutations in eIF4ENIF1 are associated with primary ovarian insufficiency. J Clin Endocrinol Metab. 2013;98(9):E1534-9. pmid:23902945
  51. 51. Kumagai J, Hsu SY, Matsumi H, Roh J-S, Fu P, Wade JD, et al. INSL3/Leydig insulin-like peptide activates the LGR8 receptor important in testis descent. J Biol Chem. 2002;277(35):31283–6. pmid:12114498
  52. 52. Timms KL. Rodent models for translational research in endometriosis. Biology of Reproduction. 2012;87(Suppl_1):143–143.
  53. 53. Zimmermann S, Steding G, Emmen JM, Brinkmann AO, Nayernia K, Holstein AF, et al. Targeted disruption of the Insl3 gene causes bilateral cryptorchidism. Mol Endocrinol. 1999;13(5):681–91. pmid:10319319
  54. 54. Sharma V, Lehmann T, Stuckas H, Funke L, Hiller M. Loss of RXFP2 and INSL3 genes in Afrotheria shows that testicular descent is the ancestral condition in placental mammals. PLoS Biol. 2018;16(6):e2005293. pmid:29953435
  55. 55. Assis LHC, Crespo D, Morais RDVS, França LR, Bogerd J, Schulz RW. INSL3 stimulates spermatogonial differentiation in testis of adult zebrafish (Danio rerio). Cell Tissue Res. 2016;363(2):579–88. pmid:26077926
  56. 56. Wu X, Viveiros MM, Eppig JJ, Bai Y, Fitzpatrick SL, Matzuk MM. Zygote arrest 1 (Zar1) is a novel maternal-effect gene critical for the oocyte-to-embryo transition. Nat Genet. 2003;33(2):187–91.
  57. 57. Yamamoto TM, Cook JM, Kotter CV, Khat T, Silva KD, Ferreyros M, et al. Zar1 represses translation in Xenopus oocytes and binds to the TCS in maternal mRNAs with different characteristics than Zar2. Biochim Biophys Acta. 2013;1829(10):1034–46. pmid:23827238
  58. 58. Miao L, Yuan Y, Cheng F, Fang J, Zhou F, Ma W, et al. Translation repression by maternal RNA binding protein Zar1 is essential for early oogenesis in zebrafish. Development. 2017;144(1):128–38. pmid:27913641
  59. 59. Cook J, Charlesworth A. The developmentally important RNA‐binding protein, zygote arrest (Zar), regulates mRNA translation. The FASEB Journal. 2015;29(S1).
  60. 60. Guzeloglu-Kayisli O, Lalioti MD, Aydiner F, Sasson I, Ilbay O, Sakkas D, et al. Embryonic poly(A)-binding protein (EPAB) is required for oocyte maturation and female fertility in mice. Biochem J. 2012;446(1):47–58. pmid:22621333
  61. 61. Pan Q, Kay T, Depincé A, Adolfi M, Schartl M, Guiguen Y, et al. Evolution of master sex determiners: TGF-β signalling pathways at regulatory crossroads. Philos Trans R Soc Lond B Biol Sci. 2021;376(1832):20200091. pmid:34247498
  62. 62. Kitano J, Ansai S, Takehana Y, Yamamoto Y. Diversity and convergence of sex-determination mechanisms in teleost fish. Annu Rev Anim Biosci. 2024;12:233–59. pmid:37863090
  63. 63. Rafati N, Chen J, Herpin A, Pettersson ME, Han F, Feng C, et al. Reconstruction of the birth of a male sex chromosome present in Atlantic herring. Proc Natl Acad Sci U S A. 2020;117(39):24359–68. pmid:32938798
  64. 64. Kongsstovu SÍ, Dahl HA, Gislason H, Homrum E, Jacobsen JA, Flicek P, et al. Identification of male heterogametic sex-determining regions on the Atlantic herring Clupea harengus genome. J Fish Biol. 2020;97(1):190–201. pmid:32293027
  65. 65. Franke Y, Peoples RJ, Francke U. Identification of GTF2IRD1, a putative transcription factor within the Williams-Beuren syndrome deletion at 7q11.23. Cytogenet Cell Genet. 1999;86(3–4):296–304. pmid:10575229
  66. 66. Sammour ZM, Gomes CM, de Bessa J Jr, Pinheiro MS, Kim CAE, Hisano M, et al. Congenital genitourinary abnormalities in children with Williams-Beuren syndrome. J Pediatr Urol. 2014;10(5):804–9. pmid:24582571
  67. 67. Micale L, Fusco C, Augello B, Napolitano LMR, Dermitzakis ET, Meroni G, et al. Williams-Beuren syndrome TRIM50 encodes an E3 ubiquitin ligase. Eur J Hum Genet. 2008;16(9):1038–49. pmid:18398435
  68. 68. Charlesworth D, Charlesworth B, Marais G. Steps in the evolution of heteromorphic sex chromosomes. Heredity (Edinb). 2005;95(2):118–28. pmid:15931241
  69. 69. Rice WR. The accumulation of sexually antagonistic genes as a selective agent promoting the evolution of reduced recombination between primitive sex chromosomes. Evolution. 1987;41(4):911.
  70. 70. Kumar S, Suleski M, Craig JM, Kasprowicz AE, Sanderford M, Li M, et al. TimeTree 5: An expanded resource for species divergence times. Mol Biol Evol. 2022;39(8):msac174. pmid:35932227
  71. 71. Lahn BT, Page DC. Four evolutionary strata on the human X chromosome. Science. 1999;286(5441):964–7. pmid:10542153
  72. 72. Wang J, Na J-K, Yu Q, Gschwend AR, Han J, Zeng F, et al. Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution. Proc Natl Acad Sci U S A. 2012;109(34):13710–5. pmid:22869747
  73. 73. Lemaitre C, Braga MDV, Gautier C, Sagot M-F, Tannier E, Marais GAB. Footprints of inversions at present and past pseudoautosomal boundaries in human sex chromosomes. Genome Biol Evol. 2009;1:56–66. pmid:20333177
  74. 74. Charlesworth B, Charlesworth D. A model for the evolution of dioecy and gynodioecy. Am Nat. 1978;112:975–97.
  75. 75. Charlesworth D, Charlesworth B. Sex differences in fitness and selection for centric fusions between sex-chromosomes and autosomes. Genet Res. 1980;35(2):205–14. pmid:6930353
  76. 76. Fisher R. The evolution of dominance. Biol Rev. 1931:345–68.
  77. 77. Úbeda F, Patten MM, Wild G. On the origin of sex chromosomes from meiotic drive. Proc Biol Sci. 2015;282(1798):20141932. pmid:25392470
  78. 78. Lenormand T, Roze D. Y recombination arrest and degeneration in the absence of sexual dimorphism. Science. 2022;375(6581):663–6. pmid:35143289
  79. 79. Charlesworth B, Wall JD. Inbreeding, heterozygote advantage and the evolution of neo–X and neo–Y sex chromosomes. Proc R Soc Lond B. 1999;266(1414):51–6.
  80. 80. Jay P, Tezenas E, Véber A, Giraud T. Sheltering of deleterious mutations explains the stepwise extension of recombination suppression on sex chromosomes and other supergenes. PLoS Biol. 2022;20(7):e3001698. pmid:35853091
  81. 81. Charlesworth B, Coyne JA, Barton NH. The relative rates of evolution of sex chromosomes and autosomes. Am Nat. 1987;130:113–46.
  82. 82. Ponnikas S, Sigeman H, Abbott JK, Hansson B. Why do sex chromosomes stop recombining? Trends Genet. 2018;34(7):492–503. pmid:29716744
  83. 83. Jeffries DL, Gerchen JF, Scharmann M, Pannell JR. A neutral model for the loss of recombination on sex chromosomes. Philos Trans R Soc Lond B Biol Sci. 2021;376(1832):20200096. pmid:34247504
  84. 84. Yue J, Krasovec M, Kazama Y, Zhang X, Xie W, Zhang S, et al. The origin and evolution of sex chromosomes, revealed by sequencing of the Silene latifolia female genome. Curr Biol. 2023;33(12):2504-2514.e3. pmid:37290443
  85. 85. Jaenike J. Sex chromosome meiotic drive. Annu Rev Ecol Syst. 2001;32(1):25–49.
  86. 86. Connallon T, Clark AG. Balancing selection in species with separate sexes: insights from Fisher’s geometric model. Genetics. 2014;197(3):991–1006. pmid:24812306
  87. 87. Broman KW, Wu H, Sen S, Churchill GA. R/qtl: QTL mapping in experimental crosses. Bioinformatics. 2003;19(7):889–90. pmid:12724300
  88. 88. Meier JI, Salazar PA, Kučka M, Davies RW, Dréau A, Aldás I, et al. Haplotype tagging reveals parallel formation of hybrid races in two butterfly species. Proc Natl Acad Sci U S A. 2021;118(25):e2015005118. pmid:34155138
  89. 89. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. pmid:24695404
  90. 90. Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18(2):170–5. pmid:33526886
  91. 91. Zhang H, Song L, Wang X, Cheng H, Wang C, Meyer CA, et al. Fast alignment and preprocessing of chromatin profiles with Chromap. Nat Commun. 2021;12(1):6566. pmid:34772935
  92. 92. Zhou C, McCarthy SA, Durbin R. YaHS: yet another Hi-C scaffolding tool. Bioinformatics. 2023;39(1):btac808. pmid:36525368
  93. 93. Xu M, Guo L, Gu S, Wang O, Zhang R, Peters BA, et al. TGS-GapCloser: A fast and accurate gap closer for large genomes with low coverage of error-prone long reads. Gigascience. 2020;9(9):giaa094. pmid:32893860
  94. 94. Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A fast and versatile genome alignment system. PLoS Comput Biol. 2018;14(1):e1005944. pmid:29373581
  95. 95. Tang H, Krishnakumar V, Li J. Jcvi: jcvi utility libraries. Zenodo; 2015. https://doi.org/10.5281/zenodo.31631
  96. 96. Manni M, Berkeley MR, Seppey M, Zdobnov EM. BUSCO: assessing genomic data quality and beyond. Curr Protoc. 2021;1(12):e323. pmid:34936221
  97. 97. Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38(10):4647–54. pmid:34320186
  98. 98. Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019;20(1):275. pmid:31843001
  99. 99. Smit A, Hubley R, Green P. RepeatMasker Open-4.0. 2015. 2013. Available from: http://www.repeatmasker.org
  100. 100. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37(8):907–15. pmid:31375807
  101. 101. NCBI. Eukaryotic Genome Annotation Pipeline. 2024. Available from: https://github.com/ncbi/egapx
  102. 102. Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol Biol Evol. 2017;34(8):2115–22. pmid:28460117
  103. 103. Kofler R, Pandey RV, Schlötterer C. PoPoolation2: identifying differentiation between populations using sequencing of pooled DNA samples (Pool-Seq). Bioinformatics. 2011;27(24):3435–6. pmid:22025480
  104. 104. Shajii A, Numanagić I, Berger B. Latent variable model for aligning barcoded short-reads improves downstream analyses. Bioinformatics; 2017.
  105. 105. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv13033997 Q-Bio. 2013 [cited 1 Oct 2020]. Available from: http://arxiv.org/abs/1303.3997
  106. 106. Van der AG, O’Connor B. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (1st Edition). O’Reilly Media; 2020.
  107. 107. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–8. pmid:21653522
  108. 108. Pedersen BS, Quinlan AR. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics. 2018;34(5):867–8. pmid:29096012
  109. 109. Kofler R, Orozco-terWengel P, De Maio N, Pandey RV, Nolte V, Futschik A, et al. PoPoolation: a toolbox for population genetic analysis of next generation sequencing data from pooled individuals. PLoS One. 2011;6(1):e15925. pmid:21253599
  110. 110. Wang Y, Chen Q, Deng C, Zheng Y, Sun F. KmerGO: A tool to identify group-specific sequences with k-mers. Front Microbiol. 2020;11:2067. pmid:32983048
  111. 111. Gayral P, Melo-Ferreira J, Glémin S, Bierne N, Carneiro M, Nabholz B, et al. Reference-free population genomics from next-generation transcriptome data and the vertebrate-invertebrate gap. PLoS Genet. 2013;9(4):e1003457. pmid:23593039
  112. 112. Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8(8):1494–512. pmid:23845962
  113. 113. Morisse P, Legeai F, Lemaitre C. LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data. Bioinformatics; 2021.
  114. 114. Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178–92. pmid:22517427
  115. 115. Edge P, Bafna V, Bansal V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 2017;27(5):801–12. pmid:27940952
  116. 116. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. pmid:19505943
  117. 117. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37(5):1530–4. pmid:32011700
  118. 118. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. pmid:20003500
  119. 119. Löytynoja A. Phylogeny-aware alignment with PRANK. Mult Seq Alignment Methods. 2014;155–170.
  120. 120. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17(4):540–52. pmid:10742046
  121. 121. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91. pmid:17483113
  122. 122. Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2):80–92. pmid:22728672
  123. 123. Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417–9. pmid:28263959
  124. 124. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. pmid:25516281
  125. 125. NCBI. NCBI multiple sequence alignment viewer, version 1.25.3 [Internet]. Bethesda (MD): National Center for Biotechnology Information; n.d. [cited 2025 Mar 10]. Available from: https://www.ncbi.nlm.nih.gov/projects/msaviewer/