Uncovering the Ancestry of B Chromosomes in Moenkhausia sanctaefilomenae (Teleostei, Characidae)

B chromosomes constitute a heterogeneous mixture of genomic parasites that are sometimes derived intraspecifically from the standard genome of the host species, but result from interspecific hybridization in other cases. The mode of origin determines the DNA content, with the B chromosomes showing high similarity with the A genome in the first case, but presenting higher similarity with a different species in the second. The characid fish Moenkhausia sanctaefilomenae harbours highly invasive B chromosomes, which are present in all populations analyzed to date in the Parana and Tietê rivers. To investigate the origin of these B chromosomes, we analyzed two natural populations: one carrying B chromosomes and the other lacking them, using a combination of molecular cytogenetic techniques, nucleotide sequence analysis and high-throughput sequencing (Illumina HiSeq2000). Our results showed that i) B chromosomes have not yet reached the Paranapanema River basin; ii) B chromosomes are mitotically unstable; iii) there are two types of B chromosomes, the most frequent of which is lightly C-banded (similar to euchromatin in A chromosomes) (B1), while the other is darkly C-banded (heterochromatin-like) (B2); iv) the two B types contain the same tandem repeat DNA sequences (18S ribosomal DNA, H3 histone genes, MS3 and MS7 satellite DNA), with a higher content of 18S rDNA in the heterochromatic variant; v) all of these repetitive DNAs are present together only in the paracentromeric region of autosome pair no. 6, suggesting that the B chromosomes are derived from this A chromosome; vi) the two B chromosome variants show MS3 sequences that are highly divergent from each other and from the 0B genome, although the B2-derived sequences exhibit higher similarity with the 0B genome (this suggests an independent origin of the two B variants, with the less frequent, B2 type presumably being younger); and vii) the dN/dS ratio for the H3.2 histone gene is almost 4–6 times higher for B chromosomes than for A chromosome sequences, suggesting that purifying selection is relaxed for the DNA sequences located on the B chromosomes, presumably because they are mostly inactive.

B chromosomes constitute a heterogeneous mixture of genomic parasites that are sometimes derived intraspecifically from the standard genome of the host species, but result from interspecific hybridization in other cases. The mode of origin determines the DNA content, with the B chromosomes showing high similarity with the A genome in the first case, but presenting higher similarity with a different species in the second. The characid fish Moenkhausia sanctaefilomenae harbours highly invasive B chromosomes, which are present in all populations analyzed to date in the Parana and Tietê rivers. To investigate the origin of these B chromosomes, we analyzed two natural populations: one carrying B chromosomes and the other lacking them, using a combination of molecular cytogenetic techniques, nucleotide sequence analysis and high-throughput sequencing (Illumina HiSeq2000). Our results showed that i) B chromosomes have not yet reached the Paranapanema River basin; ii) B chromosomes are mitotically unstable; iii) there are two types of B chromosomes, the most frequent of which is lightly C-banded (similar to euchromatin in A chromosomes) (B 1 ), while the other is darkly C-banded (heterochromatin-like) (B 2 ); iv) the two B types contain the same tandem repeat DNA sequences (18S ribosomal DNA, H3 histone genes, MS3 and MS7 satellite DNA), with a higher content of 18S rDNA in the heterochromatic variant; v) all of these repetitive DNAs are present together only in the paracentromeric region of autosome pair no. 6, suggesting that the B chromosomes are derived from this A chromosome; vi) the two B chromosome variants show MS3 sequences that are highly divergent from each other and from the 0B genome, although the B 2 -derived sequences exhibit higher similarity with the 0B genome (this suggests an independent origin of the two B variants, with the less frequent, B 2 type presumably being younger); and vii) the dN/dS ratio for the H3.2 histone gene is almost 4-6 times higher for B chromosomes than for A chromosome sequences, suggesting that purifying selection is relaxed for the Introduction B chromosomes are dispensable genomic elements that are present in approximately 15% of eukaryotes. These chromosomes exhibit a parasitic nature, and their interaction with the host genome determines their population frequency, which is highly dynamic as a consequence of "the arms race" between A and B chromosomes [1][2][3]. Because B chromosomes do not always occur in pairs, their segregation does not conform to a Mendelian system, which may facilitate transmission rates higher than 0.5 among these chromosomes, resulting in transmission advantages collectively referred to as "drive" [3,4].
The origin of B chromosomes has been investigated in various organisms. Basically, they may arise from the A chromosomes of the current host species (intraspecific origin) or be derived interspecifically through hybridization [3]. An intraspecific origin of B chromosomes has been demonstrated, for instance, in maize [5,6], the migratory locust [7], rye [8], and the fish Astyanax paranae [9]. However, examples of B chromosomes arising through interspecific hybridization have been reported in the plant genus Coix [10], the fish Poecilia formosa [11], and the wasp Nasonia vitripennis [12,13].
Although the sequence composition of B chromosomes is unknown in most cases, several studies using fluorescent in situ hybridization (FISH) and next-generation sequencing (NGS) have allowed better characterization of repetitive DNA sequences and single-copy genes located on B chromosomes [14,15]. Notably, these data provided new insights about the origin of B chromosomes [8,[16][17][18] and have suggested that some of the DNA contained in B chromosomes is potentially functional, as ribosomal DNA within the B chromosome of the grasshopper Eyprepocnemis plorans is able to yield the corresponding phenotype (i.e., a nucleolus) [19].
Recently, after karyotyping three populations from the Tietê River basin and performing chromosome painting using a B-specific probe, Scudeler et al. [25] concluded that the heterochromatic B type had an intraspecific origin, due to sharing DNA sequences with several A chromosomes, and that it arose independently from the euchromatic B chromosome, as a painting probe produced from the former B type did not paint the latter variant. In an attempt to extend these conclusions further, in the present study, we analyzed a natural population carrying the euchromatic and heterochromatic variants by means of a combination of cytogenetic (C-banding, microdissection, chromosome painting, FISH mapping, silver staining), molecular (PCR amplification, cloning, Sanger DNA sequencing and Illumina sequencing), phylogenetic, and bioinformatics techniques, additionally providing the first report for a B-lacking population in this species.

Ethics Statement
Sampling was carried out on private lands, and the owners gave permission to conduct this study. The animals were captured using nets, transported to the laboratory and kept in a fish tank for 2 days, and they were anesthetized before the analyses. The animals were collected in accordance with Brazilian environmental protection legislation (Collection Permission MMA/ IBAMA/SISBIO-number 3245) and the procedures for the sampling, maintenance and analysis of the fishes were performed in compliance with the Brazilian College of Animal Experimentation (COBEA) and approved (protocol 504) by the BIOSCIENCE INSTITUTE/UNESP ETHICS COMMITTEE ON THE USE OF ANIMALS (CEUA).

Sampling, chromosome banding and DNA extraction
Individuals of M. sanctaefilomenae were sampled in 2 rivers of the Paraná River system, with 23 specimens (11 females and 12 males) being collected from the Batalha River (BR), belonging to the Tietê River basin (Bauru, SP; 22°24'23.65'S, 49°05'51.38"W), and 16 individuals (11 females and 5 males) from the Novo River (NR), in the Paranapanema River basin (Ocauçu, SP; 22°28'13.32"S, 49°55'26.17"W). These rivers are separated from each other by approximately 170 Km as the crow flies and by several hundred kilometres through the rivers connecting them. After analysis, all specimens were deposited in the fish collection of the Laboratório de Biologia e Genética de Peixes (LBP) at UNESP, Botucatu, São Paulo, Brazil, under voucher numbers LBP19830 (Batalha River) and LBP19831 (Novo River). After capture, the animals were taken to the Laboratory, transferred to an aerated 40L glass aquarium (60 x 30 x 30 cm) at 25°C and kept there until sacrifice for three days at most. Food administration was not carried out.
Before analysis, the animals were sacrificed by overdose of anaesthetic in 1% benzocaine in water. Mitotic chromosomes were obtained from cell suspensions from the anterior kidney according to Foresti et al. [26]. C-banding was carried out according to Sumner [27], and active nucleolar organizer regions (NORs) were revealed according to Howell and Black [28]. The chromosomes were classified as metacentric (m), submetacentric (sm), subtelocentric (st) and acrocentric (a) according to Levan et al. [29]. Genomic DNA (gDNA) was obtained from liver cells using the Promega Wizard Genomic DNA Purification Kit according to the manufacturer's instructions.

Chromosome microdissection
Cell suspensions were dropped onto clean coverslips. The coverslips were then washed in a 1x PBS solution for 1 minute, incubated in a trypsin solution (1% Trypsin, 1x PBS) for 20 seconds and washed again in 1x PBS. Finally, the preparations were stained with 5% Giemsa in PBS for 5 minutes. For the microdissection of B 1 , we used cell suspensions from individual 69693, which presented only this variant (Table 1). To microdissect the B 2 variant, C-banding was performed to allow better identification of the heterochromatic B chromosome in the metaphase spread. In addition, one autosome from a B-carrying individual was also microdissected (pair No. 1, henceforth referred to as A 1 ). It must be noted that all microdissection experiments were carried out using single-copy chromosomes.
Each microdissected chromosome was transferred to a micropipette containing a collection solution (1.5 μg/μl proteinase K, 0.1% SDS, 0.1% Triton X-100, 1 mM EDTA, 10 mM Tris-HCl, pH 8.0, 10 mM NaCl) and placed for 1 h at 60°C in a moist chamber. Then, the pipette tips were broken into a 0.2 ml microtube containing 5 μl of sterile MilliQ water, followed by amplification using the GenomePlex Single Cell Whole Genome Amplification Kit (WGA-Sigma).

Whole-genome sequencing (WGS) and satellite DNA identification
To perform a deeper search for satDNAs in the M. sanctaefilomenae genome, we sequenced gDNA from two individuals collected from the Batalha River (carrying up to 6B) and the Novo River (0B) on the Illumina HiSeq2000 platform, yielding 2x101 bp paired-end reads. After a quality trimming step (filtering out reads with less than 90% of bases showing a quality lower than Q20) with Trimmomatic [30], we sampled 200,000 pairs of reads (100,000 reads from each population) for clustering using RepeatExplorer [31] considering paired-end reads, with clustering and assembly overlap lengths equal to 55 and 40 bp, respectively. For this analysis, we also built a custom database of repeated sequences by running RepeatModeler [32] on the assembled A. mexicanus genome (GenBank accession number APWO00000000.1) as a complement to RepBase for cluster annotation. This custom database resulted in 1,243 sequences, consisting of 589,736 bp and N50 = 613 bp. We subsequently searched for clusters of satDNA families (i.e., unannotated) with sphere or ring shapes and a graph density higher than 0.1. Next, we manually processed the assembled contigs with Geneious Pro v8.04 using the High Sensitivity/Slow option to visualize dotplot graphics to detect tandem repetitions. We then split these sequences into monomers, aligned them and obtained a consensus sequence of the monomeric units.

FISH
Prior to the FISH experiments, all probes were labelled with digoxigenin-11-dUTP or biotin-16-dUTP. The painting probes were labelled using the GenomePlex (WGA3 Reamplification Kit-Sigma) following the manufacturer's protocol, and the repetitive DNA probes were labelled via PCR. FISH was performed under high stringency conditions using the method described by Pinkel et al. [36]. The pre-hybridization conditions were different according to the probes used. Thus, slides probed with repetitive sequences were incubated with RNAse (50 μg/ml) for 1 h at 37°C, and the chromosomal DNA was denatured in 70% formamide/2x SSC for 5 min at 70°C. For each slide, 30 μl of hybridization solution (containing 200 ng of each labelled probe, 50% formamide, 2x SSC and 10% dextran sulphate) was denatured for 10 minutes at 95°C, then dropped onto the slides and allowed to hybridize overnight at 37°C in a moist chamber containing 2x SSC. Slides probed with whole-chromosome paints were incubated with 0,005% pepsin/10 mM HCl for 10 min, and the chromosomal DNA was denatured in 70% formamide/ 2x SSC for 3 min at 70°C. For each slide, 30 μl of hybridization solution (containing 200 ng of labelled probe, 50% formamide, 2x SSC, 10% dextran sulphate and 3 μg of salmon sperm DNA) was denatured for 10 minutes at 85°C and allowed to pre-hybridize for 30 min at 37°C, then dropped onto the slides, followed by sealing with rubber cement and hybridization at 37°C in a moist chamber containing 2x SSC for 36 h. Post-hybridization, all slides were washed in 0.2x SSC/15% formamide for 20 min at 42°C, followed by a second wash in 0.1x SSC for 15 min at 60°C and a final wash at room temperature in 4x SSC, 0.5% Tween for 10 min. Probe detection was carried out with avidin-FITC (Sigma) or anti-digoxigenin-rhodamine (Roche), and the chromosomes were counterstained with DAPI (4',6-diamidino-2-phenylindole, Vector Laboratories) and analyzed using an optical photomicroscope (Olympus BX61). Images were captured using Image Pro plus 6.0 software (Media Cybernetics). From each individual, a minimum of 10 cells was analyzed to confirm the FISH results and estimate the number of B chromosomes per cell.

DNA amplification, cloning and sequencing
We amplified, cloned and sequenced partial H3 histone genes and both satDNAs from various samples, including a microdissected euchromatic B chromosome (B 1 ), a microdissected heterochromatic B chromosome (B 2 ), gDNA 0B from M. sanctaefilomenae from the Novo River (0B gDNA) and 0B gDNA from A. fasciatus. In addition, MS3 and MS7 satDNA sequences were obtained from the longest autosome pair (A 1 ). The reactions were performed in 1x PCR buffer, 1.5 mM MgCl 2 , 200 μM each dNTP, 0.5 U of Taq polymerase (Invitrogen), 0.1 μM each primer and 50 ng of DNA. The cycling program for amplification of these regions consisted of an initial denaturation at 95°C for 5 min, followed by 32 cycles at 95°C for 45 s, 56°C for 30 s and 72°C for 1 min and a final extension of 72°C for 15 min. The PCR products were visualized in 2% agarose gels, and the fragment obtained from each sample was extracted from the gel and cloned into the pGEM-T Easy Vector (Promega, Madison, Wisconsin, USA). DNA sequencing was performed with the Big Dye TM Terminator v3.1 Cycle Sequencing Ready Reaction Kit (Applied Biosystems) following the manufacturer's instructions. Although different strategies were adopted (i.e., testing different primer pairs and adding DMSO to the PCR), we failed to amplify any region of the 45S rDNA sequence from B chromosomes, probably because GCrich sequences are usually under-amplified in WGA steps [7].

Extraction of MS3 and MS7 from Illumina reads
To obtain a detailed and reliable score of haplotype abundance for the MS3 and MS7 satDNAs sequences from genomic libraries, we extracted the monomers directly from the Illumina reads. Because the reads are smaller than the monomer size of both satellites, we joined the paired-reads using fastq-join (https://code.google.com/p/ea-utils/wiki/FastqJoin) with a minimum overlapping size of 6 bp.
For MS7, we aligned the joined reads against a dimer sequence of satDNA with RepeatMasker software [37] and using a custom Python script (https://github.com/fjruizruano/ngsprotocols/blob/master/rm_getseq.py); we employed the alignment information from the output file (with the extension.out) to extract only the aligned region. Then, we mapped these sequences with Geneious Pro v8.04 against the dimer and extracted the central region corresponding to one monomer by manually deleting those sequences that did not cover an entire monomer. For MS3, after Geneious mapping, we cut the ends mapped out of the reference monomer and added them to the other end to obtain sequences starting and ending at the same positions, which was achieved with another custom Python script (https://github.com/ fjruizruano/ngs-protocols/blob/master/sat_cutter.py).
Comparisons of synonymous substitutions per synonymous site (dS) and non-synonymous substitutions per non-synonymous site (dN) of H3.2 from each library (B 1 , B 2 , A 1 and 0B gDNA) were carried out using the nonparametric Kruskal-Wallis test, followed by Dunn's multiple post-hoc test, considering α = 0,05.

Results
Individuals from the two analyzed populations showed a similar standard karyotype, all exhibiting 2n = 50 biarmed chromosomes (6m + 16sm + 28st), with no sex-related chromosomal dimorphism. Additionally, mitotically unstable B chromosomes were observed in the genomes of all studied specimens collected from the BR population, as manifested in the intraindividual variation in B numbers (0-6). However, all individuals from the NR population lacked B chromosomes.
In the BR population, there were two types of B chromosomes observed on the basis of the C-banding pattern and population frequency. The more frequent variant (designated B 1 ) showed a light C-banding pattern (similar to euchromatin in the A chromosomes), whereas the less frequent variant (B 2 ) showed a dark C-banding pattern (similar to heterochromatin in the A chromosomes) (Fig 1). While B 1 was found in all 23 individuals analyzed from the BR population (100% prevalence), B 2 was present only in four of them (17% prevalence) ( Table 1). Both B types were found in males and females.

Similar DNA content in the two B chromosome variants
Whole-chromosome painting (WCP) with the B 1 and B 2 probes showed similar hybridization signals for the two probes on both B variants, in addition to the pericentromeric regions of approximately one-third of the A chromosomes (Fig 1a-1d). This was also apparent when double WCP with both B-derived probes was performed on metaphase cells from B-lacking NR individuals, with both hybridization signals co-locating in most cases (Fig 1e and 1f).
FISH mapping revealed remarkable differences between the two populations regarding the number of H3 histone, 5S and 18S rDNA clusters, while the U2 snDNA showed exactly the same pattern of chromosomal distribution (Fig 2).
The most extreme difference between the two populations was found for 18S rDNA which, was located only on the short arm of chromosome pair no. 6 in NR, whereas in BR, distal clusters appeared on many chromosomes in a homozygous (13,14,15,17) or heterozygous (2,12,20,22,24,25) state, in addition to the cluster on chromosome 6. Sequential FISH and silver impregnation (with the latter indicating active rDNA clusters) showed that the rDNA cluster on chromosome 6 was always active, while most other clusters on other chromosomes (including B chromosomes) in the BR population were inactive (S1 Table,  Among the tandem repeat gene families assayed, both types of B chromosomes carried only 18S rDNA and H3 histone gene sites, but the heterochromatic variant (B 2 ) carried a larger 18S rDNA cluster than the euchromatic one (B 1 ) (Figs 2 and 3).
To improve our knowledge of the DNA content of the B chromosomes, we searched for satellite DNA (satDNA) tandem repeats in two sets of Illumina Hiseq2000 Paired-End reads obtained from whole-M. sanctaefilomenae genome sequencing runs from a B-lacking individual from the NR population and a B-carrying individual from the BR population. Because all individuals analyzed from the BR population carried B chromosomes, it was necessary to analyse a B-lacking genome from the NR population. However, it was unfortunate that the two repeat families present on the B chromosomes (18S rDNA and H3 histone genes) showed extensive spreading across A chromosomes in the B-carrying population, thus impeding the detection of changes in DNA repeat coverage between the B-carrying and B-lacking genomes. For this reason, we used the Illumina reads to search for satellite DNAs that might be useful as additional B-specific FISH markers.
Sequence clustering analysis resulted in 11,149 clusters, constituting genomic proportions of 25.6% and 24.4% for +B and 0B individuals, respectively. We designed primer pairs for seven putative satellite DNAs, then PCR amplified these DNAs and selected those showing a ladderlike pattern in agarose gels. Next, we generated DNA probes for FISH and mapped them to A and B chromosomes (data not shown), and we selected the two satDNA families showing a clustered distribution that were present on both A and B chromosomes (Fig 3e and 3f), henceforth referred to as MS3 (CL27) and MS7 (CL96). The MS3 satDNA showed a consensus sequence of 186 bp with a cluster density of 0.16 (S2 Fig) and did not show any similarity to the custom database. Notably, this cluster was 2x more abundant in the BR library (0.236%) than in the NR library (0.117%). The MS7 satDNA exhibited a consensus sequence of 100 bp with a cluster density of 0.55 (S2 Fig) and yielded similarity hits with DNA/TcMar-Tc1 (54% of hits) in the A. mexicanus database. In terms of abundance in different libraries, there was almost no difference observed for this cluster between the two populations analyzed (0.0285% for BR and 0.0295% for NR). FISH analysis corroborated the abundance data and revealed that MS3 was located in the pericentromeric region of 13 chromosome pairs in the BR population, but only 6 pairs in the NR population. Therefore, the higher abundance of MS3 in the B-carrying population was not due to B chromosomes (even though they actually carry this satellite) but to its presence on twice as many A chromosomes. Conversely, MS7 was located in the telomeric regions of 15 pairs in both analyzed populations.
Taken together, these results indicate that both B variants contain essentially the same DNA repeats (H3 histone genes, 18S rDNA and MS3 and MS7 satDNAs). The fact that autosome pair no. 6 is the only A chromosome carrying all of these repeat families strongly points to the possibility that both B chromosomes were derived from the pericentromeric region of this chromosome.

The two B chromosome variants show a similar degree of mitotic instability
Because the number of B chromosomes varied among cells within the same individual, we performed an analysis of the degree of mitotic instability causing this variation. For this purpose, we used a mitotic instability index previously developed in a migratory locust [42] that is based on the assumption that the median number of B chromosomes in the adult represents the number of B chromosomes in the zygote stage. This mitotic instability index (MI) measures the sum of deviations in B numbers in a sample of cells with respect to the median, normalized per B chromosome.
The fact that both B chromosome types contained H3 histone genes and 18S rDNA helped us to identify the two B types in 657 mitotic metaphase cells subjected to double FISH and subsequent C-banding in 23 individuals from the BR population (mean = 29 cells per individual, SD = 6) to accurately score the number and type of B chromosomes (Fig 4). In each individual, we calculated the mean number of B chromosomes per cell, the median number of Bs and the mitotic instability index (MI). The results revealed that B 1 and B 2 showed a similar MI, but B 1 was almost nine-fold more frequent than B 2 (Table 1).
A comparison of the mean number of B chromosomes and MI per individual between the present data with those previously reported for the Tiete River in [21] and [23], by means of the Kruskal-Wallis test, revealed significant differences for both the mean values (H = 21.16, df = 2, N = 46, p< 0.0001) and MI (H = 15.4, df = 2, N = 46, P = 0.0005). Given that the three samples were collected from the same river, but at different times, these results suggest a tendency of MI to increase across years. However, we observed variation in the mean number of B chromosomes per individual that presented an inverted V-shape (Fig 4). This might suggest that the mitotic instability of B chromosomes has increased across the years, while the number of Bs per individual has reached a maximum, in accord with the existence of a tolerance threshold.

DNA sequence analysis suggests different ages for the two B variants
Three of these repetitive sequences (H3 histone, and MS3 and MS7 satDNAs) were PCR amplified from four sources (0B gDNA from NR and microdissected B 1 , B 2 and A 1 chromosomes from BR). Notably, we were unable to amplify any fragment of the major ribosomal sites from the microdissected libraries or histone H3 genes from the A 1 chromosome. These PCR products were cloned and sequenced in both directions. After discarding primer regions, a total of 318-328 bp, 183-186 bp and 93-106 bp from the H3 histone, MS3 and MS7 sequences, respectively, were obtained from several clones and Illumina reads ( Table 2). To minimize the impact of possible PCR/sequencing errors, we discarded singletons from subsequent analyses. All the isolated repetitive sequences were deposited in GenBank (accession numbers as follows: MS3: KU129073-KU129201, KU177253-KU177378, MS7: KU129202-KU130117, KU177379-KU177435, Histone H3: KU184525-KU184571 The H3 histone gene sequences obtained from the B types showed two distinct isotypes: H3.2 (28 clones, 11 of which showed a 10 bp deletion, and 15 were polymorphic with respect to the Danio rerio H3.2 amino acid sequence; UniProt accession number Q4RF4) and H3.3 (6 clones, all of which were non-defective and identical to the Danio rerio H3.3 amino acid sequence; UniProt accession number Q6PI20). In the case of the 0B gDNA, however, we obtained only the H3.2 isotype (19 clones, 2 of which were defective, showing a 10 bp deletion and 1 polymorphism in relation to the Danio rerio H3.2 amino acid sequence).
Calculation of dN for the H3.2 sequences in each group revealed significant differences between the 0B gDNA sequences and those from B 1 and B 2 (H = 161.7, dF = 2, N = 47, p<0.0001). Post-hoc comparisons (not shown) failed to show significant differences between the two B chromosome types, but both B 1 and B 2 showed significantly higher dN values compared with the H3.2 sequences from the 0B genome (Table 3). Conversely, dS showed significant differences between the 0B, B 1 and B 2 sequences (H = 40.12, dF = 2, N = 47, p<0.0001), and post-hoc comparisons revealed significant differences in all cases (not shown). It was remarkable that dS was almost twice as high for B 1 than for B 2 , suggesting that the former is probably older and has had longer time to accumulate synonymous changes. In addition, the dN/dS ratios of the B 1 and B 2 sequences were higher than that for the 0B gDNA, suggesting that purifying selection is relaxed in the B chromosomes. The minimum spanning tree built with H3.2 haplotypes showed a certain degree of differentiation of the B-derived sequences (Fig 5).
In the case of the MS3 and MS7 satDNAs, we obtained 62 and 57 PCR clones, respectively. In addition, 193 and 916 monomers were extracted from the Illumina reads for the MS3 and MS7 satDNAs, respectively (Table 2). We aligned the Illumina reads for each satDNA and built minimum spanning trees considering haplotype relative abundance, upon which we traced the PCR sequences obtained from the different libraries (Fig 6). Illumina reads are expected to provide accurate estimates of haplotype abundance, without the bias of PCR   amplification. In both cases, the satDNAs amplified from A. fasciatus showed the highest divergence (as inspected in the nucleotide alignment), as expected for DNA sequences subjected to concerted evolution. The most abundant haplotypes for MS3 and MS7 in M. sanctaefilomenae, found in the Illumina reads (Fig 6), were present in the 0B and 6B genomes. However, for MS3, this haplotype was found only on the B 2 chromosome and not on the B 1 chromosome or in the 0B genomic DNA (PCR). The minimum spanning trees showed higher conservatism for MS7 (Fig 6b) than MS3 (Fig 6a). Remarkably, the tree for the MS3 satDNA showed that the most abundant haplotype found in the 0B and 6B Illumina-sequenced genomes was present in the PCR sequences obtained from the B 2 chromosome, but not in those coming from the B 1 chromosome. This difference would not be expected if one B-type was derived from the other. In addition, the absence of the most common haplotype on B 1 and its presence on B 2 would be consistent with an independent and more recent origin for the latter (conceivably from the same A chromosome). The tree for the MS7 satDNA was consistent with the former conclusion because all DNA sequences found on the B 2 chromosome corresponded to the most abundant haplotype in the 0B and 6B genomes, whereas only 6% of the DNA sequences obtained from the B 1 chromosome corresponded to the former haplotype, and the remainder belonged to a different haplotype showing one mutational difference. Uncovering the Ancestry of B Chromosomes in Moenkhausia sanctaefilomenae (Teleostei, Characidae)
Since their first description, several authors have proposed the occurrence of three B types in this species, distinguishable by their C-banding patterns [21,24]. Recently, it was reported that some B chromosomes in this species carry 18S rDNA clusters, while others do not [25]. In the present study, C-banding and FISH analyses allowed us to clearly identify only two different B types, which showed differences in terms of C-banding patterns, the abundance of 18S rDNA and population frequency. We also analyzed a novel population collected in the Paranapanema river basin and, for the first time in this species, no B chromosome-bearing individuals were found. In fact, B chromosomes are usually highly dynamic, and the occurrence of populations with and without B chromosomes is common. Previous studies on the grasshopper Myrmeleotettix maculatus have shown a cline of B chromosomes correlated with temperature or rainfall, so that supernumerary elements are absent from populations in climatically stringent environments [43,44]. On the other hand, the geographical distribution of B chromosomes in the grasshopper Eyprepocnemis plorans appears to be shaped by historical non-selective events, such as the occurrence of geographical barriers that limited the spread of B-carrying individuals [45,46]. These two B chromosome systems (in M. maculatus and E. plorans) illustrate two of the main explanations for the existence of populations lacking B chromosomes: i) that B chromosomes have not yet reached these populations, as in E. plorans; and ii) that selective constraints are acting to prevent their occurrence, as in M. maculatus. In the present study, however, the question of whether B chromosomes ever existed and were eliminated from this  [18,21,[47][48][49][50][51], while the grasshopper Eyprepocnemis plorans constitutes the most complete known example of B chromosome diversification [52,53]. In this last species, several studies have shown the existence of numerous B variants (more than 40) that probably arose from a common ancestral B chromosome [53,54]. In the B chromosome system of the fish M. sanctaefilomenae, the B 1 and B 2 variants showed a similar degree of mitotic instability, whereas the frequency of the euchromatic variant (B 1 ) was almost nine-fold higher than that of the heterochromatic variant (B 2 ). Because mitotic instability is a frequent mechanism underlying drive [55], we can infer that it is not responsible for the difference between B 1 and B 2 , although other conceivable drive mechanisms (e.g., meiotic) should be analyzed in future experiments. Nevertheless, our sequence analysis of MS3 and MS7 satellite DNAs suggests that B 1 is older than B 2 , which would be consistent with the higher frequency of the former in the BR population. If B 2 was actually younger, we would expect it to increase in frequency during the coming years, most likely at the expense of B 1 , because the frequency of B chromosomes appears to have reached a maximum in this population. This population therefore provides an opportunity to witness the possible replacement of one B variant for another, similar to what was previously reported in the grasshopper Eyprepocnemis plorans [56].
Sharing of repetitive DNAs between A and B chromosomes is a common feature, as demonstrated in various animals, including fish, grasshoppers and mammals [7,9,49,57,58]. The composition of B chromosomes has been used to identify the probable ancestral chromosome in the host species [7,9,35,49]. Our WCP and FISH mapping results showed that the B 1 and B 2 chromosomes of M. sanctaefilomenae are composed of the same repetitive DNA sequences, suggesting a common intraspecific origin of these chromosomes. Recently, Scudeler et al. [25] suggested an intraspecific origin of B chromosomes in this species, based on WCP results indicating the presence of DNA sequences shared between A and B chromosomes. Our present results are consistent with this conclusion. However, they also suggested an independent origin for different B variants in this species because they were not painted with the only B probe employed. We cannot rule out the possibility that other B variants were present in these authors' samples, but our present analysis revealed that all of the B chromosomes observed in the 23 individuals analyzed in the BR population contained H3 genes and 18S rDNA.
Because autosomal pair No. 6 is the only pair in the A karyotype that exhibits co-located histone, 18S rDNA, MS3 and MS7 sites (i.e., the repetitive DNA sequences contained on B chromosomes), we suggest that this pair might be the B chromosome ancestor. Remarkably, the minimum spanning trees obtained from MS3 and MS7 satDNA sequences also supported the hypothesis of an intraspecific origin due to the high similarity, and even shared haplotypes, between the sequences located on the A and B chromosomes of M. sanctaefilomenae. However, the fact that 52% of the MS3 and 100% of the MS7 sequences observed on B 2 corresponded to the most frequent haplotype found in the Illumina reads from the B-lacking and B-carrying populations, whereas these figures were 0 and 6%, respectively, for B 1 , suggests that the two B chromosomes arose independently from autosome 6, such that in the most recent B type (B 2 ), many satDNA repeats of the commonest haplotype in the A genome, found even in distant populations, are conserved. The observed differences in satellite DNA sequences between the two B variants also suggest that concerted evolution might act separately for each B type. Remarkably, MS3 and MS7 nucleotide diversity is lower in B 1 , perhaps because of its higher possibility of sequence homogenization due to its greater age and population frequency. In the grasshopper Eyprepocnemis plorans, males with two or more B chromosomes form chiasmated B-bivalents during meiosis [59], which helps to explain the observed variation in the amount of distally located 45S ribosomal DNA [60]. Likewise, in M. sanctaefilomenae, the frequent presence of cells with two or more B 1 chromosomes would allow the formation of B-bivalents during meiosis and possible unequal crossovers, yielding sequence homogenization. The fact that B 1 shows almost twice the dS value for H3.2 histone genes as B 2 also supports the conclusion that B 1 is older than B 2 .
Once they originate, B chromosomes are subjected to nearly the same genetic conditions that affect the molecular evolution of sex chromosomes, particularly regarding the degeneration of the heteromorphic Y or W chromosomes, which includes loss of both functional loci and sequence homology with the regular genome as well as heterochromatin gains [3,61]. In this context, if B 2 was derived from B 1 , it would be unexpected for the younger B chromosome (B 2 ) to be heterochromatic. However, their divergent C-banding responses might not be associated with their relative ages, but with their independent origins and differences in 18S rDNA contents. In this context, the heterochromatic nature of B 2 seems to be related to the large amounts of rDNA-associated heterochromatin on this chromosome. B chromosomes containing 18S rRNA genes have been described in different fish species [9,62], but their activity has only been demonstrated for the euchromatic variant (B 1 ) in M. sanctaefilomenae [24]. Several other examples of NOR-bearing B chromosomes have been described in different organisms reviewed in [61], and there is no general trend between the Cbanding response and NOR activity of the B chromosomes. For example, the variants B 1 and B 24 in the grasshopper E. plorans are C-band positive, show active NORs and exhibit no correlation between rDNA content and NOR activity [60,[63][64][65][66], whereas in the plants Allium flavum and Crepis capillaris, the active NOR sites are located outside the constitutive heterochromatin on the B chromosomes [67,68]. It is noteworthy that the main structural difference between the eu-(B 1 ) and heterochromatic (B 2 ) variants is the higher content of rDNA in the latter. Similarly, in the A chromosomes of different salmonid species, rDNA loci with smaller FISH signals show faint C-band heterochromatin, while larger clusters are coincident with strongly positive C bands [69,70]. This condition is probably related to the interspersed organization of the rDNA within the repeated DNA sequences of the heterochromatin, although the possibility that the rDNA itself contributes to heterochromatin formation cannot be discarded [69]. In this context, the B chromosomes of M. sanctaefilomenae therefore provide a model to test the relationship between rDNA and heterochromatin contents and their roles in determining the C-banding pattern.
Other multigene families have also been found on the B chromosomes of several species [7,9,71]. Regarding histone genes, only the H3.2 subtype has hitherto been reported on B chromosomes [7,9]. In M. sanctaefilomenae, two non-defective (thus, potentially active) H3 histone types (H3.2 and H3.3) were found on the eu-and hetero-chromatic B chromosome variants. Remarkably, while H3.2 was represented by several haplotypes on both B-types, the six H3.3 clones analyzed from the two B chromosome variants showed exactly the same DNA sequence. Interestingly, the coding nature of histone genes allows additional inferences to be made about the DNA sequences contained in the B chromosomes in terms of the dN/dS ratio. In general, higher dN/dS ratios are expected for coding DNA sequences residing on B chromosomes compared with the same sequences on the A chromosomes because selection is assumed to be relaxed in B chromosomes due to their dispensable nature. This situation has been reported for the H3 and H4 histone genes of the grasshopper L. migratoria, where the dN/dS ratios are 2,23 and 1,72 higher for the B chromosome, respectively [7], and for the H1 histone gene of the fish A. paranae, where the dN/dS ratio is 3 times higher for the B chromosome [9]. Our present results are consistent with these previous observations because they showed 5.77 (B 1 ) and 3.78 (B 2 ) higher dN/dS ratios for the B chromosomes compared with the 0B genome, indicating that purifying selection is relaxed for the H3.2 genes located on the B chromosomes. This suggests that the H3.2 genes on the B chromosomes are most likely inactive. Conversely, H3.3 genes appear to be conserved on B chromosomes. It is conceivable that the complete identity of the six DNA sequences analyzed could be due to purifying selection, thus implying the possible functionality of the B chromosome copies, but further research will be necessary to address this issue.
In conclusion, analysis of the B chromosome contents of several types of repetitive DNA sequences (18S rDNA, H3 histone genes and two satellite DNAs) and comparison with those of the A chromosomes by means of FISH mapping, chromosome painting, and DNA sequencing (by Sanger and Illumina methods) revealed that the Neotropical fish M. sanctafilomenae harbours two B chromosome variants differing in their C-banding patterns, frequency and abundance of 18S rDNA. Both B variants were presumably derived independently from the same A chromosome (autosome no. 6), but the heterochromatic variant shows signs of being younger than the euchromatic variant. Finally, both B variants showed higher dN/dS ratios for the H3.2 histone gene, suggesting that purifying selection is relaxed for the B-sequences, as expected if they are mostly inactive.