An S-Locus Independent Pollen Factor Confers Self-Compatibility in ‘Katy’ Apricot

Loss of pollen-S function in Prunus self-compatible cultivars has been mostly associated with deletions or insertions in the S-haplotype-specific F-box (SFB) genes. However, self-compatible pollen-part mutants defective for non-S-locus factors have also been found, for instance, in the apricot (Prunus armeniaca) cv. ‘Canino’. In the present study, we report the genetic and molecular analysis of another self-compatible apricot cv. termed ‘Katy’. S-genotype of ‘Katy’ was determined as S 1 S 2 and S-RNase PCR-typing of selfing and outcrossing populations from ‘Katy’ showed that pollen gametes bearing either the S 1- or the S 2-haplotype were able to overcome self-incompatibility (SI) barriers. Sequence analyses showed no SNP or indel affecting the SFB 1 and SFB 2 alleles from ‘Katy’ and, moreover, no evidence of pollen-S duplication was found. As a whole, the obtained results are compatible with the hypothesis that the loss-of-function of a S-locus unlinked factor gametophytically expressed in pollen (M’-locus) leads to SI breakdown in ‘Katy’. A mapping strategy based on segregation distortion loci mapped the M’-locus within an interval of 9.4 cM at the distal end of chr.3 corresponding to ∼1.29 Mb in the peach (Prunus persica) genome. Interestingly, pollen-part mutations (PPMs) causing self-compatibility (SC) in the apricot cvs. ‘Canino’ and ‘Katy’ are located within an overlapping region of ∼273 Kb in chr.3. No evidence is yet available to discern if they affect the same gene or not, but molecular markers seem to indicate that both cultivars are genetically unrelated suggesting that every PPM may have arisen independently. Further research will be necessary to reveal the precise nature of ‘Katy’ PPM, but fine-mapping already enables SC marker-assisted selection and paves the way for future positional cloning of the underlying gene.


Introduction
Gametophytic self-incompatibility (GSI) is a widespread mechanism in the plant kingdom that prevents inbreeding [1]. In Solanaceae, Plantaginaceae and Rosaceae GSI is controlled by the S-locus that contains at least two genes coding for S-RNase and F-box proteins. S-RNases are style-specific expressed and their ribonuclease activity is essential for self-pollen rejection [2][3][4]. In turn, the S-locus F-box proteins (SLF or SFB) are the pollen S-determinants [5][6][7]. Evidence accumulated in Petunia and Antirrhinum supports a model in which SLFs are components of a SCF E3 ubiquitin ligase complex that interacts with non-self S-RNases leading to their ubiquitination and degradation by the 26S proteasome proteolytic pathway [8,9]. Alternately, the compartmentalization model proposed by Goldraij et al. [10] in Nicotiana explains the resistance to nonself S-RNases by their sequestration in vacuolar compartments of pollen compatible tubes. A hypothetical S-RNase endosome sorting model involving both S-RNase degradation and compartmentalization has been recently proposed [11], but many pieces of the puzzle remain elusive.
Spontaneous and induced self-compatible mutants have been particularly important to support S-RNase and S-locus F-box genes as the S-determinants in Prunus (Rosaceae) since other functional approaches based on transgenic experiments are seriously hindered in this genus. For instance, a Mu-like element insertion upstream of the S 6m -RNase in sour cherry (Prunus cerasus) [12] and a similar mutation in the Japanese plum (Prunus salicina) S e -RNase [13] reduce the S-RNase expression level leading to a insufficient accumulation of S-RNase in the pistil which breaks the rejection mechanism. Modifications affecting the S-RNase structure and conferring self-compatibility (SC) have also been found in peach (Prunus persica) where the S 2m -RNase shows a reduced stability as a consequence of the cysteine residue replacement by a tyrosine in the C5 domain [14]. Regarding the pollen-part mutations (PPM), self-compatible mutants with non-functional SFB genes have been identified in sweet cherry (Prunus avium) [15][16][17], apricot (Prunus armeniaca) [18], sour cherry [19], Japanese apricot (Prunus mume) [15] and peach [14], supporting their role as the pollen-S determinants in this genus. In most of these cases, the self-compatible phenotype was associated with indels in the SFB codifying region causing a frame-shift in translation that produces a non-functional truncated protein [20]. This seems to be a specific feature of the S-RNase based GSI system operating in Prunus, since in Solanaceae the only pollen-side mutations found to cause SC are due to the S-heteroallelic pollen effect [21]. Therefore, SLF mutations were initially suggested to confer SI or lethality, but recent findings provide an alternative explanation since in the non-self recognition by multiple factors SI system, shown to operate in Solanaceae [22] and Pyrus (Rosaceae) [23], the loss of pollen-S function does not lead to SC. In contrast, all loss-offunction mutations found in Prunus SFB cause SC which may support differences in the self-recognition mechanism where the SFB target would be an S-RNase 'inhibitor' instead of the S-RNase itself [24]. Nevertheless, even considering the discrepancies, major similarities (i.e. S-RNase and SLF/SFB as S-specificity determinants) are still more striking and the model as a whole might be preserved across families [25].
As reported above, SC accessions found in Rosaceae are mostly related to mutations in pistil and pollen S-locus determinants [20]. However, mutations in non S-locus factors have also been associated with SC in sweet cherry [26], almond (Prunus amygdalus) [27] and diploid strawberries (Fragaria spp.) [28]. Genetic evidence for S-locus unlinked factors required for GSI, also called modifier genes, was previoulsy accumulated in Solanaceae. For instance, Ai et al. [29] showed that the selfcompatible Petunia hybrida cv. 'Strawberry Daddy' (S O S X ) accumulates a non-functional S-allele (S O ) and a stylar mutation in an additional factor necessary for SI. Later studies in Nicotiana revealed that the so called 4936 stylar factor is also required for SI [30]. Moreover, mutations in modifier loci affecting the pollen-S function have been suggested to explain SI breakdown in Solanum tuberosum [31] and Petunia axillaris [32]. More intriguing is the behaviour of the PPM found in Solanum chacoense that predicts a S-locus inhibitor (Sli) gene acting as a single dominant factor that displays sporophytic inhibition of SI [33,34]. More recently, some stylar modifier factors have been identified and successfully cloned in Nicotiana, such as the small asparagine-rich protein HT-B [35], the 120 kDa glycoprotein [36] and the Kunitz-type proteinase inhibitor NaStEP [37] but their role in SI still has not been completely elucidated. Pollen modifier factors have also been identified in the Solanaceae, such as the Petunia pollen-expressed Skp1-like protein PhSSK1 proposed to be acting as adaptor in the SCF complex [38]. Interestingly, Matsumoto et al. [39] have identified a similar SFB-interacting Skp1-like protein (PavSSK1) in sweet cherry and suggest that it could also be a functional component of the SCF complex. Nevertheless, the identification of additional GSI modifier factors will be necessary to dissect completely the underlying mechanism in Prunus.
In apricot, the cv. 'Canino' (S 2 S C Mm) was found to contain two different mutations conferring SC, an insertion in the SFB C gene that produces an SFB C truncated protein and a mutation in a modifier gene (m) unlinked to the S-locus, both independently causing the loss of pollen-S function [18,40]. In this work, we have analyzed the self-compatible apricot cv. 'Katy' using genetic and molecular approaches, and the compiled evidence suggest that the loss of function of an Slocus unlinked factor (M'-locus) is also involved in pollen-S function breakdown in this case. According to the current knowledge on GSI in Prunus the possible roles for the mutated modifier gene are discussed. In addition, we have paved the way for future positional cloning of the 'Katy' pollen-part modifier gene by fine-mapping the M'-locus to the distal part of apricot chr. 3. Macro-and micro-synteny of this region has been studied by comparing with the M-locus in 'Canino' and by analyzing the ORFs comprised in the peach syntenic region according to the peach genome v1.0 (International Peach Genome Initiative -IPGI; http://www.rosaceae.org/peach/ genome).

Results
'Katy' is an Apricot Self-compatible Cultivar with Sgenotype S 1 S 2 'Katy' is an apricot variety developed by Zaigers Genetics (Modesto, CA, USA) and reported as self-fruitful [41]. In this study, SC of this cultivar was confirmed by self-pollination in the field (Table 1). To determine the S-genotype of 'Katy', fragments containing the first intron of the S-RNases were PCR-amplified using the SRc-F/SRc-R primers ( Figure 1A). These fragments were assigned to S 1 and S 2 -alleles by comparison with known Sgenotypes, following the nomenclature established by Burgos et al. [42]. This S-genotype was confirmed by the amplification of the second intron using the primers Pru-C2/Pru-C4R [43] since fragment sizes obtained were coincident with those expected for the S 1 and S 2 -alleles ( Figure 1B). In addition, PCR-amplified fragments spanning the first intron, were sequenced and compared with GenBank accessions, being identical to the already identified Prunus armeniaca S-RNases 1 and 2. The alignment of their deduced amino acid sequences (44 aa) showed the presence of the C1 and C2 Prunus S-RNase conserved domains along with the hypervariable region HV1 located between them [44].

SC in 'Katy' is Associated with a PPM Unlinked to the Slocus
To analyze the nature of SC in 'Katy', this cultivar was selfpollinated and reciprocally crossed with 'Goldrich', a selfincompatible cultivar sharing the same S-genotype. S-RNase genotyping of the progenies derived from the 'Katy' (S 1 S 2 ) selfpollination ( Figure 1C) and the 'Goldrich' (S 1 S 2 ) 6 'Katy' (S 1 S 2 ) outcross ( Figure 1D) revealed three different S-genotypes (S 1 S 1 :S 1 S 2 :S 2 S 2 ) in both cases (Table 1). In turn, the 'Katy' (S 1 S 2 ) 6 'Goldrich' (S 1 S 2 ) cross did not produce any seedling. Thus, 'Katy' pollen is able to grow through the 'Goldrich' pistil meanwhile 'Goldrich' pollen is rejected in the 'Katy' styles. According to these results, SI breakdown in 'Katy' may be due to a pollen-part mutation since 'Katy' is completely functional as a female parent. Indirect evidence supporting this hypothesis was also compiled from the S-genotype segregation ratio in 'K6C', because the number of S 2 bearing genotypes is lower than that expected for a non-functional pistil-S 2 determinant (Table 1). Moreover, both 'Katy' S-alleles are able to grow in 'Goldrich' and 'Katy' styles suggesting that the PPM is unlinked to the S-locus.
To complement these observations, we performed additional crosses with cultivars having different S-genotypes. Figure 1E shows the S-RNase genotyping of the 'Harcot' (S 1 S 4 ) 6'Katy' (S 1 S 2 ) population where S-genotypes fell into four classes (S 1 S 1 :S 1 S 2 :S 1 S 4 :S 2 S 4 ) ( Table 1). Two of these S-genotypes were unexpectedly obtained (S 1 S 1 and S 1 S 4 ) since pollen tubes carrying the S 1 -haplotype from 'Katy' were expected to be incompatible in 'Harcot' styles. On the other hand, reciprocal crosses with the cv. 'Canino' (S 2 S C Mm) produced four S-genotype classes (S 2 S C :S 2 S 2 :S 1 S C :S 1 S 2 ). According to the two unlinked PPMs associated wtih SC in 'Canino' (S C and m), these four S-genotypes were expected for the 'K6C' progeny (Table 1). Nevertheless, since pollen tubes having the S 2 -haplotype should be arrested in S 2 -styles, the S 2 S C and S 2 S 2 genotypes observed in the 'C6K' progeny were unexpected. The observed ratios for S-genotype segregations in 'H6K' and 'C6K' fit with that expected in a model where 'Katy' carries a heterozygous PPM affecting pollen-S function that is unlinked to the S-locus (2:2:1:1) with x 2 values of 3.68 and 0.74 (P = 0.30 and P = 0.86) ( Tables 1 and 2). On the contrary, if we consider an heterozygous PPM linked in coupling to the incompatible S-allele or an homozygous PPM (linked or unlinked to the S-locus) the expected ratios (1:1:1:1) do not fit with the observed data with x 2 values of 13.6 and 13.5, respectively (P,0.004).
All performed crosses were shown to be compatible, barring 'Katy 6 Goldrich' cross, and fruit set ranged approximately from 15% ('K6K') to 34% ('C6K'). Differences in germination rate and seedling fitness were striking. Only 59% of the 'K6K' inbred seeds produced healthy plants while this percentage increased to 82-96% in the outcrossed seeds.
Molecular Analysis of the Self-compatible cv. 'Katy' (S 1 S 2 ) To test whether the 'Katy' pollen tubes are not rejected in pistils bearing a matching S-allele as a consequence of SNPs or indels affecting SFB 1 and SFB 2 , genomic DNA fragments containing both alleles were cloned and sequenced. Genomic sequences of S 1 and S 2 -haplotype regions from the self-incompatible cv. Goldrich (S 1 S 2 ) were used as references [44]. No changes were found in the nucleotide sequences of the two cloned fragments (approximately 1.3 and 1.9 kb, respectively) containing the complete SFB      PPMs identified in Solanaceae are mostly associated with Sallele duplications caused by polyploidy or induced mutations [45]. To discard this reason, we first examined the ploidy level in 'Katy' by flow cytometry analysis. The peaks of nuclei isolated from 'Katy' were coincident with those detected in the control diploid plant ('Goldrich'), indicating that 'Katy' is a diploid (data not shown). A hypothetical duplication of the SFB alleles in 'Katy' was also tested by a real-time PCR-based gene dosage assay, but the relative DNA amounts detected for SFB 1 and SFB 2 were not significantly different between 'Katy' and the self-incompatible cv. 'Goldrich' (Figure 2).
Gene expression analysis showed that SFB 1 and SFB 2 alleles are specifically expressed in pollen in 'Katy' and 'Goldrich' (data not shown). Furthermore, relative transcript abundance of SFB 1 and SFB 2 in 'Katy' and 'Goldrich' was quantified by real-time RT-PCR using actin as endogenous control to normalize transcription values. No significant differences in the transcript levels were found for any of the two SFB alleles between 'Katy' and the selfincompatible cv. 'Goldrich' (Figure 3) discarding transcriptional repression of SFBs as the cause of SC.

S-locus Unlinked PPM Conferring SC in 'Katy' is Located on Linkage Group 3
Overall, genetic and molecular evidence support a model where 'Katy' is heterozygous for a PPM unlinked to the S-locus that confers SC. The locus containing this PPM in 'Katy' was referred as M'-locus to distinguish it from the M-locus previously reported in 'Canino' [40]. Thus, according to the Sand M'-locus genotypes, 'Katy' was designated as S 1 S 2 M'm' ( Table 2). Under the proposed genetic model, SSR markers linked to the M'-locus in 'Katy' selfing populations should be highly distorted, since only seedlings derived from 'Katy' pollen gametes carrying the m'-allele (S 1 m' or S 2 m') could be obtained (Table 2). Thus, the expected ratio for a SSR marker segregating independently of the M'-locus in the F 2 populations is 1:2:1 while that for an absolutely linked SSR is 1:1. On this assumption, genome-wide distributed SSR markers were tested to look for associations with the M'-locus. Thereby, 118 SSR markers distributed across the eight Prunus chromosomes (ranging from 9 in LG7 to 34 in LG3) were selected for mapping (Tables S1 and S2). Fifty-five of these SSRs (47%) were found to be polymorphic in 'Katy' and, subsequently, tested in the 'K6K 05 ' and 'K6K 06 ' progenies (Table 3). According to the genetic maps constructed for each group, the maximum genetic distance estimated between any pair of markers was ,52 cM in LG5 (Table 3). In terms of the physical distance, determined from the peach genome sequence, the major gap was found in LG1 (,23 Mb). Considering the estimated sizes for the peach genome (,290 Mb) and for the Prunus general map (519 cM) [46], the relationship between physical and genetic distances is ,0.56 Mb/ cM on average. Accordingly, the LG1 23 Mb gap should correspond to ,45 cM. Consequently, in the most unfavourable scenario, distance to M'-locus should be lower than 25 cM and recombination frequency lower than 0.25. In this hypothetical case, the expected ratio for a SSR linked to the M'-locus would be 1:4:3, and only markers located on LG3 and LG6 fulfill this prediction and show skewed segregations (x 2 .5.99 with P,0.05 for 2 d.f.) ( Table 3).
In agreement with the segregation of the S-genotypes in the analyzed populations, the M'-locus is proposed to be unlinked to the S-locus (Table 1). Therefore, LG3 or a region far from the LG6 distal end, where the S-locus is located, are likely positions for the M'-locus. To discern between these two possibilities, a more detailed SDL analysis was performed in LG3 (Table 4) and LG6 (Table S1) by including the 'K6K 10 ' population and additional Table 2. Expected gamete and seedling genotypes formed from the outcross 'Harcot' (S 1 S 4 ) x 'Katy' (S 1 S 2 ) and the selfing of 'Katy' (S 1 S 2 ) considering 'Katy' heterozygous for a pollenpart mutation unlinked to the S-locus (M'm').

Female gametes
Male gametes 'Katy' (  On the other, the magnitude of the segregation distortion detected in LG6 (x 2 = 15.28 with P = 5610 24 for PGS6_07) lower than that found in LG3 (x 2 = 31.30 with P = 1.6610 27 for PGS3_23). This is due to the lower imbalance between homozygous genotypes found in PGS6_07 (7B against 32A) when compared with PGS3_23 (0B against 37A) ( Table 4 and Table S1). It is inferred from the model that pollen gametes carrying SSR alleles linked in repulsion phase with the PPM would not grow into incompatible styles. Therefore, homozygous genotypes for these SSR alleles should not be obtained in the progeny, as observed for the LG3 SSR distorted markers and particularly for PGS3_23 (Table 4). Thus, both arguments support LG3 as the most likely location for the M'-locus allowing us to discard LG6.
High-density Mapping of the M'-locus on chr.3 To construct a high-density map of the M'-locus region on chr.3, 102 SSRs identified from the peach scaffold_3 sequence by Zuriaga et al. [40] (Table S2) and 18 additional SSRs available from the GDR website [47] were tested in 'Katy'. A higher percentage of these SSRs did not amplify or produced multi-band patterns in 'Katy' (40%) when compared with both 'Goldrich' and 'Canino' (,30%). However, polymorphism of amplified SSRs was similar between 'Goldrich' and 'Katy' (,55%) and significantly higher than that found in 'Canino' (23%) ( Table S2). Polymorphic SSRs in 'Katy' were tested in 87 trees from the 'K6K' F 2 population. Sixteen of them were mapped, forming a LG3 genetic map of 72 cM with an average marker density of 0.22 marker/cM (Table 4). This marker density increased up to 0.62 marker/cM in the region flanked by the most distorted markers PGS3_12 and AMPA119 (Table 4). An additional LG3 map obtained from the outcrossing population 'C6K' was found to be essentially collinear with the 'K6K' map (sharing .80% markers), except for a single order change between AMPA119 and PGS3_32 (data not shown). The SDL associated with the M'-locus region were confirmed by analyzing 60 additional seedlings derived from the outcrosses 'H6K', 'G6K' and 'C6K' for all sixteen LG3 markers (Table 5). These seedlings were selected by their S-genotypes, so that they could only be derived from the fertilization with a 'Katy' pollen gamete carrying the PPM (m') and, therefore, directly assigned to the M' m' genotype (Table 2). Skewed segregations in selfing (F 2 ) and outcrossing populations suggested that the M'-locus is roughly located between PGS3_22 and PGS3_28 (Tables 4 and 5).
To define the M'-locus location more consistently, not only considering distortions but also on the basis of genotyping data, an additional mapping strategy was performed. As described above, all 'K6K-F 2 ' trees could only be derived from pollen gametes with genotype S 1 m' or S 2 m', having either the M'm' or the m'm' genotype. To discriminate between these two genotypes the screening of F 3 offsprings was necessary. Thereby, twelve 'K6K-F 2 ' individuals, with recombination breakpoints mapping to the LG3 region between UDAp468 and CPDCT027, were selfpollinated to obtain F 3 populations. Six of them (K05-15, K05-21, K06-18, K06-25, K06-34 and K06-37) were finally discarded for the analysis due to the low number of embryos obtained (less than 7 in four cases) or because they were redundantly represented (other F 2 individuals with larger F 3 populations have identical SSR genotypes in this genomic region). The six F 3 populations obtained from the remaining F 2 recombinants (K05-12, K05-24; K06-05, K06-06, K06-17 and K06-21) were tested for a subset of 6 SSRs encompassing the M'-locus (PGS3_13/PGS3_32 interval) ( Table 6). Those SSR markers heterozygous in the F 2 recombinant (H) were expected to segregate 1:1 in the F 3 population when the F 2 recombinant had the M'm' genotype and 1:2:1 if it had the m'm' genotype (Table 6). According to the segregation of these markers (A, H or B as per JoinMap 3.0 notation) the M'-locus was proposed to be flanked by PGS3_22 and EPPCU7190 markers within an interval of 9.4 cM. Graphical ordering of genotype data enable the positioning of recombination breakpoints to confirm map order ( Figure 4A).

Macro-and Microsynteny Analysis of the M'-locus in Apricot
Eleven out of the sixteen SSR markers contained in the 'Katy' LG3 map had been previously mapped in the 'Canino' LG3 [40]. As a whole, these markers were found to be collinear between both maps (8 out of 11) but some order changes regarding PGS3_33, AMPA119 and EPPCU0532 were observed at the distal chromosome end (data not shown). In turn, marker order in the 'Katy' LG3 map was completely collinear with the physical position of the markers in the peach genome (Table 4 and Figure 4). Unfortunately, most of the markers surrounding the M-locus in 'Canino' LG3 were found to be monomorphic in 'Katy' and therefore could not be mapped (Table S3). Genetic differences between 'Katy' and 'Canino' were detected across the whole genome, they share only 38,8% of their SSR alleles and show a Neis genetic distance of 0,83 (Table S4). Indeed, only a few collinear markers, such as PGS3_12, PGS3_15 and EEPCU7190, were useful to define a syntenic region between both apricot maps containing the Mand M'-loci and corresponding to a physical interval between 17. 38-19.78 Mb in the peach genome ( Figure 4A). The PGS3_22/EEPCU7190 interval comprising the M'-locus in 'Katy' corresponds to ,1.29 Mb in the peach syntenic genomic region (between 18.490-19.780 Mb positions). Meanwhile, in 'Canino' the M-locus was predicted to be flanked by PGS3_71 and PGS3_96 markers within an interval of 1.8 cM corresponding to ,364 Kb in the peach genome (between 18.399-18.763 Mb positions) [40]. Therefore, there is an overlapping interval between these two regions spanning ,273 kb. To have a complementary view of the predicted positions for the Mand M'-loci, the relative frequency of individuals lacking SSR alleles in coupling phase with the PPM (expected to be zero in those markers absolutely linked) was represented graphically on the peach chr.3 ( Figure 4B). To do this, only individuals carrying the 'Canino' m mutated allele from the 'G6C-019 population [40] or the m' allele from 'K6K' and 'Katy' outcrossing populations were computed. This analysis showed frequency values of zero in shorter overlapping intervals: PGS3_23 (18.61 Mb) in 'K6K', PGS3_22/PGS3_28 (18.49-19.14 Mb, ,650 Kb) in 'Katy' out-    The genomic landscape of the ,1.29 Mb peach region syntenic to the apricot M'-locus contains 223 predicted gene transcripts as annotated by IPGI. Forty-two of these transcripts (located in the overlapping interval) were shared in common with the 'Canino' M-locus. BLASTP analysis of the ORFs against The Arabidopsis Information Resource (TAIR) database, with an exp. value cut-off ,1e 26 , was used by IPGI to predict gene functions based on homology to Arabidopsis. Table S5 includes the results of the BLASTP analysis for the ORFs comprised in the M'-locus region (IPGI) and indicates those Prunus/Arabidopsis gene pairs that are best-reciprocal BLASTP hits identifying putative orthologues. According to the large-scale gene expression analysis performed by Wang et al. [48] in Arabidopsis mature pollen, hydrated pollen and pollen tubes using Affymetrix ATH1 Genome Arrays, up to 53 of these Arabidopsis homologues were found to be pollen-expressed (Table S5).

Loss of Function of an S-locus External Factor is Responsible for SI Breakdown in 'Katy' (S 1 S 2 )
In this work the North-American apricot cv. 'Katy', released by Zaigers Genetics (Modesto, CA, USA) in 1978 [41], was confirmed as self-fruitful and its S-genotype was determined as S 1 S 2 following the nomenclature established by Burgos et al. [42]. However, previous reports assigned to 'Katy' the S-genotypes S 8 S C [49] and S 1 S 8 [50]. In addition, these two manuscripts referred 'Katy' as a spontaneous cultivar native to Europe and lately introduced to China. Therefore, both the S-genotype and the geographic origin proposed by these authors suggest that the cultivars they analyzed might be different from the cv. 'Katy' we describe here. Wu et al. [50] also suggest that SC in 'Katy' is associated with PPMs that, according to the segregation of Sgenotypes, seem to exert a polygenic control. Again, this is not the case in the Zaigers 'Katy' where SC is associated with a single PPM, however a sort of kinship between the two cultivars can not be discarded.
To investigate the genetics of SC, 'Katy' (S 1 S 2 ) was selfpollinated and reciprocally crossed with the self-incompatible cv. 'Goldrich' (S 1 S 2 ) [51,52]. 'Katy' pollen tubes bearing either the S 1or the S 2 -haplotype were able to grow in 'Katy' and 'Goldrich' pistils and to complete fertilization, producing the three Sgenotype classes expected for an F 2 population (S 1 S 1 :S 1 S 2 :S 2 S 2 ). However, no progeny was obtained in the reciprocal cross using 'Katy' as female parent. These results would support a PPM unlinked to the S-locus as the cause for SC. Crosses performed with other cvs. such as 'Harcot' (S 1 S 4 ) and 'Canino' (S 2 S C ) reinforce this conclusion, since seedlings carrying the 'Katy' S 1 -(when crossing with 'Harcot') and the S 2 -haplotype (when crossing with 'Canino') were also obtained. Moreover, segregation ratios in all performed crosses fit with a model where 'Katy' is heterozygous for the PPM conferring SC (M'm') (see Table 2).
Interestingly, in the 'K6K' and 'G6K' populations the number of seedlings homozygous for the S 1 -haplotype (20) is significantly lower than that for the S 2 -haplotype (43) (see Table 1). Similar deviations were observed by Wünsch and Hormaza [26] when the sweet cherry cv. 'Cristobalina' was self-pollinated. Following their reasoning, several causes might explain these deviations such as postzygotic selection against homozygous embryos, linkage in coupling between the mutated allele of the modifier factor (m) and the S 2 -allele or differences in the pollen competitive capacity to grow through the style (depending on the S-haplotype). In this particular case, a hypothetical effect of postzygotic selection would explain the reduced number of S 1 S 1 but not the high number of S 2 S 2 genotypes. Regarding the second reason, neither the segregation ratios observed in different populations nor the SDL analysis support a linkage between the M'-and the S-locus. Therefore, a lower growth capacity for pollen gametes bearing the S 1 -haplotype is regarded as the most acceptable hypothesis to explain this discrepancy. SC caused by loss of pollen-S function has been usually found to be associated with mutations (mainly indels) of the SFB genes in different Prunus species such as sweet cherry [15][16][17], apricot [18], Japanese apricot [15], peach [14] and sour cherry [19]. However, sequence analysis revealed no mutations or indels affecting any of the two 'Katy' SFB alleles discarding this as the cause of SI breakdown. In Solanaceae, self-compatible PPMs may arise from S-allele duplications located in a centric fragment, in a non-S chromosome or linked to the S-locus leading to the formation of Sheteroallelic pollen [45]. According to the segregations obtained in the performed crosses, S-allele duplications did not seem probable in 'Katy' (all descendants should have had the S 1 S 2 genotype), even so, we discarded that possibility showing that SFB gene dosage is equivalent between 'Katy' and the self-incompatible cv. 'Goldrich'. S-allele duplications may also result from polyploidy but 'Katy' was confirmed as diploid by flow cytometry analysis and by marker segregation and mapping in all crosses. These results rule out competitive interaction resulting from S-heteroallelic pollen as the cause of SC in 'Katy'.
Altogether, it can be hypothesized that the loss-of-function of a S-locus unlinked factor gametophytically expressed in pollen causes breakdown of SI in 'Katy'. Moreover, according to the relative abundance of SFB 1 and SFB 2 transcripts in 'Katy', when compared with the reference cv. 'Goldrich', the hypothetical defective factor in 'Katy' does not seem to affect their expression. These characteristics of the self-compatible mutant 'Katy' resemble those of other self-compatible pollen-part mutants defective for non S-locus factors already found in Prunus. For instance, gene duplications and modified transcription levels of the S-locus genes were also discarded as the cause of SC in the Prunus avium cv. 'Cristobalina' [53] and the Prunus armeniaca cv. 'Canino' [18]. According to the classification established by McClure et al. [30] the modifier factor in 'Katy' would belong to the group of modifier genes required for pollen rejection but with no wider role in pollination. Although no direct evidence is available about its possible function, last findings in Prunus may provide some clue in this respect. For instance, the PavSSK1 and PavCul1 proteins recently identified by Matsumoto et al. [39] in Prunus avium are proposed to form the SCFSFB E3 ubiquitine ligase complex involved in S-RNases degradation. Therefore, the loss-of-function of any of them would predictably lead to SC. However, none of these two genes is located in LG3 where the M'-locus region is found and so they can be discarded as a possible cause of SC in 'Katy'. On the other hand, Tao and Iezzoni [24] proposed an alternative model for the GSI in Prunus where a S-RNase inhibitor would be the target for the SCFSFB ubiquitination complex instead of the S-RNases. If the modifier factor found in 'Katy' was this hypothetical inhibitor, its loss-of-function would lead to SI and not to SC what also rules out this possibility. Further research will therefore be necessary to reveal the SI related function affected by the PPM in 'Katy'.

PPMs Conferring SC in 'Katy' and 'Canino' Apricots are both Located at the chr. 3 Distal End
To facilitate future identification and cloning, the 'Katy' GSI mutated modifier gene locus (M'-locus) was mapped following a two-steps strategy. First, we hypothesized that those markers linked with the M'-locus should be highly distorted in the populations obtained from crosses where 'Katy' was the pollen parent, since only 'Katy' pollen tubes carrying the m'-allele would be able to grow. In other words, the M'-locus genomic region should correspond to a segregation distortion locus (SDL), a chromosomal region that causes distorted segregation ratios [54] To identify this kind of regions, 'K6K 05 ' and 'K6K 06 ' populations, which all trees carry the PPM, were tested for genome-wide distributed SSRs to detect SDL by examining changes in genotypic frequencies. Attending to segregation of pollen alleles, two SDL were found in LG3 and LG6 but a deeper analysis showed that LG6 markers were partially linked to the Slocus and only moderately distorted. Consequently, LG3 was predicted as the most likely location for the M'-locus. Distortion in LG6 seems more plausibly related to the different capacity of S 1 and S 2 -pollen gametes for growing through the style. Further analyses are in progress to confirm this point.
In a second step, to refine M'-locus mapping, chr.3 specific SSRs were analyzed to estimate their segregation distortion ratios in selfing (F 2 ) and outcrossing populations obtained by using 'Katy' as pollen parent. Additionally, indirect M'-locus genotyping was performed by analyzing linked SSRs in the F 3 offspring of six selected 'K6K' F 2 trees. Recombination breakpoints in five of these trees defined a 9.4 cM interval for the 'Katy' M'-locus that corresponds to ,1.29 Mb in the peach genome (18.49-19.78 Mb) and overlaps ,273 Kb with that established for the M-locus in 'Canino' [40]. A non S-locus PPM conferring SC to the P. avium cv. 'Cristobalina' was also mapped on the LG3 by Cachi and Wünsch [55]. However, it was tentatively predicted to be downstream the EMPaS02 marker (,20,0 Mb) and therefore, if confirmed, the position for this locus is not coincident with those for the Mand M'-loci in apricot. Different map locations for PPMs would support different defective genes as responsible for SC in sweet cherry and apricot, but this point still requires confirmation. Particularly in apricot, SSR markers showing the highest distortion values associated with the PPMs in 'Canino' (PGS3_62) and 'Katy' (PGS3_23) are located in very close positions (18.612 and 18.608 Mb, respectively). Thus, in the light of the similarities found between the apricot cvs. 'Katy' and 'Canino' (i.e. genetics of SC, Mand M'-locus mapping positions, etc.) it is tempting to speculate that both PPMs causing SC might be affecting the same gene, however no conclusive evidence is yet available on this point. Only 42 genes are shared in common between Mand M'-locus [40] and, if this was the case, the availability of two different PPMs would be very helpful to identify the modifier gene. Interestingly, both cultivars have different geographic origins (i.e. 'Katy' is a North-American apricot selection [41] and 'Canino' is a local Spanish apricot [18]) and, according to the analysis of gemome-wide distributed SSRs, they seem to be genetically unrelated. This prompt us to speculate that both PPMs (being or not the same) may have arisen independently.
According to the peach syntenic genome region annotated by IPGI, the apricot M'-locus is predicted to contain about 223 gene transcripts. Based on sequence similarity, putative Arabidopsis orthologues were suggested for many of these Prunus genes [56] and, according to Movahedi et al. [57], a consistent tissue-specific expression might be expected for the reported gene pairs. Under this general rule, a high number of genes scattered throughout the M'-locus (up to 53) might be pollen-expressed fulfilling one of the main requirements for the SI 'Katy' modifier gene. Nevertheless, those genes whose orthologues are not pollen-expressed should not be discarded because inferred orthologues do not always have the same biological function [57]. Gene function annotation might also be helpful to select candidate genes for the SI 'Katy' modifier gene. Unfortunately, the hypothetical roles suggested for this factor are still merely speculative hindering this approach. In view of the limitations for these strategies and considering the high number of ORFs comprised within the M'-locus, narrowing down the mapping region will be an essential step to identify the SI modifier gene in 'Katy'. In summary, 'Katy' does not only provide an additional S-locus unlinked source of SC, a desired trait for apricot breeding programs, but also becomes a very useful tool to dissect the molecular genetics behind pollen-pistil interactions in Prunus.  (Table 1). 'K6K' population was formed by pooling all the individuals from these three latter F 2 populations. All these trees are maintained at the collection of the Instituto Valenciano de Investigaciones Agrarias (IVIA) in Valencia (Spain). Additionally, 12 independent F 3 seed populations (ranging from N = 2 to N = 77) were obtained after self-pollination of 'K6K 05 ' and 'K6K 06 ' trees.

Plant Material
Selfing populations from 'Katy' (F 2 and F 3 ) were obtained by putting insect-proof bags over several branches (containing 200-250 flower buds) before anthesis to prevent cross-pollination. Outcrossing populations were obtained by pollinating balloonstage flowers. Fruits were collected about three months later. F 3 seed-derived embryos were dissected from the rest of the seed tissue and stored at -20uC.

Nucleic Acids Extraction
Two leaf discs of each selection were collected and stored at 280uC before DNA isolation. Genomic DNA was extracted following the method of Doyle and Doyle [58]. DNA quantification was performed by NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE) and integrity was checked by comparison with lambda DNA (Promega, Madison, WI, USA). Embryo DNA was extracted by incubating for 10 min at 95uC with 20 ml of TPS (100 mM Tris-HCl, pH 9.5; 1 M KCl; 10 mM EDTA) isolation buffer [59]. Total RNA was extracted from mature anthers (contaning mature pollen grains) of balloonstage flowers using the UltraClean Plant RNA Isolation Kit (MoBio, Carlsbad, CA, USA).

PCR-amplification, Cloning and Sequencing of S-RNase
Gene Fragments and the Complete S-locus F-box Alleles from 'Katy' Fragments comprising the S-RNase first intron were PCRamplified with primers SRc-F [44] and Pru-C2R [43] (Table  S6) using 'Katy' genomic DNA as template. Cycling conditions were as follows: an initial denaturing step of 94uC for 2 min; 30 cycles of 94uC for 30 s, 55uC for 60 s and 72uC for 1 min 30 s; and a final extension of 72uC for 10 min (GeneAmpHPCR System 9700, Perkin-Elmer, Fremont, CA). PCR products were electrophoresed in 1% (w/v) agarose gel, purified using the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany) and cloned into the pGEM T-Easy vector (Promega, Madison, WI). DNA sequences from four independent clones were determined with an ABI3730 equipment using the Big Dye Terminator v.3.1. cycle sequencing kit (Applied Biosystems, Foster City, CA). Sequences were assembled and edited with the Staden package v1.4 [60] and homology searches were performed with BLASTX [61]. S-RNase fragments comprising the second intron were amplified with primers Pru-C2/Pru-C4R [43] (Table S6) using PCR-conditions described by Sonneveld et al. [62].
Genomic fragments containing the complete coding sequence of SFB 1 and SFB 2 (as well as their 39/59 flanking regions) were PCR-amplified with the haploytpe-specific primer pairs FBf-Hap1/FBr-Hap1 (this work) and FBf-Hap2/FBr-Hap2 [18] respectively (Table S6), using 'Katy' genomic DNA as template. PCR conditions and methods for isolating, cloning, and sequencing these fragments were the same used for the S-RNase fragments.
Genomic PCRs for S-genotyping S-genotyping of populations and cultivars was performed by PCR-amplification of the S-RNase first intron with the primer pair SRc-F/SRc-R [44] (Table S6)

Ploidy Level Determination
Ploidy level was determined using the Partec CyStain UV precise P reagent kit (Partec PAS, Münster, Germany) for nuclei extraction and DNA staining of nuclear DNA from plant tissues. Approximately 0.5 cm 2 leaf tissue was chopped using a sharp razor blade in 400 ml extraction buffer and filtered through a Partec 50 mm CellTrics disposable filter. Samples were then incubated for 60 seconds in the staining solution and analyzed in the Partec flow cytometer Ploidy Analyzer PA (Partec, Münster, Germany) in the blue fluorescence channel.
Real Time RT-PCR for SFB 1 and SFB 2 cDNA was obtained from total RNA isolated from mature anthers of the cvs. 'Goldrich' and 'Katy' using the SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen, Carlsbad, CA, USA). Genomic DNA traces were previously removed from RNA samples by treatment with DNAse I (Invitrogen, Carlsbad, CA, USA). SFB allele-specific PCRprimer pairs were designed in this work to amplify SFB 1 and SFB 2 (RT-SFB1-for/RT-SFB1-rev1 and RT-SFB2-for/RT-SFB2-rev2, respectively) (Table S6). Primer allele-specificity was tested by PCR-amplifying both alleles from genomic DNA and comparing fragment sizes with known S-genotypes in agarose gels after electrophoresis. The actin gene was used as endogenous control and the specific PCR primers Act3 and Act4 designed from the peach genome sequence (Gabino Ríos personal comm.) were used for amplification (Table S6). Specificity of actin PCR reaction was tested through size estimation of the amplified product by gel electrophoresis. Real-time PCR reactions were performed using an Applied Biosystems StepOnePlus Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) in a final volume of 20 ml, containing 10 ml of the SYBR Premix Ex Taq (Takara, Foster City, CA, USA), 0.4 ml of ROX reference dye, 0.375 mM of each primer and 2 ml of cDNA template diluted 1:15 from a total of 20 ml synthesized from 2 mg of total RNA. Cycling conditions were as follows: an initial denaturing step of 95uC for 30 s; 40 cycles of 95uC for 5 s, 60uC for 30 s and 72uC for 1 min. Relative expression of SFB 1 and SFB 2 from 'Katy' and 'Goldrich' RNA of mature anthers was measured by the standard curve method. Threshold cycle (C T ) values were automatically determined by StepOne v. 2.0 software (Applied Biosystems, Foster City, CA, USA). PCR reaction specificity was assessed after the amplification by confirming the presence of a single peak in the dissociation curve analysis. Results were the average of three independent biological replicates repeated three times.
Real-time PCR-based Gene Dosage Assay for SFB 1 and SFB 2 SFB allele-specific PCR primers used to determine gene dosage of SFB 1 and SFB 2 from genomic DNA of cvs. 'Goldrich' and 'Katy' were also RT-SFB1-for/RT-SFB1-rev1 and RT-SFB2-for/ RT-SFB2-rev2. Actin was used as endogenous control and the specific primers used to amplify this gene were Act3/Act4 (see previous sections). Real-time PCR reactions were performed using the same PCR mixtures (except for 2 ml of gDNA as a template), cycling conditions and thermocycler previously reported for realtime RT-PCR. Relative DNA quantity corresponding to SFB 1 and SFB 2 alleles from 'Katy' and 'Goldrich' was measured by the standard curve method. C T values and PCR reaction specificity were also determined as for the real-time RT-PCR. Results were the average of two independent biological replicates repeated three times.

SSR Marker Analysis
A total of 118 SSR markers, spread over the 8 Prunus chromosomes, were tested to perform a genome-wide screen for the PPM (Table S7). Those SSRs amplifying in 'Katy', 'Goldrich' and 'Canino' (85) ( Table S3) were used to estimate Neis genetic distance between the three cultivars [64] by means of GENETIX v.4.05 software [65]. One hundred and two additional SSRs developed by Zuriaga et al. [40] were tested to construct the 'Katy' LG3 map (Table S2). SSR amplifications were performed in a GeneAmpH PCR System 9700 thermal cycler (Perkin-Elmer, Freemont, CA, USA) in a final volume of 20 ml, containing 75 mM Tris-HCl, pH 8.8; 20 mM (NH 4 ) 2 SO 4 ; 1.5 mM MgCl 2 ; 0.1 mM of each dNTP; 20 ng of genomic DNA and 1 U of Taq polymerase (Invitrogen, Carlsbad, CA). Each polymerase chain reaction was performed by the procedure of Schuelke [66] using three primers: the specific forward primer of each microsatellite with M13(-21) tail at its 59 end at 0.4 mM, the sequence-specific reverse primer at 0.8 mM, and the universal fluorescent-labeled M13(-21) primer at 0.4 mM. The following temperature profile was used: 94uC for 2 min, then 35 cycles of 94uC for 45 s, 50-60uC for 1 min, and 72uC for 1 min and 15 s, finishing with 72uC for 5 min. Allele lengths were determined using an ABI Prism 3130 Genetic Analyzer with the aid of GeneMapper software, version 4.0 (Applied Biosystems).

M'-locus Fine Mapping
Segregation distortion locus (SDL) associated with the PPM was detected using JoinMap 3.0 software [67] by analyzing x 2 values of selected SSRs spread over the Prunus genome in the 'K6K 05 ' and 'K6K 06 ' F 2 populations. Genetic maps for each linkage group were roughly estimated using these two populations. The logarithm of odds (LOD) grouping threshold was established at $ 3.0 for LG2, LG4, LG7 and LG8 but ,3.0 for the rest. Comparative mapping with other apricot cvs. was used to support grouping of markers in these latter cases.
Linkage maps of 'Katy' chr.3 were constructed using SSR markers segregating in 'K6K' and 'C6K' populations. Calculations were performed by JoinMap 3.0 software [67] using the Kosambi mapping function [68] to convert recombination units into genetic distances. In the 'C6K' population, LG3 was established following the ''two-way pseudo test-cross'' model of analysis Grattapaglia and Sederoff [69] under a LOD grouping threshold of 5.0 and a recombination frequency parameter below 0.4. According to the single LG3 map obtained for 'Katy' from 'C6K', LOD score was relaxed to 2.0 for merging, two separated groups (at LOD .5.0) in the 'K6K' population to construct LG3.
M'-locus genotyping of K6K-F 2 individuals was indirectly performed by analyzing segregation ratios of heterozygous SSR markers linked to the PPM (according to the SDL analysis) in the F 3 progenies. A x 2 test was performed to check whether the observed ratios fit a 1:2:1 ratio, corresponding to the m'm' genotype, or a 1:1 ratio, corresponding to the M'm' genotype.

Supporting Information
Table S1 Identification of segregation distortion SSR loci distributed throughout the 'Katy' LG6 using the F 2 population 'K6K'. x 2 and P values estimated for each SSR, considering the expected segregation ratio 1:2:1 are indicated. (DOC)