Evaluation of efficiency of controlled pollination based parentage analysis in a Larix gmelinii var. principis-rupprechtii Mayr. seed orchard

Controlled pollination (CP) is an important tool for breeding programs to improve seed quality, as it rapidly generates desirable genotypes and maximizes genetic gains. However, few studies have evaluated the success rate of CP, especially in Larix gmelinii var. principis-rupprechtii Mayr. seed orchards. In this study, we estimated the rate of correct parentage in 257 CP progeny in an L. gmelinii var. principis-rupprechtii seed orchard from ten candidate parents using 13 microsatellites. The parentage exclusion probabilities of all combined loci in the single parent and parent pair tests were > 0.99, which was sufficient to distinguish the relatedness of the sampled individuals. Comparing the maximum likelihood-based parentage analysis results with breeding records revealed that the percentages of correctly identified maternal and paternal parents were 22.6% and 35.0% at 95% CL, respectively, suggestive of parent mislabeling and pollen contamination in the CP population. We conducted a pedigree reconstruction by identifying the expected parents and assigned maternity, paternity, and parent pairs to 176 (68.5%), 199 (77.4%), and 132 (51.4%) progeny, respectively. This study provides a reference for future selection of elite genotypes for commercial production. To increase the efficiency of CP, molecular markers should be used to correctly identify individuals in seed orchards before conducting CP.


Introduction
Prince Rupprecht's larch (Larix gmelinii var. principis-rupprechtii Mayr.) is one of the most important native conifer species in northern China used for reforestation and to build nonfood forests due to its growth speed, high timber quality, and numerous applications [1]. Natural populations are mainly distributed in Hebei and Shanxi provinces, China. Pinaceae species were first targeted for scientific breeding programs in China in the 1960s [2]. Currently, Prince In this study, the parentage of 267 individuals, including 257 CP progeny and ten candidate parents, sampled from a first-generation seed orchard were estimated using a set of SSR markers. The aims of this study were: (1) to estimate the rate of correct CP by comparing breeding records of the expected parents and identify the underlying sources of error in the breeding program to propose methods to reduce such errors during the production of genetically improved seeds; and (2) To use the microsatellite data to estimate the genetic parameters for parentage determination and individual identification and conduct a pedigree reconstruction of the CP L. gmelinii var. principis-rupprechtii seed orchard. Pedigree reconstructions can provide a practical tool for guiding the future selection of superior individuals and parents to produce improved progeny.

Plant materials and DNA extraction
All plant material was collected in May 2012 from the national key seed base of L. gmelinii var. principis-rupprechtii in Longtoushan, Hebei Province, China. In total, 267 individuals were selected for this study, including ten parents sampled from the first-generation seed orchard and 257 progeny generated from seven and five elite parents that were used as mothers and fathers, respectively via controlled crossing (Table 1). Healthy young leaves were collected from each tree, frozen in liquid nitrogen, and stored at -80˚C in an ultra-low-temperature freezer until the genomic DNA extraction. Genomic DNA was extracted from leaf tissue using a modified cetyltrimethylammonium bromide method [25]. DNA quality was evaluated using 0.8% agarose gel electrophoresis. DNA concentration was determined using a NanoDrop 2000 spectrophotometer and diluted to a concentration of 50 ng/μL for the subsequent microsatellite amplification.

Microsatellite amplification
DNA was genotyped using 13 SSR markers, which included seven SSR markers obtained from the transcriptome sequencing data of Larix gmelinii var. principis-rupprechtii and six expressed sequence tag (EST)-SSR markers developed from ESTs of Pinaceae from the National Center for Biotechnology Information (NCBI) database (Table 2) [24,26]. The microsatellite simplex PCR amplification was conducted with a reaction volume of 25 μL containing 100 ng of genomic DNA, 200 μmol dNTP, 0.25 μmol of each primer pair, 1U of Taq DNA polymerase with 1× buffer containing Mg 2+ . The amplification was performed under the following conditions: initial denaturation for 5 min at 94˚C, 29 cycles of 30 s at 94˚C, 30 s at the optimized annealing temperature, elongation for 45 s at 72˚C, and a final extension for 5 min at 72˚C. The products were separated on a non-denaturing 8% polyacrylamide gel using the HT-SCZ04 high flux vertical electrophoresis tank and electrophoresed at 130 V for 1.5 h. The fragments were visualized with 0.1% silver nitrate. Alleles were visually compared with a 100-bp DNA ladder as a standard to determine their size.

Data analysis
The genetic polymorphism parameters of each SSR locus were calculated with POPGENE ver. 1.32 software [27]. The parameters included the number of alleles (N a ) and observed and expected heterozygosity (H o , H e ). The polymorphism information content (PIC) of each primer pair was calculated with PIC_CAL ver. 0.6 software [26]. The parentage analysis, individual identification of all samples, and parentage exclusion probabilities were estimated with Cervus ver. 3.0.7 software [28]. The null allele frequency was estimated with an individual inbreeding model-based estimator [29]. The Hardy-Weinberg equilibrium of each locus was tested to characterize the microsatellite markers. The simulation parameters were: 300000, 500000 and 100000 parent-offspring pairs in the analysis of maternities, paternities and parent pairs, respectively, 7 candidate maternal parents and 5 candidate paternal parents (i.e., number of parents according to the breeding records), 99% of candidate parents were selected, and 98% of the loci were typed. The proportion of loci mistyped was calculated as where q i was the null allele frequency at the ith locus among 267 trees, and L was the number of loci. The error rate over the 13 loci was estimated to be 2.8%. [30] For each parent-offspring pair, the critical likelihood value (LOD score) was obtained from simulations at either an 80% (relaxed) or 95% (strict) confidence level (CL).

Microsatellite marker polymorphisms
In total, we genotyped 267 individuals from 13 SSR markers, with data missing for 37 of 3,738 data points (0.8%). The N a of all samples ranged from 2 to 6, with a mean N a of 3.9. The mean H o and H e values were 0.503 and 0.492, respectively. The mean H e was higher than that of the mean H o for five primer pairs (D15, D63, D77, F17, and DN4), indicating a lack of heterozygotes at these loci. D42 and D63 had the highest (0.612) and lowest (0.189) PICs, respectively ( Table 3). The alleles of two loci (D63 and F97) were over-concentrated (PIC < 0.25), while four primer pairs had high degrees of polymorphism (PIC > 0.5) and the others had intermediate levels of polymorphism (0.25 < PIC < 0.5). High genetic diversity resulted in a high statistical power for the parentage analysis when combining all markers. In addition, the total parentage exclusion probability for the single parent and parent pair tests was > 0.99, indicating that the set of markers used for this analysis had good identification ability and was adequate for determining parentage and recognizing individuals. Among the 13 primers pairs, the null allele frequency was < 0.05, considered to be a meaningful threshold [31]. One marker (D42) deviated from Hardy-Weinberg equilibrium, which may have resulted from the presence of null alleles. Overall, the genotyped population demonstrated strong consistency with the expected genotypic proportions under Hardy-Weinberg equilibrium.

Pedigree reconstruction
To reconfirm the parentage of the 257 individuals to the 10 candidate parents, we used SSR markers to assign maternity, paternity, and the most likely parent pair. Overall, maternity was assigned to any single parent of the 10 parents for 176 of the 257 progenies (68.5%), while the paternity of 199 progenies (77.4%) was successfully assigned to any single parent at 80% confidence levels ( Table 5). The LOD scores were slightly lower for the paternity assignment, likely explaining the large proportion of unassigned individuals in this data array. At the 95% CL, paternity (54.5%) was assigned to a greater percentage of progeny than maternity (23.3%) ( Table 5).
Higher LOD scores were obtained in the assignment of most likely parent pairs of the 257 progeny. In total, 51.4% of all progeny (132 of 257) were assigned any parent pair, including 58 (22.6%) at the 95% CL (Table 5). However, 125out of 257 progeny were unassigned, suggestive of clonal ramet mislabeling or pollen contamination. Based on the maternity, paternity, and Efficiency of controlled pollination in a Larix gmelinii var. principis-rupprechtii Mayr. seed orchard full parentage assignments, we conducted a pedigree reconstruction of the CP progeny in the seed orchard (S1 Table). When the most likely mother (or father) assigned in the tests was the father (or mother) of the cross combination, we selected the second most likely option as the mother or father; if the LOD score of the second most likely parent was below the standard (i.e., the LOD score of the maternity and paternity assignment was < 0.21 and 0.04, respectively), we determined that no parent from the seed orchard could be identified, possibly due to the mislabeling of parents or pollen contamination from other parents than the ten candidate parents.

Discussion
The estimates of probabilities for parentage assignments Polymorphic microsatellites were a useful and inexpensive tool for assigning parentage in Prince Rupprecht's larch. The H o and H e of 13 SSR markers were within the same ranges as natural populations [24], although the mean H e (0.492) was higher than that of other Prince Rupprecht's larch populations [26,2,24,32] (Table 3). The results revealed that the polymorphism of these microsatellites had sufficient statistical power to identify individuals and assign parentage. The exclusion power of genetic markers is one of the most important elements of parentage assignments. The number of markers must change with the number of sampled individuals to achieve a high exclusion probability of the correct parents [33]. The multi-locus combined parentage exclusion probabilities obtained in this study were > 0.99 for the single parent and parent pair tests (Table 3). Similar estimates were reported in a parentage assignment study of a loblolly pine seed orchard; the estimated of probabilities were 0.99 for the single parent test and > 0.9999 for the parent pair test [21]. To our knowledge, this is the first report to estimate the parentage exclusion probability using SSR markers in Prince Rupprecht's larch. The high exclusion probabilities obtained in this study suggest that SSR markers are sufficiently powerful to precisely identify the parents of CP progeny in Prince Rupprecht's larch seed orchards. Although SNPs are increasingly popular in genetic studies due to their higher rates of correct genotyping [34], SSR markers will remain an easily accessible, informative, and cheap option for molecular analyses of Prince Rupprecht's larch and similar breeding programs.

Causes of incorrect parentage
The maternity and paternity assignment analyses had error rates of 52.1% and 56.5% at 80% CL (Table 4), respectively, suggesting that CP in this seed orchard had been affected by various sources of error. Some progeny with a maternal parent in disagreement with the breeding records may have been generated from other individuals from the same orchard or other seed gardens. During the breeding process, careful collection of seeds from different clone ramets is important to guarantee the correct mother. When seeds are accidentally mixed with those from different parents and misidentified, maternity errors occur due to the mismatching of ramets [21]. This labeling error has been reported in other studies. For example, in a loblolly pine seed orchard, mislabeling occurred mostly among parents that had undergone mass CP, which influenced the accuracy of maternity in the breeding program [21]. In addition, two forms of mislabeling, homonymous and synonymous mislabeling, were identified in a cacao seed garden [18]. Paternity errors generally occur during pollination due to the introduction of external pollen from neighboring gardens [35]. In this study, paternal contamination was the result of both pollen from other parents within the seed orchard and external sources of pollen (Table 4). Pollen contamination is a serious problem that can decrease the genetic quality of an orchard's crops [36]. In a maritime pine (Pinus pinaster Ait.) polycross seed orchard, the minimum pollen contamination rate was 36%, resulting in expected genetic gain losses of 18-50% [37]. Moreover, a maritime pine clonal seed orchard had an alien pollen contamination rate of 52.4%, determined via nuclear microsatellite analysis [36]. Similarly, a Scots pine (Pinus sylvestris L.) seed orchard had an estimated pollen contamination of 52%, which impacted the expected genetic gain to such a degree that no selection gain was observed [38].

Pedigree reconstruction
Pedigree reconstructions of seed orchards are an effective method of providing genetic information to assign parentage, estimate parental reproductive success, and evaluate the efficacy of management strategies [39,5]. Although we successfully identified paternity and maternity in 77.4%and 68.5% of progeny, respectively, using 13 SSR markers, some progeny could not be assigned to any of the ten parents from this seed orchard. Estimating the maternal or paternal parents of these unassigned progenies would be complex and time consuming, since the candidate parents could be any ramets of any clone from this or other seed orchards. To eliminate this error, breeding programs should verify the parentage of individuals using molecular markers before breeding. However, the parentage assignments in this study can be used to select parents of elite individuals for further breeding and provide a method for ascertaining the accuracy of breeding materials to conduct successful genetic studies in the future. Another study conducted a full and partial reconstruction pedigree of lodgepole pines (Pinus contorta) and found that the full pedigree reconstruction was superior to the partial pedigree reconstruction, enabling the estimation of both paternal-and maternal-related fertility parameters [39]. In Douglas fir (Pseudotsuga menziesii), pedigree reconstruction was used to identify the maternal and paternal parents of each seed and estimate the clonal reproductive success and selfing rate and determine the proportion of seeds sired by outside pollen sources [40]. Similarly, Funda et al. compared the reproductive success of three seed orchards [lodgepole pine, Douglas fir, and western larch (Larix occidentalis)] based on pedigree reconstructions [41]. These studies show that pedigree reconstruction is an important component of genetic studies of seed orchards, particularly for parentage assignments and reproductive success estimations.

Conclusions
We used microsatellite markers to evaluate the success rate of controlled crossing and identify the parentage of 257 CP progeny in an L. gmelinii var. principis-rupprechtii seed orchard. The results revealed that maternal and paternal parents were correctly identified for only slightly more than half of all progeny, possibly due to ramet mislabeling and pollen contamination, including pollen from this seed orchard and alien pollen from other orchards. Therefore, we advocate the use of DNA markers which have a sufficiently high statistical power to identify individuals using several microsatellites to appraise and correctly identify all clonal ramets. Before conducting CP, the maternity and paternity of each ramet should be confirmed to eliminate factors that could affect the success of controlled crossing. Moreover, the pedigree reconstruction of the 257 CP progeny provides a strong basis for recommending elite individuals for the breeding program and guiding parentage selection to improve L. gmelinii var. principisrupprechtii.
Supporting information S1