Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Single-step genomic prediction of fruit-quality traits using phenotypic records of non-genotyped relatives in citrus

  • Atsushi Imai,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Institute of Fruit Tree and Tea Science, National Agriculture and Food Research Organization, Fujimoto, Tsukuba, Ibaraki, Japan, Graduate School of Life and Environmental Science, University of Tsukuba, Tennodai, Tsukuba, Ibaraki, Japan

  • Takeshi Kuniga,

    Roles Data curation, Investigation, Resources

    Affiliation Western Region Agricultural Research Center, National Agriculture and Food Research Organization, Senyucho, Zentsuji, Kagawa, Japan

  • Terutaka Yoshioka,

    Roles Investigation, Resources

    Affiliation Western Region Agricultural Research Center, National Agriculture and Food Research Organization, Senyucho, Zentsuji, Kagawa, Japan

  • Keisuke Nonaka,

    Roles Investigation, Resources

    Affiliation Institute of Fruit Tree and Tea Science, National Agriculture and Food Research Organization, Okitsunakacho, Shimizu, Shizuoka, Japan

  • Nobuhito Mitani,

    Roles Investigation, Resources

    Affiliation Institute of Fruit Tree and Tea Science, National Agriculture and Food Research Organization, Fujimoto, Tsukuba, Ibaraki, Japan

  • Hiroshi Fukamachi,

    Roles Investigation, Resources

    Affiliation Institute of Fruit Tree and Tea Science, National Agriculture and Food Research Organization, Okitsunakacho, Shimizu, Shizuoka, Japan

  • Naofumi Hiehata,

    Roles Data curation, Investigation, Resources

    Affiliation Nagasaki Agricultural and Forestry Technical Development Center, Nagasaki Prefectural Government, Kaizumachi, Isahaya, Nagasaki, Japan

  • Masashi Yamamoto,

    Roles Investigation, Resources

    Affiliation Faculty of Agriculture, Kagoshima University, Korimoto, Kagoshima, Kagoshima, Japan

  • Takeshi Hayashi

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    hayatk@affrc.go.jp

    Affiliations Graduate School of Life and Environmental Science, University of Tsukuba, Tennodai, Tsukuba, Ibaraki, Japan, Institute of Crop Science, National Agriculture and Food Research Organization, Kannondai, Tsukuba, Ibaraki, Japan

Single-step genomic prediction of fruit-quality traits using phenotypic records of non-genotyped relatives in citrus

  • Atsushi Imai, 
  • Takeshi Kuniga, 
  • Terutaka Yoshioka, 
  • Keisuke Nonaka, 
  • Nobuhito Mitani, 
  • Hiroshi Fukamachi, 
  • Naofumi Hiehata, 
  • Masashi Yamamoto, 
  • Takeshi Hayashi
PLOS
x

Abstract

The potential of genomic selection (GS) is currently being evaluated for fruit breeding. GS models are usually constructed based on information from both the genotype and phenotype of population. However, information from phenotyped but non-genotyped relatives can also be used to construct GS models, and this additional information can improve their accuracy. In the present study, we evaluated the utility of single-step genomic best linear unbiased prediction (ssGBLUP) in citrus breeding, which is a genomic prediction method that combines the kinship information from genotyped and non-genotyped relatives into a single relationship matrix for a mixed model to apply GS. Fruit weight, sugar content, and acid content of 1,935 citrus individuals, of which 483 had genotype data of 2,354 genome-wide single nucleotide polymorphisms, were evaluated from 2009–2012. The prediction accuracy of ssGBLUP for genotyped individuals was similar to or higher than that of usual genomic best linear unbiased prediction method using only genotyped individuals, especially for sugar content. Therefore, ssGBLUP could yield higher accuracy in genotyped individuals by adding information from non-genotyped relatives. The prediction accuracy of ssGBLUP for non-genotyped individuals was also slightly higher than that of conventional best linear unbiased prediction method using pedigree information. This indicates that ssGBLUP can enhance prediction accuracy of breeding values for non-genotyped individuals using genomic information of genotyped relatives. These results demonstrate the potential of ssGBLUP for fruit breeding, including citrus.

Introduction

Genomic selection (GS) is considered to be a practical tool for accelerating genetic improvement in plant breeding [1,2], and the potential of GS is now being evaluated for use in fruit breeding [3]. Conventional phenotypic selection in fruit breeding has difficulties owing to long juvenile periods and complex inheritance of quantitative traits [4], and GS is expected to be an alternative method to phenotypic selection and work toward solving these problems.

In plant breeding, statistical GS models are generally constructed based on information from both the genotypes and phenotypes of a population [5]. However, phenotypic data from non-genotyped relatives can also be used to construct GS models when full pedigree records are available [6]. This situation is common in fruit breeding because an organized fruit breeding program has a well-defined recording system and continuously accumulates phenotypic records along with pedigree information, such as in [7,8]. Therefore, phenotypic and pedigree information from non-genotyped relatives could be used to improve the accuracy of GS modeling in fruit breeding.

For GS in animal breeding, phenotypic data from non-genotyped relatives are often incorporated to obtain regular breeding values for genotyped individuals using pedigree information and, subsequently, genomic prediction model is constructed by combining the estimated breeding values and genotypes via multiple steps [9,10]. This procedure is called multiple-step GS, which can be complicated to perform, and can result in lower accuracy, biased outputs, or loss of information [11]. In contrast to multiple-step GS, single-step genomic best linear unbiased prediction (ssGBLUP) has been proposed [11,12], where phenotypic data from both genotyped and non-genotyped individuals are jointly analyzed to predict breeding values of all individuals using a mixed linear model with a relationship matrix obtained by combining genomic relationship information among genotyped individuals and pedigree information between genotyped and non-genotyped individuals and within non-genotyped individuals [13]. Thus, ssGBLUP can predict the breeding values of both genotyped and non-genotyped individuals simultaneously, with lower bias and increased accuracy compared to multiple-step methods [14,15]. Therefore, ssGBLUP could be a promise tool in fruit breeding.

In the procedure of ssGBLUP, a combined relationship matrix, denoted as H matrix, is computed from a genome relationship matrix and a pedigree-based relationship matrix, referred to as G matrix and A matrix, respectively, to fit the best linear unbiased prediction (BLUP) model [13]. Through the H matrix, A is augmented by G and vice versa, enabling ssGBLUP to improve accuracy in the evaluation of breeding values for both genotyped and non-genotyped relatives. However, although ssGBLUP has several advantages to the multiple-step method as described above, the application of this method for plant breeding has been limited to several species, including rice (Oryza sativa L.) [16,17], wheat (Triticum aestivum L.) [18,19], maize (Zea mays L.) [20], and those of forest trees [2124], and to the best of our knowledge, no previous studies of ssGBLUP have reported for fruit breeding. Accordingly, we applied ssGBLUP to a real dataset of fruit-quality traits obtained from an ongoing citrus breeding program. We compared the prediction accuracy of ssGBLUP with that of conventional methods in both genotyped and non-genotyped individuals.

Materials and methods

Plant materials and phenotypic records

An outline of the plant materials tested is shown in Fig 1. A total of 1935 individuals were obtained from the Kuchinotsu Citrus Research Station, National Agriculture and Food Research Organization (NARO, Nagasaki, Japan). We used 106 parental cultivars and 1829 F1 individuals derived from 122 pair-cross families (hereafter, referred to as families). Both the parental cultivars and the F1 individuals were maintained as previously described [25]: briefly, the F1 individuals were each grafted onto one tree of trifoliate orange (Poncirus trifoliata L.) from 2006–2008, which were planted in the breeding fields at a spacing of 0.3 m within and 5 m between rows. Parental cultivars were grafted onto trifoliate orange or satsuma mandarin (Citrus unshiu Marcow.) interstocks in adjacent fields. Crosses were performed solely for producing commercial cultivars, and therefore, no specific mating design was adopted. All trees were maintained in accordance with the standard management protocol in Japan, namely, four applications of fertilizer and 10–20 applications of agrichemicals per year.

thumbnail
Fig 1. Outline of plant materials used in this study (parental cultivars, 106; F1 individuals, 1829; total, 1935).

F1 individuals were derived from crosses between two parental cultivars. Numbers in the boxes indicate number of individuals in each category described below. Gray and white boxes represent with or without single nucleotide polymorphism (SNP) data, respectively; 483 individuals (106 parental cultivars and 377 F1 individuals) have SNP data. Numbers in parentheses represent the number of pair-cross families; thus, e.g., 377 F1 individuals with SNP data were derived from nine pair-cross families. F1 individuals without SNP data were divided into two categories: those derived from pair-cross families that had less than 10 F1 individuals (upper) or more than 10 F1 individuals (lower). Family means of the phenotypic records of the latter category were targeted for cross-validation of non-genotyped individuals.

https://doi.org/10.1371/journal.pone.0221880.g001

Three fruit-quality traits including fruit weight (FW), sugar content (SC), and acid content (AC) were evaluated in each tree of the genotypes used in this study (Table 1). Five colored fruits were sampled for immediate evaluations in December, and FW, SC, and AC were determined annually from 2009–2012. Thus, all 1935 individuals were evaluated one–four times for each trait. These phenotypic records were collected through the selection process of our citrus breeding program, NARO, and are summarized in Table 2.

thumbnail
Table 2. Summary statistics of the phenotypic records evaluated in this study.

https://doi.org/10.1371/journal.pone.0221880.t002

Marker genotypes

All 106 parental cultivars and 377 F1 individuals derived from nine families were genotyped using the genotyping-by-sequencing (GBS) method [26] to obtain genome-wide single nucleotide polymorphism (SNP) data. Accordingly, 483 of 1935 individuals have SNP data (outlined in Fig 1). The obtained SNP data were subsequently subjected to quality control (QC) procedures: briefly, the SNP loci were removed with a call rate <0.80 and a minor allele frequency <0.01. The remaining SNPs were further filtered based on the consistency of Mendelian inheritance, and missing SNP genotypes were imputed by Fimpute v. 2.2 [27]. Following the imputation process, highly correlated SNP loci were eliminated according to Wiggans et al. [28]. The detailed GBS conditions and QC procedure, including the extent of linkage disequilibrium (LD) were described previously by Imai et al. [29].

Prediction models

The following linear mixed model was applied to compare the prediction performance of ssGBLUP with that of genomic best linear unbiased prediction (GBLUP) in genotyped individuals and that of conventional BLUP (ABLUP) in non-genotyped individuals: (1) where y is a vector of phenotypic records of the 1935 individuals observed from 2009–2012, b is a vector of fixed effects including an intercept and year effect, X is a design matrix relating b to y, and Z is an incidence matrix relating u to y. The vector u represents breeding values as described below, and e is a vector of residuals assuming , where I is an identity matrix and represents residual variance.

In Eq (1), u are assumed to follow a normal distribution with a mean vector of 0 and a covariance matrix in the ABLUP model and in the GBLUP model, where is the additive genetic variance, and A and G represent a pedigree-based additive relationship matrix and a realized genomic matrix, respectively. We calculated G from SNP data according to VanRaden’s first method [6]. Using the A and G matrices, the best linear unbiased predictor of u, denoted by , was calculated for ABLUP as follows: (2) and for GBLUP as follows: (3) where λ is given as , and M is a projection matrix defined as M = IX(X'X)−1X'. The A and G matrices were computed using airemlf90 [30] and preGSf90 software [12,31], respectively, and was calculated using airemlf90 software [30].

In the ssGBLUP model, it was assumed that in Eq (1). This H matrix combines pedigree and genomic relationships, and was defined previously [11] as follows: (4) where A11, A12, A21, and A22 are submatrices of A, and the subscripted 1 and 2 represent non-genotyped and genotyped individuals, respectively. Through the H matrix, the prediction accuracy of genotyped individuals can be improved with data from non-genotyped relatives, and the prediction accuracy of non-genotyped individuals can also be improved by G, which accounts for the Mendelian sampling effect of genotyped relatives and can provide more accurate relationships than A. For the H matrix calculation, we scaled G based on A22 so that the mean diagonal and off-diagonal of G equals those of A22; appropriate scaling avoids the biases of breeding values in genotyped individuals [14]. The inverse of H has a simple form [12,32], and can be written with tuning-parameters α, β, τ, and ω as follows: (5) Fine tuning of α, β, τ, and ω can increase the accuracy and reduce biases of genomic prediction of breeding values [33]. We used fixed values of α = 0.95 and β = 0.05 to enable inversion of the matrix. We assigned the same value to τ and ω (τ = ω); in this context, τ defines a mixing proportion of genomic and pedigree information [12]. If τ > 0, and τ = ω, then the portion of genomic and pedigree information becomes τ:(1−τ). Adding pedigree information could be beneficial for capturing the polygenic effects that could not to be accounted for by genomic information. We tested three values of τ (1.00, 0.75, and 0.50) for evaluation of prediction accuracy. Using the H matrix, the best linear unbiased predictor was calculated as follows: (6) where λ and M are defined the same as the ABLUP and GBLUP models. The H matrices were computed using the preGSf90 software [12,31], and was calculated using airemlf90 software [30].

Heritability estimation

Additive genetic variance (), residual variance (), and heritability in each trait were estimated based on the linear mixed model described above. We estimated the heritability by ABLUP and ssGBLUP with different τ values (1.00, 0.75, and 0.50). We did not calculate the heritability by the GBLUP method, because GBLUP only applied to the dataset with both phenotyped and genotyped individuals.

Evaluation of prediction accuracy

The prediction accuracy of ssGBLUP was compared with that of GBLUP in genotyped individuals. Cross-validation (CV) was performed to evaluate these methods, assuming early selection at the juvenile stage. CV was also performed to compare the prediction accuracy of ssGBLUP with that of ABLUP in non-genotyped individuals.

To compare the prediction accuracy in genotyped individuals, genotypic values (i.e., sum of the intercept and breeding values) of individuals from nine genotyped families were calculated based on all phenotypic records of the 1935 individuals by the ABLUP method, and these values were predicted by the ssGBLUP and GBLUP methods. In each CV cycle, each of the nine genotyped families was omitted and the remaining individuals, including the parental cultivars and non-genotyped families (only in ssGBLUP), were used to construct the prediction model to predict the genotypic values of the omitted family. Thus, CV consisted of nine cycles and evaluated the accuracy of seedling selection based on SNP genotypes at the juvenile stage during cross-breeding. The prediction accuracy was evaluated as a correlation coefficient (r) between the targeted genotypic values and the predicted ones.

To compare the prediction accuracy in non-genotyped individuals, phenotypic mean values in each of the 50 non-genotyped families with more than 10 F1 individuals (hereafter, referred to as observed family mean) were calculated as the target values of the CV procedures. These values were predicted by ssGBLUP and ABLUP methods, which calculated the predicted genotypic values in each target family for validation. The phenotypic records for calculation of the family mean were adjusted for year effect that was estimated from all observations of the 1935 individuals by the ABLUP method. In the ssGBLUP analysis, we adopted the fixed values of τ (ω) with the highest prediction accuracy in genotyped individuals. In each CV cycle, each of the 50 non-genotyped families were omitted and the remaining individuals, including the parental cultivars, genotyped families, and non-genotyped families with less than 10 F1 individuals, were used to construct the prediction model to predict the observed means of the omitted family. In this case, the predicted genotypic values became identical within a family, because their phenotypic records were omitted. The prediction accuracy was evaluated as weighted correlation coefficient (r) between the target and predicted values. The weights of the correlation coefficient were determined from the numbers of F1 individuals in each family.

Results

Heritability estimation

We estimated heritability using ABLUP and ssGBLUP with three τ values (Table 3). Heritability ranged from 0.57 to 0.82 in three fruit-quality traits, and AC showed the highest estimates of heritability. These estimates were somewhat lower than those from our previous report [25], which reflects the differences in population to be analyzed. In all traits, ABLUP and ssGBLUP offered almost the same heritability estimates. The mixing proportion τ of ssGBLUP also had little effect on heritability estimation, thus we considered GBLUP provided similar estimates of heritability in our case.

thumbnail
Table 3. Heritability estimated by ABLUP and ssGBLUP methods.

https://doi.org/10.1371/journal.pone.0221880.t003

Comparison of prediction accuracy in genotyped individuals

The GBS approach and successive QC procedures provided 2353 SNPs from 483 individuals. Using the SNP data and pedigree information of all individuals, we constructed H matrices and applied them to the ssGBLUP to evaluate the prediction accuracy in genotyped individuals and to compare with those of GBLUP. For H matrix construction, we used the three values of τ (1.00, 0.75, and 0.50), which define the mixing proportion of genomic and pedigree information. Thus, we compared three ssGBLUP models with different τ values and one GBLUP model for three fruit-quality traits.

The CV for each genotyped family showed a similar or higher accuracy in ssGBLUP compared with GBLUP (Table 4; S1S3 Figs). While our result showed rather lower prediction accuracy for GBLUP than that of previous study that evaluated the same traits [34], the reduced accuracy may be caused by the differences in SNPs, plant materials, and the procedures of CV. A considerable improvement in accuracy was attained in SC, and similar accuracy was obtained in FW and AC. The comparisons between the ssGBLUP models with different τ values showed that the highest accuracy was obtained when τ = 1.00 for FW, 0.50 for SC, and 0.75 for AC (Table 4). However, the differences in accuracy were small and showed little effect on the accuracy of ssGBLUP.

thumbnail
Table 4. Comparison of prediction accuracy between ssGBLUP and GBLUP methods in genotyped individuals.

https://doi.org/10.1371/journal.pone.0221880.t004

Comparison of prediction accuracy in non-genotyped individuals

The H matrix in ssGBLUP combined pedigree and genomic relationships. Consequently, this method could provide more accurate genetic evaluation even for non-genotyped relatives than ABLUP method only using a pedigree-based additive relationship matrix. We validated the prediction accuracy of the ssGBLUP and ABLUP methods. The observed family means in each of the 50 non-genotyped families were predicted using CV procedures for each trait. Slightly higher correlation coefficients resulted from the ssGBLUP method compared to those of the ABLUP method (Table 5; S4S6 Figs). The improvement in prediction accuracy achieved by ssGBLUP was higher for SC than it was for FW and AC. Although the prediction accuracy was considerably different for each trait, large discrepancies between the observed and predicted values were commonly detected in several families.

thumbnail
Table 5. Comparison of prediction accuracy between ABLUP and ssGBLUP method in non-genotyped individuals.

https://doi.org/10.1371/journal.pone.0221880.t005

Discussion

Recently, GS has attracted the attention of those involved in fruit breeding, because it has the potential to capture minor gene effects, and thus provide more accurate selection of complex quantitative traits of economic importance [3437]. However, to construct reliable models for GS, a sufficiently large training population with both genotyped and phenotyped individuals is required [38,39]. This is one of the main obstacles for the introduction of GS for fruit breeding, because a long juvenile period and large plant size hinders the rapid accumulation of phenotypic data such as fruit-quality traits. In addition, the genotypic data necessary for GS can only be obtained from living individuals, although most individuals evaluated in breeding programs are culled after selection. Thus, obtaining both genotype and phenotype records for GS model construction is more difficult for fruit breeding than it is for animal breeding or other crop breeding.

One possible solution for constructing reliable GS models in fruit breeding would be to use previously accumulated phenotype records, e.g., from the breeding procedure, which can be achieved using the ssGBLUP methodology. Generally, an organized fruit breeding program includes well-defined maintenance protocols for the breeding materials [40], and phenotyping protocols [4143]. These practices enable the continuous accumulation of phenotypes and other records that are useful for breeding such as those containing pedigree information [7,8,25]. Therefore, ssGBLUP can be introduced into fruit breeding programs with few changes to the existing system for maintenance of breeding materials and phenotypic evaluations.

In the present study, we compared the prediction performance of ssGBLUP with that of GBLUP, assuming selection at the juvenile stage in the genotyped individuals. We also compared the prediction performance of ssGBLUP with that of conventional ABLUP in the non-genotyped individuals. Our results showed that ssGBLUP equaled or outperformed GBLUP and ABLUP in terms of prediction accuracy in all cases, especially for SC. These gains in prediction accuracy were consistent with those from previous reports on different plants, such as rice [16,17] and wheat [18,19], and on domesticated animals including dairy cattle [44,45], beef cattle [46,47], pigs [15], and chickens [48]. With the H matrix, genomic information that can account for Mendelian sampling is incorporated into standard BLUP models. Furthermore, much larger datasets of phenotypic information can be used with the ssGBLUP method than with the GBLUP method. These advantages of using the ssGBLUP method are herein confirmed for citrus.

Although we have demonstrated the potential of ssGBLUP for use in citrus breeding, there remains several problems. For CV of non-genotyped individuals, large discrepancies between observed and predicted family means were detected (S4S6 Figs). These large discrepancies indicate that predictions from the ssGBLUP method could be inaccurate in some cases, at least for fruit-quality traits in citrus. One possible cause of these large discrepancies may be the influence of non-additive effects, such as dominance or epistasis effects. Under the assumption of an infinitesimal model [49], ssGBLUP assumes additive polygenic effects as the mode of inheritance for target traits. Although the assumption of additive effect captures a large part of dominant and epistasis effects [50,51], the predictions from ssGBLUP may, in some cases, have some outliers that are affected by large non-additive effects, despite moderate to high narrow-sense heritability traits as analyzed in our study.

In addition to the problems from non-additive effects, several previous studies have reported factors that influence the accuracy of genomic predictions, including training population size, heritability, genetic architecture of target traits, extent of LD, and marker density [39,5254]. For these factors, the extent of LD determines the marker density necessary for genomic predictions, and an insufficient number of SNP markers against LD decreases the model’s prediction accuracy due to imperfect associations between quantitative trait loci (QTL) and SNP markers. Our previous study and others have reported relatively high LD in fruit breeding populations [29,34,37]. Thus, a smaller number of SNPs may be sufficient for GS in an advanced fruit breeding population. In addition, for the GBLUP method (and also for ssGBLUP), the effect of increasing the number of SNPs on prediction accuracy can appear to reduce the sampling error of G, and a larger number of SNPs would provide only small improvements in accuracy if the effects of QTLs are well captured by a small number of SNPs [33]. However, if it is not the case, it may be desirable to capture polygenic effects using an A matrix and tuning the mixing proportion of the A and G matrices [33]. Nevertheless, our study demonstrated that the τ parameters had little effect on prediction accuracy for the three fruit-quality traits tested. This is in contrast to the results of the first report on ssGBLUP in plants [18], which stated the importance of trait-specific weighting parameters (τ parameters in the present study). Owing to the inconsistent results for τ parameters observed in the previous report and the present study, the effect of τ parameters on prediction accuracy should be carefully considered when they are applied to other traits or other species of fruit.

The accuracy of genomic predictions is also affected by the heritability of the target traits [39]; the higher prediction accuracy is obtained for a trait with higher heritability. For three fruit-quality traits evaluated in this study, AC showed the highest heritability, and showed slightly or considerably higher prediction accuracy compared with the other two traits in both genotyped and non-genotyped individuals (Tables 4 and 5). These results suggested that heritability can be a measure for evaluating prediction accuracy in genomic predictions with ssGBLUP for fruit breeding, although an inconsistent result was observed between FW and SC in non-genotyped individuals. Furthermore, the heritability of target traits is used to estimate the training population size necessary to achieve predetermined accuracy of genomic predictions [55], and a larger population size is necessary if heritability is low. Although ssGBLUP could achieve larger sample sizes compared with those of GBLUP, the greater number of individuals is more desirable for the construction and validation of GS models, especially for low heritability traits. Our study included only moderate to high heritability traits (0.57 to 0.82); therefore, the prediction accuracy of ssGBLUP for lower-heritability traits should be further evaluated with larger datasets in future studies.

As for the genetic architecture of target traits, ssGBLUP assumes additive polygenic inheritance of target traits which are contributed by a large number of QTLs each with small effect. However, several studies have reported the QTLs with large effects in three fruit-quality traits evaluated in our study [29,34,5658]. Therefore, these genetic architectures may decrease the prediction accuracy of ssGBLUP. As a modified ssGBLUP method, a single-step methodology using Bayesian regression, which can assume different marker variances, was recently proposed by Fernando et al. [59]. Their method can treat large QTL effects which are estimated as marker effects in the prediction model, and thus has the potential to further improve genomic prediction accuracy. Although the studies of Hayes et al. [9] and VanRaden et al. [10] indicated that a suitable number of markers with equal variance is appropriate for most traits, the application of Fernando et al.’s single-step methodology using Bayesian regression may be an alternative choice for GS in fruit breeding.

Although there are still problems to overcome, we have demonstrated the potential of ssGBLUP for fruit breeding using actual data of citrus. We consider that the several features of ssGBLUP methodology, which uses information from both genotyped and non-genotyped relatives with simple manners, makes it suitable for ongoing fruit breeding programs. The advantages of ssGBLUP and other single-step GS approaches can increase in the future with the accumulation of larger phenotypic and genotypic datasets

Supporting information

S1 Fig. Plots of the estimated genotypic values vs. predicted genotypic values via cross-validation in fruit weight.

Estimated genotypic values were calculated using a numerator relationship matrix (A) including all observations from 1935 individuals. Predicted genotypic values via cross-validation were calculated using a genomic relationships matrix (G, GBLUP) or combined H matrix from G and A (single-step GBLUP) excluding phenotypic records of each target family for cross-validation. (a) GBLUP model (b) ssGBLUP model with τ = 0.50 (c) ssGBLUP model with τ = 0.75 (d) ssGBLUP model with τ = 1.00.

https://doi.org/10.1371/journal.pone.0221880.s001

(PDF)

S2 Fig. Plots of the estimated genotypic values vs. predicted genotypic values via cross-validation in sugar content.

Estimated genotypic values were calculated using a numerator relationship matrix (A) including all observations from 1935 individuals. Predicted genotypic values via cross-validation were calculated using a genomic relationships matrix (G, GBLUP) or combined H matrix from G and A (single-step GBLUP) excluding phenotypic records of each target family for cross-validation. (a) GBLUP model (b) ssGBLUP model with τ = 0.50 (c) ssGBLUP model with τ = 0.75 (d) ssGBLUP model with τ = 1.00.

https://doi.org/10.1371/journal.pone.0221880.s002

(PDF)

S3 Fig. Plots of the estimated genotypic values vs. predicted genotypic values via cross-validation in acid content.

Estimated genotypic values were calculated using a numerator relationship matrix (A) including all observations from 1935 individuals. Predicted genotypic values via cross-validation were calculated using a genomic relationships matrix (G, GBLUP) or combined H matrix from G and A (single-step GBLUP) excluding phenotypic records of each target family for cross-validation. (a) GBLUP model (b) ssGBLUP model with τ = 0.50 (c) ssGBLUP model with τ = 0.75 (d) ssGBLUP model with τ = 1.00.

https://doi.org/10.1371/journal.pone.0221880.s003

(PDF)

S4 Fig. Plots of the observed family means vs. predicted family means via cross-validation in fruit weight.

Observed family means refer to mean values of phenotypic records, and predicted family means refer to predicted genotypic values in each pair-cross family. Phenotypic records for calculation of observed family means were adjusted for year effects. Predicted values via cross-validation were calculated using a pedigree-based BLUP model (ABLUP) or single-step GBLUP model (ssGBLUP) excluding the phenotypic records of each target family; thus, they offered the same values within a family. Mixing proportion τ showing the highest accuracy in prediction of genotypic values was used for ssGBLUP model. (a) ABLUP model (b) ssGBLUP model with τ = 1.00.

https://doi.org/10.1371/journal.pone.0221880.s004

(PDF)

S5 Fig. Plots of the observed family means vs. predicted family means via cross-validation in fruit weight.

Observed family means refer to mean values of phenotypic records, and predicted family means refer to predicted genotypic values in each pair-cross family. Phenotypic records for calculation of observed family means were adjusted for year effects. Predicted values via cross-validation were calculated using a pedigree-based BLUP model (ABLUP) or single-step GBLUP model (ssGBLUP) excluding the phenotypic records of each target family; thus, they offered the same values within a family. Mixing proportion τ showing the highest accuracy in prediction of genotypic values was used for ssGBLUP model. (a) ABLUP model (b) ssGBLUP model with τ = 0.50.

https://doi.org/10.1371/journal.pone.0221880.s005

(PDF)

S6 Fig. Plots of the observed family means vs. predicted family means via cross-validation in fruit weight.

Observed family means refer to mean values of phenotypic records, and predicted family means refer to predicted genotypic values in each pair-cross family. Phenotypic records for calculation of observed family means were adjusted for year effects. Predicted values via cross-validation were calculated using a pedigree-based BLUP model (ABLUP) or single-step GBLUP model (ssGBLUP) excluding the phenotypic records of each target family; thus, they offered the same values within a family. Mixing proportion τ showing the highest accuracy in prediction of genotypic values was used for ssGBLUP model. (a) ABLUP model (b) ssGBLUP model with τ = 0.75.

https://doi.org/10.1371/journal.pone.0221880.s006

(PDF)

Acknowledgments

We acknowledge all the staff members responsible for the agricultural fields at Kuchinotsu Citrus Research Station, NARO, for their careful management of the plant materials used in this study.

References

  1. 1. Heffner EL, Sorrells ME, Jannink J-L. Genomic selection for crop improvement. Crop Sci. 2009;49:1–12.
  2. 2. Jannink J-L, Lorenz AJ, Iwata H. Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics. 2010;9:166–177. pmid:20156985
  3. 3. Iwata H, Minamikawa MF, Kajiya-Kanegae H, Ishimori M, Hayashi T. Genomics-assisted breeding in fruit trees. Breed. Sci. 2016;66:100–15. pmid:27069395
  4. 4. Yamamoto T. Breeding, genetics, and genomics of fruit trees. Breeding Sci. 2002; 66:1–2. pmid:27069386
  5. 5. Desta ZA, Ortiz R. Genomic selection: genome-wide breeding value prediction in plant improvement. Trends Plant Sci. 2014;19:592–601. doi: https://doi.org/https://doi.org/10.1016/j.tplants.2014.05.006. pmid:24970707
  6. 6. VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–4423. pmid:18946147
  7. 7. Kouassi AB, Durel CE, Costa F, Tartarini S, van de Weg E, Evans K, et al. Estimation of genetic parameters and prediction of breeding values for apple fruit-quality traits using pedigreed plant material in Europe. Tree Genet Genomes. 2009;5:659–672.
  8. 8. Kumar S, Volz RK, Alspach PA, Bus VGM. Development of a recurrent apple breeding programme in New Zealand: a synthesis of results, and a proposed revised breeding strategy. Euphytica. 2010;173:207–222.
  9. 9. Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME. Invited review: Genomic selection in dairy cattle: Progress and challenges J Dairy Sci. 2009;92:433–443. pmid:19164653
  10. 10. VanRaden PM Van Tassell CP Wiggans GR Sonstegard TS Schnabel RD Taylor JF Schenkel FS. Invited review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009b;92:16–24. pmid:19109259
  11. 11. Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–4663. pmid:19700729
  12. 12. Aguilar I, Misztal I, DL Johnson, Legarra A, Tsuruta S, Lawlor TJ. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J Dairy Sci. 2010;93:743–752. pmid:20105546
  13. 13. Henderson CR. Sire evaluation and genetic trends. In: Proceedings of the Animal Breeding and Genetics Symposium: In honor of Dr. Jay L. Lush. Champaign, Illinois: Am Soc Anim Sci; 1973. pp 10–41.
  14. 14. Vitezica ZG, Aguilar I, Misztal I, Legarra A. Bias in genomic predictions for populations under selection. Genet Res. 2011;93:357–366. pmid:21767459
  15. 15. Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Single-step methods for genomic evaluation in pigs. Animal. 2012;6:1565–1571. pmid:22717310
  16. 16. Morais Júnior OP, Duarte JB, Breseghello F, Coelho AS, Morais OP, Magalhães Júnior AM. Single-step reaction norm models for genomic prediction in multienvironment recurrent selection trials. Crop Science. 2018;58(2):592–607.
  17. 17. Morais Júnior OP, Breseghello F, Duarte JB, Coelho AS, Borba TC, Aguiar JT, Neves PC, Morais OP. Assessing Prediction Models for Different Traits in a Rice Population Derived from a Recurrent Selection Program. Crop Sci. 2018;58:1–13.
  18. 18. Ashraf B, Edriss V, Akdemir D, Autrique E, Bonnett D, Crossa J, et al. Genomic prediction using phenotypes from pedigree lines with no markers. Crop Sci. 2016;56:957–964.
  19. 19. Pérez-Rodríguez P, Crossa J, Rutkoski J, Poland J, Singh R, et al. Single-Step Genomic and Pedigree Genotype × Environment Interaction Models for Predicting Wheat Lines in International Environments. Plant Genome. 2017;10:2. pmid:28724079
  20. 20. Westhues M, Heuer C, Thaller G, Fernando R, Melchinger AE. Efficient genetic value prediction using incomplete omics data. Theor Appl Genet. 2019:1–12.
  21. 21. Ratcliffe B, El-Dien OG, Cappa EP, Porth I, Klápště J, Chen C, et al. Single-step BLUP with varying genotyping effort in open-pollinated Picea glauca. G3. 2017;7:935–942. pmid:28122953
  22. 22. Cappa EP, El–Kassaby YA, Munoz F, Garcia MN, Villalba PV, Klapště J, Poltri SNM. Improving accuracy of breeding values by incorporating genomic information in spatial‐competition mixed models. Mol Breeding. 2017;37:125.
  23. 23. Cappa EP, El-Kassaby YA, Muñoz F, Garcia MN, Villalba PV, Klápště J, et al. Genomic-based multiple-trait evaluation in Eucalyptus grandis using dominant DArT markers. Plant Sci. 2018;271:27–33. pmid:29650154
  24. 24. Klápště J, Suontama M, Dungey HS, Telfer EJ, Graham NJ, Low CB, et al. Effect of Hidden Relatedness on Single-Step Genetic Evaluation in an Advanced Open-Pollinated Breeding Program. J Hered. 2018;109:802–810. pmid:30285150
  25. 25. Imai A, Kuniga T, Yoshioka T, Nonaka K, Mitani N, Fukamachi H, et al. Evaluation of the best linear unbiased prediction method for breeding values of fruit-quality traits in citrus. Tree Genet Genomes, 2016;12:119.
  26. 26. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE. 2011;6:e19379. pmid:21573248
  27. 27. Sargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics. 2014;15:478. pmid:24935670
  28. 28. Wiggans GR, Sonstegard TS, Vanraden PM, Matukumalli LK, Schnabel RD, Taylor JF, Schenkel FS, Van Tassell CP. Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada. J Dairy Sci. 2009;92:3431–3436. pmid:19528621
  29. 29. Imai A, Nonaka K, Kuniga T, Yoshioka T, Hayashi T. Genome-wide association mapping of fruit-quality traits using genotyping-by-sequencing approach in citrus landraces, modern cultivars, and breeding lines in Japan. Tree Genet Genomes. 2018;14:24.
  30. 30. Misztal I, Tsuruta S, Strabel T, Auvray B, Druet T, Lee DH, Ducrocq V, Elsen JM, Minvielle F. BLUPF90 and related programs (BGF90). In: Proceedings of the 7th World Congress on Genetics Applied to Livestock Production. Montpellier (France), 2002, pp. 28–07.
  31. 31. Aguilar I, Misztal I, Legarra A, Tsuruta S. Efficient computation of the genomic relationship matrix and other matrices used in single-step evaluation. J Anim Breed Genet. 2011;128:422–428. pmid:22059575
  32. 32. Christensen O, Lund M. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2. pmid:20105297
  33. 33. Misztal I, Aggrey SE, Muir WM. Experiences with a single-step genome evaluation1. Poult Sci. 2013;92:2530–2534. pmid:23960138
  34. 34. Minamikawa MF, Nonaka K, Kaminuma E, Kajiya-Kanegae H, Onogi A, Goto S, et al. Genome-wide association study and genomic prediction in citrus: potential of genomics-assisted breeding for fruit quality traits. Sci Rep. 2017;7(1):4721–4734. pmid:28680114
  35. 35. Kumar S, Garrick DJ, Bink MC, Whitworth C, Chagné D, Volz RK. Novel genomic approaches unravel genetic architecture of complex traits in apple. BMC Genomics. 2013;14:393. pmid:23758946
  36. 36. Iwata H, Hayashi T, Terakami S, Takada N, Sawamura Y, Yamamoto T. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia. Breed Sci. 2013;63:125–140. pmid:23641189
  37. 37. Minamikawa MF, Takada N, Terakami S, Saito T, Onogi A, Kajiya-Kanegae H, et al. Genome-wide association study and genomic prediction using parental and breeding populations of Japanese pear (Pyrus pyrifolia Nakai). Sci Rep. 2018;8:11994. pmid:30097588
  38. 38. Meuwissen TH, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157:1819–1829. pmid:11290733
  39. 39. Goddard ME. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica. 2009;136(2):245–257. pmid:18704696
  40. 40. Mitani N, Matsumoto R, Yoshioka T, Kuniga T. Citrus hybrid seedlings reduce initial time to flower when grafted onto shiikuwasha rootstock. Sci. Hortic. 2008;116:452–455. doi.org/10.1016/j.scienta.2008.03.003.
  41. 41. Evans K, Guan Y, Luby J, Clark M, Schmitz C, Brown S, et al. Large-scale standardized phenotyping of apple in RosBREED. Acta Hortic. 2012;945:233–238.
  42. 42. Mathey MM, Mookerjee S, Gündüz K, Hancock JF, Iezzoni AF, Mahoney LL, et al. Large-scale standardized phenotyping of strawberry in RosBREED. J Am Pom Soc. 2013;67:205–216.
  43. 43. Frett T, Gasic K, Clark J, Byrne D, Gradziel T, Crisosto C. Standardized phenotyping for fruit quality in peach [Prunus persica (L.) Batsch]. J Am Pom Soc. 2012;66:214–219.
  44. 44. Gao H, Christensen OF, Madsen P, Nielsen US, Zhang Y, Lund MS, Su G. Comparison on genomic predictions using three GBLUP methods and two single-step blending methods in the Nordic Holstein population. Genet Sel Evol. 2012;44:8. pmid:22455934
  45. 45. Koivula M, Stranden I, Su G, Mantysaari EA. Different methods to calculate genomic predictions—comparisons of BLUP at the single nucleotide polymorphism level (SNP-BLUP), BLUP at the individual level (G-BLUP), and the one-step approach (H-BLUP). J Dairy Sci. 2012;95:4065–73. pmid:22720963
  46. 46. Onogi A, Ogino A, Komatsu T, Shoji N, Simizu K, Kurogi K, et al. Genomic prediction in Japanese Black cattle: application of a single-step approach to beef cattle. J Anim Sci. 2014;92:1931–1938. pmid:24782393
  47. 47. Lourenco DAL, Tsuruta S, Fragomeni BO, Masuda Y, Aguilar I, Legarra A, et al. Genetic evaluation using single-step genomic BLUP in American Angus. J Anim Sci. 2015;93:2653–2662. pmid:26115253
  48. 48. Chen C, Misztal I, Aguilar I, Tsuruta S, Aggrey S, Wing T, Muir W. Genome-wide marker-assisted selection combining all pedigree phenotypic information with genotypic data in one step: an example using broiler chickens. J Anim Sci. 2011;89:23–28. pmid:20889689
  49. 49. Fisher RA. XV.—The correlation between relatives on the supposition of Mendelian inheritance. T Roy Soc Edin. 1918;52:399–433.
  50. 50. Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 2008; 4:e1000008. pmid:18454194
  51. 51. Crow JF. On epistasis: why it is unimportant in polygenic directional selection. Philos T Roy Soc B. 2010;365:1241–1244. pmid:20308099
  52. 52. Solberg TR, Sonesson AK, Woolliams JA, Meuwissen TH. Genomic selection using different marker types and densities. J Anim Sci. 2008;86:2447–54. pmid:18407980
  53. 53. Habier D, Tetens J, Seefried FR, Lichtner P, Thaller G. The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol. 2010;42:5. pmid:20170500
  54. 54. Daetwyler HD, Pong-Wong R, Villanueva B, Woolliams JA. The impact of genetic architecture on genome-wide evaluation methods. Genetics. 2010;185:1021–1031. pmid:20407128
  55. 55. Daetwyler HD, Villanueva B, Woolliams JA. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS ONE. 2008;3:e3395. pmid:18852893
  56. 56. Imai A, Yoshioka T, Hayashi T. Quantitative trait locus (QTL) analysis of fruit-quality traits for mandarin breeding in Japan. Tree Genet Genomes. 2017;13:79.
  57. 57. Yu Y, Chen C, Gmitter FG. QTL mapping of mandarin (Citrus reticulata) fruit characters using high-throughput SNPmarkers. Tree Genet Genomes. 2016;12:77.
  58. 58. Asins MJ, Raga V, Bernet GP, Carbonell EA. Genetic analysis of reproductive, vegetative and fruit quality traits to improve Citrus varieties. Tree Genet Genomes. 2015;11:117.
  59. 59. Fernando RL, Dekkers JC, Garrick MD. A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses. Genet Sel Evol. 2014;46:50. pmid:25253441