Quantitative genetic parameters for growth and wood properties in Eucalyptus “urograndis” hybrid using near-infrared phenotyping and genome-wide SNP-based relationships

A thorough understanding of the heritability, genetic correlations and additive and non-additive variance components of tree growth and wood properties is a requisite for effective tree breeding. This knowledge is essential to maximize genetic gain, that is, the amount of increase in trait performance achieved annually through directional selection. Understanding the genetic attributes of traits targeted by breeding is also important to sustain decade-long genetic progress, that is, the progress made by increasing the average genetic value of the offspring as compared to that of the parental generation. In this study, we report quantitative genetic parameters for fifteen growth, wood chemical and physical traits for the world-famous Eucalyptus urograndis hybrid (E. grandis × E. urophylla). These traits directly impact the optimal use of wood for cellulose pulp, paper, and energy production. A population of 1,000 trees sampled in a progeny trial was phenotyped directly or following the development and use of near-infrared spectroscopy calibration models. Trees were genotyped with 33,398 SNPs and 24,001 DArT-seq genome-wide markers and genomic realized relationship matrices (GRM) were used for parameter estimation with an individual-tree additive-dominant mixed model. Wood chemical properties and wood density showed stronger genetic control than growth, cellulose and fiber traits. Additive effects are the main drivers of genetic variation for all traits, but dominance plays an equally or more important role for growth, singularly in this hybrid. GRM´s with >10,000 markers provided stable relationships estimates and more accurate parameters than pedigrees by capturing the full genetic relationships among individuals and disentangling the non-additive from the additive genetic component. Low correlations between growth and wood properties indicate that simultaneous selection for wood traits can be applied with minor effects on genetic gain for growth. Conversely, moderate to strong correlations between wood density and chemical traits exist, likely due to their interdependency on cell wall structure such that responses to selection will be connected for these traits. Our results illustrate the advantage of using genome-wide marker data to inform tree breeding in general and have important consequences for operational breeding of eucalypt urograndis hybrids.


Introduction
The lack of genetic knowledge on wood properties has limited a better exploitation of the available inherent variation for these traits in breeding programs based on advanced generations of urograndis hybrids. While growth traits are easily measured in all trees of a progeny trial at half rotation age, about 3 to 4 years in tropical climates, the assessment of wood properties requires mature trees, such that adequate measurements are done typically closer to, or at rotation age. Moreover, standard methods are typically destructive, entail large samples, include the whole tree for some specific measurements, and are slow, laborious, and expensive. These constraints make wood analyses feasible only for a small number of trees already selected and deployed in final clonal trials of a selection cycle, and frequently target only wood density as a proxy for ultimate trait selection (i.e. for pulp yield or calorific power). However, the possibility of directly measuring chemical and physical traits has now become increasingly relevant especially as planted forests are also seen as the foundation of new industries replacing the use of fossil hydrocarbons for energy and industrial organic chemicals [16].
To reduce time and cost of wood phenotyping, methods that can predict wood traits based on the development of calibration models using near-infrared reflectance spectroscopy (NIRS) have become increasingly common [17][18][19]. These models use the spectra of a sample to predict its compounds or physical attributes. In Eucalyptus, a number of studies have successfully applied NIRS to predict chemical wood properties such as lignin content, syringyl/guaiacyl (S/ G) lignin ratio, cellulose, pulp yield, and wood density in temperate species such as E. globulus, E. nitens and its hybrids [20][21][22][23][24][25]. Only very few studies, however, have been reported for tropical species such as E. grandis [26] and E. urophylla [27][28][29]. Recently, a comprehensive study has developed global NIRS models using wood samples from different tropical, subtropical, and temperate eucalypt species grown in different locations around the world, but unfortunately no samples of urograndis hybrids were included [30]. Some of these studies successfully used NIRS predicted measurements to estimate genetic parameters and correlations for growth and wood quality traits in light of their paramount importance for successful breeding.
Accurate assessment of genetic parameters relies on the estimation of variance and covariance components, which in turn are a function of the genetic relatedness of the individuals sampled. Genetic control of a trait is then estimated by correlating the phenotypic resemblance with the expected proportion of the genome that two relatives share identical by descent. An expected coefficient of relationship of 0.25 is assumed for half-siblings and 0.5 for full-sibs based on the pedigree information and presumed unrelatedness of parents. However, in the mixed mating system of Eucalyptus, selfing, pollen contamination, or even identification errors are common during controlled crosses, notwithstanding potential cryptic relatedness of parents especially in elite breeding populations involving hybrid parents, a common feature in urograndis hybrid-based programs. Inaccurate relationships may lead to incorrectly estimated genetic parameters, which in turn can bias the predicted genetic gains either up or downwardly. The value of using molecular markers for more precise estimation of genetic parameters in forest tree breeding has been documented, and shown initially with a limited number of microsatellites [31][32][33]. A so-called genomic relationship matrix (GRM) built with marker data provides the effectively realized genetic relatedness among individuals, instead of only an expected relatedness when using the average numerator relationship matrix built from error-prone pedigree information [34]. With easier access to dense genome-wide genotyping platforms, this quantitative genomics approach has become increasingly used in forest trees providing not only better estimation of parameters, including untangling nonadditive genetic effects [15,[35][36][37][38][39][40][41], but also serving as the driving framework to improved genomewide association [42][43][44] and genomic selection [45]. In species of Eucalyptus, realized genetic relationships have been estimated using thousands of genome-wide DArT [46,47] and SNPs (Single Nucleotide Polymorphisms) markers [48]. In this study, we report quantitative genetic parameters for fifteen growth and wood quality traits in tropical urograndis hybrids, including chemical and physical properties that largely impact the use of the resource for pulp, paper and energy. Estimates were obtained and compared using both a pedigree-based average numerator relationship matrix A and GRMs built with data from two alternative genome-wide genotyping platforms: DArT-seq, a restriction enzyme based complexity reduction followed by genotyping-by-sequencing and the fixed-content SNP array EuCHIP60K.

Eucalyptus population and growth data
The Eucalyptus population employed in this study belongs to the breeding program of International Paper do Brasil. It involved a progeny trial encompassing almost exclusively E. grandis × E. urophylla F 1 , backcrosses, and F 2 full-sib families. The trial was planted in randomized complete blocks, with five trees per plot comprising 58 full-sib families generated by an incomplete half-diallel mating design, totaling 2,784 trees planted in 2006 in Brotas, São Paulo State, Brazil (22˚17 0 05@S 48˚07 0 38@W). A subset of 1,000 trees out of the 2,784 in the trial was sampled for measuring growth and wood properties and DNA marker genotyping (Fig 1). Sampling was carried out in a stratified manner to keep equivalent representation of the majority of families, with approximately 20 to 25 trees per family and avoiding off-type trees (e.g. branched, diseased, deformed, etc.). The sampled set of 1,000 trees included 45 full-sib families derived from 46 parents considered genetically unrelated based on the fact that they were originally selected in unrelated families. This sample included 610 trees (61%) of 25 F 1 hybrid families of E. grandis and E. urophylla, 366 trees (37%) of 19 backcross and F 2 hybrid families of E. grandis and E. urophylla, and 24 (2%) trees of one E. grandis × E. camaldulensis hybrid family (Table 1). For brevity, the population will be called an urograndis hybrid population. Diameter at breast height (DBH) and tree height (H), were measured in 2011 at age five, and wood volume (VOL) in cubic meters was estimated using a taper factor of 0.45 [49]. Mean annual increment (MAI) in m 3 .ha -1 .year -1 was calculated by multiplying the total tree volume by 1,200 trees per hectare and dividing the result by 5 years of growth.

Wood sampling and sample selection for wet lab analyses
Sampling of wood shavings for all 1,000 trees was performed when trees were 5-year-old. Bark was removed from both sides of the tree, and wood shavings were sampled through the entire stem using a driller (12 mm diameter) at breast high (1.3 m), always in a north-south direction (S1 Fig). Wood shavings were stored in paper envelopes dried at room temperature and ground using a Willey mill. The 40-60 mesh (0.297-0.420 mm) woodmeal portion of the sample was used in all subsequent wet chemistry lab and NIRS analyses (S2 Fig). To select a robust set of samples for wet lab chemical and physical trait measurements and for subsequent NIRS model calibration, spectra of woodmeal for all 1,000 samples were obtained using a NIRSystems 5000 equipment (FOSS, Hillerød, Denmark), reading every second wavelength, from 1,100 to 2,500 nm. Each sample was read 16 times, using the average of each one of 700 wavelengths. In order to reduce the cost and time needed for the wet lab chemical and physical wood analysis, a representative subset of 350 samples was selected constituting a NIRS calibration/validation set (Fig 1). Samples were selected by the Kennard and Stone sampling algorithm [50], based on Euclidean distances of samples spectra aiming at maximizing sample variation and representativeness of the range variation for predicted chemical variation. Growth data and wood sample NIRS spectra were collected from the sampled subset of 1,000 trees across full-sib families. A subset of between 200 and 350 trees selected based on maximizing NIRS spectra distance was phenotyped (wood chemical and physical traits) and data used to develop acceptable NIRS calibration models used to predicted lignin and wood density for the remaining 650 trees. Pedigree, genotypes (SNPs and DArT-seq), growth and wood trait data, either directly measured for 200 trees (cellulose, hemicellulose, microfibril angle, fibers, coarseness) or predicted for 1000 trees (lignin, wood density) were employed for genetic parameter estimation. Block arrows indicate step processes, thin dashed arrows indicate data or sample use. measured as follows: woodmeal sample (40-60 mesh) was extracted with acetone in Soxhlet apparatus for 12 hours. Cell wall carbohydrates, namely cellulose (CEL) and hemicellulose (HC), and lignin traits, acid-soluble lignin (SL) and acid-insoluble lignin (IL), in combination forming the total lignin (TL) were determined as described [51] using the Klason method, with small modifications. Cell wall carbohydrates were quantified with a high-performance liquid chromatography system (HPLC) using a Dionex (DX-600, Sunnyvalle, CA, USA) equipped with a PA1 (Dionex) column, detector with a gold electrode and SpectraAS3500 auto injector (Spectra-Physiscs, Santa Clara, CA, USA). Carbohydrates amounts were quantified relative to monomeric cell wall-associated carbohydrates (glucose, xylose, mannose, galactose, rhamnose and arabinose) as standards [35]. The amounts of Klason lignin and cell wall sugars were obtained in percentages, relative to the initial weight of dry wood sample analyzed. The ratio of lignin monomer subunits (syringil/guayacil) (S/G), was determined from acetoneextracted ground wood and analyzed via gas chromatography (GC) on a Hewlett Packard 5890 series II equipment (Agilent Technologies, Santa Clara, CA, USA), equipped with an autosampler, splitless injector, flame ionizing detector and a 30-m 5% diphenyl-95% dimethyl polysiloxane-coated RTX-5MS capillary column (inner diameter, 0.25mm) [52]. Lignin chemistry traits were measured on the 350 samples, while cellulose and hemicelluloses were ultimately measured on a smaller set of 200 samples due to resource limitations (Fig 1).

Wood physical analyses
Increment wood cores were collected for the 350 selected trees based on NIRS spectra variation. Increment cores (12 mm) were collected at breast height (1.3 m), in a north-south direction (S1 Fig). The northern half of the core was used for wood density (WD) and microfibril angle (MFA) analysis and the southern half for fiber length (FL), fiber width (FW) and coarseness (COA). To measure wood density the northern half of the wood cores were precision cut in 1.67 mm-thick sections, using a custom-built twin-blade pneumatic saw [35], and acetone extracted in Soxhlet apparatus for 12 hours. The wood sections were acclimated to 7% moisture content and then scanned by X-ray densitometry (QTRS-01X; Quintek Measurement Systems Inc., Knoxville, TN, USA), from pith to bark. The measurements across the section were averaged to determine the sample density. With the purpose of establishing a regression for X-ray densitometry, ten samples were randomly selected and had precisely recorded their weight over volume and then scanned in the equipment, to estimate the remaining samples of the phenotyping calibration set. Unlike subtropical and temperate tree species, tropical eucalypts have a continuous growth pattern, and annual growth rings are unclear. Thus, MFA was measured 10 mm adjacent to the bark, instead of choosing a certain year, following procedures described earlier [35]. Briefly, precision-cut samples, used for wood density determination, were also used for MFA in a Bruker D8 Discover (Bruker AXS Inc., Madison, WI, USA) X-ray diffraction instrument equipped with an area detector (GADDS) to collect diffraction patterns, which contain reflection information of the microfibril orientation in the wood sample. The southern part of the increment cores were used for fiber traits analysis (FL, FW, COA). Samples were incubated in Franklin solution (30% hydrogen peroxide and glacial acetic acid; 1:1 ratio) at 70˚C for 48 hours and macerated. Afterward, samples were washed in deionized water until the samples had been neutralized. The samples were filtered and oven-dried at 105˚C. A part of the sample had weight recorded and resuspended in distilled, deionized water and analyzed on a Fiber Quality Analyzer (FQA; Optest Equipment Inc., Hawkesbury, Ontario, Canada). Fiber length and width were taken as the average of all fibers measured and coarseness measured by the dry fiber mass per unit length (mg.100m -1 ). Wood physical traits were ultimately measured on between 337 (MFA) and 350 (WD) samples as some samples were lost in the procedure.

NIRS model calibration
Calibrations of the NIRS models were attempted for all chemical and physical traits based on spectra outputs of the 40-60 mesh (0.297-0.420 mm) woodmeal isolates of wood shavings and their corresponding wet chemical and physical measurements. The calibration population, composed by trees with both spectra and wet-lab data, was randomly divided into two subsets, one for model estimation and one for model validation. The estimation/validation sample sizes were either N = 250/N = 100 for those traits for which 350 samples were measured, or N = 150/N = 50 for those traits for which only 200 samples could be measured (Fig 1). The Unscrambler software (v.9.0; CAMO A/S, Oslo, Norway) was used to estimate the model parameters for each wood trait using Partial Least Squares (PLS) analysis for phenotype prediction, based on each sample spectra. For each trait, different spectral transformations were tested in order to obtain the highest possible accuracies. Root mean squared error of prediction (RMSEP; the difference between the true and estimated compositional value expressed in units of the phenotype), bias (the average value of the difference between predicted and measured values), and the coefficient of determination of prediction (R 2 p ) following transformations, were calculated for the external validation set and used to compare model estimates. The phenotypes of the remaining 650 samples of the population were predicted only for those traits for which NIRS models showed satisfactory performance. In the case of traits for which NIRS models were deemed poor (see below), only data for the directly measured trees were used in quantitative parameter estimation.

Genotypic data
Genomic DNA was extracted from xylem scrapings isolated from the sapwood at breast height (1.3m) using an optimized Sorbitol+CTAB method for high quality DNA from wood samples [53], quantified with a Nanodrop 2000 spectrometer and adjusted to concentrations between 20 and 40ng.uL -1 . DNA samples were genotyped at Geneseek (Lincoln, NE) using an Infinium (Illumina) custom made chip for Eucalyptus (EucHIP60k.br) [54]. DNA samples were also genotyped using DArT-seq, a sequence-based genotyping method developed by Diversity Arrays Technology Pty Ltd (DArT P/L, Canberra, Australia) [46]. From the sampled population of 1,000 trees, 970 were ultimately genotyped with both marker types, co-dominant SNPs and dominant (presence/absence) DArT-seq markers, and used for the subsequent quantitative genetics analyses.

Quantitative genetics analyses
The single-trait analysis was based on the following univariate individual-tree mixed model with additive and dominance genetic effects: Where, y is the vector of phenotypic data, β is the vector of fixed effects (block design effects); p is the vector of random plot effects following p � N ð0; Is 2 p Þ where I is the identity matrix and s 2 p is the plot variance; a is the vector of random additive genetic effects following a � N ð0; As 2 a Þ, where A is the average numerator relationship matrix and s 2 a is the additive genetic variance; d is the vector of random dominance effects following d � N ð0; Ds 2 d Þ where D is the average dominance relationship matrix and s 2 d is the dominance genetic variance; and e is the vector of the random residual effect following e � N ð0; Is 2 e Þ where s 2 e is the residual error variance. X, Z p Z a , and Z d are incidence matrices relating fixed and random effects to measurements in vector y.
In the marker-based approach, the pedigree-based relationship matrices for additive A and dominance D effects of the previous mixed model [1] were substituted by the corresponding marker-based additive (G A ; based either on SNPs or DArT-seq data) and dominance (G D ; based on SNPs data only) relationship matrices. The additive genomic relationship matrix (G A ) from the co-dominant SNPs was calculated using the function 'A.mat', in R (http://www. R-project.org/) package rrBLUP, that uses the formula proposed earlier [55].
where, W is the n × m (n = number of individuals, m = number of SNPs) rescaled genotype matrix following M-P, where M is the genotype matrix containing genotypes coded as 0, 1, and 2 according to the number of alternative alleles, and P is a vector of twice the allelic frequency, p i . The additive genomic relationship matrix (G A ) from the dominant DArT-seq markers was calculated following the formula proposed by [56] where, S is a n × m matrix (n = 970, m = number of DArTseq markers) rescaled genotype matrix following Z-P, where Z is the genotype matrix that specifies the genotypes expressed as 1/0 denoting the presence/absence of the DArT-seq marker, and P is a matrix containing the allelic frequency of the code 1 at locus i, p i . We examined the effect of progressively reducing the number of SNP or DArT-seq markers on the estimate of relatedness coefficients and narrow-sense heritability using an additive-only model, i.e. model in Eq [1] without the dominance component. An additive-only model was used for comparing the two types of markers because DArT-seq are dominant and, as such, do not allow building a bona-fide dominance (G D ) relationship matrix. To that end, subsets of 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), 20,000 (20K), and 30,000 (30K) randomly selected SNPs, and subsets of 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), and 20,000 (20K) randomly selected DArT-seq markers were used to build the corresponding genomic additive relationship matrices. Then, a product-moment correlation coefficient was used to evaluate the connection between pairs of marker-based additive relationship matrices using each combination of marker (SNPs and DArT-seq) and the pedigree-based relationship matrix.
The dominance pedigree-based (D) and SNPs-based (G D ) relationship matrices were calculated in R (http://www.R-project.org/) using the package "AGHmatrix [57]. Two different parameterization approaches [58,59] were employed to build the dominance relationship matrices G D , herein named G DVitezica and G DSu , respectively.
Additive genetic correlations between two different traits measured from the same individual were estimated based on the following individual-tree mixed model: where, y i and y j are, respectively, the vectors of individual-tree observations on trait i and j. Matrices X i L X j , Z p i L Z p j and Z a i L Z a j relate observations to elements of ½β 0 i j β 0 j �, plot effects in ½p 0 i j p 0 j �, and additive genetic effects in ½a 0 i j a 0 j �, respectively, and ½e 0 i j e 0 j � is the error vector. The symbols L and ' indicate the direct sum of matrices and transpose operation, respectively. Finally, the expectation and variance-covariance matrix for plot effects in model [2] are respectively equal to: where, s 2 p i , and s 2 p i are the plot variances for the traits i and j, respectively. The expected value and variance-covariance matrix of the additive genetic effects in model [2] are respectively equal to: where, s 2 a ii and s 2 a jj are the additive genetic variances for the traits i and j, respectively, whereas s a ij is the additive covariance between traits i and j. The symbol � indicates the Kronecker products of matrices. The expected value and variance-covariance matrix of e are respectively equal to: The residual variances for the traits i and j are s 2 e i , and s 2 e j , respectively, and s e ij is the residual covariance between the two traits.

Estimation of genetic parameters
Restricted maximum likelihood (REML) [60] was used to estimate variances and covariances for the random effects in the mixed models [1] and [2], and were obtained with the ASREML program [61]. Estimates of pedigree-and marker-based variances for the plot, additive, and dominance effects, and residual errors, i.e.ŝ 2 p ;ŝ 2 a ;ŝ 2 d ; andŝ 2 e , respectively, were re-parameterized to additive genetic correlations (r), and individual trait narrow-and broad-sense heritability (h 2 N and h 2 B , respectively) as follows: r ¼ŝ a ij ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffî Additionally, pairwise trait Pearson's correlations (and its significance expressed in probability levels) were calculated using phenotypes, and breeding values estimated using pedigree (pedigree-based numerator relationship matrix), SNP-based (33,398 SNPs) and DArT-seqbased (24,001 markers) genomic relationship matrices, using the cor() function in R and the function corr.test() in R (http://www.R-project.org/) package "physch". Finally, Spearman correlations were also calculated to compare whether the ranking of predicted breeding values differed whether estimated using pedigrees and the two marker types used.

Phenotypic data and NIRS models
Extensive phenotypic variation was observed for all measured traits in the breeding population sample ( Table 2). Range variation for mean annual increment in growth ranged from 29 to 122 m 3 .ha -1 .year -1 , total lignin from 24 to 32%, cellulose content from 42 to 55%, and S/G ratio from 1.8 to 4.2. Predictions with NIRS data once calibrated using the subset of 350 trees were satisfactory for part of traits. Overall, considerably better predictions were obtained for wood chemical than physical traits (S1 Table). For S/G ratio, for example, the coefficient of determination of prediction following external validation was R p 2 = 0.86, while for hemicellulose it was only R p 2 = 0.36. For WD density, the NIRS model had a poorer prediction (R p 2 = 0.60), but was still considered effective for predicting values and therefore increase sample size for the subsequent quantitative analyses. For all other traits (CEL, HC, MFA, FL, FW and COA) only between 200 and 339 lab-measured samples were used in subsequent quantitative genetic parameters analyses (Fig 1). Trees for which direct wood trait measurements were obtained and used in quantitative parameter estimation, involved between 40 and 46 parents, 39 and 45 families with an average of 7.7±3.7 trees per family, a fully representative sample from the genetics standpoint.

Genomic relationships
A total of 33,398 SNPs with call rate � 0.90 and MAF (minor allele frequency) � 0.01, and 24,001 DArT-seq dominant (presence/absence) variants at a call rate � 0.80 and estimated MAF � 0.02 were used for the genetic analyses providing good genome-wide coverage of the 11 Eucalyptus chromosomes (S2 Table). To capture the realized relationship structure in the population sample, individual pairwise relatedness were estimated using either genome-wide marker data or pedigrees of unrelated (0.00 relatedness), half-sibs (0.25) and full-sibs (0.50). For the 970 trees ultimately genotyped with both marker types, we gathered 469,965 pairwise estimates of pairwise relationships. The value distribution involved 87.2% estimates of unrelated individuals, while half-sibs represented 10.3% and full-sibs 2.5%. Genomic realized relationships estimated using SNPs or DArT-seq markers showed bell shaped distributions with a better approximation centered at zero for SNPs and a left skewed distribution toward negative values for DArTseq markers (Fig 2). The average genomic relationships for unrelated, half-sibs and full-sibs were always lower than the expected values based on pedigree information. They were estimated at -0.026, 0.125 and 0.34 using SNPs, and -0.029, 0.145 and 0.376 using DArTseq markers, and essentially the same average relationship was apparent once >1,000 markers Genomic relationship-based quantitative genetics of growth and wood properties in Eucalyptus "urograndis" were used (S3 Table). A comparison of the pedigree expected and genomic realized relationship matrices is also depicted using heatmaps showing similar patterns, indicating good pedigree control in the production of the full-sib families, although marker data, by capturing the realized genetic relationships, provides considerably more refined estimates of the continuous distribution of true relatedness in the population sample (Fig 2). The median and interquartile ranges of genomic relationships in the box plots spanned consistently lower values than the pedigree expected relationships for half-sibs and full-sibs, irrespective of the type and number of DNA marker used to estimate relationships, while the whiskers revealed a considerable number of relationships outside the expected range (S3 Fig). Pearson correlations between the A matrix and the different GRM (i.e., G) matrices built with variable numbers and types of DNA markers varied from 0.640 to 0.774, showing a very small variation with decreasing numbers of markers (S4 Table). Correlations between GRMs built with variable numbers of SNPs or DArTseq were generally high, above 0.9. A random set of Genomic relationship-based quantitative genetics of growth and wood properties in Eucalyptus "urograndis" 1,000 SNPs or DArT-seq markers recovered essentially the same GRM built with 33,000 SNPs or 24,000 DArT-seq (r = 0.95; S4 Table). Finally, the correlations of GRM built with SNPs and DArT-seq markers, varied from 0.742 (between the smaller marker subsets) to 0.907 (between GDART10.5K and GSNP30K). Slightly higher correlations were observed for the GRMs built with the genome positioned 10,500 DArTseq when compared to using all 24,000 DArTseq markers.

Heritabilities and genetic correlations
Heritability estimates for lignin traits and wood density were higher than for growth traits with all three approaches (i.e., A, G DVitezica and G DSu ), while for the remaining wood traits heritabilities showed variation according to the estimation approach. Growth traits showed higher broad sense (ĥ 2 B ) than narrow sense (ĥ 2 N ) heritabilities with all three estimation approaches (A matrix and the two G D approaches, G DVitezica and G DSu ), revealing a substantial effect of nonadditive genetic control on these traits (Table 3). For all wood traits, on the other hand, narrow and broad-sense heritabilities were essentially the same, except for FL where dominance also seemed relevant. Additive variance is therefore the main driver of genetic variation in chemical and physical traits, in contrast to what was observed for growth. While for lignin traits heritabilities using pedigree or genomic data were relatively similar, for CEL and HC approximately 2-fold larger heritabilities were estimated using genomic data with both approaches. For FW narrow-sense heritability estimates of zero to 0.08 were obtained, while broad-sense varied between 0.08 and 0.33 suggesting an implausible picture of only dominance variance controlling this trait, a result that will require further scrutiny. Estimates of narrow-sense heritabilities obtained from the realized genomic relationships using varying numbers and types of DNA markers were compared to the estimates obtained with pedigree data under an additive-only model (S5 Table). For growth traits and wood density the estimates obtained using genomic data were again smaller than those obtained from pedigree-based estimates, while for wood chemical and fiber traits they varied. Heritabilities estimates using the two types of markers were essentially the same, and increased with increasing numbers of markers, only stabilizing  Genomic relationship-based quantitative genetics of growth and wood properties in Eucalyptus "urograndis" when more than 10,000 markers were employed, indicating that sparser genome coverage below this density does impact the accuracy of parameter estimation. Pairwise Pearson genotypic (from breeding values) and phenotypic correlations were estimated amongst the fifteen traits using a univariate analysis (Table 4). Genotypic correlations estimated with the two types of markers were equivalent, and generally slightly higher when compared to estimates based on pedigrees relationships and higher than phenotypic correlations. However, this varied depending on which pairwise traits were compared. Genotypic correlations, as expected, were high amongst the traits within each category of correlated traits, i.e., growth, lignin, cellulose and fiber. Overall, significant but low correlations, positive or negative, were apparent between growth and wood traits. A slightly higher negative correlation was estimated between growth traits and WD. Higher negative or positive correlations were also seen between the different wood traits, including CEL × TL (-0. 65 Table). Worth mentioning are the consistently high negative additive correlations of CEL with TL and IL, WD with MAI, DBH and S/G, and the high positive correlations of WD with CEL, FL and COA, of MFA with HC, and of FW and DBH, irrespective of whether a SNP, DArT-seq or pedigree matrix was used. Finally, high Spearman correlations (0.71 to 0.99) were observed among all three pairwise comparisons of breeding values estimated using pedigrees, SNPs and DArT-seq markers (S7 Table). However, correlations between breeding values derived from observed phenotypes with those obtained from pedigrees or marker data were slightly lower for growth and wood physical traits, consistent with the observed differences between narrow and broad-sense heritabilities.

Discussion
Despite the extensive worldwide use of Eucalyptus urograndis in tropical regions, knowledge on the range of genetic variation and magnitude of quantitative parameters for key wood traits of this hybrid was rare until two recent studies reporting data for wood density and pulp yield [14,15]. Our study attempts to fill this gap by providing a comprehensive assessment of wood property traits in a typical urograndis hybrid breeding population. Along with data on four growth traits, measurements of 11 wood properties were carried out by wet chemistry and physical analyses on a set of 200 to 350 samples, which in turn were used to develop NIRS models used to estimate lignin traits and wood density for an additional sample of 650 trees. Moreover, in keeping with the current advances to integrate genomic data into operational breeding [45], our study employed high-density genome-wide SNP-based relationships as a substitute of the conventional pedigrees for improved genetic parameter estimation. Accurate estimates of narrow and broad sense heritabilities, genetic and phenotypic correlations were obtained that should be valuable to inform improved breeding decisions for similar genetic material.

NIRS phenotyping and range variation of wood traits
In addition to confirming the outstanding volume growth of urograndis hybrids, a key finding of our report is the significant phenotypic variation observed for wood chemical and physical traits. Trees with excellent growth rate, wood density in the range of 550-600 kg. m -3 and lignin content and S/G ratio, as low as 24% and above 4.0 respectively (Table 2), were observed in a relatively small set of 1,000 individuals sampled in 45 full-sib families from 46 unrelated parents. The combination of high growth rate, with average density and high S/G ratio wood is what pulpwood tree breeders usually target. These properties ultimately result in high mean annual cellulose increment per hectare, because wood with higher S/G tends to be easier to    delignify during chemical pulping [62], consuming less chemical and energy and producing higher pulp yield.
Our study emphasizes that a better use of the existing phenotypic variation in breeding populations depends on efficient ways to collect data for wood property traits for large numbers of samples. Direct wet chemistry methods for thousands of tested trees in a progeny trial are usually not an option, in spite of faster analytical methods for some traits. In most Eucalyptus breeding programs, only wood density and sometimes pulp yield are measured in a subsample of trees of a trial that passed the initial cutoff of volume growth assessment. Additional wood traits are then measured on a very limited set of trees at the final clonal testing stage, precluding a wider sampling of the available variation for wood properties during progeny trials. Near-infrared reflectance spectroscopy (NIRS) applied to wood properties in Eucalyptus is changing this scenario [22,24,30,63,64], allowing the prediction of wood traits by non-destructive sampling of very large numbers of samples. In our study, a good calibration model was obtained for S/G ratio (R p 2 = 0.86) and reasonable models for the other lignin traits with R p 2 � 0.70 (S1 Table). Very good prediction models for S/G ratio have recurrently been reported for Eucalyptus with R p 2 up to 0.97, a particularly valuable result to breeding for cellulose, as S/G ratio and Kraft pulp yield have been shown to be strongly positively correlated both in temperate and tropical Eucalyptus species [21,30,[64][65][66]. A NIRS calibration below the threshold for a ''good" model [17] was obtained for WD (R p 2 = 0.60), possibly impacted by the fact that NIRS spectra were collected on woodmeal samples and not on solid samples. Model for WD was, however, deemed adequate for the purpose of this study including the estimation of quantitative parameters and tree ranking, as both these data applications involve relative traits values.
Better calibrations for WD in eucalypts were developed when NIRS readings were obtained on more elaborate samples of radial wood surfaces [67], although good models could also be developed with woodmeal material when larger sample sizes were used [66]. For cellulose and physical fiber traits, our NIRS models were not as strong as those reported in previous studies [24,30]. As a result, only the actual 200 to 339 directly wet-lab measured trees were used for subsequent quantitative genetics analyses. These sample sizes and their source in terms of the number of families and parents involved nevertheless constituted a representative sample of the population for quantitative parameter estimation. Recently, global NIRS calibrations for wood chemistry in Eucalyptus were reported using the same NIRS instrument as the one we employed in our study [30]. Such a development opens opportunities for a follow-up study by simply inputting the spectra data collected in our work into those models, to potentially improve our estimates and include individual carbohydrate traits not yet contemplated here.

Genetic control of growth and wood traits
Wood properties, particularly wood density and chemical traits, showed a moderate to stronger genetic control than growth traits (Table 3), in agreement with previous reports [14,15] and a number of studies in different Eucalyptus species [22,28,63]. On the other hand, FL, MFA and COA showed a moderate to low genetic control, consistent with studies in E. globulus and E. urophylla [27,63], while FW showed a poor heritability. The use of additive-dominant models and the inclusion of genome-based data in our work further highlights the significant role of non-additive sources of variance in the control of growth in this Eucalyptus hybrid, confirming earlier studies that showed either a balance or a higher weight of dominant relative to additive effects [7,15,38]. Furthermore, with the exception of height growth, the ratio of non-additive to additive variance became higher when a genomic relationship matrix was used (Table 3). Interestingly, the key role of dominance variation in growth seems to be a peculiarity of this hybrid, as dominance variance was largely insignificant for growth in all other eucalypts where this was investigated, including E. urohylla × E. tereticornis hybrids [29], E. globulus [20], E. nitens [41] and E. globulus × E. nitens hybrids [68]. These results help elucidate the long-lasting success in achieving significant volume gains by capturing the non-additive portion of the genetic variation into clonally propagated selected eucalypt urograndis trees [69]. For all wood properties, the dominance variance was either unimportant for chemical traits and WD (s 2 D =s 2 A � 0), or only slightly relevant for FL and COA with a ratio around 0.2 to 0.6 when a genomic matrix was used. Equivalent results were recently reported for growth, wood density and pulp yield when using a genomic relationship matrix in an additive-dominant model in a similar urograndis hybrid population [15].
It is important to point out that in estimating genetic parameters, we used an individualtree mixed model (i.e., progeny model) with additive and dominance genetic effects instead of a parent mixed model with male and female effects and their interaction. The individual-tree mixed model used here assumes that the alleles controlling the traits would be common in the parental lines studied, that the variance genetic components in the segregating eucalypt hybrid population used are similar, and that epistasis is negligible. Identifying which genetic model provides the best description of a forest tree hybrid is a challenge [70], as the assumptions of the infinitesimal model may not be fully appropriate [71]. Studies in eucalypt hybrids have estimated genetic variance components using either an individual-tree mixed model [72,73], a parent mixed model (e.g., [11,29,74], or both [38]. When comparing parental and progeny mixed models involving pedigree-and marker-based information in a urograndis hybrid population with 13 female and 9 male parents, Bouvet et al. (2016) showed that progeny models with genome-wide information improve the variance estimates and the prediction of genetic values, and that a larger numbers of parents could improve model performance even more. Our study included a relatively large number parents (N = 46) and the progeny additive-dominance model using marker-based data reduced the overestimation of the additive variance observed when using pedigrees and consequently the narrow-sense heritabilities, mainly for DBH, volume and MAI (S3 Table). Particularly for such traits where the non-additive component is important, the pedigree-based analysis cannot capture the full genetic relationship among the individuals, failing in disentangling the non-additive genetic component from the additive one [37]. Genome-based relationships therefore not only provide considerably more realistic estimates of narrow-sense heritability, but also more accurate genetic variance decomposition into additive and non-additive factors. Of relevance is the fact that the use of a GRM allows such variance decomposition even if breeding is carried out by incomplete pedigrees [41], an easy, economical alternative that now becomes an appealing option for advancing eucalypt populations in light of the availability of low-cost genotyping.

Genetic correlations
Reported additive correlations between growth and wood traits have varied significantly across different Eucalyptus species, germplasm sources, and sites such that it is difficult to find general patterns for the relationships among these traits. Correlations from SNP-based relationships indicate that selecting for increased growth in urograndis will result in very modest negative genetic responses in wood density (-0.26), cellulose content (-0.12) and fiber length (-0.18), and positive for soluble and insoluble lignin and S/G ratio (0.15) ( Table 4). These findings are consistent with earlier reports showing low negative correlations of growth with WD and pulp yield [14]. Although statistically significant, likely due to the large sample size used, these low correlations indicate that simultaneous selection in any desired direction for these wood properties can be applied with minor effects on the genetic gain for higher growth rate. Reports on correlations between growth and wood density have been somewhat conflicting, most of them showing negative correlations in E. globulus [20,63] and E. urophylla [27] consistent with our study, while others detected either positive correlation in E. urophylla × E. tereticornis hybrids [29], or no correlation in E. globulus [22,75]. Taken together, these results suggest that growth and wood density in eucalypts might be in fact largely independent, with their correlation impacted by species, germplasm source and environment. We also found low genotypic correlations between volume growth and chemical and fiber traits (positive with lignin, S/G and hemicellulose; negative with cellulose and fiber dimensions), again indicating opportunities for flexible concurrent selection for high growth and wood properties. Results from other studies either agree or contrast ours, confirming that no general patterns seem to exist when trying to establish any standard correlation between growth and wood traits in eucalypts [23,28]. On the other hand, correlations between wood chemical traits and density have been more consistent across eucalypt studies, likely due to the evident physical content interdependency of cellulose and lignin in cell wall structure and the impact of cellulose content on wood density. Our results agree with a general pattern of moderate to strong negative correlations between cellulose and lignin contents, density and S/G ratio, and positive between density with cellulose content and fiber traits [22,27,29,63].

Implications for Eucalyptus urograndis breeding
Urograndis hybrids have been and will continue to be the mainstay of the vast majority of eucalypt forest based industrial operations in the tropics, with an increasingly outstanding role in the world supply of sustainable forest-based products. By studying a typical urograndis breeding population, our report confirms some known patterns of variation for growth traits and brings novel quantitative genetics data for an array of wood properties. While increased volume growth is a common objective to all breeding programs, the relative importance and direction in which wood traits are selected varies. For example, a specific wood density window of approximately 500-550 kg.m 3 is the target for trees for maximal efficiency in cellulose production, but higher density is aimed for energy (specifically, calorific value). While low lignin is usually desired to improve pulp yield in cellulose production, high lignin content to enhance the calorific power of wood is pursued by industries that use eucalypt for charcoal or biomass. Our analysis showed that ample genetic variation is available for all traits in a typical urograndis hybrid breeding population. However, while additive variance represents essentially the exclusive source of genetic variation for all wood property traits, dominance variance plays at least an equally important role as the additive variance when growth traits are considered. Additionally, our results indicate a general pattern of low genetic correlations between growth and wood traits, indicating good potential for simultaneous genetic improvement of individual trees for higher volume growth and wood properties in any desired direction. Wood chemical traits, however, do show significant genetic correlations, and given their important influence on final production traits, they call for unique breeding tactics to manage the connected responses to selection.
Our results have useful consequences for the current and future prospects of breeding urograndis hybrids. Reciprocal recurrent selection (RRS) between E. grandis and E. urophylla was tentatively adopted by a number of industrial breeding programs in the 1980's and 1990's, based on models derived from hybrid breeding of annual crops. It proved, however, to be time-consuming, costly and inefficient. Recently, simple recurrent selection (SRS) in synthetic hybrid population has been adopted in urograndis programs due to its simplicity, higher speed and lower cost, with genetic gain per year projected to far exceed that of RRS [69,76,77], although little empirical data is yet available to back its expected performance. The mostly additive control of wood properties described in our study would warrant using SRS, but the relevance of dominance would question its full efficiency for growth. However, as pointed out earlier [69], while non-additive effects would apparently be disregarded when choosing SRS, they would be efficiently captured nonetheless during selection of elite individual trees followed by their deployment by large-scale clonal propagation.
Finally, our study adds to the recent stream of promising reports showing the positive impact of using genomic realized relationships to estimate heritability, genetic correlations and breeding values in essentially all mainstream forest trees [45]. By effectively tracking the Mendelian sampling term, the use of genomic data gets closer to the true relatedness among individuals within and across families, allowing better genetic variance decomposition when compared to using the expected documented pedigrees. Both SNP and dominantly inherited DArT-seq markers efficiently recovered relationships; however, co-dominant SNPs were superior in providing more accurate estimates of heritability and allowing the estimation of dominance variance. Of note is the fact that the DArT-seq genotyping method currently also provides co-dominant SNP data. Although this study was carried out on fully pedigreed families, genomic relationships from genome-wide SNPs would allow advancing breeding populations by partial pedigrees followed by sib-ship reconstruction with the same or better precision of fully pedigreed populations with the added advantage of close control of inbreeding and improved management of genetic diversity. Costs of SNP genotyping have fallen drastically in recent years making this a viable option for eucalypts where controlled crosses are still relatively cumbersome, expensive, and prone to errors. Such flexibility in operational breeding practice and the prospects of accelerating breeding cycles by genomic selection for multiple traits at ultra-early ages, convincingly point to a new era of genomically-informed eucalypt tree improvement. , and Unrelated (c). From let to right estimates obtained using G matrices constructed from subsets of randomly selected of 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), 20,000 (20K), 30,000 (30K) and all 33K SNP markers, and 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), 20,000 (20K) and all 24K DArT-seq markers and using only the 10,501 (10.5K) DArT-seq markers mapped to the eleven Eucalyptus chromosome scaffolds. (PDF) S1 Table. Fit statistics of NIR calibration models for wood chemical and physical traits. (PDF) S2 Table. Distribution of SNPs and DArT-seq markers along the 11 assembled chromosome scaffolds of the Eucalyptus grandis reference genome. (PDF) S3 Table. Mean and standard error (SE) of the pairwise estimated relatedness for individuals with expected relationships (Full-sib, Half-sib and Unrelated) from the additive relationship matrix from the pedigree (A) and genomic relationship matrices (G). Matrix G was constructed from all available SNPs (~33K) and DArT-seq (~24K) markers and from subsets of randomly selected of 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), 20,000 (20K), 30,000 (30K) SNP markers, and 500 (05K), 1,000 (1K), 3,000 (3K), 5,000 (5K), 10,000 (10K), 20,000 (20K) DArT-seq markers. Matrix G was also calculated using only the 10,501 (10.5K) DArT-seq markers mapped to the eleven chromosome scaffolds. (PDF) S4 Table. Pearson correlations between all the non-diagonal pairwise elements (full-and half-sib relatedness and unrelated) of the additive relationship matrix from the pedigree (A) and genomic relationship matrices (G). Abbreviations used for the number and types of marker were described in the caption of S3 Table. (PDF) S5 Table. Narrow-sense heritabilities (ĥ 2 N ) and their approximate standard error (SE) for each growth, chemical and physical wood trait. See text for traits´abbreviation. Abbreviations used for the types and number of marker were described in the caption of S3 Table. (PDF) S6 Table. Estimated additive genetic and phenotypic correlations (and approximate standard errors) between different growth, chemical and physical wood traits from pairwise bivariate analysis of the Eucalyptus grandis × E. urophylla breeding population. In each cell from top to bottom: genotypic correlation based on SNP-based realized relationship matrix; genotypic correlation based on DArTs-based realized relationship matrix; genotypic correlation based on pedigree-based relationship matrix; phenotypic correlations. NOTE: a Correlation and their approximate standard errors were not estimated due to convergence problems. (PDF) S7 Table. Spearman correlations amongst breeding values (a) derived from observed phenotype (y), with breeding values predicted from pedigree (A), SNP (~33K) and DArT (2 4K) markers. (PDF)