Genome Regions Associated with Functional Performance of Soybean Stem Fibers in Polypropylene Thermoplastic Composites

Plant fibers can be used to produce composite materials for automobile parts, thus reducing plastic used in their manufacture, overall vehicle weight and fuel consumption when they replace mineral fillers and glass fibers. Soybean stem residues are, potentially, significant sources of inexpensive, renewable and biodegradable natural fibers, but are not curretly used for biocomposite production due to the functional properties of their fibers in composites being unknown. The current study was initiated to investigate the effects of plant genotype on the performance characteristics of soybean stem fibers when incorporated into a polypropylene (PP) matrix using a selective phenotyping approach. Fibers from 50 lines of a recombinant inbred line population (169 RILs) grown in different environments were incorporated into PP at 20% (wt/wt) by extrusion. Test samples were injection molded and characterized for their mechanical properties. The performance of stem fibers in the composites was significantly affected by genotype and environment. Fibers from different genotypes had significantly different chemical compositions, thus composites prepared with these fibers displayed different physical properties. This study demonstrates that thermoplastic composites with soybean stem-derived fibers have mechanical properties that are equivalent or better than wheat straw fiber composites currently being used for manufacturing interior automotive parts. The addition of soybean stem residues improved flexural, tensile and impact properties of the composites. Furthermore, by linkage and in silico mapping we identified genomic regions to which quantitative trait loci (QTL) for compositional and functional properties of soybean stem fibers in thermoplastic composites, as well as genes for cell wall synthesis, were co-localized. These results may lead to the development of high value uses for soybean stem residue.


Introduction
Composite materials are produced from two or more components, which have different physical and chemical properties. In biocomposites, one or more phases have a biological origin [1]. monolignol units (p-coumaryl alcohol, coniferyl alcohol and sinapyl alcohol) with different levels of methoxylation. The ratios of the monolignols in lignin vary among plant species, tissues and cell wall layers [15]. Cellulose and lignin contribute to the mechanical properties of plant stems. Hemicellulose usually acts as filler between cellulose and lignin and mechanically contributes minimally to the strength and stiffness of fibers [16]. The lignin content plays a large role in the fiber structure, morphology and flexibility. In general, higher lignin contents are associated with finer and more flexible fibers [17].
The composition of plant cell walls is genetically controlled and varies among species, cultivars, tissues, developmental stages and environments [18,19]. It is also highly dependent on the technique(s) used to make the measurements and variation exists between different evaluations. Cell wall components are often determined with a well established detergent analysis procedure used to measure the components and digestibility of fibers in forages and animal feeds [20]. It is performed in sequential manner, by employing a series of extractions, and measures the content of neutral detergent fiber (NDF, measures cellulose, hemicellulose and lignin), acid detergent fiber (ADF, measures cellulose and lignin) and acid detergent lignin (ADL, lignin). It also provides estimates of hemicellulose (NDF-ADF) and cellulose (ADF-ADL) contents.
Identification of quantitative trait loci (QTL) associated with phenotypic variation for cell wall components and genes underlying these QTL would result in a better understanding of genetic bases of these traits, while the development of molecular markers for the QTL would simplify and accelerate breeding for these traits. A number of fiber QTL studies have been conducted and numerous QTL have been mapped for cell wall components in various plant species. For example, QTL that explained significant variation for NDF (cellulose, hemicellulose and lignin), ADF (cellulose and lignin) and ADL (lignin) contents in maize stems have been identified on all ten chromosomes in different mapping populations {recombinant inbred lines (RIL) B73 x B52, n = 200; [21]; F 3 B73 x De811, n = 150; [22]; RIL (F 6 ) B73 x De811, n = 200; [23]}. In Arabidopsis, QTL for stem fiber length and lignin content were identified on chromosomes 2 and 5 in Col-4 x Ler-0 RIL population (n = 98) and annotated genes within the QTL intervals were investigated [24]. Numerous QTL for seed cell wall polysaccharides have been identified in soybean Minsoy x Archer RIL population {n = 108; [25]} however, no QTL information is available for soybean stem cell wall components. All these QTL for cell wall components were identified in populations of different types and sizes, usually designed to segregate specifically for the trait(s) of interest. However, in a mapping population not all segregants are equally informative and selective mapping was proposed as an alternative approach, that is valid especially for the traits that are expensive or difficult to phenotype [26,27]. In selective mapping, the focus can be on lines at the extreme (high and low) ends of the trait distribution, which have a tendency to contain more positive and negative alleles. By selecting the most informative recombinants from the mapping (random) population, selective mapping can be as effective as whole population mapping. In general, the detection of major QTL is not affected by selective mapping. However, selective mapping may have reduced power to detect minor QTL (<10%), since the selected population contains only a portion of variance explained by the whole population and may result in a reduced linkage map [28,29].
The clustering of QTL associated with fiber traits was reported for a number of plant species, including cotton [30,31] and maize [21,22] and has been attributed to linkage or pleiotropy. It was proposed that the underlying genetic basis for this observation is clustering of developmentally-related genes in plant genomes {reviewed in Nȕtzman and Osbourn [32]}. In particular, clusters of cotton fiber quality QTL may represent groups of coordinately regulated genes and/or groups of small gene families that have undergone proximal duplication followed by sub-or neo-functionalization [33]. However, the biosynthetic pathways for cellulose, hemicellulose and lignin are different and complex and involve the synthesis of numerous diverse products [34,35]. It was estimated that over 2,000 genes are involved in cell wall biosynthesis and modification in Arabidopsis stems [36].
Cellulose and hemicellulose are closely associated in cell walls by enzymatic modifications and chemical crosslinking [42]. Hemicellulose synthesis occurs in the Golgi bodies [43]. Enzymes for hemicellulose biosynthesis are encoded by a cellulose synthase-like (Csl) gene superfamily classified into eight families (CslA to CslH). In Arabidopsis, 29 Csl genes are classified into six families (no CslF and CslH families); in rice, 36 Csl genes have been identified in six families (no CslB and CslG families); and 33 Csl genes identified in maize belong to five families {no CslB, CslG and CslH [38]}. The current version of the soybean genome contains over 60 sequences annotated as Csl genes {G. max Wm82.a2.v1, Phytozome v9.1, accessed 13 Mar 2015; [41]}.
Biosynthesis of secondary cell walls is complex and requires coordinated expression of secondary wall structural genes and targeted secretion, deposition and assembly of wall components {cellulose, hemicellulose and lignin; [47][48][49]}. A transcription factor network, composed of secondary cell wall NAC domain and R2R3 MYBs, regulates secondary cell wall biosynthesis in Arabidopsis. Several NAC domain transcription factors (SND1, NST1, NST2, VDN6 and VND7) act as master switches and can directly activate the expression of secondary cell wall specific biosynthetic genes and downstream transcription factors (such as: SND2, SND3, KNAT, AtMYB46, AtMYB52, AtMYB54, AtMYB58, AtMYB63, AtMYB85 and AtMYB103) that also directly regulate secondary cell wall biosynthetic genes.
Biomass from crop production is a large source of natural fibers. In North America several thousand soybean varieties are grown on over 32 million hectares (http://www.agcensus.usda. gov/Publications/2012/Full_Report/Volume_1,_Chapter_1_US/usv1.pdf), spanning 13 maturity zones [50]. This production leaves more than 15 million tons of residue in the fields annually (http://www.ofa.on.ca/uploads/userfiles/files/biomass_crop_residues_availability_for_ bioprocessing_final_oct_2_2012.pdf). Approximately one third of the residues could be safely removed from the fields and used in various industrial applications. However, the relative contributions of the genetic backgrounds and environments the plants are grown in to the compositional and functional properties of soybean stem fibers are unknown. The current study characterized the chemical compositions of mature soybean stems grown in different environments, measured the genetic and environmental effects on the performance properties of stem fibers after incorporation into a low-cost polypropylene (PP) thermoplastic matrix and identified chromosomal locations conditioning these unique stem fiber traits in soybean. The work identified novel QTL for this traditional food and feed crop that could lead to the development of high value uses for the stem residue.

Plant Material
A selective phenotyping approach, with a set of 50 RILs, representing approximately 30% from an existing, well characterized RG10 x OX948 mapping population of 169 RILs [51,52] was used in this study. A height per unit of lodging (H/L) was derived from the plant height and lodging measurements made previously on these lines [51] and used to select two groups of lines with contrasting stem characteristics from both ends of the trait distribution (Fig A in

Chemical Analysis of Fibers
A three-step detergent fiber analysis [20] was used to characterize ground dry soybean stems (a single 0.5 g sample from each replication/location/year) for neutral detergent fiber [NDF, isolates cell wall (hemicellulose, cellulose and lignin)], acid detergent fiber (ADF, estimates cellulose and lignin) and acid detergent lignin (ADL, isolates lignin). Sequential analysis using a filter bag method was performed with the Ankom 200 fiber analyzer (Ankom technology, Macedon, NY) according to the manufacturer's instructions (http://www.ankom.com/analyticalprocedures.aspx). Hemicellulose and cellulose contents were calculated from NDF, ADF and ADL values (hemicellulose = NDF-ADF, cellulose = ADF-ADL) and expressed in % on a dry weight basis.
The content of free phenolics in ground soybean stems was determined with a 50% Folin-Ciocalteu's phenol reagent (Sigma Chemicals Company, St. Louis, USA) using a microwavebased protocol [53] and determined at 725 nm with the SpectraMax Plus384 absorbance microplate reader using SOFTmax PRO 4.0 controller software (Molecular Devices Corporation, Sunnyvale, CA, USA). Gallic acid (Acros Chemical Company, NJ, USA) in 50% ethanol was used as a standard and the quantities of free phenolics were expressed as an average of three (10 mg) subsamples measurements (from each replication/location/year) in μg mg -1 phenolics on a dry weight basis.

Thermogravimetric Analysis
To optimize the processing conditions, the thermal stability of the fibers was determined by thermal gravimetric analysis (TGA), prior to compounding in the PP matrix. The fiber onset degradation temperature (°C) was measured by heating samples (one per pooled replication from each location/year) from 35°C to 700°C at a heating rate of 10°C min -1 in a nitrogen environment (flow rate of 50 ml min -1 ) with the TGA Q500 instrument (TA Instruments, New Castle, DE, USA). The weight loss as a function of temperature was measured with the Universal Analysis 2000 instrument (TA Instruments, New Castle, DE, USA). TGA thermographs were used to measure weight losses for fibers from the different RILs. Onset degradation temperatures were recorded as the temperatures (°C) at which the samples showed 1% weight loss. The materials were homogenized with a melt blend process. Soybean stem residue (20 wt-%), polypropylene (77.5 wt-%), coupling agent (2 wt-%) and antioxidants (0.25 wt-% each Irganox 1010 and Irgaphos 168) were hand mixed to get a uniform mixture and extruded using a conical twin-screw micro-extruder (Haake MiniLab, Thermo Electron Corporation, Waltham, MA, USA) with optimized processing conditions (190°C, 40 rpm). The extruded composites (208 formulations) were hand cut into pellets and molded to produce 15 test bars using an injection molding RR/TSMP machine (Ray-Ran, Warwickshire, UK) with the barrel temperature at 190°C, mold tool temperature at 50°C, 15 sec hold time at 100 psi pressure. The test specimens were annealed in an air circulating oven GC 5890A (Hewlett Packard, Ramsey, MN, USA) at 150°C for 10 min at a temperature rate of 10°C min -1 and cooled down to room temperature. Ten test bars (63.5±0.2 12.5±0.2 x 3.10±0.2 mm) produced with stem fibers of 50

Trait Data Analysis
Analysis of variance was performed using the Proc Mixed procedure in SAS (Statistical Analysis System) v.9.2 software [57]. Data were analyzed separately for each location and year, combined across locations for each year, and combined over two years. Genotypes and locations were considered as fixed effects and all other effects were considered to be random. The homogeneity of error variances were tested before pooling data for combined analyses using a residual analysis in SAS. The relationships among agronomical traits, fiber chemical traits and the physical traits of the composites were analyzed by correlation (Spearman) in SAS and principal components analysis (PCA) using STATISTICA v.9 software [58]. Heritability in standard units [59] for fiber traits and composite mechanical traits was estimated by determining correlations between the values for 2009 RILs (F 9 ) and 2008 RILs (F 8 ).

Mapping and Marker Development
Over 100 genes involved in cell wall biosynthesis and modifications were selected from databases (NCBI, DFCI) and microarray literature and more than 200 gene-specific PCR primers were designed (Table A in S1 File). Genomic DNA was isolated from the young leaves (100 mg) of growth room-grown plants with the DNeasy plant mini kit (Qiagen Inc.-Canada, Mississauga, ON, Canada) according to manufacturer's protocols. Cell wall gene-specific primers were screened with parental (RG10 and OX948) genomic DNA. PCRs were performed in 20 μl volumes containing 1x PCR buffer (supplied with enzyme), 3 mM MgCl 2 (supplied with enzyme), 0.1 mM each of dNTPs (Invitrogen, Life Technologies, Inc., Burlington, ON, Canada), 1.6 U Taq DNA polymerase (Invitrogen), 5 μM each of the forward and reverse primer and 24 ng of soybean genomic DNA, with a PTC-100 Programmable Thermal Controller (MJ Research, Inc., Watertown, MA, USA). The amplification program consisted of an initial 2 min denaturation step at 94°C, followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 55-60°C for 45 s and extension at 72°C for 1 min, with a final extension at 72°C for 10 min. The PCR products were separated by electrophoresis on 1% w/v agarose gel containing ethidium bromide in a 1xTBE buffer at 100 V for 2 h and visualized under ultraviolet light. Polymorphic primers were used to screen 169 RILs from the RG10 x OX948 population. Monomorphic PCR products were purified and used as a template for cycle sequencing (CEQTM 8000 genetic analysis system; Beckman Coulter Inc., Fullerton, CA, USA). Sequences were compared to existing soybean sequences by BLAST searches at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) to ensure that the target genes had been cloned. Newly produced single nucleotide polymorphism (SNP) markers were used to screen the complete RG10 x OX948 population. Fiber genes were isolated and gene-specific PCR-based markers for several key enzymes in cellulose, hemicellulose and lignin biosynthetic pathways were developed. Mapmaker/Exp 3.0b [60] was used to add the newly developed fiber gene-based markers to the previously created RG10 x OX948 linkage map, which contained 120 markers [simple sequence repeat (SSR), random amplified polymorphic DNA (RAPD) and gene (omega-3 fatty acid desaturase and seed lipoxygenase)-based sequence-tagged sites (STS) and cleaved amplified polymorphic sequences (CAPS)] on 26 linkage groups (18 chromosomes) and covered 1,247.5 cM [51]. A minimum LOD score of 3.0 and maximum distance between two markers of 50.0 cM were used to assign new fiber gene-based loci into linkage groups. Recombination frequencies were converted to cM distances using Kosambi's mapping function [61]. QTL were identified with composite interval mapping (CIM) using Windows QTL Cartographer version 2.5 [62] with the following settings: map function Kosambi, a walk speed of 2 cM, five control markers, model 6 (standard), forward and backward regression (method 3) and probabilities of 0.05. Genome-wide scans were performed for each trait and QTL. The 1,000 permutation test at 0.05 significance level for CIM was used to determine LOD thresholds for each trait (QTL group 1). Because of the novelty of some of the mapping traits, QTL at LOD threshold values 2.5 (program's default; QTL group 2) and 2.0 {LOD threshold used in Reinprecht et al [51]; QTL group 3} were also considered as putative QTL. The map positions of these QTL were detected using the option for automatic QTL location (using program's default parameters). Several QTL not automatically detected were also marked as putative when they exceeded threshold LOD scores (QTL group 4). Additive effects at each significant QTL and the percentages of phenotypic variation (R 2 ) explained by QTL for each trait were acquired directly from the CIM output.
A soybean in silico map was generated by a two-step process. Initially, all gene sequences used to design fiber gene-based primers were BLASTed against soybean Williams 82 (Wm82) genome (G. max Wm82.a2.v1) in Phytozome v9.1 {available at: www.phytozome.net; [41]}. Flanking markers of the newly identified fiber compositional and composite performance QTL intervals were used to position these QTL on soybean physical map. Subsequently, genomic regions containing these QTL were scanned for additional candidate genes that might be involved in cell wall biosynthesis and modification. The maps were drawn with the MapChart 2.2 software [63]. For each chromosome, the genetic and in silico maps were aligned and connected by common SSR markers.

Quantitative Traits Variability
In this work we used 50 RILs, approximately 30% of the existing soybean RG10 x OX948 mapping population (n = 169) created from the parents with different agronomic and seed characteristics with available extensive molecular genetic information [51,52]. The selection of RILs was based on the height per unit of lodging (H/L), a trait derived from the plant height and lodging measurements [51]. Cell wall composition was associated with stem strength and standability (lodging) in wheat [64,65]. In pea, lodging was negatively correlated with lignin and cellulose contents in stems [66].
Significant genotype by environment (GxE) interactions were detected for the most of the analyzed traits (Table B in S1 File). Therefore, QTL analysis was performed separately for each environment. Frequency distributions for all traits (raw data) are shown in Fig C-a to C-c in S1 File.
Agronomic traits. RILs were different for all agronomic traits (days to maturity, plant height, lodging and derived height per unit of lodging trait) evaluated in three locations over two years (Fig D-a in S1 File). Maturity varied from 109 days (Harrow, 2008) to 146 days (Ridgetown. 2009). Variability for plant height was higher in the second year (2009) and ranged from 58.5 cm (Ridgetown) to 115 cm (Harrow) and was likely associated with greater moisture availability and higher temperatures during the second growing season compared to the first (Fig B in S1 File). However, variability for plant height was less in this data set when compared to the values for this trait evaluated for the whole population in a different set of environments [51].
Chemical composition of soybean stem fibers. The fibers obtained from mature stems of the parental genotypes (RG10 and OX948) and 50 RILs grown in three locations over two seasons, after cutter milling and sieving through a 2.0 mm pore-size screen showed difference in color, and varied from light (Fig E-a in S1 File) to a deep brown (Fig E-b in S1 File). The stem fibers also had significantly different chemical compositions. Transgressive segregants were detected for all fiber compositional traits among RILs grown in six environments (Table 1; Fig D- Fig D-b in S1 File). Johnson et al [67] reported slightly higher values of these cell wall components [cellulose 526 g kg -1 , hemicellulose 289 g kg -1 , and lignin 168 g kg -1 (acid-insoluble) + 5 g kg -1 plant material (acid-soluble)] for soybean line NK S14-M7. Reddy and Yang [68] reported wide ranges of cellulose (44-83%, ADF method) and lignin (5-14%, Klason lignin) contents in soybean straw. The low performance of plant fibers in composites is associated, in part, with the degradation of fiber components caused by the high temperatures necessary to melt resins (~200°C) [69]. The stem fibers of the RILs used in this study showed significant variation (17.3°C) in their onset degradation temperatures. They ranged from 188.2°C (Woodstock, 2009) to 205.5°C (Woodstock, 2008), but were generally lower than wheat fibers (208.8°C) and pure PP (350.0°C)(data not shown).
Mechanical properties of soybean stem fiber/polypropylene (SS/PP) composites. In the current study, a total of 208 soybean straw formulations were molded into test specimens with PP and analyzed for their mechanical properties. The results showed that both genotype and environment had significant effects on the performance (mechanical) properties of the composites ( Table B in S1 File). Fibers from different RILs, when incorporated into PP matrix, were significantly different for their flexural, tensile and impact properties. Furthermore, transgressive segregants were identified for all the traits, which indicates that these traits could be improved through breeding ( Table 2; Fig D-c in S1 File). To the best of our knowledge, this is the first demonstration that the genetic background of the source of plant fibers influences the performance characteristics of the composite materials manufactured from them.
In general, the addition of soybean stem fibers to the PP matrix resulted in composite materials that had improved tensile and flexural properties relative to pure PP. The values for flexural strength, flexural modulus and tensile modulus were 17%, 33% and 15% higher, compared to pure PP (Table C in  These results indicate that soybean stem residues, when incorporated into a PP matrix, behave as enforcing materials. These data support previously reported results with composites with soybean stem flour [70,71]. The addition of plant fibers to polymer matrix increases tensile and flexural properties of composites. The onset degradation temperature of soybean stem fibers incorporated into composites was 225°C, which was considerably lower compared to pure PP (350°C). However, pure PP degrades completely at 425°C while soybean fiber composites degrade at 462°C in nitrogen gas environment suggesting that the incorporation of soybean fibers retards the process of degradation at higher temperatures (data not shown). The superior characteristics of the soybean/ PP matrices, compared to the wheat fiber/PP matrix, is significant because a wheat straw fiberfilled PP (WS/PP) composite is currently being used in automotive parts production (http:// corporate.ford.com/news-center/press-releases-detail/pr-ford-teams-up-to-develop-wheat-31391).

Associations Between Fiber Composition and Composite Performance
Significant negative correlations were found between height per unit of lodging (derived selection trait) and plant height and lodging measurements. This trait was also correlated with some fiber and composite traits in some environments ( Table 3). The relationships among fiber compositional traits were complex, as indicated by inconsistent low to moderate correlations in samples from different locations/years (S1 Table).  Table). PCA was used to further examine potential relationships (average of four environments-Harrow and Woodstock in 2008 and 2009) among fiber and three main composite traits (tensile strength, flexural modulus and impact strength). Four factors explained 68% of the total fiber and composite traits variation among the RILs (data not shown). A biplot of the first two factors explained 42% of the variation in the RILs (Fig 1). With the exeption of lignin composition, which was grouped with composite mechanical properties (tensile strength, flexural   modulus and impact strength), fiber composition and composite traits were separated into two groups. In particular, onset degradation temperature, hemicellulose and cellulose content grouped together with no significant effects on the composite traits. In addition, the free phenolic content of the fibers did not belong to any group and negatively affected hemicellulose content, cellulose content and onset degradation temperature (fiber traits). This is in agreement with the negative correlation values observed between free phenolics and these traits in some environments (S1 Table). Some significant associations of the fiber compositional traits with mechanical properties of their composites indicates that the chemical compositions of soybean stem fibers affect their performance in SS/PP composites. This was expected because the major components of plant cell walls (cellulose, hemicellulose and lignin) were present in soybean stem fibers in highly various quantities among the different RILs. In addition, they have very different chemical structures [72,73]. Cellulose is a simple homopolymer, composed of linear D-anhydroglucopyranose units joined together by β-1,4-glycosidic linkages, that is the same structure in all plants but present in different quantities. Hemicellulose is a more complex heteropolymer, consisting of pentoses, hexoses and sugar acids that differs among plants. It is branched and has molecular sizes that are 10 to 100 times smaller than cellulose and it is soluble in alkali and strong acids. Lignin is a complex hydrocarbon polymer with both aliphatic and aromatic components that differs, not only among plant species, but also, among cell types. It is soluble in alkali and readily oxidized with phenol [73]. Lignin functions as reinforcing agent in plant cell. It crosslinks various cell polysaccharides and provides mechanical strength to the cell wall. This variation in chemical structure would be expected to have different effects on composite mechanical properties. Cellulose has previously been associated with impovement of tensile strength and modulus in cellulose fiber-filled polypropylene composites [74]. Because of its complex chemical structure, hemicellulose has not been isolated from fibers and tested in composites. Toriz et al [75] reported lower tensile, flexural and unnotched impact strength in unmodified lignin-filled polypropylene composites compared to pure PP.
Heritability (in standard units) for fiber compositional and composite mechanical traits was low to moderate (Tables 1 and 2). Cellulose and free phenolics had moderate heritabilities (33% and 37%, respectively). Heritabilities for lignin and hemicellulose were low (13% and 1%, respectively). Lorenz et al [76] reported 0.79 and 0.24 broad-sense heritability for lignin content in two maize populations but to the best of our knowledge this has not been determined previously for soybean stem fibers and SS/PP composites.

Selection of the Best Performing RILs
To identify RILs that can produce fibers for thermoplastic composites suitable for automotive parts manufacture, we developed a performance index based on three mechanical traits of the composites, namely: tensile strength (resistance to tension in material to break), flexural modulus (resistance of material to deformation under stress) and impact strength (resistance of material to withstand shock/ force). The index = Ʃ percent difference for tensile strength from pure PP, percent difference for flexural modulus from pure PP and percent difference for impact strength from pure PP. These parameters were chosen because they show the strength of material against the most common forces [tension (tensile), compression (flexural) and hammering (impact)] and can act together on a piece of composite material or its end product. The index values for the RILs ranged from 115-152 and the majority had higher values than the best parent (RG10 = 132). The line RO139 had the highest index value of 152.8 (Fig 2A). A principal component analysis (PCA) of the index data indicated that the most of the variation in the mechanical performance index was explained by three factors and a biplot with the first two factors explained 85% of the variation (Fig 2B). The plot shows that the index was highly aligned with impact strength and flexural modulus, but the index and the other two parameters were nearly independent of the tensile strength parameter.
A PCA projection of genotype index values ( Fig 2C; Table C in S1 File) identified a number of SS/PP composites (prepared with fibers from different RILs) that had superior index values, suggesting that they would be suitable for manufacturing various parts where impact strength and flexural modulus were important attributes. This analysis would also suggest that line RO139 is the best candidate, since it had the furthest distance from the origin in the double positive quadrant. A radar diagram indicated that composites from this line outperformed pure PP for all three mechanical traits (Fig 3). When compared to WS/PP composites, which already have application in auto industry, SS/PP composites produced from RO139 were better  for impact strength and had similar values for tensile strangth. If the application required a composite material with more tensile strength then lines from the posive negative quadrant could be selected, such as RO7 (Fig 2C).

QTL Identification
The availability of a linkage map data for the soybean RG10 x OX948 population [51] that included the RILs used in the current study provided a unique opportunity to partition the genetic effects on fiber composition and composite physical traits among specific chromosomal locations through a QTL analysis. In addition, we were also able to locate the mapped SSRs and cell wall gene-specific markers in the soybean genome [46] to identify potential candidate genes associated with the QTL (Table D in S1 File). Furthermore, we defined and identified a novel type of QTL, we termed fiber composite performance QTL, which are soybean genomic regions associated with the functional properties of soybean stem fibers (flexural, impact and tensile) in thermoplastic composites. Because of the significant GxE effects (Table B in S1 File) QTL analyses were performed separately for each environment. Based on CIM analysis {a LOD [logarithm (base 10) of odds] score threshold, determined by 1,000 permutations, ranged from 2.2 to 4.7}, at least one QTL was identified for all 15 traits in at least one of six environments (Fig 4).We also considered as putative QTL those identified at LOD threshold 2. In total, 247 QTL were identified for four fiber traits, seven mechanical traits and four agronomic traits including a height/lodging (H/L) selection trait on 27 linkage groups that corresponded to 19 chromosomes (soybean reference genetic map-Gm composite_2003) and a linkage group X (which had only two UBC RAPD markers not mapped on the soybean 2003_composite map) (Fig F in S1 File). The inclusion of QTL identified at a relatively low threshold (LOD 2) could be justified by the novelty of this type of QTL. Since this was the first study of this nature, the focus of this work was to identify any genomic region potentially associated with fiber compositional and composite performance traits in soybean. Also, this would allow direct comparison with the QTL identified previously for some agronomic traits using the whole mapping population {n = 169, [51]}. The number of QTL identified per chromosome ranged from a single fiber composite performance QTL (UTS8H) on chromosome Gm04 (C1) to 28 QTL (agronomic, fiber compositional and fiber composite performance) on chromosome Gm10 (O). The highest number of QTL was identified for the days to maturity (25 QTL on 12 chromosomes) and the lowest number of QTL was identified for flexural modulus (eight QTL on eight chromosomes). In total, 96 and 99 QTL were identified for all 15 traits in Harrow and Woosdtock, respectively, in both years. Ridgetown was excluded from the QTL study of composite mechanical properties, which resulted in the smallest number QTL identified in this location. Fifty two QTL were mapped in Ridgetown for nine agronomic and fiber traits; no QTL for hemicellulose was detected in this location in the second year of the study. Genomic regions associated with analyzed traits accounted for 9% to 44% of total phenotypic variability for the specific trait ( . Current work identified 82 QTL for these traits using a set (50 RILs) of the RG10 x OX948 population (n = 169) in three locations over two years. Twenty five QTL were detected for plant height on 12 chromosomes and explained up to 36% of the phenotypic variability for the trait. Twenty three QTL were identified for plant height on seven chromosomes. These QTL explained up to 44% of total phenotypic variability for the trait. Sixteen QTL for lodging were mapped on nine chromosomes and explained 11% to 44% variability for the trait. Using the set of 50 RILs, this study confirmed several maturity QTL (on chromosomes Gm02, Gm08, Gm10, Gm11 and Gm13), plant height QTL (chromosomes Gm03, Gm06, Gm10 and Gm18) and lodging QTL (on chromosomes Gm11 and Gm19) identified previously with the whole population {n = 169, [51]}. Eighteen QTL for the plant height to lodging ratio (H/L) selection trait were mapped on nine chromosomes and linkage group X and explained up to 32% phenotypic variability for the trait. Eleven QTL for this trait were identified in soybean previously {SoyBase, available at www://soybase.org; accessed 14 Mar 2015; [77]} and were mapped on four chromosomes (Gm07, Gm13, Gm18 and Gm19) in four independent mapping populations [78,79]. Although some H/L QTL identified in the current study mapped to the same regions as QTL identified earlier (eg. QTL on chromosomes Gm07, Gm13, Gm18 and Gm19), the majority of the H/L QTL mapped to new genomic regions on six different chromosomes (Gm01, Gm02, Gm10, Gm11, Gm14 and Gm17) and linkage group X (Fig 4; Table 4).
Fiber compositional QTL. Seventy three QTL were identified for fiber compositional traits in six environments. They mapped to 18 chromosomes and explained significant portions of the phenotypic variation for individual fiber traits. Twenty two QTL for cellulose (C) . The 1,000 permutation test at 0.05 significance level for CIM was used to determine LOD thresholds for each trait. Because of the novelty of some of the mapping traits, QTL at LOD threshold values 2.0 were also considered as putative QTL. The map positions of these QTL were detected using the option for automatic QTL location (using program's default parameters). Several QTL not automatically detected (nad) were also marked as putative when they exceeded threshold LOD scores.
Fiber composite performance QTL. Ninety two QTL were detected for the composite mechanical traits in four environments. The QTL were distributed across the entire soybean genome and explained significant portion of phenotypic variability for individual composite mechanical traits. Fifteen QTL for flexural strength (FS) were mapped on 11 chromosomes and explained 10% to 40% of the phenotypic variability for the trait. Eight QTL for flexural modulus (FM) were mapped on eight chromosomes and explained 11% to 31% phenotypic variability for the trait. Fifteen QTL for ultimate tensile strength (UTS) were mapped on ten chromosomes and explained 12% to 36% phenotypic variability for the trait. Fifteen QTL for tensile strength (TS) were mapped on seven chromosomes and linkage group X. These QTL explained 11% to 37% phenotypic variability for the trait. Fourteen QTL for tensile modulus (TM) were mapped on 11 chromosomes and explained 16% to 31% variability for the trait. Eleven QTL for impact strength (IS) were mapped on nine chromosomes and linkage group X and explained 12% to 40% phenotypic variability for the trait. Fourteen QTL for mechanical index (IND) were mapped on nine chromosomes and explained 13% to 40% phenotypic variability for the trait. Some QTL for flexural strength and modulus, tensile strength and modulus and impact strength were clustered on several chromosomes (Fig 4; Table 5). No QTL for composite mechanical traits were mapped in plants previously.

QTL Consistency Across Environments
Significant GxE interactions were identified for the majority of the traits analyzed in this study ( Table B in S1 File). Therefore, trait values could not be averaged over environments and QTL analysis was performed for each environment separately. The variability in the expression of these QTL was consistent with the low correlations observed among the fiber compositional and composite mechanical performance properties in different locations and years (S1 Table). The enzymes and regulatory proteins involved in the biosynthesis and modification of cellulose, hemicellulose and lignin are encoded by hundreds of genes, which often belong to large gene families (http://cellwall.genomics.purdue.edu/families/index.html), whose expression is affected by environmental conditions [10,80,81]. To be useful for breeding, the consistency of these QTL need to be tested in different environments and/or genetic backgrounds.
In general, the major effect QTL that were identified in the current study for four agronomic traits were stable over locations and/or years (Fig 4; Tables 4 and 5). Moreover, using a set of 50 RILs, this work confirmed positions and effects of several QTL for days to maturity, plant height and lodging, identified previously with the complete RG10 x OX948 population (n = 169), performed under a different set of environments [51]. These QTL, which are associated with the genes in some cases, have the greatest potential for application in plant breeding.
A number of fiber compositional QTL were consistently identified in several environments, including: lignin QTL (L8W and L8R) on chromosome Gm06, free phenolics QTL (FPH9R and FPH9H) on chromosome Gm08, hemicellulose QTL (HC8W and HC8R) on chromosome Gm10, and cellulose QTL (C8H and C8W) on chromosome Gm11. These QTL may be important in breeding for cell wall compositional traits. Because of the complexities involved in assesing the newly defined fiber composite performance QTL, they were assayed with fibers obtained from the two most divergent locations (Harrow and Woodstck) over two years. These QTL were sensitive to environmental conditions and the most of them were location or year specific. These QTL might be easier to detect if a number of environments were increased [82]. Also, by averaging trait values over environments specific environmental effects would likely be avoided [79]. The inconsistencies of these QTL over environments could be the major limitation of their application in MAS for a specific trait. Also, because of some large gaps in the linkage map and incomplete genome coverage [51] it is possible that some QTL were not identified. In addition, by using a subset of 50 RILs from the mapping population (n = 169), only major effect QTL could be potentially detected. By increasing number of markers and /or population size some of the missing QTL might be identified.

Co-localization of Fiber Composite Performance QTL, Fiber Compositional QTL and Cell Wall Biosynthesis Genes
The level of polymorphism in mapping populations depends on the diversity of the genetic background of parental genotypes used to develop the population and type of the markers used in the study. We found relatively low levels of polymorphism for random (RAPD and SSR) markers in the RG10 x OX948 mapping population previously [51]. Similarly, only a few cell wall biosynthetic (regulatory and structural) gene-based markers (Table A in S1 File) were polymorphic between parents (RG10 and OX948) in this study.
Nevertheless, several fiber traits were associated with specific cell wall biosynthetic genes mapped in the RG10 x OX948 population. For example, impact strength QTL IS8W on chromosome Gm12 (Fig 4; Fig G in S1 File) co-localized with fiber QTL HC9H (hemicellulose) and both were associated with the gene coding for glycine-rich protein (GRP). This gene-spercific marker explained 13% phenotypic variability for the impact strength in Woodstock in 2008. The superfamily of plant glycine-rich proteins (GRPs) is characterized by the presence of semirepetitive glycine-rich motifs and diversity in structure, expression and localization. GRPs may act as scaffolds for the deposition of cell wall constituents, during protoxylem development and in cell wall fortification by connecting lignin rings [83,84]. In addition, it was shown that Arabidopsis AtGRP9 interacts with AtCAD5 (which catalyzes the last step in monolignol biosynthesis [85]-the reduction of cinnamaldehydes into cinnamyl alcohols) suggesting the involvement of GRPs in lignin biosynthesis/deposition [86].
Similarly, the flexural modulus QTL FM8W on chromosome Gm20 co-localized with six composite performance QTL [two index QTL (IND8W and IND9H), ultimate tensile strength QTL (UTS8W), flexural strength QTL (FS8W) and two tensile modulus QTL (TM8W and TM9W)] and two fiber composition QTL [free phenolics QTL (FPH8W) and cellulose QTL (C8R)]. These QTL were associated with the cinnamyl alcohol dehydrogenase (CAD) gene, which explained 10% of the phenotypic variability for the flexural modulus in Woodstock in 2008. CAD catalyses the conversion of cinnamyl aldehydes, with a NADPH cofactor, to cinnamyl alcohols, which are precursors to lignin [85]. Therefore, markers developed for these genes could assist breeding efforts for soybean lines with higher stem lignin content with potential application in composite manufacturing. However, because fiber composition and fiber performance in composite materials traits are quantitative in nature, encoded by a number of genes with small effects, more work is required in this area.
An example of overlapping fiber composite performance QTL occurred on chromosome Gm06 (Fig 4; Fig G in S1 File) between markers UBC463-600 and Satt357 (27 cM) where a QTL for flexural modulus FM8W overlapped with several other composite performance QTL, including: two ultimate tensile stregth QTL (UTS9H and UTS8W), a flexural sterngth QTL (FS8W), a flexural modulus QTL (FM8W) and a tensile strength QTL (TS8W). In addition, it co-localized with two fiber composition QTL [C8R (cellulose) and FPH8R (free phenolics)] and a lodging QTL (LG9H). Because of low polymorphism in the RG10 x OX948 population, we were not able to map any of the candidate cell wall synthesis genes directly in this region. However, the in silico map (Fig G in S1 File) contains numerous potential cell wall candidate genes in this region, including: glycosyl hydrolase 5 (Cell.GlyH5), cell wall protein (CWall), xyloglucan endo-transglycosylase (XET), cytochrome P450 (P450), peroxidase (Pox), cellulose synthase (CesA) and a number of transcription factors.
Similarly, two QTL for tensile strength (TS8W and TS9H) on chromosome Gm06 overlaped with a QTL for impact strength (IS8H), a QTL for tensile modulus (TM8W) and two QTL for ultimate tensile sterngth (UTS9H and UTS8W), and these co-localized with a number of fiber composition QTL for lignin (L8W and L8R), cellulose (C9H) and free phenolics (FPH9H)) and three plant height QTL (PH9W, PH8W and PH8R). The in silico map for this region contains cell wall-related genes, such as genes coding for expansin (EXP), Myb305-like transcription factor (Myb305), glycosyl-phosphatidyl inositol anchored protein [COBRA(TC216062)], flavonoid 3'-hydroxylase (F3'H), peroxidase (Pox), ABC transporter (ABCtr) or cellulose synthase (CesA). A gene coding for a member of a COBRA-like protein family (COBL4) was mapped on chromosome Gm18 (Fig 4; Fig G in S1 File). Maize brittle stalk2 (bk2) encodes a COBRA-like protein (similar to phytochelatin synthetase), which is involved in secondary cell wall biosynthesis [87]. It is expressed in early organ development but, it is also required for tissue flexibility at maturity. Mutants make smaller plants with reduced levels of cellulose and cell wall sugars. In general, cell wall biosynthetic (structural and regulatory) genes are encoded by gene families. In addition, extensive duplication exists in the soybean genome [88].
Several genomic regions contained clusters of linked and/or peiotropic loci that affected a number of traits (Fig 4; Tables 4 and 5). For example, SND2-Sat_120 marker interval on chromosome Gm13 (F) was associated with QTL for days to maturity (DM9R), height per unit of lodging (HL08H, HL08R, HL08W), ultimate tensile strength (UTS9H) and tensile strength (TS9H). Similarly, marker interval (UBC122 -1500 -Satt070-Satt534) on chromosome Gm14 (B2) was associated with QTL for several traits, inclusing days to maturity (DM9R), lodging (LG8H, LG8R, LG8W, LG9W), height per unit of lodging (HL8H, HL8R, HL8W), cellulose (C9R), flexural modulus (FM9H), ultimate tensile strength (UTS8H), tensile strength (TS8H, TS9W) and tensile modulus (TM8H). Pleoitropy is usually associated with major gene effects [89]. A common genetic basis might explain some correlations between agronomic, fiber compositional and composite traits. Alternatively, clustering of genes for different traits might be the basis for overlapping QTL. Mansur et al [90] and Orf et al [79] observed clustering of QTL with strong effects on maturity, plant height and lodging. However, the marker density in the current map was not sufficient to determine if the regions significant for more than one trait were the result of pleiotropy or gene linkage. Currently, a work to add microarray-based single feature polymorphism (SFP) markers to the map is underway. A more saturated map would help to resolve this ambiguity.

Conclusion
The use of plant fibers in automotove parts is limited by their variability and poor performance when incorporated into composites. Our work indentified connections between the structure and chemistry of soybean stem fibers and their performance in thermoplastic composites. The results demonstrate that the performance of soybean stem fibers in composites was signifcantly affected by both genotypes and evironments. This study provides an understanding of the cell wall compositional traits that are important for the use of soybean stem fibers in composites. In particular, the lignin content of stem fibers (from some environments) was positively correlated with certain composite mechanical traits. For example, higher lignin contents in the fibers used to produce composites will enhance their mechanical properties such as flexural modulus, tensile strength and impact strength and determine their use in specific applications. We also developed SS/PP composites from a number of RILs that performed better than pure PP. Above all, most of the SS/PP composites had the values of flexural modulus higher than pure PP. The superior mechanical performance characterisitcs of SS/PP composites indicated that the soybean stem fibers have structural (reinforcing) roles in the composites and are not simply fillers. For the production of automotive interior parts, composite materials need to be strong and flexible to maintain durability. Moreover, the finding that the SS/PP composites were in some cases better that the WS/PP composites that are already used in the production of automotive interior parts suggests that the SS/PP materials could also be commercialized. With the additional adavantage that the SS/PP composites are relatively easy to manufacture and mold, compared to glass fiber composites, the current results make these materials potentially useful in injection-moulded or directly-formed automotive parts. Numerous QTL for fiber compositional and composite mechanical traits were detected across the entire soybean genome that explained significant portions of phenotypic variation for a specific trait. Co-segregation analysis with fiber compositional QTL as well as in silico mapping resulted in the identification of cell wall biosynthesis genes that co-localize with fiber composite performance QTL. Furthermore, several gene-based markers that were developed might allow rapid introgression of genes related to good fiber quality into elite germplasm. However, more work in this area, including QTL confirmation, fine mapping and functional analysis of candidate genes underlying QTL are needed.
Supporting Information S1 File. Appendix. Distribution of height per unit of lodging (H/L) selection trait in 50 recombinant inbred lines (RILs) from the RG10 x OX948 cross. Lines were selected from an existing, well characterized population of n = 169 RILs (Fig A). Mean temperature ( o C) and total precipitation (mm) for soybean growing seasons in Harrow, Ridgetown and Woodstock (ON, Canada) in 2008 and 2009. a) 2008; b) 2009 (Fig B). Frequency distribution of traits (raw data) analyzed in parental genotypes (RG10 and OX948) and 50 selected RG10 x OX948 recombinant inbred lines (RILs). a) agronomic traits; b) fiber compositional traits; c) composite mechanical properties (Fig C). Distribution of 15 traits in parental genotypes (RG10 and OX948) and 50 selected RG10 x OX948 recombinant inbred lines (RILs). a) agronomic traits; b) fiber compositional traits; c) composite mechanical properties (Fig D). Ground dry stem fibers from selected soybean lines. a) light brown color; b) deep brown color (Fig E). Distribution of quantitative trait loci (QTL) LOD scores. a) QTL for agronomic and fiber compositional traits in six environments; b) QTL for fiber mechanical performance in four environments. QTL were detected using the Composite Interval Mapping (CIM) with Windows QTL Cartographer v.2.5_009. The settings used: map function Kosambi, a walk speed of 2cM, five control markers, model 6 (standard), forward and backward regression (method 3), and probabilities of 0.05. The 1,000 permutation test at 0.05 significancs level for CIM was used to determine LOD thresholds for each trait (Fig F). Comparison of the soybean RG10 x OX948 stem fiberbased composite QTL map (right) with the G. max Wm82.a2.v1 sequence map (left). Linkage map-QTL were detected using the Composite Interval Mapping with Windows QTL Cartographer v.2.5_009 [The settings used: map function Kosambi, a walk speed of 2cM, five control markers, model 6 (standard), forward and backward regression (method 3), and probabilities of 0.05]. Sequence (in silico) map-Initial mapping was done by BLASTing cell wall gene sequences against soybean genome (G. max Wm82.a2.v1) in Phytozome 9.1; additional sequences were then added to newly identified QTL regions [by scaning (200 kb walk) the soybean genome for genes potentially involved in cell wall biosynthesis/modifcation in Phytozome 9.1 and/or using G. max Wm82.a2.v1 annotation and feature coordinate files from SoyBase]. Maps were linked by common SSR markers. Mapped fiber genes are indicated in bold (Fig G). Cell wall-related gene-based PCR primers (Table A). Analysis of variance (Fisher test values) for agronomic, fiber compositional and composite mechanical traits (Table B). Comparison of mechanical performance of stem fibers in soybean/polypropylene (SS/PP) composites with pure polypropylene (PP) and wheat straw/polypropylene (WS/PP) composites (Table C). Soybean (Glycine max Wm82.a2.v1) sequence map (partial, Phytozome v9.1) ( Table D). (PDF) S1 Table. Correlations between agronomic, fiber compositional and composite performance traits in 50 selected RG10 x OX948 recombinant inbred lines (RILs) in different environments. (XLSX)