Generation and Characterization of the Western Regional Research Center Brachypodium T-DNA Insertional Mutant Collection

The model grass Brachypodium distachyon (Brachypodium) is an excellent system for studying the basic biology underlying traits relevant to the use of grasses as food, forage and energy crops. To add to the growing collection of Brachypodium resources available to plant scientists, we further optimized our Agrobacterium tumefaciens-mediated high-efficiency transformation method and generated 8,491 Brachypodium T-DNA lines. We used inverse PCR to sequence the DNA flanking the insertion sites in the mutants. Using these flanking sequence tags (FSTs) we were able to assign 7,389 FSTs from 4,402 T-DNA mutants to 5,285 specific insertion sites (ISs) in the Brachypodium genome. More than 29% of the assigned ISs are supported by multiple FSTs. T-DNA insertions span the entire genome with an average of 19.3 insertions/Mb. The distribution of T-DNA insertions is non-uniform with a larger number of insertions at the distal ends compared to the centromeric regions of the chromosomes. Insertions are correlated with genic regions, but are biased toward UTRs and non-coding regions within 1 kb of genes over exons and intron regions. More than 1,300 unique genes have been tagged in this population. Information about the Western Regional Research Center Brachypodium insertional mutant population is available on a searchable website (http://brachypodium.pw.usda.gov) designed to provide researchers with a means to order T-DNA lines with mutations in genes of interest.


Introduction
Brachypodium distachyon (Brachypodium) is an annual grass native to the Mediterranean and Middle East and is a member of the Pooideae subfamily [1]. This group also contains cereal and forage grasses including economically important species with complex genomes such as Triticum aestivum (bread wheat), Hordeum vulgare (barley), and Avena sativa (oats). In 2001, Brachypodium was proposed as a model system for the study of the Triticeae [2], and in 2005 following a feasibility study from the United States Departments of Energy on the use of plant biomass for the generation of energy and other products, Brachypodium was recognized as an attractive model for the improvement of proposed bioenergy feedstocks such as switchgrass and Miscanthus [3]. Many aspects of grass biology, such as cell wall composition and architecture [4,5], development, and grain properties, are distinct from dicots. In these cases, a grass such as Brachypodium, that possesses all of the biological, physical and genomic attributes required for an experimental system [2,6,7], represents a more relevant model than the dicot model Arabidopsis. Brachypodium's compact 272 Mbp diploid genome is similar to rice and sorghum in gene content and gene family structure [8], and the small size, rapid generation time, and simple growth requirements of Brachypodium enable high-throughput studies that are not feasible using larger, more demanding species [2,[9][10][11]. Additionally, Brachypodium is self-fertile and rarely outcrosses [12] which facilitates breeding homozygous lines for applications that require the maintenance of large numbers of independent genotypes, such as mapping experiments, mutant analysis, and studies of natural diversity [13][14][15][16][17].
While comparative analyses of sequence information can provide educated guesses about gene function, establishing a true link between genes and their biological function requires detailed functional characterization. Sequence indexed insertional mutants are a particularly valuable tool in this context because mutations in a given gene can be identified by simply searching a database. Thus, the large, sequence-indexed T-DNA and transposon-tagged mutant collections available for Arabidopsis and rice are an invaluable resource for such forward genetic studies. In addition to providing loss of function mutants when a T-DNA or transposon lands in a gene, the vectors can be designed to track promoter function and overexpress nearby genes. Gene trapping vectors containing promoterless reporter genes with splice donor and acceptor sites can be used to infer the expression pattern of disrupted genes and to identify promoters with tissue-specific expression patterns [41,42]. Activation tagging vectors contain transcriptional enhancers that cause nearby genes to be overexpressed while maintaining normal expression patterns. Such activation tagged mutants are particularly useful for investigating essential genes in which disruption is lethal, genes with redundant functions where a knockout in one family member fails to produce a phenotype, and the regulation of complex processes in which the activation of a global control gene is required to observe a phenotype [43][44][45].
A high efficiency transformation method is a prerequisite for the creation of large T-DNA collections. Fortunately, Brachypodium is highly amenable to tissue culture and transformation. After conditions were established for the generation of embryogenic callus and production and regeneration of fertile Brachypodium plants [46], Brachypodium was successfully transformed using a biolistic method [2,47]. Due to complex insertion patterns that often involve many copies of the inserted DNA and local rearrangements, biolistic transformation is not ideal for the generation of mutant populations. Agrobacterium-mediated transformation of Brachypodium was first reported in 2006 [27]. Subsequently, two groups optimized high efficiency Agrobacteriummediated transformation methods [18,19]. With efficient transformation methods in hand, the generation of Brachypodium T-DNA collections was initiated. In 2010, the BrachyTAG collection (http://www.brachytag.org) [48] reported 741 T-DNA lines with flanking sequence tags (FSTs) anchoring insertions sites within the Brachypodium genome. These lines contain insertions between 1,500 bp upstream to 500 bp downstream of the coding sequence for 364 Brachypodium genes. Based on this data, the authors estimated that at least 51,976 lines would be necessary to obtain a T-DNA insertion in any gene with a 95% probability and that the actual number of lines required to meet this goal is over 100,000. As a means to approach this distant target, we developed a highthroughput pipeline for the production and sequencing of Brachypodium T-DNA mutants. Here we present the optimization of this pipeline including a comparison of T-DNA and transposon tagging strategies, further optimization of our transformation method, and a comparison of several T-DNA vectors. We used these optimized methods to generate 8,491 Brachypodium T-DNA mutants and identify 5,285 unique insertion loci for 4,402 of these lines.

Plant lines and growth conditions
Brachypodium inbred lines Bd21 and Bd21-3 were compared in initial transformation studies, and Bd21-3 was selected to generate the bulk of the T-DNA mutant population [18]. Plants were grown in a soil mix of 1 part sandy loam, 2 parts sand, 3 parts peat moss, and 3 parts medium grade (#3) vermiculite. A time release fertilizer containing micronutrients (Osmocote Plus 15-9-12, Scotts Co., Marysville, OH) was added at the time of planting. Plants were grown in both greenhouses and growth chambers. Growth chambers conditions were 20 hr light : 4 hr dark photoperiod, cool-white fluorescent lighting at a level of 150 mEm 22 s 21 , and temperatures of 24uC during the day and 18uC at night. Greenhouse conditions were no shading, 24uC in the day and 18uC at night with the day length extended to 16 hours by supplemental lighting. T 1 seeds were harvested from senesced T 0 plants after they were completely dried (typical yield ranged between 50-150 T 1 seeds per plant). When additional seeds were required for particular lines, 6-12 T 1 seeds were sown and the harvested seeds were collected in bulk. If large quantities of seeds are required, plants can be grown under short day conditions and then vernalized or moved to long day conditions. In this case, over 1,000 seeds can be obtained from an individual plant.

T-DNA constructs
Several constructs were used in this study. The previously described construct pOL001 [27] was used as a benchmark for the evaluation of the new constructs and for the initial production of T-DNA lines. pOL001 contains a HptII gene under control of the CaMV 35S promoter for selection of transgenic callus and a GUS reporter gene under control of the maize ubiquitin promoter (Fig. 1A).
Ac-Ds and En-I(Spm) transposon systems were tested using the constructs Ac-DsATag-Bar_gosGFP [49] (Fig. 2A) and pdSpm-R [50]. These constructs are single vector transposon tagging constructs and contain both the transposase and the corresponding mobile element. We built seven additional constructs (the pJJ vectors) starting from pCAMBIA vector backbones (http://www. cambia.org/) (Fig. 1A). The first constructs (pJJ, pJJB) were used to compare the transformation efficiency of hygromycin and BASTA selections. The pJJ2LB and pJJB2LB constructs were used to determine if the presence of two left border sequences improved the efficiency of FST generation by decreasing the transfer of vector DNA beyond the left border [51]. The final vectors were modified from pJJ2LB to create derivatives designed for gene trapping (pJJ2LBP and pJJ2LBP2) and activation tagging (pJJ2LBA). The construction of these vectors is described below, and primers used in their construction are listed in Table S1. Sequencing was performed on the T-DNA regions of all constructs to ensure that no mutations were introduced during PCR and to confirm proper orientation after ligation of digestion fragments.
The pUbi-BASK vector (courtesy of Jim Thomson) was the source of the maize ubiquitin promoter with the ubiquitin 59 intron used in the constructs below. To generate pUbi-BASK the pAHC20 vector [52] had the internal EcoRI restriction site within the maize ubiquitin promoter removed using site directed mutagenesis. The BAR coding sequence was excised from pAHC20 with BamHI and KpnI and a BamHI/AscI/SpeI/KpnI (BASK) multiple cloning site was inserted in its place.
pJJB. This vector has a phosphinothricin acetyl transferase selection (BAR) gene under the control of the maize ubiquitin promoter. The BAR gene sequence was amplified from pMNRTT224nb (courtesy of Venkatesan Sundaresan) using the BARBamHIF and BARKpnIR primers. The PCR product was digested with BamHI and KpnI and inserted into the corresponding sites of the pUbi-BASK vector directly after the maize ubiquitin promoter with the ubiquitin 59 intron and preceding the nos terminator. The Ubi-BAR-nos fragment was then inserted into the HindIII and EcoRI sites of the pCAMBIA0305.2 vector (http://www.cambia.org/). The resulting construct also has a GUSPlus reporter gene under control of the CaMV 35S promoter and a kanamycin resistance gene for selection in bacteria.
pJJB2LB. To generate this vector the pCAMBIA0305.2 left border (LB) was replaced with the double left border sequence (2LB) from pL3 [53]. This 741 base pair region consists of a nopaline LB and its flanking regions followed by an octopine LB and its flanking regions. The 2LB sequence was amplified with primers designed to introduce a SacII site adjacent to the octopine border and an AseI site adjacent to the nopaline border (pL3-2LB-F1 and pL3-2LB-R1). The 2LB PCR fragment was introduced between the AseI and SacII sites of pCAMBIA0305.2. Then, the Ubi-BAR-nos sequence (as for pJJB) was introduced between the HindIII and EcoRI sites.
pJJ. This vector has the hygromycin phosphotransferase selection gene HptII under control of the maize ubiquitin promoter. The maize ubiquitin promoter sequence with the ubiquitin 59 intron was amplified from the pUbi-BASK vector using the primers UbiHEF and UbiOLR, and HptII was amplified from pGHyg [18] using the primers HygOLF and HygEcoRIR. The PCR products were digested with BamHI, and the resulting fragments ligated and used as a template for PCR with the UbiHEF and HygEcoRIR primers. This PCR product was digested with EcoRV and EcoRI and introduced into the SmaI and EcoRI sites of pCAMBIA0380. This construct lacks a nos terminator following the HptII sequence.
pJJ2LB. To build this construct, the Ubi-Bar-nos cassette was removed from pJJB2LB by digestion with EcoRI and HindIII and replaced with the Ubi-Hyg fragment from pJJ. This fragment is also lacking the nos terminator following the HptII gene.
pJJ2LBA. This construct contains a CaMV 4x35S activation tagging cassette and hygromycin selection under control of the maize ubiquitin promoter. The CaMV 4x35S enhancer fragment was derived from Ac-DsATag-BAR_gosGFP [49], in which it was flanked by two EcoRI restriction sites and contained one internal EcoRI. Ac-DsATag-BAR_gosGFP was digested with HindIII to release a fragment containing the CaMV 4x35S region and the Ubi-BAR-nos cassette, the ends of the fragment were blunted using mung bean nuclease and the fragment was then inserted into pCR4-TOPO. The ,1.7 kb CaMV 4x35S enhancer fragment was released by partial digestion with EcoRI and introduced into the EcoRI site of pJJ2LB. Due to the repetitive nature of the CaMV 4x35S region, we were only able to sequence through three copies of the enhancer sequence after cloning into pJJ2LBA. However, enzymatic digestion confirmed the insert was of the expected size for four enhancer copies. This construct does not have a nos terminator following the HptII gene, but still transforms efficiently.
pJJ2LBP and pJJ2LBP2. These constructs were modified from pJJ2LB. Both contain hygromycin selection driven by the maize ubiquitin promoter and two gene trap reporters, a promoterless GUS gene (I2-GUS-T n ) near the 2LB and a promoterless GFP gene (T n -sGFP-I3) near the RB. The gene trap cassettes were derived from the pGA2717 vector and contain rice tubulin intron sequences with splice donor and acceptor sites before the reporter gene sequences [41]. To make pJJ2LBP, the constitutive GUSPlus reporter and the RB were first removed from pJJ2LB by digestion with SphI. Primers (RB-f2 and RB-r2) were designed to amplify the pCAMBIA0305.2 right border and to introduce SpeI, AvrII and AseI restriction sites just before the RB and an SpeI site just following the RB sequence. This PCR fragment was introduced into the SpeI sites of pJJ2LB lacking the GUSPlus and original RB to create the intermediate pJJ2LB-RB2. Next, the I2-GUS-T n cassette was released from pGA2717 by digestion with HpaI and KpnI, the ends were blunted, and the fragment cloned into the TOPO Blunt vector (Life Technologies, Grand Island, NY). The I2-GUS-T n cassette was released from TOPO Blunt with EcoRI and ligated into pJJ2LB-RB2. Clones with the insert oriented with the I2 intron adjacent to the 2LB sequence were selected for the intermediate pJJ2LB-GUS-RB2. Next, the I3-sGFP-T n cassette was amplified from pGA2717 using the GFP-f2 and GFP-r2 primers to introduce an AscI site before the I3 intron and an AvrII site after the nos terminator (T n ). This fragment was inserted into the pJJ2LB-GUS-RB2 intermediate at the AscI and AvrII sites near the new RB to create the final pJJ2LBP vector. This construct is also lacking the nos terminator following the HptII gene. pJJ2LBP2 was made by the addition of a nos terminator to pJJ2LBP following the HptII gene. The nos terminator was amplified from pCAMBIA0305.2 with primers designed to add XmaI sites at the ends of the amplified DNA (nos-f and nos-r). The PCR fragment was introduced into the XmaI site between the HptII gene and the I2-GUS-T n cassette of pJJ2LBP. The nos terminator was then checked for proper orientation.

Agrobacterium-mediated Transformation
The Agrobacterium-mediated transformation protocol used to generate the T-DNA insertion lines in this collection was optimized from the protocol described in Vogel and Hill 2007 [18]. Embryos (,0.3-0.7 mm) were dissected from immature seeds and transferred to callus initiation media (CIM, per L: 4.43 g Linsmaier & Skoog basal medium (Phytotechnology, Shwanee Mission, KS #L689), 30 g sucrose, 1 ml 0.6 mg/ml CuSO 4 , pH 5.8. For plates, add 2 g phytagel (Sigma #P-8169). After autoclaving, add 0.5 ml of 5 mg/ml 2,4-D stock solution.) Following 3-4 weeks incubation in the dark at 28uC, embryogenic callus was subcultured onto fresh CIM plates. A second subculture was performed after two more weeks. The calluses from the second subculture were grown for one week before being used for transformation. On the day of transformation, calluses were bathed for 5 minutes in a suspension of Agrobacterium strain AGL1 containing the desired vector for transformation [54] (OD 600 = 0.6) prepared in liquid CIM containing 200 mM 2,4-D and 0.1% Synperonic PE/F68 (Sigma #81112, formerly Pluronic F68). After removing as much of the Agrobacterium suspension as possible, the calluses were transferred to petri dishes containing a piece of sterile filter paper for co-cultivation for 3 days in the dark at 22uC. Note that co-cultivation under desiccating conditions is critical to the success of the transformation protocol. Next the callus pieces were moved to CIM plates containing 150 mg/L timentin and the appropriate selective agent to kill untransformed plant tissue -either 40 mg/L hygromycin B (Phytotechnology H397) or 60 mg/L DL-Phosphinothricin (Phytotechnology P679) -and incubated in the dark at 28uC for 1 week. Healthy sectors of hygromycin resistant transgenic callus were subcultured to fresh CIM plates one time for an additional two weeks of selection. BASTA selected callus was subjected to two additional rounds of subculture for an additional four weeks of selection. Note it is not necessary to obtain a callus with only healthy transgenic tissue because even small pieces of healthy callus surrounded by dead and dying callus will produce plantlets efficiently. Between 3 weeks (hygromycin selection) to 5 weeks (BASTA selection) after cocultivation, calluses were transferred to regeneration media (per L: 4.43 g Linsmaier & Skoog (LS) basal medium, 30 g maltose, 2 g phytagel, pH 5.8; after autoclaving, 1.0 ml of sterile 0.2 mg/ml kinetin stock solution was added) containing 150 mg/L timentin and the appropriate selective agent. Plates were incubated in the light (cool-white fluorescent lighting at a level of 65 mEm 22 s 21 with a 16 hr light : 8 hr dark cycle) at 28uC. Callus pieces began to turn green and shoots appeared between 2-4 weeks. Individual plantlets were moved to tissue culture boxes (we used sundae cups made for food service applications from Solo Corporation, Lake Forest, IL Cat. # SOL-TS5 (cups) and SOL-DL-100 (dome lids)) containing MS sucrose medium (per L: 4.42 g Murashige & Skoog (MS) basal medium with vitamins (Phytotechnology M519), 30 g sucrose, and 2 g phytagel, pH 5.7) and incubated in the light (coolwhite fluorescent lighting at a level of 65 mEm 22 s 21 with a 16 hr light : 8 hr dark cycle) at 28uC. After plantlets had formed roots and were approximately 2-5 cm tall, they were transplanted to soil and placed in a growth chamber for flowering (20 hr light, 4 hr dark, 24uC during the day and 18uC at night, cool-white fluorescent lighting at a level of 150 mEm 22 s 21 ). In this case, vernalization was not required to induce rapid flowering. Alternatively, to promote rapid flowering in plantlets moved directly to greenhouse conditions (no shading, 24uC in the day and 18uC at night with supplemental lighting to extend daylength to 16 h), plants were vernalized in tissue culture boxes or in soil under light (continuous cool-white fluorescent lighting, 4 mEm 22 s 21 ) at 4uC for 2-4 weeks depending on the season.
PCR of empty donor sites in Ac/Ds lines PCR was performed on DNA extracted from leaves of Ac-DsATag-Bar_gosGFP T-DNA lines. Primers HygBamHIF (59ttggatccatgaaaaagcctgaactcacc39) and HygKpnIR (59ttggtaccc-tatttctttgccctcgg39) were used to amplify the HptII gene. Locations of the primers R13pMOg22 (59ggaaacgacaatctgatctctagg 39) and Ac-promRev (59ctcagtggttatggatgggagttg39) that were used to identify the empty donor sites are shown as small black arrows in Fig. 2A.

DNA Extraction
DNA extraction was based on the method described by Shiaman Chao and Daryl Somers (http://maswheat.ucdavis. edu). To prepare tissue, two young leaves (approximately 3 inches) from T 0 plants were sampled into the wells of a 96-well plate (E&K Scienfitic, EK-22280). Glass or chrome steel beads (http://www.biospec.com) were added to the wells (either one 6.3 mm and two to three 3.5 mm glass beads or two 3.2 mm chrome steel beads per well), and the plates were lyophilized overnight. Directly after lyophilization, the plates were covered with sealing mats (E&K Scienfitic, EK-80080), and tissue was ground in a Retsch MM301 Ball Mill at 30 cycles/sec for 1 min. The grinding was repeated 5 times, for a total of 5 min. Before opening plates, the ground tissue was centrifuged at 4000 rpm for 20 min. at 4uC. To isolate DNA, 800 ml of extraction buffer (0.1 M Tris-HCl pH 7.5, 0.05 M EDTA pH 8.0, 1.25% SDS) preheated to 65uC was added to each well. Plates were sealed with a new sealing mat and shaken thoroughly. Chrome steel beads were removed prior to adding extraction buffer, but glass beads were left in the wells. To prevent contamination from neighboring wells, a flat weight was placed on top of the sealing mat and secured with several rubber bands. Plates were incubated at 65uC for 0.5-1 hour, with mixing every 5-10 min. Plates were transferred to ice for 15 min. before centrifuging at 4000 rpm for 5 min. at 4uC. Next, 400 ml cold 6M ammonium acetate was added to the wells, and the plates inverted several times and placed on ice for 15 min. Plates were then centrifuged at 4,000 rpm for 15 min. at 4uC to collect the precipitated proteins and plant tissue, and 900 ml of the supernatant was transferred into another plate containing 540 ml of isopropanol in each well. Plates were mixed thoroughly and placed on ice or at 220uC for 30 min. to precipitate DNA. Plates were then centrifuged at 4,000 rpm for 30 min. at 4uC, the supernatant decanted, and the plate inverted on a paper towel for a few seconds. Pellets were washed with 1 ml of 70% ethanol and centrifuged at 4,000 rpm for 20 min at 4uC. After removing the supernatant, the pellets were dried in the hood overnight and resuspended in 125 ml TE buffer (10 mM Tris-Cl pH 7.5, 1 mM EDTA pH 8.0). To dissolve the DNA, the samples were placed at 4uC overnight and then transferred to a 96-well microtiter plate. Estimated yield is 10-20 ng/ml.

Inverse PCR
Restriction enzyme digestions were performed in 15 ml reactions containing 75-150 ng (7.5 ml) of DNA, 0.2 ml enzyme, and 16 NEB #1 buffer. Samples were incubated at 37uC for 3 hours, then at 65uC or 80uC (depending on enzymes) for 20 min. The enzymes used to digest individual T-DNA constructs at the right or left borders are listed in Table 1. Digestion products were purified by precipitation with 95% ethanol and 3M sodium acetate and resuspended in 10 ml of TE buffer (pH 8.0). Ligations were performed by adding 0.125 ml T4 DNA ligase, 1.0 ml of 106 ligation buffer, and 3.875 ml H 2 O to the digested DNA and incubating for 16 hr at 16uC. PCR reactions (10 ml) were prepared for the ligation products as follows: 2 ml 56 Go Taq buffer (NEB), 1 ml dNTPs (2.5 mM), 0.2 ml Primer1 (10 mM), 0.2 ml Primer2 (10 mM), 0.5 ml DMSO (100%), 0.05 ml Go Taq, 2 ml DNA (ligation), and 4.05 ml H 2 O. The PCR program used was 95uC for 5 min. and 94uC 30 sec, followed by 35 cycles (94uC 30 sec, 60uC 30 sec, 72uC 1 min. and 45 sec), and the program ended with 72uC 10 min. Some primer pairs needed 2-3 additional cycles to improve amplifications. If the PCR did not work well, another round of PCR was performed using nested primers and 0.5 ml of the first round PCR product as a template.

FST Sequencing
Unconsumed dNTPs and primers were removed from PCR products using ExoSAP-IT (Affymetrix) per manufacturer's instructions. Sequencing was performed in either 96 (10 ml reactions) or 384 (5 ml reactions) well plates with the BigDye Terminator v3.1 sequencing kit (Applied Biosystems) using the following program: 98uC 5 min, followed by 39 cycles (96uC 10 sec, 50uC 10 sec, 60uC 4 min), and holding at 4uC. The sequencing products were precipitated with 95% ethanol and 3M sodium acetate, washed with 70% ethanol, dried in a hood (hours to overnight). The pellets were dissolved in 8.5 ml (for a 96 well plate) or 5 ul (for a 384-well plate) of Hi-Di Formamide (Applied Biosystems), and the plates stored at 220uC until use. Before loading the sequencer, the samples were denatured at 96uC for 3 min. Primer locations and sequences are listed in Table 2.

GUS staining of JJ2LBP and JJ2LBP2 lines
Stem, leaf and floral tissue samples were collected from young Brachypodium plants into microcentrifuge tubes. GUS staining solution (0.1 M sodium phosphate pH 7.0, 0.5 mM potassium ferrocyanide, 0.5 mM potassium ferricyanide, 0.5% v/v Triton X-100, 0.15% w/v X-Gluc) was added directly to the tubes, and samples were vacuum infiltrated for 5 min before placing at 37uC in the dark overnight. The GUS staining solution was removed with a pipet and 95% EtOH was added to remove any chlorophyll that might mask the blue staining and to fix the tissue.

Optimization of transformation
We performed a series of experiments to improve transformation efficiency, defined as the number of transgenic plantlets regenerated per number of callus pieces co-cultivated with Agrobacterium, and decrease the labor required per transgenic plant produced. The following modifications were found to improve transformation efficiency. The production of high quality embryogenic callus was increased by the addition of copper sulfate at a final concentration of 0.6 mg/L to the callus initiation media [19]. Callus pieces were moved to media containing the selective agent (hygromycin or Basta) directly after the 3 day cocultivation step, rather than after a 7 day recovery on media lacking the selective agent. This modification permitted us to eliminate one of the subculture steps after co-cultivation decreasing labor and supplies and accelerating the recovery of transgenic plants. After rooting, the regenerated plantlets were placed at 4uC for two to three weeks before moving them into soil to cue early flowering in the greenhouse and bypass the need for growing the plants under long days in a growth chamber. Brachypodium accessions Bd21-3 and Bd21 were compared as the source of embryogenic callus. Transformation efficiency was similar for the two lines when a microscope was used to identify embryogenic callus during subculture. However, Bd21-3 was selected for the production of our insertional mutant population, because it forms a more strongly yellow colored callus with organized structures that eased selection of the correct callus and increased transformation efficiency when subculture was performed without the aid of a microscope.
Optimization of the transformation vector was responsible for the greatest improvement in transformation efficiency. We conducted experiments to compare the recovery of transgenic plants using hygromycin or Basta selection under the control of three different promoters ( Table 3). The pOL001 vector previously was used to produce transgenic Brachypodium with transformation efficiency up to 41% [18]. This construct contains hygromycin selection under the control of the CaMV 35S promoter and was used as a benchmark in all transformation optimization experiments. Three additional hygromycin selection vectors and three vectors designed with Basta selection were compared with pOL001. The pGA2717 rice transformation vector [41] contains hygromycin selection driven by the rice tubulin promoter. Two additional vectors, pJJ and pJJ2LB, were constructed by placing hygromycin selection under control of the maize ubiquitin promoter and inserted into pCAMBIA0305.2 (http://www.cambia.org/). The pJJ vector utilizes the single left border (LB) from pCAMBIA0305.2, but in the pJJ2LB vector, the pCAMBIA0305.2 LB was replaced with a double left border from the pL3 vector [53]. The pJJB and pJJB2LB vectors were constructed similarly, but have the maize ubiquitin promoter driving Basta selection. The final vector, pSMAb801 has the CaMV 35S promoter directing expression of the Basta selection gene [55].
Transformation efficiency was evaluated for these seven vectors in multiple experiments ( Table 3). Our results show that transgenic plants can be recovered using either hygromycin or Basta as the selective agent, however, vectors containing hygromycin selection, with the exception of vector pGA2717, were much more efficient (averages 22.9 to 55.8%) than those employing Basta selection (averages 2.2 to 8.3%). The hygromycin-selected pGA2717 vector yielded the lowest transformation efficiency observed, 0.7%, possibly due to the function of the rice tubulin promoter. However, since we did not sequence the vector we cannot rule out the possibility that there was something wrong Additional experiments compared transformation efficiency for pOL001 and pJJ2LB. Hygromycin selection is driven by different promoters in these two vectors (CaMV 35S in pOL001 and maize ubiquitin in pJJ2LB). In six side by side transformation experiments 1,037 pieces of callus were co-cultivated with Agrobacterium carrying pOL001 and 991 pieces were co-cultivated with Agrobacterium carrying pJJ2LB. Transformation efficiency was significantly higher for pJJ2LB containing the maize ubiquitin promoter (48%) than for pOL001 containing the CaMV 35S promoter (32.2%). For both of these constructs, more than 95% of the plants tested were positive for GUS reporter gene expression and greater than 90% of the regenerated plantlets survived to set T 1 seed (data not shown). Using our optimized method and vectors, transformation efficiencies averaged 42% during the production of the mutant population and efficiencies of 50-75% were achieved for individual experiments.

Generation of T-DNA lines
The bulk (82.6%) of the T-DNA population was generated from three constructs pJJ2LB, pJJ2LBA and pJJ2LBP2 (Fig. 1). All three constructs were designed with two left borders to try to limit the transfer of vector DNA beyond the left border [51]. Constructs pJJ2LB and pJJ2LBP2 can only affect gene function by insertion into coding or regulatory regions. In addition, pJJ2LBP2 can function as a gene trap to identify adjacent promoters because it contains reporterless GUS and GFP genes with multiple splice acceptors at the left and right borders respectively [41]. In addition to disrupting gene function by insertion, the pJJ2LBA can act as an activation tag that causes the overexpression of nearby genes because it contains four copies of the CaMV 35S enhancer sequence adjacent to the LB [43,56]. Overall, 8,491 fertile lines were produced that comprise the WRRC Brachypodium T-DNA insertional mutant population described in this report ( Fig. 1B and Table S2). Since this is an ongoing project, additional lines continue to be produced and readers are directed to the project website (http://brachypodium.pw.usda.gov/) for the most up to date totals.

Evaluation of transposon tagging in Brachypodium
To determine if transposon tagging could be used to generate insertional mutants in Brachypodium efficiently, we tested two transposon systems (Fig. 2) that have been used previously for large-scale mutagenesis in rice and Arabidopsis, Ac/Ds and EnSpm [57,58]. Agrobacterium-mediated transformation was used to deliver T-DNAs containing the transposon constructs. Both T-DNAs contain hygromycin selection for plant transformation, a GFP reporter gene, and a transposase. In addition, each construct contains a mobile element harboring four copies of the CaMV 35S enhancer sequence for activation tagging. The average transformation efficiency of Ac-DsATag-Bar_gosGFP (Fig. 1A) [49] over six transformations was 27.2%, but most plants either died before setting seed or produced non-viable seed (Fig. 2B). Only 5.9% (7 of 119) of the Ac/DsAtag-Bar_gosGFP T 0 transgenic plants survived to set T 1 seed. PCR was performed on genomic DNA extracted from eight Ac/DsAtag-Bar_gosGFP T 0 plants (Fig. 2C).  Brachypodium T-DNA PLOS ONE | www.plosone.org All samples tested positive for the presence of the HptII gene indicating the T-DNA was present in the plants. Transposition of the Ds element leaves behind an empty donor site that can be detected when PCR is performed with primers located outside of the Ds region ( Fig. 2A). If the Ds element remains in place, the distance between the primers is too large to amplify a fragment.
Five of the 8 plants tested for an empty donor site yielded a band with a size indicating that the Ds element had moved (Fig. 2C) and none of these lines set seed. In the case of line 171-3, the larger PCR fragment suggests that the Ds excision was incomplete. Line 175, one of the lines in which the Ds element did not move, is also a line that produced T 1 seed. We conclude the Ac/Ds transposon functions in Brachypodium, but is lethal, possibly because it is too active. Similarly, transformations with a construct containing dSpm (pdSpm-R) [50] yielded 22 plantlets, all but one of which died while still very small (Fig. 2D). However, all callus pieces transformed with pdSpm-R displayed GFP fluorescence indicating that they were transformed with the construct (Fig. 2E). Although there is potential for optimization of transposon systems for the production of mutants in Brachypodium, we chose to focus on T-DNA tagging to generate insertional mutants.

Expression of ß-glucuronidase (GUS) from the pJJ2LBP and pJJ2LBP2 gene trap vectors
To assess the function of the gene trap vectors, 235 lines containing pJJ2LBP and 500 lines containing pJJ2LBP2 lines were assayed for GUS activity as an indicator that the promoterless GUS reporter gene was being expressed by a Brachypodium promoter. Since the T 1 generation examined was segregating for the transgenes, we first used PCR to identify transgenic T 1 plants by amplifying the HptII gene contained on the T-DNA (data not shown). Leaf, stem, and floral samples for each transgenic plant were placed in a GUS staining solution and incubated overnight. Upon visual examination of cleared tissue, 53 (7.2%) of the lines showed GUS expression in at least one tissue type (Fig. 3A, B, C). Seven lines showed blue staining in all three tissues sampled (Fig. 3B), 14 lines in vegetative tissues only (leaf and stem), and 22 lines in floral tissue only (Fig. 3C). The remaining 10 positive lines had GUS expression in flowers and either stem or leaf tissue. Seeds for 95 lines were germinated on MS media containing hygromycin and tested for GUS expression in roots, but blue staining was not detected in the roots (data not shown). Using the vector from which we derived our gene trap cassettes, pGA2717, Ryu et. al. observed GUS staining in 4.8% of 3,140 rice lines [41], a value similar to what we observed in Brachypodium. Another study of 8,200 rice lines using a similar vector, pTAG8, reported GUS staining in vegetative tissue for 11% of the lines tested and in reproductive tissue for 22% of the lines tested [42]. These results demonstrate that the GUS reporter derived from the pGA2717 rice gene trap vector functions in Brachypodium and suggest there is room for optimization of gene trap vectors for Brachypodium.  genomic digestion site for the same enzyme [57,58]. To determine the best means of recovering FSTs, we performed sequencing reactions from both the LB and the RB of the T-DNA. Multiple enzymes and sequencing primers were also tested. Primers oriented out of the T-DNA (designated T primers) yield sequences directly adjacent to the T-DNA and are directed into the flanking genomic region. Primers oriented into the T-DNA (designated RE primers) result in sequences starting at the genomic restriction enzyme site and are directed toward the T-DNA (Fig. 4).

Phenotypes observed in T-DNA insertional mutant lines
Initial studies tested the efficiency of recovering FSTs after digestion with the enzyme HpyCH4IV. This enzyme was chosen because it is located near both of the T-DNA borders, and therefore, one digestion and ligation reaction could be used for four sequencing reactions. Sequencing from the LB using T primers (LB-T) returned FSTs for 232 of 567 lines tested (40.9%), and reactions from the LB using RE primers (LB-RE) produced FSTs for only 189 of the same 567 lines (33.3%). Similar results were observed when sequencing from the RB. Reactions using T primers (RB-T) returned FSTs for 178 of 378 lines tested (47.1%) and reactions using the RE primers (RB-RE) reactions produced FSTs for 172 of 474 lines tested (36.3%). These results indicate that at both borders, sequencing directly from the T-DNA into the genomic sequence using T primers was more efficient at generating FSTs than sequencing from the genomic restriction site back toward the T-DNA using RE primers. In all four sets of reactions described above (LB-T, LB-RE, RB-T, and RB-RE), the majority of the sequences recovered contained vector sequences (51-61%), and 7-15% of the sequences failed to match any known sequence or were not readable due to low quality scores. A comparison of early sequencing reactions from the pJJ vector derivatives showed that the addition of a second LB did not improve the efficiency of FST recovery (data not shown).
In an effort to increase the efficiency of recovering FSTs, IPCR was conducted using two enzymes adjacent to the RB of the pJJ vectors. The two enzymes, HpyCH4IV and HpaII, were used in RB-T reactions for 852 T-DNA lines. HpyCH4IV digestion resulted in FSTs for 350 lines (41.1%) and HpaII digestion returned FSTs for 344 lines (40.4%). In combination, FSTs for 433 lines (50.8%) were obtained from the IPCR reactions after digestion with the two enzymes. The two enzyme approach increased the efficiency of obtaining FSTs from the RB approximately 25% over the single enzyme approach. When LB-T reactions were included for the HpyCH4IV digestion, the number of lines with FSTs reached 531 (62.3%), further increasing the efficiency by more than 20%.
In a separate set of tests, FST recovery was evaluated when IPCR reactions were performed after digestion with three different enzymes. Using BfaI in LB reactions, we obtained FSTs for 38.6% of the lines tested. In HpyCH4IV reactions from the LB and RB, we recovered FSTs for 55.3% of the lines. Together, these two enzymes yielded FSTs for 66.4% of the lines tested. Addition of IPCR reactions from the RB using a third enzyme, TaqI, only increased the total to 67.4% of the lines tested. As a result of these experiments, we decided to use the two borders, two enzymes approach to obtain FSTs for our T-DNA lines.

Identifying flanking sequence tags and assigning insertion sites
Data from 17,637 inverse PCR sequencing reactions for 7,145 T-DNA lines were compared to the Brachypodium distachyon genome assembly v1.0 using BLASTn ( Table 4). To maximize the detection of T-DNA insertion loci, we assigned an e-value cutoff of 10 23 . Using this criterion, 7,389 sequences (41.9%) matched the Brachypodium genome. These sequences represent 4,402 (61.6%) of the lines analyzed. The remaining 10,248 sequences that failed to match the Brachypodium genome were vector sequences, sequences with no identified matches, or poor quality sequences. The top scoring BLAST hit for each FST was used for subsequent analysis. The only exceptions were FSTs that exactly matched more than one location in the genome (discussed below). The average FST length was 195 bases and the median length 142 bases. We defined the base of the FST closest to the T-DNA to be the insertion site (IS). However, since we often did not sequence across the junction, the actual IS may differ slightly. Multiple sequencing reactions were performed for many of the T-DNA lines using T and RE primers that anneal near the RB or LB. Since independent reactions may return different sequences for a single line, a particular line can have FSTs assigned to more than one location in the genome. This is not surprising because the average number of T-DNA insertions per line is ,2 [18]. When a single line has multiple FSTs, they are distinguished by the addition of a numerical suffix to the name of the T-DNA line. For example, four sequencing reactions were performed for the T-DNA line JJ3, yielding four FSTs. The FSTs were designated as JJ3.0, JJ3.1, JJ3.2, and JJ3.3. The JJ3.0 FST is located on chromosome 1, and the other three FSTs are assigned to chromosome 2. When a T-DNA line has more than one FST located on the same chromosome within a 1 kb range, the FSTs were treated as a single IS. For example, JJ3.1 and JJ3.2 are both located on Bd2 separated by only 284 bases, and therefore are counted as one T-DNA IS. Using this classification, the 7,389 FSTs in this collection represent 5,285 distinct ISs. Of these sites, 1,538 (29.1%) ISs are supported by more than one FST (in 1,501 lines) ( Tables 5 and  6). In the majority of the T-DNA lines with assigned FSTs, one (82.4%) or two (15.4%) ISs were identified ( Table 5), and fewer than 3.0% of the lines were assigned three or more ISs. In cases where the T-DNA has inserted into a repetitive sequence, the BLAST search returned multiple hits with equal scores and prevented assignment of an unambiguous IS. We observed this for 0.6% of the lines (28 lines) with FSTs. There are two primary locations in the genome where these insertions mapped. One line (JJ4195) returned a sequence that was assigned to 29 loci near the centromere of chromosome Bd4 in a region that contains Brachypodium centromeric retroelements, and 27 lines gave sequences that were assigned to 649 loci in the first 215 kb of the short arm of Bd5 in a region encoding 26S ribosomal RNA genes. To simplify our analyses we only used one IS for each FST. For the majority of the T-DNA lines, the single FST assigned from the top scoring BLAST hit for each sequencing reaction is displayed on the USDA-ARS-WRRC T-DNA website (described in a later section). However, for the insertions in repetitive regions, each of the potential genomic locations for the equal scoring BLAST hits is displayed.

Distribution of ISs within the Brachypodium genome
The distribution of ISs across the five Brachypodium chromosomes was analyzed by plotting the number of insertions within 500 kb windows moving across the length of each chromosome starting from the beginning of the short arm to the end of the long arm (Fig. 5). Insertions span the entire length of all five chromosomes. For chromosomes 1, 2, 3, and 5, the number of insertion sites detected per chromosome is generally proportional to the chromosome length and the distribution ranges from 19.2 to 20.9 ISs/Mb with an average of 20.2 ISs/Mb ( Table 7). Fewer ISs were detected on chromosome 4 relative to its size (15.9 ISs/ Mb) than were observed for the other chromosomes and results in an average over the entire genome of 19.3 ISs/Mb. Overall, the number of genes/Mb is directly proportional to the length of the chromosome. Thus, the lower number of ISs/Mb on chromosome 4 may be partly attributable to the lower number of genes/Mb on this chromosome ( Table 7). A non-uniform distribution of ISs similar to that reported for T-DNA insertions in rice [57][58][59], Arabidopsis [60], and the BrachyTAG [48] collection also is observed in the WRRC population. In general, the distal ends of the chromosomes have a greater density of insertions, and fewer insertions are detected near the centromeric regions (Fig. 5). The Brachypodium v1.0 annotation [8] was used to plot the number of genes present in the same 500 kb windows used to visualize the distribution of ISs. Higher numbers of ISs correlate well with the regions of higher gene density (Fig. 5).

Distributions of FSTs in genic and intergenic regions
To analyze the T-DNA distribution between genic and intergenic regions ( Fig. 6 and Table 8), we compared the assigned IS loci to the v1.0 annotation reported by the International Brachypodium Initiative [8]. The report identified 25,532 protein coding genes in the Brachypodium genome with an average gene length of 3,336 bases, including exons, introns, and UTRs. This represents 31.3% of the 272 Mb genome. In our population, 28.4% of the ISs (1,499) reside within genes, a value slightly lower than the percentage of the genome assigned to genes. To more precisely describe the ISs we further categorized   Online resources for accessing the USDA-ARS Western Regional Research Center (WRRC) T-DNA collection The goal of this project was to add to the growing collection of genomic resources available for Brachypodium by creating a large collection of sequence-indexed T-DNA lines. Mutant lines are available to interested researchers through a link from the Brachypodium resource page of the Genomics and Gene Discovery Research Unit at the USDA-ARS, WRRC (http:// brachypodium.pw.usda.gov/). The WRRC T-DNA website (http://brachypodium.pw.usda.gov/TDNA/) includes a GBrowse window for visualization of FSTs in the context of adjacent genomic features and a window for BLAST searches against the regions adjacent to T-DNA insertion sites. In addition, an Excel table listing FST details (Table S2) and FASTA file containing the genomic regions flanking T-DNA ISs (File S1) are available for download. Instructions for ordering lines are provided. In Table 7. Chromosome distribution of FSTs, ISs, and genes.   addition, the WRRC T-DNA collection can be viewed as a track in the http://www.brachypodium.org/ Gbrowse window.

Discussion
By modifying our transformation protocol we were able to significantly increase transformation efficiency and reduce the length of time necessary to generate transgenic plants. Specifically, we eliminated the recovery step where callus was placed on callus inducing media without hygromycin for a week after cocultivation. This reduced the transformation time by one week and eliminated the labor and materials required for one transfer. In studies evaluating different transformation vectors, we found hygromycin selection to be more efficient than BASTA for the production of transgenic plants. After co-cultivation, callus selected using BASTA required two subculture steps prior to regeneration, whereas higher transformation efficiencies were achieved from hygromycin-selected callus transferred to regeneration media after only one subculture. Our vector comparisons also demonstrated that the promoter driving the selectable marker greatly affected transformation efficiency with the maize ubiquitin promoter demonstrating the highest efficiency among the promoters tested. Using our optimized protocol we achieved an average transformation efficiency of 42%. Significantly, this high efficiency was achieved in a production setting where calluses were transferred and discarded on a set timetable to minimize labor and space required per transgenic line produced. These improvements provide considerable time and cost savings when transformations are conducted on a large scale.
Using our optimized transformation method, we generated 8,491 fertile T-DNA lines making the WRRC collection the largest collection of T-DNA lines in any grass with the exception of rice. To increase the utility of this collection, we used inverse PCR to sequence the DNA flanking the insertion sites. Our initial experiments focused on optimizing the IPCR method. We found 20-26% higher FST recovery when we performed sequencing reactions directly from the T-DNA into the genomic sequence compared to reactions from the genomic restriction site back toward the T-DNA. Adding sequencing reactions generated from a second enzyme digestion near the RB increased recovery of FSTs by approximately 25%, and we increased FST recovery by 20% when we included reactions from the LB. However, adding a third enzyme at the RB only produced a marginal increase in FST identification, and vectors with two LBs did not improve FST recovery.
In total, we identified 5,285 specific insertion sites in the Brachypodium genome. We successfully identified T-DNA insertions  in 62% of the lines that we sequenced and found an average of 1.2 insertions sites per line. These sites represent 1,499 insertions in genes and another 2,203 insertions within 1 kb of a gene. This latter class of insertions may alter gene expression if they lie within regulatory regions or if the T-DNA is an activation tagging construct. The WRRC collection comprises mutants with insertions in or near (within 1 kb) 8.8% (2,245) of the annotated Brachypodium genes and represents a significant addition to existing collections of available Brachypodium insertional mutants. However, the number of lines is far from saturation. The formula {P = 12(12[x/g]) n } [61] can be used to calculate the probability (P) of finding an insertion in a particular gene (''g'' is the genome size in kb, ''x'' is the average gene length in kb, and ''n'' is the number of T-DNA insertions needed). Applying the above formula to Brachypodium assuming our current 62% efficiency of retrieving a useful FST from a line and an average of 1.2 T-DNA insertions detected per line, we calculate that a collection of 76,000 lines would have a 50% probability of containing an insertion in any average 3.3 kb Brachypodium gene (Fig. 7). To reach P = 0.95, a collection on the order of 329,000 lines would be necessary. Due to redundancy of hits, the first T-DNA insertions will hit the highest diversity of genes. Thus, even the smaller P = 0.5 collection would have great utility. These are large, yet not unrealistic, numbers. This estimation assumes random integration, but because T-DNA insertions are preferentially identified near genes, fewer lines should be needed to reach saturation. Furthermore, the utility of the existing collections can be increased through improvements in detection of insertions missed in the first sequencing attempts (Fig. 7). Current efforts are focused on identifying insertion sites in the 38% of the lines of the WRRC collection lacking FSTs and increasing the average number of T-DNA insertions detected to approach the expected number of 2 per line (estimated at 9,000 additional insertion sites).
The insertion sites are accessible through the project website (http://brachypodium.pw.usda.gov/) where they can be searched by BLAST, Gbrowse or downloaded as a table. Lines are freely available to anyone in the scientific community. The WRRC T-DNA collection represents a significant and growing resource for plant science research.