Fig 1.
Core Chado database schema relied upon by Breedbase.
The modifications for non-relationally storing genotyping data within the Chado relational schema are highlighted in red [17].
Table 1.
Descriptions of the JSON objects stored in the nd_protocolprop table.
Table 2.
Description of JSON object stored in the genotypeprop table.
Fig 2.
The search wizard is the primary means of querying Breedbase and provides a means for downloading phenotypic and genotypic records in several formats.
The search consists of four query categories (1) to (4) to filter across every kind of data object in the database. In this example, traits were first selected (1) and ‘grain moisture’, ‘grain yield’, and ‘plant height’ were chosen. Then, accessions were selected (2) and from the 1,404 accessions which met the selected trait criteria 8 accessions were chosen. Then, trials were selected (3) and of the 5 field trials which met the selected trait and accessions criteria, 4 trials were chosen. Then, locations were selected (4) and the two locations which met the previous criteria were chosen. A genotyping protocol can be selected as a filter in (1) to (4); however, a default genotyping protocol is used when one is not explicitly selected. Clicking on “Related Genotype Data” brings a dialog to filter genotype data for the selected accessions by chromosome, start position, and end position prior to downloading in VCF or Dosage Matrix formats (5). Additionally, a marker set can be selected to filter downloaded genotypes further. Genotypes can be computed from parents in the pedigrees of the selected accessions if the parents were genotyped by clicking the “Compute from Parents” checkbox for (5), (6), or (7). The genomic relationship matrix (GRM) can be downloaded (6) for the selected accessions after filtering for minor allele frequency (MAF) and missing data. Three formats for downloading the GRM are available: a tab separated matrix format (.tsv), a three-column format (.tsv), and a heatmap figure (.pdf). A GWAS can be computed by selecting accessions and traits in (1) to (4) and results can be downloaded (7) as Manhattan and QQ plot figures (.pdf) or as a tabular file of the p-values (.tsv). Clicking “Related Trial Phenotypes” brings a dialog to filter phenotypes by minimum and maximum values prior to downloading phenotypic data in CSV or Excel formats (8).
Table 3.
Results for non-cached query performance.
Table 4.
Results for repeated query performance from the file cache.