Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Schematic overview of the study design.

(a) Soybean plants harboring Rps1k were inoculated with a Race 1 P. sojae isolate, Race 25 isolate, or sterile media. Inoculated hypocotyls were used for RNA-seq. Capture-seq was performed subsequently to validate the RNA-seq data. (b) Overrepresented TF families were identified from the RNA-seq analysis. DAP-seq data was generated/obtained for the families most represented by total abundance and percentage of genome-wide proportion. (c) DL models were trained using DAP-seq binding site data. The capacity of some models to generalize across a given TF family was performed intra- and interspecifically. For several TF families of interest, soybean- or Arabidopsis-based DNNs were trained and used to predict TFBS. (d) DNN predictions were overlapped with FIMO motif scans, and the highly confident targets were used to construct a GRN.

More »

Fig 1 Expand

Fig 2.

Pathogenicity testing and transcriptome analysis.

(a) Disease development in Race 25- (top) and Race 1-treated (bottom) hypocotyls at seven days post-infection. (b) Venn diagram of DEGs between different treatments. (c) TF representation among DEGs from RNA-seq. WRKY was the most represented TF family by total abundance and RAV by the percentage of genome-wide proportion. (d) K-means clustering of DEGs. DEGs were assigned to nine co-expression clusters. Of these, seven displayed increased expression (log2FC [FC] >0) in infected vs Mock treatments, while two demonstrated decreased expression (FC <0). (e) Functional enrichment and TF representation for gene co-expression clusters. (left panel) Top five GO categories by adjusted p-value ( ≤0.05; data available in S7 Data). (middle panel) Top five KEGG terms by adjusted p-value ( ≤0.05). (right panel) top 3 TF families (abundance) for each cluster.

More »

Fig 2 Expand

Fig 3.

DAP-seq identification of GmWRKY30 and GmRAV TFBS.

(a) Distribution of DAP-seq peaks across genomic features. (b) Distance of peaks from the TSS.

More »

Fig 3 Expand

Fig 4.

Schematic illustration of CRNN architecture and TFBS prediction.

More »

Fig 4 Expand

Fig 5.

Generalization testing for soybean and Arabidopsis CRNNs and schematic illustration of TFBS prediction for defense-related TF families.

(a) The GmWRKY30 CRNN was used to predict TFBS interspecifically with AtWRKY30 DAP-seq data (left barplot) and intraspecifically with GmWRKY2 AmpDAP-seq data (right barplot). (b) AtWRKY, AtMYB, and AtNAC CRNNs were trained with available DAP-seq data and used to predict binding sites for other members of their respective families. (c) The Arabidopsis-based models, along with the GmWRKY30 and GmRAV models, were used to predict TFBS in our DEG set. These predictions were overlaid with FIMO scans to elucidate TF-target gene interactions.

More »

Fig 5 Expand

Table 1.

Summary of GmWRKY and GmRAV CRNN models.

TPR: True Positive Rate; TNR: True Negative Rate; FPR: False Positive Rate; FNR: False Negative Rate.

More »

Table 1 Expand

Table 2.

Summary of Arabidopsis data-based CRNN models used for generalization tests.

TPR: True Positive Rate; TNR: True Negative Rate; FPR: False Positive Rate; FNR: False Negative Rate.

More »

Table 2 Expand

Table 3.

Summary of Arabidopsis data-based CRNN models used for cross-species predictions.

TPR: True Positive Rate; TNR: True Negative Rate; FPR: False Positive Rate; FNR: False Negative Rate.

More »

Table 3 Expand

Table 4.

Arabidopsis-to-soybean cross-species prediction results.

TPR: True Positive Rate; TNR: True Negative Rate; FPR: False Positive Rate; FNR: False Negative Rate.

More »

Table 4 Expand

Fig 6.

GRN inference at 24 hpi.

(a) (left) log2FC (FC) of DEGs across interaction types, (middle) WRKY and RAV binding site representation in the DEG set derived from DAP-seq, and (right) binding site representation for each TF family in the DEG set derived from CRNN + FIMO prediction. The bar plot shows the total number of target genes for each family, as well as the number of TF-encoding target genes (blue). (b) Hairball of the global GRN. Nodes and edges represent TFs and target genes, respectively. Node size corresponds to outdegree. (c) Scatterplot of the top co-occurring TF pairs by cosine association score identified with TF-COMB. The datapoint color reflects the total number of shared targets for a given TF pair. (d) Prioritization of nodes. Nodes that were statistically enriched by Simple Enrichment Analysis and were represented in the transcriptome analysis (n = 118) were prioritized by outdegree, cumulative indegree, cumulative cosine, and mean |log2FC| (Mean |FC|). Blue polygons represent the upper quartile for each parameter. Thirteen genes/14 TFs were in the upper quarter for all four parameters. (e) Hairball of the hub nodes. Node size corresponds to outdegree.

More »

Fig 6 Expand