Fig 1.
Overview graph of proteomics pipeline.
The LC-MS/MS proteomics workflow is illustrated using ovals to represent the key steps within the workflow and the arrows connecting them.
Fig 2.
Venn diagram showing the distribution of unique and common non-redundant proteins among the four biological replicates.
The overlapping regions show numbers demonstrating the proteins that were expressed in at least three biological replicates with 2 unique peptides Proteins within the red outline are those present in all biological replicates (Pool 1 to 4). The identities of the expressed proteins are detailed in S4 Table in S1 File.
Fig 3.
Distribution of proteins identified from Brassica rapa R-o-18 seed extract according to GO cellular component annotations.
The proteins identified from Brassica rapa R-o-18 seed extract using LC-MS/MS were categorized based on Gene Ontology (GO) annotation as described in Material and Methods. Ten ‘cellular component’ categories were assigned to 219 proteins with 14 unknown proteins remaining uncategorized.
Fig 4.
Distribution of proteins identified from Brassica rapa R-o-18 seed extract according to GO molecular function annotations.
The proteins identified from B. rapa R-o-18 seed extract using LC-MS/MS were categorized based on Gene Ontology (GO) annotation as described in Material and Methods. Seventeen ‘molecular function’ categories were assigned to 226 proteins with 7 unknown proteins remaining uncategorized.
Fig 5.
Gene Ontology (GO) term enrichment analysis of identified Brassica rapa seed proteins by DAVID.
Gene Ontology (GO) term enrichment analysis was carried out using the hypergeometric method with Benjamini false discovery rate (FDR) correction. MF, CC and BP represents molecular functions, cellular components and biological processes, respectively. Rich factor is the ratio of the number of identified proteins annotated in the given GO term pathway to the number of all proteins from the database annotated in the pathway.
Fig 6.
Multiple sequence alignment of the 7S globulin-like vicilin protein sequences identified in Brassica rapa.
7S globulin-like vicilin protein sequences from B. rapa R-o-18 seed proteome were aligned with sequences from Arabidopsis thaliana, Cannabis sativa, cashew (Anacardium occidental), narrow-leaved blue lupine (Lupinus angustifolius), white lupine (Lupinus albus), sesame (Sesamum indicum), pistachio (Pistacia vera), European hazel (Corylus avellana) and peanut (Arachis hypogaea) available in public domain databases. Ep = epitopes. The orange boxes delineate known Ara h 1 epitopes [50], dark blue boxes outline known Ana o 1 epitopes [52], grey boxes delineate known olive vicilin T-cell epitopes abbreviated as OVTE and yellow boxes bound the known olive vicilin B-cell epitopes abbreviated as OVBE [51]. All the epitopes are listed in Table 1 and the corresponding sequences identified in the vicilin-like proteins in B. rapa. pid = percent sequence identity. Blue boxes show the location of the conserved proline and glycine residues.
Fig 7.
Three-dimensional structure of the vicilin like seed storage proteins identified in Brassica rapa.
Structural modelling was carried out using the SWISS-MODEL program and was based on the available crystal structures of vicilin (S6 Table in S1 File). The green shading indicates the location of β-sheets in the N-terminal region and the blue shading indicates α-helices at the C-terminus.
Fig 8.
Phylogenetic tree of vicilin-like proteins.
The phylogenetic tree was constructed using the maximum likelihood method in NGPhylogeny.fr web resource (https://ngphylogeny.fr/workflows/oneclick/, 30/05/2020). The tree was constructed using the five 7S globulin-like vicilin protein sequences identified in Brassica rapa R-o-18, as well as 7S globulin like vicilin protein sequences reported in different plant species including Arabidopsis thaliana (At3g22640), Cannabis sativa (Cannabis_vicilin), cashew (Anacardium occidental) (ANA_O-1 and ANA_O_2), narrow-leaved blue lupine (Lupinus angustifolius) (CONB1_LUP), white lupine (Lupinus albus) (CONB2_LUP), sesame (Sesamum indicum) (SESIN), Korean pine (Pinus koraiensis) (Pin_k_2_0101), pistachio (Pistacia vera) (PISVE), European hazel (Corylus avellana) (CORAV) and peanut (Arachis hypogaea) (ARAHY_Ara_h_1), all available in public domain databases. The full-length sequences of identified vicilins are presented in S5 Table in S1 File, while the other sequences are presented in S1 Table in S1 File. The scale above the tree represents the in-built distance matrix employed in NGPhylogeny.fr web resource and indicates the distance based on the sequence similarity of their features.
Table 1.
Epitope mapping of Brassica rapa vicilin sequences with the twenty-three reported epitopes of the well-studied peanut vicilin allergen Ara h 1.
Fig 9.
Phylogenetic analysis of identified Brassica rapa oleosin and oil body associated proteins.
The linear phylogram was built using PRESTO-Phylogenetic tReE viSualisaTiOn server (https://ngphylogeny.fr/data/displaytree//, 29/09/2020) and shows the relationship to the Arabidopsis oleosin genes (S7 Table in S1 File). The branch length scale at the top of the figure indicates the distance between the sequences based on the sequence similarity of their features. The identified oleosin proteins Bra019493, Bra000167, A0A078GHK4, M4E9X1, A0A397KW15, A0A397ZZI8, M4DBK6, M4EI43, Bra035756, Bra032113 and oil-body associated proteins Bra036039 and Bra036039 are designated as OLE01, OLE02, OLE03, OLE04, OLE05, OLE06, OLE07, OLE08, OLE09, OLE10, OBAP 01 and OBAP 02 respectively.