Fig 1.
Ectoderm microdissections and experimental strategy.
(A) Xenopus laevis neurula-stage embryos were dissected at Nieuwkoop and Faber stage (St.) 12.5 and St. 14 according to the patterns indicated. Live embryos are shown in dorsal view. The superficial ectoderm was separated from the underlying mesoderm germ layer. The tissues collected corresponded to the developing neural plate (NP, blue), medial and lateral neural border (NB, red-pink-beige), anterior neural border and preplacodal ectoderm (ANB/PPE or ANF/PPE, violet), and nonneural ectoderm (NNE, green, yellow-brown). NP, NB (at St. 12.5 and St. 14), and NNE (at St. 14 only) were further subdivided into 2 or 3 pieces: anterior (a, darker color) and posterior (p, lighter color) fragments, as indicated. For St. 12.5 NB only, a more lateral fragment was cut (l, beige), (lateral refers to medial-lateral coordinates on the dorsal view or corresponds to more ventral if using dorsal-ventral whole embryo coordinates). Each sample is associated to 1 color throughout this study (color code reference in S1 Table). (B) This dissection pattern (dotted lines) matches with reference genes expression patterns in the developing neurula ectoderm as indicated for pax3 (dorsal-lateral NP and NB), sox2 (NP and St. 14 PPE), and snail2 (premigratory neural crest within the NB). The dissection line is indicated outside the embryo whenever the dissection encompasses tissues located more ventrally than seen in dorsal view (e.g., for PPE and NNE). Similarly, the blastopore (white circle) can be indicated shifted outside the embryo if located more ventrally. (C) Whole embryos at each stage of neurulation were used for studying the time course of gene expression in neurulas. Drawing after living embryos. (D) The main steps of sample treatments and bioinformatics analysis are indicated. After total RNA purification from each individual dissected sample or whole embryo, quality was checked using quantitative PCR. Suitable samples (see text) were processed for deep transcriptome sequencing. Details of analysis workflow are indicated in Results and Materials and methods.
Fig 2.
Unsupervised analysis of variance between samples using principal component analysis (PCA) retrieves biologically meaningful components, matching embryo polarity axes, neural border (NB) formation, and developmental stage.
(A, B) At Nieuwkoop and Faber stage (St.) 12.5 (A, circles) and St. 14 (B, triangles), the first 2 PCA components segregated ectodermal samples according to their position along the dorsal-ventral (D-V) and anterior-posterior (A-P) embryonic axes, respectively. Percent of total gene expression variance captured by each component is indicated between parentheses. Sample color code is identical to Fig 1A. Lighter background colors are Voronoi diagram areas. These areas are drawn using the barycenter of samples belonging to the same dissected region. Each area indicates the regions of the PCA plane, where all points are closer to the barycenter of this dissected region than to any other barycenter. Arrows indicate anterior neural border samples falling slightly outside their intended dissected region (see text for details). (C, D) At St. 12.5 (C) and St. 14 (D), t-distributed stochastic neighbor embedding (tSNE) groups samples according to dissected regions. (E, F) At St. 12.5 (E) and St. 14 (F), neural border samples are located higher than most others along the third PCA component. This indicates that component 3 correlates with the acquisition of the NB characteristics during neurulation. (G, H) When combining St. 12.5 (circles) and St. 14 (triangles) samples together, the first component remains correlated to the D-V axis. The second component sorts out most samples according to their stage, irrespective of their position along the D-V axis (triangles separated from circles, (G). The first and third PCA components segregate most samples according to their position along the D-V and A-P axes, irrespective of their stage (circles and triangles of same color together, [H], Voronoi diagram). The posterior neural border (NBp), which is known to resemble the posterior neural plate (NPp) at St. 12.5, grouped with the NPp, while the anterior neural border (NBa)–St. 12.5, NBa–St. 14, and the lateral neural border (NBl)–St. 12.5 belong to the same Voronoi cell. As noticed above, NBa–St. 14 is more heterogeneous (red triangles).
Fig 3.
Nonnegative matrix factorization (NMF) efficiently deconvolutes differential expression profiles.
The NMF algorithm defines a set of ectoderm tissues (NMF-tissues) at Nieuwkoop and Faber stage (St.) 12.5 (A) and St. 14 (B). Columns in the mixing matrix indicate how these tissues contribute to the initial dissected samples. Red indicates high contribution; blue indicates low/no contribution. Interesting intermediate contributions are outlined by white dotted lines; stars (*) refer to comments in text. From these contributions, dissected samples (left part of the pattern drawn on embryo) were matched with 1 of the NMF-tissues (right part of the pattern). At St. 12.5, striped pattern indicated the mixed contribution of posterior neural plate and neural border (NMF-NP(p), NMF-NB(p)) to NBp samples. Abbreviations and colors are the same as in Fig 1 and S1 Table.
Fig 4.
Nonnegative matrix factorization (NMF) predicts accurate expression patterns for neural border (NB) genes.
In order to represent gene expression levels on the map of the developing ectoderm, we used 2 approaches: averaging the expression values for each of 7 domains or using NMF deconvolution predictions in 5 domains. We found those 2 approaches complementary. Average patterns directly describe raw data but are sensitive to shifts in dissection. NMF deconvolution predicted expression levels in each NMF-tissue. In the case of the anterior neural border (NBa), the NMF pattern is closer to in situ hybridization patterns than average patterns, e.g., for known genes expressed in various ectoderm areas, sox2, snail2, pax3, and foxi1e (also see Fig 8, S4 and S7 Figs for more examples). Moreover, these patterns predict expression for genes of unknown pattern and with low expression level, such as sp8. We present the average pattern and NMF-predicted pattern (NMF pattern); each is normalized on a percent scale, 100% being maximal expression of the gene of interest. This implies that genes with low expression will be represented in similar shades as genes expressed at high level, which allows better visualization but could be misleading if the gene is very weakly expressed in all ectoderm (e.g., sp8 at Nieuwkoop and Faber stage [St.] 12.5). To complete pattern prediction with expression level information, we thus also include the average expression levels per dissected sample for each gene (reads per kilobase per million [RPKM] shown for St. 14 here). Robustly expressed genes (>100 RPKM, e.g., sox2, snail2, and pax3 at St. 14) are predicted as well as genes with very low expression (e.g., snail2 at St. 12.5 or sp8 at St. 14). In addition, the color of the dot indicates high ectoderm specificity (green) to low ectoderm enrichment (orange). See text and Materials and methods for details, see S11 Table for numerical data.
Fig 5.
Identification of gene ontology (GO) biological functions enriched in weighted gene co-expression network analysis (WGCNA) groups.
(A) Distribution of Xenopus laevis transcripts in WGCNA groups. The color-filled bars on the histogram indicate the number of unique Entrez geneID records per group. Functions of some developmentally relevant groups are highlighted: the Wnt responding genes in blue (Groups #23 and #41 in blue), neural crest (#51, purple), forebrain (#38, orange), mRNA splicing (#80, pink), and cilium morphogenesis in green (#22, green). Refer to S6 and S6B Table for the full list of functions associated with each group. (B) Comparison between groups generated by WGCNA (blue) and those created by random assignment of transcripts (red). About 45% of the WGCNA groups associate to at least 1 biological function, which is significantly more than the average obtained for the random dataset groups (4%; the 5 random datasets of 141 groups). The distribution of adjusted p-values and sensitivities (1-β) of overrepresented terms show that, overall, WGCNA groups are more likely to contain a relevant biological function than groups assigned randomly. (C) Specificity versus sensitivity plot, in which each of the data points represents a single GO function enriched in a group. Black circles represent all 80 biological functions enriched in the 5 random datasets (total of 705 groups). The Wnt responding genes (Groups #23, blue), neural crest (#51, purple), forebrain (#38, orange) are highlighted. For further description of the function, refer to S6B Table; see S11 Table for numerical data.
Fig 6.
Distribution of Wnt-responding genes and genes involved in neural, neural crest (NC), and placodal development across weighted gene co-expression network analysis (WGCNA) synexpression groups.
The histogram shows how many transcripts were found in each synexpression group and if this represented a significant enrichment compared to uniform distribution in all groups; see S11 Table for numerical data. (A) The Wnt-responsive transcripts identified by Kjolby and Harland [49] were mainly distributed in 4 co-expression groups: 47% of the genes positively regulated by Wnt signaling from the input list were found in groups #23, #31, and #41, while 62% of genes negatively regulated by Wnt were found in group #92. Group #23 contained posterior genes (e.g., cdx1/2/4, hes5/9, ngfr, kremen2); group #31, Wnt-signaling components (xarp, tcf7); group #41, meso-endoderm specifiers (e.g., msgn1, myf5, t, vegt); and group #92, genes up-regulated in absence of Wnt signaling. (B) Placodal and NC genes were enriched in groups #38 (43% of the input list) and #51 (70% of the input list), respectively. Contrarily to NC markers, the neural border genes, many of which are more broadly expressed, were distributed across many different co-expression groups; however, pax3 and myc were part of group #51.
Fig 7.
Genes with predicted expression and functional association to the neural crest (NC).
(A) Validation of the expression pattern of sh3pxd2a at Nieuwkoop and Faber stages (St.) 12.5 and St. 14. Upon prediction of NC expression by nonnegative matrix factorization (NMF), we cloned a probe for novel gene sh3pxd2a and analyzed its in vivo pattern: while expressed at very low levels (B, 25 reads per kilobase per million [RPKM] in the anterior neural plate border [NBa] St. 14), it was readily detected, at low level, in the prospective NC in vivo. (B) Average RPKM expression per tissue dissection highlighting sh3pxd2a ectodermal enrichment in the NBa. Expression level for each homeologous gene copy is indicated. (C) Expression pattern predictions for 2 novel genes grouped in G-51, a group associated with NC development: cacng, cdon. Egr4 was recently shown expressed in cranial NC [35]. Mean expression of both homeologous copies is indicated for each dissected region. (D) Temporal expression during neurulation (mean expression levels in whole embryos) for each homeologous copy of cacng4, cdon, and egr4. See S11 Table for numerical data.
Fig 8.
A few Xenopus laevis homeologous gene pairs display spatial subfunctionalization.
We have studied how homeologous genes are expressed in the whole dataset (using weighted gene co-expression network analysis [WGCNA]) or in the dissected regions. We found that most pairs with differential expression exhibit asymmetrical decrease of 1 copy (e.g., egr4, sh3pxd2a, snail2, Fig 7, S11 Fig). We found 5 pairs with spatial differential expression, suggesting spatial subfunctionalizations: at Nieuwkoop and Faber stage (St.) 12.5, alas1, ccdc68, gmnc-like, and hs6st1 and at St. 14, p2ry1. Average and nonnegative matrix factorization (NMF) expression patterns are shown at the relevant stage.
Fig 9.
Gene co-expression network of Xenopus neurula visualized biologically meaningful relationships between weighted gene co-expression network analysis (WGCNA) groups and subsets of genes.
We obtained a gene co-expression network using a large input gene list, including Wnt-related transcripts, neural, neural crest (NC), placode, and mesoderm genes as input nodes (large points, S10 Table). The p-value cutoff <1E-09 generated 2,001 nodes (small points indicate closely correlated genes; the color of the points refers to a given WGCNA group) and 4,945 edges. Circles are drawn to groups of genes with close spatial/temporal expression patterns. The group was named based on known members of the group. Groups are linked to other groups with positive (thin line) or negative (thick line) correlation. As an example, NC genes (groups #51 and #132) and placodes (group #38, in green) display negative correlation of expression with posterior genes (groups #23, #20, thick lines). The network also indicates interactions between ventral genes such as ventx genes (group #138), positively correlated with bmp4 and bmp7.2 (thin lines). When running the EctoMap application, the network is interactive and indicates each gene name upon selection.
Fig 10.
Genes linked to human melanoma are tightly associated with Wnt-associated genes and neural crest (NC) genes during development.
(A) Several (26) genes linked to melanoma super-enhancers defined by Kauffman et al. (2016) were found among the posterior/Wnt genes group (#23) and the NC group (#51). (B) Cumulative frequency of genetic alterations found in these 26 genes in various cancer types: NC-derived cancers are outlined in purple. High mutation frequency (>45%) is found in melanomas. The nature of the genetic alteration is plotted along y-axis. We used 70 studies describing at least 100 tumors of each kind (TCGA). (C) These 26 genes were frequently mutated in cutaneous and uveal melanoma: this diagram displays genetic alterations found for each gene above 5% frequency, in 191 out of 287 cutaneous melanomas and in 24 out of 80 uveal melanomas. See S11 Table for numerical data. Panels (B) and (C) are based on data obtained by the TCGA Research Network: http://cancergenome.nih.gov and visualized by cBioPortal.
Fig 11.
EctoMap: An application to explore gene synexpression during neurulation.
(A) Workflow for the EctoMap application, indicating the input options and the files needed to produce the output plots. (B) Example of outputs (see also S1 Web Archive).