Identifying a developmental transition in honey bees using gene expression data

doi:10.1371/journal.pcbi.1010704

Fig 1.

The method successfully identifies the transition state in simulated data.

As interaction strength α increases in a simple dynamical model of gene regulation (see Eq (19) in the Methods), gene expression levels projected along the first principal component transition from a unimodal to a bimodal distribution (three insets show simulated data as blue histograms and the best-fit distributions as red curves). A statistical comparison between a Gaussian distribution and the distribution shape expected near a continuous transition reliably identifies the transition state once the bimodality is sufficiently pronounced (red and orange points). Increasing the number of data samples allows identifying the transition state when bimodality is less pronounced (compare red and orange points). Error bars indicate estimated standard deviation of the mean given 100 simulations.

More »

Expand

Fig 2.

In honey bees, both the variance of gene expression and the strength of evidence for bistability grow over developmental time.

A: The standard deviation of gene expression along the principal component increases during development. Error bars show standard errors (N_samples = 16). B: The Bayesian Information Criterion measure ΔBIC quantifies the strength of evidence in favor of the bistable transition distribution as compared to a unimodal Gaussian, with positive values favoring bistability (blue circles; see Methods Eq (16)). We interpret ΔBIC values larger than 6 (horizontal dashed line) as strong evidence for bistability. We also compute ΔBIC that compares a Gaussian mixture (n = 2) with the unimodal Gaussian, which identifies weaker evidence for bimodality (orange Xs; see section 3).

More »

Expand

Fig 3.

Bistability along the first principal component.

Gene expression data from 16 bees at age 10 days and 15 days projected along the first principal component (colored circles; log-transformed data). Colors correspond to distance along this dimension, with orange chosen to represent the low vg state indicative of foragers. The fit Landau distribution is shown in blue.

More »

Expand

Fig 4.

Individual genes associated with the bistability.

A handful of individual genes or gene products have a large proportion s of their variance along the bistable dimension. Here we highlight all genes with s > 2/3. Orange and purple colors correspond to bees on two sides of the bistability, with orange chosen to represent the low vg state that is indicative of foragers. In the column “up/down reg.”, we indicate whether the gene is up or down regulated in the orange state as compared to the purple state. Expression data for these genes are shown with the same colors for individual bees as in Fig 3, along with the marginalized fit distribution in blue (log-transformed data, with scale bar corresponding to an expression ratio of 10).

More »

Expand

Table 1.

Genes whose variance in expression is most aligned with the transition dimension.

Here we list the top 20 genes, at age 10 and 15 days, ordered according to s, the fraction of the gene’s variance that lies along the bistable dimension (see Methods Eq (17)).

More »

Expand

Fig 5.

Fitting to the Landau probability density function performs better than a mixture of two Gaussians.

In tests with simulated data, the Landau method (A) identifies the transition state sooner (at smaller μ, when the bistable states are closer to one another; shown here fitting 100 samples), and (B) fits the transition distribution much more closely (shown here fitting 50,000 samples at μ = 0.0158; simulated data in blue histogram compared to best fits of Landau distribution and Gaussian mixture distribution shown as solid curves). Error bars in (A) indicate estimated standard deviation of the mean given 100 simulations.

More »

Expand

Fig 6.

Existing “transition index” measure also suggests a transition when focused on particular genes.

The transition index defined in Ref. [17] requires selecting a set of genes that are known a priori to be involved in a transition. Applying this measure to our data and restricting to genes identified by our method, the largest values occur at age 10 and 15 (red circles). This corroborates our method, which does not require selecting specific genes, yet locates a transition at a similar time. In contrast, when including data from all 91 genes (pink squares), the signal in the transition index is washed out.

More »

Expand

Fig 7.

Applying the Landau method to hepatocellular carcinoma data demonstrates its relationship with methods that rely on the increase of correlated fluctuations.

(A) Combining control samples from healthy livers with samples from various stages of liver cancer (data from Ref. [32]), we find increasingly strong evidence of bimodality starting at the “early HCC” stage. The SNE approach, designed to find evidence of transitions before they happen, identified critical fluctuations at the “very early HCC” stage [18]. (B) Along the bistable dimension, increasingly separate clusters of gene expression are visible. The ground-truth group membership of each sample is indicated by color: purple for control samples and orange for disease samples.

More »

Expand

Fig 8.

Comparing two measures for the relevance of individual genes to the bistability.

The magnitude of the correlation coefficient |c| between individual genes and the bistable dimension is closely related to the proportion of variance s along .

More »

Expand

Fig 9.

The method is robust to the number of included genes.

As in Fig 1, we plot the proportion of simulated datasets in which bistability is identified as a function of interaction strength α, here for (A) N_genes = 10 and (B) N_genes = 1000. Each point is a mean over 50 simulations with N_samples = 16, with error bars indicating standard deviation of the mean.

More »

Expand