Multivariate analysis of variegated expression in Neurons: A strategy for unbiased localization of gene function to candidate brain regions in larval zebrafish

doi:10.1371/journal.pone.0281609

Fig 1.

Overview of experimental workflow.

A) Before MAVEN can be performed, a variegated, labeled expression construct for the gene of interest that has variable effects on a phenotype of interest is required. B) The first step of MAVEN is to collect two groups of larvae, each expressing the variegated transgene, but displaying different behavioral phenotypes. In this example, some mutant larvae fail to express the transgene in the relevant cells, and therefore display the typical mutant phenotype (red), while other mutant larvae express the transgene in relevant cells and therefore display a WT-like, rescued behavioral phenotype (blue). C) After larvae are collected in each phenotype group, their tails are trimmed distinctively so they can be identified after immunohistochemical staining in a single tube to avoid batch artifacts. Antibodies for the rescue construct (in this example, GFP) and an anatomical reference stain are applied. D) Next, the brains are aligned to a 3D brain atlas using the reference stain and the rescue construct signal in identified brain regions is quantified. We give an example with our own data in Fig 3. We provide MATLAB code for this stage of the protocol. E) Multivariate analysis is performed to identify specific brain regions in which GFP signal levels correlate with larval phenotype. These are candidate regions which may mediate the function of the gene of interest. We show example results of our analysis in Fig 4, and provide R code for this stage. F) Validation is performed by identifying more specific Gal4 drivers for candidate regions. If the phenotype of interest can be rescued by driving expression solely in these regions, then a region that mediates the gene’s effects on phenotype has been successfully identified. We show how we validated our results in Fig 5. G) If a specific region fails to validate, we offer methods to identify alternative candidate regions in Fig 6.

More »

Expand

Fig 2.

An example larval collection strategy based on rescue of a loss-of-function mutant phenotype.

A) Schematic of behavioral bias for the escape versus reorientation decision of various larvae. At low stimulus intensities, WT larvae perform predominantly reorientations (pink). As stimulus intensity increases, WT larvae shift their preference towards escapes (light blue). CaSR mutant larvae, by contrast, are highly reorientation-biased across all stimulus intensities, although they do occasionally perform escapes. B) Example of a loss-of-function rescue strategy for collecting groups of larvae with divergent phenotypes. When responding to a strong acoustic stimulus, CaSR homozygous mutant larvae (red squares) are reorientation-biased relative to sibling controls (blue circles). Overexpressing CaSR in a variegated fashion in neurons in CaSR mutants (empty red squares) sometimes, but not always, rescues the mutant phenotype. We collected larvae in rescued (above light blue line) and non-rescued (below pink line) groups. Some data from Panel B also appears in Shoenhard, Jain, and Granato [19].

More »

Expand

Fig 3.

Patterns of variegation of rescue construct expression in neurons of d5 larvae.

A) CaSR-EGFP (green) and total ERK anatomy reference (magenta) signal in single slices in the dorsal portion of the brain of an αtubulin:Gal4; UAS:CaSR-EGFP larva. Left and right images are from the same brain: raw image is on the left and brain atlas-registered image is on the right. Note the variegation in CaSR-EGFP expression, with left-right asymmetries highlighted using white arrowheads. Scale bars are 100 um. Due to the deformations that occur during brain atlas registration, the raw image and the registered image do not depict exactly the same anatomical regions, and therefore the exact same cells cannot be identified from image to image. B) Quantification of CaSR-EGFP signal in the brains of n = 50 αtubulin:Gal4> UAS:CaSR-EGFP; CaSR^+/+ larvae sorted by behavioral phenotype. Signal in each brain region is represented as a gradient ascending through the colors black, green, yellow, and white. C) Pearson correlations of a set of example brain regions calculated on the full dataset of n = 150 larvae. Width and color of ellipses are proportional to the degree of correlation. Regions are ordered by hierarchical clustering using the Ward method. Red = negatively correlated (no brain regions in this group were negatively correlated), white = no correlation, blue = positively correlated.

More »

Expand

Fig 4.

Identification of the Dorsal Cluster—Rhombomere 6 as a candidate site for CaSR-dependent decision-making.

A) Normalized fluorescence intensity signal in the DRC6 of larvae of various CaSR genotypes that either displayed the typical phenotype (red) or were escape-shifted (phenotype caused by CaSR overexpression) (blue) in response to a low-intensity, primarily reorientation-evoking stimulus. Presence of the overexpression phenotype significantly (p<0.0001) predicts DCR6 signal by Two-way ANOVA. B) Normalized fluorescence intensity signal in the Dorsal Cluster–Rhombomere 6 of CaSR mutants whose behavioral phenotype was unrescued (red) vs rescued (blue), e.g. performed predominantly escapes in response to a strong, escape-evoking stimulus. Rescue significantly predicts DCR6 signal by Mann-Whitney U test. C) Normalized fluorescence intensity signal in the locus coeruleus (LC) of larvae of various CaSR genotypes that either displayed the typical phenotype (red) or were escape-shifted (overexpression phenotype) (blue) in response to a low-intensity stimulus. Presence of the overexpression phenotype does not significantly predict (p = 0.0738) LC signal by Two-way ANOVA. D) Normalized fluorescence intensity signal in the locus coeruleus of CaSR mutants whose behavioral phenotype was unrescued (red) vs. rescued (blue), e.g. performed predominantly escapes in response to a strong, escape-evoking stimulus. Rescue does not significantly predict LC signal by Mann-Whitney U test (p = 0.0649). E) Accuracy of logistic regression models used to predict phenotype of CaSR^+/+ (blue), CaSR^p190/+ (green), or CaSR^p190/p190 (red) larvae using the average CaSR-EGFP signal across the whole brain, the signal only in the DCR6, or both. CaSR^+/+ and CaSR^p190/p190 phenotypes could be more accurately predicted with the DCR6 than with the average signal in the whole brain, and using both variables in the model did not improve accuracy compared to the model that used only the DCR6. CaSR^p190/+ phenotypes were not accurately predicted by any of the three models. Data in panels A and B also appears in Shoenhard, Jain, and Granato [19].

More »

Expand

Fig 5.

Example validation of a single Gal4-defined brain region as a mediator of a genetic behavioral phenotype.

A) Expression patterns of the y293 (purple), y234 (green), and y341 (magenta) Gal4s relative to the DCR6 in Slice 122 of ZBrain 2.0 (https://zebrafishatlas.zib.de/lm) [8, 26]. Scale bars are 100 um. Below, enlarged images of the DCR6. The y293 Gal4 labels apparent cell bodies of a population medial to the DCR6 (black arrowheads) that project through the DCR6 and terminate in a region lateral to the DCR6 (orange arrowheads). However, cell bodies within the DCR6 are labeled only sparsely. By contrast, the y234 and y341 Gal4s both label cell bodies within the DCR6 and immediately surrounding it, y234 to a slightly greater extent than y341. B) Maximum projections from ZBrain slices 12–64 (ventral portion of the brain) in the y293 (blue), y234 (green), and y341 (magenta) Gal4s. All three Gal4s drive expression in the trigeminal ganglion (TG), while only y293 and y234 drive expression in the vagal ganglion (VG). Scale bar = 100 um. C) CaSR overexpression escape-shifted phenotype is exhibited by GFP+ y234; UAS:CaSR-EGFP; CaSR sibling (CaSR^+/+ and CaSR^p190/+) larvae responding to a low-intensity acoustic stimulus [19]. D) Partial rescue of CaSR loss-of-function phenotype in sorted GFP+ y234; UAS:CaSR-EGFP; CaSR^p190/p190 larvae responding to a strong acoustic stimulus. CaSR sibling (CaSR^+/+ and CaSR^p190/+) larvae in blue; CaSR^p190/p190 mutant larvae in red. Data from Panel D and some data from Panel C also appear in Shoenhard, Jain, and Granato [19].

More »

Expand

Fig 6.

Methods for identifying alternative candidate regions.

A) Anatomical map of regions with the highest correlations (Pearson R²) with the DCR6. The DCR6 is indicated in red; correlating regions are indicated in shades of blue, with brighter color indicating greater correlation. The majority of regions that correlate highly with the DCR6 are nearby within the dorsal rhombencephalon. Some regions were excluded due to anatomical overlap with regions shown or because they physically contain the entire DCR6; see Supplemental Information full results. Abbreviations, in descending order of correlation strength: DCR6 –Dorsal cluster Rhombomere 6 (“Rhombencephalon—QRFP Neuron Cluster Sparse”); VGlut2 Cl2 –“Rhombencephalon—VGlut2 Cluster 2”; Gad1b Cl 14 –“Rhombencephalon—Gad1b Cluster 14”; Cerebellum–“Rhombencephalon—Cerebellum”; Rh 1 –“Rhombencephalon—Rhombomere 1”; HcrtR St 2 –“Rhombencephalon—6.7 DHCrtR Gal4 Stripe 2”; s1181t –“Rhombencephalon—s1181t Cluster”; L Habenula VGlut2 Cl–“Diencephalon—Left habenula VGlut2 Cluster” B) Training and testing set accuracy of 100 LASSO regression models created by bootstrapping data from n = 50 CaSR WT larvae expressing UAS:CaSR-EGFP under control of αtubulin:Gal4. Most models fall into one of two categories: accurate (greater than 50% accuracy on both training and testing sets, sea green, n = 35) or inaccurate (less than 50% accuracy on both training and testing sets, purple, n = 55). Datapoints are jittered to avoid overplotting. We hypothesize that the inaccurate models are the result of declining statistical power associated with halving sample size for bootstrapping. For all further analysis, only accurate models were used. C) Regions that received positive coefficients in more than one bootstrapped, accurate LASSO model. The most common region employed in accurate models was the DCR6 (red). Two other regions that are among the top 10 most correlated with the DCR6 (see Fig 5A and S1 File) also appeared (blue). The other regions (gray) may explain some of the variability in phenotype that was not explained by the DCR6. Abbreviations: DCR6 –“Dorsal Cluster Rhombomere 6” (ZBrain: “Rhombencephalon–QRFP Neuronal Cluster Sparse”); Mes RAF7 –“Mesencephalon—Retinal Arborization Field 7 (AF7)”; Tel OB–“Telencephalon—Olfactory Bulb”; Rhomb Gad1b Clust14 –“Rhombencephalon—Gad1b Cluster 14”; Rhomb Vmat2 Clust2 –“Rhombencephalon—Vmat2 Cluster 2”; Tel Migrated Area 4 –“Telencephalon—Telencephalic Migrated Area 4 (M4)”; Rhomb Vglut2 Clust2 –“Rhombencephalon—Vglut2 cluster 2; Rhomb Hcrtr Clust5 –“Rhombencephalon—6.7FDhcrtR-Gal4 Cluster 5”; Tel Vmat2 Clust–“Telencephalon—Vmat2 Cluster”; Tel Pallium–“Telencephalon—Vmat2 Cluster”; Rhomb Neuropil Reg4 –“Rhombencephalon—Neuropil Region 4”; Tel Vglut2 Rind–“Telencephalon—Vglut2 rind”; Rhomb Glyt2 Clust13 –“Rhombencephalon—Glyt2 Cluster 13”; Mes Torus Long–“Mesencephalon—Torus Longitudinalis”. D) Regions that received negative coefficients in more than one bootstrapped, accurate LASSO model. The most commonly employed region was lateral line neuromast O1, which is the single least correlated region with the DCR6 (green, top 10 regions least correlated with DCR6). Other regions are shown in gray. Abbreviations: Gang LLN O1 –“Ganglia—Lateral Line Neuromast O1”; Gang PLLG–“Ganglia—Posterior Lateral Line Ganglia”; Gang Vagal–“Ganglia—Vagal Ganglia”; Gang Facial Glossopharyng–“Ganglia—Facial glossopharyngeal ganglion”; Rhomb Isl1 Clust3 –“Rhombencephalon—Isl1 Cluster 3”; Gang LLN D1 –“Ganglia—Lateral Line Neuromast D1”; Di Rost Hypothal–“Diencephalon—Rostral Hypothalamus”.

More »

Expand