Fig 1.
Vocal behavior is modulated by the familiarity of specific callers.
(a) Example call interaction between a bonded pair of zebra finches in a colony. (b) Experimental setup depicting a male zebra finch presented with playbacks of calls from a familiar and unfamiliar conspecifics in a pseudo-randomized manner. (c) Top panels: Call responses (colored lines) to call playbacks (grey intervals) presented once per second (exemplified by data from the initial two days). Bottom panels: Call response distributions across inter-playback intervals (calculated over four days/sessions). (d) Peak response probabilities (max response probability in response time distribution) to different playbacks (n = 9, birds = 7, days = 4). Response probability (familiar)=0.117, response probability (unfamiliar)=0.090, Wilcoxon signed-rank test, p = 0.03. (e, f) Response latencies (from playback onset to response onset) and response latency variability (std of latencies to all responses). Response latency (familiar)=306ms, response latency (unfamiliar)=354ms, Wilcoxon signed-rank test, p = 0.01. Latency variability (familiar)=246ms, latency variability (unfamiliar)=264ms, Wilcoxon signed-rank test, p = 0.02. Blue solid dots represent mean values across multiple unfamiliar playbacks, and error bars standard error (sem). (g) Average classification accuracy for playback familiarity based on behavioral features described in d, e and f (model = random forest, iterations = 1000, test size = 0.5). Left: Confusion matrix. Middle: Distribution of accuracies across runs (79.71 ± 11.32%). Right: Kernel density estimate distribution derived from shuffled data. The solid gray line indicates the 95% confidence interval of the shuffled distribution (77.78%), while the black solid line represents the mean accuracy of the observed data (79.71%, p = 0.016). Chance level = 49.2%. * denotes p < 0.05.
Fig 2.
Interneuron activity is differentially modulated by caller familiarity.
(a) Example cells recorded during familiar and unfamiliar call playbacks. Top: Spike-sorted waveforms with average (black) and spectrogram of call playback. Middle: Spike dot raster plot. Bottom: peri-stimulus time histogram (PSTH). (b) Normalized firing rate for interneurons that changed activity beyond 2 standard deviations from baseline (169/210 Interneurons, recordings = 9, birds = 8). Neurons are ordered by their peak firing time during familiar call playbacks. The same neuron order is maintained for unfamiliar call playback, showing corresponding activity patterns across conditions. White dashed lines depict call onsets and offsets. (c) Average normalized firing rate (z-score) across significantly responsive neurons. Shaded area represents the 95% confidence intervals (1.96*standard error). (d) Left: Response trajectories in PC space for neurons shown in b (Variance explained by first 2PCs = 55.34%). Lines connecting dots represent 10ms. Right: Euclidean distance between conditions in PC space across time. Red dotted lines represent ±2std from mean baseline values. (e) Average classification accuracy for call playback familiarity based on the firing rate of neurons shown in b (time window = 0 to 400ms from playback onset, model = support vector machine, iterations = 1000, test size = 0.1). Left: Confusion matrix. Top Right: Distribution of accuracies across runs (61.1 ± 7.78%). Bottom Right: Kernel density estimate distribution derived from shuffled data. The solid gray line indicates the 95% confidence interval of the shuffled distribution (56.78%), while the black solid line represents the mean accuracy of the observed data (61.1%, permutation test, p = 0.005). Chance level = 50%. (f) Distribution of different features extracted from the neural activity. Conditions compared using linear mixed model (lmm, see S8 Fig). Firing rate (familiar)=1.46 ± 2.32 (z score), firing rate (unfamiliar)=1.14 ± 1.65 (z score), lmm, p = 0.011. Max firing rate (familiar)=8.4 ± 6.9 (z score), max firing rate (unfamiliar)=7.63 ± 6.38 (z score), lmm, p = 0.051. Time of max firing rate (familiar)=94.85 ± 103.51ms, time of max firing rate (unfamiliar)=88.52 ± 96.22ms, lmm, p = 0.491. Response duration (familiar)=108.87 ± 112.98ms, response duration (unfamiliar)=88.87 ± 95.62ms, lmm, p = 0.001. Black line depicts the identity line where slope = 1. Red line represents fitted regression line for significant comparisons and shaded region shows the 95% confidence interval for the regression estimate. *** denote p < 0.001, * p < 0.05 and ns not significant.
Fig 3.
Projection neuron activity differs based on caller familiarity but encodes less information about the social context (compared to HVC interneuron activity).
(a) Example cells recorded during familiar and unfamiliar call playbacks. Top: Spike-sorted waveforms with average (black) and spectrogram of call playback used. Middle: Spike dot raster plot. Bottom: peri-stimulus time histogram (PSTH). (b) Normalized firing rate for projection neurons that changed activity beyond two standard deviations from baseline (400/555 Projection neurons, recordings = 9, birds = 8). Neurons are ordered by their peak firing time during familiar call playbacks. The same neuron order is maintained for unfamiliar call playback, showing corresponding activity patterns across conditions. White dashed lines depict call onsets and offsets. (c) Average normalized firing rate (z-score) across significantly responsive neurons. Shaded area represents the 95% confidence intervals (1.96*standard error). (d) Left: Response trajectories in PC space for neurons shown in b (Variance explained by first 2PCs = 56.96%). Lines connecting dots represent 10ms. Right: Euclidean distance between conditions in PC space across time. Red dotted lines represent ±2std from mean baseline values. (e) Average classification accuracy for call playback familiarity based on the firing rate of neurons shown in b (time window = 0 to 400ms from playback onset, model = support vector machine, iterations = 1000, test size = 0.1). Left: Confusion matrix. Top Right: Distribution of accuracies across runs (53.85 ± 5.07%). Bottom Right: Kernel density estimate distribution derived from shuffled data. The solid gray line indicates the 95% confidence interval of the shuffled distribution (54.13%), while the black solid line represents the mean accuracy of the observed data (53.85%, permutation test, p = 0.069). Chance level = 49.9%. (f) Distribution of different features extracted from the neural activity. Conditions compared using linear mixed model (lmm, see S8 Fig). Firing rate (familiar)=1.54 ± 3.94 (z score), firing rate (unfamiliar)=1.24 ± 2.46 (z score), lmm, p = 0.059. Max firing rate (familiar)=12.20 ± 19.62 (z score), max firing rate (unfamiliar)=11.51 ± 17.79 (z score), lmm, p = 0.44. Time of max firing rate (familiar)=113.12 ± 104.68ms, time of max firing rate (unfamiliar)=104.45 ± 97.98ms, lmm, p = 0.11. Response duration (familiar)=88.22 ± 95.33ms, response duration (unfamiliar)=73.32 ± 82.97ms, lmm, p = 0. Black line depicts the identity line where slope = 1. Red line represents fitted regression line for significant comparisons and shaded region shows the 95% confidence interval for the regression estimate. *** denote p < 0.001, and ns not significant.
Fig 4.
Variations in call responses correlate with interneuron activity changes across different playbacks.
(a) Correlation between normalized features used to describe vocal responses (behavior) and neural activity (n = 4 sessions, across 3 birds). Behavioral data was normalized by bird, while neural recordings were normalized for each neuron individually. Neural parameters plotted represent the mean values obtained for all recorded interneurons (n = 69). Kolmogorov-Smirnov test used to test for normality, then Pearson correlation coefficient calculated (shown inside cells for each correlation, see S11 Fig). Bonferroni correction used to control for multiple comparisons. Corrected p values denote *** p < 0.001.