Conceived and designed the experiments: JEM SAW. Performed the experiments: JEM. Analyzed the data: JEM ATH. Wrote the paper: JEM ATH SAW.
The authors have declared that no competing interests exist.
Trial by trial variability during motor learning is a feature encoded by the basal ganglia of both humans and songbirds, and is important for reinforcement of optimal motor patterns, including those that produce speech and birdsong. Given the many parallels between these behaviors, songbirds provide a useful model to investigate neural mechanisms underlying vocal learning. In juvenile and adult male zebra finches, endogenous levels of FoxP2, a molecule critical for language, decrease two hours after morning song onset within area X, part of the basal ganglia-forebrain pathway dedicated to song. In juveniles, experimental ‘knockdown’ of area X FoxP2 results in abnormally variable song in adulthood. These findings motivated our hypothesis that low FoxP2 levels increase vocal variability, enabling vocal motor exploration in normal birds.
After two hours in either singing or non-singing conditions (previously shown to produce differential area X FoxP2 levels), phonological and sequential features of the subsequent songs were compared across conditions in the same bird. In line with our prediction, analysis of songs sung by 75 day (75d) birds revealed that syllable structure was more variable and sequence stereotypy was reduced following two hours of continuous practice compared to these features following two hours of non-singing. Similar trends in song were observed in these birds at 65d, despite higher overall within-condition variability at this age.
Together with previous work, these findings point to the importance of behaviorally-driven acute periods during song learning that allow for both refinement and reinforcement of motor patterns. Future work is aimed at testing the observation that not only does vocal practice influence expression of molecular networks, but that these networks then influence subsequent variability in these skills.
Birdsong and speech share key features
As zebra finches undergo sensorimotor learning, age-dependent increases in syllable structure are observed between 45–90d
Two nuclei of the AFP, basal ganglia sub-region area X and pallial lateral nucleus of the anterior nidopallium (LMAN), are required for song learning, including processes important for vocal variability underlying motor exploration. Lesions of area X at the onset of sensorimotor learning interfere with song improvement
Among these molecules, the gene that encodes the FOXP2 transcription factor has been a focus of birdsong research because of its direct link to human speech and language
At 75d, songs were more variable following the condition of vocal practice than following non-singing. Similar consistent trends were observed in these same birds at 65d, but the more variable nature of the developing song precluded detection of significant conditional differences at this age. Accordingly, comparison of songs obtained at 65d and 75d reveal greater stability in these measures at 75d. In a separate group of adult birds, song was even more stereotyped. These findings, together with previous studies, provide insight into the age- and behaviorally-dependent shaping of song variability and motivate additional exploration of behavior – gene interactions.
All animal use was in accordance with NIH guidelines for experiments involving vertebrate animals and approved by the University of California at Los Angeles Chancellor's Institutional Animal Care & Use Committee. Juvenile ∼62 days of age (62d) or adult (125–160d) male zebra finches were moved from our breeding colony to individual sound attenuation chambers (Acoustic Systems; Austin, TX) under a 12∶12 hour light/dark cycle. Birds were left undisturbed for 2–3 days prior to the behavioral experiments to enable acclimation to the new environment.
Sounds were recorded using Shure SM57 microphones and digitized using a PreSonus Firepod (44.1 kHz sampling rate, 24 bit depth). Recordings were acquired and analyzed using Sound Analysis Pro 2.091 software with pre-set parameters for capturing zebra finch song (SAP;
UD song was recorded and analyzed from the same bird at two stages late in sensorimotor learning (
Our prior work showed that 2 hours of UD singing lowers levels of FoxP2 mRNA and protein within area X of male zebra finches
Non-singing then undirected singing (NS-UD): For the first 2 hours following lights-on, the door to the sound chamber was propped open and birds were monitored by the presence of the investigator nearby, and distracted if they attempted to sing. If distraction was ineffective such that birds sang >10 motifs during the experiment, they were excluded from the study. After 2 hours, the chamber door was closed and the bird left undisturbed. Songs sung immediately after this 2 hour timepoint (see below) were used for behavioral comparisons. The time of the first subsequent UD motif was usually shortly after door closure (for example, 75d range: 1–25 min, mean = 7.5 min, n = 11).
Undirected singing throughout (UD-UD): UD song was continuously recorded from the time of lights-on and throughout the morning. The time of the first motif was usually shortly after lights-on (75d range: 1–14 min, mean = 5 min, n = 11). Two hours thereafter, song was immediately collected for behavioral comparisons.
At all ages, half of the birds in the group were in the NS-UD condition on Day 1 and the UD-UD condition on Day 2. This order was reversed for the other half. Counterbalancing was done to ensure that any changes in song structure were due to a conditional or age effect and not due to the chronological order of recordings. Ten birds were successfully recorded at 65d and 11 birds at 75d with 9 of these birds recorded at both ages. One bird was successfully recorded at 65d but not at 75d because of repeated singing during the non-singing period. Two birds recorded at 75d were not recorded at 65d due to technical problems. Methods for juvenile song analysis are detailed in the next section, followed by adult song analysis.
In this study, we present two separate methods for analyzing the same behavioral data obtained within the first 30 minutes following an initial 2 hours of either non-singing (NS-UD), or undirected singing (UD-UD). One method relied on investigator-defined segmentation of motif structure, while the second was independent of such judgments. In the first method, referred to as ‘motif-based’, we quantified phonological and sequence variability within the context of the motif, considered to be the basic analytical unit of song encoded by specific patterns of neuronal firing
In the second method, phonological variability was assessed using 30 one second song clips (‘clip-based’), while sequence variability was assessed using the first 300 syllables (‘string-based’), similar to the method used by Haesler et al.
In addition to comparing entire motifs, motifs were examined for selection of 2–3 individual syllables for more detailed comparisons. Local similarity and accuracy scores for 25 examples of each of these syllables were computed using symmetric pair-wise comparisons in SAP
Mean syllable scores were obtained in two different ways– means were either computed for each bird (i.e. mean of 1–3 syllables) and then collapsed across birds or a single mean was derived from all syllables independent of the birds . The rationale for selecting either method was as follows. A bootstrap 1-way ANOVA determined that the bird, not the syllable, was the independent factor when assessing similarity and accuracy (p<0.05 for between-bird differences). Thus, for syllable similarity, accuracy, and identity (product of similarity×accuracy/factor of 100) scores, the mean values and statistical tests reported in the tables are based on average scores per bird (n = 10 at 65d, and n = 11 at 75d). The case was different for the CV scores such that syllables were independent of the bird. Given the lack of significant conditional effect for the syllable similarity and accuracy scores at 65d (see
Wav files representing 25 renditions of each of the selected syllables at 75d in both conditions were manually segmented using amplitude and entropy thresholds in the sound analysis window of the similarity features tab of SAP. These features are: duration, amplitude, frequency modulation (FM), pitch, Wiener entropy, mean frequency, and pitch goodness. The coefficient of variation (CV, standard deviation/mean of 25 renditions) is reported (n = 30). The mean, standard deviation (SD), standard error of the mean (SE), and CV from 25 renditions were calculated for each syllable and then averaged for all 30 syllables, since the above mentioned 1-way bootstrap ANOVA indicated that the syllable was the independent variable for this analysis (p>0.05).
The first 20 motifs sung after 2 hours of non-singing (NS-UD) or UD singing (UD-UD) were selected by visual inspection of spectrograms in Audacity (version 1.3;
Songs were divided into 30 one second clips (adjusting for syllable boundaries as needed, never more than+/−0.1 sec) and analyzed in SAP to quantify phonological variability. We used the 75d NS-UD data from each bird to set syllable segmentation parameters in SAP, since the motif analysis indicated that singing in this condition was the least acoustically and sequentially variable. These parameters were then held constant for analysis of that bird's other three datasets (75d UD-UD, 65d NS-UD, 65d UD-UD) to account for possible subtle changes in syllable formation. The choice of 30 samples was based on empirical discovery of the minimum number of one second samples needed to provide a stable average score, as follows. We gradually increased the number of samples compared from each condition (starting with 10 samples and incrementing by 5 each time) until the mean similarity and accuracy scores no longer changed. As in the motif analysis, each one second clip was compared to all other samples collected in that condition, ((30×30)−30 self-tests = 870 scores for one second clips) from which the average was taken to produce scores of self-similarity and self-accuracy for a given age and condition.
Quantification of sequence variability was performed by first estimating the transition probability distributions of individual syllables, then calculating the scaled entropy of the distribution for each syllable, and finally averaging across syllables to obtain a score of motif entropy
The transition probability is a ratio representing the number of times that each leading syllable transitions to some following syllable (including to itself) over many renditions, divided by the total number of occurrences of the leading syllable. Thus, the transition probability of syllable A to B is defined as
Since motif entropy is an average of normalized syllable entropy scores, the entropy of each syllable contributes equally to the motif score. This may be problematic if a certain syllable only appears rarely but has a very high or very low entropy, since this skews the motif entropy towards the entropy of the rarely occurring syllable, especially if the bird has only a few unique syllables to begin with. We addressed this potential skewing in the string-based analysis, described below.
During the 300 syllable ‘string-based’ analysis, we set syllable segmentation parameters in SAP based on the bird's 75d NS-UD data, and then allowed SAP to automatically define syllables in all of that bird's subsequent datasets. Setting these parameters once using the most precise version of the song available allowed for a more objective definition in the other data sets. For some birds, syllables that were frequent at 65d, were rare at 75d due to their having merged into a single new syllable over the ten day interval. We also observed single syllables in the 65d data that were split into separate syllables at 75d. Most of the time, the 65d syllables did not disappear altogether but simply appeared much less frequently in the 75d data. Usually, these infrequently occurring syllables had entropy scores approaching 1 or 0, likely due to the small sampling rate that strongly drove the motif entropy score.
One way to deal with this skewing would have been to exclude infrequently occurring syllables from the analysis, but it is not clear what an appropriate occurrence threshold for excluding infrequent syllables would be. Because of that uncertainty, and more importantly, in order to capture the full complexity of the song, we chose not to do this. Instead, we created a weighted entropy score for the string-based analysis. The weighted entropy was obtained by calculating the normalized syllable entropy as described above, then weighting syllable entropy by the ratio of how frequently that syllable was sung, relative to the most frequent syllable. Thus,
Individual syllables were defined by the investigator while viewing song spectrograms in Audacity. Each unique syllable was designated a letter name, then letter names were translated into numbers for analysis in MATLAB, where A = 1, B = 2, C = 3, etc.. Plain text files were created with each line representing a motif as a string of numbers. Each bird had one 20 line text file per dataset available, representing the first 20 motifs sung immediately following 2 hours of non-singing (NS-UD) or undirected singing (UD-UD) at each age, if available. MATLAB functions (see
In this analysis, we calculated syllable transition probabilities from the first 300 SAP defined syllables that were sung immediately following 2 hours. This usually corresponded to slightly more than one minute of continuous song. Individual syllables were segmented automatically in SAP using the pre-set parameters
As in juveniles, two hours of morning UD singing decreases area X FoxP2 levels in adults relative to levels following two hours of non-singing
In traditional statistical methods, a test statistic such as Student's t or Fisher's F is compared to a mathematically derived continuous probability distribution of that statistic under the null hypothesis. In resampling statistics (also known as ‘bootstrapping’ and referred to henceforth) the appropriate statistic is defined by the experimenter and then compared to a distribution of that statistic under the null hypothesis which is generated via randomly permuting the data. In this way, bootstrap tests avoid the need for any theoretically based assumptions as to the form of the real data to assure validity of the test, as in traditional parametric statistics. We performed bootstrap tests using custom MATLAB functions (written by ATH; see
Finally, the number of Ms in the null distribution outside the critical values (actual M and its reflection across the mean of the null distribution) divided by 10,000, was the likelihood that we could have observed such a difference if there were no real conditional effect. This likelihood is the p-value. In contrast to a traditional t-test that has a critical t value indicating statistical significance at some alpha level, the critical M values change depending on the null distribution generated in each test, although alpha (0.05) does not. The same basic procedure, with a different test statistic, was used in the bootstrap 1-way ANOVAs to assess the independent variable, bird or syllable, in the syllable analysis as described above. The test statistic in this case was the ratio of between-group over within-group variability, computed not as sums of squares as in a traditional F-test, but as sums of absolute values of the distances from the grand/group means.
Since we could not confirm a normal distribution for many of our datasets, thus not satisfying a major assumption of traditional t-tests, we performed non-parametric 2-tailed paired bootstrap tests on all data (see
Condition Comparison | p-value | p-value | ||||||||
75d NS-UD | SE | 75d UD-UD | SE | 75d NS-UD vs. UD-UD | 65d NS-UD | SE | 65d UD-UD | SE | 65d NS-UD vs. UD-UD | |
Motif Similarity | 85.74 | 1.25 | 82.69 | 1.52 | 83.76 | 1.36 | 83.53 | 1.18 | 0.878 | |
Motif Accuracy | 83.10 | 0.71 | 81.75 | 0.60 | 79.92 | 0.70 | 79.97 | 0.78 | 0.932 | |
Motif Entropy | 0.29 | 0.03 | 0.35 | 0.05 | 0.42 | 0.06 | 0.44 | 0.07 | 0.603 | |
Motif Stereotypy | 70.69 | 3.34 | 64.95 | 4.59 | 57.78 | 5.84 | 55.56 | 6.88 | 0.596 | |
Clip Similarity | 82.45 | 1.17 | 80.25 | 1.24 | 0.064 | 84.51 | 1.04 | 82.44 | 1.44 | |
Clip Accuracy | 80.20 | 0.42 | 79.39 | 0.39 | 78.16 | 0.71 | 77.89 | 0.78 | 0.689 | |
String Entropy | 0.28 | 0.03 | 0.34 | 0.04 | 0.34 | 0.04 | 0.38 | 0.04 | 0.183 | |
String Stereotypy | 72.45 | 3.49 | 66.10 | 4.22 | 65.97 | 3.96 | 62.23 | 3.97 | 0.174 | |
Syllable Similarity | 96.95 | 0.38 | 95.88 | 0.62 | 95.31 | 0.81 | 94.84 | 0.71 | 0.408 | |
Syllable Accuracy | 92.79 | 0.37 | 91.92 | 0.47 | 91.71 | 0.63 | 91.38 | 0.57 | 0.543 | |
Syllable Identity | 89.97 | 0.70 | 88.16 | 1.00 | 87.46 | 1.32 | 86.70 | 1.16 | 0.467 |
Mean scores with standard error (SE) and exact p-values for 2-tailed paired bootstrap tests (significant p-values in bold face type) are shown for phonological and sequence comparisons between conditions generated at 75d (n = 11, left columns) and 65d (n = 10, right columns). Results are first reported for motif-, clip-, and string-based analyses, followed by syllable scores. For the latter, the investigator selected 25 consecutive renditions of the same syllable, computed an average of ∼3 syllables per bird and obtained the mean from 10 birds at 65d and 11 birds at 75d.
CV, Individual Syllables (n = 30) | NS-UD | SE | UD-UD | SE | p-value |
Pitch* | 0.074 | 0.007 | 0.084 | 0.008 | |
Pitch Goodness* | 0.101 | 0.006 | 0.123 | 0.100 | |
Wiener Entropy* | 0.085 | 0.005 | 0.098 | 0.008 | |
Syllable Amplitude* | 0.034 | 0.003 | 0.043 | 0.003 | |
Syllable Duration | 0.064 | 0.007 | 0.062 | 0.008 | 0.638 |
Frequency Modulation (FM) | 0.118 | 0.013 | 0.125 | 0.011 | 0.379 |
Mean Frequency | 0.064 | 0.006 | 0.065 | 0.006 | 0.984 |
The coefficient of variation (CV, standard deviation/mean) with SE is reported for all features obtained from 25 syllable renditions per bird in the NS-UD (left columns) or the UD-UD (right columns) condition. Asterisks and bold face type denote significance by 2-tailed paired bootstrap test.
Motif analysis | Power | Clip analysis | Power | ||
75d- NS vs. UD | similarity | 76.5% | 75d- NS vs. UD | similarity | 44.5% |
accuracy | 96.7% | accuracy | 63.2% | ||
stereotypy | 52.0% | stereotypy | 76.3% | ||
weighted stereotypy | 52.4% | weighted stereotypy | 51.7% | ||
65d- NS vs. UD | similarity | 7.0% | 65d- NS vs. UD | similarity | 62.2% |
accuracy | 4.7% | accuracy | 8.2% | ||
stereotypy | 11.2% | stereotypy | 26.4% | ||
weighted stereotypy | 10.5% | weighted stereotypy | 8.8% | ||
NS- 65d vs. 75d | similarity | 11.4% | NS- 65d vs. 75d | similarity | 49.4% |
accuracy | 78.3% | accuracy | 65.5% | ||
stereotypy | 72.8% | stereotypy | 65.7% | ||
weighted stereotypy | 56.5% | weighted stereotypy | 32.4% | ||
UD- 65d vs. 75d | similarity | 15.2% | UD- 65d vs. 75d | similarity | 18.5% |
accuracy | 63.6% | accuracy | 49.5% | ||
stereotypy | 42.4% | stereotypy | 42.7% | ||
weighted stereotypy | 27.2% | weighted stereotypy | 7.3% |
Results for the power analysis are shown for the 65d and 75d data. Higher power is seen in the ability to detect conditional differences between NS-UD and UD-UD in the 75d data relative to the 65d data which exhibits high variability within each condition. A comparison of the 65d vs. 75d data within a condition reveals higher power to detect age differences in the NS-UD condition.
Age Comparison: 75d vs. 65d | 75d | 65d | p-value | 75d | 65d | p-value | ||||
NS-UD | SE | NS-UD | SE | 75d vs. 65d | UD-UD | SE | UD-UD | SE | 75d vs. 65d | |
Motif Similarity | 85.15 | 1.46 | 84.05 | 1.49 | 0.514 | 82.17 | 1.82 | 83.69 | 1.37 | 0.334 |
Motif Accuracy | 82.66 | 0.69 | 80.11 | 0.75 | 81.46 | 0.66 | 79.89 | 0.87 | ||
Motif Entropy | 0.27 | 0.03 | 0.44 | 0.06 | 0.34 | 0.05 | 0.45 | 0.08 | 0.101 | |
Motif Stereotypy | 72.65 | 3.32 | 56.30 | 6.31 | 66.32 | 5.46 | 54.84 | 7.64 | 0.099 | |
Clip Similarity | 82.28 | 1.24 | 84.73 | 1.14 | 0.052 | 80.53 | 1.48 | 82.40 | 1.61 | 0.298 |
Clip Accuracy | 80.04 | 0.51 | 78.17 | 0.79 | 79.44 | 0.48 | 77.81 | 0.87 | 0.052 | |
String Entropy | 0.27 | 0.04 | 0.35 | 0.04 | 0.33 | 0.05 | 0.39 | 0.04 | 0.073 | |
String Stereotypy | 73.35 | 4.13 | 64.87 | 4.26 | 67.21 | 5.10 | 60.78 | 4.13 | 0.065 | |
Syllable Similarity | 96.78 | 0.47 | 95.24 | 0.90 | 0.106 | 95.55 | 0.77 | 94.71 | 0.78 | 0.321 |
Syllable Accuracy | 92.57 | 0.48 | 91.61 | 0.70 | 0.177 | 91.74 | 0.62 | 91.17 | 0.59 | 0.249 |
Syllable Identity | 89.96 | 0.77 | 88.06 | 1.33 | 0.135 | 88.19 | 1.10 | 86.94 | 0.95 | 0.293 |
Within each condition, mean scores with SE and exact p-values for 2-tailed paired bootstrap tests (significant p-values in bold face type) are shown for phonological and sequence comparisons between ages for 9 birds. Age comparisons for the NS-UD condition are on the left half of the table, while those for the UD-UD condition are shown on the right. Results (top to bottom) are for motif-, clip-, and string-based analyses, followed by syllable scores.
Motif analysis | NS-UD | SE | UD-UD | SE | p-value, NS-UD vs. UD-UD | Power |
similarity | 92.10 | 1.13 | 93.27 | 1.59 | 0.136 | 33.0% |
accuracy | 86.04 | 1.76 | 86.69 | 1.57 | 0.534 | 18.0% |
entropy | 0.11 | 0.02 | 0.11 | 0.02 | 0.950 | 5.6% |
stereotypy | 0.89 | 0.02 | 0.89 | 0.02 | 0.951 | 4.7% |
Within each condition, mean scores with SE and exact p-values for 2-tailed paired bootstrap tests are shown for phonological and sequence comparisons in adult data between NS-UD and UD-UD conditions. All measures showed low power reflecting a lack of conditional differences.
Figures and tables were created in Microsoft Excel, JMP (Cary, NC), and Origin (Northampton, MA). To enable comparison of traditional parametric and nonparametric statistical approaches with nonparametric bootstrap statistics, we report the results of all tests in
Songs that were sung by males immediately following a 2 hour period of UD singing (UD-UD) were compared with those sung following 2 hours of non-singing (NS-UD) at two ages in late sensorimotor learning (65d and 75d). In line with our prediction, phonological and sequential measures of song variability were higher in the UD-UD condition at 75d. At 65d, similar trends were evident but detection of significant conditional differences was overshadowed by higher within-condition variability. Song features in both conditions increased in stability from 65d to 75d, reflecting age-specific coarse- and fine-tuning of these processes. In contrast, a separate group of adult birds showed highly stereotyped song in both conditions.
At 75d, with continuous vocal practice (UD-UD), phonological features of the notes within a given syllable often appeared altered relative to one another (see
We next examined the mean coefficient of variance (CV) in these 30 syllables. As predicted, the CV was higher in the UD-UD condition for individual syllable features (pitch, p<0.05; pitch goodness, p<0.005; Wiener entropy, p<0.05; syllable amplitude, p<0.005;
Akin to the individual syllable analysis at 75d, the motif- and clip-based metrices also revealed greater variability in the UD-UD condition. These analyses compared a series of song syllables, including syllables used for the analysis of individual acoustic features detailed above. As described, motif similarity and accuracy scores are derived from calculations of individual acoustic features in SAP, including pitch, FM, Wiener entropy, and pitch goodness. 75d birds in the UD-UD condition, representing uninterrupted vocal practice, had lower mean similarity and accuracy scores, indicative of higher phonological variability, than in the NS-UD condition (n = 11, similarity, p<0.001; accuracy, p<0.001;
Paired data scores for the NS-UD (filled circles) and UD-UD (open circles) conditions for each bird are represented by connected lines.
We utilized similar entropy-based methods as in Haesler et al
At 75d, greater sequence variability was observed in the UD-UD condition. An example of these conditional differences is shown in
Conditional differences in variability were also measured in these same birds at 65d. Similar to the 75d data, syllable analyses as well as the motif- and clip/string-based analyses showed a trend toward lower similarity, accuracy and stereotypy scores (8/11 measures) and higher entropy scores (2/11 measures) in the UD-UD condition (
The detection of statistically significant differences between conditions at 65d were likely precluded by the high within-condition variability of the data at this age (
Given the greater overall variability at 65d, we hypothesized that songs sung at 75d would exhibit higher phonology scores (less variability) than at 65d, reflecting progression of song development. Unexpectedly, phonological features of syllables did not differ significantly across the two ages in either condition although scores trended lower in the 65d data (n = 9, p>0.05,
Syllable sequencing became more stereotyped with age in the NS-UD condition using standard motif- and string-based analyses (stereotypy, p<0.01) but not in the UD-UD condition (p>0.05;
The power to detect conditional differences in the 65d data was hindered by the high overall variability. In a separate group of adult birds, mature song exhibited low variability within each condition with higher similarity, accuracy, and stereotypy scores compared to juvenile birds. Even when the most stable songs sung by juvenile birds (i.e. at 75d under the NS-UD condition) were compared with those of adults, adult songs were more stable for similarity and stereotypy scores (p<0.01) with a trend for accuracy (p = 0.08). No differences were observed between the NS-UD and UD-UD conditions in these adults (p>0.05,
Undirected song has been likened to vocal practice
Here, we asked whether vocal practice under behaviorally-driven conditions, would increase vocal variability, at two ages during sensorimotor learning. In line with our prediction, we observed greater variability of song syllables in the UD-UD condition relative to the NS-UD condition at 75d, with similar trends at 65d. These results were observed in both the motif- and clip-based analyses, providing a high level of confidence in these findings and suggesting that 75d affords a developmental ‘sweet-spot’ for detection of vocal variability with practice.
We could not measure both behavior and FoxP2 levels simultaneously in the same bird, and did not specifically manipulate FoxP2 levels. However, our conditions for sampling song correspond to times of behavioral regulation of FoxP2 mRNA and protein
At 75d, syllable phonological scores were lower in the UD-UD condition and had a broader distribution than scores from the NS-UD condition, indicating more vocal variability (
The variability discussed here is on a time-scale of hours rather than the minute-to-minute changes that also occur
Continuous vocal practice in the morning also increased sequential variability at 75d as revealed by both the motif- and string-based analyses. For a given syllable in the UD-UD condition, it was more difficult to predict what the next syllable would be and, in some cases, there were more possible transitions than in the NS-UD condition (e.g.,
Detection of significant conditional differences was limited to data obtained at 75d because of the high within-condition variability in the 65d data. Data collected at 65d did show trends for greater variability in the UD-UD condition, with significant differences in similarity revealed by the clip-based-analysis. The improved detection of conditional differences using the one second clips at 65d was surprising given that this analysis appeared to provide a stricter test of conditional differences for the 75d data, reflected in higher p-values relative to the motif-based analysis. Hence, we did not expect the clip-based analysis at 65d to be more sensitive to any conditional difference in phonology. This unexpected finding suggests that at 65d, there is a gross level of phonological variability that can only be detected in SAP by disregarding motif structure. Further, these comparisons suggest that fine-grained phonological tuning, reflected by accuracy scores, occurred less at 65d when compared to 75d, while coarser tuning, reflected in similarity scores, occurred at roughly the same level at both ages. We expand on these interpretations in
Power analysis revealed low power for detection of conditional differences at 65d even when prospectively increasing the number of birds per condition, likely due to high within-condition variability obscuring any between-condition effect. The adult data, like the 65d data, also showed low power for detection of conditional differences and increasing the number of adult birds per condition does not substantially increase the power. Unlike in the 65d data, however, in adults, it was the low between-condition difference that diminished the power. We note that lack of robust conditional differences at 65d and in mature song rules out that increased variability at 75d in the UD-UD condition reflects singing fatigue. Moreover, birds sang similar amounts immediately after 2 hours in both conditions, also arguing against any fatigue. While we did not observe conditional effects on song variability in adult birds, other labs have documented rapid effects of social context on the variability in fundamental frequency (FF)
The lack of a conditional effect in adults is surprising, given the equivalent amount of singing-induced FoxP2 down-regulation in juveniles (Teramitsu et al., companion article) and adults
A comparison of the 65d versus 75d data in the same group of birds revealed age-related increases in song stability, as expected for juveniles undergoing sensorimotor learning. For within-condition comparisons, both the motif- and clip-based analyses showed higher accuracy scores at 75d compared to 65d. This is consistent with the detection of conditional differences in the 75d data. Unexpectedly, at the syllable level, phonological scores were not significantly higher at 75d compared to 65d. The developmental increase in accuracy scores observed at the motif/clip-based level, but not at the syllable level, may reflect a more comprehensive coarse tuning of all syllables versus fine-tuning of select syllables over the ten day time period. Using the standard measure, sequence stereotypy also increased from 65d to 75d in both conditions using motif- and clip-based analyses. In contrast, no developmental improvement in sequencing was observed using the frequency weighted measure (
The maturational increase in stability observed here is consistent with other studies (c.f.
Moving forward, one should not consider single genes in isolation, but in the context of other genes and gene networks in humans
Methods and results.
(0.03 MB DOC)
Motif-based scores and test statistics for all statistical methods. Means are reported along with exact p-values from Student's paired t-test (parametric) and Wilcoxon signed-rank and bootstrap statistics (nonparametric) for 2-tailed tests. Significant p-values are highlighted in bold face type.
(0.09 MB DOC)
Unweighted clip- and string-based scores and test statistics. Means are reported with exact p-values from Student's paired t-test (parametric), and Wilcoxon signed-rank and bootstrap statistics (nonparametric) for 2-tailed tests. Significant p-values are highlighted in bold face type.
(0.08 MB DOC)
Frequency-weighted clip- and string-based scores. Means are reported with exact p-values from Student's paired t-test (parametric) and Wilcoxon signed-rank and bootstrap statistics (nonparametric) for 2-tailed tests. Significant p-values are highlighted in bold face type.
(0.09 MB DOC)
Subset of phonological features that did not differ between conditions at 75d. Box plots show the mean scores (middle of the box), standard error (top and bottom of the box), and upper and lower 95% confidence intervals (whiskers). Data scores for the NS-UD (filled circles) and UD-UD (open circles) conditions for ∼3 syllables from each bird (30 syllables total) are represented by individual points. Mean CV scores were obtained from 25 renditions of the same syllable. No differences in CV (p>0.05) were observed for syllable mean frequency, duration, or frequency modulation (FM).
(0.62 MB TIF)
Syllable scores did not differ between conditions at 65d. A) Paired data shows similarity scores for the NS-UD (filled circles) and UD-UD (open circles) conditions for each bird at 65d. Individual points represent a mean syllable score from a single bird. Although the mean values in the UD-UD condition were lower than NS-UD means, the differences were not significant (2-tailed paired bootstrap, p>0.05). B–D) Histograms show the distribution of phonological scores for all 25 syllables from 10 birds (2-tailed paired bootstrap, p>0.05). For both conditions, scores were broadly distributed, reflecting greater overall variability in song at 65d relative to 75d.
(1.06 MB TIF)
No conditional differences were observed in motif and sequence variability at 65d. A) Motif similarity and accuracy scores for 65d were similar between the NS-UD and UD-UD conditions (2-tailed paired bootstrap, p>0.05). B–C) Entropy scores for the string- and motif-based analysis were similar between the two conditions (2-tailed bootstrap, p>0.05). D) Histogram depicts the percent change in the string-based scores, showing bi-directional distribution.
(0.85 MB TIF)
The authors gratefully acknowledge Dr. Michael Brainard for advice on syllable pitch calculations, Dr. Masakasu Konishi for sharing resources, Dr. Alan Garfinkel for feedback on the statistics and Dr. Cara Hampton for critical comments on the manuscript.