Reader Comments

Post a new comment on this article

Addendum to Varnum et al., 2021

Posted by mvarnum on 03 May 2022 at 19:07 GMT


In our original publication [1], we operationalized novel music production using three indicators and a composite derived from those three indicators. One of those original indicators, the Total Hot 100 songs operationalization of novel music production, was based on the subset of songs for which we were able to successfully scrape lyrics. This indicator was compiled by the lead author. Post-publication, a researcher who was not one of the original authors contacted us and alerted us that using a fuller count of songs entering the Hot 100 charts showed different temporal trends than our original indicator. As a result, we reanalyzed our data with this more complete count of songs entering the Hot 100 charts derived from the billboard.py API as one our operationalizations of music production. This led to the several differences in results. Generally speaking the key findings of the original paper hold when analyses were re-run using 2 of the original individual indicators of novel music production based on Wikipedia and Discogs, but not for the new Hot 100 indicator. For full data, code, and results see https://osf.io/w7umf/ .

Correlations with time, lyrical compressibility, with other indices of music production

The two original indicators of music production used in these re-analyses were positively correlated with average annual lyrical compressibility, wikipedia songs kendall’s tau = .680, p < .001, discogs songs kendall’s tau = .720, p < .001.

The new fuller indicator of songs entering the Billboard Hot 100 chart per year was negatively correlated with year, kendall’s tau = -.530, p < .001, whereas the prior operationalization was positively correlated with year. The new Hot 100 indicator was also negatively correlated with average annual lyrical compressibility, kendall’s tau = -.413, p < .001, whereas the prior operationalization was positively correlated with average annual lyrical compressibility.

The new Hot 100 indicator of music production was negatively correlated with the other two original indicators of music production, wikipedia songs kendall’s tau = -.463, p < .001, discogs songs kendall’s tau = -.533, p < .001. Given the lack of uniform positive correlations between the indicators, in the analyses that follow we use individual indicators rather than a composite of these three variables.

Robustness of key results with individual music production indicators

Control variables

The positive relationship between the Wikipedia songs indicator of music production and average annual lyrical compressibility remained significant when separately controlling for 9 of the 12 original control variables, .318 < partial kendall’s tau’s <.677, p’s < .002. The relationship was not significant when controlling for population size, GDP per capita, or ethnic heterogeneity partial kendall’s tau’s < .145, p’s > .108 (see https://osf.io/w7umf/ for full results).

The positive relationship between the Discogs indicator of music production and average annual lyrical compressibility remained significant when separately controlling for 9 of the 12 original control variables, .372 < partial kendall’s tau’s <.730, p’s < .001. The relationship was also marginally significant when controlling for ethnic heterogeneity, partial kendall’s tau’s = .171, p = .058. The relationship was not significant when controlling for population size or GDP per capita, partial kendall’s tau’s < .088, p’s > .340 (see https://osf.io/w7umf/ for full results).

The negative relationship between the new Hot 100 indicator of music production and average annual lyrical compressibility remained significant when separately controlling for 7 of the 12 original control variables, . -.504 < partial kendall’s tau’s < -.179, p’s < .05. It was positively correlated with average annual lyrical compressibility when controlling for conservatism, partial kendall’s tau’s = .347, p = .017, and it was not significantly correlated with average annual lyrical compressibility when controlling for population size, GDP per capita, residential mobility, or ethnic heterogeneity, partial kendall’s tau’s < |.128|, p’s > .191 (see https://osf.io/w7umf/ for full results).

Autocorrelation/Linear Trends

Using Tiokhin-Hruschka corrected thresholds to account for autocorrelation, the two original indicators of music production remained significantly positively correlated with average annual lyrical compressibility, wikipedia songs r = .871, corrected p < .001, discogs songs r = .889, corrected p < .001. The new Hot 100 indicator was significantly negatively correlated with average annual music production, r = -.634, corrected p < .05, but not at more conservative thresholds.

After residualizing out year in order to detrend the data, the Wikipedia based music production indicator was marginally significantly correlated with average annual lyrical compressibility, kendall’s tau = .154, p = .083. Neither the new Hot 100 indicator, kendall’s tau = .113, p = .207, nor the Discogs indicator, kendall’s tau = -.001, p = .912 were significantly correlated with average annual lyrical compressibility after this detrending procedure was performed.

Results of separate auto.arima analyses with individual music production indicators as exogenous predictors found that none of them were significant, z’s < .700, p’s > .480 (see https://osf.io/w7umf/ for full results).

Song-level results

We re-ran 3 separate multi-level models including year, compressibility, an individual music production indicators, and the interaction between compressibility and an individual music production indicator as predictors of songs’ success. For each model, year, compressibility, and the music production indicator were mean-centered prior to running the model. Here we focus on whether effect observed with the original novel music production composite - - that a song’s position on the charts was more strongly linked to its compressibility in years when more novel music was produced - - was observed when using individual indicators of novel music production.

For the model with the Wikipedia-based indicator of novel music production, lyrical compressibility was more strongly associated with song success in years when a greater number of songs were produced, Compressibility x Wikipedia Songs interaction, B = -.003, SE = .001, t(df = 14649.946) = 3.55, p < .001. This was also true for the model with the Discogs-based indicator, Compressibility x Discogs Songs interaction, B = -.000, SE = .000, t(df = 14590.710) = 3.001, p < .001. However this interaction was not observed for the model with the new Hot 100-based indicator of novel music production, Compressibility x New Hot 100 Songs interaction, B = .005, SE = .005, t(df =14568.977) = .826, p = .409. See https://osf.io/w7umf/ for full results of multi-level models.

Conclusion

These new analyses suggest that overall there is still evidence supporting a positive association between the number of novel songs produced in a given year and average compressibility of songs entering the Billboard Hot 100 charts. Namely, this linkage is observed for 2 out of 3 individual indicators of music production, and we observe evidence that this association is largely robust to other ecological, socio-ecological, and cultural control variables and somewhat robust to attempts to account for temporal auto-correlation and linear trends in the time series.

For 2 of 3 indicators of music production, multi-level models using song-level data replicated the originally observed interaction between amount of novel music production and lyrical compressibility, such that in years when more novel music was produced, compressibility was a stronger predictor of a given song’s success. These multi-level models included year as a co-variate suggesting that these effects hold when accounting for the potentially confounding effects of linear trends in the time series.

Further, other key findings from the original publication, namely 1) an increase in average lyrical compressibility over time, 2) a positive association between lyrical compressibility of a given song and its success, and 3) a strengthening of the association between lyrical compressibility and a song’s success over time remain unaffected.

We attempt no strong interpretation for why the new Hot 100 indicator of novel music production displays different relationships with average annual lyrical compressibility than the other indicators, however we note that it may be a narrower index of novel music production than those based on Wikipedia and Discogs. The latter two tend to capture a far larger number of songs, and by definition the number of songs entering the Hot 100 in a given taps more into how many new songs achieve commercial success in a given year, rather than how many new songs are produced overall in a given year. Indeed the negative relationships observed between the broader indicators of novel music production and Hot 100 is consistent with this interpretation. It is also potentially of theoretical interest as it may suggest that when more new songs are produced, a smaller total number of songs achieve commercial success. Thus somewhat counterintuitively, it may be the case that when more cultural products are in competition, a smaller total number will achieve success. As a result, more novel products to choose from may not only increase the success of simpler products, but may also reduce the number of products that spread broadly in the population. The broader implications of these findings should be explored more systematically in future research.

Acknowledgments
We thank Myoung Nam for bringing this issue to our attention.

References
1. Varnum MEW, Krems JA, Morris C, Wormley A, Grossmann I (2021) Why are song lyrics becoming simpler? a time series analysis of lyrical complexity in six decades of American popular music. PLOS ONE 16(1): e0244576. https://doi.org/10.1371/j...

No competing interests declared.