## Figures

## Abstract

Because classical music has greatly affected our life and culture in its long history, it has attracted extensive attention from researchers to understand laws behind it. Based on statistical physics, here we use a different method to investigate classical music, namely, by analyzing cumulative distribution functions (CDFs) and autocorrelation functions of pitch fluctuations in compositions. We analyze 1,876 compositions of five representative classical music composers across 164 years from Bach, to Mozart, to Beethoven, to Mendelsohn, and to Chopin. We report that the biggest pitch fluctuations of a composer gradually increase as time evolves from Bach time to Mendelsohn/Chopin time. In particular, for the compositions of a composer, the positive and negative tails of a CDF of pitch fluctuations are distributed not only in power laws (with the scale-free property), but also in symmetry (namely, the probability of a treble following a bass and that of a bass following a treble are basically the same for each composer). The power-law exponent decreases as time elapses. Further, we also calculate the autocorrelation function of the pitch fluctuation. The autocorrelation function shows a power-law distribution for each composer. Especially, the power-law exponents vary with the composers, indicating their different levels of long-range correlation of notes. This work not only suggests a way to understand and develop music from a viewpoint of statistical physics, but also enriches the realm of traditional statistical physics by analyzing music.

**Citation: **Liu L, Wei J, Zhang H, Xin J, Huang J (2013) A Statistical Physics View of Pitch Fluctuations in the Classical Music from Bach to Chopin: Evidence for Scaling. PLoS ONE 8(3):
e58710.
https://doi.org/10.1371/journal.pone.0058710

**Editor: **Derek Abbott,
University of Adelaide, Australia

**Received: **November 8, 2012; **Accepted: **February 8, 2013; **Published: ** March 27, 2013

**Copyright: ** © 2013 Liu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **The authors acknowledge the financial support by the National Natural Science Foundation of China under Grant Nos. 11075035 and 11222544, by the Program for New Century Excellent Talents in University, by Fok Ying Tung Education Foundation under Grant No. 131008, by Shanghai Rising-Star Program (No. 12QA1400200), by CNKBRSF under Grant No. 2011CB922004, and by National Fund for Talent Training in Basic Science (No. J1103204). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Because music has well accompanied human beings for thousands of years, abundant scientific researches have been done to understand the fascinating power of it. For example, a research group used positron emission tomography to study neural mechanisms underlying intensely pleasant emotional responses to music [1]. Voss (1989) discovered self-affinity fractals in noise and music [2]. Tzanetakis and Cook analyzed timbral texture, rhythmic content and pitch content of audio signals to try to classify musical genres [3]. Clearly, these discoveries are still far from enough for people to fully understand interesting laws behind music.

In this work, we attempt to understand music from a statistical physics point of view. Traditional statistical physics mainly concerns about natural systems, whose structural units are usually molecules or atoms. Those units are not adaptive to the environment because they have no mental faculties. From the 1990s, people gradually applied the methods originating from traditional statistical physics to investigate the intelligent and adaptive human systems. For example, Mantegna and Stanley discovered a scaling behaviour of probability distribution for a particular economic index in 1995 [4]. The competing and collaborating activities in a complex adaptive system were also studied to investigate risk-return relationships [5] and resource allocations [6] in human society. Besides, methods of statistical physics were also applied to study the birth (death) rate of words, providing an insight into the research on language evolution [7]. In the light of such directions, here we try extending some of these methods to the field of music, especially the study of notes. In fact, a number of related works have been done before. Manaris et al. (2005) applied Zipf's Law to music and studied the distribution of various parameters in music [8]. Liu (2010) constructed networks with notes and edges corresponding to musical notes and found similar properties in all networks from classical music to Chinese pop music [9]. The research group of Levitin (2012) studied the rhythm of classical music. They computed the power spectrum of the rhythm by the multitaper method, and found a 1/f power law in the rhythm spectra, which can classify different musicians according to the predictability [10]. As far as the classical music is concerned, it is an important branch of music originating in Europe around the 11th century. The central norms and standards of western classical music were codified from 1550 to 1900, also known as the common practice period [11]. It contains three periods: the Baroque era, the Classical era and the Romantic era, when a number of outstanding musicians and masterpieces were born [12]. Therefore, for our purpose, we also focus on the compositions and musicians in this common practice period in the present work. As we all know, a composition of classical music is actually a time series of notes. The time series of pitch fluctuations of notes in a composition correspond to types of melodies, which can distinguish various musical genres and composers. Accordingly, in this work, we mainly calculate the cumulative distribution function (CDF) and the autocorrelation function of pitch fluctuations.

## Methods

We analyze 1,876 compositions of five classical music composers across 164 years [11], [12]. The five composers, including J. S. Bach, W. A. Mozart, L. van Beethoven, F. Mendelsohn, and F. F. Chopin, are the representative figures of three different genres in chronological order, namely the baroque (1600–1750), classical period (1730–1820) as well as the romantic era (1815–1910) [11], [13], [14], [15], [16]. The information of the musicians and the accurate number of compositions we selected are listed in Table 1.

All pieces of music in our work were downloaded from kern humdrum music data base [17] as MIDI files, which contain accurate and easily-read information of music. A note in a music score can be named by a scientific pitch notation with a letter-name and a number identifying the pitch's octave [18]. Each scientific pitch notation is corresponding to a certain frequency. Details can be found in Table 2, where the left column (i. e., C, D, E, F, G, A, B) is the note's letter-name and the first line (namely, 0, 1, , 9) is the pitch's octave. To proceed, we regard the sequential notes or pitches (representing frequencies) of a composition as a time series.

Let us denote the pitch of time as ( = 1, 2, 3, , ), where is the length in notes of the concatenated parts of the composition. Then we introduce the pitch fluctuation, , to describe the pitch change between two adjacent notes, which is defined as(1)

The reason why we focus on two adjacent notes may be two-folded. Firstly, if we focus on the pitch change between two notes with and , according to Table 2, it can be easily conjectured that the pitch change, , cannot be statistically distinguished well from Bach to Chopin especially when is large enough. Secondly, according to music appreciation, two adjacent notes could be much more impressive for audience than two separated notes with . However, it is worth noting that most compositions are composed of several tracks, as shown in Fig. 1. Thus, for our fluctuation calculations, we turn them into one track by adding tracks one after another. Nevertheless, the difference between the ending note of the previous track and the beginning note of the latter track was removed from the calculations throughout this work.

## Results

### (1) Statistical analysis of pitches and pitch fluctuations

First, let us take a glimpse at the data of pitches of the five composers, by calculating the mean value of pitches as we can see in Fig. 2. The horizontal ordinate shows the musicians arranged in chronological order according to their years of birth. As we can see, the mean value of pitches is different for the five composers. Particularly, Bach has the smallest value, 343.65 Hz, while the values of the other four composers are all above 400 Hz. In particular, the smallest value for Bach is probably due to the different standards for assigning frequencies in his period, where the tunings were usually lower [19].

The mean value of pitches for the five composers: 343.658 Hz (Bach), 435.448 Hz (Mozart), 416.332 Hz (Beethoven), 406.961 Hz (Mendelsohn), and 314.037 Hz (Chopin).

Next, let us move on to statistical analysis of pitch fluctuations, . We calculated the mean value and the standard deviation of pitch fluctuations as well as the kurtosis and skewness. All the results are shown in Table 3. As we can see, the mean values of pitch changes are all around zero for the five composers. The kurtosis of Bach is the smallest 8.230 while the kurtosis of Mendelsohn is the largest, 95.953. Speaking of the skewness, Mendelsohn has the value of 1.618 while the values for the rest are much smaller.

After the statistical analysis of pitches and pitch changes, we are now in a position to investigate the CDFs.

### (2) CDF of pitch fluctuations

CDF (cumulative distribution function), , for a discrete variable describes the probability distribution of to be found larger than or equal to a number [20], [21]. It is also named as the complementary cumulative distribution function or tail distribution. is defined for every number as(2)Every CDF is monotonically decreasing. If we define for any positive real number , then has two properties:(3)To comply with our notations, here represents pitch fluctuation . Therefore the positive tail and negative tail of CDF can be calculated separately to make a comparison [22].

The CDF of pitch fluctuations for each composition is calculated at first, and then it is classified in accordance with musicians, as shown in Fig. 3. Clearly, as time evolves from Bach time to Mendelsohn/Chopin time, the biggest pitch fluctuation of a composer gradually increases. The robustness of this time-evolution result can also be shown because the biggest pitch fluctuations of Mendelsohn and Chopin (born in 1809 and 1810, respectively) are closed very much. Particularly, both positive and negative tails of CDFs show a straight line in the log-log plot for different composers, indicating that the time sequence of the acoustic frequencies, instead of a random process, decays very slowly. Then we applied the power-law fitting to both tails of the CDFs. The fitting formular is(4)where C is a constant. The corresponding fitting parameters are shown in Table 4. As we can see, each tail of the CDF satisfies a power law, where the power-law exponent differs from composers. Another discovery is that for the same musician, the positive and negative tails are almost symmetrical except Beethoven, where the for positive tail is 6.2 and that for negative tail is 5.5.

All the tails have a part in the power-law (or scale-free) distribution as indicated by the straight lines.

Next we examine the time evolution of this scaling property (), as shown in Fig. 4. The power-law exponent of both the positive and negative tails gradually decreases linearly with time. Because represents the degree of attenuation of the CDF tails, the smaller the exponent is, the slower the tail decays. This reflects that large-scale changes happened more often in the melody. The decay of the tail exponent () reveals the evolution of classical music that the melody has larger ups and downs from Bach to Mendelsohn/Chopin.

decreases from Bach to Mendelsohn/Chopin [Note the horizontal coordinates corresponding to the five symbols in either (a) or (b) denote the birth years of the five composers from Bach to Chopin, respectively]. The lines are just a guide to the eye.

### (3) Autocorrelation function of pitch fluctuations

In statistical physics, the autocorrelation function of a time series describes the correlation with itself as a function of time differences [23]. For a discrete time series, , the autocorrelation function, , for a time difference, , is defined as(5)where means the mean value of , the variance and the expected value operator. The value of autocorrelation function changes in range [−1,1], with −1 suggesting perfect anti-correlation and 1 perfect correlation [24]. Here we use to indicate the absolute value of pitch fluctuations, .

Different from the calculation of CDF before, we calculate the autocorrelation function of each composition at first, then average the value of autocorrelation of the compositions for each musician. Particularly, we only selected the compositions with more than 250 notes to avoid unusual large values of the autocorrelation functions due to the short length.

The autocorrelation function for the absolute values of pitch fluctuations is shown in Fig. 5. The values of autocorrelation function for every musician are all positive, which indicate a positive correlation of . As we can see, the autocorrelation functions for all the five composers in the log-log plot show a straight line (namely, a power-law behavior), indicating a slow decay of autocorrelation functions. Then we applied the power-law fitting to the autocorrelation function. The fitting formular is(6)where is a constant. The results of power-law fitting are shown in Table 5. As we can see, the power-law exponent () varies with each musician as shown in Fig. 6. This means the decay rate of autocorrelation function is different, or they have different levels of long-range correlation of pitch fluctuations. For example, Mendelsohn has the smallest value of while Chopin the largest.

The horizontal coordinate indicates the time lag, , from 1 note to 50 notes, while the vertical coordinate indicates the value of . It is worth noting that is always positive. In this log-log plot, the five panels respectively show a straight line, suggesting a long-range correlation of notes for each of the five composers.

The five composers have different 's. Chopin has the smallest value while Mendelsohn has the largest although they were of the same era.

## Conclusions

In conclusion, we have revealed that the biggest pitch change (between two adjacent notes) of a composer gradually increases as time evolves from Bach to Mendelsohn/Chopin. In particular, the positive and negative tails of a CDF (cumulative distribution function) for the compositions of a composer are distributed not only in power laws (i.e., a scale-free distribution), but also in symmetry (namely, the probability of a treble following a bass or that of a bass following a treble are basically the same for each composer). Particularly, the power-law exponent decreases as time elapses. Furthermore, we have also calculated the autocorrelation function of the pitch fluctuations. The autocorrelation function shows a general power-law distribution for each composer. Especially, the power-law exponents vary with the musicians, indicating their different levels of long-range correlation of pitch fluctuations. Compared with the previous works on analyzing music, we focus on pitch fluctuations and study the time evolution and development of the classical music. In particular, all of our statistic results are based on MIDI files. We choose only those five composers due to the limitation of database. However, in the preparation of MIDI files different temperaments, tunings and transpositions in the music were neglected. Works playing with different instruments may correspond to different notes and even form different styles. Thus the statistical results remain to be improved in these aspects. Further, although we study the overall statistical properties of each composer, we should mention that each composer still has various styles in his career and we just have a rough style comparison between composers. This work may be of value not only for suggesting a way to understand and develop music from a statistical physics point of view, but also for enriching the realm of traditional statistical physics by including music.

## Author Contributions

Conceived and designed the experiments: JPH. Performed the experiments: LL JRW HSZ JHX. Analyzed the data: LL JRW HSZ JHX JPH. Contributed reagents/materials/analysis tools: LL JRW HSZ JHX. Wrote the paper: LL JRW HSZ JHX JPH.

## References

- 1. Blood A, Zatorre R (2001) Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proceedings of the National Academy of Sciences 98: 11818–11823.
- 2. Voss R (1989) Random fractals: Self-affinity in noise, music, mountains, and clouds. Physica D: Nonlinear Phenomena 38: 362–371.
- 3. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. Speech and Audio Processing, IEEE transactions on 10: 293–302.
- 4. Mantegna R, Stanley H (1995) Scaling behaviour in the dynamics of an economic index. Nature 376: 46–49.
- 5. Song K, An K, Yang G, Huang J (2012) Risk-return relationship in a complex adaptive system. PloS one 7: e33588.
- 6. Zhao L, Yang G, Wang W, Chen Y, Huang J, et al. (2011) Herd behavior in a complex adaptive system. Proceedings of the National Academy of Sciences 108: 15058–15063.
- 7. Petersen A, Tenenbaum J, Havlin S, Stanley H (2012) Statistical laws governing fluctuations in word use from word birth to word death. Scientific Reports 2.
- 8. Manaris B, Romero J, Machado P, Krehbiel D, Hirzel T, et al. (2005) Zipf's law, music classification, and aesthetics. Computer Music Journal 29: 55–69.
- 9. Liu X, Tse C, Small M (2010) Complex network structure of musical compositions: Algorithmic generation of appealing music. Physica A: Statistical Mechanics and its Applications 389: 126–132.
- 10. Levitin D, Chordia P, Menon V (2012) Musical rhythm spectra from bach to joplin obey a 1/f power law. Proceedings of the National Academy of Sciences 109: 3716–3720.
- 11.
Kennedy M (2006) The oxford dictionary of music author: Michael kennedy, joyce bourne, publisher: Oxford university press, usa pages: 1008 .
- 12.
Johnson J (2002) Who needs classical music?: cultural choice and musical value. Oxford University Press, USA.
- 13.
Perreault J, Fitch D (2004) The thematic catalogue of the musical works of Johann Pachelbel. Lanham, Md.: Scarecrow Press.
- 14.
King A (1973) Some aspects of recent Mozart research. In: Proceedings of the Royal Musical Association. Taylor & Francis, volume 100, pp. 1–18.
- 15.
ClassicalNet website. Available: http://www.classical.net/music/composer/works/chopin/index.php. Accessed 2012 March 3.
- 16.
Taruskin R (2009) The Oxford History of Western Music: Music in the Nineteenth Century, volume 3. OUP USA.
- 17.
Kern website. Available: http://kern.humdrum.net/. Accessed 2011 Oct 10 .
- 18. Young R (1939) Terminology for logarithmic frequency units. The Journal of the Acoustical Society of America 11: 134–139.
- 19. Cavanagh L (2009) A brief history of the establishment of international standard pitch a = 440 Hz. WAM: Webzine about Audio and Music 4.
- 20.
Kokoska S, Zwillinger D (2000) CRC Standard Probability and Statistics Tables and Formulae. CRC.
- 21. Clauset A, Shalizi C, Newman M (2009) Power-law distributions in empirical data. SIAM review 51: 661–703.
- 22. Zhou W, Xu H, Cai Z, Wei J, Zhu X, et al. (2009) Peculiar statistical properties of chinese stock indices in bull and bear market phases. Physica A: Statistical Mechanics and its Applications 388: 891–899.
- 23.
Box G, Jenkins G, Reinsel G (2011) Time series analysis: forecasting and control, volume 734. Wiley.
- 24.
Bendat J, Piersol A (2011) Random data: analysis and measurement procedures, volume 729. Wiley.