Standardized cross-cultural databases of the arts are critical to a balanced scientific understanding of the performing arts, and their role in other domains of human society. This paper introduces the Global Jukebox as a resource for comparative and cross-cultural study of the performing arts and culture. The Global Jukebox adds an extensive and detailed global database of the performing arts that enlarges our understanding of human cultural diversity. Initially prototyped by Alan Lomax in the 1980s, its core is the Cantometrics dataset, encompassing standardized codings on 37 aspects of musical style for 5,776 traditional songs from 1,026 societies. The Cantometrics dataset has been cleaned and checked for reliability and accuracy, and includes a full coding guide with audio training examples (https://theglobaljukebox.org/?songsofearth). Also being released are seven additional datasets coding and describing instrumentation, conversation, popular music, vowel and consonant placement, breath management, social factors, and societies. For the first time, all digitized Global Jukebox data are being made available in open-access, downloadable format (https://github.com/theglobaljukebox), linked with streaming audio recordings (theglobaljukebox.org) to the maximum extent allowed while respecting copyright and the wishes of culture-bearers. The data are cross-indexed with the Database of Peoples, Languages, and Cultures (D-PLACE) to allow researchers to test hypotheses about worldwide coevolution of aesthetic patterns and traditions. As an example, we analyze the global relationship between song style and societal complexity, showing that they are robustly related, in contrast to previous critiques claiming that these proposed relationships were an artifact of autocorrelation (though causal mechanisms remain unresolved).
Citation: Wood ALC, Kirby KR, Ember CR, Silbert S, Passmore S, Daikoku H, et al. (2022) The Global Jukebox: A public database of performing arts and culture. PLoS ONE 17(11): e0275469. https://doi.org/10.1371/journal.pone.0275469
Editor: Steven R. Livingstone, University of Ontario Institute of Technology, CANADA
Received: May 26, 2022; Accepted: September 14, 2022; Published: November 2, 2022
Copyright: © 2022 Wood et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All coded data are available at https://github.com/theglobaljukebox. Source code for data conversion and analysis are available at https://zenodo.org/record/6537663#.YnszmllS_BK. Audio files are available for streaming at http://theglobaljukebox.org, with some restrictions as explained in the text. The datasets are archived with ZENODO, and the DOI provided by ZENODO should be used when citing particular releases of Global Jukebox datasets, which are available within the respective GitHub repositories. For details regarding third-party streaming and downloading of audio, see Section 2.6 (“Availability of Audio Recordings”).
Funding: The Global Jukebox has been developed with support from the National Endowment for the Arts, the National Endowment for the Humanities, the Concordia Foundation, the Rock Foundation, and Odyssey Productions. PES, HD, and SP are supported by funding from the Yamaha corporation, a Grant-in-Aid from the Japan Society for the Promotion of Science (#19KK0064), and by grants from Keio University (Keio Global Research Institute and Keio Gijuku Academic Development Fund). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
During the 20th century, anthropologists began organizing data on cross-cultural diversity in ways that could be systematically compared on a global scale. The Ethnographic Atlas  coded data on social structure, kinship, religion, and economy; the Human Relations Area Files (HRAF)  compiled and subject-indexed detailed ethnographic texts; Ethnologue  and Glottolog  cataloged linguistic diversity. These resources allow scientists to quantitatively test cross-cultural hypotheses using global data. The 21st century has seen a resurgence of interest in such global databases, stimulating new research and debate on the nature of cross-cultural diversity and cultural evolution [5–12]. Alan Lomax and Conrad Arensberg’s Expressive Style Research Project at Columbia University, which also began in the mid-20th century, complemented resources like the Ethnographic Atlas for the domain of the performing arts by integrating its cross-cultural classification design into the research design of “Cantometrics” (“canto” = song, “metrics” = measure). In the 1980s their studies of song, dance, instrumentation, conversation, popular music, vowel and consonant placement, breath management, social factors, and societies were brought together in an interactive multimedia platform, the “Global Jukebox” [13–18]. While several subsequent studies have used methods and samples modeled after Cantometrics to analyze hundreds of traditional music recordings from around the world [19–23], the full Global Jukebox sample of over 5,000 coded performances was never made publicly available, until now.
The Global Jukebox is an interactive online resource for exploring music and other performing arts cross-culturally. On it, the immediacy of field recordings representing the full range of the world’s music can be experienced with reference to ethnographic, historical, environmental, linguistic, and geographic contexts, with song lyrics and testimonials by first-hand observers, musicians, and culture members. The Jukebox thus bridges the sciences and the humanities. It encompasses thousands of examples of singing, dancing, speaking, instrumentation, and other performing arts from over 1,000 societies, transformed into an online form that can be used for research, education, and cultural activism.
This article announces the long-anticipated publication of the raw coded data in downloadable form of Cantometrics and six additional studies and two supporting datasets (see Table 1), on https://theglobaljukebox.org. By releasing these data to the public, we hope to enrich scientific data on the expressive arts, to support cultural diversity, and to facilitate the practice of cultural equity in homes, classrooms, and research organizations.
Scientific cross-cultural comparison of music had been conducted since the late 19th century when the invention of the phonograph made relatively objective comparison of sound possible for the first time, birthing the field of comparative musicology, which later became known as “ethnomusicology” [25–30]. But until Alan Lomax’s research on expressive culture, such comparisons were generally limited to relatively small samples of recordings and emphasized those aspects of melody, pitch and rhythmic structure that are privileged in Western staff notation. In intellectual partnership with the anthropologist Conrad Arensberg, Lomax worked with multidisciplinary teams from the 1960s through the mid-1990s to collect and analyze thousands of examples of recorded expressive traditions from all world regions with a new, radical approach that emphasized performance style and social interaction [14–18] (see 1.3 and S1.2 in S1 File for details).
Lomax and Arensberg treated the performing arts empirically, as expressive behavior, and searched for the recurring features of performance that affiliate and differentiate preferred styles of singing, dancing, and speaking in a culture or region. They theorized that culture is a web of interaction that is formalized and codified in the expressive arts. They proposed that performance styles are influenced by human fundamentals like subsistence, the organization of work, social structure, environment, and cultural history. They amassed large global samples of recorded performances and developed systems for coding them (Table 1). Each example is classified and coded by aesthetic, organizational, qualifying, and formal features that can be compared cross-culturally, making it possible to explore relationships between music, dance, speech, social life and the environment. A cross cultural approach modeled after the Ethnographic Atlas  was used to frame, sample, and analyze the resulting data; multivariate factor analysis and correlations tests were used to seek relationships and patterns. For example, they found large regions sharing similar performance styles, which were argued to match patterns of ancient human settlement, subsistence, and migration (Fig 1; see summary of more results in S1.2 in S1 File).
Lomax and Arensberg’s novel methods, execution, and conclusions drew considerable criticism when they were published and subsequently (e.g., [31–36]; cf. [14,17,18] for review and discussion). One major impediment to overcoming criticism was that the raw Cantometric data and sample were never made public for others to examine and reanalyze. The public release of the Global Jukebox now largely solves this problem.
The Global Jukebox and its data
In the 1980s, the project’s multiple datasets were incorporated into a single multimedia, relational database that Lomax called the Global Jukebox. Despite interest and financial support from Apple and other institutions, the project was curtailed by technological limitations and the intellectual climate of the eighties and nineties. In 2010, ALCW, an anthropologist, GD’A, a world leader in architectural digital design who had worked with Lomax on the Global Jukebox prototype, and a team of ethnomusicologists, designers and programmers, stepped in to realize the project.
A preliminary beta version of a new Global Jukebox, was released in 2017 as an online resource for exploring music and other performing arts cross-culturally. While the release generated much excitement among the public and media such as the New York Times and Rolling Stone [37–41], the digitized codings had not yet undergone quality control and were not yet downloadable for research or accompanied by any formal peer-reviewed documentation of the kind now provided in this article. Between 2017 and 2021, the raw data and metadata of Cantometrics were processed for full publication: descriptive data on each song and society were revised and expanded with song details, lyrics, genres and instruments, and nearly two thousand missing and truncated audio files were added. Songs were assigned to narrower and more accurate societal groupings with new coordinates. An updated comprehensive dataset of societies was constructed within a neutral geographic classification with linguistic, ethnohistorical, climatological and terrain referents. Societies were matched to the Ethnographic Atlas , the Standard Cross-Cultural Sample , the Binford Hunter-Gatherer dataset , and Jorgensen’s Western North American Indian dataset  so that the database could be shared with D-PLACE  (see S1.4.3 in S1 File for details). JM applied automated procedures for identifying potential coding errors to the Cantometrics data, which were then manually checked and corrected as necessary (see S1.8 in S1 File). Coding reliability was rigorously tested on a random sample (see S1.7 in S1 File). Finally, once we had completed and documented this multi-year quality-control process, we made the full Global Jukebox data and metadata downloadable via GitHub and Zenodo (https://github.com/theglobaljukebox) in May 2021 in tandem with uploading a preprint of the current manuscript to PsyArXiv (https://psyarxiv.com/4z97j).
Our article focuses primarily on the data of Cantometrics, and concludes with an important piece of reanalysis (see section 5). We announce the long-anticipated publication of the raw coded Cantometrics data in downloadable form, and of six additional studies plus two supporting datasets (see Table 1), representing over 1000 societies. This public release will make it possible for scientists to test cross-cultural hypotheses using global data, and will stimulate new research and debate on cross-cultural diversity, the nature of aesthetic preferences, and the role of expressive arts in human evolution [6–12,45]. It also supports the practice of cultural equity in scientific approaches to music research.
1. The data
1.1. Datasets of the Global Jukebox
Foundational to the Global Jukebox are its datasets. Seven cross-cultural studies on distinct but related aspects of the performing arts, and two supporting datasets of societies are currently available (see Table 1; full descriptions in S1.3.1 in S1 File). An additional five performance datasets, plus four derivative studies, and one supporting taxonomy of subsistence are in progress, and are not included in the current release (see S11 Table in S1 File). This article focuses on the largest, best known, and most comprehensive dataset, Cantometrics, which uses 37 features (cf. Table 2) to code musical style for 5,776 audio recordings of traditional songs from 1,026 societies.
We also introduce several datasets representing more detailed analyses of performance style using many of the audio recordings coded by Cantometrics, as well as additional recordings: Minutage uses 36 additional variables to code breathing and phrasing patterns for 687 songs from 118 societies, while Phonotactics uses 51 variables to code vowel and consonant use for 338 songs from 47 societies. In addition, we include the Instruments dataset, which uses 14 different variables to code structural and functional aspects of 1,780 instruments from 152 societies, Ensembles with 776 cases from 153 societies, the Urban Strain (a popular song study) with 378 songs and dances from North America, and the Parlametrics dataset, which uses 52 variables to code aspects of spoken conversation for 188 audio recordings from 158 societies (Table 1). Unlike Cantometrics, these datasets have not yet been thoroughly cleaned and checked, but are being released together for maximum accessibility and transparency. Other datasets in the process of being digitized and prepared for release in a future publication are Choreometrics, Personnel and Orchestra, Song Texts, Vocal Qualities, Popular Songs (an extension of the Urban Strain set not yet coded), and several derivative studies on song style and social structure (see S1.3.3 in S1 File for details).
Finally, we publish data on social structure (“Social Factors” and “Societies”) built from an adapted version of the Ethnographic Atlas by Lomax, Arensberg, Barbara Ayres and their colleagues, and used in Lomax et al.’s previous analyses of relationships between music and culture. Their inclusion here will allow replication and extension of Lomax et al.’s original analyses.
1.2. Comparison with other cross cultural datasets
One of the major advantages of the Global Jukebox is its size and global scope compared to other published cross-cultural performing arts datasets. In particular, the Cantometrics dataset of 5,776 coded songs from 1,026 societies is more than an order of magnitude larger than the largest previously available global datasets that used Cantometrics or similar coding schemes to classify musical style, such as the 304 recordings from the Garland Encyclopedia of World Music analyzed in  or the 118 recordings from the Natural History of Song Discography (NHS-D) analyzed in . In addition to raw recording numbers, Cantometrics is also more balanced in many ways than NHS-D or Garland: Cantometrics includes more songs per society (median: 4 for Cantometrics vs. 1 for NHS-D and Garland), has better representation of small-scale societies than Garland, and better representation of large-scale societies than NHS. Of course, each dataset has its own sampling criteria and analysis goals, each with its own strengths and weaknesses: for instance, like the NHS-D, Cantometrics’ principal focus is vocal music, but it includes several variables for the analysis of orchestras, and its data are linked with a dataset classifying personnel and orchestra (to be published soon), and two others classifying instruments and ensembles. Garland includes both instrumental and vocal music. At the same time, the NHS-D only sampled examples of four song genres (dance songs, lullabies, love songs, and healing songs), while Cantometrics and Garland include numerous traditional genres that are identified by their emic names as well as broader genre categories. Unlike Cantometrics and Garland, NHS-D includes transcriptions in Western staff notation in addition to Cantometric-like codings. Clearly, Cantometrics may be more suitable for some purposes and not others, while all of these datasets may serve as complementary sources of evidence to investigate questions about cross-cultural musical diversity. For a fuller description of key differences between these three global datasets and other regional datasets, cf. . (See also section 2.1 for more details on the Cantometrics sampling methodology).
1.3. Coded performance variables: Selection and reliability
Cantometric variables were developed through field observation, intensive listening, and experimentation. Lomax and his collaborators looked for performance qualities with significant worldwide variation and with a role in shaping performance traditions. Variables requiring fine distinctions to be made were discarded. Lomax and Victor Grauer  aimed to transcend Western staff notation’s overly narrow focus on pitch and rhythm, and aimed to include variables capturing aspects of cohesiveness and differentiation within performances, social and musical organization, synchrony, relationships within and between instruments and voices, text, ornamentation, and vocal qualities. Together with melodic, rhythmic and structural features, these factors aimed to capture a broad matrix of aesthetic and social codes or conventions that are fairly consistent at the cultural level.
Following a similar methodology, specialist teams developed parameters for coding dance, speech and additional aspects of music, producing multiple datasets (see S3-S9 Tables in S1 File for variable descriptions for other published datasets). The inclusion of speech in this project, alongside forms more typically recognized as art, like song and dance, reflects Lomax’s intentions to study the cross-cultural aesthetic patterns of expressive culture in its various forms rather than on the basis of conventional definitions of “art.” Full details for most coding systems have been republished or published for the first time in [46,47]. The Global Jukebox website contains a fully digitized training course (“Songs of Earth: Aesthetic and Social Codes in Music”), with recordings and coding guides to allow researchers to interpret existing Cantometrics codings and to become trained Cantometric coders capable of adding new codings. New introductions, coding guides, and results for the other datasets are available in .
Many limitations of Cantometrics, Choreometrics, and related schemes have been previously critiqued (cf. [14,17,18] for review and discussion). For example, some argue they are too subjective, and thus have low reliability , while others argue they are not subjective enough in that they do not account for the ways in which similar sound structures can have different meanings in different cultures . While we cannot provide data to evaluate all criticisms here, we did conduct an analysis of inter-rater reliability by having two trained coders (ALCW and PES) independently code a random sample of 30 songs from the Global Jukebox using Cantometrics and compare their codings against each others’, against codings of students in Japan recently trained in Cantometrics, and against the codings in the Global Jukebox. These analyses found inter-rater reliability to be at acceptable levels on average, but with substantial variation in reliability across variables (see sections 4, S1.7, S9 Fig, and S12 Table in S1 File for details).
1.4. Performance data sources
Performance data is based on analysis of audiovisual sources—recorded examples of song and speech and filmed examples of dance and movement. As a documentarian himself, Alan Lomax had great faith in the recorded (and filmed) medium and what it could communicate. He already had a substantial library of world music on records and field tapes recorded himself or sent to him by colleagues, but he and Grauer spent a year acquiring more recordings to fill in the gaps of their first sample of over 2,000 recordings. Geopolitical tensions during the middle of the Cold War limited the accessibility of some regions, and it was only possible later, in stages, to obtain material from some parts of Eastern Europe, the former Soviet Union, rural China, India, and the Pacific.
1.5. Selection of audio examples
Lomax primarily sampled folk and Indigenous songs for Cantometrics, although small samples of traditional art music and jazz are also included (e.g., Hindustani/Carnatic traditions of South Asia; Central European classical singing; various examples of Chinese opera, Javanese Gamelan, modern jazz, etc.). In his own research practice, Lomax sought out the oldest and most typical songs and performance styles because he believed they would shed light on how singing voices the touchstones of emotion and personality development in a culture . He held extensive consultations with the singers and culture holders he recorded, which in many instances amounted to mini- or full autobiographical accounts (e.g., “Mr. Jelly Roll”, a full-length curated autobiography  and Interviews with Bessie Jones ; see  for a book-length review). Representativeness within a tradition, performance quality, and aesthetic criteria were important in selecting songs for Cantometrics. Lomax made every effort to include women’s and children’s songs, although these weren’t always readily available at the time. Final selections were based on subjective but informed decisions by Lomax and experts who recorded the songs. Lomax also drew upon his own extensive fieldwork, and from oral histories, recording liner notes, and the accounts of ethnographers and collectors.
An assumption of Cantometrics, based on Lomax’s field experience, early experiments, and listening sessions with Victor Grauer (co-inventor of Cantometrics with Lomax), was that the same features of song style would appear throughout most examples of singing in a society. Extensive piloting during the development of the Cantometric song sample led Lomax and Grauer to conclude that approximately ten songs were sufficient to capture the key elements of a song style in most folk and Indigenous societies. In cases where it was not possible to code ten songs for a given society, Lomax’s above-mentioned emphasis on representativeness of a tradition in the selection of song(s) to code justifies the inclusion of these society’s data in cross-cultural comparisons and hypothesis testing. This assumption has been critiqued and it could be systematically retested, but it is debatable whether a better option was available given the limitations of available recordings and resources to devote to manually listening to and coding these recordings (cf. discussion in ). The broader criticism of insufficient song samples for some societies and regions, such as Polynesia, is valid and will be addressed in future updates to the data.
A related critique of the original Cantometrics dataset asserted that its song sample did not sufficiently cover intra-societal diversity, and that societal designations were too broad. These issues have been addressed in this current release by splitting some of the original societies into more specific subgroups, and assigning more precise ethnolinguistic and geographic designations to societies (see 2; S1.4 in S1 File). A consequence of this decision to classify societies at finer grain is that the database now appears to have a larger number of societies and smaller number of songs per society (currently 1,026 societies with a median of 4 songs per society, rather than the “minimum of ten songs per society” from 233 societies originally described by Lomax [14,16]. However, the current finer-grained classification allows researchers to model higher-order relationships between societies (e.g., linguistic, geographic) while also preserving potential differences between closely related societies. A recent study by Daikoku et al.  shows the potential to use Cantometrics data to investigate questions of intra-societal musical diversity.
1.6. Availability of audio recordings
Alan Lomax had research and publishing agreements with artists, collectors, filmmakers, repositories, and publishers; the Association for Cultural Equity (ACE) has obtained or requested streaming rights on the Global Jukebox website. But distributing these recordings presents challenges in terms of copyright and respecting the wishes of culture bearers. Our lawyers advise us to ask all remaining rights holders for permission to allow their songs to be downloaded for use in scientific research and publications. This is a lengthy and expensive process, and there is no way of circumnavigating it. The large institutions that now house many collections of field recording are increasingly reluctant to grant this kind of access to their holdings. We have excluded 456 Indigenous American and Australian songs from streaming pending permission from their communities. The audio files for 191 songs are currently missing, but their coded data is available.
However, we can make approximately 2000 of the songs available for download for scientific research and publication, and we have developed a process for facilitating the use of audio samples for research. We have prepared two agreements, which will be available on our website: (1) between researchers and the Global Jukebox with terms for using audio; (2) between third party rights holders to authorize researchers to use their recordings. APIs of audio examples will be linked with the song data on Github and Zenodo. To access them, an authorization token from the Global Jukebox will be required. Initially, approval will be handled case by case, but it can be automated in future if there is a sufficient volume of requests. Audio will be restricted to those samples ACE controls, with more added as rights holders opt in, or agree to the new terms through our ongoing efforts. Researchers who would like to use examples in scientific publication are asked to contact the rights holders or repositories for permission to use those songs; this will not be difficult with those samples ACE controls. Researchers can also link readers to a streaming audio playlist of selections on the Global Jukebox.
1.7. Performance metadata
Metadata for each coded song performance includes a unique coding identification number; source society; audio reference numbers and information; song and performer information; setting and context if known; collector, publishing and archival data and year recorded; comments; source tags; and additional descriptive information. Fig 2 is a screenshot of metadata and codings for one example song. Metadata for Cantometrics, Phonotactics, Minutage, Parlametrics, and Instruments and Ensembles are summarized in S10 Table in S1 File.
2. Sampling societies and links to other datasets
2.1. Sampling societies
The Jukebox is designed to be an experience of the expressive arts as cultural phenomena. The datasets (songs, instruments, conversation, etc.) are related through their source societies. We use the term “societies” as an alternative to the term “cultures” or other terms for defining cultural groups. The Global Jukebox follows the Ethnographic Atlas in adopting ethno-linguistically defined societies as a key unit of analysis. The Jukebox differs from the Ethnographic Atlas in that in most cases a given society is not represented by a single set of coded data, but instead contains multiple examples of different performances. Performances (songs, dances, etc.) thus represent a finer unit of analysis beneath the larger unit of societies, and make it possible to account for local/regional historical, sociological and aesthetic influences, as well as for musical diversity within societies [52,53]. Societies can also be related to one another at higher levels of organization (e.g., people, Koppen climate/terrain , language family, geographic area or region, etc.; see S1.4.2 in S1 File). For visualization on the maps and linguistic trees of D-PLACE, songs and other performance data are condensed into a single “modal profile” representing the most common traits across all examples for a given society (cf.  for examples of how D-PLACE can be used to explore patterns in culture and their relationship to population history (via language) and geography).
A total of 1,275 Indigenous and folk societies across thirteen world regions are represented in the Global Jukebox, plus 472 popular song cultures. This accounts for societies that were sampled in the primary studies listed in Table 1, with the exception of Instruments and Ensembles. Where possible, Cantometrics was sampled from the more than 1,200 societies for which cultural data had already been coded in Murdock’s Ethnographic Atlas to facilitate comparison of song style and social structure [14,17,18]. Of the total number of Global Jukebox societies, 1,234 are linked with coded data (including Choreometrics data coding dance style, which will be published in the future), and of that number, 1,026 are included in the Cantometrics set. In addition, 508 Cantometrics societies have been coded for a series of social variables, describing aspects of social and community organization, most of which are taken from (or slightly modified from) Murdock’s Ethnographic Atlas variables. This dataset is released here as the ‘Social Factors’ dataset.
The cases included in the “Ensembles” and “Instruments” datasets depart from the other datasets in that they only partially conform to our definition of society. Because these studies use bibliographic sources rather than specific audio recordings, the societal designations made by the original investigators were often necessarily much broader than those in the other datasets (they ‘lump’ many societies that are ‘split’ in other datasets). Future research with scholars who specialize in musical instruments will be necessary to match the Instruments/Ensembles data with our more detailed societal data.
2.2. Links to other cross-cultural datasets
One goal of this release of Lomax’s datasets is to enable researchers to map and reanalyze the kinds of relationships between the arts and society originally explored by Lomax (cf. Fig 4 for an example comparing distributions of song style and social complexity). Of the 1,026 societies sampled in the Cantometrics study, approximately half have been coded for other cultural features in one or more of the ethnographic datasets currently available through D-PLACE. Specifically, 469 Cantometrics societies are also coded in the Ethnographic Atlas; 66 in Binford’s Forager Dataset; 122 in the Standard Cross-Cultural Sample; and 33 in Jorgensen’s Western North American Indian Dataset. Over the past several years, efforts have been made to identify and provide links to the primary sources used to code cultural data in these historical cross-cultural datasets (see D-PLACE [d-place.org] ), making it easier for researchers to return to primary sources when a coded cultural variable of interest is not available. eHRAF World Cultures [ehrafworldcultures.yale.edu]  greatly facilitates this process, by providing finely subject-indexed, fully digitized and searchable primary ethnographic documents. Currently, 268 Cantometrics societies are covered by eHRAF.
3. Data curation, cleaning, and validation
Each of the 7 Global Jukebox datasets and the Social Factors supporting dataset listed in Table 1 have their own coding schemes and criteria. Fully cleaning each dataset takes a great deal of effort and resources, and we currently cannot fully clean all datasets for publication. We decided to focus our cleaning efforts on the largest and most influential dataset of 5,776 Cantometric codings of songs. We decided to simultaneously publish the other 6 partially cleaned datasets and the Social Factors supporting dataset listed in Table 1 in order to make the materials available as widely as possible, but we emphasize that only the Cantometric dataset has been fully cleaned and validated. S1.6-S1.8 in S1 File provide a detailed description of the data curation, digitization, cleaning, and validation process.
4. Coding reliability
Overall, our analyses suggest that both coding reliability (mean κ = 0.54; S8 Fig and S12 Table in S1 File) and accuracy (approximately 0.4–1% rate of unambiguous coding/data entry errors; S9 Fig in S1 File) are at acceptable levels on average. However, there was also substantial variation in reliability across variables. Some variables showed near-perfect consensus: for example Line 4 differentiating solo vs. different types of group singing (κ = 0.94 / 89% agreement) and Line 7 which captures similar information but for instrumental accompaniment (κ = 0.92 / 86% agreement). Other variables had very low reliability effectively at chance levels (e.g., nasality [Line 33], vocal width [Line 32]). In general, aspects of vocal timbre (e.g., nasality, vocal width, rasp) tended to have lower reliability, but even within categories different variables could show strikingly different reliability. For example, Embellishment (Line 23) and Glissando (Line 29) are both intended to capture aspects of melodic ornamentation, but the former has much higher reliability (κ = 0.70 vs. 0.19, respectively).
We caution that lower reliability does not necessarily mean a variable is “worse”. For example, some variables with higher reliability may reflect more obvious acoustic features that could potentially be automated (e.g., presence vs. absence of instruments), while some variables with lower reliability could nevertheless reflect more subtle but meaningful features that require human annotation (e.g., melodic form). Finally, these ratings only reflect the codings of a few raters whose musical backgrounds are not representative of the world. We encourage users to use the reliability data as simply one of many pieces of evidence to guide decisions about variables for use in cross-cultural analysis, development of automated signal processing algorithms, or other uses.
5. Is song style correlated with social complexity? An example of hypothesis testing using the Global Jukebox
5.1. Hypothesis testing with the Global Jukebox datasets
Lomax and colleagues published numerous analyses of Cantometrics and other Global Jukebox data. These results were reviewed by Wood [17,18] and Savage , and are summarized in S1.2 in S1 File. Unfortunately, the original analyses cannot be precisely reproduced, as it is not possible to reconstruct the specific datasets and procedures that were used for previous analyses performed during the mid-20th century before data/code-sharing capabilities were widely available. For reanalyses of Cantometric data and analyses of similar data, see [19,21,22,52,53,55–61].
Lomax’ original analyses consisted of bivariate correlations between all 37 musical variables and the dozens of social variables coded in the Ethnographic Atlas (EA). Lomax summarized five of these key proposed correlations between song style variables and social variables as follows:
Song style tends to grow more articulated, ornamented, heavily orchestrated, and exclusive as societies grow bigger, more productive, more urbanized, and more stratified. Specifically, (1) the level of text repetition decreases directly as productivity increases, (2) the level of precision of enunciation increases as states grow in size, (3) the prominence of small intervals and embellishments indicates the level of stratification, (4) orchestral complexity symbolizes state power, and (5) melodic form and complexity reflect the size and subsistence base of a community. 
Understandably, readers may be skeptical about any attempt to establish predictive or causal relationships between society and music. Whether such correlations reflect causal relationships, autocorrelation due to shared ancestry, or something else, has been greatly debated [14–18,31–36,61]. Here we attempt to see whether the basic correlations are a) reproducible, and b) artifacts of geographic or linguistic autocorrelation.
A given society may contain stylistically similar or diverse sets of songs with similar or differing codings. Such codings can be condensed into a single “modal profile ’’ in order to analyze relationships at the level of a society, as was the method in the original analysis, or clustered and analyzed separately as individual songs [21–23,55–60,63]. An example modal profile is mapped in Fig 3 for the Cantometric variable “Embellishment”. To maximize usability, we have provided the data in a form treating songs as the unit of analysis (see cldf/songs.cv in https://zenodo.org/record/4898406)) and using modal profiles to treat societies as the unit of analysis (output/converted_modal_profiles.csv at https://zenodo.org/record/6537663#.YnszmllS_BK)).
Embellishment is a technique in which rapid, ephemeral notes ornament the main melodic line, but are distinct from it. The distribution of highly embellished singing (in red tones) outlines the”Silk Road” region of Eurasian cultural exchange which includes the Mediterranean, North Africa, the Arabian Peninsula, the Middle East, Western Asia, South Asia, Southeast Asia, and East Asia. This highly embellished singing is differentiated from Eastern and Central Europe, sub-Saharan Africa, Oceania, and the Americas where singing is mostly unembellished (blue tones).
Below, we use the society-level modal profiles to test 5 bivariate relationships between musical style and patterns of social structure, using cultural data from the Ethnographic Atlas  and linguistic and spatial data from D-PLACE . These statistical tests revisit 5 key hypotheses proposed and tested by Alan Lomax using modal profiles, and reanalyze one of Lomax’s primary conclusions of his original Cantometric analyses: that global song style is correlated with social complexity [15,16].
A common critique of early Cantometrics analyses was that it did not control for common cultural patterns of autocorrelation [14,31]. Specifically, that the statistical evidence for the correlations occurred because societies were historically related, or frequently interacted with each other, rather than because of a functional relationship between music and social structure [64,65]. Here, we re-test the proposed correlations, as well as the aggregate complexity relationship, while controlling for historical relationships (using language), and spatial relationships (using geography).
We use three types of models in this analysis: first, a simple linear model. The simple linear model replicates methods used in the original Lomax tests and acts and a baseline model. Second, vertical relationships are tested using a phylogenetic linear regression, implemented in phylolm . Finally, spatial relationships are tested within generalized mixed-models with non-gaussian random effects and exponential spatial correlation, using spaMM . Phylogenetic relationships are determined from the glottolog taxonomy  with a Grafen branch length transformation, as performed in . All Cantometrics and Ethnographic Atlas data were also standardized; see details in S1.9.1 in S1 File. For detailed definitions of the social and musical variables, see Table 2; S9 and S18-S28 Tables in S1 File.
The five correlations are between:
- Musical organization of the Orchestra (CV7) ~ Jurisdictional hierarchy beyond the local community (EA033)
- Text repetition (CV10) ~ Subsistence (an aggregate variable described in the SM);
- Embellishment (CV23) ~ Class (EA066) + Caste (EA068) + Slavery (EA070)
- Melodic interval size (CV21) ~ Community size (EA031), and
- Enunciation (CV37) ~ Jurisdictional hierarchy beyond the local community (EA033)
Within each hypothesis, AIC model comparison allows us to determine whether the distribution of data is best explained by linguistic or geographic relatedness (or neither). Secondly, we test the general hypothesis of a relationship between average song style and social complexity by performing two principal component analyses (PCA) on the set of musical variables and the set of social variables used in the preceding tests. Using the first principal component from each variable set, we test the relationship between musical style and social complexity under the same three conditions as the bivariate models. More details of the analyses are available in section S1.9 in S1 File of the supplementary material.
5.3 Results of reanalysis
Across the five models, we find that all hypothesized correlations hold in at least one of the three model variations. Specifically: musical organization increases with more jurisdictional hierarchy; text repetition declines with more productive subsistence technologies; embellishment increases with stratification; melodic intervals decline with larger community size; and enunciation becomes more precise with higher levels of jurisdictional hierarchy. However, we also find that models that control for geographic relatedness improve model fit by greater than 2 AIC units in the Musical Organization of the Orchestra, Text Repetition, Embellishment, and Melodic interval Size models. We cannot statistically determine whether vertical or horizontal processes best explain variation in Enunciation (S17 Table in S1 File). These results suggest that the way we make music is guided by the musical traditions of our ancestors and our neighbors, and is also related to societal structure.
Using the eigenvalue criterion and visual inspection of scree-plots, we determined that both the musical and social variable sets are each best explained by a single dimension. The first principal component of musical variables explains 45% of variation, and the first principal component of social variables explains 58% of variation (proportions comparable to those found in previous analyses using similar [though not identical] variable sets [9,52]; for loadings see S15 Table in S1 File). We find a significant positive relationship between the two principal components, regardless of linguistic or geographic controls (Fig 4). However, AIC comparison revealed that a model controlling for geography best explained the data (β = 0.60, p < 0.001, n = 147; S13 Fig, S17 Table in S1 File).
a) A map of the global distribution of the first principal component (PC1; interpreted as “musical differentiation”) for five musical variables from the Global Jukebox’s Cantometrics dataset (Musical organization of the Orchestra (CV7), Text repetition (CV10), Embellishment (CV23), Melodic interval size, and Enunciation (CV37). b) a) A map of the global distribution of the PC1 (interpreted as “societal complexity”) for six social variables from D-PLACE’s Ethnographic Atlas dataset (Jurisdictional hierarchy beyond the local community (EA033), Subsistence (an aggregate variable described in the SM); Class (EA066), Caste (EA068), Slavery (EA070), and Community size (EA031). c) The correlation between musical PC1 and social PC1 was significant when controlling for possible autocorrelation using linguistic relatedness and geographic proximity (see Supplementary Material S1.9 in S1 File for modeling details and analyses of bivariate correlations between the musical and social variables).
Lomax’s correlations between song style and social complexity have been disputed [14–18,31–36,61], but our bivariate and PCA reanalyses provide evidence suggesting that his original correlations are a) reproducible, and b) not clear artifacts of geographic or linguistic autocorrelation. We do not believe, nor is there any evidence that Lomax believed, that social factors directly produce effects upon music. Here we do not attempt to apply formal causal analysis or test any proposed causal mechanisms. (cf. ). However, the probability that certain social and musical traits and configurations will consistently cooccur raises questions about what drives aesthetic preferences. For example, are aesthetic preferences embodied in vocal representations of emotions and / or physical states that may arise under certain persistent conditions?
Different evolutionary theories propose several possible mechanistic pathways that could be tested , such as patterns of interaction , roles of social bonding , signaling , and psychosocial effects . Our database can generate new hypotheses with preliminary evidence for causal and other kinds of relationships; these could then be tested by modeling archival information and data from the field. It can take ethnomusicological work beyond the descriptive by identifying and explaining such causal relationships, and it can introduce ethnologically grounded and cross-cultural data, methods and perspectives to cognitive and neuroscientific research on music. We hope that future work with these open datasets will engender fresh insights into cultural evolution and the role of art and human expressivity in society.
6. Ethics, rights and consent
The Global Jukebox was created to provide a platform for the world’s music and for the data to speak for itself, so that users can appreciate a diverse range of performative approaches, not by Western standards, but by listening to global samples and identifying their many approaches to performance [46,72]. This approach to cross-cultural performance analysis was in part based upon Lomax’s extensive fieldwork with singers and musicians, where he recognized distinctive aesthetic and social values in their performances. Guiding his efforts was the conviction that every society’s music must be appreciated and researched on its own terms, according to its own aesthetic standards, given equitable airplay, and play a significant part in education [cf. [73–75].
The Association for Cultural Equity (ACE) is committed to obtaining permission to stream the media examples that have been studied and analyzed for the Global Jukebox. As did Lomax, ACE seeks out the estates of artists recorded by Lomax, and their descendents and estates receive fees and royalties from licensing and sales. Repatriation of Lomax’s recordings to their communities of origin, in partnership with those communities, is ongoing and has reached over 50 communities, descendents of artists, and national libraries. North American and Australian Indigenous audio samples will be streamed on the Jukebox only with the agreement of each community.
To improve ethical practices, ACE convenes with cultural advocates from diverse communities [76,77]. ACE’s online resources are among the very few, if not the only, ones that stream media examples in their entirety (except those on Spotify, in which case users must sign up for Spotify to hear more than 30 seconds), and have no sign in or membership requirements for using them. To further improve access to Lomax’s recordings and research, ACE engages with community arts leaders, artists and other culture bearers to connect their constituencies to the Global Jukebox and our online archive in meaningful ways. They are invited to contribute Journeys and Exhibits, correct metadata, interpret the songs, suggest new songs and codings, and add their documentation to the songs. We plan to work with culture-bearers to expand and improve the Global Jukebox sample and data. ACE also supports emerging leaders from endangered cultures to document, describe, steward and reappraise their expressive traditions. Additional information regarding the ethical, cultural, and scientific considerations specific to inclusivity in global research is included in the Supporting Information (S1 Checklist).
The full publication of Global Jukebox data represents the culmination of sixty years of research by one of the world’s most influential scholars of music [51,75]. We are making these data publicly available in order to encourage their use, improvement, and expansion through diverse intercultural and interdisciplinary collaborations. We also hope to encourage further scientific research into music, dance, speech and other arts as primary rather than ancillary factors in human history and evolution, as well as to deepen our understanding of cross-cultural diversity at a time when it is more important than ever before.
How to cite the Global Jukebox
Research that uses data from the Global Jukebox should cite both the original source(s) of the data and this paper (e.g., research using data from the Cantometrics dataset: “Lomax (1968, 1980); Wood et al. (2022)”). The reference list should include the date that data was accessed and URL for the Global Jukebox (http://theglobaljukebox.org), in addition to the full reference for Lomax (1968). Additionally, each dataset (Table 1) is versioned and stored on Zenodo. Users can cite the specific dataset and version used by visiting https://zenodo.org/record/4898406.
S1 Checklist. PLOS’s questionnaire on inclusivity in global research.
S1 File. Supplementary material.
This document includes all additional information and supplementary tables and figures referenced in the manuscript. This includes material regarding original data analysis and results, details of the Global Jukebox datasets, information on societies, data history, data normalization, coding reliability, data validation, current analyses, and the concept of cultural equity.
We thank the thousands of musicians, ethnomusicologists, record labels, and funding agencies whose decades of work resulted in the audio recordings that provide the foundation of the Global Jukebox (see Lomax 1968:xv-xvii for a full list of Acknowledgments, and see meta-data at http://theglobaljukebox.org for detailed credits for each song; see Fig 2 for an example). Alan Lomax, with Michael del Rio, imagined and realized the essence of the Global Jukebox. It was based on Lomax’s years of fieldwork and experimentation, and thirty years of research on performance style with Conrad M. Arensberg and specialist collaborators. George P. Murdock, whose foundational work in cross-cultural anthropology led to the Ethnographic Atlas, and Norman Berkowitz, a cutting edge programmer and statistician, each played a leading role in operationalizing theories and secondary hypotheses. Richard Smith brought the website out of the dustbin of obsolete programming languages and obsolete hard drives. Gideon D’Arcangelo helped to bring the Jukebox to life as it is now being developed. Other contributors include Jeff Feddersen (Original Design); Ray Cha (Wireframe); Alona Weiss (Additional Design); Kiki Smith-Archiapatti (Design and Content); Martin Szinger (Developer); Steve Rosenthal and the Magic Shop (Audio Digital Transfers, Restoration); Forrestine Paulay (Choreometrics); Karen Kohn Bradley, Frederick Curry, Meriam Lobel, Sinclair O’Gaga, Onye Ozuzu, Miriam Philips, Susan Wiesner (Dance Analysis and Social Justice); Patricia Campbell (Curriculum); Todd Harvey, Jorge Arevalo Mateus, Bruno Nettl, Anthony Seeger, Michael Tenzer, Philip Yampolsky (Ethnomusicology Advisors), Karen Claman, Kathleen Rivera (Researchers); Miriam Elhajli (Latin American folk sample); Jesse Rifkin, Don Fleming (Popular Song); Victor Grauer (Cantometrics Advisor); Sergio Bonanzinga, Judith Cohen, Lamont Pearly III, Mark Slobin (Journeys); Herb Sturz.
The Global Jukebox is a project of the Association for Cultural Equity (culturalequity.org), a 501c(3) non-profit charitable organization and custodian of the Alan Lomax Archive. Founded by Alan Lomax in 1983, ACE’s mission is to stimulate cultural equity through fostering research, dissemination, and sustainability of the world’s traditional expressive practices. It endeavors to reconnect people and communities with their creative heritage through open access and mutual engagement. Lomax’s original recordings and papers were deposited with the American Folklife Center of The Library of Congress; ACE retains digital copies which it uses in repatriation, publication, and collaborative initiatives with source communities.
- 1. Murdock G. P. (1967). Ethnographic atlas: A summary. Ethnology, 6(2), 109–236.
- 2. HRAF. eHRAF World Cultures. https://ehrafworldcultures.yale.edu/ehrafe/.
- 3. Wycliffe Bible Translators. (1951). Ethnologue: Languages of the world. SIL International.
- 4. Hammarström H. & Forkel R. & Haspelmath M. & Bank S. (2021). Glottolog 4.4. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://doi.org/10.5281/zenodo.4761960. (Available online at http://glottolog.org, Accessed on 2021-08-18.
- 5. Kirby K. R., Gray R. D., Greenhill S. J., Jordan F. M., Gomes-Ng S., Bibiko H.-J., et al. (2016). D-PLACE: A Global Database of Cultural, Linguistic and Environmental Diversity. Plos One, 11(7), e0158391. pmid:27391016
- 6. Currie T. E., & Mace R. (2009). Political complexity predicts the spread of ethnolinguistic groups. Proceedings of the National Academy of Sciences of the United States of America, 106(18), 7339–7344. pmid:19380740
- 7. Atkinson Q. D. (2011). Phonemic diversity supports a serial founder effect model of language expansion from Africa. Science, 332, 346–349. pmid:21493858
- 8. Botero C. A., Gardner B., Kirby K. R., Bulbulia J., Gavin M. C., & Gray R. D. (2014). The ecology of religious beliefs. Proceedings of the National Academy of Sciences of the United States of America, 111(47), 16784–16789. pmid:25385605
- 9. Turchin P., Currie T. E., Whitehouse H., François P., Feeney K., Mullins D., et al. (2018). Quantitative historical analysis uncovers a single dimension of complexity that structures global variation in human social organization. Proceedings of the National Academy of Sciences of the United States of America, 115(2), E144–E151. pmid:29269395
- 10. Turchin P., Francois P., Hoyer D., Nugent S., Savage P. E., Brandl E., et al. (2022). Explaining the rise of moralizing religions: A test of competing hypotheses using the Seshat Databank. Religion, Brain & Behavior. https://doi.org/10.1080/2153599X.2022.2065345.
- 11. Blasi D. E., Moran S., Moisik S. R., Widmer P., Dediu D., & Bickel B. (2019). Human sound systems are shaped by post-Neolithic changes in bite configuration. Science, 363(6432). pmid:30872490
- 12. Slingerland E., Atkinson Q. D., Ember C. R., Sheehan O., Muthukrishna M., Bulbulia J., et al. (2020). Coding culture: Challenges and recommendations for comparative cultural databases. Evolutionary Human Sciences, 2, e29. https://doi.org/10.1017/ehs.2020.30.
- 13. Cohen R. D. (Ed.) (2003). Alan Lomax: Selected writings, 1934–1997. Routledge.
- 14. Savage P. E. (2018). Alan Lomax’s Cantometrics Project: A comprehensive review. Music & Science, 1, 1–19. https://doi.org/10.1177/2059204318786084.
- 15. Lomax A. (1980). Factors of musical style. In Diamond S. (Ed.), Theory & practice: Essays presented to Gene Weltfish (pp. 29–58). Mouton.
- 16. Lomax A. (1968). Folk Song Style and Culture. Washington, DC: American Association for the Advancement of Science.
- 17. Wood A. L. C. (2018a). “Like a cry from the heart”: An insider’s view of the genesis of Alan Lomax’s ideas and the legacy of his research, Part I. Ethnomusicology, 62(2), 230–264.
- 18. Wood A. L. C. (2018b). “Like a cry from the heart”: An insider’s view on the genesis of Alan Lomax’s ideas and the legacy of his research: Part II. Ethnomusicology, 62(3), 403–438.
- 19. Savage P. E., Brown S., Sakai E., & Currie T. E. (2015). Statistical universals reveal the structures and functions of human music. Proceedings of the National Academy of Sciences of the United States of America, 112(29), 8987–8992. pmid:26124105
- 20. Mehr S. A., Singh M., Knox D., Ketter D. M., Pickens-Jones D., Atwood S., et al. (2019). Universality and diversity in human song. Science, 366, eaax0868. pmid:31753969
- 21. Matsumae H., Ranacher P., Savage P. E., Blasi D. E., Currie T. E., Koganebuchi K., et al. (2021). Exploring correlations in genetic and cultural variation across language families in Northeast Asia. Science Advances. pmid:34407936
- 22. Savage P. E., Matsumae H., Oota H., Stoneking M., Currie T. E., Tajima A., et al. (2015). How “circumpolar” is Ainu music? Musical and genetic perspectives on the history of the Japanese archipelago. Ethnomusicology Forum, 24(3), 443–467. https://doi.org/10.1080/17411912.2015.1084236.
- 23. Brown S., Savage P. E., Ko A. M.-S., Stoneking M., Ko Y.-C., Loo J.-H., et al. (2014). Correlations in the population structure of music, genes and language. Proceedings of the Royal Society B: Biological Sciences, 281(1774), 20132072. pmid:24225453
- 24. Forkel R., List J.-M., Greenhill S. J., Rzymski C., Bank S., Cysouw M., … Gray R. Det al. (2018). Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics. Scientific Data, 5(1), 1–10. https://doi.org/10.1038/sdata.2018.205.
- 25. Stumpf C. (1911). The Origins of Music [translated 2012] (D. Trippett (Trans.)). Oxford University Press.
- 26. Hornbostel E. M. von. (1975). Hornbostel Opera Omnia (K. P. Wachmann, D. Christensen, & H.-P. Reinecke (Eds.)). Marinus Nijhoff.
- 27. Sachs C. (1962). The Wellsprings of Music (J. Kunst (Ed.)). M. Nijhoff.
- 28. Nettl B., & Bohlman P. V. (Eds.). (1991). Comparative Musicology and Anthropology of Music: Essays on the History of Ethnomusicology. University of Chicago Press.
- 29. Schneider A. (2006). Comparative and Systematic Musicology in Relation to Ethnomusicology: A Historical and Methodological Survey. Ethnomusicology, 50(2). http://www.jstor.org/stable/10.2307/20174451.
- 30. Savage P. E. (2022). Comparative musicology: The science of the world’s music. PsyArXiv. https://doi.org/10.31234/osf.io/b36fm.
- 31. Erickson E. E. (1976). Tradition and evolution in song style: A reanalysis of Cantometric data. Cross-Cultural Research, 11(4), 277–308. https://doi.org/10.1177/106939717601100403.
- 32. Nettl B. (1970). Review of A. Lomax, Folk song style and culture. American Anthropologist, 72(2), 438–441.
- 33. Feld S. (1984). Sound Structure as Social Structure. Ethnomusicology 28(3): 383–409.
- 34. Dubinskas F. (1983). A Musical Joseph’s Coat: Patchwork Patterns and Social Significance in World Musics. Reviews in Anthropology 10(3): 27–42.
- 35. Driver H. (1970). Review of Folk Song Style and Culture, by Alan Lomax. Ethnomusicology 14(1): 57–62.
- 36. Locke D. (1981). Review of Cantometrics: An Approach to the Anthropology of Music, by Alan Lomax. Ethnomusicology 25(3): 527–29.
- 37. Russonello G. (2017, July 11). The unfinished work of Alan Lomax’s Global Jukebox. New York Times. https://www.nytimes.com/2017/07/11/arts/music/alan-lomax-global-jukebox-digital-archive.html.
- 38. Chow A. R. (2017, April 18). Alan Lomax recordings are digitized in a new online collection. New York Times. https://www.nytimes.com/2017/04/18/arts/music/alan-lomax-recordings-the-global-jukebox-digitized.html.
- 39. Zollo P. (2020, April 6). Global Jukebox offers free music from around the world. American Songwriter. https://americansongwriter.com/global-jukebox-free-world-music/.
- 40. Reed R. (2017, April 19). Hear music from 1,000 cultures on massive Alan Lomax recordings site. Rolling Stone. https://www.rollingstone.com/music/music-news/hear-music-from-1000-cultures-on-massive-alan-lomax-recordings-site-109086/.
- 41. Grant C. (2015, January 29). Sounding the Global Jukebox: We owe Alan Lomax a debt of thanks. The Conversation. https://theconversation.com/sounding-the-global-jukebox-we-owe-alan-lomax-a-debt-of-thanks-36206.
- 42. Murdock GP & White DR. (1969). Standard Cross-Cultural Sample. Ethnology 9:329–369.
- 43. Binford L. (2001). Constructing Frames of Reference: An Analytical Method for Archaeological Theory Building Using Hunter-gatherer and Environmental Data Sets. University of California Press.
- 44. Jorgensen JG. (1980). Western Indians: Comparative Environments, Languages, and Cultures of 172 Western American Indian Tribes. San Francisco: W.H. Freeman and Company.
- 45. McDermott J. H., Schultz A. F., Undurraga E. A., & Godoy R. A. (2016). Indifference to dissonance in native Amazonians reveals cultural variation in music perception. Nature, 535, 547–550. pmid:27409816
- 46. Lomax A., & Grauer V. (1968). The Cantometric coding book. In A. Lomax (Ed.), Folk Song Style and Culture (pp. 34–74). American Association for the Advancement of Science.
- 47. Wood A. L. C. (2021). Songs of Earth: Aesthetic and Social Codes in Music. ACE/University Press of Mississippi, Oxford, MS.
- 48. Savage P. E. (2022). An overview of cross-cultural music corpus studies. In Shanahan D., Burgoyne A., & Quinn I. (Eds.), Oxford Handbook of Music and Corpus Studies (C34.S1–C34.N2). Oxford University Press. http://doi.org//10.1093/oxfordhb/9780190945442.013.34.
- 49. Lomax A. (1950). Mister Jelly Roll: the fortunes of Jelly Roll Morton, New Orleans Creole and "inventor of jazz." New York: Duell, Sloan and Pearce.
- 50. Jones B. (1961, October 3–31). Interviews by Alan Lomax. Bessie Jones 1961–1962, Association for Cultural Equity, Lomax Digital Archive. https://archive.culturalequity.org/node/765.
- 51. Szwed J. (2010). Alan Lomax: The Man Who Recorded the World. Viking.
- 52. Daikoku H., Wood A. L. C., & Savage P. E. (2020). Musical diversity in India: A preliminary computational study using Cantometrics. Keio SFC Journal, 20(2), 34–61.
- 53. Rzeszutek T., Savage P. E., & Brown S. (2012). The structure of cross-cultural musical diversity. Proceedings of the Royal Society B: Biological Sciences, 279(1733), 1606–1612. pmid:22072606
- 54. Peel M.C., Finlayson B.L. & McMahon T.A. (2007). Updated world map of the Köppen-Geiger climate classification. Hydrol. Earth Syst. Sci., 11, 1633–1644.
- 55. Busby G. (2006). Finding the blues: An investigation into the origins and evolution of African-American music [M.Sc. thesis: University of London]. http://users.ox.ac.uk/~some2456/docs/Busby_GBJ_finding_the_blues_2006.pdf.
- 56. Leroi A. M., & Swire J. (2006). The recovery of the past. The World of Music, 48(3), 43–54.
- 57. Callaway E. (2007). Music is in our genes. Nature News, December. https://doi.org/10.1038/news.2007.359.
- 58. Du Toit D. (2011). Cultural variation of music [thesis, Department of Psychology] University of Auckland.
- 59. Grauer V. A. (2006). Echoes of our forgotten ancestors. The World of Music, 48(2), 5–58.
- 60. Flory M. (2017). Cultures Clustered by Song Style. In Symposium on the Global Jukebox. 62nd Annual Meeting of the Society for Ethnomusicology, Denver. Accessible at www.theglobaljukebox.org.
- 61. Passmore S., Wood A. L. C., Barbieri C., Shilton D., Daikoku H., Atkinson Q. D., & Savage P. Eet al. (2022). Global relationships between musical, linguistic, and genetic diversity. PsyArXiv preprint: https://doi.org/10.31234/osf.io/mdrsn.
- 62. Lomax A. (1989). Cantometrics. In International encyclopedia of communications (1st ed., pp. 230–233). New York, NY: Oxford University Press.
- 63. Savage P. E., & Brown S. (2014). Mapping music: Cluster analysis of song-type frequencies within and between cultures. Ethnomusicology, 58(1), 133–155. https://doi.org/10.5406/ethnomusicology.58.1.0133.
- 64. Levinson S. C., & Gray R. D. (2012). Tools from evolutionary biology shed new light on the diversification of languages. Trends in Cognitive Sciences, 16(3), 167–173. pmid:22336727
- 65. Gray R. D., & Watts J. (2017). Cultural macroevolution matters. Proceedings of the National Academy of Sciences of the United States of America, 114(30), 7846–7852. pmid:28739960
- 66. Ho L. S. T. and Ane C. (2014) A linear-time algorithm for Gaussian and non-Gaussian trait evolution models. Systematic Biology, 63(3):397–408. pmid:24500037
- 67. Rousset François & Ferdy Jean-Baptiste. (2014). Testing environmental and genetic effects in the presence of spatial autocorrelation. Ecography 37(8): 781–790. URL http://dx.doi.org/10.1111/ecog.00566.
- 68. Roberts S. G., Winters J., & Chen K. (2015). Future Tense and Economic Decisions: Controlling for Cultural Evolution. PLOS ONE, 10(7), e0132145. pmid:26186527
- 69. Arensberg C. M. (1972). Culture as behavior: Structure, and emergence. Annual Review of Anthropology 1:1–27.
- 70. Savage P. E., Loui P., Tarr B., Schachner A., Glowacki L., Mithen S., & Fitch W. Tet al. (2021). Music as a coevolved system for social bonding. Behavioral and Brain Sciences, 1–22. https://doi.org/10.1017/S0140525X20000333.
- 71. Mehr S. A., Krasnow M. M., Bryant G. A., & Hagen E. H. (2021). Origins of music in credible signaling. Behavioral and Brain Sciences, 23–39. https://doi.org/10.1017/S0140525X20000345.
- 72. Lomax A. (1976). Cantometrics: An approach to the anthropology of music. Berkeley, CA: University of California Exten- sion Media Center.
- 73. Lomax A. (1977). Appeal for Cultural Equity. Journal of Communication, 27(2), 125–138.
- 74. Pettan S., & Titon J. T. (Eds.). (2015). The Oxford Handbook of Applied Ethnomusicology. Oxford: Oxford University Press.
- 75. Pareles J. (2002). Alan Lomax, who raised the voice of folk music in the U.S., dies at 87. The New York Times, A1,A11. http://www.nytimes.com/2002/07/20/arts/alan-lomax-who-raised-voice-of-folk-music-in-us-dies-at-87.html.
- 76. Wood A.L.C. (2020). Story of an archive. Etnografie Sonore, 11(2).
- 77. Wood A.L.C. (2022). Cultural Resources: For Whom? Proceedings of the 15th International Conference on Interdisciplinary Social Sciences, 2020. Athens, Greece, Edited by V. Chryssanthopoulos. University of Athens.