Figures
Abstract
A data flow is presented for visualising the evolution of elementary structures of polyphonic music from early Baroque to late Romantic, using quasi-phylogenies based on fingerprint diagrams and barcode sequence data of 2-tuples of consecutive vertical pitch class sets (pcs). The present methodological study, which sees itself as a proof of concept for a data-driven approach, uses examples of music from the Baroque, the Viennese School and the Romantic era to show that such quasi-phylogenies can be generated from multi-track MIDI (v. 1) files that largely correspond to the eras and the chronology of compositions and composers. The method presented is considered to have the potential to support the analysis of a wide range of musicological questions. In the context of collaborative work on quasi-phylogenies of polyphonic music, a public data archive could be established that provides multi-track MIDI files with contextual data.
Citation: Rambold G, Neubacher D, Schießl S (2023) Fingerprints, barcode sequences and quasi-phylogenies–Tools for analysing polyphonic music. PLoS ONE 18(3): e0280478. https://doi.org/10.1371/journal.pone.0280478
Editor: Constantine Dovrolis, Georgia Institute of Technology, UNITED STATES
Received: July 8, 2022; Accepted: January 3, 2023; Published: March 2, 2023
Copyright: © 2023 Rambold et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Software and used files are available from the GitHub repository (https://github.com/Rameau-Project/Rameau).
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Systematic branches of science such as biology, linguistics and musicology need instruments to describe and compare their research objects according to uniform criteria. In the systematic sciences, these criteria depend on the accessibility and recordability of the characteristics, which in turn depend on the methodology. In biological systematics, it was initially phenotypic, i.e. morpho-anatomical, features that were used, for example, to distinguish species. Later, characteristics at the cellular and chemical level were added. Finally, it was genotypic characteristics (DNA sequence data) that led to the further development of classifications. DNA barcoding genes, as the name suggests, are used to identify individual organisms and play an important role to characterise and compare organisms or groups of organisms [1]. These genes or gene segments occur in all representatives of the groups of organisms under consideration and thus enable comparison.
Genes or gene segments used as species barcoding genes or species barcodes are also used to reconstruct the evolution of species or taxa. For this purpose, the sequence data are arranged in a matrix and the dissimilarities of the individual sequences are visualised as phylograms using various algorithms and parameters [2]. The resulting topologies provide insight into the evolution of certain traits within a particular phylogenetic group [3] and can be used as a basis for the classification of research objects with the taxonomic levels of species, genus, family, order, etc., as in systematic biology [4]. Ultimately, entire object or class hierarchies can be built up like evolutionary classification systems.
Barcoding genes fulfil the following criteria: 1) They are intrinsic properties that cannot be changed or only insignificantly changed by environmental impacts, as is the case with the genome (persistence, consistency). 2) They are measurable properties that are intrinsic to the object and for which repeated measurements lead to the same result (repeatability, reproducibility). 3) The underlying properties occur in all objects to be compared (universality). 4) It is possible to represent the characteristic values as a sequence of code elements whose elements can be positioned, e.g. in a so-called alignment, in such a way that corresponding, i.e. homologous, elements lie in the same horizontal positions, so that a data matrix is created that can subsequently be processed for visualisation and analysis (processability, comparability).
Among the systematic sciences, linguistics and musicology are predestined for a similar approach, especially since both have as their object of research the information encoded in a limited number of different letters or notes, respectively. In polyphonic (including homophonic) music (representable by 12 half tones), quite analogously both a quasi ‘phenotypic’ and a ‘genotypic’ structural level can be distinguished in the respective works. Phenotypical characteristics of music compositional works are all those that can be influenced by instrumentation and interpretation, i.e. key, rhythm (incl. tempo) as well as expression (incl. articulation and timbre). Genotypic in a certain sense, on the other hand, are elementary tonal structures that cannot be easily influenced without affecting the compositional identity, such as melody and harmony (without key), and which can be described on the basis of the set theory of music [5]. In a data flow, sets of tones or pitches coded in multi-track MIDI v. 1 format (https://www.midi.org/specifications) can be transformed into each other and assigned to pitch class sets (pcs) by elementary transformation procedures such as transposition, permutation and mirroring (Fig 1A and 1B). In other words, defined pcs are represented by pitch sets that can be transformed into each other by the above-mentioned transformation procedures. Thus there is a limited number of pcs, which are named or numbered according to a classification system with names or numbers by Forte (1973) [5].
A. Data flow from a MIDI input file to a pcs listing (all in UTF-8 encoding format). [MID: MIDI; the input files must follow the multi-track MIDI standard (v. 1); RDL: Reduced Data Listing files provide all basic information required for analyses based on VPS files, supplying all information concerning tone = pitch (as MIDI note numbers), duration of tone for every track = voice; VPS: Vertical Pitch Set files include successive vertical pitch sets; PCS: Pitch Class Set files include the successive vertical pitch sets (as ‘Forte name/numbers’]. B. Data flow schema from a multi-track MIDI input file to a MAP fingerprint file and to SEQ barcode sequence data (to be copied or exported from the cmd window under local MS Windows).
Vertical pitch sets (corresponding to chords but independent of key and permutation) can be assigned to a pcs at any point of a pitch change in polyphonic compositional works. According to chordal links, the consecutive pcs (pcsn → pcs n+1) can be represented and plotted as ordered pairs, i.e. 2-tuples in a grid diagram, where the name/number of the initial pcs (pcsn) is the right-hand value and the subsequent pcs (pcsn+1) the high value. The target pcs becomes the initial pcs in the subsequent step. The procedure for creating these diagrams is shown in Fig 2.
[The consecutive pcs (corresponding to chordal links) are coded and plotted as ‘Forte name/numbers’ (i.e. pcs 1–1, 2–1 to 2–6, 3–1 to 3–12 etc.) of the initial pcsn = right hand value (x) and the name/number of the target pcsn+1 = high value (y)].
After repeatedly applying pcsn-pcsn+1 2-tuples according to their respective Forte names/numbers, characteristic patterns result for the respective compositional works or parts thereof. The resulting diagrams can be understood as signatures or fingerprints on the genotypic level in the sense described above. They provide information about the type and number of different fundamental pairs of pcs and show the occurrence of connections between diatonic pitch classes (i.e. extentionally diatonic and intentionally chromatic connections) on the one hand (cells coloured blue in Fig 3), and of extentionally chromatic pitch class connections on the other (red), as well as of the connections between both types of consecutive pcs 2-tuples (green). Abundances of occurring pcs are provided as well (Fig 3, bottom). The resulting diagrams appear to be typical for certain composition styles and composers [6,7]. Since the greater of these two studies was published in German and not in a scientific journal, the approach has received little attention, so it has not yet been taken up and developed further. Comparing the two examples from two eras given in Fig 3A and 3B and considering the total of fingerprint grid diagrams generated in this study (S2, S3A–S3J, S4A–S4AF and S5 Figs), it is clear that the patterns follow certain basic structures. A purely random selection of vertical pitch sets would result in a diagram with a uniform distribution of grid points (data not shown).
[A: D. Buxtehude 1675−76; B: A. Bruckner, 1878; based on vertical pitch set links (tuples of consecutive vertical pitch class sets) coded as pitch class sets (pcs). Consecutive ordered pairs of pcs both being extentionally diatonic (incl. intentionally chromatic) (blue); both pcs being extentionally chromatic (i.e. non-diatonic) (red); a pcs being extentionally diatonic (incl. intentionally chromatic), followed or preceded by a pcs being extentionally chromatic (i.e. non-diatonic) (green). Bar charts of relative abundances of occurring pcs from pcs 1–1 to pcs 5–38 at the bottom.] Symphony no. 5 Symphony no. 5.
In the present study, we have further developed the approaches of previous works [5–7] to enable the construction of ‘quasi-phylogenetic’ trees (in analogy to ‘phylogenetic’ trees in evolutionary biology) for direct comparison of music compositional works based on alignments of barcode sequence data of 2-tuples of consecutive vertical pitch class sets (pcs). The aim of the present study was to create quasi-phylogenetic trees where a) the compositions and/or composers of at least one era form a quasi-monophyletic group and b) that within the quasi-monophyla it should be possible to read to some extent a chronology of the compositions with good statistical support values.
Methodology
The music notation software Sibelius [8] was used to digitise the music compositional works, and MIDI files in version 1 were exported for subsequent analysis (S1 Fig). From these MIDI files, the open source analysis software and processual pipeline RAMEAU v. 1.1, released under the GNU General Public License v. 3.0, was used to create a series of intermediate files containing the listings of raw data (*.rdl), of vertical pitch sets (*.vps), of pitch class sets (*.pcs) (S6 Fig), and to create fingerprint grid diagrams (*.map.ps) of the individual music compositional works or parts thereof, respectively.
To generate quasi-phylogenetic trees from the fingerprint data, the binary values (absent-present) of the cells (= positions 1−n, i.e. ‘1 to n’) of the (two-dimensional) grid must be brought into a single (one-dimensional) line (y = 1, x = 1−n → y = 2, x = 1−n → y = 3, x = 1−n → … y = n, x = 1−n), which now represents a binary string and the ‘barcode sequence’ of the selected music compositional work (or parts thereof). A sequence of grid points may be coded with ‘1’ for the filled cells in the diagram and with ‘0’ for empty cells. A sequence of grid cell points from a grid diagram alignment based on pcs cardinalities 1 to 5 (pcs 1–1, pcs 2–1 to pcs 2–6, pcs 3–1 to pcs 3–12, pcs 4–1 to pcs 4–29, and pcs 5–1 to pcs 5–38) therefore has a length of 86 × 86 = 7,396 positions or digits. Alignments based on pcs cardinalities 1 to 6 (i.e. including pcs 1–1 to pcs 6–50) even comprise (86 + 50) × (86 + 50) = 18,496 positions as in the present study (S7A Fig). However, the corresponding grid diagrams of the given examples only consider cardinalities up to 5. As in biological phylogenetics, quasi-phylogenetic trees can now be calculated from such alignments using various methods.
As use case, a total of 28 (20) compositions by 14 (10) composers from the Baroque era, 10 compositions by 4 composers from the Viennese School, and 32 compositions by 16 composers from the Romantic era were selected for creating grid diagrams (numbers of works considered in the quasi-phylogenies in brackets). The names of the compositions and composers as well as the years of composition are listed in Table 1A–1C. For reasons of comparability and practicability, the first 20 bars (without the upbeat) mostly of the respective first movement of these compositions were included. The resulting fingerprints are provided as S2, S3A−S3J, S4A–S4AF and S5 Figs.
The individual barcode sequence data of identical length, derived from the procedural RAMEAU v. 1.1 pipeline were manually aligned into a matrix (alignment) (S7A Fig) as well as combined pairs per composer before aligning (S7B Fig).
In order to gain an idea of which method and which parametrisation is most suitable for the analysis of such alignments/matrices, the three combined data sets were analysed under the average linkage methods UPGMA (Unweighted Pair Group Method with Arithmetic mean) [9], neighbor-joining [10], as well as the maximum likelihood [11] method. Neither UPGMA with DendroUPGMA (Distance coefficient: ‘Euclidean distance’, Bootstrap replicates: ‘100’) [12] nor neighbor-joining using with MAFFT (Method: ‘All of gap-free sites’; Substitution model: ‘Raw difference’; Bootstrap: ‘on’, ‘1000’) [13] provided quasi-phylogenetic topologies that matched well with the two objectives of the present study with sufficient boots trap support (data not shown).
The presented quasi-phylogenies (Fig 4A and 4B) were created from S7A and S7B Fig by maximum likelihood using IQ-TREE [14] (sequence type: ‘binary’; substitution model: ‘auto-detect’; ‘ultrafast bootstrap approximation’ [15]; number of bootstrap alignments: ‘1000’; SH-aLRT branch test [16]: ‘yes’, repeats: ‘1000’). The quasi-phylogenies were visualised in the software TreeGraph [17]. The resulting topologies were rooted automatically and the clade tips sorted chronologically (‘move subtree up/down’) within the constraints of the clade topologies.
A. Sequences from individual works. B) Combined sequences per composer. [The alignments of the individual works are based on pcs cardinalities 1 to 6 (i.e. including pcs 1–1 to pcs 6–50 with 18,496 variables in individual barcode sequences and 36,992 variables in the combined sequences). The years indicate the dates of composition, and are based on the (final) year of the composition. In the case of compositions by Scarlatti and Couperin, no dates of composition could be found, so the year of the respective composer’s midlife has been chosen as a substitute. Bootstrap support values originate from ultrafast bootstrap approximation (value 1) and from SH-aLRT branch test (value 2)].
Results
It can be shown that polyphonic compositional works of music have intrinsic structures that can be used for the structural analysis of musical works within or across particular musical eras. The method presented here derives the data directly from the musical structure itself as encoded in MIDI-formatted files, and is free of subjective influences, so that a generally usable framework for systematic comparative analyses can be built on it. This study sees itself merely as a proof of concept for a data-driven approach of the analysis of structural evolution using compositions from the Baroque to the Romantic eras (Table 1). Examples from other eras, genres or styles as impressionism, dodecaphony, modernism, rock and jazz are not included here, but could also have been used as examples. It can be shown that under maximum likelihood the structural evolution of compositions can be visualised with sufficient to very good support values. It can be seen in Fig 4A that only few cases of the two investigated compositions per composer form quasi-monophyletic or quasi-paraphyletic pairs, as is the case, for example, with the examined works by A. Bruckner, S. W. Rachmaninov, N. A. Rimsky-Korsakov, D. Scarlatti, G. P. Telemann, and R. Wagner. In contrast, the two corresponding works of composers such as J. Brahms, A. Dvořák, G. Maler, and R. Strauss show great quasi-phylogenetic distances most likely due to the time intervals between the respective works. In order to assign the composers to one of the selected eras based on a more robust tree topology, the two sequences per composer were combined by concatenation, resulting in sequences with a length of 36,992 variables (S7B Fig). Fig 4B shows clearly and with quite good support that all composers occur in the clade of their respective era, being a quasi-paraphyletic clade for the Baroque era on the one hand and, on the other, a quasi-monophyletic clade for the Romantic era including the Viennese School.
The quasi-paraphyletic Baroque era clade in this analysis and shows no remarkable structures. At least, there are no ‘lost’ composers from the Romantic era clade in it. In the Romantic era clade all selected composers of that period are covered. Interestingly works of the Viennese School are not forming a quasi-monophylum but are scattered within the Romantic era clade. Within the latter clade is more or less apparent that pcs diversity within each era largely increases over time, i.e. early period works have comparatively low structural diversity, while late period works relatively high structural diversity with a higher proportion of extentional chromatics, but with exceptions in both directions.
Discussion
This methodological study merely aims to provide evidence that data-driven systematic comparisons between compositional musical works are possible by transforming data that directly form the intrinsic ‘genotypic’ structures as they exist in MIDI-formatted files. Quasi-phylogenies generated from barcode sequence matrices in this way provide insight into the quasi-phylogenetic positions between different compositional works or composers, which cannot be achieved solely based on contextual metadata. The situation is rather similar to that in evolutionary biology. There, traditional taxonomic classification allows assignment of a taxon to a particular genus or family, etc., whereas the phylogenetic representation provides information on the relative position of the individual taxa among each other. Of course, the topology, i.e. the relative positions of the contained units and clades of quasi-phylogenies in the present study is relative and dependent on the choice of compositional works (units) or parts thereof to be included, as well as on the choice of analysis method and parametrisation. However, as in evolutionary biology, the stabilisation of relationships between units and tree clades can be expected to provide an increasingly reliable basis for interpretation as more barcode sequence data are included for analysis. To ‘understand’ the topologies obtained, it is recommended to compare them with the fingerprint grid diagrams and the pcs frequency profiles included in the diagrams (S2–S5 Figs). The green and red coded fingerprint diagram grid cells, representing tuples with extentionally chromatic pcs, may be of particular interest as key factors for the development of evolutionary models for tree generation.
A global correlation between quasi-phylogenetic status and chronological assignment, as achieved in the present study, is reflected to some extent in the two trees for compositions and composers. It is rather obvious, however, that the number of barcode sequences considered in relation to compositions and composers is still too small to confirm this correlation in more detail. Nevertheless, it is regarded an encouraging result that all composers appear in the clade of their respective era (Fig 4B), but the musical works of one and the same composer do not always appear in the same subclade (Fig 4A). The latter is not necessarily irritating and would even contradict findings of traditional musicological analysis, since most composers have developed their style over time according to their personal development and the current zeitgeist. Overall it could be demonstrated that the pcs-tuple-based structural diversity largely developed from the early Baroque to the late Romantic era.
A detailed interpretation of the tree topologies presented here can only be made with caution, since a barcode sequence of 20 bars in length per musical work and two catenated barcode sequences per composer are definitely not sufficient to make completely reliable assumptions. Nevertheless, we assume that certain tendencies can already be outlined. In the Baroque era, the specific problem is that the inclusion or exclusion of works with ornamentation (mordents, turns, appoggiaturas etc.) can make differences in the structure of the fingerprints and consequently in the positioning of the respective barcode sequences in the quasi-phylogenetic tree. This is one of the reasons, why some of the musical works and composers from which the fingerprint grid diagrams are shown were excluded from the quasi-phylogenetic analysis.
In the Romantic era, the two barcode sequences of a composer such as R. Wagner (1845 versus 1870) show that the structural features even within a composer’s work do not necessarily increase with time (Fig 4A). In contrast, the works of J. Brahms and R. Strauss each appear in different major subclades, indicating their comparatively great compositional flexibility. The quasi-phylogenetic gap between the compositions of R. Wagner and those of A. Bruckner is remarkable, the latter in sister relationship to J. Brahms (Fig 4B). An extreme is G. Mahler, who in 1888 provided a composition with structural features close to those in the Baroque era clade and in 1902 came into quasi-phylogenetic proximity to P. I. Tchaikovsky (Fig 4A). It is not entirely surprising that compositions by composers of the Viennese School cluster together with those by composers of the Romantic period. The position of W. A. Mozart is remarkable but can be explained by the selection of compositions from his late work. Experimental analysis with his earlier works (e.g., K. 216 and K. 575) showed that these even reach into the quasi-paraphyletic sister subclades of the Baroque (data not shown).
When analysing the quasi-phylograms, the question arises as to what exactly these clades represent. Since the data basis of the quasi-phylogenetic trees is barcode sequences consisting of pcs 2-tuples, the characteristic features of the various clades represent specific combinations of pcs 2-tuples. These in turn, are determined by the ‘cardinality’ of the pcs in the sense of Forte [5] (not to be confused with the number of voices) and/or the abundance of pcs belonging to extentionally diatonic and extentionally chromatic chords or vertical pitch sets in the sense of Rambold [6]. It is apparent that the compositions positioned near the root, tend to have pcs of a lower cardinality and/or a lower proportion of extentionally chromatic vertical pitch sets. In the late Romantic era, the situation is reversed, with composers whose musical oeuvres tend to have a higher pcs cardinality and/or a higher proportion of extentionally chromatic pitch sets.
It remains to be seen whether a higher number of compositions or elements considered in the alignments really leads to better robustness of the quasi-phylogenetic tree topologies. However, in well-planned music analysis projects, i.e. those based on a specific concept, it may happen that only a limited number of compositional works or parts thereof, need to be examined anyway. From the observations we have made in this study (quasi-phylogeny of compositions versus that of composers with combined sequences), it seems fairly certain that merging barcode sequences from different compositions or from movements of a given piece of music will lead to more meaningful and better-supported quasi-phylogenetic topologies. Moreover, focusing on the works of one composer might show how widely the composer’s oeuvre is scattered across the quasi-phylogeny of music.
Before the method of analysis presented here can be widely applied, it is likely that a collaborative process will take place in the research community over time to reach agreement on an optimal approach to object selection criteria. This concerns, among other aspects, the number of variables per barcode sequence and the minimum sequence number to be included in an analysis. In composition-based analysis, there may be a convention that each movement of a given musical work should be represented by at least one barcode sequence, analogous to multi-gene approaches in evolutionary biology. Of course, it would be also conceivable to create and combine sequences of whole movements analogous to full genome sequencing. A particular challenge may be the question of ornamentation as in Baroque music. In this case, it is possible to remove such ornamental structures before the MIDI files are created for subsequent analysis. Furthermore, the inclusion of specific evolutionary models [18] could provide further improvement of the results. Their development could be a task for future music informatics. In addition, the extent to which branch support and branch lengths need to be taken into account will need to be discussed [19], and the same applies to bootstrap replicate numbers [20]. Different approaches could be developed to analyse the similarity of the compositional oeuvres, either on the basis of combined barcode sequences from different compositional works (by concatenation as in the sequence alignment for Fig 4B, or by creating alignments of consensus sequences) or based on individual compositions or parts thereof, and which method/algorithm and parameterisation should be chosen.
We assume that in future studies of this kind, in which perhaps hundreds of music compositional works will be examined, more or less large deviations of the quasi-phylogenetic tree topologies presented here will emerge, but that sooner or later they will stabilise through saturation and form a kind of framework for musicological work. Certainly, a musicology based solely on such analyses would be rather poor in substance. But musicological analyses that do not take these intrinsic structures into account are arguably incomplete. It should be emphasised again that in this approach, the basic analytical framework (pcs-tuple-based grid diagrams and quasi-phylogenetic trees derived from them) is entirely based on the intrinsic structures of the targeted compositional works, and not only on accompanying contextual data (metadata) whose underlying classifications themselves already represent a kind of pre-interpretation of the features. However, by mapping such (meta-)data onto the obtained quasi-phylogenetic tree topologies, new relationships and correlations can be identified for further analysis. It should also be emphasised that the proposed approach (or other types of visualisations such as ordination diagrams based on the presented type of datasets) can be used not only for comparing works of different composers within a particular era or genre, but of course also for investigating developments within the oeuvre of a single composer.
To this end, a public repository would be useful to make data resources freely available for musicological analysis. Such infrastructure already exists in the natural sciences where public repositories exist, e.g. for genomic and proteomic sequence data such as BOLD (https://www.boldsystems.org/), GenBank (https://www.ncbi.nlm.nih.gov/genbank/) or DDBJ (https://www.ddbj.nig.ac.jp/index-e.html) for gene sequence data and UniProtKB (https://www.uniprot.org/) for protein sequence data. In order to collaboratively realise a growing number of quasi-phylogenetic trees of musical compositions, it is recommended to set up freely accessible archives in which MIDI v. 1 files are published and archived according to FAIR guiding principles [21] together with a mandatory minimum set of contextual data (metadata) for analysis, such as by means of RAMEAU v. 1.1 (online service pending) or comparable software (e.g. R scripts pending setup; https://www.r-project.org/).
Conclusion
With the present study, we were able to show that pcs 2-tuple-based barcode sequences represent elementary ‘genotypic’ structures, which can be used to characterise of polyphonic music on the basis of fingerprints and to classify it by quasi-phylogenetic trees. By applying the maximum likelihood method, we were able to more or less unambiguously achieve the two goals of the study that a) the compositions and/or composers of at least one era formed a quasi-monophyletic group and that b) within one quasi-monophylum it was possible to read to some extent a chronology of the compositions with sufficient statistical support.
Based on experience with evolutionary analyses and visualisations in other scientific fields, we anticipate that the use of fingerprints, barcode sequences and quasi-phylogenies provides options for comparative computational and machine learning-based musicology, particularly as a structural backbone to address different aspects of music theory on the background of data from traditional and other approaches in computational musicology.
Supporting information
S1 Fig. Multi-track MIDI input files of the examined music compositional works.
https://doi.org/10.1371/journal.pone.0280478.s001
(ZIP)
S2 Fig. Figs A−AB. Fingerprint grid diagrams of pcs n- pcs n+1 2-tuples and relative pcs abundances of the examined music compositional works from the Baroque era.
https://doi.org/10.1371/journal.pone.0280478.s002
(ZIP)
S3 Fig. Figs A−J. Fingerprint grid diagrams of pcs n- pcs n+1 2-tuples and relative pcs abundances of the examined music compositional works from the Viennese School.
https://doi.org/10.1371/journal.pone.0280478.s003
(ZIP)
S4 Fig. Figs A−AF. Fingerprint grid diagrams of pcs n- pcs n+1 2-tuples and relative pcs abundances of the examined music compositional works from the Romantic era.
https://doi.org/10.1371/journal.pone.0280478.s004
(ZIP)
S5 Fig. Fingerprint grid diagrams of the examined music compositional works from the Baroque to the Romantic era (pdf).
https://doi.org/10.1371/journal.pone.0280478.s005
(DOCX)
S6 Fig. RAMEAU v.1 intermediate files of the examined music compositional works, including extracted from them Vertical Pitch Set files (*.vps), Rich Data Listing files (*.rdl), Pitch Class Set Listing files (*.pcs).
https://doi.org/10.1371/journal.pone.0280478.s006
(ZIP)
S7 Fig.
Alignments of A. Individual (‘compositions’) and B. Combined (‘composers’) Barcode Sequence files (*.fas) of the examined music compositional works from the Baroque, Viennese School and the Romantic era.
https://doi.org/10.1371/journal.pone.0280478.s007
(ZIP)
Acknowledgments
We thank Claire Arthur (Georgia Institute of Technology, Atlanta, USA) and an anonymous reviewer for constructive criticism and suggestions on an earlier version of the manuscript. We also thank Kazutaka Katoh (RNA team, CBRC, AIST, Tokyo, Japan) and Derek Peršoh (University of Bayreuth, Germany) for fruitful discussions on details of phylogram generation, and Dagmar Triebel (SNSB IT Center, München, Germany) for advise on open source licencing issues.
References
- 1. Ratnasingham S, Hebert PDN. BOLD: The Barcode of Life Data System (www.barcodinglife.org). Mol Ecol Notes 2007;7, 355–364. pmid:18784790.
- 2.
Swofford DL, Olsen GJ. Phylogeny reconstruction. In: Hillis D. M., Moritz G.: Molecular Systematics. Sinauer Associates, Sunderland, Mass. 1990, 411–501.
- 3. Peršoh D, Beck A, Rambold G. The distribution of ascus types and photobiontal selection in Lecanoromycetes (Ascomycota) against the background of a revised SSU nrDNA phylogeny, Mycological Progress 2004;3(2), 103–121.
- 4. de Queiroz K, Gauthier J. Toward a phylogenetic system of biological nomenclature. Trends Ecol Evol 1994;9(1), 27–31. pmid:21236760.
- 5.
Forte A. The structure of atonal music. New Haven and London: Yale University Press; 1973. ISBN 0-300-01610-7 (cloth) ISBN 0-300-02120-8.
- 6. Rambold A. Elementare Strukturen und Funktionen des musikalischen Satzes. Eine tonmengenbezogene, intervallorientierte Funktionstheorie und ihre Anwendung in der Analyse. 409 pp. ARION, Baden-Baden; 2012.
- 7. Rambold A, Rambold G. RAMEAU Music analysis system. 34 pp. ARION, München; 1994. [Software, source code and resources: https://github.com/Rameau-Project/Rameau].
- 8.
Rudolph TE, Leonard VA. Sibelius: a comprehensive guide to Sibelius music notation software. Milwaukee, WI: Hal Leonard Corp. 2007.
- 9. Sokal RR, Michener CD. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 1958;38, 1409–1438.
- 10. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 1987;4(4), 406–425. pmid:3447015
- 11.
Huelsenbeck JP. Phylogeny Estimation Using Likelihood-Based Methods. In: Balding D, Moltke I, Marioni J. (eds), Handbook of Statistical Genomics. 4th ed. 2019. Wiley. https://doi.org/10.1002/9781119487845.ch6
- 12. Garcia-Vallvé S, Puigbo P. DendroUPGMA: A dendrogram construction utility. 2016. http://genomes.urv.cat/UPGMA/UPGMAboot_v12.cgi.
- 13. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 2013;30, 772−780. pmid:23329690.
- 14. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020;37, 1530–1534. pmid:32011700
- 15. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the ultrafast bootstrap approximation. Molecular Biology and Evolution. 2018;35(2), 518–522. pmid:29077904
- 16. Guindon F, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biology 2010;59(3), 307–321. pmid:20525638
- 17. Stöver BC, Müller KF. TreeGraph 2: Combining and visualizing evidence from different phylogenetic analyses. BMC Bioinformatics 2010;11, 7. http://treegraph.bioinfweb.info/. pmid:20051126
- 18. Pinelis I. Evolutionary models of phylogenetic trees. Proc Biol Sci 2003;270 (1522): 1425–1431. pmid:12965036.
- 19. Wiens JJ, Kuczynski CA, Smith SA, Mulcahy DG, Sites JW Jr, Townsend TM, et al. Branch Lengths, Support, and Congruence: Testing the Phylogenomic Approach with 20 Nuclear Loci in Snakes Systematic Biology 2008;57(3), 420–431. pmid:18570036
- 20. Pattengale NP, Alipour M, Bininda-Emonds ORP et al. How many bootstrap replicates are necessary? Comp Biol 2010;17(3), 337–354. pmid:20377449
- 21. Wilkinson MD, Dumontier M, Aalbersberg IJ et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data 2016;3, 160018. pmid:26978244.