Assessing Phylogenetic Relationships among Galliformes: A Multigene Phylogeny with Expanded Taxon Sampling in Phasianidae

Galliform birds (relatives of the chicken and turkey) have attracted substantial attention due to their importance to society and value as model systems. This makes understanding the evolutionary history of Galliformes, especially the species-rich family Phasianidae, particularly interesting and important for comparative studies in this group. Previous studies have differed in their conclusions regarding galliform phylogeny. Some of these studies have suggested that specific clades within this order underwent rapid radiations, potentially leading to the observed difficulty in resolving their phylogenetic relationships. Here we presented analyses of six nuclear intron sequences and two mitochondrial regions, an amount of sequence data larger than many previous studies, and expanded taxon sampling by collecting data from 88 galliform species and four anseriform outgroups. Our results corroborated recent studies describing relationships among the major families, and provided further evidence that the traditional division of the largest family, the Phasianidae into two major groups (“pheasants” and “partridges”) is not valid. Within the Phasianidae, relationships among many genera have varied among studies and there has been little consensus for the placement of many taxa. Using this large dataset, with substantial sampling within the Phasianidae, we obtained strong bootstrap support to confirm some previously hypothesized relationships and we were able to exclude others. In addition, we added the first nuclear sequence data for the partridge and quail genera Ammoperdix, Caloperdix, Excalfactoria, and Margaroperdix, placing these taxa in the galliform tree of life with confidence. Despite the novel insights obtained by combining increased sampling of taxa and loci, our results suggest that additional data collection will be necessary to solve the remaining uncertainties.


Introduction
The avian order Galliformes (landfowl) contains about 290 species [1], many of which (e.g., chicken, turkey and peacock) are closely related to human society. Some of these birds are also important model systems in areas as diverse as development [2], disease transmission [3], and sexual selection [4]. Thus, a wellresolved galliform phylogeny is necessary to address a wide range of questions such as the geographic origin of certain lineages [5], the possible transmission and evolution of avian pathogens [3], and the evolution of sexual traits [4,[6][7][8][9]. Finally, approximately 25% of galliform species worldwide are threatened (critically endangered, endangered, or vulnerable) based upon the IUCN red list [10]. This includes 11 galliform species found in China of which ten were included in our phylogeny. There is substantial interest in using phylogenetic information to inform conservation priorities [11] and previous efforts to understand the use of phylogenetic information to establish conservation priorities have used galliforms as a model system [12], making a well-resolved galliform phylogeny even more critical.
Various types of data have been used to elucidate relationships within Galliformes, with researchers in the early to mid 1900 s using traits such as tail molt patterns [22] or anatomical traits [23]. Although those studies do not use cladistic methods, their classifications are used to develop ''traditional'' ideas regarding relationships within Galliformes [20,24] (Figure 1A). Later studies include a variety of biochemical and molecular methods, ranging from immunological comparisons of protein and allozyme analyses [25][26][27], protein sequence data [26,28], to DNA-DNA hybridization and restriction site analyses [18,29]. Sibley and Ahlquist [18], by combining the results of DNA-DNA hybridization analyses with other lines of evidence available at that time, hypothesize that the turkeys and grouse nest within the Phasianidae ( Figure 1B). More recent analyses use sequence data [17] (both mitochondrial and nuclear), cladistic analyses of morphological and behavioral traits [16] (hereafter M/B traits), transposable element (TE) insertions [30,31], and combinations of various subsets of these data [5] to build galliform phylogenies. Although there are varying levels of disagreement among those more recent studies, it is possible to use supertree methods to summarize their results and visualize the best-supported clades in them [7,32] (e.g., Figure 1C). These topologies can be viewed as informal priors that can be used to evaluate novel data, although it is clear that there are substantial differences among these studies. Thus, in spite of intensive study, a well-resolved and supported phylogeny representing a large diversity of gamebirds remains to be realized.
The remaining controversies regarding the branching order of Galliformes, especially within the Phasianidae, limit our ability to examine their evolution from many different perspectives. This includes studies as wide ranging as comparative morphology [8,17], behavioral and ecological comparisons for these taxa [6,7], and assessments of conservation priorities [11]. Thus, additional studies focused on this group are warranted. Here we focus on extending mitochondrial DNA (Mt) and nuclear intron dataset to improve taxon sampling, particularly within the Phasianidae, while limiting the amount of missing data. Using this data matrix we 1) placed our results in the context of a historical review of previous estimates of galliform phylogeny; 2) added taxa for which the molecular data were limited or absent; 3) attempted to resolve groups that were controversial in previous studies; and 4) discussed the classification of Galliformes based on our best estimate of their phylogeny.

Ethics Statement
The majority of samples for this study were part of existing tissue collections and thus were not collected for this study. For the remaining samples, permission to collect the samples was granted by the Management Bureau of local national reserves (Dongzhai National Nature Reserve in Henan Province and Shennongjia National Nature Reserve in Hubei Province) or the managers of Beijing Breeding Center for Endangered Animals. Tissue samples were collected from birds that were dead naturally in the field and blood samples were collected gently from brachial veins of captured birds that were released afterwards. All procedures involving the handling of birds were approved by the Institutional Animal Care and Use Committee in Beijing Normal University.

Data Collection
We sampled 92 species, four of which were outgroups (from Anseriformes, the sister order of Galliformes) (Table S1). While data for some species were taken from Genbank, most data (,80%) have not been published previously.
DNA was extracted from blood or muscle tissue using the TIANamp Genomic DNA Kit (TIANGEN Biotech, China) or the PUREGENEH DNA Purification Kit (Gentra Systems). We amplified two Mt genes (ND2 and 12S) and six nuclear intron sequences (CLTC, CLTCL1, EEF2, FGB, SERPINB14, and RHO) using a combination of published [33,34] and unpublished primers (Table S2). Loci were initially amplified using a single annealing temperature optimized for each locus, but later samples were amplified using a single touchdown PCR protocol that was able to successfully amplify all loci (the annealing temperature is from 58uC to 48uC, being reduced 0.5uC per cycle followed by 20 additional cycles at 48uC). All PCR products were examined for size on 1% agarose gel and purified using a PCR purification Kit (QIAGEN) or by PEG:NaCl (20%:2.5 M) precipitation. Samples were sequenced in both directions, using either ABI BigDyeH Terminator v.3.1 on an ABI Prism TM 3100-Avant genetic analyzer (PE Applied Biosystems) or by the Beijing Genomic Institute. Sequences were assembled using either Sequencher TM 4.1 (Gene Codes Corp.) or MEGA 4.0 [35]. The novel sequences collected in this study have been deposited in Genbank Table S3).

Phylogenetic Analyses
Sequence alignment. Different alignment methods can have a major impact on phylogenetic estimation [36,37]. To determine whether alignment influenced our conclusions, we aligned individual locus using three automated alignment programs, Mafft v.6.717 [38], Tcoffee [39], and Muscle [40]. We then combined alignments generated by the same program into three concatenated data matrices (only Mt genes, only nuclear introns, and all sequence data) using combine-0.9 (written by ELB). To assess the impact of the different sequence alignment programs, we conducted unpartitioned and partitioned maximum likelihood (ML) analyses in RAxML 7.2.6 [41] using the GTR+C model and ten randomized starting trees (all partitioned analyses were partitioned either by locus or by defining each gene type as a partition: Mt genes vs. Nuclear introns). Then we generated a neighbor-joining (NJ) tree of the Robinson-Foulds (RF) distances [42] among these trees to visualize their differences. Although the alignments given by the three programs differed in the positions and lengths of gaps, all combined data matrices using the same data type constructed trees that were very similar to each other (RF distances among trees that resulted from analyses using different alignments were similar to the RF distances among trees that resulted from different analyses using the same alignment, Figure S2), suggesting that sequence alignment had little impact upon tree reconstruction. Thus, we conducted the remaining analyses using the alignment generated by Mafft v.6.717.
Data analyses. Phylogenetic analyses were conducted by using the ML criterion and Bayesian Markov Chain Monte Carlo (MCMC) inference. The best fitting models for individual locus and the concatenated dataset were established in Modeltest 3.7 [43] using the second-order variant of the Akaike information Criterion (AIC c ). For the concatenated dataset, we conducted ML analysis in PAUP*4.0b10 [44] using the best fitting model (GTR+C+I) in addition to the analyses conducted using the GTR+C model in RAxML (as described above in the material on sequence alignment). RAxML and PAUP* identified the same optimal tree topology; since RAxML is more computationally efficient than PAUP*, all other ML analyses were conducted in RAxML using the GTR+C model and 10 randomized starting trees. We compared the fit of the GTR+C model with and without partitioning by using the AIC c based on the likelihood scores, using the numbers of free parameters reported by RAxML and treating the number of variable characters as the sample size. We also used RAxML with GTR+C to conduct ML bootstrap analyses with 500 replicates [45] with and without partitioning.
MrBayes3.1 [46] was used to conduct Bayesian MCMC analyses. Each locus was assigned the best model based on the AIC c (or the closest available model implemented in MrBayes). We conducted two runs simultaneously with each having one cold chain and three heated chains and ran the analyses for 10 million generations, sampling every 100 generations. The standard deviation of split frequencies between the two runs were below 0.01, which indicated a convergence upon a specific topology. The potential scale reduction factor (PSRF [47]) was also used as a diagnostic to examine convergence. Convergence is suggested if the PSRF approaches 1, as they did for our analyses. Moreover, the effective sample size (ESS) values of various parameters were all greater than 200 based upon inspection using Tracer v1.4 [48], which suggested a sufficient sampling in the MCMC runs. Finally, our analyses appeared to converge based upon analyses using AWTY [49] ( Figure S3). We deleted the first 25% of trees as burnin and used the rest to generate the consensus tree.
Gene tree -species tree analyses. We also estimated the species tree from individual gene trees, in addition to using concatenation. We used both NJst [50] and STAR [51] on the Species Tree Webserver [52]. To accommodate uncertainty in the gene tree estimates, we performed 500 bootstrap replicates on each locus (combining the two mitochondrial regions as a single locus) using the GTR+C model in RAxML. STAR requires rooted gene trees with a single outgroup sequence. Therefore, we used a single anseriform (the Southern screamer Chauna torquata) as the outgroup. Since one locus, SERPINB14, lacked outgroup sequences, this locus was not included in the STAR analysis. We conducted two NJst analyses, one using all loci and a second excluding SERPINB14 (to allow a direct comparison to the results from STAR).
Testing the robustness of phylogenetic inference. We used several methods to test the robustness of our estimate of the tree topology. First, the Bayesian posterior probability and the ML bootstrap values were used to assess the support of each clade. Second, gene jackknifing was used to determine whether there was a single gene that had a large impact on the tree topology by excluding one locus at a time and using the remaining data for analyses in RAxML. Moreover, similar analyses were conducted by excluding either Mt genes (leaving only nuclear introns) or nuclear introns (leaving only Mt genes) to test the contribution of different type of DNA markers to the overall ML tree topology.
We also explored whether substitutional saturation or base compositional heterogeneity might have affected our conclusions. We used the I ss metric calculated by DAMBE [53] to test for saturation (incorporating the estimated proportion of invariant sites) and the x 2 test implemented in PAUP* to examine the homogeneity of base frequencies across taxa for each locus. Finally, we conducted RY coding (changing purines [A and G] to R and pyrimidines [C and T] to Y) because many previous analyses indicate that this strategy can reduce the influence of substitutional saturation and base compositional heterogeneity. This was accomplished by using combine-perl (written by ELB) and then reestimating the phylogeny for both unpartitioned and the partitioned analyses in RAxML as described above.

Patterns of Sequence Evolution for the Combined Dataset
The concatenated data matrix aligned by Mafft was 7147 bp in length and it had 69% variable sites and 49.4% informative sites. There were 2119 bp of Mt DNA sequence with 55% variable sites and 47.8% informative sites, while the remaining 5028 sites were from nuclear intron sequences that had 61.6% variable sites and 50% informative sites. Thus, the nuclear intron sequences had a greater percentage of variable sites than Mt genes. The consistency index (CI) and the retention index (RI) of the nuclear introns were higher than that of the Mt genes ( Table 1). The shape parameter (a) of the C distribution describing among-sites rate variation, which was included in the best-fit models for all partitions except SERPINB14, was lower for the two Mt genes than in the nuclear introns, indicating that the Mt regions had greater among-site rate heterogeneity ( Table 1). The results were consistent with the hypothesis that the nuclear introns exhibit less complex patterns of molecular evolution relative to Mt genes [15,54].

Higher Level Relationships among Galliform Birds
The topologies of the partitioned ML trees, whether using twopartitions (Mt and nuclear) or eight partitions, were identical, although there were differences between the unpartitioned and the partitioned ML trees ( Figure S4 and the Treefile S1). The partitioned model (i.e., partitioned by locus) fit the data better than the unpartitioned model based upon the AIC c (185769.1 vs. 190842.9). Moreover, the Bayesian consensus tree had almost the same tree topology as that given by the partitioned ML analyses (Figures 2 and 3, Figure S4 and the Treefile S1). These analyses confirmed there are five major clades within Galliformes that correspond to five of the traditional families ( Figure 2). The first divergence within Galliformes was between Megapodiidae and all other taxa, and the remaining families branched successively after that in the order Cracidae, Numididae, Odontophoridae, and Phasianidae. The traditional Meleagrididae and Tetraonidae (hereafter referred to as turkey and grouse) nested within the Phasianidae and were sister to each other.
The most species-rich family within Galliformes, Phasianidae, recovered robust relationships among genera in the Bayesian analysis ( Figure 2). In general, there were three major clades within Phasianidae. The earliest divergence was between Arborophilinae sensu Crowe et al. [5], a clade that contained the Hill partridge (Arborophila spp.), Crested Partridge (Rollulus rouloul), and Ferruginous Partridge (Caloperdix oculea), and the other phasianids. The second one included most of the pheasants (e.g., Phasianus, Lophura, and related genera), the typical Grey partridge and relatives (Perdix spp.), the turkey and the grouse. This group has been designated the ''erectile clade'' [17] and this study extended those findings by including the Blood Pheasant (Ithaginis cruentus) in the erectile clade. In addition to the strongly supported Arborophilinae and erectile clade, there was a third clade that only received marginal support. This third clade comprised junglefowl (Gallus spp.), peacock-pheasants (Polyplectron spp.), peafowl (Pavo spp.) and their allies within the traditional pheasants, as well as many partridges, Old World quail, and francolins (i.e., Alectoris, Ammoperdix, Coturnix, Excalfactoria, Margaroperdix, Tetraogallus, and Francolinus). Although there were well-supported groups within this last clade, relationships among the groups did not always receive strong support ( Figure 3) and those poorly supported clades were united by short branches, consistent with the hypothesis that there was a rapid radiation.

The Performance of Gene Tree -Species Tree Methods
Analyses of concatenated sequences can be an inconsistent estimator of the species tree [55] so we also used gene trees to obtain an estimate of the species tree using NJst ( Figure 4) and STAR (Supporting information Treefile S1). These trees were very similar and only differed between poorly supported nodes (e.g., typically fewer than 50% of input species trees had the node in question). In terms of the basal structure of the galliforms, the NJst tree placed the peafowl clade (i.e., Pavo, Afropavo, and Argusianus) sister to all phasianids except the Arborophilinae (albeit with less than 50% support). In contrast, analyses of concatenated data (e.g., Figures 2 and 3) and STAR (Supporting information Treefile S1) placed the peafowl clade into a larger clade comprising junglefowls, peacock-pheasants, Old World quail, francolins, and many partridges. However, this difference appeared to reflect the exclusion of SERPINB14 from the STAR analysis; the NJst tree estimated without SERPINB14 (Supporting information Treefile S1) placed the peafowl clade in the same position as the concatenated analyses (Figures 2 and 3) and the STAR analysis. This suggested that SERPINB14 might have a strong signal that differs from the other loci included in our analysis, possibly reflecting a distinct gene tree.
There were several other differences within the major phasianid clades when comparing the gene tree-species tree analyses with the concatenated analyses. However, as with the position of the peafowl clade above, all of these relationships were poorly supported in the NJst tree and did not consistently present in the other analyses (e.g., Figure 3). For example, within the erectile clade, the position of the Cheer Pheasant (Catreus wallichii) differed between Figures 3 and 4. Additionally, within the third clade in the Phasianidae (which included the clades with the junglefowl, peacock-pheasants, and most partridges and francolins), relationships among the different groups varied among all of the analyses (e.g., compared Figures 2, 3, and 4). However, while there were some differences, all of the consistent and well-supported nodes in the concatenated analyses were also in the species trees estimated from gene trees, suggesting that our conclusions were robust to type of analysis.

Tests of Tree Robustness and Potential Artifacts
Partitioned ML bootstrap values, Bayesian posterior probabilities, and species-tree (e.g., NJst) bootstrap values all provided strong support for many phylogenetic relationships (Figures 3 and  4). Gene jackknifing in which one locus was excluded at a time also resulted in tree topologies very similar to those given by the concatenated dataset (i.e. the differences among them were similar to the differences between un-partitioned and partitioned ML analyses, and these differences primarily occurred in the third phasianid clade, see Figure S4 and Treefile S1), indicating that no single gene had a strong impact on our phylogeny.
It is noteworthy that ML trees built by the Mt genes alone showed a number of differences from those obtained with the concatenated dataset (Figure 3, Figure S4 and Treefile S1) due to the reduction or even the loss of support of certain clades. For example, the third major clade within Phasianidae identified in the concatenated dataset was not found in analyses of the Mt genes. Since there were only two Mt genes included in our study, their power to resolve difficult relationships within Galliformes might be limited. Moreover, the CI and RI were greater for the nuclear than the Mt data, suggesting that the Mt data exhibited more homoplasy than the nuclear introns (Table 1). For these reasons we have more confidence in the analyses that included nuclear data than the Mt only dataset.
None of the eight partitions exhibited substitutional saturation based upon the I ss metric [53] (Table 2). ND2 did exhibit base composition heterogeneity but excluding ND2 from the analyses had little impact on the tree topology ( Figure S4 and Treefile S1). Moreover, the topology of partitioned and unpartitioned ML trees obtained after RY coding were the same for higher-level relationships as those obtained using all four nucleotides, though some lower-level relationships among genera within the Phasianidae were different (Supporting information Treefile S1). Since there were more very short branches on the RY trees, it suggested that RY-coding might lead to a loss of more informative sites and thus have a reduced power to resolve phylogenetic relationships.

Discussion
There have been substantial changes over time regarding relationships among members of the Galliformes (Figure 1, Figure  S1). In fact, despite the similarities evident in parts of previous large-scale trees (compare Figure 1C and 1D), a clear consensus tree regarding relationships within this order has not emerged and even the most recent studies exhibit incongruence ( Figure S1). Some of this incongruence may reflect the types of data used in each analysis. For example, differences between trees based upon cladistic analyses of M/B traits [16] and those based upon molecular data could reflect the limited number of traits included in M/B data matrices. Moreover, M/B traits have the potential to be scored incorrectly [56]. Patterns of Mt sequence evolution are very complex and some analyses of Mt sequences have recovered phylogenies that are likely to reflect bias [57]. The Mt genome represents a single genetic locus so the Mt gene tree may differ from the species tree due to lineage sorting or introgression [58]. Nuclear introns show a relatively slower rate of evolution and less homoplasy than Mt sequences [15,54,59], but indels (insertions and deletions) in nuclear non-coding data can make alignment difficult and influence downstream evolutionary analyses [60]. We examined the influence of sequence alignment here by analyzing alternative alignments, finding that alignment had a limited influence on the tree topology ( Figure S2), in agreement with other recent studies using avian introns [61,62]. Other nuclear regions, including coding exons and untranslated regions (UTRs) typically accumulate substitutions more slowly, limiting their power to resolve rapid evolutionary radiations [63,64]. Finally, rare genomic changes such as TE insertions accumulate very slowly and therefore have little homoplasy, but their very low rate of accumulation also limits their power [65]. Moreover, they are subject to lineage sorting, like all loci, and they do not appear to be completely homoplasy-free [66,67]. Finally, large-scale ''supermatrix'' analyses that combine multiple gene regions (and M/B characters in the case of Crowe et al. [5]) have substantial missing data and the impact of this missing data upon analyses is unclear [68,69]. Analyses using all of these markers, either independently or in various combinations, have been applied to galliform phylogeny ( Figure S1) and this has clearly resulted in substantial progress toward resolving specific relationships within the order. However, a number of questions remain despite this progress; we discuss the ways that our data address these remaining questions below.

The Deepest Branching Clades within Galliformes
Galliform phylogeny has been an active area of research for almost a century, and our understanding of the evolution of this group continues to improve with new techniques and methods. In the pre-molecular classification world, Wetmore [70] splits the Galliformes into two large groups: the superfamily Cracoidea that includes Megapodiidae and Cracidae, and the superfamily Phasianoidea that includes Tetraonidae, Phasianidae (which he defines as including New and Old World quail, pheasants, and partridges), Numididae, and Meleagrididae. Similar division suggesting a sister group between Megapodiidae and Cracidae is also found in studies that use appendicular musculature [71,72], and DNA-DNA hybridization [18]( Figure 1B). However, with the explosion of DNA sequence data there is now increasing evidence indicating that the megapodes and cracids do not form a clade. In fact, the megapodes are sister to a clade comprising cracids and the remainder of the Phasianoidea, which is further corroborated by analyses of many different types of data including extensive M/ B characters [16], mitochondrial (Mt) sequences [73], combined M/B and molecular datasets [5], nuclear sequence data [13], and retrotransposon insertion [31] (Figure S1). All of our analyses revealed strong support for the hypothesis that the deepest divergence within extant Galliformes was between the megapodes and all other galliform species, with the next divergence corresponding to that between cracids and Phasianoidea.

The Position of New World Quail and Guineafowls
The position of the New World quail has varied substantially among studies ( Figure S1). Traditionally, the New World quail are considered part of the Phasianidae [70], although placement within the Phasianidae varies and includes grouping with the grouse and turkeys [74] or the Old World quail [16,75,76]. On the other hand, the New World quail have also been placed in their own family, Odontophoridae. Using DNA-DNA hybridization, Sibley and Ahlquist [18] place the New World quail sister to a clade comprising Numididae, Phasianidae, Meleagrididae, and Tetraonidae. This relatively deep-branching placement of New World quail is also supported by some analyses of Mt genes [14,73,77], though there appears to be conflict among different Mt genes [59]. The majority of nuclear introns, UTRs and Chicken Repeat 1 retroposon insertions have supported an Odontophoridae-Phasianidae-Meleagrididae-Tetraonidae clade, with Numididae sister to this group [5,17,31,59,64,78]. This pattern was strongly supported both by concatenated ( Figure 3) and species tree (Figure 4) analyses in our study, which suggested that the New World quail should be an independent family sister to the Phasianidae. However, when using the Mt genes alone, the ML bootstrap supports reduced to around 72% (Figure 3). This was consistent with the conflicting signal observed by Cox et al. [59] and might reflect the greater homoplasy exhibited by mitochondrial sequences.

The Phylogenetic Position of Turkeys and Grouse
The turkeys and grouse have also been treated as independent families in some taxonomies [70], implying that they are distantly related to other phasianids [20]. This probably reflects the perception that these taxa have a number of special characteristics. For example, grouse are distinguished by morphological adaptations to cold environments such as feathered nostrils and tarsi. Their toes can also grow feathers or small scales in winter to adapt walking on snow and burrowing into it for shelter [70]. In contrast, both turkey species are bare headed and males have a snood (a distinctive fleshy wattle or protuberance that hangs from the top of their beak). However, more recent studies have suggested a derived position for turkeys and grouse within the Phasianidae [4,30,79] (Figure S1A-D, F-M), even placing them sister to each other [8,14,80] (Figure 1D, Figure S1B, D, G-L). In our study, the turkey and grouse formed a sister group nesting inside the Phasianidae. Although the support for the sister relationship varied across analyses, and sometimes was very low (less than 50%, Figures 3 and 4), the turkey-grouse clade nested into a larger clade with strong support that included taxa such as Perdix, Tragopan, and the typical pheasants (i.e., the erectile clade of Kimball and Braun [17], see discussion below). This provided strong support that the position of turkeys and grouse should be within the Phasianidae rather than considered as independent families. Since our sampling was greatest within the Phasianidae, our following discussion will be focusing on the remaining relationships within this family.

Phylogenetic Relationships within Phasianidae
The Phasianidae has received extensive study by ornithologists, in part due to the inclusion of important avian model systems in this family, such as the chicken, quail, and turkey (see above). Since the Phasianidae includes ,61% of galliform species, understanding relationships within the Phasianidae is critical to clarify evolutionary patterns across the Galliformes. However, previous studies have typically obtained low bootstrap support for branching order for some relationships. While failure to obtain robust support has been suggested to be due to a rapid radiation [14,17,81,82], it may also be that limited marker and taxon sampling have contributed to the difficulties in resolving branching patterns and thus the different conclusions among studies.
''Pheasants'' and ''partridges''. The Phasianidae have often been split into the tribes Phasianini (pheasants) and Perdicini  (partridges and allies) [18][19][20][21]. These tribes have typically been assumed to be closely related but reciprocally monophyletic, although there has been some debate regarding the exact sets of taxa they comprise. According to this classification, pheasants are relatively large with most species exhibiting extreme sexual dichromatism and specialized ornamental traits [4]. In contrast, partridges (including Old World quail) are often monochromatic or exhibit limited sexual dimorphism and are primarily dull colored. As these tribes have been typically defined (see above), the partridges exhibit either none or many fewer of the extreme or highly specialized ornamental traits found in the pheasants [14]. However, monophyly of the pheasants and partridges has not been supported in most recent studies. For example, there has been strong support for the sister relationship between the junglefowl (typically considered a pheasant) and the bamboo partridges (Bambusicola spp.) using various markers, such as mitochondrial sequences [14,81,83], nuclear genes [15], combined Mt and nuclear sequences [4,17,84,85], combined morphological and molecular data [5] and retrotransposable elements [30], similar to what we found (Figures 2 and 4). We also identified several other examples that reject monophyly of the ''pheasants'' and ''partridges'' as traditionally defined (discussed below).
Clade 1: Arborophilinae. The first clade in the Phasianidae formed the earliest diverging group including three traditional partridge genera: Arborophila spp., Rollulus rouloul and Caloperdix oculea. This clade is also supported by other studies [4,5] (Figure 1C), though this is the first study to include nuclear sequence data from Caloperdix and to strongly support placement of this partridge in this clade.
Clade 2: ''Erectile Clade''. The second major clade within Phasianidae corresponded to the ''erectile clade'' of Kimball and Braun [17]. In this study, we used more species but the circumscription of this group remained similar to that reported by Kimball and Braun [17] in that it contained species belonging to the traditional ''Gallopheasants and allies'', ''Tragopan and allies'' [20], the Perdix partridges, as well as turkeys and grouse.
The Gallopheasant group [20] was monophyletic and it comprised six genera: Catreus, Chrysolophus, Crossoptilon, Lophura, Phasianus and Syrmaticus. Supports for many of the relationships among these taxa were low (Figures 3 and 4), though the initial divergence between Syrmaticus and the remaining Gallopheasants was strongly supported by all of our analyses and by a number of previous analyses ( Figure 1D, Figure S1A, F, K-L, N). The sister group of the gallopheasants was Perdix spp. (a partridge) with 100% support (hereafter we refer to this group as the Perdix-Gallopheasant clade). The phylogenetic position of Perdix has varied among studies, being placed with Francolinus and other traditional partridges using morphology [16,20], or sister to turkey and grouse in some studies based on molecular data [5,73,79,81], though typically with limited support. Our analyses rejected those placements, although we noted that our analyses were in agreement with some molecular studies (Figure1D, Figure S1J-L, N). Our strong support for the Perdix-Gallopheasant clade further refuted the traditional division of pheasants and partridges.
Johnsgard [20] defines the ''Tragopan and allies'' as including Tragopan spp., Lophophorus spp., Ithaginis cruentus, and Pucrasia macrolopha. However, recent studies have not found that these taxa form a monophyletic group, in agreement with our results (Figures 2 and 4). Our data united Tragopan and Lophophorus, as have many previous studies ( Figure S1), but placed Pucrasia and Ithaginis in separate positions. More specifically, our study generally placed Pucrasia sister to the turkey and grouse with limited support (Figures 3 and 4), though the Mt genes and the species tree analyses that lacked SERPINB14 united Pucrasia with Perdix-Gallopheasants (Supporting information Treefile S1). The results of other studies are mostly consistent with our Mt results, but with varying degrees of support. Thus, the position of Pucrasia cannot be established with confidence at this time. The position of Ithaginis cruentus has not been extensively studied. Previous studies place it at the base of the major ''partridges'' including the New World quail [16], or sister to Gallopheasants [5,32], but always with marginal support. In contrast, our data strongly supported (with 100% support in all analyses, Figures 3 and 4) placing Ithaginis sister to the other members of the second major (or ''erectile'') clade [78]. Thus, Ithaginis was not only distinct from the ''Tragopan and allies'' group of Johnsgard [20], but also distinct from the other lineages within the erectile clade.
Clade 3: ''Chickens and allies''. In contrast to the Arborophilinae and the erectile clade that were each supported at 100% in all of our analyses, the third clade within the Phasianidae received mixed support across analyses, with analyses of the Mt genes ( Figure 3) and the NJst (Figure 4) failing to support the clade. Within this group of taxa, however, were four clades each of which formed a robust monophyletic group. However, relationships among these four clades received marginal support, even with the concatenated dataset, and so relationships among the clades remained to be determined.
These four small clades included the Gallus-Bambusicola-Francolinus lineage mentioned above. Sister to this was a lineage containing Alectoris, Tetraogallus, Coturnix, Ammoperdix, Margaroperdix, and part of the genus Francolinus as traditionally circumscribed. It is clear that Francolinus is not a monophyletic genus, as has been found in other studies [5,78,86](also see Table S1). Moreover, Coturnix was paraphyletic based on all of our analyses since C. coturnix was sister to Margaroperdix rather than C. chinensis (suggesting that the alternative species name of Excalfactoria chinensis is more appropriate). Of the remaining two clades, one included a single genus (Polyplectron) and the other corresponded to the peafowl clade (Pavo, Afropavo, and Argusianus). Collectively, the peafowl clade and the peacock-pheasants have been suggested to form a monophyletic clade, the Pavoninae [5], which is identical to the peafowl and allies of Johnsgard [20]. Although a monophyletic Pavoninae has been found in some studies [6,84], most studies have either not found resolution among the putative Pavoninae [14,16,30] or have not supported a monophyletic Pavoninae, instead placing Polyplectron distant from the peafowl clade ( Figure 1D, Figure S1A, G, K-M). However, since none of these studies has robust support for relationships among these taxa, the exact relationship between Polyplectron, Argusianus and Pavo/Afropavo still needs further exploration.

Conclusions
We have generated estimates of the large-scale structure of Galliformes phylogeny, using 88 ingroup species. Although many parts of this phylogeny were robust to analytical method and appeared to faithfully reflect evolutionary history, several uncertainties remained. Taken as a whole, we strongly corroborated the hypothesis that Galliformes can be split into five major families: Megapodiidae, Cracidae, Numididae, Odontophoridae, and Phasianidae. The earliest divergence was between Megapodiidae and other Galliformes, followed by the divergences of Cracidae, Numididae, and finally the sister group of Odontophoridae and Phasianidae. Moreover, the hypothesis that turkey and grouse, each of which has sometimes been considered to form an independent family, are instead part of the Phasianidae was also strongly corroborated. There were multiple examples that refuted the possibility that the traditional classification of ''pheasants'' or ''partridges'' represent monophyletic groups. Thus, we extended the suggestion [14] that these terms should be used to refer suites of similar morphological and behavioral traits because those groups are not monophyletic and therefore do not imply shared evolutionary history. While our study provided strong support for many relationships, including some that have been contentious, several uncertain nodes remained that will require additional study before there will be a well-resolved phylogeny of Galliformes.    Treefile S1 The file of trees based on different analyses. ( )