Fig 1.
Consensus classification of species delimitation results.
A, Flowchart describing the process of generating a consensus of delimitation results (among different methods and loci). B, C, Pipeline for classifying nominal species as either containing or not containing hidden genetic diversity in each consensus analysis (all agree and majority rules, respectively).
Fig 2.
Geographic spread of salamander data.
Map shows geographic distribution of salamander occurrences pulled from phylogatR [38] and used in these analyses. Pie charts show the total number of cytb and COI sequences used (left) and the number of species represented by those cytb and COI sequences (right). Basemap created with world map data from the public domain Natural Earth project (http://www.naturalearthdata.com). Salamander figures in black were obtained from Phylopic [74] and are licensed under public domain.
Fig 3.
A, Graphs show the results of ABGD, ASAP, and GMYC species delimitation analyses of the genes cytb and COI for each nominal species. Numbers represent the predicted genetic lineages from each analysis. Results highlighted in red indicate no hidden genetic lineages were predicted (i.e., number of genetic lineages = 1). Results highlighted in green indicate hidden genetic lineages were predicted (i.e., number of genetic lineages > 1). Grey highlighting indicates that specific analysis was not performed due to a lack of data. B, Pie charts display the number of nominal species classified as either containing or not containing hidden diversity in each consensus analysis (i.e., all agree and majority rules).
Table 1.
Results of majority rules consensus random forest models.
Model metrics for each random forest predictive model generated using the majority rules consensus classifications are shown.
Table 2.
Results of all agree consensus random forest models.
Model metrics for each random forest predictive model generated using the all agree consensus classifications are shown.
Fig 4.
Variable importance for random forest classification models generated using the majority rules consensus.
Variables ranked among the top ten most important variables (based on MDA and Gini) from the classification model generated at different correlation cut-offs are included. Blue highlighting indicates the best consensus model (majority rules–correlation cutoff 0.90).
Fig 5.
Comparison of hidden vs not hidden trait values for the top five most important predictors of the best consensus model (majority rules–correlation cutoff 0.90).
A, Columns 1–2 of the table identifys the specific model and predictors (i.e., traits). Columns 3–4 show the median trait values for each group (i.e., hidden vs not hidden). Columns 5–6 show the results of Kruskal-Wallis significance tests, which determine if the difference in median trait values for each group is statistically significant. B, Corresponding boxplots of the median trait values for the top five most important predictors show a significant difference in the range of values between hidden and non-hidden genetic lineages.
Table 3.
Summary of results of mammal random forest classification models presented in Parsons et al. (Parsons et al., 2022 [35]).
Model metrics for each random forest classification model generated using data from the class Mammalia are shown.