Species richness of Eurasian Zephyrus hairstreaks (Lepidoptera: Lycaenidae: Theclini) with implications on historical biogeography: An NDM/VNDM approach

Aim A database based on distributional records of Eurasian Zephyrus hairstreaks (Lepidoptera: Lycaenidae: Theclini) was compiled to analyse their areas of endemism (AoEs), species richness and distribution patterns, to explore their locations of past glacial refugia and dispersal routes. Methods Over 2000 Zephyrus hairstreaks occurrences are analysed using the NDM/VNDM algorithm, for the recognition of AoEs. Species richness was calculated by using the option ‘Number of different classes’ to count the different classes of a variable presented in each 3.0°×3.0° grid cell, and GIS software was used to visualize distribution patterns of endemic species. Results Centres of species richness of Zephyrus hairstreaks are situated in the eastern Qinghai-Tibet Plateau (EQTP), Hengduan Mountain Region (HDMR) and the Qinling Mountain Region (QLMR). Latitudinal gradients in species richness show normal distribution with the peak between 25° N and 35° N in the temperate zone, gradually decreasing towards the poles. Moreover, most parts of central and southern China, especially the area of QLMR-EQTP-HDMR, were identified as AoEs that may have played a significant role as refugia during Quaternary global cooling. There are four major distributional patterns of Zephyrus hairstreaks in Eurasia: Sino-Japanese, Sino-Himalayan, high-mountain and a combined distribution covering all three patterns. Conclusions Zephyrus hairstreaks probably originated at least 23–24 Myr ago in E. Asia between 25° N to 35° N in the temperate zone. Cenozoic orogenies caused rapid speciation of this tribe and extrusion of the Indochina block resulted in vicariance between the Sino-Japanese and the Sino-Himalayan patterns. The four distribution patterns provided two possible dispersal directions: Sino-Japanese dispersal and Sino-Himalayan dispersal.

However, their biogeographical relations are still poorly understood. By detecting the AoEs from the data, we can obtain new insights into the origins, diffusion and distribution of this group.
East Asia, located on the west side of the Pacific Ocean, is characterized by complex geological structures due to geographical events throughout history, such as the Indo-Asian tectonic collision which caused the formation of the Himalaya, uplifted the Qinghai-Tibetan plateau, and caused the rotation of the Shan-Malay Plate, and the glacial and interglacial periods during the Quaternary. The areas, affected by the collision and global cooling, are also biodiversity hotspots and are the focus of current research [23].
Our study examines the AoEs, centres of species richness of Zephyrus hairstreaks and their distribution patterns in Eurasia. By integrating the information from analyses based on our database, we aim to explain where the candidates of refugia of this tribe are and how the dispersal to the current distribution patterns can be reconstructed. This garnering information is useful for conservation applications.

Species distribution data
We compiled a database including 2126 records of 197 known species belonging to 53 genera of Theclini. We included 1564 records of 189 Eurasian endemic species assigned to 47 genera to analyse species richness. Moreover, 1517 records of 142 endemic species from 42 genera (excluding species from only a single location) were used to identify areas of endemism (AoEs). Geographical distribution data and taxonomic information for each species were compiled and reviewed from data in literature published prior to 2016 . Classification and nomenclature follow the literature of "The Zephyrus Hairstreaks of the World" by Koiwaya [22]. A list of species records and complete geographical information regarding the specimen collection sites and references can be found in the supporting information (Table in S1 Table).

Description of the study area
An area covering latitudes -10˚S to 80˚N and longitudes from 40˚E to 180˚E, including most parts of the countries of Eurasia, was subdivided into quadrats measuring 2˚× 2˚and 3˚× 3O perative Geographical Units (OGU) without consideration of physiographical features and the uniformity of quadrats in the margin of the survey area that are only partly land [52]. This region covers most parts of the known distributional area of Zephyrus Hairstreaks in Eurasia.

Analysis
Species richness is the most direct method to evaluate species diversity. A total of 1564 distributional records were coded as a shapefile that was then run in DIVA-GIS 7.5, using 3˚× 3g rid cells. DIVA-GIS 7.5 is a free computer program for mapping and analyzing geographical distributions and biodiversity data [53][54][55]. Species richness was analyzed using the option "Number of different classes" to count the different classes of a variable present in each grid cell. A list of conditioned equal interval values was taken from a species richness map of the study area. Before analysis, a species accumulation curve at genus level has been tested, using our database, and the curve tends to go up and then it tends to slow down. It reflects a relatively adequate sampling. NDM/VNDM 3, two sister programs written by Goloboff [56], implement the methods described in Szumik et al., and Szumik & Goloboff [57,58] to demarcate AoEs. These programs, based on an optimality criterion, take into account the spatial component of endemism for identifying AoEs. Parameters were selected in accordance with Prado et al. [59]. Scores above 2.0 were saved for genus-level analysis and above 3.0 for species-level analysis, and set the upper minimum score above 0.3 (except for Hayashikeia florianii (0.289)). The consensus AoEs were computed using a cut-off of 50% similarity at different taxonomic levels and the flexible consensus was selected [56,57]. Endemicity analysis (EA) was selected to identify areas of endemism and executed by NDM, because EA has been recognized as a best method that supported areas of endemism [60]. Analytical procedures are filed in http://dx.doi.org/10. 17504/protocols.io.[dx.doi.org/10.17504/protocols.io.jr2cm8e].
AoEs were constructed from 2.0˚×2.0˚and 3.0˚×3.0˚cell sizes at each taxonomic level (genus and species). The two sizes of cells were optimized in consideration of the advantages and disadvantages of each particular size [59]. Different taxonomic levels used in the analysis of endemicity can increase the reliability [61,62].

Latitudinal gradients in species richness
The value of species richness of Zephyrus hairstreaks in each grid cell is variable in Eurasia. Although for certain grid cells, where the species have not been observed, this may either represent real absences or sampling errors, the database meets the analysis criterion and reflects reasonable real world results. Latitudinal gradients in species richness show a normal distribution (Kolmogorov-Smirnov Z = 1.281, P = 0.075 > 0.05) with one clear peak between 25˚N and 35˚N in the temperate zone (Fig 2), and a gradual decrease towards the poles. The prevalent consensus regarding species richness is that diversity generally peaks in the tropics near the equator and declines towards the poles.

Areas of endemism
The NDM/VNDM 3 analytic approach identified fourteen different AoEs (Figs 3 and 4) at two taxonomic levels (genus and species) and two different Operative Geographical Units (2.0˚×2.0˚and 3.0˚×3.0˚). At the genus level, there were three consensus endemic areas (Fig  3a-3c) (Table in S2 Table), with both of these two regions being associated with the eastern   In the 2.0˚grid size, there were four consensus areas (Fig 3d-3g): QLMR-EQTP-HDMR (score range 3.04-4.27), QLMR-EQTP (score range 5.71-6.46), EQTP (score range 7.64-8.14) and HDMR (score range 3.67-3.92), all of which were derived from an optimality criterion to demarcate AoEs. There were seven consensus areas (Figs 3h and 4i-4m)  Among the three core AoEs, EQTP and HDMR at the same time have high species richness. In the 3.0˚grid size, SHEMY was also identified as a core consensus AoE, supported by five species; Chrysozephyrus letha (0.714), Chrysozephyrus uedai (0.633), Neozephyrus suroia (0.455), Shirozuozephyrus jakamensis (0.800) and Shirozuozephyrus khasia (0.700). Additionally, the consensus area of QLMR-EQTP-HDMR (Fig 5) was obtained from the analysis results using two grid sizes (2.0˚and 3.0˚) with its highest support scores (up to 17.32973-17.57973) made up of 28 species (in the 3.0˚grid size) (Table in S2 Table) based on NDM/VNDM 3. QLMR-EQTP (Fig 5) was strongly supported as an AoE for the reason that, at the generic level, it was supported by three genera with different scores in the 2.0˚and 3.0g rid sizes; Neogonerilia (0.600/0.800), Saigusaozephyrus (0.680/0.605) and Shaanxiana (0.850/ 0.857). Furthermore, at the species level, there were twelve species supporting an endemic area in the 2.0˚grid size (Table in S1 Table) For instance, both EQTP and HDMR are supported as independent AoEs in the 2.0 grid size, and moreover, EQTP-HDMR (Fig 5) has a lower support rate (only one time) and values (3.974-4.224) compared with those of QLMR-EQTP (three times and their values as follows: 2.130-2.380, 2.262-2.512 and 5.714-6.464). QLMR-EQTP-HDMR is treated as a large and important region for species richness and AoEs of Zephyrus hairstreaks and among these areas QLMR has a particularly closer connection with EQTP (because QLMR lacks independence and was three times recognised together with EQTP) more than other AoEs.

Areas of endemism and species richness
Prado et al. [59] stated that using smaller grid cells appears to be more restrictive and rigorous in identifying AoEs but it may lead to ignoring some important areas, so we identified these regions using different grid sizes and different taxonomic levels. The results revealed the most regions in the mountains of central and southern China, especially the area of QLMR-EQTP-HDMR (Fig 5) as AoEs. These regions played a significant role as centres of survival and speciation during the Quaternary global cooling [4], and these AoEs (Fig 5) appear consistent with those obtained from other biological assemblages in plants, birds and aphids [63,64]. The highest species richness value within the area of EQTP reached 74. The Hengduan Mountain Region (HDMR) also recorded 60 species of Zephyrus hairstreaks. The two areas are not only core AoEs but also centres of species richness. In addition, the species compositions within the two regions appear markedly different (Table in S2 Table). EHMLY (Fig 5) is also a core AoE, independent from HDMR and EQTP, although it lacks high species richness. Eliot [16] deduced the greatest abundance and variety of Lycaenidae in the SE Asian Subregion of the Oriental Region, and he suggested that the SE Asian Subregion extends from SE China to Ceylon and as far east as Weber's Line. Its characteristic species are lowland or submontane in habit and appear to be centred in Sundaland. However, centres of species richness of Zephyrus hairstreaks and core AoEs are located in the Sino-Himalayan and Sino-Japanese subregions.

General distribution and diffusion of Zephyrus hairstreaks
Almost all of the Zephyrus hairstreaks are distributed in the Northern Hemisphere and a major part of them are concentrated in Asia, except for three species in North America and two species in Europe. Our data used in the analyses showed that the northern boundary of distribution of Zephyrus hairstreaks is located along the Sea of Okhotsk in the Russian Far East in Asia, with the southern boundary located in Java in Indonesia, the western boundary is in the eastern region of Afghanistan, and the eastern boundary is in northern Japan and its surroundings, In addition, parts of northern and northwestern China are covered by large areas of desert and inaccessible areas, so there are no distributional records of Zephyrus hairstreaks in our database, along with areas of over-urbanization and areas incompletely investigated, all of which are represented as gaps (Fig 1). Nevertheless, our database maximizes the use of published available information and investigations of our team and fulfills the criteria of quality of biodiversity databases discussed by Hortal et al., [65]. At last, there are four major distribution patterns of Zephyrus hairstreaks in Eurasia (Fig 6): 1) Sino-Japanese distribution: the species belonging to this distribution pattern are characteristic with extensions from EQTP and HDMR northward to Manchuria-Pacific (Northeast China, the Korean Peninsula, the Russian Far East and Japan), southward to South China (WLMR, TMS and NLM), and even to the west of Europe (e.g. Neozephyrus, Shirozua, Iozephyrus, Thecla and Howarthia). 2) Sino-Himalayan distribution: the species originated on the EQTP and HDMR, from where it dispersed to northwestern India along the Himalayan chain and eastward to east China, Indochina, Sundaland, Taiwan island, Japan and the west of Wallacea (e.g. Euaspa, Shirozuozephyrus, Chaetoprocta and Austrozephyrus). 3) High mountain distribution: the species are distributed only in the areas of HDMR, EQTP and QLMR (e.g. Kameiozephyrus, Saigusaozephyrus and Uedaozephyrus). 4) Combined distribution: the species exhibit a combined distribution pattern covering all three other patterns (Chrysozephyrus).
Evidence suggests that the Indo-Asian collision occurred in the early Cenozoic Era, 50-55 Myr ago or even earlier [66], and it almost simultaneously caused the uplift of the Himalaya-Tibetan plateau region and the Eurasian plate, which also underwent glacial/interglacial cycles during the Quaternary [4]. Lepidoptera, as the second largest order of Insecta, diverged prior to the Cretaceous/Paleogene (K/Pg) event (65 Mya) and the extant families of Lycaenidae quite likely evolved in the Cenozoic Era [67]. In addition, the extrusion of the Indochina block [68][69][70] which happened during the period of 23 to 24 Mya [71] was proposed to explain the formation of the boundary line called the "Tanaka-Kaiyong line" which extends from 28˚N, 98˚E southward to approximately 18˚45 0 or 19˚N, 108˚E [13,14,72,73]. Of particular note, this biogeographical line, to some extent, exists between the core AoEs of EQTP and HDMR. This fact indicates that Zephyrus hairstreaks may have originated at least 23-24 Mya, and the Cenozoic orogeny as a result of the Indo-Asian collision likely caused rapid speciation in two AoEs, EQTP and HDMR. The significant differences between EQTP and HDMR (identified as independent AoEs respectively with lower correlation values) probably can be attributed to vicariance events which happened in these areas, for instance, the aforementioned extrusion of the Indochina block.

Conclusion
Areas of endemism and centres of species richness of Zephyrus hairstreaks are located in central and southern China, the Himalaya as well as northern Indo-China which is in good agreement with Quaternary vegetation reconstructions [4,74] that had been supposed as the existence of refugia in China. Furthermore, four distribution patterns inferred two dispersal routes of endemism and diversity, which strongly suggests that Zephyrus hairstreaks probably originated in E. Asia between 25˚N to 35˚N in the temperate zone (based on our database). Dispersal may have played a leading role in the closer relationship of species between QLMR and EQTP, while vicariance events are the most likely cause of the significant differences between EQTP and HDMR. In addition, the Cenozoic orogeny (the uplift of the Himalaya-Tibetan plateau region and the extrusion of the Indochina block during the Indo-Asian collision) [66,71], probably accelerated speciation, especially in the core areas endemism.
Supporting information S1