Genetic Signals of Demographic Expansion in Downy Woodpecker (Picoides pubescens) after the Last North American Glacial Maximum

The glacial cycles of the Pleistocene have been recognized as important, large-scale historical processes that strongly influenced the demographic patterns and genetic structure of many species. Here we present evidence of a postglacial expansion for the Downy Woodpecker (Picoides pubescens), a common member of the forest bird communities in North America with a continental distribution. DNA sequences from the mitochondrial tRNA-Lys, and ATPase 6 and 8 genes, and microsatellite data from seven variable loci were combined with a species distribution model (SDM) to infer possible historical scenarios for this species after the last glacial maximum. Analyses of Downy Woodpeckers from 23 geographic areas suggested little differentiation, shallow genealogical relationships, and limited population structure across the species’ range. Microsatellites, which have higher resolution and are able to detect recent differences, revealed two geographic groups where populations along the eastern edge of the Rocky Mountains (Montana, Utah, Colorado, and southern Alberta) were genetically isolated from the rest of the sampled populations. Mitochondrial DNA, an important marker to detect historical patterns, recovered only one group. However, populations in Idaho and southeast BC contained high haplotype diversity and, in general were characterized by the absence of the most common mtDNA haplotype. The SDM suggested several areas in the southern US as containing suitable Downy Woodpecker habitat during the LGM. The lack of considerable geographic structure and the starburst haplotype network, combined with several population genetic tests, suggest a scenario of demographic expansion during the last part of Pleistocene and early Holocene.


Introduction
Continental North America has been a climatically and biologically dynamic place during the last two million years, especially during the last part of the Quaternary ice ages, 21,000 years ago when the polar ice sheets were at their maximum extent [1]. Two immense ice sheets (Cordilleran and Laurentide) covered most of Canada, and the northern part of the US at the LGM (last glacial maximum). Some areas north of the glacial maximum may have remained ice-free such as the Queen Charlotte Islands (Haida Gwaii), Alaska (Beringia) and parts of Newfoundland [2]. Most plants and animals probably survived in areas south of the ice sheets; quite different from present-day North America, where ,20,000 species of vascular plants, more than 800 species of birds, and more than 640 species of mammals dominate a diverse and complex landscape [3]. The changing conditions following the melting of the ice sheets make this continent a natural laboratory to study patterns of colonization, geographic variation, adaptive radiation and extinction [4][5][6][7][8][9][10].
North America has been one of the main areas for studies on post-glacial expansion, and as such is an area where the impacts of the last glacial maximum (LGM) on organisms from a geological and biological perspective are best understood [7]. In North America the ice sheets were present as far as south as 40 o N with the Pacific Northwest, Northeast coast, Southwest US and Beringia serving as the major refugia for a diverse array of organisms. Some examples of species that may have used these refugial areas include the Chestnut-backed Chickadee (Poecile rufescens), Steller's (Cyanocitta stelleri) and Mexican Jays (Aphelocoma ultramarina) [11][12][13]; Northern Flying Squirrel (Glaucomys sabrinus), Red-backed Voles (Clethrionomys gapperi), American Marten (Martes americana); Spotted Salamander (Ambystoma maculatum) and Western Diamondback Rattlesnake (Crotalux atrox) [14,15 for a review on NW North America]. In contrast to Europe mountain ranges in North America run north-south and patterns of postglacial colonization from southern refugia differ between the two continents [1].
Despite the knowledge gained from multiple studies during the last two decades, the understanding of how different lineages and species responded in a spatio-temporal context to the retreat of the ice sheets following the LGM is still in its infancy [16]. How long did it take for organisms to colonize ice-free areas after the glaciers retreated? How much genetic structure remained after the colonization events? Even simple questions like these are not easy to answer for many taxa due to the analytical and methodological problems of inferring past events from current information, and the absence fossil data [17][18][19][20].
Studies on phylogeography and population genetics of Nearctic birds have uncovered an array of different trends [21,22]. Multiple patterns are found in specialists and generalists, widespread and locally distributed species, and migrants and residents. For example, in some species, low intraspecific genetic diversity and absence of population genetic structure is one of the patterns found [21,23,24,25] yet some of these species (Junco hyemalis and J. phaeonotus) have high morphological and ecological differentiation [25]. Other species have high genetic differentiation and geographic structure [26,27]. Additionally it has been suggested a general east-west split in both genetic and plumage data reflects independent evolutionary histories and restricted gene flow [28]. The general patterns and the idiosyncratic nature of the genetic variation within each species challenge a simplistic view of the demographic and evolutionary history of the continental taxa [15,21].
Patterns of genetic variation can be used to make predictions about the demographic and evolutionary history of an organism within the context of the LGM. One extreme are species with genetically structured populations displaying a gene genealogy with highly differentiated groups and collections of unique mitochondrial haplotypes or nuclear alleles restricted to specific areas, in addition to high nucleotide and haplotype diversities [12,13]. This pattern is typical of colonization out of multiple refugia, by a large number of individuals and low levels of gene flow between populations. At the other extreme are species with reduced genetic structure and weakly differentiated populations. The expectations are shallow gene genealogies with one common haplotype and numerous, less common haplotypes that differ by a few mutational steps from the main one [29]. Populations also exhibit low levels of nucleotide diversity and no or very few private alleles. In most cases, these characteristics are typical of species that have undergone recent, rapid population expansions by a reduced number of individuals, usually from a single refugium [12,24,30]. Patterns can be species-specific as a consequence of different demographic or evolutionary histories [31].
Here we focused on the Downy Woodpecker (Picoides pubescens, Picidae) a small, sexually dimorphic woodpecker inhabiting a variety of habitats including coniferous, deciduous and mixed forest and tall shrubbery. It is a year-round resident ubiquitously distributed in North America from Alaska to Florida ( Fig. 1) [32]. Some local movements between and within seasons have been reported, but it is not clear how often they occur and if they result in gene flow [33]. The Downy Woodpecker is represented by at least seven subspecies (P. p. medianus, pubescens, glacialis, nelsoni, leucurus, gairdnelli and turati) differing mostly in characteristics such as body mass and plumage coloration. Larger birds inhabit northern latitudes and higher elevations, and smaller birds live further south and at lower elevations [32,34]. In addition, birds in the east are whiter than the ones in the west (e.g. more white spots on the wings). The five western subspecies are difficult to differentiate even when using several morphological and plumage traits [35]. Other North American species, such as the Hairy Woodpecker (Picoides villosus), the Brown Creeper (Certhia americana) and the Dark-eyed Junco (Junco hyemalis) present some similarities, for instance, there are several subspecies, and geographic variation in terms of body size and natural history traits [25,28,36].
We studied geographic patterns of genetic variation for the Downy Woodpecker (Picoides pubescens) using a series of population genetic statistics derived from mitochondrial DNA sequences and microsatellite data and created for the first time a species distribution model (SDM) at 21 kya. Preliminary evidence suggests that the Downy Woodpeckers were different than today's in terms of population size and distribution during the late Pleistocene [21]. Ball & Avise [23] who used mtDNA RFLP data from 51 individuals mainly from the eastern US, found no genetic structure in the Downy Woodpecker. However, no study has attempted to examine a post-glacial expansion hypothesis using more rapidlyevolving molecular markers and a more complete geographic sampling for the Downy Woodpecker. Based on plumage and body size diversity, and reported ''sedentary'' behaviour [32], one might predict that in different areas of North America the Downy Woodpeckers have remained isolated. Previous genetic data [23] suggest a starburst pattern and little genetic differentiation, but had limited samples from glaciated or western regions. We predict that current Downy Woodpecker populations in previously glaciated areas are characterized by genetic signatures typical of species that have undergone rapid population expansion and small effective population size. Rapidly evolving markers such as microsatellites are better able to detect recent reductions in gene flow [37]. Given the morphological variation seen in the Downy Woodpecker, we predict nuclear loci will show reduced gene flow throughout parts of the range.

Genetic Diversity
Fifty nine variable sites were found in an 850 base pair fragment of the mitochondrial genome for 263 individuals. No insertions or deletions were detected. Global nucleotide diversity (p = 0.00097) was low while haplotype (H d = 0.5839) diversity was moderate. Population p ranged from 0.0002-0.0031 and H d from 0.182 to 1.000 (Table 1). Among the 55 haplotypes, none differed by more than six nucleotide substitutions, and all the populations except Revelstoke, BC (BCR) shared a general widespread haplotype (Hap1). In general, Alaska, Washington, Newfoundland and southern Alberta were the least diverse regions in terms of H d and p, and Missouri, Idaho, Revelstoke (BC), and Louisiana exhibited the highest diversity for the same two genetic indices (Tables 1). MICRO-CHECKER showed no evidence of scoring errors due to allele dropout or stutter for any of the loci. The only evidence for null alleles arising from an excess of homozygotes was at DlU5. Both Illinois-Wisconsin (DMD9, P = 0.0217) and Vancouver Island (DMD118, P = 0.0360) deviated from Hardy-Weinberg equilibrium, but neither was significant after correction for multiple tests. No evidence of linkage disequilibrium was found among loci as indicated by Fisher's exact test (P values ranged from 0.121 to 1.000).
The seven variable microsatellite markers had moderate levels of genetic variation among loci and across populations. The number of alleles per locus (and average number) ranged from 4 (2.760.159) in DMD9 to 17 (9.460.608) in DMD118. On average, Alaska had the lowest number of alleles (3.360.71), while Nova Scotia -New Brunswick had the highest (6.161.08) ( Table 1). Microsatellite genetic diversity, measured indirectly using H e , was relatively uniform across the range with no association with latitude (r 2 = 0.1325, P = 0.1703) or longitude (r 2 = 0.0422, P = 0.4557).

Population Structure
Mitochondrial-based pairwise F ST comparisons across 21 sampling locations revealed minor differences among most pairs of populations ( Table 2). Most of the significant differences were  (3/20) were also significant. The most common haplotype, Hap1, was absent from Revelstoke and found in a single bird in Idaho (n = 6) (Fig. 4). Plotting the genetic similarities (F ST/ (1-F ST )) as a function of the geographic distance among populations revealed no significant relationship (r 2 = 0.0091, P = 0.220).
The Bayesian clustering analysis estimated K = 2 as the most probable number of Downy Woodpecker subgroups using both DK method (Fig. 2) and Bayes factor (0.999 for K = 2). Based on assignment probabilities (coefficient of ancestry -Q .0.6) for each individual, one group includes Utah, Montana, Colorado and southern Alberta, and the second group includes most of the other populations. The exceptions to these two groups were Alaska, Michigan and central Alberta in which most individuals were equally assigned to both groups ( Figs. 1 and 3).

Phylogeographic and Demographic History
Statistical parsimony analysis of the 55 mitochondrial haplotypes produced a starburst pattern, a common, central haplotype different from most of the other haplotypes by one or two nucleotide differences, suggestive of a demographic expansion.  Table 1 for locality names). The hypothesis of constant population size was rejected by Fu's F s (28.3204, P,0.0001). Likewise, Tajima's D rejected a scenario of selective neutrality and population equilibrium (22.65, P,0.0001). Fu and Li's D* (26.54, P,0.02) and F* (25.77, P,0.02) were both negative and significant, supporting the results of Fu's F s and Tajima's D. Nonetheless, a unimodal mismatch distribution signals that Downy Woodpecker populations have undergone periods of population growth (SSD = 0.00034, P = 0.75) (Fig. 5). This was also confirmed by the raggedness index, which failed to reject the null hypothesis of population expansion (r = 0.068, P = 0.36).
The absence of any geographic pattern in the haplotype distribution in the TCS network suggests limited phylogeographic structure of breeding populations of the Downy Woodpecker in North America (Fig. 4). A single widespread haplotype (Hap1) was present in all the populations except Revelstoke, BC. Hap1 was the most frequent haplotype found in 169 individuals (64%), followed by Hap8 (9 ind. -3.4%), Hap4 (6 ind. -2.3%) and Hap44 (5 ind. -2%). The remaining 51 haplotypes were present in one to four individuals ( Fig. 4 and Table 3).

Ecological Niche Modeling
The resulting models for the current and LGM 21 kya Downy Woodpecker distribution in North America were constructed based on 16,271 presence records used for testing and 5,423 for training. The predicted models for the current distribution and at 21 kya were statistically robust using guidelines suggested by [38]. The omission plot (not shown) displayed a good match between the predicted omission by the program algorithm and the training data (25% of locality records). Both the training data (5,423 records) and the test data (16,271 records) predicted similar models, as the area under the curve test -AUC = 0.942 (values can range from 0 to 1, and values less than 0.5 are not better than a random model). The predicted model for the current distribution suggests a strong correspondence with other sources of information related to the current species' distribution ( Fig. 6a). The predicted distribution during the LGM (21 kya) suggests two main areas (Fig. 6b). The largest area corresponds to the southeastern US from North Carolina west to Texas, with a proportion of the suitable habitat outside the current coastal line (sea level dropped up to 120 m at 21 kya [39]). The second main area corresponds to the Rockies with higher distribution probabilities in the east: Arizona, Utah, and Colorado, and west: California and Oregon. At the LGM these areas were covered with open boreal woodlands, semi-arid temperate woodlands or scrub, forest steppe, subalpine parkland, temperate steppe parkland and taiga [39]. Based on current habitat preferences [32], it seems that all of the above could have been potential places for the Downy Woodpecker to persist.

Congruencies and Contrasts between mtDNA and Microsatellites
This study examined on rangewide patterns of genetic variation in the Downy Woodpecker in the context of postglacial expansion and colonization of North America and expands on previously published work [23]. Both mtDNA and microsatellite data showed limited genetic structure across sampled locations, with most geographic groups characterized by a wide range of haplotype and nucleotide diversities, and similar expected heterozygosity. The presence of a common, widespread haplotype and a unimodal mismatch distribution suggest a recent common history for most continental North American populations. The exceptions to this are Downy Woodpeckers in Revelstoke (BC) and Idaho, as discussed below. Microsatellite and mtDNA data suggest Alaska has the lowest genetic diversity (Table 1) compared with the other sampled populations and support the idea of a recent founder event or a bottleneck. In comparison, Washington and southern Alberta have low nucleotide and haplotype diversities in contrast to high nuclear variation. Both Washington and southern Alberta sampling sites are adjacent to Revelstoke and Idaho (Fig. 1) suggesting that geographic barriers such as the Rocky Mountains or different colonization routes could explain the current patterns of genetic variation in nearby areas. While sample sizes from the Revelstoke and Idaho populations were too small to obtain accurate allele frequency estimates, the mtDNA haplotypes from these two populations show uncharacteristically high haplotype variation, unique haplotypes and low frequency of Hap1.
Patterns in genetic diversity indices for mtDNA and microsatellites have been shown to be similar, but can also vary in the same species [8]. This may reflect not only the intrinsic mutational processes, but also the effective population size associated with nuclear and mitochondrial genomes or selection [8,40]. In    addition, it may reflect more contemporary processes such as restricted gene flow between populations or male-biased dispersal, a pattern that can be detected by the microsatellites but not mtDNA [41].

Population Structure
F ST based differences suggest partially incongruent patterns between mtDNA and microsatellite data, which is expected under equilibrium due to different effective population sizes and other factors, such as higher mutation rates in microsatellites than mtDNA [8,42]. For instance, southern Alberta, Montana, Utah and Colorado were differentiated from other populations with the microsatellite analyses (Fig. 3), but, with the exception of Montana, not using mtDNA ( Table 2). Most of the genetic variation (.90% in AMOVAs, not shown) is contained within populations and IBD analyses for both datasets suggest gene flow may be restricted by physical distance though support is weak (mtDNA: r 2 = 0.0091, P = 0.220, microsatellites: r 2 = 0.1103, P = 0.070). Therefore, over a large geographic scale, the effects of genetic drift may be stronger than gene flow explaining why patterns of significant F ST values do not necessarily correspond to geographically clustered populations. A similar pattern albeit at a different geographic scale is found in the Greater Prairie-Chicken (Tympanuchus cupido) [43].   Three sets of populations: Alaska, populations east of the Rocky Mountains and populations in the Intermountain West (Revelstoke BC and Idaho) show evidence of restricted gene flow. The patterns for the first two sets of populations are only evident with microsatellite markers. In the case of Alaska allele frequency differences may be the result of a founder event as evident from the low mtDNA variation (H d = 0.18); however, tests for recent bottlenecks were not significant (P = 0.62). In contrast, gene flow in the eastern Rockies (MT, CO and UT) may be restricted by a number of factors including habitat and landscape barriers [see 44,45]. Restricted gene flow may also be evident in the Intermountain West with nuclear loci, however, additional samples will need to be collected to test this. Nevertheless, Intermountain West populations harbour uncharacteristically high mtDNA variation that we address below.
As gene flow is not significantly impeded by geographic distance in the Downy Woodpecker (as suggested by genetic analyses and band recovery data), it is possible that populations are not reproductively isolated, and individuals are moving between distant geographic areas. However, we cannot claim there is or has been continuous gene flow since some populations (e.g. Illinois-Wisconsin or Missouri) have higher haplotype (H d ) diversities and allelic richness (AR) than others (Table 1) and restricted gene flow accessed by F ST can also indicates potential contact zones. Browning [33] found that of the 113 band recoveries of Downy Woodpeckers, 94% were hatch year (HY) and after hatch year (AHY) females moving between areas between different seasons (on average 218 km). Three of the recoveries were 583 km (winter to summer), 591 km (fall to winter) and 1080 km (summer to winter) away from the original banding locations. This potential scenario of birds (.90% females) moving seasonally, even at low frequencies, could in theory genetically homogenize the populations [46,47]. If females' movements are higher in some geographic areas, this would have had a strong influence on the genetic architecture of different populations for both microsatellites and mtDNA. Although not fully understood, it seems that Downy Woodpeckers (mostly females) disperse during the breeding season searching for food and empty territories [48]. It is not known if there is differential dispersal between the east and west due to bird density, resources or physical barriers [32].

Demographic History
One of the most relevant findings of this study is related with the genetic distinctiveness of the BC Revelstoke and Idaho populations. Despite the small sample size and the lack of microsatellite data, this highlights the importance of southern BC and Idaho as a potential refugium for animals and plants [11,49,50,51]. Although our data (genetics and SDM) do not allow us to conclusively state a refugia in the northern Rocky mountain area (such as Clearwater), it is congruent with a widespread pattern where populations southern of the ice sheets are genetically more diverse (see also Missouri and Louisiana populations, Table 1) than their northern counterparts [28,52]. It is also intriguing that southern Alberta, geographically very close to Idaho and BC Revelstoke, had one of the lowest genetic diversity measures at both mtDNA mitochondrial and nuclear loci. Consequently, one possible explanation for the low genetic diversity in the eastern slope of the Rocky Mountains (and the forested areas in southern Alberta plains) and other areas such as Washington and Newfoundland is that they were recently colonized either from the southern populations or refugial areas in the northern Rockies [51,52].
Our estimates of time since expansion (using a standard mutation rate of 2%/Ma) suggest that current Downy Woodpecker populations expanded from 178,764 (BC Revelstoke) years to 6,294 years (Alaska). Several populations such as Washington and southern Alberta expanded around the 21 kya period whilst other (California, Missouri and Louisiana) predates the LGM boundary. Although these estimates can be a reference to understand the demographic history, the putative times should not be taken as conclusive as they are highly dependent on the mutation rates.
In much the same way, the genetic patterns depicted here are suggestive of at least three explanations for the Pleistocene demographic history of this species. One is the rapid population expansion typified by starburst as seen in the mtDNA network (Fig. 4) and supported with the mismatch analysis (Fig. 5). Second, a new advantageous allele might have spread throughout the population and, as the mitochondrial genes are linked, a selective sweep at one mitochondrial locus may have reduced mtDNA genetic variation at ''neutral'' loci [53,54]. We cannot rule out background selection based on Fu and Li's D* and F*, however, selection may not have acted on ATPase genes themselves. As the mitochondrial genome is composed of a number of coding loci, selection at any locus would reduce variation at the ATPase loci. The third possibility is a population bottleneck. Reduction in habitat and resources during an interglacial or glacial might have decreased the genetic variation [47]. Higher nuclear variation does not rule out any of the three possibilities.
Several independent lines of evidence support a postglacial expansion by the Downy Woodpecker during the last 21 kya. First, a star-like haplotype network characterized by a predominant haplotype and accompanied by a collection of low frequency peripheral haplotypes differing by one or two mutational steps from the main haplotype [12,23,29,55]. Second, is the absence of divergent haplotypes between geographic groups (even with reduced structure) as shown in the current study [24,56]. Third, a unimodal mismatch distribution not different from a model of sudden expansion, which suggests an historical increase in effective population size [31]. More support comes from the neutrality test, where Fu's F s and Tajima's D rejected the null hypothesis of population stability. Finally, the species distribution model for the Downy Woodpecker at 21 kya suggests two distinct refugial areas, one in the southeastern US and the second in the west from Arizona to Oregon. Assuming that this distribution model represents a realistic scenario, it strongly favors a limited number of southern refugia. The predicted LGM habitat does not support any additional areas north of the ice sheets, such as Beringia, for Downy Woodpeckers. From a genetic point of view, both mtDNA and microsatellite data, Alaska has the lowest genetic diversity of all the populations (and lacks unique genetic variants), suggesting that it was recently colonized from other populations. Finally, mtDNA data do not support surviving lineages originating from multiple refugia; however the nuclear data do support the presence of at least two genetically distinct groups. It is possible that a Our results show limited phylogeographic structure in the Downy Woodpecker supporting a previous mtDNA RFLP study [23]. Unlike Ball & Avise [23] we found limited, but significant structure along the eastern Rocky Mountains in Montana, Colorado, Utah and southern Alberta. Another novel finding of this study is that Alaska (mtDNA and microsatellite data) has the lowest genetic diversity of all the populations (and lacks unique genetic variants), suggesting that it was recently colonized from other populations. The genetic affinities of non-sampled populations in the Midwest is unknown, however, based on the current genetic patterns it is unlikely that the inclusion of samples from this area will change the general patterns depicted here.

Sampling and DNA Extraction
Individuals were sampled through: fieldwork (blood samples), toe-pads from museum birds, or tissue loans from different institutions. A total of 262 birds were used for mtDNA and 257 for microsatellite analysis (see Table 1) with differences in sample sizes explained below. All animals captured in the field for sampling were properly treated and released in the field. No birds died during the procedures. Blood samples were collected following animal welfare protocol (#0624) approved by the University of Lethbridge animal welfare committee using guidelines set by the Canadian Council on Animal Care (CCAC). Most samples were collected during late spring-summer and include individuals from across the majority of the species' range ( Fig. 1) and all seven subspecies recognized by the American Ornithologists' Union (AOU) [32].
Total genomic DNA was extracted from muscle, blood, feather shaft and toe-pads from bird skins using a modified Chelex extraction [40]. Ten microlitres of blood-ethanol mix, or the equivalent in muscle or toe-pad, was incubated at 50uC for 30 min to allow the ethanol to evaporate. A 300 mL solution of DNA extraction buffer (0.1 M Tris-HCl pH 8.0, 0.05 M EDTA, 0.2 NaCl and 1% SDS) containing 5% w/v Chelex, 500 mg of proteinase K and 250 mg RNase was added, and the sample was incubated at 50uC overnight on a rotating wheel. The sample was vortexed, centrifuged at 10,000 rpm for 60 s, and 200-250 ml supernatant of transferred to 300 ml 5% w/v Chelex in low TE buffer (10 mM Tris-HCl pH 8.0, 0.1 mM EDTA).

Mitochondrial DNA Amplification and Sequencing
Partial tRNA-Lys, ATPase 6 and 8 genes were sequenced for 263 individuals (GenBank accession numbers: JX094511-JX094773). We were able to amplify the targeted regions for 262 individuals; however, some museum specimens could not be amplified (e.g. Vancouver Island or Ontario, Table 1 , and approximately 100 ng of purified PCR product. The program consisted of one cycle of 120 s at 96uC; 25 cycles of 30 s at 96uC, 15 s at 50uC and 240 s at 60uC. PCR products were cleaned using a NaC 2 H 3 O 2 precipitation. Samples were left to incubate at 4uC for 20 min and centrifuged at 13,000 rpm for 20 minutes, followed by two sequential rinses to remove residual salts: first with 100 ml cold 70% ethanol, spun at 13,000 rpm for 5 min, and second 35 ml cold 70% ethanol centrifuged at 13,000 rpm for 5 min. Ethanol was removed after each step. Samples were left to evaporate in the dark at room temperature for 60 min, re-suspended in 10 ml of hi-di formamide, and sequenced using an ABI3130 sequencer.

Microsatellite Amplification and Scoring
Only populations with ten or more individuals were genotyped ( Table 1). Samples were screened at 12 published microsatellite loci [58,59,60]. Five of the 12 were monomorphic (DlU1, DlU4, Ptri5, DMC112, and DMD7) and were dropped. A total of two hundred and fifty seven individuals were genotyped at seven variable microsatellite loci: DlU5 [58], DMA2, DMD9, DMC111, DMC115, DMD118 [59], and Ptri3 [60]. All 'forward' primers were modified with the addition of a M13 sequence on the 59 end to allow for the incorporation of a fluorescently labelled M13 primer during PCR [13]. The forward primer of DMC115 (59-TGTCAGAGATGGTTCATGGGTGCACT-39) [59] and Ptri3 (59-GCAAAAGCCAGTTCCTGTGCATGG-39) [60] were modified to improve DNA amplification. All loci were amplified using a two-step procedure that consisted of: one cycle of 120 s at 94uC, 45 s at annealing temperature 1 (T A1 ), and 60 s at 72uC; seven cycles of 60 s at 94uC, 30 s at T A2 , 45 s at 72uC. Annealing temperatures (T A1 and T A2 , respectively) were 45-48uC for DMA2, DMD9, DMD118; 48-50uC for DMC111; 50-52uC for: DlU5, DMC115 and Ptri3. DNA amplification was carried out in 10 ml reactions under the following conditions: ,200 ng DNA, 100 mM dNTP, 2.5 mM (for DMC111, DMD118, DMA2, DlU5, Ptri3) or 1.5 mM (DMD9, DMC115) MgCl 2 , 5 pmol of each F and R primer (and M13 primer), 1X PCR buffer, and 0.5 U of Taq polymerase. Amplification products were loaded on a Li-COR 4300 analyzer. DNA fragments were sized using Li-COR size standards and internal controls (three samples run on every load). Both authors checked microsatellite genotypes for every locus independently. We also developed a check plate containing multiple samples from each gel to confirm the original scores and check for shifts in allele sizes during scoring.

Mitochondrial DNA Genetic Analyses
Sequences were aligned manually using MEGA version 4 [61]. Two measurements of DNA polymorphism, nucleotide (p) and haplotype (H d ) diversity indices [62], were calculated for each population and all populations combined using DnaSP version 4.50.1 [63]. To understand how the genetic variation was partitioned among and within populations, Wright's fixation index F ST was calculated in Arlequin version 3.1 [64]. We tested for a positive association between genetic (F ST /(1-F ST )) and geographic distances with those expected under a stepping-stone model of population structure [65] using isolation by distance (IBD) analysis. A Mantel test was used to test significance with 500,000 permutations in GenAlEx version 6 [66].
Phylogeographic and demographic history of the Downy Woodpecker were studied using two complementary approaches. First, genealogical relationships among haplotypes were visualized using the software TCS version 1.21 [67]. This program produces a parsimony-based minimum-spanning network to display the number of possible connections (nucleotide differences) between haplotypes at a 95% confidence level. Second, we used classical population genetic statistics. Fu's F s is very powerful at detecting population expansions (increases in effective population size) by estimating departures from neutrality [68,69]. Large, significant negative F s values reject the null hypothesis of population stasis. F s were calculated in DnaSP version 4.50.1 and significance was evaluated with 10,000 coalescent simulations. Similarly, Tajima's D was used to test against selective neutrality and population equilibrium [70] where negative and significant values suggest either population expansion or purifying selection. Tajima's D was calculated in Arlequin version 3.1 [64]. Additionally, Fu and Li's D* and F* tests were calculated to determine if background selection explained the patterns of genetic diversity in this species. If Fu and Li's statistics are significant and Fu's F s and Tajima's D are not, then background selection is more likely creating the observed pattern [68].
We used mismatch distributions to help visualize signatures of demographic expansion for all the samples combined, and to test the null hypothesis of population growth [30,31]. The simulated distribution of pairwise nucleotide differences (under the population growth model) was compared with the observed distribution using the sum of square deviations (SSD). Likewise, the raggedness index (r) was calculated to test if the frequency of pairwise nucleotide differences followed a smooth unimodal curve expected from a growing population by comparing the expected and observed (r). Small values of r suggest a good fit to the model. Both SSD and r were calculated in Arlequin version 3.1 [64].
We estimated the putative time of population expansion most of the populations from the statistic tau (t). t is equal to 2ut, where u equals 2 mk, m is the mutation rate and k is the length of the sequence and t the time of expansion [31]. Tau (t) values were calculated in Arlequin version 3.1 [64]. Population expansion times were estimated assuming a constant molecular clock and rates from 0.5, 1.6 and 2% (71 Pereira & Wajntal 2008) using the only tool http://www.uni-graz.at/zoowww/mismatchcalc/mmc1. php developed by [72].

Microsatellite Genetic Analyses
MICRO-CHECKER [73] was used to detect errors in scoring and the presence of null alleles. Allele frequencies, allelic richness, observed (Ho) and expected heterozygosity (He) were calculated for every population and each locus using the package GenAlEx version 6 [66]. Linkage disequilibrium and Hardy-Weinberg equilibrium were assessed (using Fisher's exact probability test) in the web version of GENEPOP version 4.0.10 [74]. The Markov chain parameter was set to 10,000 dememorizations, 1000 batches, and 10,000 iterations per batch for both analyses.
Genetic differentiation among sampled populations was evaluated using F ST [75] implemented in GENETIX version 4.05 [76]. A total of 10,000 permutations was used to determine the significance of the F ST . For microsatellite analysis of population structure, F ST is preferred over R ST when the number of loci is less than 20 [77] as is the case with this study.
The Bayesian clustering program STRUCTURE version 2.3.3 [78,79] was used to determine the number of possible clusters in our sample (K). K was set to a value from 1 to 16 during the initial exploratory runs (500,000 generations with a burn in of 100,000), and was then reduced to K = 1-5 for the final analyses based on the number of clusters detected in the initial runs. The final analysis was run for 1,000,000 generations with a burn in of 500,000, assuming population admixture (same alpha for the degree of admixture), correlated allele frequencies, and including a priori population information. Ten replicates were performed at each value of K for the initial and final runs. Average ln P(X|K) was plotted against K to visualize the number of possible clusters as suggested by the authors. Although STRUCTURE is widely used, the biological meaning of the number of groups (K) can be problematic to infer when there are few available loci or the populations are not strongly differentiated [79]. Likewise, because it is difficult to decide which K captures the major structure in the data due to similar ln P(X|K) values, two complementary methods were employed. One is the Bayes rule, where posterior probabilities for each K are compared in order to select the more meaningful K [78]. The second is the DK method [79], for which we used the on-line tool STRUCTURE Harvester (http://taylor0. biology.ucla.edu/struct_harvest/). DK eliminates the possibility of overestimating the potential number of populations or groups due to the fluctuating nature of probability ln P(X|K) as K increases. One limitation with the DK method is it cannot detect when K = 1 is the true number of clusters. Finally, isolation by distance (IBD) analysis was carried out for microsatellites in order to test the relationship between genetic and geographic distances.

Ecological Niche Modeling (ENM)
To gain a better understanding of the possible areas where the Downy Woodpecker survived during the LGM, a distribution model was constructed using the maximum entropy method implemented in Maxent version 3.3.3e [38]. Maxent estimates maximum entropy distributions using climatic variables to predict non-negative probabilities on a target distribution using presence records [80]. The analysis is based on 19 bioclimatic variables from the WorldClim dataset, publically available and extensively used in ENM [14,81]. For all analyses, we used the default convergence threshold and maximum number of iterations (500), using 25% of the localities for model training. Maxent produces a continuous probability value (0 to 1 using the logistic default output) as an indicator of the presence/suitability for the species based in the principle of maximum entropy [38]. Distribution modeling for the LGM was done using paleoclimatic data drawn from the Community Climate System Model [82].
Sets of 16,271 localities were used to generate current and historical distribution maps. All the records were downloaded from the Global Biodiversity Information Facility (GBIF) data portal (http://data.gbif.org), and included the localities used in the genetic analyses. Duplicate records were removed as part of the default setting in Maxent to avoid pseudo-replication. To test if the modeled distribution of the Downy Woodpecker during the LGM corresponds to a realistic model, we used two basic techniques implemented by the software. The first is the omission plot. If the data using for training deviates from the predicted omission line then the model is statistically weak. The second if the area under the curve (AUC), which test if the model predicted with the training data (25% of the total locality records), is similar to the testing data, and if both are not better than a random model. Values range from 0 to 1, and values under 0.5 are not better than a random model [38].