Multiplex APLP System for High-Resolution Haplogrouping of Extremely Degraded East-Asian Mitochondrial DNAs

Mitochondrial DNA (mtDNA) serves as a powerful tool for exploring matrilineal phylogeographic ancestry, as well as for analyzing highly degraded samples, because of its polymorphic nature and high copy numbers per cell. The recent advent of complete mitochondrial genome sequencing has led to improved techniques for phylogenetic analyses based on mtDNA, and many multiplex genotyping methods have been developed for the hierarchical analysis of phylogenetically important mutations. However, few high-resolution multiplex genotyping systems for analyzing East-Asian mtDNA can be applied to extremely degraded samples. Here, we present a multiplex system for analyzing mitochondrial single nucleotide polymorphisms (mtSNPs), which relies on a novel amplified product-length polymorphisms (APLP) method that uses inosine-flapped primers and is specifically designed for the detailed haplogrouping of extremely degraded East-Asian mtDNAs. We used fourteen 6-plex polymerase chain reactions (PCRs) and subsequent electrophoresis to examine 81 haplogroup-defining SNPs and 3 insertion/deletion sites, and we were able to securely assign the studied mtDNAs to relevant haplogroups. Our system requires only 1×10−13 g (100 fg) of crude DNA to obtain a full profile. Owing to its small amplicon size (<110 bp), this new APLP system was successfully applied to extremely degraded samples for which direct sequencing of hypervariable segments using mini-primer sets was unsuccessful, and proved to be more robust than conventional APLP analysis. Thus, our new APLP system is effective for retrieving reliable data from extremely degraded East-Asian mtDNAs.


Introduction
Mitochondrial DNA (mtDNA) is a powerful tool for exploring matrilineal phylogeographic ancestry, as well as for analyzing highly degraded samples, because of its polymorphic nature and high copy numbers per cell. The recent advent of complete mitochondrial genome sequencing has led to improved techniques for phylogenetic analyses based on mtDNA, and many multiplex genotyping methods have been developed for the hierarchical analysis of phylogenetically important mutations [1][2][3][4][5][6][7].
However, few multiplex genotyping systems for analyzing East-Asian mtDNA lineage can be applied to extremely degraded samples [2,[5][6][7]. Even in these studies, haplogroup D, which exhibits the highest frequency and incidence of variations in many East-Asian populations, is not sufficiently classified. For example, Coutinho et al. [6] divided haplogroup D into 8 subhaplogroups (the highest number among the above-mentioned studies). However, with the exception of sub-haplogroups D4b1 and D4e, these sub-haplogroups are exclusively observed in Native Americans; moreover, many sub-haplogroups of haplogroup D that are phylogenetically important in East-Asian populations are missing (e.g., haplogroup D4a). Therefore, there is a need to establish higher resolution multiplex systems for the hierarchical analysis of phylogenetically important mutations in East-Asian populations.
Among the methods for analysis of single nucleotide polymorphisms in mtDNA (mtSNPs), amplified product-length polymorphism (APLP) [8,9] is considered one of the simplest and most robust. To detect mtSNPs, APLP employs two allele-specific primers, one of which has a few non-complementary bases in the 5'-terminus. The detection consists of assessing the difference in the length of the amplicons, which are obtained by polymerase chain reaction (PCR) and subsequent electrophoresis. We previously showed the effectiveness of APLP-based multiplex mtSNP analyses [9] for highly degraded samples when we successfully clarified the genealogy of individuals, and the relationship between populations excavated from different archaeological sites [10][11][12][13][14][15][16].
However, with respect to the successful analysis of extremely degraded samples, the conventional mitochondrial APLP (mtAPLP) system [9] has at least four drawbacks. First, conventional mtAPLP systems examine 35 haplogroup-diagnostic mtSNPs and a 9-bp repeat variation in the non-coding cytochrome oxidase II/tRNA Lys intergenic region. This number of polymorphic sites is too small for classifying mtDNAs to sub-haplogroup level without using the sequence data of the hypervariable segments (HVS). Second, in each set of a conventional mtAPLP system, the mtSNPs are not selected in accordance with the phylogenetic order. For instance, the macro-haplogroup examined in set A is N despite the fact that seven out of nine haplogroups examined in this set stem from macro-haplogroup M: haplogroup D, its branches (D4, D4a, D4b, D4g, and D4e), and haplogroup M12. Third, the competitiveness of some primers is low. For example, haplogroup F mtDNA always shows an extra 66-bp band on gel. Fourth, the amplicon size is considered inappropriate. In practice, amplicons longer than 120 bp frequently disappear when analyzing extremely degraded samples. To overcome such limitations, a more accurate, detailed, and sensitive mtDNA haplogrouping system is required.
Here, we present a novel multiplex inosine-flapped APLP system that is specifically designed for haplogrouping extremely degraded East-Asian mtDNAs.

DNA samples
To obtain modern-day DNA samples, intraoral epithelial cells were collected from eight healthy Japanese adults. Before cells were collected, volunteers were informed, in writing that their DNA would be anonymized and that it would be used only for haplogrouping of its mtDNA. Written consent was then obtained from each volunteer to use his or her DNA in the study. Both the consent procedure and, the written forms, were approved by the ethics committee of the Faculty of Medicine of the University of Yamanashi.
In order to establish the current APLP system, in addition to the samples from the volunteers, we also used ancient and modern-day DNA samples for which the mtDNA haplogroups had been securely determined in previous studies [9][10][11][12][13][14][15][16][17]. DNA samples provided by the University of Malaya, Tokai University School of Medicine, and the Yamagata University had all been anonymized before arriving at our research facility at the University of Yamanashi. We obtained permission to conduct this study using these DNA samples from each of the respective universities.
The intraoral epithelial cells were collected using a forensic swab (Sarstedt Inc., Nümbrecht, Germany). DNA extraction was performed using a MonoFas 1 Intraoral epithelial cells genome DNA extraction kit VIII (GL Science Inc., Tokyo, Japan), and the manufacturer's protocol was followed. The quantity and purity of the DNA was evaluated by optical density (OD) 260 and OD 260/280 measurements, obtained using a spectrophotometer (Nano Drop 1000; Thermo Fisher Scientific Inc., Waltham, MA, USA).
To determine the mtDNA haplogroups from these DNA samples, segments of mtDNA that cover parts of the tRNA Pro gene, the hypervariable segments (HVS) 1 (nucleotide position (np) 15999-16366, relative to the revised Cambridge reference sequence (rCRS) [18]), and HVS 2 (np 128-256) were analyzed as described previously [12]. Moreover, to confirm our ability to identify mtDNA haplogroups from modern-day samples, we also analyzed haplogroup-diagnostic mtSNPs and a 9-bp repeat variation in the non-coding cytochrome oxidase II/tRNA Lys intergenic region by using the conventional mtAPLP system [9]. Nucleotide changes observed in eight modern-day samples are shown in Fig 1. Thereafter, we assigned each modern-day mtDNA under study to the relevant haplogroup by using Phylotree, the updated comprehensive phylogenetic tree of global human mitochondrial DNA variation (www.phylotree.org; mtDNA tree Build 17) [19]. Basically, Phylotree is built based on the Reconstructed Sapience Reference Sequence (RSRS) [20] to avoid inconsistencies, misinterpretations, and errors in medical, forensic, and population genetic studies. However, the description of nucleotide changes in the conventional mtAPLP system [9] is based on rCRS. Therefore, we used an rCRS-oriented version of mtDNA tree Build 17 [19] as a classification tree for the modern-day samples.
Furthermore, in order to evaluate the effectiveness of our system for the analysis of extremely degraded samples, we also tested ancient DNA samples extracted from one early Kofun (approximately 1,600 years old) and 11 Middle Jomon (approximately 4,000 years old) skeletons excavated from the Kusakari shell midden site, Chiba, Japan. DNA was extracted from the teeth of these skeletons according to the method described by Adachi et al. [14].
Phylogenetically important mutations were examined using fourteen 6-plex PCR sets (Table 1 and Fig 3; Table 2 and Fig 4). We recently developed novel APLP primers with a short inosine extension added to the 5 0 -terminus. This modification improves the competitiveness of allele-specific primers to the template DNA, resulting in enhanced reliability of the analysis of SNPs [28]. In the present study, we designed the primers based on this inosine-flapped APLP method. Moreover, to maximize the robustness of the PCR, we used amplicons with a length of <110 bp, which is shorter than the amplicon length used in the conventional system (<151 bp) [9]; we also examined fewer polymorphic sites in each set (6 sites, compared to 9 in the conventional system) [9].
First of all, reactions using primers of multiplexes M-I and N-I were performed for all samples, because most East-Asian mtDNAs stem from macro-haplogroups M and N, and these multiplexes can identify major branches of macro-haplogroups M and N that are widely observed in East-Asian populations.
Following SNP typing using multiplexes M-I and N-I, each mtDNA under study was classified on the basis of the criteria shown in Fig 2, by using the mtDNA haplogroup nomenclature from Phylotree [19]. Thereafter, each mtDNA underwent subsequent haplogrouping based on the result of multiplexes M-I and N-I. For example, if an mtDNA was designated to haplogroup B, further haplogrouping of this mtDNA was performed using multiplex N-V. If the SNPs observed in a sample did not represent a haplogroup motif, i.e., they were apparently incongruent with Phylotree, the data was discarded because, as we reported previously [29], such incongruence often stems from contamination of the sample.

PCR conditions and detection of PCR products
The formula of the amplification reaction and the PCR condition were the same for all multiplexes; only the primers differed. Each reaction was performed in a total volume of 10 μl, containing a 1 μl aliquot of the sample DNA solution, optimum concentrations of each primer (Tables 1 and 2), and reagents of the QIAGEN multiplex PCR kit (QIAGEN, Hilden, Germany).
The amplification reaction was conducted in a TaKaRa PCR Thermal Cycler FAST (TaKaRa, Shiga, Japan). The condition for PCR included: incubation at 95°C for 15 minutes; 5 cycles at 94°C for 30 seconds, and at 64°C for 5 minutes (ramp speed > 2.5°C/sec); 33 cycles at 94°C for 30 seconds, and at 64°C for 90 seconds (ramp speed > 2.5°C/sec); and a final extension at 72°C for 3 minutes.
A 2 μl aliquot of the PCR product was separated by electrophoresis in a precast native polyacrylamide gel (10% T, 5% C) containing 1 × TBE buffer with running buffer (1 × TBE) (TEFCO, Tokyo, Japan) using an electrophoretic apparatus STC-808 (TEFCO). The voltage at electrophoresis was 150 V (constant voltage), and the electrophoretic time was approximately 98 minutes. PCR bands were visualized fluorographically after staining with SYBR Green (Bio-Rad Laboratories, Hercules, CA, USA).

Testing the sensitivity of the new APLP system
To evaluate the sensitivity of our new APLP system, various amounts of crude DNA (1.0 × 10 −9 -0.1 × 10 −15 g), which included genomic DNA and mtDNA with known haplogroups (D4j and F2), were examined using multiplexes M-I and M-III for D4j mtDNA, and     Table 2 and Fig 4). The results of the experiments were confirmed by three independent assays. To detect the possibility of contamination, negative PCR controls were also analyzed.

Application to highly degraded samples
To validate the effectiveness of our new APLP system for highly degraded samples, we analyzed one early Kofun (approximately 1,600 years old) and 11 Middle Jomon (approximately 4,000 years old) skeletons. At first, the ancient DNAs were examined by using multiplexes M-I and N-I. Thereafter, the samples underwent subsequent haplogrouping using multiplexes M-V and N-III. The results of the experiments were confirmed by three independent assays. Before performing the analysis using our new APLP system, we checked the quality of the ancient DNA samples by using a conventional APLP system [9]; we also checked the direct sequencing of the hypervariable segment I (15999-16366) using our mini-primer sets [29], for which the amplicon length is shorter than 139 bp. The results of the preliminary analyses revealed that 3 out of 12 samples could be assigned to relevant haplogroups using the conventional APLP system: sample B192 assigned to haplogroup N9b, sample B516C to haplogroup M7a, and sample B516D to haplogroup M7a; only one sample (B192), which is ascribed to the early Kofun period, could be analyzed by direct sequencing (mutations identified at the nucleotide positions 15999-16366: A16183C-T16189C-C16223T, relative to rCRS [18]). Along with the ancient samples, negative extraction and negative PCR controls were also analyzed.

Hierarchical analysis of mtSNPs
As shown in Figs 3 and 4, our new APLP system correctly identified the genealogy of the mtDNAs for which the haplogroups had been determined in advance.   Table 2. Yellow, light blue, light green, red, green, purple, orange, and blue frames indicate multiplexes M-I, II, III, IV, V, VI, VII, and VIII, respectively. This color coding corresponds to that given in Fig 2. LM indicates the 10-bp ladder marker. doi:10.1371/journal.pone.0158463.g003

Sensitivity of the new APLP system
Although the copy number of mtDNA exhibits some variation among individuals, the detection limit was 1.0 × 10 −13 g of crude DNA for multiplexes M-I, M-III, and N-I, and 1.0 × 10 −14 g for multiplex N-VI. Consequently, our new APLP system correctly identified haplogroups D4j and F2 from 1.0 × 10 −13 g (100 fg) of crude DNA templates, which corresponds to less than 10 copy numbers of mtDNA (Fig 5). In the analysis of these samples, negative PCR controls were negative throughout the experiment.
Robustness of the new APLP system with respect to highly degraded mtDNA By using our new APLP system, 10 out of 12 mtDNAs from the ancient skeletons were successfully assigned to relevant haplogroups (Fig 6). In the analysis of the ancient skeletons, negative extraction and negative PCR controls were negative throughout the experiment (data not shown).

Discussion
The results showed that, following a hierarchical analysis of 81 haplogroup-defining mtSNPs and 3 insertion/deletion sites performed using fourteen 6-plex multiplexes and subsequent   Table 2.Yellow, light blue, light green, red, green, and purple frames indicate multiplexes N-I, II, III, IV, V, and VI, respectively. The color coding corresponds to that given in Fig 2. LM indicates the 10-bp ladder marker.
doi:10.1371/journal.pone.0158463.g004 electrophoresis, our new APLP system correctly identified the genealogy of the mtDNAs with known haplogroups. Previously, 15 to 36 mtSNPs and insertion/deletion polymorphisms were examined in East-Asian mtDNA by using conventional means such as SNaPshort minisequencing assays [2,6,7] or APLP [5,9]. However, the number of mtSNPs examined in these previous studies is too small for detailed haplogrouping of East-Asian mtDNAs. In particular, haplogroup D, which is the predominant haplogroup in many East-Asian populations, was not sufficiently classified in the previously reported assays. By using our new APLP system, many major mtDNA lineages including haplogroup D can be securely classified to the sub-haplogroup level. Moreover, using our new APLP system, hierarchical examination of many mtSNPs can help identify contamination or misinterpretation of the results on the basis of congruence with Phylotree, the updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Therefore, as we reported previously [29], our new APLP system can improve the reliability of sequencing and SNP analysis of mtDNA.
The recent advent of high throughput sequencing (HTS) technology based on so-called Next Generation Sequencers has allowed analyses of the complete mitochondrial and chromosomal genome sequences even in very degraded samples like those from archaeological skeletons [30,31]. However, HTS is very costly, and thus it is difficult for most laboratories to perform such analyses routinely. Therefore, to maximize the success rate of HTS, it is important to evaluate the quality and quantity of DNA in the samples before subjecting them to HTS. Our new APLP system correctly identified the haplogroup of mtDNAs in only 100 fg (1.0 × 10 −13 g) of crude DNA. This sensitivity is over 10 times higher than that reported in previous studies, where quantities of crude DNA in the order of at least pico (1.0 × 10 −12 ) grams were required for accurate genotyping [3,[5][6][7]. This extremely high sensitivity may be ascribable to the reduced number of SNP sites analyzed in each multiplex in our system compared to that analyzed in other systems. Our new APLP system is thus expected to serve as a time-and cost-efficient tool to evaluate the quality and quantity of DNA in samples before HTS analysis.
In the present study, we show how an inosine-flapped APLP system can be efficiently applied for the hierarchical multiplex analysis of mtSNPs. Adding a short inosine extension to the 5'-terminus of APLP primers improves the competitiveness of allele-specific primers to the template DNA, resulting in enhanced reliability of the SNP analysis [28]. Furthermore, the thermodynamics of the primers with inosine flaps have been proven to be less influenced by the sequence of PCR templates than the thermodynamics of the primers with 5' flaps containing ordinary bases [28]. These features of inosine-flapped primers are likely to have contributed to the high sensitivity observed for our new APLP system.
The robustness of our APLP system was verified by the analysis of 12 archaeological skeletons; only 3 such samples could be successfully assigned to relevant haplogroups using the conventional APLP system, whereas a total of 10 samples were successfully assigned using our new APLP system. Our inosine-flapped APLP primers generated shorter amplicons (<110 bp) compared to those generated by conventional APLP primers (<151 bp), and we believe it is the Using the conventional APLP system, 3 out of 12 samples could be assigned to relevant haplogroups (B192 to N9b, B516C to M7a, and B516D to M7a). Arrows indicate subsequent haplogrouping flows based on the results obtained using multiplexes M-I and N-I. Yellow frames identify results obtained using multiplexes M-I and N-I, while results obtained using mutiplexes M-V and N-III are framed in green and light green, respectively (color coding corresponds to that given in Fig 2). shorter amplicon length that is the source for the higher success rate observed for our new APLP system.
The haplogroups observed in the samples excavated from the Kusakari shell midden site were N9b and M7a. These haplogroups are observed in the Jomon people unearthed from Hokkaido, the northern island of Japan. Notably, haplogroup N9b is the most predominant haplogroup in these Hokkaido Jomon people (64.8%, 35 out of 54 individuals) [13]. The fact that these haplogroups are also observed in the Kusakari Jomon people, who were excavated from Honshu, the main island of Japan, indicates that these haplogroups are strong candidates for the so-called "Jomon genotype" as suggested by the previous studies [9,12,13]. Moreover, the fact that haplogroup N9b is observed in the Kofun sample (B192), excavated from the same site, may hint at genetic continuity in this site extending from the Jomon era to the Kofun era.
In addition, at the sub-haplogroup level, one Kusakari Jomon sample (B516C) was assigned to M7a1, which was not observed in the Hokkaido Jomon people [13]. Intriguingly, this subhaplogroup is the most predominant one found in modern-day Japanese and Korean M7a mtDNAs [22,32,33]. It has its highest frequency (44 out of 156 individuals) in Okinawa islanders living in the southern-most islands of Japan [32]. However, haplogroup M7a is rare in Southeast Asian populations, whereas the frequencies of its sister haplogroups (e.g., M7b and M7c) are relatively high in these populations [24,27,34]. We have previously hypothesized that haplogroup M7a may have diversified from its ancestral M7 haplogroup in the southern part of the Japanese archipelago [13]. Given the findings here, the fact that haplogroup M7a1 is observed in Honshu Jomon people, but is absent in Hokkaido Jomon individuals, gives some support to this hypothesis.
Unfortunately, we could not compare the robustness of our new system with that of SNaPshot analysis or HTS, mainly because of the residual volume of the samples. However, our system generates short amplicons similar to those reported in the study of Coutinho et al. [6], which focuses on the ancient DNA analysis of the skeletons excavated from South America. Therefore, our system is expected to be as effective as that of Coutinho et al. [6] for the analysis of fragmented mtDNA.
As described earlier, in the case of extremely degraded samples like archaeological skeletons, it is often very difficult to obtain reliable mtDNA sequences. Despite such difficulties, it is worth trying to obtain as much mtDNA data as possible from those samples, because such data is important for phylogeographic analysis, and, in some cases, personal identification. Therefore, our new APLP system is expected to be very useful in analyzing extremely difficult forensic samples, as well as for molecular anthropological studies of ancient populations.