The polymorphism of Hydra microsatellite sequences provides strain-specific signatures

Hydra are freshwater polyps widely studied for their amazing regenerative capacity, adult stem cell populations, low senescence and value as ecotoxicological marker. Many wild-type strains of H. vulgaris have been collected worldwide and maintained effectively under laboratory conditions by asexual reproduction, while stable transgenic lines have been continuously produced since 2006. Efforts are now needed to ensure the genetic characterization of all these strains, which despite similar morphologies, show significant variability in their response to gene expression silencing procedures, pharmacological treatments or environmental conditions. Here, we established a rapid and reliable procedure at the single polyp level to produce via PCR amplification of three distinct microsatellite sequences molecular signatures that distinguish between Hydra strains and species. The TG-rich region of an uncharacterized gene (ms-c25145) helps to distinguish between Eurasian H. vulgaris-Pallas strains (Hm-105, Basel1, Basel2 and reg-16), between Eurasian and North American H. vulgaris strains (H. carnea, AEP), and between the H. vulgaris and H. oligactis species. The AT-rich microsatellite sequences located in the AIP gene (Aryl Hydrocarbon Receptor Interaction Protein, ms-AIP) also differ between Eurasian and North American H. vulgaris strains. Finally, the AT-rich microsatellite located in the Myb-Like cyclin D-binding transcription factor1 gene (ms-DMTF1) gene helps to distinguish certain transgenic AEP lines. This study shows that the analysis of microsatellite sequences, which is capable of tracing genomic variations between closely related lineages of Hydra, provides a sensitive and robust tool for characterizing the Hydra strains.


Introduction
Since the initial discovery of Hydra regeneration by Abraham Trembley in 1744 [1], the freshwater Hydra polyp is used as a fruitful model system not only in cell and developmental biology but also for aging, neurobiology, immunology, evolutionary biology and ecotoxicology studies [2][3][4][5][6][7][8]. Hydra, which belongs to Cnidaria, the sister phylum of bilaterians (Fig 1A), is closely related to jellyfish although displaying a life cycle restricted to the polyp stage (Fig 1B).

PLOS ONE
PLOS ONE | https://doi.org/10.1371/journal.pone.0230547 September 28, 2020 1 / 20 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Over the past 100 years, numerous strains were captured all over the world to explore the variability of the Hydra genus and the genetic basis of developmental mechanisms [9][10][11]. The analysis of morphological and cellular criteria identified in Hydra strains collected worldwide established four distinct groups named H. oligactis (stalked Hydra), H. vulgaris (common Hydra), H. viridissima (symbiotic green Hydra) and H. braueri (gracile Hydra) [11] (Fig 1B). The main cellular criterion was provided by the morphology of nematocysts (the venom capsules located inside the mature stinging cells named nematocytes or cnidocytes) that varies between the Hydra groups [12]. More recently, a series of mitochondrial and nuclear molecular markers were used for a series of phylogenetic analyses [13][14][15][16], which confirmed the relevance of these four groups but also revealed that each group may actually contain several closely related strains that have been described as different species, e.g. H. carnea and H. vulgaris within the H. vulgaris group (Fig 1A). Indeed, in this group, two sub-groups were identified, the H. vulgaris-Pallas sub-group that includes animals collected in Europe and Asia, typically the Hv_Basel and H. magnipapillata 105 (Hm-105) strains respectively, while the second sub-group named H. vulgaris-4 by Schwentner and Bosch [16] includes animals collected in North America as H. carnea from which the AEP strain was derived (Fig 1A). In the absence of consensus names for the latter, we have decided to name it "H.vulgaris-NA", following their North American (NA) origin.
The importance of identifying the various Hydra strains/species relies on the fact that they can exhibit (i) different developmental behaviors, especially the morphogenetic variants that show distinct budding rate or size features in homeostatic context [20][21][22][23], (ii) lower regeneration potential such as the reg-16 strain that was obtained through inbreeding of the Japanese H. magnipapillata strain [24], (iii) abnormal apical patterning such as multiheaded strains [25,26], (iv) specific cellular properties such as the nf-1 strain that contains neither interstitial stem cells nor interstitial derivatives [27] or the sf-1 thermo-sensitive strain that loses its cycling interstitial cells upon transient heat-shock exposure [28]. Importantly, strains that do not show obvious differences at the morphological or cellular levels actually exhibit variable responses to gene silencing upon RNA interference [29], to drug treatment [30][31][32] or to environmental stresses [32]. In addition, experimental evidence indicate that strain-specific signals regulate the proliferation of interstitial cells [33].
During the past ten years, efforts were made to obtain the H. vulgaris genome [34], reference transcriptomes and proteomes [35][36][37], quantitative RNA-seq in homeostatic and regenerative conditions [38][39][40][41], on flow-cytometry sorted stem cell populations [38,[40][41][42] and single-cell transcriptomes [43]. Two strains of H. oligactis, one named "cold-sensitive" (Ho_CS) that undergoes aging and another named "cold-resistant" (Ho_CR) that does not, were used for transcriptomic and proteomic analysis [32]. Genomic sequences were also made available for the H. oligactis and H. viridissima species [41]. The current molecular barcoding in Hydra is precise and efficient but time-consuming and relatively costly as based on DNA extractions, PCRs amplification followed by DNA sequencing, and therefore not well-adapted to large-scale characterization of individual polyps.
Microsatellites consist in tandem repeats of short nucleotide motifs of variable length, e.g. (TA)n, (CA/TG)n, (CG)n, (CAG)n, where n represent the number of repetitions [44]. These microsatellites are distributed at different locations in the genome, and the number of repeats within a given microsatellite may differ between animals of the same species or population. As a result, microsatellites are widely used for DNA profiling in population genetics studies, but also in criminal investigations, paternity testing, or identification of individuals in the event of a mass disaster [45,46]. In these studies, individuals with the same number of repeats at a given genomic location are considered to be closely related, while each additional repeat reflects a divergent step. The combined analysis of different microsatellites makes it possible to construct a genotypic fingerprint specific to each individual, which provides accurate information for tracing evolutionary events such as population bottleneck, migrations, expansions, etc. . .
The objective of this work was to establish a rapid, inexpensive and reliable method to characterize animals of the H. vulgaris strains used in the laboratory. To this end, we established a method that relies on PCR amplification of microsatellite sequences on a single polyp without DNA extraction or sequencing. We show that the analysis of microsatellite polymorphism in animals from either various wild-type strains or transgenic lines provides specific signatures that reliably distinguish strains of the H. vulgaris group. This barcoding method, now routinely applied in our laboratory, is efficient and well suited for large-scale studies.

Hydra strain collection
The wild-type strains used in this study were a kind gift from colleagues, Basel1 and AEP1 from B. Hobmayer (University of Innsbruck), Basel2, Hm-105 and Ho_CR from T. Holstein (University of Heidelberg), AEP2 from R. Steele (University of California), Ho_CS from H. Shimizu (National Institute of Genetics, Mishima) and Nicolet from a Geneva pond. The AEP transgenic lines that constitutively express GFP in their epithelial cells, either gastrodermal (endo-GFP) or epidermal (ecto-GFP), were produced by the Bosch Lab (University of Kiel) [19,47] and kindly provided to us. The AEP1 transgenic lines expressing the HyWnt3-2149::GFP construct (here named Wnt3::GFP) either in epidermal or gastrodermal epithelial cells were produced in-house with the HyWnt3-2149::GFP-HyAct:dsRed reporter construct kindly given by T. Holstein [48,49]. We also produced in the AEP2 strain the Q82-203 and Q82-293 lines by injecting early embryos with the HyActin:Q82-eGFP construct (QS, unpublished) following the original procedure [19]. All cultures were fed three times a week with freshly hatched Artemia and washed with Hydra Medium (HM) [24].

One-step preparation of macerate extracts
Live polyps were washed three times five minutes in distilled water. Then, single polyps were dissociated into 50 μL distilled water by energetically pipetting them up and down until there is no tissue left, and immediately transferred on ice. Alternatively, reg-16 polyps previously fixed in paraformaldehyde (PFA 4%) and stored in methanol for several months, were stepwise rehydrated and dissociated as indicated above. Cell density of each macerate was estimated by measuring the OD 600 using a NanoDrop One (Thermo Sientific). The DNA content and DNA purity were roughly estimated by measuring the absorbance of each sample at 230, 260 and 280 nm. To implement an efficient one-step PCR procedure, we selected three AEP2 polyps showing a regular size (about 4-6 mm long without the tentacles).

PCR amplification from macerate extracts
To test the efficiency of PCR amplification on macerate extracts, we used primers of the βactin gene ( of Top Taq Polymerase, Qiagen). Subsequently we used 5 μL out of 50 μL macerate extract to amplify the mitochondrial cytochrome C oxidase I (COI) gene, the mitochondrial 16S ribosomal DNA (16S) and the microsatellite regions (ms) in each strain ( Table 1, S1 Table). Similar PCR conditions were used for all amplifications. Briefly, after an initial denaturation step at 94˚C for two minutes, samples were submitted to 30 cycles of (i) denaturation at 94˚C for 15 seconds, (ii) annealing at 52˚C for 30 seconds and (iii) a 30-60 seconds elongation step at 72˚C. The process was terminated by a final extension at 72˚C for 15 minutes. 10 μL PCR products were run on a 2.5% agarose gel at 120 V for two to three hours in the case of microsatellites, stained with ethidium bromide and revealed under UV-light.

Cloning and sequencing
For sequence validation, the PCR products were cloned using the pGEMT kit (Promega): 3 μl PCR products were ligated to 50 ng pGEMT vector in the presence of 3 units T4 ligase overnight at 18˚C (final volume 10 μL). Plasmidic DNA was integrated into competent DH5α E. coli and colonies were screened thanks to alpha-complementation. After overnight culture, plasmidic DNA was extracted using the CTAB procedure and sequenced using standard T7 primer at Microsynth (Basel, Switzerland). The number of colonies we sequenced and their origin (single or several animals) is indicated for each microsatellite sequence in S2 Table. Phylogenetic analyses The COI and 16S genes were selected for phylogenetic analyses. Corresponding DNA sequences were amplified by direct PCR amplification method as described above and sequenced (S1 Table). The obtained sequenced were aligned with the dataset previously produced by Martinez et al. [15] using the ClustalW function of BioEdit v7.2.6.1, and Maximum Likelihood phylogenetic trees were constructed with the PhyML 3.0 software (http://www. atgc-montpellier.fr/phyml/) applying the GTR substitution model [50]. The robustness of the nodes was tested by 1000 bootstraps.

One-step genomic amplification after quick mechanical tissue maceration
To bypass genomic and mitochondrial DNA extractions that are time-consuming and expensive when massively performed, we established a rapid animal dissociation in water that provides genomic DNA of sufficient quantity and quality for PCR reaction. We obtained an efficient PCR amplification of β-actin (193 bp) from macerate extracts, indicating that the application of a mechanical force to dissociate the tissues combined to the initial denaturation step of the PCR reaction suffice to release high quality genomic DNA and amplify sequences of interest (Fig 2A). Despite slight variations in band intensity, certainly reflecting the amount of starting material, the amplification remained highly efficient whatever the polyp size, the 260/ 280 and 260/230 OD value ratios and the template volume used here (yellow dots Fig 2B). For all subsequent experiments, we used one tenth of macerate extract as template for COI, 16S and microsatellite amplifications (Fig 2B). We also obtained efficient PCR amplification from macerate extracts prepared from animals fixed months or years earlier in paraformaldhehyde (PFA) and stored in methanol at -20˚C, especially for mitochondrial DNA amplification. This procedure thus allows us to gain genetic information from fresh as well as old samples.

Phylogenetic assignation of Hydra strains to the different groups and species
Next, we confirmed the assignation of each strain we acquired to one of the four  (Fig 3), which is not the case in the 16S analysis where the H. vulgaris group actually includes the H. brauerei and H. oligactis groups that thus do not appear monophyletic (S1 Fig).   (Fig 3). We found that the genomic 16S sequences of the two Hv_Basel strains are identical, while the mitochondrial COI sequences are different with 9 mismatches out of 657 bp (sequences obtained twice independently). Consequently, animals of these two cultures can be considered as representatives of two different strains, which we have named Basel1 and Basel2. In contrast, the COI and 16S sequences of AEP1 and AEP2 are identical, suggesting that they could represent a single strain.

Identification of three microsatellite regions in the Hydra vulgaris genome
We then analyzed some microsatellite sequences to test the conclusions obtained in the phylogenetic analyses and to establish a method for easy identification of strains belonging to the H. vulgaris group. To identify H. vulgaris genomic regions that contain microsatellites, we blasted two different tandem repeat motifs (TA) 15 and (CA) 15 against AEP transcriptomes available at the HydrATLAS web portal. We found three transcripts expressed by AEP polyps that encode repeats, the first one c25145_g1_i04 contains TG-repeats in its first intron (Fig 4, S2 Fig), the second c8134_g1_i1 encodes the Aryl-hydrocarbon receptor-Interacting Protein (AIP) and contains AT-repeats in its 5' untranslated region (UTR) (Fig 5, S5 Fig), and the third one (c21737_g1_i4) encodes the cyclin-D-binding Myb-like Transcription Factor 1 (DMTF1) and contains AT-repeats located in the 3'UTR (Fig 6, S6 Fig).
Next, we validated these sequences onto genomic and transcriptomic databases publicly available for Hm-105 on National Human Genome Research Institute (NHGRI) and Compagen. These three microsatellite regions were selected as they were retrieved from most databases and contained a variable number of microsatellite repeats between H. vulgaris-Pallas (Hm-105) and H. vulgaris-NA (AEP). We named these microsatellite regions ms-c25145, ms-AIP and ms-DMTF1 respectively; access to the corresponding transcriptomic and genomic sequences are given in S2 Table. The ms-c25145 polymorphism helps to discriminate between Hydra species and H. vulgaris strains The TG-rich ms-c25145 could be detected within two different Hm-105 genomic regions (Sc4wPfr_1246, Sc4wPfr_396 scaffolds) and the direct PCR approach efficiently amplified the ms-c25145 genomic sequences in seven strains (Hm-105, Basel1, Basel2, AEP1, AEP2, Ho_CS,  Ho_CR), but remained inefficient in the reg-16 strain (H. vulgaris group), possibly as a consequence of a lower quality of the genomic DNA that was produced from animals stored in methanol after PFA fixation. Even though the primers were designed for the H. vulgaris group, we succeeded to amplify this region in samples from Ho_CS and Ho_CR (H. oligactis group), whereas no amplification was observed from gDNA freshly prepared from Nicolet animals (H. viridissima group), possibly due to mismatches into primer regions (Fig 4A, S3 Fig).
The patterns obtained for ms-c25145 are quite different between Hm-105 (four bands), Basel1 (three bands) and Basel2 (single band), indicating that these strains can indeed be considered as distinct, in agreement with the results of the COI phylogeny (Fig 3). Concerning the AEP1 and AEP2 strains, the ms-c25145 patterns appear quite similar, with a main band about 216 bp long, and a smear of larger and less intense bands (Fig 4A, yellow arrows). This pattern is quite distinct from the sharp bands observed in Basel2. An intense band of similar size than in AEPs (218 bp) is observed for Ho_CS and Ho_CR as well as some weaker and longer amplicon (Fig 4A, orange arrows). In summary, ms-c25145 appears as an informative marker to distinguish Hm-105, Basel1 and Basel2 strains from each other, and from strains representative of the H. carnea, H. oligactis and H. viridissima species.
To confirm these results, we cloned the PCR products and randomly sequenced some colonies from at least two animals of each strain (for sequencing details see S2 Table), and we found the sequence size fully consistent with the observed size of the bands on the gels (Fig 4B  and 4C). Indeed, the lowest Basel1 PCR product is slightly shorter (205 bp) than the unique Basel2 PCR product (207 bp), whereas the two other Basel1 PCR products are 217 and 220 bp long. For Hm-105, we retrieved sequences for three PCR products out of the four observed on the gel, two corresponding to the shortest bands (192 and 203 bp) and one to the upper one (223-225 bp). In AEP samples, we retrieved multiples sequences with nucleotide polymorphism (AA instead of CA repeat) that correspond to the most abundant PCR product, ranging from 212 to 218 pb. Finally, sequencing results confirmed that the main PCR product observed in H. oligactis strains correspond to the 218 bp band, also found in AEPs. The sequencing data provided robust results regarding the number of TG-repeats of each sequence, i.e. 6, 13 and 17 in Hm-105, 7 and 13 in Basel1, 8 in Basel2, and 14 in the H. carnea and H. oligactis sequences.
We also analyzed the location of this ms-cv25145 microsatellite sequence within the ms-cv25145 gene: It appears intronic, located after the first exon, about 245 bp downstream to the 5' end (S2 Fig). In all Hydra strains where this microsatellite was detected, we actually also retrieved at least one isoform that does not contain the intronic region indicating that the unprocessed and the mature c25145 transcripts are rather stable. The c25145 gene encodes a putative evolutionarily-conserved protein with an unknown function as deduced from the alignment of the Hydra c25145 deduced protein product with related bilaterian sequences (S4 Fig). We found similarities in the N-terminal moiety (~100 first amino acids) with hypothetical proteins expressed by the sea cucumber Apostichopus japonicus [51], the arthropods Folsomia candida and Sipha flava (aphid), the mollusc Crassostrea gigas, the teleost fish Myripristis murdjan, Sinocyclocheilus rhinocerous or Danio rerio. Within this domain, a signature can be identified, formed of 37 residues, from which 32 are present in the Hydra protein (S4 Fig).

The ms-AIP polymorphism helps to identify H. vulgaris-Pallas and H. vulgaris-NA strains
The second microsatellite region (ms-AIP) is an AT-rich region located in the 5'UTR region of the gene encoding the Aryl-hydrocarbon (AH) receptor-Interacting Protein (S5 Fig). The polymorphism of ms-AIP is more restricted than that of ms-c25145, as we were unable to  (Hm-105, Basel1, Basel2 strains), H. vulgaris-NA (AEP1, AEP2 strains) and H. oligactis (Ho_CR, Ho_CS strains). Yellow arrows point to a smear detected in both AEP1 and AEP2, orange arrows point to a faint second band detected in both Ho_CS and Ho_CR. (B) Sequence alignment of the ms-c25145 region. The salmon-pink color box indicates the central TG-rich microsatellite region embedded within highly conserved regions (grey boxes). Primer sequences used for amplification are indicated with black arrows. Numbers in brackets after the strain name indicate the number of independent positive sequencings, numbers at the 3' end indicate the size of the PCR product and the number of TGrepeats (bold). Red writings indicate transcriptomic (t) or genomic (g) sequences available on the HydrATLAS (HA) server [32,40,41], the NHGRI Hydra web portal for the Hydra 2.0 genome (g2.0) [34] and Juliano transcriptomes (Jul) [38] or the Compagen (Co) server [37,42] (see S2 Table). (C) Graphical representation of the different ms-c25145 amplicons as deduced from sequencing data. Each dot corresponds to a distinct amplicon confirmed by one or several sequencings as indicated by the number of sequenced colonies (see S2 Table). Green, red and yellow color dots correspond to expected sizes, lighter color dots refer to sequences with errors (PCR or sequencing), the grey dot indicates missing data.  (Basel1, Basel2, Hm-105, reg-16) and H. vulgaris-NA (AEP1, AEP2). White arrows point to a faint band amplify these genomic sequences from the H. oligactis and H. viridissima strains (Fig 5A). Nevertheless, ms-AIP is useful to discriminate between the strains within the H. vulgaris group, i.e . Hm-105, reg-16, Basel1, Basel2, AEP1, AEP2. Two PCR products were obtained after genomic amplification from Hm-105 and AEP2 whereas a single PCR product was amplified from the other strains, with a specific size for each strain (Fig 5A).

Analysis of the polymorphism of the TG-rich microsatellite c25145 sequence (ms-c25145). (A) Amplification of the ms-c25145 genomic sequences from 7/9 tested strains that represent H. vulgaris-Pallas
The sequencing results mainly matched with the patterns detected by electrophoresis (Fig  5B and 5C), proving that distinct band sizes reflected stable strain-specific variations in both the length of the A-rich region and the number of AT-repeats. Indeed, two distinct batches of sequences were obtained for Hm-105 (199-200 and 229-234 pb; 13 and 29-33 AT-repeats respectively). The slight differences observed in the amplicon size among a given animal possibly resulted from polymerase slippage during the PCR process or from an altered sequencing process, as often observed in AT-rich regions (Fig 5B). In addition, the ms-AIP sequences obtained from Basel1, Basel2 and reg-16 are consistent with the 198, 206 and 199 pb long bands observed on the gels, corresponding to 11, 13 and 11 AT-repeats respectively. In contrast to ms-c25145, the analysis of the ms-AIP sequences helps distinguish between AEP1 and AEP2, since AEP2 shows two bands, 204 and 212 bp long corresponding to 16 and 22 ATrepeats, while only the lowest band is present in AEP1 (Fig 5A, orange arrow). As a consequence, we consider AEP1 and AEP2 as two distinct strains even though their COI and 16S sequences are identical (Fig 2). Since we were able to identify different patterns in the AEP1 and AEP2 strains, we also looked at the ms-AIP polymorphism in AEP transgenic lines (Fig  5D). The Q82-293 and ecto-GFP lines show the two-bands pattern found in AEP2 while the Wnt3::GFP, endo-GFP and Q82-203 lines show the same single-band pattern than AEP1. In summary, the analysis of the ms-AIP patterns are informative to identify and characterize strains of the H. vulgaris 1 species. In addition, in contrast to ms-c25145, ms-AIP provides a useful marker for the AEP strains and AEP transgenic lines.

The ms-DMTF1 microsatellite helps to discriminate between the AEP lines
The third microsatellite sequence (ms-DMTF1) is also AT-rich but located in the 3' UTR of the cyclin-D-binding Myb-Like transcription factor 1 gene (S6 Fig). The ms-DMTF1 primers were designed for H. vulgaris-NA strains and are thus only suitable for strains that belong to the H. vulgaris group (Fig 6A). Accordingly, they are useful to discriminate between animals of this H. vulgaris group. The analysis of the ms-DMTF1 polymorphism does not show variability between AEP1 and AEP2 but remains useful to distinguish the endo-GFP transgenic animals from all other AEPs (Fig 6). In fact, all the AEP strains and lines we tested here but the transgenic line endo-GFP, provide a two-band pattern, the lowest band being similar in size with the single one found in the endo-GFP (Fig 6B). By shot-gun sequencing of the PCRs products from different AEP animals, we found that the sequence of the upper band is 118 bp long (Fig  6C and 6D). In complement, the sequencing data obtained in the endo-GFP animals identified a PCR product that corresponds to 104 bp. Interestingly, available transcriptomes confirm the existence of both sequences (Fig 6C and S2 Table).
observed only in Hm-105 polyps, yellow arrows indicate a size difference between Basel1 and Basel2, and the orange arrows show a second band detected in AEP2 but not in AEP1. (B) Alignment of the ms-AIP sequences. The color boxes indicate the AT-rich central region (salmon-pink) and an A-rich motif (green) embedded within highly conserved regions (grey). Primer sequences used for amplification are indicated with black arrows. Numbers at the 3' end indicate the PCR product size and the number of AT-repeats (bold). Red writings indicate transcriptomic (t) or genomic (g) sequences available on HydrATLAS (HA) [32,40,41], NHGRI web portal for the Hydra 2.0 genome (g2.0) [34] and Juliano transcriptomes (Jul) [38], or Compagen (Co) server [37,42] (see S2 Table). (C) Graphical representation of the ms-AIP amplicons as deduced from sequencing data. Dot legend as in Fig 4. (D) Amplification of ms-AIP in five transgenic lines ecto-GFP and endo-GFP produced in uncharacterized AEP [42], AEP1_Wnt3 [49], AEP2_203 and AEP2_293 (QS, unpublished). https://doi.org/10.1371/journal.pone.0230547.g005

Comparative analysis of the information brought by microsatellite barcoding
To establish the respective barcode values of the ms-c25145, ms-AIP and ms-DMTF1 microsatellites (Fig 7), we compared the results obtained in the 36 strain/species pairs tested for each microsatellite. From the analysis of these three microsatellites we deduced four levels of information, (1) informative when the patterns are distinct between the two strains/species, (2) partially informative when microsatellite amplification is observed in one strain/species but not in the other, (3) or when the patterns obtained are identical between the two strains/species, (4) non-informative when amplification is not observed in either strain/species. Among these three microsatellites, ms-c25145 is the most informative as the only one amplified in three distinct groups (H. vulgaris-Pallas, H. vulgaris-NA, H. oligactis), providing a positive discrimination in 29 pairs (80.6%), either based on specific patterns as observed in 15 pairs (41.7%) or on an amplification restricted to a single strain/species in 14 pairs (38.9%). The ms-AIP is amplified in H. vulgaris-Pallas and H. vulgaris-NA, providing a positive discrimination in 30 pairs, based on specific patterns in only 12 pairs (40%) and on an amplification restricted to a single strain/species in 18 pairs (60%). Finally, ms-DMTF1 is only amplified in the AEP1 and AEP2 strains, providing a similar pattern in eight pairs, but a distinct one in some transgenic strains. We concluded that the approach presented here fulfilled our initial objective since it allowed us to properly characterize all strains of the H. vulgaris group used in  Factor 1 gene (ms-DMTF1). (A, B) Amplification of the ms-DMTF1 genomic sequence is restricted to the AEP strains, either unmodified (AEP1, AEP2) or transgenic (Q82-293, ecto-GFP, Wnt3::GFP, endo-GFP) lines. (C) Alignment of the ms-DMTF1 sequences. The color boxes indicate the AT-rich central region (salmon-pink) embedded within highly conserved regions (grey). Primer sequences used for amplification are indicated with black arrows. Numbers in brackets after the strain name indicate the number of independent positive sequencings, numbers at the 3' end indicate the size of the PCR product and the number of ATrepeats (bold). Red writings indicate transcriptomic (t) or genomic (g) sequences available on HydrATLAS (HA) server [32,40,41], NHGRI web portal for the Hydra 2.0 genome (g2.0) [34] and Juliano transcriptomes (Jul) [38], or Compagen (Co) server [37,42] (see S2 Table). (D) Graphical representation of the size of the ms-DMTF1 amplicons as deduced from sequencing data. Red color dots correspond to expected sizes, the grey dot indicates missing data.
https://doi.org/10.1371/journal.pone.0230547.g006 our laboratory, i.e. strains Hm-105, Basel1, Basel2 and reg-16 of the species H. vulgaris-Pallas as well as strains AEP1, APE2 of the species H. vulgaris-NA. By contrast, the phylogenetic approaches based on COI and 16S sequences to discriminate between strains had failed as the COI and 16S sequences were identical between some strains.

Analysis of speciation events in H. vulgaris based on the microsatellite signatures
Although the region surrounding the microsatellites sequences is quite conserved between all strains, we observed systematic differences between H. vulgaris-Pallas and H. vulgaris-NA strains in the organization of the amplified regions such as the TAGTCAAAGTAGTACA deletion in the upstream non-conserved region of ms-c25145 in H. vulgaris-Pallas strains (Fig 4B), or the size difference in the A-rich region in ms-AIP (Fig 5B). The conserved deletions in one of the two subgroups and the differences in the microsatellite motifs might suggest that the genetic flux between H. vulgaris-NA (AEP) and H. vulgaris 1 strains (Hm-101, Basel1, Basel2, reg-16) no longer exists. This result is compatible with the hypothesis that H. vulgaris-Pallas and H. vulgaris-NA can be considered as two cryptic species [52]. This hypothesis requires further confirmation such as the amplification of the ms-c25145, ms-AIP and additional microsatellite sequences from representative animals of the 14 hypothetical species reported by Schwentner and Bosch [16]. The acquisition of a genome for each sub-group would help to perform meta-analyses and analysis of single-nucleotide polymorphism to characterize the H. vulgaris species as recently done for the Ophioderma sea stars [53].

The direct dissociation of soft tissues provides quality templates for genomic PCR amplification
Genomic extractions for multiple samples as well as for population genetics studies can be rapid but costly when commercial kits are used, or time-consuming and risky when reagents

PLOS ONE
that are rather toxic to humans and/or the environment are used (e.g. guanidium thiocyanate, β-mercaptoethanol). For these reasons, a procedure using a simple buffer containing proteinase K has previously been established for efficient DNA extraction from individual Hydra polyps [15]. Here, we have simplified this procedure to completely bypass the genomic extraction step and to use directly as PCR substrate dissociated Hydra tissues that we call "macerate extracts". The rapid, inexpensive and highly reproducible single-step protocol is based on the mechanical dissociation of the tissues, which reliably allows the PCR amplification of mitochondrial and nuclear DNA. This procedure is now commonly used in our laboratory, not only to amplify microsatellite sequences and detect in Hydra cultures suspected contamination between strains, but also to amplify genomic sequences of genes of interest for directly sequencing or insertion into plasmid vectors. We also successfully applied this procedure to PFA-fixed Hydra tissues as reported above, as well as poriferan larvae (e.g. Oscarella lobularis, not shown). Nevertheless, if this direct DNA amplification from fixed animals provides a genomic DNA from similar quality that that obtained from fresh animals, it is quantitatively less efficient. In summary, this protocol can be effectively applied to soft tissues from any developing or adult organisms, especially when small amounts of tissue are available.

Systematize characterization of Hydra strains to improve data reproducibility
The microsatellite barcoding approach reported here offers a series of important advantages in that it is (i) sensitive, detecting a 2 bp shift in amplicon size, (ii) simple, requiring no chemicals or materials other than those used in ordinary PCR as in conventional barcoding approaches, (iii) fast, with data being acquired in less than a day, (iv) robust as it provides reproducible results, with 100% specific PCR amplification when primer sequences are evolutionarily-conserved. The immediate use of macerate extracts could be a possible limitation of this procedure. Indeed, we did not test the quality of these macerate extracts after their storage in a frozen state, assuming that nucleic acid degradation would occur. Nevertheless, we were able to amplify genomic DNA obtained after mechanical dissociation from fixed animal samples, implying that fresh material is not an absolute requirement.
In the context of life sciences where reproducibility can be a challenge [54,55], the development of tools to properly characterize the animals we work with appears to be a cornerstone towards more effective research. Indeed, Hydra laboratories use a wide variety of strains that are known to respond differently to chemical treatments or show variable sensitivity to gene expression silencing by RNAi. This procedure opens up the possibility of conducting blind clonal culture experiments, where the sensitivity of different strains to toxic substances, environmental stresses such as temperature changes can be compared. Indeed, as the microsatellite barcode procedure can be easily replicated on batches of unique polyps, it represents a major asset for discriminating among phenotypically similar polyps those that are genetically different, and vice versa. For novel unknown strains, it might be necessary to first identify additional microsatellite regions.

Possible mechanisms explaining the strain-specific variations observed in Hydra microsatellite sequences
Karyotyping on Hm-105 revealed that Hydra are diploid animals (2n = 30) [56]. It is therefore not surprising to observe either a single band or more frequently the same band completed by a second band, reflecting the homozygous versus heterozygous status of a given animal respectively. On the other hand, we interpret the differences in band size observed in animals of different strains as different alleles. Nevertheless, we have clearly observed and sequenced more than two different bands in the same polyp (see ms-c25145 in Hm-105 and Basel1). As mentioned above, the ms-c25145 primers we have designed can amplify two different regions of the Hm-105 genome (Sc4wPen_1246, Sc4wPen_396), which explains why four bands can be observed in this strain (twice two alleles). The most parsimonious scenario would be that these two regions result from a recent single gene duplication that occurred in the common ancestor of the Hm-105 and Basel1 strains, without affecting the other strains tested here where only one copy is detected.
The microsatellite barcoding might also reveal some genetic mosaicism, as suspected from the four-band and three-band patterns observed for ms-c25145. Genetic mosaicism is defined as genetic variations acquired post-zygotically in cells of an individual developed from a single zygote, a phenomena frequently observed in plants and clonal animals as well as in humans [57,58]. In clonal animals as cnidarians, the segregation of germ cells does not occur during early embryonic development and mutations affecting somatic cells as well as germ cells can accumulate over the multiple divisions of multipotent stem cells. In Hydra, beside the interstitial stem cell population that can transiently provide germ cells, the two epithelial stem cell populations also continuously cycle over the lifetime of the animal, potentially accumulating somatic mutations independently. This mechanism provides the opportunity for additional genetic variations within the same animal as observed in leaf cells [59].

Conclusion
With this study, we implemented a powerful barcoding approach based on microsatellite polymorphism for strains belonging to the H. vulgaris group. The use of this approach should enhance the reproducibility of experiments conducted in different laboratories by allowing the correct identification of each strain, including the AEP transgenic lines, in order to conduct unbiased experiments on well-characterized polyps. Data obtained on six wild-type strains belonging to the main Hydra species used in experimental biology, namely H. vulgaris-Pallas and H. vulgaris-NA, tend to confirm that the H. vulgaris group actually covers a set of cryptic species rather than a single one. We believe that microsatellite polymorphism analysis can help discover speciation events, thus representing a complementary approach to phylogenetic analyses aimed at identifying Hydra species.
Supporting information S1