Kinship and Social Behavior of Lowland Tapirs (Tapirus terrestris) in a Central Amazon Landscape

We tested the hypothesis that tapirs tolerate individuals from adjacent and overlapping home ranges if they are related. We obtained genetic data from fecal samples collected in the Balbina reservoir landscape, central Amazon. Samples were genotyped at 14 microsatellite loci, of which five produced high quality informative genotypes. Based on an analysis of 32 individuals, we inferred a single panmictic population with high levels of heterozygosity. Kinship analysis identified 10 pairs of full siblings or parent-offspring, 10 pairs of half siblings and 25 unrelated pairs. In 10 cases, the related individuals were situated on opposite margins of the reservoir, suggesting that tapirs are capable of crossing the main river, even after damming. The polygamous model was the most likely mating system for Tapirus terrestris. Moran's I index of allele sharing between pairs of individuals geographically close (<3 km) was similar to that observed between individual pairs at larger distances (>3 km). Confirming this result, the related individuals were not geographically closer than unrelated ones (W = 188.5; p = 0.339). Thus, we found no evidence of a preference for being close to relatives and observed a tendency for dispersal. The small importance of relatedness in determining spatial distribution of individuals is unusual in mammals, but not unheard of. Finally, non-invasive sampling allowed efficient access to the genetic data, despite the warm and humid climate of the Amazon, which accelerates DNA degradation.


Introduction
The ability of individuals to change their behavior based on the recognition of kin is an important characteristic in the evolution mammalian social systems [1,2]. For instance, individuals can form philopatric social groups based on kinship or disperse from the natal home range. The way individuals behave with respect to related individuals will affect how genetic diversity is distributed in space [3]. Building an understanding of how different species behave towards kin is crucial to our investigation of the evolutionary causes of mammalian social behavior [4].
A common social outcome mediated by kin recognition in mammals is the formation of philopatric social groups, in which a number of closely related individuals remain together at or near their natal site and display cooperative behavior [5]. Surprisingly, and so far little explored, philopatric social groups are found not only in gregarious species but also in solitary ones [6][7][8]. In solitary species, while there is a significant overlap between home ranges, individuals perform their daily activities alone [6]. This helps explain why such behavior has attracted little attention, as direct interactions occur infrequently, thus making it difficult to carry out observational studies. Observational studies are often further complicated by the structure of the environment and the species' activity pattern, which can further decrease the probability of observing interactions in situ.
To overcome the issues with direct observation, indirect approaches based on home range overlap have been used as a measure of sociality in solitary species [9]. Nevertheless, this still requires capturing, radio-collaring, and monitoring several individuals over a suitable time scale. However, large mammals are not easy to capture. An alternative approach is provided by analysis of genetic data obtained from non-invasively collected samples. Such samples can be used to identify individuals and infer the degree of relatedness among them [10]. By examining the spatial distribution of relatedness we can obtain valuable insights into animal behavior, which do not require capture and manipulation of animals [11][12][13].
Here, we present genetic data on the lowland tapir (Tapirus terrestris), a large solitary mammal, in order to test the hypothesis that lowland tapirs exhibit social behavior based on relatedness. The little we know about social behavior in this species comes from observation of overlap in home ranges, radio-collar data, and anecdotal accounts. Substantial home range overlap between individual tapirs has been found in a number of studies encompassing a range of different biomes [14][15][16]. Medici [16] did not find significant differences in percentage home range overlap across the three possible gender pairs: 43.2% for male-male, 33.4% for female-female and 34.9% for male-female range overlap. The extent of home range overlap suggests that T. terrestris may display some sort of social behavior (as defined in Waser & Jones [6]).
In addition to home range overlap, indications that tapirs display territorial behavior are provided by movement data and exclusion behavior. Tobler [15] found that individuals regularly walked along the borders of their ranges, possibly monitoring a territory. With respect to exclusion behavior, resident T. bardii individuals were observed attacking newly translocated individuals, suggesting territorial defense [17]. Tapirs may also use latrines as a way of marking territory boundaries, a common behavior among mammals [18,19] (but see Ralls [20] and Rostain et al. [21] for alternative explanations for latrine use). The evidence of territorial behavior and home range overlap suggests that tapirs can recognize different individuals, reinforcing the possibility that this species exhibits social behavior.
Mating systems are also often intimately associated with social behavior and may influence the degree of territoriality in a species [6]. Currently, we lack data on the tapir's mating system. Observations reported by C. R. Foerster in studies of T. bairdii indicate that tapirs are likely facultative polygynous [16]. Overlap in territory among related females is expected under polygyny [6]. Hence, polygyny can lead to increased spatial autocorrelation in genes at small spatial scales relative to broader spatial scales [22].
The apparent capacity to change behavior based on individual recognition lead us to suspect that kin recognition may influence patterns of interactions in T. terrestris. We thus hypothesized that tapirs tolerate individuals from adjacent and overlapping home ranges if they are related. Based on this hypothesis we expect to find pairs of related individuals geographically closer than pairs of unrelated individuals. This hypothesis was proposed by Medici [16], based on both personal communications with C. R. Foerster, and the study of Tobler [15], in which a male and female, likely sibs, were observed sharing their parent's home range. According to C. R. Foerster [16], tapirs form family units, in which there are extensive home range overlaps between related individuals and non-related individuals are excluded. Our study is the first, to our knowledge, to test this hypothesis using genetic information.
Thus, the objectives of the present study were: (1) to analyze the spatial distribution of related individuals of T. terrestris; and (2) determine the species mating system.

Study area
The Balbina hydroelectric dam was flooded in 1987 and is located approximately 150 km north of the city of Manaus (Amazonas state, Brazil). Due to the flat topology, the reservoir has extended over 2360 km 2 creating over 3500 islands [23]. To offset the environmental impact caused by the dam the Brazilian government created the Uatumã Biological Reserve in 1990 (0u50' to 1u55' S, 58u50' to 60u10' W). The Uatumã reserve is predominantly composed of continuous forest and its buffer zone includes the lake and island formations. The area is dominated by dense tropical rainforest with an average tree height of 30 meters [23].
Preliminary surveys conducted in the islands of the Uatumã Biological Reserve suggested a high density of tapirs (M. Benchimol personal communication). This was in stark contrast to our previous experience in continuous forest in the Jaú National Park (AGS), where tapir densities were low and dung samples were rare. The observation of greater densities of tapirs in seemingly disturbed habitat is not uncommon [24]. Also, the distribution across islands facilitates fieldwork, as a larger area can be covered by boat than what would be feasible on foot in the jungle. Furthermore, the footprints in the margin of the islands were important indicators of the presence of tapirs and, therefore, provided ample opportunities to collect dung.
The ability of tapirs to swim, the relatively small distance between adjacent islands (see Figure 1), the high tapir density and the logistical efficiency suggested that it would be more feasible to study tapir behavior on the islands than in areas of contiguous forest. Therefore we selected groups of islands (islands in close proximity to one another) to be sampled intensely in order to capture all tapirs, and ascertain that we have described all possible local relatedness connections. To cover a greater portion of the reservoir and have a representative sampling, we surveyed several groups of adjacent islands in different areas of the reservoir ( Figure 1). We used Landsat TM5 satellite images from 2008 analyzed with ArcGIS 9.3 [25] to assist in the selection of target islands. A total of 48 islands were visited over a period of 55 days predominantly in the dry seasons of 2009 and 2010. Sampling was carried out under permit no. 21320-1 issued by the Instituto Chico Mendes de Conservação da Biodiversidade (ICMBio/MMA).

Sample collection
A small aluminum boat with an outboard engine was used to circle the islands in search for recent signs of tapir activity and feces in the water. The number and age of footprints on the margin were determinants in the choice of landing sites for each island. We found a positive relationship between footprint density and the possibility of encountering latrines. Each landing lasted one hour, with three researchers actively looking for latrines or feces.
We only collected feces with a short exposure time (1 to 5 days). Disposable scalpels were used to scrape off approximately 1 ml of the fecal pellet surface. The sample was then placed in a Falcon tube (15 ml) containing 5 ml of Longmire buffer [26] or 10 ml of absolute ethanol. Sample and preservation buffer were thoroughly mixed by inversion to ensure a uniform mixture. The samples were maintained at ambient temperature throughout the fieldwork (maximum of 15 days) and preserved at 280uC in the laboratory.
DNA was isolated using the QIAmp DNA Stool Mini Kit (QIAGEN), following the manufacturer's protocol with the following modifications: (1) the initial amount of the sample was increased to 500 ml and (2) the final elution volume was decreased to 40 ml, as recommended by other authors [27,28].
We used microsatellite markers previously developed for T. terrestris. Fourteen microsatellite loci were tested on blood samples from two captive tapirs held at the Mantenedor da Fauna Cariuá facility (Cadastro Técnico Federal-National Registry Number: 671958). Blood was drawn during scheduled veterinary care procedures by a registered vet (Laerzio Chiesorin Neto, CRMV 0284/AM) following standard procedures approved by the Brazilian Regional Veterinary Council, and the IUCN/SSC Tapir Specialist Group's Veterinary Committee (http://www. tapirs.org). The Animal Ethics Committee at INPA does not require prior approval to conduct sampling if it is deemed 'prophylactic or for other veterinary care'. All care was taken to ensure that no animals suffered during the development of this study.
Non-invasive samples are generally characterized by high rates of genotyping and amplification errors [34][35][36]. To ensure we only used high quality genotypes in analyses we used the multiple-tube approach [37]. In the multiple-tube approach, the genotype at a locus is determined by consistent observation of alleles across multiple PCRs. The exact number of PCRs largely depends on the amount and quality of DNA that can be obtained from the fecal samples, and the resources available to repeat PCR reactions. Taberlet et al. [37] suggest that an initial three positive PCRs be performed, and if ambiguity persists, another four PCRs should be carried out.
We modified the Taberlet et al. [37] approach in order to fit with our budget, sample, and laboratory constraints. We included in the final dataset only genotypes that were observed at least three times in a maximum of seven PCRs per marker per sample. Three positive PCRs with consistent genotypes was the minimum required by Taberlet et al. [37] to achieve 99% confidence in the observed genotype when the genotype is heterozygous. We also propose that the mislabeling of heterozygous individuals as homozygous was not a significant error in our dataset because we: (1) did not detect departures from Hardy-Weinberg proportions and null alleles; (2) re-captured genotypes in close geographic proximity; and (3) observed and expected heterozygosities did not differ significantly from other studies (see Results section).
We found that DNA degraded rapidly after isolation, thus, to obtain three consistent amplifications, an average of three extractions were needed per sample in order to obtain sufficient DNA for all the PCRs. Extractions were not pooled, but instead were performed as needed following PCR amplification failures. Genotypes from samples that did not amplify in five consecutive PCRs per primer were discarded.
The DNA extraction, PCR preparation and PCR handling prior to genotyping were all performed in different laboratories in order to avoid contamination. Sample preparation for genotyping was the only step carried out in a laboratory in which samples from other animals were present. PCR mixtures were prepared in a PCR hood sterilized with ultraviolet light. Positive and negative controls were included as part of each PCR batch and negative controls were included in the DNA extraction step. If the negative control was positive in the PCR, the control was genotyped and, if successful, the samples with similar alleles to the negative control were discarded.

Data analysis
The genotypes were analyzed for the presence of null alleles, allelic dropout and stuttering using the program MICRO-CHECKER v2.2 [38]. To measure the statistical power of the primer set for individual identification we calculated the probability of identity (P (ID) ), which is the probability that two non-related individuals have the same genotype in a population [39]. The P (ID)unbiased and the P (ID)sib were calculated using GIMLET v1.3.3 [40], correcting for the number of sampled individuals and the possibility of sampling related individuals [41]. ARLEQUIN v3.5 [42] was used to estimate heterozygosity and test for deviations from Hardy-Weinberg proportions and linkage equilibrium. Bonferroni correction [43] was applied to adjust statistical significance across multiple tests.
To test our hypothesis it was necessary to first establish the number of distinct genetic units sampled in the Balbina reservoir. The number of units was estimated using STRUCTURE V. 2.2 [44]. We allowed for the possibility of admixture and considered the allelic frequencies correlated between genetic units. One to four populations (K) were tested and a priori we considered all K to be equally likely. We did not use location information to inform the prior (i.e., uninformative prior). For each K, we ran 10 chains each with 10 6 iterations, with the first 10 5 iterations discarded as burnin. The most likely K was inferred by maximizing the loglikelihood of the data given K. Convergence of the MCMC was assessed by visual inspection of chains within STRUCTURE, and by comparing results across multiple runs of STRUCTURE. We were satisfied that convergence was achieved when we did not observe any trends in the chains, and that results across chains were largely comparable. It is hard to assess with complete confidence that convergence has been attained, but it is usually easy to determine that convergence has not been reached [45]. In our approach, we reduce the uncertainty about the effect of the starting values on the final outcome, and we minimize the risk that inferences are being drawn on MCMC that have not yet reached stationarity [46].
We used ARLEQUIN to estimate gene diversity, and estimate F ST and F IS values [47] based on the analysis of molecular variance (AMOVA) between individuals on the eastern (n = 18) and western (n = 12) banks of the reservoir. We chose this grouping because we believe that the lake is the greatest potential barrier to dispersal in the reservoir landscape. Individuals that did not clearly belong to either margin were excluded from this analysis (n = 2).
We also estimated the effective population size (N e ) using the program MIGRATE-N v2.1 [48]. We used Bayesian inference and maximum likelihood to estimate the parameter h, which was subsequently converted into coalescent effective population size using the formula h = 4N e m. Since there is no estimate of microsatellite mutation rate (m) for any of the species of tapirs, we considered a range of mutations rates from 1610 24 to 5610 24 . These mutations rates encompass estimates used in mammalian studies [49][50][51].
To infer h using maximum likelihood we ran 10 short chains, sampled each chain 5610 4 times and recorded 500 genealogies. We then ran three long chains, sampled each chain 1610 6 times, and discarded the first 1610 5 samples as burn-in. In the Bayesian inference analysis, we ran one long chain, which was sampled 5610 6 and recorded every 100 th genealogy. Searches were replicated 10 times. Search of genealogy space was improved via adaptive swapping among chains.
Estimates of relatedness between pairs of individuals can be highly variable, and different relatedness estimators will have distinct behaviors for any given dataset and particular relatedness category [52,53]. In order to investigate the properties of different estimators given the observed allele frequencies in our dataset, we used COANCESTRY V.1.0 [54] to simulate 100 pairs of individuals in each of the four major relatedness categories (PO: parentoffspring; FS: full-sibs; HS: half-sibs; and UN: unrelated). Because of the difficulty in separating PO and FS pairs, we grouped this category into a single first-order (FO) relationship category. We then used COLONY v2.0 [55], KINGROUP v2.08 [56] and IDENTIX v1.1 [57] to classify the simulated pairs into relatedness categories. We used the results from our simulated pairs as a training set in order to set expectations about classifying pairs of samples in our dataset.
In COLONY, due to the lack of data on sex and age of the individuals, the same individuals were set as possible candidates for siblings, bulls and cows. We used a prior probability of 0.5 that at least one true cow or bull was present in the dataset and accepted only relationships with a greater than 50% probability of belonging to a relatedness class. COLONY uses the information about the mating systems to perform the classification of pairs into relatedness classes. We estimated the likelihood of the data given three different breeding systems: (1) monogamy; (2) polygyny or polyandry; and (3) polygamy. We used Bayes factors [58] to identify which mating model had highest posterior support given the available data. Bayes factor values less than 22.0 (Log 10 scale) were considered an indication of a significantly better fit of the more complex model to the observed data [58].
The relatedness index (r) of Lynch and Ritland (r LR99 [59]) and Queller and Goodnight (r QG89 [60]) were estimated with KING-ROUP and IDENTIX. We used IDENTIX to estimate the 95% confidence interval for each pairwise r by bootstrapping. We used KINGROUP to test relationship hypotheses against more than one null hypothesis using likelihood ratio tests. In other words, we asked what is the likelihood odds ratio of a pair being PO given that FS, HS and UN are null hypotheses.
The results of the analyses for the simulated pairs were checked in R [61]. We calculated the proportion of unrelated pairs being classified as related (type I error), the number of related pairs being classified as unrelated (type II error), the proportion of first order relationships pairs (PO and FS) being misclassified as something else (misFO) and the proportion of UN and HS pairs being misclassified as first order relatives (misHS/UN). To assist the classification of some relationships in the tapir dataset, for each relatedness category we calculated the mean number of loci for which at least one allele was shared between a pair, and the mean number of alleles shared between individuals in a pair (see Table  S1 for more details).
The classification of the simulated pairs suggested that COLONY performed poorly with our dataset (see Results section). Thus, we combined the results of KINGROUP, IDENTIX, and allele sharing patterns in order to produce a final classification for each pair in our dataset. Based on the simulation results, we took the following conservative approach to classify individual pairs into relatedness categories: (1) Based on the observed confidence interval surrounding an individual pair's r LR99 and r QG89 , we classified pairs as: (2) We accepted the likelihood ratio test with the lowest p-value among the following hypotheses comparisons: N Parent-offspring vs full-sibs and unrelated; N Full-sibs vs half-sibs and unrelated; N Half-sibs vs cousins and unrelated; N Cousins vs unrelated, and; N Unrelated vs parent-offspring, full sibs, half sibs and cousins.
(3) We classified pairs as FO if they shared $7 alleles at $0.8 of the loci.
The results from our simulations found a large overlap among confidence intervals between half-sib (HS) pairs and first-order relatives (FO; parent-offspring or full-sibs). Thus, we created a HSFO category, which groups individuals that are likely related but we are unsure to what degree. As there were more FO relatives in the interval from 0.25#r,0.5 than in the interval 0.125#r,0.25, we decided to created to subgroups of HSFO. This allowed us to create a finer gradient for classification of relatedness than a single HSFO category ranging from 0.125#r,0.5 would allow.
The relationships in the tapir dataset that could be assigned to a class were used to test for associations with geographic distances. Pairwise geographic distances were estimated as the Euclidian distance between individuals, and were based on geographic coordinates recorded for each sample (collected with a Garmin GPSMAP 60CSx). We tested if the geographic distances of related individuals (classified as first-order or half sibs) were smaller than the distance between unrelated ones using a Mann Whitney U test. We also performed a Mantel test to test for an association between observed r values and geographic distances. Both tests were carried out in R [61]. For the Mantel test, we used the ncf package [62].
Finally, Moran's I, an index of spatial autocorrelation [63], was estimated based on allele sharing at two scales: individual and landscape. Based on the mean home range described for T. terrestris [14][15][16], we assumed that samples distanced less than three km were deposited by individuals that likely have overlapping home ranges; we called this the individual scale. The landscape scale consisted of pair comparisons between samples separated by more than three km. We estimated 95% confidence intervals around estimates of Moran's I by bootstrapping individuals across both scales; this analysis was performed in SPAGeDi v1.3 [64].

Results
Eleven of the 14 microsatellite markers amplified in the blood samples. One of these markers was monomorphic (Tter18) and five were sensitive to the low quality of DNA from fecal samples (TtGT070, Tte01, Tter13, Tter14, Tter9), resulting in either nonamplification or difficult to interpret electropherograms. The remaining five loci were used in kinship and population analyses, and formed our genotyping panel. No null alleles, allelic dropout, genotyping errors, linkage disequilibrium or deviations from Hardy-Weinberg proportions were detected in any of the five loci of the genotyping panel.
In spite of being able to genotype only five loci, these were sufficiently informative to discriminate individuals: we estimated a P (ID)unbiased of 2.25610 26 and a P (ID)sib of 9.67610 23 , which are considered sufficiently stringent for conservation purposes (less than 0.01 [41]). Mean observed heterozygosity was 0.7721 and allelic diversity was 6.6 alleles/locus (Table 1).
Approximately 1000 fecal samples were found, but only 63 were considered sufficiently fresh to sample. Among the samples collected for laboratory analysis, 10 amplified across three loci, two amplified across four loci and 24 amplified across all five loci. Only 20 genotypes were unique across the five loci, and four genotypes were repeated once across samples. This suggests that feces for each of four individuals were collected twice. The samples with replicate genotypes were collected within a two-day interval and were separated by 630, 505, 400 and 150 m.
Samples collected in water were useful for genetic analysis: 58% (14) of the samples that amplified at five loci were collected in water. Rapid degradation of the extracted DNA was observed for all samples even when kept at 280uC, with amplification failing approximately 15 days after extraction.
We used the 32 unique genotypes (with a minimum of three loci) for population analysis. In STRUCTURE, the maximum marginal log-likelihood of the data given K (logL(D|K)) was found for K = 1. This suggests the presence of a single genetic unit in the study area, which is corroborated by the AMOVA results between opposite margins of the reservoir. Most of the genetic variance was contained within each margin rather than between margins. This resulted in low F ST (0.008) and F IS (20.011) values (both p.0.05). The gene diversity index for the population of the Balbina reservoir was 0.663460.4207.
As shown in Table 2, errors surrounding estimates of r, the relatedness index, in the programs KINGROUP and IDENTIX were similar, and the type I errors were relatively high (0.32). The fullpedigree likelihood method implemented in COLONY was extremely conservative, with a great number of first-order pairs being misclassified (misFO = 0.94). The small sample size, marker number and the lack of information about the individuals (sex and age) probably affected the classifications made by COLONY. Also, COLONY does not identify UN pairs, thus we considered all unclassified pairs as UN; this likely has inflated our COLONY estimates of misFO and type II error.
KINGROUP's pedigree hypothesis test was also conservative, but it had a smaller misFO error (0.38) and smaller errors overall than those observed with COLONY. Moreover, the confidence interval estimates, analyzed in IDENTIX, was the most reliable method: all errors were below 0.1 ( Table 2). The main reason for these smaller errors was the classification as ''inconclusive'' for confidence intervals that included values between 0 and 0.125. Furthermore, allele sharing patterns proved to be a good approach to classify FO pairs: 87% of simulated pairs that shared $7 alleles at $0.8 of the loci were FO pairs.
Due to potential problems associated to missing data in kinship analyses, we restricted our analysis to samples that amplified for four (n = 2) or five (n = 20) loci. The polygamy model had the most support from the data, with the other two models obtaining Bayes factor values less than 220.0 (Table 3). Based on the above-described criteria to classify relationships, we found 10 first order relationships (parent-offspring or full sibs pairs -FO), 10 half-sib relationships, 25 unrelated pairs, and 186 inconclusive pairs.
In five first order relationships and five half-sib pairs, the individuals were located on opposite sides of the reservoir. The distance between FO individuals ranged from 0.1 to 29.

Discussion
In this study, we present non-invasive genetic data on T. terrestris sampled from the islands formed by the Balbina hydroelectric reservoir in central Amazon. Our objective was to test the hypothesis: individuals that overlap in their home ranges are more likely to be related than individuals that do not overlap in their home ranges. Below we interpret our results in terms of what we were able to achieve logistically, and what our results mean for tapir biology and mammal social behavior.
Finding fecal samples suitable for genetic analysis in the Amazon rainforest is hindered by the dense forest and by the region's climate. The dense canopy results in relatively dark understory, while leaf litter act as camouflage, making it difficult to spot dung samples. Meanwhile, the warm, humid climate accelerates DNA degradation in feces [65,66]. The local environmental conditions notwithstanding, once samples were found that were considered sufficiently fresh for analyses, storage time became an important factor influencing amplification success. Several samples were collected while optimization of laboratory protocols was still underway, which resulted in longer storage time and lower genotyping success rate. Thus, we recommend that protocols be established and optimized prior to initiating fieldwork [34].
Despite these operational difficulties, we were able to obtain a reasonable number of samples for a large Neotropical mammal. In 55 field days we obtained reliable information on at least 20 individuals, as identified by genotype profiles. In comparison, studies based on animal capture, such as those of Tobler [15] and Medici [16], caught seven individuals over a six month period in the Peruvian Amazon and 35 individuals over the span of approximately nine years in the Atlantic Forest, respectively. It is also apparent that non-invasive samples can be used for recapture studies in the Amazon biome; samples with identical genotypes were collected within a short span of time and at close distances, which increases our confidence in a true recapture. Thus, the use of non-invasive sampling allow relatively rapid access to important biological information about elusive species [67], and provide encouragement for future research on elusive tropical species.
A large proportion of the samples were collected in water rather than on land. Contrary to general expectation, our results demonstrate that dung samples found in water bodies in tropical terrestrial ecosystems can yield high-quality genetic data. The lack of strong water currents in the Balbina reservoir and sampling during the dry season allowed the feces to remain intact for a greater period of time. This opens up the possibility of sample collection in study areas that encompass rivers without a strong current or in lakes. It should be noted that a sample in water that was carried by the wind/current could be differentiated from defecation at the collection sites by the appearance, quantity and grouping pattern of the pellets.
Our results indicate a single genetic unit in the landscape of the Balbina reservoir. While this result is important in regards to Note: motif type (Motif), allele size variation (Size), number of samples (N), allele richness (A), observed heterozygosity (Ho), expected heterozygosity (He), probability of identity with sample size correction (P (ID)unbiased ) and probability of identity between sibs (P (ID)sib ). doi:10.1371/journal.pone.0092507.t001 validating the assumptions of the relatedness analyses, it is also an interesting result in terms of landscape genetics. It suggests that the Uatumã river does not act as a barrier to gene flow in T. terrestris. However, the question of whether the increased width of the Uatumã river will have an effect is not likely to be answered any time soon. The time elapsed since the damming of the river (24 years) has not been sufficient relative to the species generation length to generate large effects on the spatial distribution of genetic variation (e.g. [68]): the life expectancy of tapirs in captivity is 30 years [69] with a generation time of approximately 11 years [16]. Thus, the low F IS and F ST values more likely reflect values of gene flow and genetic diversity that existed prior to the flooding of the dam. Hence, our results may be used as a benchmark in future studies aimed at assessing potential disturbances caused by the building of the dam. However, we argue that our data carries evidence that demonstrates that the width of the lake does not pose a complete barrier to tapir movement in this landscape. Assuming a typical life table for mammals [70], with high mortality rates among juveniles and adults at an advanced age, the proportion of individuals in the population as old or older than the dam is probably less than 5%. Moreover, the life expectancy of mammals in the wild is generally lower than that of those kept in captivity. In this context, we feel comfortable in concluding that some of the 10 related pairs of individuals located on opposite margins of the reservoir include individuals born after the flooding. This corroborates the idea that barriers to gene flow (natural or artificial) in lowland tapirs occur at larger spatial scales that straddle more salient barriers, such as the Amazon River [71]. It is not possible, however, to say whether tapirs are able to swim across the full extent of the reservoir, or if islands along the old riverbed ( Figure 1) are used as stepping-stones. Bayesian (6.5) and maximum likelihood (7.1) estimates of h (mutation scaled effective population size) were similar, as is expected when using non-informative priors. Given mutation rate assumptions, the effective population sizes may vary from 3250 to 17750. Generally the ratio of effective population size to census population size is thought to be around 1:10. Thus the number of individuals in the Balbina reservoir region is large, ranging between 177500 and 32500 depending on the assumed mutation rate. If one considers that the census population for tapirs in an Atlantic Forest fragment of 360 km 2 is ,300 individuals [16], we would need an area ,3 times larger than the REBIO Uatumã to harbor 30 thousand tapirs. This is not unreasonable given the continuity of the habitat in the region. Therefore, the estimated values are plausible if we consider that the geographic area occupied by the Balbina population is likely to be much larger than the sampled area.
The geographic extent of the population that includes Balbina is likely to be very large, considering that De Thoisy et al. [71,72] demonstrated minimal differentiation and concomitantly high gene flow for T. terrestris over an area at least 100 times larger than that sampled in this study. As a further comparison Drummond et al. [73] and Spong et al. [51] estimated even larger effective population sizes than the present study for the Beringian bison and the Tanzanian leopard, respectively. Moreover, the estimated mean observed heterozygosity and allelic diversity in the present study are among the highest reported for large mammals [31]. De Thoisy et al. [72] found similar values and Gonçalves da Silva et al. [74] found slightly lower values for tapirs in captivity in Argentina (Table 4). For T. bairdii [31], the reported observed heterozygosity and allelic diversity values were considerably lower (Table 4), as expected for endangered populations [75].
The mating system analysis we carried in COLONY suggests that lowland tapirs are polygamous (Table 3). However, C. R. Foerster, after 10 years of study, suggested polygyny for Baird's tapirs (T. bairdii) [16]. With either mating strategy there is generally a high degree of home range overlap among adults, as found for T. terrestris [14][15][16], but the observation of home range overlap between one female with two males and of one male with two females [15,16]-plus observations made by camera trapping of females being accompanied for different males (E. P. Medici, personal communication)-suggests a polygamous system (i.e., both male and female are promiscuous [76]). Therefore, the evidence found in the present study, together with ecological Table 3. Maximum log-likelihood values for mating system models suggested for T. terrestris, with associated log Bayes factor and posterior probabilities. observations, support the hypothesis of a polygamous mating system for T. terrestris.
In general, polygamous ungulates that display some kind of territoriality are largely folivores observed in open-habitat areas, such as grasslands [77]. While our result would appear to contradict this observation, we do not believe it is entirely inconsistent with it. Instead, we propose that, if lowland tapirs are indeed promiscuous, the observation of a behavior typical of grassland habitat is a case of Krumbiegel's rule, which states that behavioral patterns evolved in one type of habitat will persist long after that habitat changes [78]. We know that tapirs in Asia evolved largely in open grasslands, and are now one of the few remaining taxa from a large megafauna that has not gone extinct with the rise of tropical jungles [79]. It is possible that similar scenario occurred in South and Central America [80].
Regarding the relatedness analysis, we classified pairs into a specific relatedness category based on the estimated r-values, confidence intervals surrounding each r estimate, pedigree hypotheses tests, mean number of shared alleles, and mean number of loci that share at least one allele (Table S1). As can be seen from our simulated data, the combined results increased our confidence in our classification, while accounting for the uncertainty resulting from the number of successfully assayed markers. Although we are confident that our classification is reliable, it is important to note that our sample sizes for the purpose of statistical analyses were small, as is the case for many studies with large mammals. Nevertheless, we consider the results informative and valuable, being the first data obtained via noninvasive sampling to identify individuals of an elusive mammal in the Amazon.
In the case where at least one of the sexes is philopatric there is an expectation of increased Moran's I at the local/social scale relative to larger, landscape scales (e.g. [81]). Our data show no difference between Moran's I at the individual scale and at the landscape scale. Pairs of related individuals did not occur geographically closer than pairs of unrelated individuals. We thus have no evidence to support the hypothesis that recognition between related individuals leads to a greater tolerance among tapirs, that tapirs prefer to be close to relatives or have philopatric behavior. Therefore, our data do not corroborate the formation of family units in T. terrestris.
The fact that kinship does not seem to influence the spatial pattern of individuals is unusual in mammals [82]. We are aware of only one example of this in mammals, the racoon (Procyon lotor) [83]. Interestingly, racoons and tapirs seem to have a lot in common. Much like tapirs, racoons are described as largely solitary wide-spread species, that occupy various types of habitats with varying densities. Similarly to our study, Hirsh et al. [83] found no pattern in spatial proximity between related and nonrelated individuals. Instead, they found that other factors, such as availability of winter dens and concentration of food resources played a much more significant role in driving associations between individuals. Thus, as in racoons, recognition between individuals may occur independently of kinship and other factors may influence the formation of social groups, such as environmental factors. Barongi [69] and Foerster & Vaughan [84] attributed tapir home range overlap to the fruiting season, in which the greater availability of food resources promotes group formation. It is also possible that the formation of the Balbina reservoir has disrupted territories and family units, and due to the long-lived nature of the species these characteristics have not yet returned to equilibrium.
The presence of unrelated pairs at the individual scale, coupled with the absence of correlation between relatedness indices and geographic distances suggests a high variance in tapir movement, which may represent dispersal events. Note that dispersal distance is defined as the distance between natal and breeding sites [6]. We cannot, however, distinguish between natal dispersal and breeding dispersal as we have no data for the ages of the animals studied. Adult individuals have been seen leaving their habitual home ranges by as much as 10 km to visit mineral licks [15]. So it is possible that adults make similar excursions in the search of the opposite sex.
Nevertheless, it is generally observed in mammals that individuals disperse from their natal site at the onset of sexual maturity, or soon after, to establish their own home ranges [5]. Foerster & Vaughan [84] observed the birth of four tapirs that dispersed from their natal area after three to four years. During the period of residence of the juveniles, their parents maintained an exclusive area without other adults. In this case, the establishment of territories would be associated with a period of parental carebut it is not clear, however, whether or not tapirs display territorial behavior. Thus, the observed high variance in distances between related individuals could reflect different stages of dispersal (e.g., before and after natal dispersal), as well as breeding dispersal or movement behavior associated with the search for resources.
The present study offers novel information on the behavioral ecology of T. terrestris and the use of non-invasive sampling for individual discrimination in tropical forests. Based on the present findings, we suggest a polygamous mating system and dispersal from the natal home range for T. terrestris. Apparently, tolerance between individuals is not influenced by kinship, as the proportion of related pairs at the individual scale was not different from the proportion observed at the landscape scale. This is unusual in mammals, but has been described elsewhere. In respect to the methods, the non-invasive sampling allowed rapid access to genetic data from an elusive species, even in the Amazon biome with its warm and humid forests. Therefore, the methods applied here should work for other medium-sized and large mammals in similar environments. However, researchers working in this perspective should be rigorous not only in the laboratory procedures, but also in testing kinship category assignments and selecting the most appropriate analytical methods for their data.

Supporting Information
Table S1 Information used in the classification of relationships. Note: individuals in the pair (Ind1, Ind2); mean number of loci for which at least one allele was shared between a pair (Mean share); mean number of alleles shared between individuals in a pair (Allele count); relatedness index (r) of Lynch and Ritland (r LR99 ) and Queller and Goodnight (r QG89 ); 95% confidence interval for each pairwise r of Lynch and Ritland (CI_LR99) and Queller and Goodnight (CI_QG89); pedigree hypotheses test with the primary hypotheses being: parentoffspring (PO), full sibs (FS), half sibs (HS), cousins (C) and unrelated (UN); final classification (Conclusion) of a pair as inconclusive (IN) or into a relationship class (FO-First Order Relatives, HS-Half Sibs or U-Unrelated); geographic distance between the individuals in meters (Distance); and additional information used to assist the classification (Additional information). In the pedigree hypotheses test the tests with ''*'' were significant at the 0.05 level, ''**'' at 0,01 level and ''***'' at the 0.001 level. The values of distance with ''{'' symbol means that the individuals were located in opposite margins of the reservoir. The probabilities mentioned in the additional information were based on the errors measured from the results of the simulation. (XLS)