Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Source-Sink Estimates of Genetic Introgression Show Influence of Hatchery Strays on Wild Chum Salmon Populations in Prince William Sound, Alaska

Source-Sink Estimates of Genetic Introgression Show Influence of Hatchery Strays on Wild Chum Salmon Populations in Prince William Sound, Alaska

  • James R. Jasper, 
  • Christopher Habicht, 
  • Steve Moffitt, 
  • Rich Brenner, 
  • Jennifer Marsh, 
  • Bert Lewis, 
  • Elisabeth Creelman Fox, 
  • Zac Grauvogel, 
  • Serena D. Rogers Olive, 
  • W. Stewart Grant


The extent to which stray, hatchery-reared salmon affect wild populations is much debated. Although experiments show that artificial breeding and culture influence the genetics of hatchery salmon, little is known about the interaction between hatchery and wild salmon in a natural setting. Here, we estimated historical and contemporary genetic population structures of chum salmon (Oncorhynchus keta) in Prince William Sound (PWS), Alaska, with 135 single nucleotide polymorphism (SNP) markers. Historical population structure was inferred from the analysis of DNA from fish scales, which had been archived since the late 1960’s for several populations in PWS. Parallel analyses with microsatellites and a test based on Hardy-Weinberg proportions showed that about 50% of the fish-scale DNA was cross-contaminated with DNA from other fish. These samples were removed from the analysis. We used a novel application of the classical source-sink model to compare SNP allele frequencies in these archived fish-scales (1964–1982) with frequencies in contemporary samples (2008–2010) and found a temporal shift toward hatchery allele frequencies in some wild populations. Other populations showed markedly less introgression, despite moderate amounts of hatchery straying. The extent of introgression may reflect similarities in spawning time and life-history traits between hatchery and wild fish, or the degree that hybrids return to a natal spawning area. The source-sink model is a powerful means of detecting low levels of introgression over several generations.


Interactions between hatchery-reared and wild Pacific salmon can be a source of genetic change within and among wild populations [1]. Even when the initial hatchery brood stock is drawn from nearby wild populations, hatchery culture can change the genetic makeup of the hatchery population, especially in segregated hatchery populations, in which brood stocks are selected from fish returning to the hatchery [2], [3]. Some hatchery fish inevitably stray into wild populations, and the degree of influence these fish have on wild populations is related to the intensity of stock enhancements [4], the amount of genetic divergence between hatchery and wild populations [4], [5], and the extent of genetic introgression of hatchery genotypes into wild populations [6]. A key variable moderating the effects of hatchery-reared strays on wild populations appears to be the degree of life-history divergence between the hatchery and wild populations [4], [6], [7].

The effects of stray hatchery fish on wild populations can be measured in several ways. Fin clips, physical tags, and thermally marked otoliths allow for the direct detection of hatchery fish in natural spawning areas [8]. While stray hatchery fish may influence wild populations ecologically [9], they may not necessarily mate with wild fish. Even matings between hatchery and wild fish may not lead to introgressive hybridization, because hybrid offspring may be less fit than pure wild fish and may not successfully compete for mates or survive to spawn [3], [10]. Long-term monitoring of life-history traits can sometimes demonstrate the effects of hatchery strays on wild populations. For example, run timing in a wild population of coho salmon (Oncorhynchus kisutch) shifted to earlier times as a result of the genetic influence of strays from a hatchery in which eggs were taken from the early portion of the run [11].

One approach to estimating genetic introgression in salmonids is based on individual-based methods that attempt to identify hatchery-wild hybrid and backcross fish [6], [12][14]. These methods use genotypes in a contemporary sample and Bayesian probabilities to estimate hybridizations with fit to Hardy-Weinberg proportions and linkage disequilibria among genotypes. Here, we use an alternative approach by comparing DNA in archived fish scales with DNA in contemporary samples to detect possible allele-frequency shifts in wild populations influenced by hatchery strays. Historical samples have previously been used to estimate the genetic structure of historical populations [15], [16] and genetic shifts from hatchery supplementation [10], [15], [17]. We developed a novel application of the classical source-sink model to track allele-frequency changes in wild populations due to introgression from stray hatchery-reared fish.

Our study focuses on the genetic influence of hatchery strays on chum salmon (Oncorhynchus keta) populations in Prince William Sound (PWS), Alaska (Figure 1). Over 200 streams in PWS support wild spawning aggregations, with an estimated ten-year (2001–2010) annual average of 273,100 fish migrating to index spawning areas. The total number of wild spawners in PWS is unknown, but is thought to be much larger than abundances in these index streams. The Prince William Sound Aquaculture Corporation (PWSAC) released about 139.5 million chum salmon fry in 2011 [18], and the total return to PWS of hatchery and naturally spawned fish averaged about 4.2 million fish annually over the last ten years [19]. Since 1976, chum salmon have been released in PWS at several sites, including Armin F. Koernig (AFKH), Main Bay (MBH), Wally Noerenberg (WNH), Cannery Creek (CCH), and Solomon Gulch (SGH) hatcheries, and Port Chalmers on Montague Island (Table 1). Although these five hatcheries produced chum salmon between 1985–1994, chum salmon culture was suspended, except at WNH. Hence, WNH has been the chief source of hatchery releases for about two decades. Wells River was the largest source of hatchery brood stock for this hatchery, but a small number of fish came from Beartrap Creek until 1986, when the number returning to WNH was large enough to meet production goals.

Figure 1. Locations of chum salmon hatcheries and release sites and sampled natural-spawning sites in Prince William Sound, Alaska.

Table 1. Chum salmon production in Prince William Sound, Alaska hatcheries.

Our study has three components. First, we develop a novel method for quality control of single-nucleotide polymorphism (SNP) genotypes. This QC was critical because the analysis relied on DNA extracted from archived scales originally sampled decades ago to age fish. The handling of fish during sampling promotes cross-contamination between scales, which can affect genotyping. Accurate genotypes are especially important for detecting low levels of introgression. Second, we compare the historical genetic population structure estimated from archived scale samples to contemporary population structure. If introgression from hatchery populations into wild populations is occurring, wild populations are expected to gradually resemble hatchery populations. Third, we develop a novel application of the classic source-sink model and use SNP frequencies in historical and contemporary samples to quantify introgression rates of hatchery chum salmon in four naturally spawning populations.

Materials and Methods

Ethics Statement

All work was conducted in accordance with animal welfare guidelines stipulated in field collection permit CF-2009-0019, issued by the State of Alaska Department of Fish and Game (ADF&G), Juneau, AK. This study did not involve endangered or protected species and the sampling locations were not privately-owned or protected in any way.

Archive Sample Selection and Field Collection

We sampled chum salmon from WNH and from four streams in PWS (Beartrap Creek, Constantine Creek, Siwash Creek, and Wells River; Table 2, Figure 1), for which historical scale samples were also available. These populations differ in spawning time and span a range of distances from hatchery release sites. We chose these populations, in part, because at least 200 archived scales were available from years before hatchery operations began. In 2008, 2009, and 2010, the axillary processes of 500 fish at each location were collected, placed into individual vials with 95% ethanol, and stored at room temperature. Otoliths were also dissected from these fish to search for thermal marks indicating hatchery culture. Otolith markings in the contemporary field collections, indicating hatchery-origin strays, were removed from the analysis. A target sample size of 200 wild fish without thermally marked otoliths was selected from the 500 fish from each stream and used for genetic analysis (Table 2). In addition, axillary processes were collected from 200 freshly killed brood stock at the WNH in 2008 and 2009. Samples from wild populations were pooled into 8 collections based on geographic location (Siwash Creek, Wells River, Beartrap Creek, and Constantine Creek) and time of collection (historical 1964–1982, and contemporary 2008–2010). Samples from the WNH, collected 2008–2009, were pooled into a single collection for analysis. After quality control (see below), sample sizes for the historical DNA samples ranged from 50–70 and for contemporary samples from 193–585.

Table 2. Collection information for samples of chum salmon from Prince William Sound, Alaska, used to examine hatchery-wild genetic introgression.

Laboratory Analysis

Genomic DNA was extracted using a DNeasy® 96 Tissue Kit by QIAGEN® (Valencia, CA). Each fish was screened for 185 nuclear DNA and 3 mitochondrial DNA SNP markers (Table S1) [20][25]. SNP assay reactions were conducted in two BioMark 96.96 Dynamic Arrays (Fluidigm). Reaction cocktails (7.2 nL) consisted of 1×TaqMan Universal Buffer (Applied Biosystems), 1.5 U AmpliTaq Gold DNA Polymerase (Applied Biosystems), 9 mM of each polymerase chain reaction (PCR) primer, 2 mM of each probe, 1×DNA Assay Loading Buffer (Fluidigm), 12.5×ROX (Invitrogen), and 0.01% Tween-20. PCR amplification was performed with a BioMark IFC Cycler with an initial denaturation period of 10 min at 95°C followed by 50 cycles of 92°C for 15 s, and one step at 60°C for 1 min. Dynamic Arrays were read with a BioMark Real-Time PCR System after amplification and scored visually with BioMark Genotyping Analysis software (Fluidigm).

Quality Control

Several steps were taken to ensure genotype accuracy. First, 8% of the samples in each collection were reanalyzed for each marker to ensure reproducibility and to identify possible laboratory handling errors. We assumed that any inconsistency was due equally to genotyping errors in the initial analysis and to errors during quality control. Second, we excluded samples with poor quality DNA, in which individuals lacked scores for more than 20% of the SNP markers [25]. This reduced the chance that the remaining loci were also mis-genotyped, as poor quality DNA provides less replicable genotypes across loci than high quality DNA. Third, we eliminated SNP markers that were genotyped for fewer than 55 fish in an archived scale sample. Fourth, we tested genotypic frequencies for each locus in each collection for fit to Hardy-Weinberg expectations (HWE) with Fisher’s exact test [26], as implemented in GENEPOP 4.0.10 [27]. Critical values (α = 0.01) were adjusted for multiple tests within collections and multiple tests across markers within a collection [28]. A locus was removed from the analyses if it deviated from HWE in a majority of the collections. Fifth, we removed markers that were invariant in all collections. Sixth, to ensure that the analyses were based on independent markers, we tested all pairs of markers for linkage disequilibrium within each collection using GENEPOP with 100 batches of 5,000 iterations. We assumed that pairs of loci were linked if they exhibited significant (P<0.05) linkage disequilibrium in more than half of the collections, or if the markers were known to be linked. When we found linkage between a pair of loci, the locus with the lower heterozygosity, or the locus with the larger percentage of missing data in the historical samples was discarded.

Contaminated Samples

We anticipated that DNA extracted from the archived scales might be contaminated with DNA from other scales, because the archived scales had been used for aging and had not been collected with protocols specifically for DNA analysis. DNA is located chiefly on the surface of a scale in epithelial tissue and mucous. We devised two methods to detect fish-to-fish contamination. First, we analyzed the archived fish DNA for 7 microsatellites [29], Oke4, Oke11 [30], Oki1L, Oki1U [31], Ots2.1L, Ots2.1U [32], Ots103 [33] (Table S2). Contaminated samples were expected to display more than two alleles for at least one locus. Fish with genotypes showing more than two alleles were excluded from further analysis.

Second, we used model selection based on genotypic frequency expectations. This method makes three key assumptions: 1) loci are in HWE and linkage equilibrium, 2) contaminated individuals are contaminated by one, and only one, other fish from the same population, and 3) if a fish is contaminated at one locus, it is contaminated for all other loci. Genotypes were subscripted by collection , individual , and locus and were represented by the unit vector

This data vector takes three possible states corresponding to the genotypes AA, Aa, and aa. We modeled asWhere is a vector of genotype score probabilities with components

Here, and are the uncontaminated and contaminated genotypic frequencies, respectively, of genotype index at locus l, and are simple functions of the allele frequencies, (Table 3), while is a Bernoulli random variable equal to one if individual is contaminated, and to zero if the individual is uncontaminated. We placed a Rannala-Mountain [34] prior on the allele frequencies and a Bernoulli prior on , with probability of success equal to 0.5. The variable was sampled for each individual from its posterior distribution to determine which of the two models fit individual the best. We ran two Markov Chain Monte Carlo (MCMC) chains in OpenBUGS (Table S3) for 100,000 iterations, discarding the initial 50,000 iterations from each chain as burn-in. To initialize the two chains, individuals were set as contaminated in one chain and uncontaminated in the second chain, and convergence between the chains was assessed by examining trace plots of the two chains. Since the prior for gives equal weight to each model, its posterior mean can be viewed as the posterior probability that scale extraction k is contaminated, given only two possible models. Individuals with posterior probabilities greater than 75% corresponded to Bayes factors greater than 3 [35], so were excluded.

Table 3. Competing models to detect DNA contamination among individuals (k) within a collection (i) across loci (l).

A total of 115 fish across samples could not be genotyped at 20% or more loci and were removed from further analysis (Table 2). Thirty-two SNP loci were removed because of low amplification success in historical scale samples. Locus Oke_U1016-154 departed from HWE in the historical collections and was removed from subsequent analysis. Additionally, the three mitochondrial DNA (Oke_Cr30, Oke_Cr386, Oke_ND3-69) and two nuclear (Oke_U1010-154, and Oke_U202) loci were removed, because they were invariant in our samples. Fourteen pairs and one triplet of loci showed significant linkage disequilibrium in more than 50% of the collections (Table S4). One exception was the pair Oke_psmd9188 and Oke_psmd957, which showed linkage disequilibrium in 4 of the 9 collections. However, this pair is known to be linked. Hence, these 16 loci were discarded. A list of SNP heterozygosities appears in Table S1. After these analyses, genotypes for 135 SNP loci were used to detect genetic introgression of the WNH stock into wild populations. We removed 339 fish from the archived scale collections, because microsatellite genotypes indicated that DNA extractions were contaminated. Finally, we removed 94 contemporary and historical individuals with Bayes factors indicating contamination (Table 2).

Population Analysis

The four contemporary samples were large enough (N = 298 to 585) to subdivide and test for differences among the three collection years, 2008–2010. Fisher’s exact test for 12 comparisons detected a significant difference (P = 0.017) between Wells River 2008 and 2009, but not between collection years at the other sites. The historical samples were too small to test for temporal differences. Otolith markings indicated that 18 fish from contemporary field collections were hatchery-origin strays and were removed from the analysis.

For the SNP data, we estimated observed (Ho) and expected (He) heterozygosities averaged over loci with ARLEQUIN [36]. Differentiation between populations was estimated in four ways. ARLEQUIN was used to estimate components of variability among samples, including FST (differentiation among spawning sites), FSC (differences between temporal samples within sites), and FIS (mean variability among individuals within collections). Second, pairwise exact tests [37], [38] were made between all samples with GENEPOP with 5000 burn-in steps, and 1000 batches of 1000 Markov Chain steps per batch. Third, pairwise FST values were calculated with the R package ‘hierfstat’ [39]. The R package ‘ape’ [40] was used to produce 1000 bootstrap neighbor-joining (NJ) trees from pairwise FST values and a consensus tree. Fourth, we used STRUCTURE 2.3.3 [12] to estimate population structure from individual assignments. Genotypic data in the collections were pooled and tested with the ‘admixture’ model using sample location and date (historical or contemporary) with 50,000 MCMC steps, following a burn-in of 10,000 steps. Genotypic data were tested with K = 1–9 populations.

Source-sink Model of Genetic Introgression

We used the source-sink model [41] to develop a way of estimating genetic introgression (Figure 2). In our model, we expected allele frequencies in wild sink populations to shift as hatchery strays from the source population bred with wild fish. Allele frequencies at a locus in a sink population, generations after the onset of introgression, are

Figure 2. Diagram of a model of genetic introgression based on the classic source-sink model of migration.

Explanation of variables: ql is the allele frequency at a locus in a source population and is assumed to be unchanging over n generations of introgression. qn,i,l is the allele frequency at locus, l, in a wild sink population, i after n generations.

Here, mi is the rate of introgression into population i. The same equation with the same rate of introgression applies to all loci individually. In this model, is not treated as a free parameter, but as a function of the free parameters . Without introgression mi = 0, so that after n generations of introgressionand

With introgression , so thatand

Therefore, the slope of the plot of against across loci indicates the effect of introgression. In the absence of introgression, these points are expected to fall along the replacement line y = x, so the slope of the regression is 1.0. With introgression, these points are expected to fall on a line with a slope of less than 1.0. The model was run in OpenBUGS (Table S5). We placed Rannala-Mountain priors [34] on and and a flat normal prior on each with a mean of zero and a variance of one-thousand. We then ran two chains with disparate starting values for 100,000 iterations, discarding the first 50,000 iterations as burn-in. The posterior mean and 95% credible intervals were estimated for each .


Population Analysis

Overall, 1.45% (FST = 0.0145, P<0.00001) of the total variability was due to differences among the four spawning sites, and 0.15% (FSC = 0.0015, P<0.0001) was due to temporal differences between samples at the same site. Fisher’s pairwise exact tests for genetic differentiation echoed these results, showing significant differences (P<0.001) between historical and contemporary collections from the same sites (Table 4). The remaining 98.40% of the variability was due to genotypic differences among individuals within samples (FIS). The amount of differentiation among the four historical samples was slightly larger (FST = 0.0161, Fisher’s exact test P<0.001) than among the four contemporary samples (FST = 0.0158, Fisher’s exact test P<0.001).

Table 4. Estimates of genetic diversity and divergence (FST) between historical (H) and contemporary (C) samples of chum salmon from Prince William Sound, Alaska.

A consensus NJ tree of FST showed that the historical and contemporary collections from Wells River were most similar to the WNH stock. Each of the population pairs, except Wells River, had high bootstrap support in the tree (Figure 3). In three of the pairs of temporal and contemporary samples, Ho and He were marginally smaller in contemporary samples than in historical samples (Table 4).

Figure 3. Neighbor-joining tree of FST between chum salmon samples from Prince William Sound, Alaska.

Numbers in the tree represent bootstrap support for a node.

STRUCTURE indicated that the 9 collections (both historical and contemporary) best fit a four-population model (Figure 4). Generally, the results showed genetic differentiation among the four populations that we sampled. Most individuals were assigned back to their population with probabilities of 85–95%. No differences in the probabilities of assigment appeared between archived and contemporary genotypes from a particular location. However, small genetic components from other populations appeared in each population. A small genetic signal (red) from Constantine Creek appeared in WNH and the three other populations. A small signal (green) from Beartrap Creek appeared in WNH and Wells River, but was absent in Siwash and Constantine creeks. A small signal (blue) from Wells River appeared in fish from Constantine and Siwash creeks, but a large Wells River signal appeared in WNH, reflecting the origins of WNH fish from Wells River. Virtually no genetic signal (yellow) of Siwash Creek fish appeared in WNH and the other populations. It is uncertain whether these extrinsic components are due to gene flow and hybridization, or to the similarity of some SNP genotypes among populations.

Figure 4. STRUCTURE analysis of genotypes at 135 nuclear SNPs in chum salmon from Prince William Sound, Alaska.

Individual assignments for contemporary (C) and historical (H) collections with K = 4.

Estimates of Genetic Introgression

In our source-sink model, evidence for introgression appeared as a convergence with time between allele frequencies in a wild population with allele frequencies in the hatchery. This convergence produced a positive deviation from the expected one-to-one relationship between the slope of the difference between source and sink allele frequencies before hatchery production and about six generations later. Three sample pairs for Siwash Creek, Wells River, and Beartrap Creek showed a shift in allele frequencies, with the strongest shift appearing in Wells River (Figure 5a,b,c). Less introgression was detected in Constantine Creek (Figure 5d). Bayesian estimates of the per-generation introgression rate (, where n = 6 generations) from the source-sink equation indicated that was significantly larger than zero in each of the four populations (Table 5, Figure 6). Wells River showed the largest rate of introgression (m = 0.257, 95% PD: 0.209–0.328), and Siwash and Beartrap creeks showed intermediate levels of introgression (m = 0.066, 0.052–0.081 and 0.060, 0.046–0.074, respectively). Constantine Creek showed the lowest, but still significant, level of introgression (m = 0.011, 0.004–0.017).

Figure 5. Plots of versus for 135 SNP loci in chum salmon in Prince William Sound, Alaska.

Dashed line represents observed curve and solid line represents expected curve without introgression. (a) Siwash Creek, (b) Wells River (c) Beartrap Creek, (d) Constantine Creek.

Figure 6. Approach to equilibrium of per-generation introgression coefficients, m, in natural chum salmon spawning areas in Prince William Sound, Alaska.

Table 5. Source-sink model estimates of genetic introgression of hatchery genes into wild populations of chum salmon from Prince William Sound, Alaska.


Our analysis of archived scales and contemporary collections of chum salmon in Prince William Sound demonstrates genetic introgression from hatchery strays into the four natural spawning populations examined in this study. The use of the source-sink model appears to have provided more power for detecting introgression than the use of individual assignments with STRUCTURE, which has commonly been used in other studies to detect introgression. Both STRUCTURE and NEWHYBRID [13] attempt to identify F1 and F2 hybrids and backcrosses in a population, but may not detect the effects of introgression over several generations. Additionally, our application of the sink-source model to analyze DNA in archived scales and contemporary samples allowed us not only to detect allele-frequency shifts, but also to estimate introgression rates over 6–7 generations.

We expended considerable effort on quality control to increase the accuracy of genotypes so that our analyses had the power to detect small hatchery influences on wild populations. First, we implemented methods to minimize the potential for genotyping bias in these procedures. For example, because some loci may be inherently more scorable than other loci in samples with poor quality DNA, we excluded the latter samples to reduce the inclusion of suspect genotypes. Poor quality DNA can yield valid genotypes for some loci, but not others.

Second, we devised two methods to screen for DNA contamination between scale samples, including the supplementary use of microsatellites, and model selection based on Hardy-Weinberg Expectations (HWE) for SNP loci. The estimation of genotype and allele frequencies critically depends on excluding contaminated samples. Inferences of introgression, in turn, depend on accurate estimates of these frequencies. Therefore, we excluded any archival or contemporary samples that appeared to be contaminated with DNA from other fish. About 1.2% (0–2%), on average, of the DNA extractions from the five contemporary samples tested positive for contamination with the HWE-based test, but not with microsatellites (Table 2). These putative contaminations appear to be false positives. On the other hand, 55.2% (49–61%), on average, of the extractions from the four archived scale samples tested positive for contamination, much higher than the false-positive rate for contemporary samples. These stringent measures of quality control provide considerable confidence in the subsequent analyses.

Shifts in Genetic Diversity and Population Structure

Persistent straying from a single-source population over several decades can potentially erode genetic diversity among populations. Previous studies of genetic population structure in chum salmon in PWS showed a considerable amount of genetic diversity among populations in the 1990s [42], [43]. These studies detected several statistically significant partitions between populations on the eastern side of PWS, and a major partition between eastern and western PWS populations. Our sampling was not geographically extensive enough to revisit the east-west partition; nevertheless, the STRUCTURE analysis showed strong geographic structure among populations around PWS. A small allele-frequency shift among contemporary samples (FST = 0.0158), relative to the historical samples (FST = 0.0161) may indicate convergence among these wild populations because of a common source of migrants. A similar temporal shift was detected among populations of Atlantic salmon (Salmo salar) in France; divergence at 17 microsatellite loci among historical samples was larger (mean FST = 0.080, range 0.057–0.096) than among contemporary samples (mean FST = 0.04, range 0.013–0.071) [44]. An allele-frequency shift and drop in heterozygosity in some PWS chum salmon populations are predictions of the Ryman-Laikre effect of hatchery strays on wild populations [45]. However, we cannot exclude the possibility that a decadal increase in the amount of gene flow between wild populations in PWS in response to an environmental regime shift [46] may also have led to greater similarity among wild populations.

Source-sink Model of Genetic Introgression

Our genetic analysis also indicates that the most recent contributions of brood stock to the WNH from Wells River swamped the genetic signatures of previous brood stock from other localities. Hence, we used genotypes in the WNH sample as the ‘source’ in our source-sink estimations of introgression. Even though the absolute amount of differentiation was small (FST = 0.001 to 0.010), the exact tests of differentiation were significant for each hatchery-sample comparison because of the power provided by a large number of markers. Hence, the individual assignments made with STRUCTURE did not indicate a substantial amount of recent hybridization between hatchery brood stock and wild populations, even though our source-sink model showed significant allele-frequency shifts after six generations of hatchery supplementation.

Genetic Imprints of Hatchery Strays on Wild Populations

The results for the four populations that we sampled illustrate several bio-complexities of hatchery influences on wild populations. The straying of hatchery fish is well documented by the presence of fish with thermally marked otoliths in streams where wild fish spawn. Brenner et al. [8] sampled chum salmon from 2004 to 2010 in the four streams used in our study and found that 0.0 to 62.6% of the fish were of hatchery origin (Table 6). Beartrap Creek had the lowest percentage of stray hatchery fish, with a four-year average of 0.2%, whereas Siwash Creek had a five-year average of 25.1%. Wells River, Beartrap, and Constantine creeks support the three largest chum salmon spawning populations in PWS, with average escapements that are an order of magnitude larger than those of Siwash Creek, which ranks 24th in size. The abundances and locations of populations relative to hatcheries provide a background for understanding the biological implications of our genetic introgression estimates.

Table 6. Percentage stray hatchery-reared chum salmon found in naturally spawning populations of chum salmon in Prince William Sound, Alaska from 2004 to 2010.

Constantine creek.

The sample from Constantine Creek had the second smallest proportion of stray hatchery chum salmon (0.5%, 2005–2010) and the lowest level of genetic introgression (Figure 6). This is the third largest chum salmon producer in PWS and is located farthest from any hatchery chum salmon release site of the streams considered here. Notably, the peak run timing of Constantine Creek is later than that for fish returning to WNH by about 16 days (ADF&G unpublished data). Thus, the low introgression rate may be due to a combination of a low proportion of stray hatchery fish in the population, a large geographic distance from hatchery release sites, a difference from hatcheries in run timing, and a large population size that resists introgression.

Siwash creek.

A high level of genetic introgression was expected in this small population (mean 3000 adults), because it is located close to the WNH and because it receives a large number of hatchery strays [8]. However, the source-sink model detected only a small amount of introgression. The low incidence of introgression may variously be due to a mismatch between the run timings of hatchery and local wild fish, behavioral differences between hatchery and wild fish that reduce successful inter-breeding, low survivals of hybrid offspring, and poor homing of hybrids to Siwash Creek. Perhaps the most important factor is the contrast in run timing between WNH and Siwash Creek fish that reduces the chances of mating between hatchery and natural fish. Aerial surveys in PWS indicate that the median time of return to Siwash Creek lags the return times of WNH Wells-River fish by about 31 days (ADF&G unpublished data). The earlier run timing of hatchery fish leads to a greater number of strays early in the season [8] when local wild fish are not spawning.

Unexpectedly low levels of introgression, despite persistent straying, have been found in other salmonids. Hendry et al. [47] showed that populations of sockeye salmon (Oncorhynchus nerka) could exchange large numbers of migrants each generation, yet remain genetically distinctive because of reduced reproductive success. In addition to straying intensity, introgression may be influenced by local population size [17] and by the survival and reproductive success of hatchery fish in the wild [3], [5], [48]. The use of hatchery brood stock with divergent run timing may, in fact, reduce hybridization between hatchery strays and wild fish in this system [49].

Beartrap creek.

This is the second largest chum salmon run in PWS (ADF&G unpublished data). Wild populations in this creek showed intermediate levels of genetic introgression from WNH fish. A small proportion of fish from Beartrap Creek was used as brood stock at the WNH until 1986, but a genetic signature of this source is absent in WNH fish. If a genetic legacy from the early use of Beartrap Creek fish as brood stock were present in contemporary WNH brood stock, straying from WNH into this creek would tend to steepen the source-sink curve and lessen a signal of introgression (Figure 5c). Even though Beartrap Creek is located a considerable distance from WNH, fish entering PWS may be attracted to Beartrap Creek, because it is a large drainage system similar to Wells River, which was the source of the WNH brood stock. Native fish spawning in Beartrap Creek also have early run timing, which is similar to that for fish returning to the WNH and Wells River.

Wells river.

Wells River chum salmon showed the highest level of genetic introgression from hatchery salmon. This river supports the largest chum salmon population in PWS with an estimated mean run-size of 23,100 fish in the past few years (Table 6). This population shows relatively high levels of introgression despite small numbers of hatchery strays. Interacting mechanisms may explain the results for the various populations. Wells River chum salmon were the primary fish used to develop the current WNH brood stock. Hence, these hatchery fish may be more successful in spawning with Wells River fish, because of common recent ancestry. Indeed, our analysis of neutral genetic markers showed a close genetic relationship between WNH fish and the historical and contemporary samples. The two populations are also phenotypically similar, with mean egg-take at WNH occurring on July 13 (2000–2011; unpublished WNH annual reports) and with the midpoint of chum salmon escapement into Wells River on July 22 (ADF&G unpublished data). The genetic similarity measured by neutral markers and similarity in run timing may indicate that other adaptive characteristics are also similar, conferring reproductive compatibility. A constant low level of successful spawning between WNH and Wells River fish could result in the high level of introgression detected in this study.

An alternative, but not exclusive, explanation for these results is that low abundances of wild fish in a stream lead to high levels of introgression for the same intensity of straying from the hatchery. Wells River chum salmon populations were depressed during 1991–1994 and 1997–1998 with an average estimated size of only 7200 spawners. These population sizes were less than 30% of the average size of 23,089 fish (1979–2010). A proportionately larger amount of interbreeding with hatchery fish may have occurred during these years. A single successful introgression event could spread through subsequent generations, because of overlapping generations of spawning fish combined with the incorporation of hatchery genotypes into the population. As for Siwash Creek, published percentages of hatchery strays in Wells River were calculated from simple averages unweighted for overall escapement and may therefore not reflect the proportion of hatchery fish in this population. For example, although hatchery strays at Wells River averaged 7.4% over the entire spawning season (Table 6), samples during the run contained as much as 23% hatchery chum salmon (ADF&G unpublished data).


The results of our study yield three important insights into detecting genetic introgression of hatchery-reared chum salmon into wild populations. First, DNA extractions from archived scales collected without using protocols for genetic sampling showed high levels of contamination with DNA from other fish. About half of the samples we attempted to use showed signs of contamination. Questionable genotypes from contaminated DNA extractions would have compromised the accuracy and power of our analyses. Rigorous quality control is an essential step in the genetic analysis of archived fish scales.

Second, the use of the source-sink model to search for temporal allele-frequency shifts appears to be a powerful method for documenting small amounts of genetic introgression over several generations. Unlike other models used for estimating genetic introgression, the source-sink model in a Bayesian context yields a confidence interval around the introgression rate. Assignments by the program STRUCTURE, which attempts to resolve population structure with the fit to Hardy-Weinberg expectations and linkage disequilibrium may not provide enough statistical power to detect small levels of introgression beyond one or two generations. Random mating is expected to lead to Hardy-Weinberg proportions in a single generation, and pseudo-linkages are expected to decay in a few generations. Small probabilities of assignment to the wrong population in our analysis likely represent type I error because of the close absolute similarity among populations. These errors were of the same order of magnitude as the probabilities of assignment to the WNH brood stock or to Wells River, the origin of the brood stock. Hence, these individual assignments may not provide evidence of introgression.

Third, the results of our study yield insights into the extent that hatchery strays have influenced wild populations of chum salmon in PWS after more than 30 years of large-scale hatchery production. These results show that some populations are more susceptible to genetic introgression by hatchery strays than other populations. Both proximity to a hatchery and the intensity of straying were less important, under some circumstances, than similarity in spawning time. Mismatches in other life-history traits may also be important in retarding genetic introgression into wild populations. Nevertheless, our results show a general convergence of allele frequencies in wild populations toward hatchery allele frequencies. While this convergence demonstrates introgression at neutral genes, the fundamental concern is over the effect of introgression on adaptive variation. Introgression from hatchery strays applies equal pressure on neutral and adaptive genes in wild populations, but it is uncertain to what extent genes underlying adaptation are resilient to introgression [50]. Future research is needed to understand extent that wild populations are adapted to a particular spawning site and the extent that the introgression of hatchery genes interrupts this adaptation.

Supporting Information

Table S1.

Expected (HE) and observed (HO) heterozygosity for 188 single nucleotide polymorphisms in populations of chum salmon in Prince William Sound, Alaska.


Table S2.

Microsatellites used to test for contamination of chum salmon samples.


Table S3.

OpenBugs code to test for contamination between samples.


Table S4.

Pairs of single nucleotide polymorphisms (SNPs) exhibiting significant linkage disequilibrium in at least 4 of 9 collections in Prince William Sound chum salmon.


Table S5.

OpenBugs code to implement source-sink model of introgression.



We thank R. Brewer, B. Behrends, J. Canfield, K. Froning, J. O’Rourke and M. O’Rourke for sample collections. The International Program for Salmon Ecological Research, University of Washington, provided unpublished scale extraction methods and SNP assays. M. Grant, R. Josephson, W. Templin, S. Turner, E. Volk, and two anonymous reviewers provided insightful comments on the manuscript. This is contribution PP-274 of Alaska Department of Fish and Game, Commercial Fisheries Division.

Author Contributions

Conceived and designed the experiments: JRJ SM RB CH. Performed the experiments: ECF SDRO ZG. Analyzed the data: JRJ JM WSG. Wrote the paper: JRJ WSG JM BL CH RB.


  1. 1. Ford MJ (2002) Selection in captivity during supportive breeding may reduce fitness in the wild. Conservation Biology 16: 815–825.
  2. 2. Kostow KE (2004) Differences in juvenile phenotypes and survival between hatchery stocks and a natural population provide evidence for modified selection due to captive breeding. Canadian Journal of Fisheries and Aquatic Science Science 61: 577–589.
  3. 3. Araki H, Ardren WR, Olsen E, Cooper B, Blouin MS (2007a) Reproductive success of captive-bred steelhead trout in the wild: evaluation of three hatchery programs in the Hood River. Conservation Biology 21: 181–190.
  4. 4. Marie AD, Bernatchez L, Garant D (2010) Loss of genetic integrity correlates with stocking intensity in brook charr (Salvelinus fontinalis). Molecular Ecology 19: 2025–2037.
  5. 5. Hansen MM, Ruzzante DE, Nielsen EE, Mensberg K-LD (2000) Microsatellite and mitochondrial DNA polymorphism reveals life-history dependent interbreeding between hatchery and wild brown trout (Salmo trutta L.). Molecular Ecology 9: 583–594.
  6. 6. Hansen MM, Mensberg KLD (2009) Admixture analysis of stocked brown trout populations using mapped microsatellite DNA markers: indigenous trout persist in introgressed populations. Biological Letters 5: 656–659.
  7. 7. Gow JL, Tamkee P, Heggenes J, Wilson GA, Taylor EB (2011) Little impact of hatchery supplementation that uses native broodstock on the genetic structure and diversity of steelhead trout revealed by a large scale spatio-temporal microsatellite survey. Evolutionary Applications 4: 763–782.
  8. 8. Brenner RE, Moffitt SD, Grant WS (2012) Straying of hatchery salmon in Prince William Sound, Alaska. Environmental Biology of Fishes 94: 179–195.
  9. 9. Rand PS, Berejikian BA, Pearsons TN, Noakes DLG (2012) Ecological interactions between wild and hatchery salmonids: an introduction to the special issue. Environmental Biology of Fishes 94: 1–6.
  10. 10. Hansen MM (2002) Estimating the long-term effects of stocking domesticated trout into wild brown trout (Salmo trutta) populations: an approach using microsatellite DNA analysis of historical and contemporary samples. Molecular Ecology 11: 1003–1015.
  11. 11. Ford MJ, Fuss H, Boelts B, LaHood E, Hard J, et al. (2006) Changes in run timing and natural smolt production in a naturally spawning coho salmon (Oncorhynchus kisutch) population after 60 years of intensive hatchery supplementation. Canadian Journal of Fisheries and Aquatic Science 63: 2343–2355.
  12. 12. Pritchard JK, Stephens M, Donnelly P (2000) Inference on population structure using multilocus genotype data. Genetics 155: 945–959.
  13. 13. Anderson EC, Thompson EA (2002) A model-based method for identifying species hybrids using multilocus genetic data. Genetics 160: 1217–1229.
  14. 14. Marie AD, Bernatchez L, Garant D (2011) Empirical assessment of software efficiency and accuracy to detect introgression under variable stock scenarious in brood charr (Salvelinus fontinalis) Conservation Genetics. 12: 1215–1227.
  15. 15. Hansen MM, Bekkevold D, Jensen LF, Mensberg K-LD, Nielsen EE (2006) Genetic restoration of a stocked brown trout Salmo trutta population using microsatellite DNA analysis of historical and contemporary samples. Journal of Applied Ecology 43: 669–679.
  16. 16. Iwamoto EM, Myers JM, Gustafson RG (2012) Resurrecting an extinct salmon evolutionary signficant unit: archived scales, historical DNA and implications for restoration. Molecular Ecology 21: 1567–1582.
  17. 17. Hansen MM, Fraser DJ, Meier K, Mensberg K-LD (2009) Sixty years of anthropogenic pressure: a spatio-temporal genetic analysis of brown trout populations subject to stocking and population declines. Molecular Ecology 18: 2549–2562.
  18. 18. Vercessi L (2012) Alaska salmon fisheries enhancement program 2011 annual report. Alaska Department of Fish and Game, Fishery Management Report No. 12-04, Anchorage.
  19. 19. Botz J, Hollowell G, Sheridan T, Brenner R, Moffitt S (2012) 2010 Prince William Sound area finfish management report. Alaska Department of Fish and Game, Fishery Management Report No. 12-06, Anchorage, AK.
  20. 20. Smith CT, Elfstrom CM, Seeb JE, Seeb LW (2005a) Use of sequence data from rainbow trout and Atlantic salmon for SNP detection in Pacific salmon. Molecular Ecology 14: 4193–4203.
  21. 21. Smith CT, Baker J, Park L, Seeb LW, Elfstrom CM, et al. (2005b) Characterization of 13 single nucleotide polymorphism markers for chum salmon. Molecular Ecology Notes 5: 259–262.
  22. 22. Elfstrom CM, Smith CT, Seeb LW (2007) Thirty-eight single nucleotide polymorphism markers for high-throughput genotyping of chum salmon. Molecular Ecology Notes 7: 1211–1215.
  23. 23. Dann TH, Habicht C, Jasper JR, Hoyt HA, Barclay AW, et al.. (2009) Genetic stock composition of the commercial harvest of sockeye salmon in Bristol Bay, Alaska, 2006–2008. Alaska Department of Fish and Game, Fishery Manuscript Series No. 09-06, Anchorage.
  24. 24. Seeb JE, Pascal CE, Grau ED, Seeb LW, Templin WD, et al. (2011) Transcriptome sequencing and high-resolution melt analysis advance single nucleotide polymorphism discovery in duplicated salmonids. Molecular Ecology Resouces 11: 335–348.
  25. 25. Petrou EL, Hauser L, Waples RS, Seeb JE, Templin WD, et al. (in revision) Secondary contact and changes in coastal hydrology influence the nonequilibrium population structure of a salmonid (Oncorhynchus keta). Molecular Ecology Resources.
  26. 26. Guo SW, Thompson EA (1992) Performing the exact test for Hardy-Weinberg proportion for multiple alleles. Biometrics 48: 361–372.
  27. 27. Rousset F (2008) genepop’007: a complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources 8: 103–106.
  28. 28. Rice WR (1989) Analyzing tables of statistical tests. Evolution 43: 223–225.
  29. 29. Smith MJ, Pascal CE, Grauvogel Z, Habicht C, Seeb JE, et al. (2011) Multiplex preamplification PCR and microsatellite validation enables accurate single nucleotide polymorphism genotyping of historical fish scales. Molecular Ecology Resources 11: 268–277.
  30. 30. Buchholz W, Miller SJ, Spearman WJ (2001) Isolation and characterization of chum salmon microsatellite loci and use across species. Animal Genetics 32: 160–167.
  31. 31. Smith CT, Koop BF, Nelson RJ (1998) Isolation and characterization of coho salmon (Oncorhynchus kisutch) microsatellites and their use in other salmonids. Molecular Ecology 11: 1614–1616.
  32. 32. Banks MA, Blouin MS, Baldwin BA, Rashbrook VK, Fitzgerald HA, et al. (1999) Isolation and inheritance of novel microsatellites in chinook salmon (Oncorhynchus tshawytscha). Journal of Heredity 90: 281–288.
  33. 33. Small MP, Beacham TD, Withler RE, Nelson RJ (1998) Discriminating coho salmon (Oncorhynchus kisutch) populations within the Fraser River, British Columbia, using microsatellite DNA markers. Molecular Ecology 7: 141–155.
  34. 34. Rannala B, Mountain JL (1997) Detecting immigration by using multilocus genotypes. Proceedings of the National Academy of Sciences, USA 94: 9197–9201.
  35. 35. Kass RE, Raftery AE (1995) Bayes factors. Journal of the American Statistical Association 90: 773–795.
  36. 36. Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1: 47–50.
  37. 37. Goudet J (1995) FSTAT (Version 1.2): A Computer Program to Calculate F-Statistics. Journal of Heredity 86: 485–486.
  38. 38. Raymond M, Rousset F (1995) An exact test for population differentiation. Evolution 49: 1280–1283.
  39. 39. Goudet J (2006) hierfstat: Estimation and tests of hierarchical F-statistics. R package version 0.04-4.,
  40. 40. Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289–290.
  41. 41. Pulliam HR (1988) Sources, sinks, and population regulation. American Naturalist 132: 652–661.
  42. 42. Seeb LW, Crane PA, Debevec EM (1997a) Genetic analysis of chum salmon harvested in the South Unimak and Shumigan Islands June fisheries, 1993–1996. Alaska Department of Fish and Game Regional Information Report 5J97-17. On the web at:
  43. 43. Seeb LW, Crane PA, Debevec EM (1997b) Supplementary appendices: Genetic analysis of chum salmon harvested in the South Unimak and Shumigan Islands June fisheries, 1993–1996. Alaska Department of Fish and Game Regional Information Report 5J97-18, 27 p.
  44. 44. Perrier C, Baglinière J-L, Evanno G (2013) Understanding admixturee patterns in supplemented populations: a case study combining molecular analyses and temporally explicit simulations in Atlantic salmon. Evolutionary Applications 6: 218–230.
  45. 45. Ryman N, Laikre L (1991) Effects of supportive breeding on the genetically effective population size. Conservation Biology 5: 325–329.
  46. 46. Mantua NJ, Hare SR (2002) The Pacific Decadal Oscillation. Journal of Oceanography 58: 35–44.
  47. 47. Hendry AP, Wenburg JK, Bentzen P, Volk EC, Quinn TP (2000) Rapid evolution of reproductive isolation in the wild: evidence from introduced salmon. Science 290: 516–518.
  48. 48. Araki H, Cooper B, Blouin MS (2007) Genetic effects of captive breeding cause a rapid, cumulative fitness decline in the wild. Science 318: 100–103.
  49. 49. Seamons TR, Hauser L, Naish KA, Quinn TP (2012) Can interbreeding of wild and artificially propagated animals be prevented by using broodstock selected for a divergent life history? Evolutionary Applications 5: 705–719.
  50. 50. Fraser DJ, Weir LK, Bernatchez L, Hansen HH, Taylor EB (2011) Extent and scale of local adaptation in salmonid fishes: review and meta-analysis. Heredity 106: 404–420.