Figures
Abstract
The biochemical complexity and evolutionary diversity of snake venom composition reflects adaptation to the diversity of prey in their diets. However, the genetic mechanisms underlying the evolutionary diversity of venoms are not well understood. Here, we explored the potential extent of and genetic basis for venom protein variation in the widely-distributed Western Diamondback rattlesnake (Crotalus atrox). As in many rattlesnake venoms, metalloproteinases (SVMPs) are the major component of C. atrox venom, with three proteins belonging to three distinct major structural SVMP classes, MDC4, MAD3a, and MPO1, constituting the most abundant SVMPs. We found that while most venom proteins, including MDC4 and MAD3a, vary little among individuals, the MPO1 protein is completely absent from some animals, most commonly those from the western part of the species’ geographic range. This distribution correlates with the previous finding of two distinct lineages within C. atrox and indicates that different ecological factors have shaped venom composition across the species’ range. We further show that the loss of MPO1 expression is not due to transcriptional down-regulation, but to independent inactivating mutations at the locus, including whole gene deletion. The recurrent inactivation of a major toxin gene within a C. atrox population may reflect relaxed selection on the maintenance of MPO1 function, but we also raise the possibility that the loss of venom components may be favored if there is a cost to producing a less effective toxin in protein-rich venoms.
Citation: Dowell NL, Cahill E, Carroll SB (2025) Loss of a major venom toxin gene in a Western Diamondback rattlesnake population. PLoS One 20(7): e0319316. https://doi.org/10.1371/journal.pone.0319316
Editor: Karen de Morais-Zani, Instituto Butantan, BRAZIL
Received: February 6, 2025; Accepted: June 4, 2025; Published: July 3, 2025
Copyright: © 2025 Dowell et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Availability of data and materials The data underlying this study are available in the NCBI Sequence Read Archive under BioProject PRJNA1142048. Accession numbers for specific data (raw reads) for each specimen are available in tables on FigShare (https://doi.org/10.6084/m9.figshare.27244716 and https://doi.org/10.6084/m9.figshare.27244740). Data tables associated with peptide counts (https://doi.org/10.6084/m9.figshare.26499178), processed RNA (https://doi.org/10.6084/m9.figshare.26503990), processed DNA reads (atx021: https://doi.org/10.6084/m9.figshare.27154119, atx022: https://doi.org/10.6084/m9.figshare.27154110, atx235: https://doi.org/10.6084/m9.figshare.27154098, atx236: https://doi.org/10.6084/m9.figshare.27154095, atx237: https://doi.org/10.6084/m9.figshare.27154083, atx238: https://doi.org/10.6084/m9.figshare.27154080, atx239: https://doi.org/10.6084/m9.figshare.27154068, atx240: https://doi.org/10.6084/m9.figshare.27154059) and assembled contigs (atx022: https://doi.org/10.6084/m9.Figshare.27146163, atx235: https://doi.org/10.6084/m9.Figshare.27146163, atx236: https://doi.org/10.6084/m9.figshare.27154005, atx237: https://doi.org/10.6084/m9.figshare.27154023, atx238: https://doi.org/10.6084/m9.figshare.27154026, atx239: https://doi.org/10.6084/m9.figshare.27154032, atx240: https://doi.org/10.6084/m9.Figshare.27154035) are available on FigShare.
Funding: Andrew and Mary Balo and Nicholas and Susan Simon Endowed Chair at the University of Maryland (SBC) https://giving.umd.edu/giving/fund.php?name=andrew-and-mary-balo-endowed-professorship-in-cell-biology-and-molecular-genetics. Howard Hughes Medical Institute (SBC) hhmi.org Funders played no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Key ecological traits often evolve through co-evolutionary interactions among organisms (e.g., predator-prey, plant-herbivore, host-parasite) [1–3]. Such interactions can drive trait diversification and may ultimately facilitate the exploitation of new environments and resources. A major priority in evolutionary biology is to identify the genes underlying key traits and the molecular basis of functional variation within them.
Animal venoms are key ecological traits that are most often employed for subduing prey and venom composition coevolves within the framework of predator-prey interactions [4–11]. Snake venoms consist of protein mixtures that often vary in composition and sometimes mode of action even between closely related species [12–14]. One general question concerning the molecular evolution of venom composition is the extent to which differences between species are due to differences in gene content or gene regulation. In rattlesnakes, one of the more studied groups, there is evidence for gene content differences underlying venom divergence including both the expansion of certain toxin gene families [15] and the loss of gene family members in certain lineages [16,17].
Of course, such interspecific differences must initially arise through variation within species, and there is growing evidence from field studies of dynamic variation associated with both geography and prey [18–23]. For example, in the widely distributed rattlesnake Crotalus viridis viridis, variation in venom composition occurs along the north-south axis of its range and is associated with differences in the availability of different types of prey [24]. Similarly, the discovery of ground squirrels resistant to the effects of sympatric C. oreganus oreganus rattlesnake venom indicates an ongoing evolutionary arms race in which both venom toxicity and prey resistance evolve adaptively in response to each other [6]. In C. adamanteus, both adult and juvenile snakes from discrete geographic populations express venoms with variable composition [25,26]. While the specific genetic bases of venom variation within these species are not yet known, it has been shown in other rattlesnakes that polymorphisms in venom type (neurotoxic v. hemorrhagic) within several species are due to differences in gene content [17,22]. The discovery of gene loss underlying venom diversity and variation suggests that venom components may be dispensable under some conditions and that the evolution of venom diversity does not occur solely through the addition of new toxins.
Here, to further investigate the extent and potential genetic basis of variation in snake venoms, we have explored variation in C. atrox because this species occupies a wide expanse and range of habitats in North America and possesses the largest known venom toxin gene family with at least 30 members of the snake venom metalloproteinase family [15]. Venom MPs are the most abundant component of C. atrox venom (by weight) and are primarily responsible for its hemorrhagic activity [27]. Three structural classes (P-I, P-II and P-III) of venom MPs are expressed by distinct genes that encode different combinations of the class-defining domains (e.g., metalloproteinase only (MPO); metalloproteinase and disintegrin (MAD); and metalloproteinase, disintegrin and cysteine-rich (MDC) for classes P-I, P-II, and P-III, respectively) [15,28]. All three classes of MP can damage micro vessels and contribute to hemorrhage but the extent of damage and mode of action varies between the MP classes [29].
We find that most venom proteins vary little in expression among individuals including the abundantly expressed metalloproteinases (MPs) MDC4 and MAD3a. However, MPO1, the only P-I class MP in the species, is abundant in some specimens but absent in others due to a variety of gene-inactivating mutations (gene deletions and fusions). Furthermore, the loss of MPO1 expression occurs predominantly in animals found in the western part of the species range, indicating that there are different requirements for venom composition and MPO1 function across the species range. We suggest that the loss of venom components within a population may be due to relaxation of selection and neutral drift but could be favored if there is a cost to producing a less effective toxin.
Materials and methods
Animals and biological materials
The authors collected no specimens for this study. All animals used in this work were housed at and in the care of the Viper Resource Center of the National Natural Toxins Research Center, Texas A&M University-Kingsville. Locality data for individual snakes was provided by the Center.
Venom (27 animals), blood (8 animals), and venom glands (7 animals) were obtained from individual C. atrox specimens according to study protocols reviewed and approved by the Texas A&M Kingsville Institutional Animal Care and Use Committee in compliance with all applicable federal regulations governing the protection of research animals (Viper Resource Center at Texas A&M University-Kingsville, IACUC #2018-11-09-A3 and #:2021-11-29/1474).
Mass spectrometry of total venom
Sample preparation and mass spectrometry of individual C. atrox venoms (9 animals) was performed as described previously [30] at the Mass Spectrometry Facility at the University of Wisconsin-Madison. In brief, lyophilized venoms were rehydrated and digested with trypsin solution and ProteaseMAX surfactant for three hours at 42°C. Peptides from digestion were solid-phase extracted (C18 pipette tips) and eluted off the C18 SPE column. Chromatography of peptides was performed using a capillary emitter column (PepMap C18, 3µM, 100Å, 150x0.075mm, Thermo Fisher Scientific) and mass spectral analysis used the NanoLC-MS/MS (Agilent 1100 nanoflow system) connected to a hybrid linear ion trap-orbitrap mass spectrometer (LTQ-Orbitrap Elite, Thermo Fisher Scientific) equipped with an EASY-Spray electrospray source.
Raw MS/MS data was converted to the Mascot generic format (mgf; MSConvert) and used in Mascot (Matrix science) searches against a C. atrox venom proteome. Initial protein identification and spectral quantification relied on Scaffold software. Peptide counts were exported from Scaffold and used for downstream analysis. The presentation of between- specimen measurements of peptide counts for any specific protein is a relative comparison of venom protein abundance. The high sequence similarity between venom metalloproteinases (MPs) prompted us to use counts of exclusive unique peptides (or spectra) to assess venom metalloproteinase abundance. Additionally, percent peptide coverage of the metalloproteinase domain was used to distinguish between expression of a complete or partial protein. The detection of any peptides from a specimen that is inferred to have deleted that gene could be due to technical reasons, for example, low level instrument contamination between specimens or perhaps miss-assignment of a particular peptide to a specific protein due to conserved sequences. Using exclusive unique peptides and coverage can minimize but does not eliminate challenges surrounding accurate determination of which proteins in large gene families are present/absent from venom.
Venom gland RNA sequencing library construction and data processing
Isolation of RNA from venom glands was performed as described previously [17]. Illumina sequencing libraries were generated using the TruSeq Stranded mRNA kit. The sequencing libraries were created using size selected RNA (300–800 bp). The venom gland libraries were sequenced in a single HiSeq2500 lane for 2 x 150 cycles producing paired reads 150 nucleotides in length.
Iso-seq libraries were generated using the SMRTbell Express Template Prep Kit 2.0 for the Sequel II system v8.0 with Sequel II Chemistry 2.0. Reads were processed using the Pacific Biosciences recommended processing pipeline (https://isoseq.how). The data processing included five steps: consensus generation, demultiplexing, refinement, clustering and collapsing of mapped reads (https://isoseq.how/clustering/schematic-workflow.html). First, raw reads were used to generate circular consensus sequence (CCS) reads. Next, primers were removed from the CCS reads. Then the reads were refined through the removal of polyA tails and concatemers. These full-length non-concatemer (FLNC) reads were clustered using a hierarchical alignment strategy. Three criteria were used for clustering two (or more) sequences: i) the 5’ overhang is less than 100 base pairs (bp), ii) the 3’ overhang is less than 30 bp and iii) any gaps are less than 10 bp (https://isoseq.how/clustering/faq.html). In general, clustering removes reads containing small (< 10 bp) indels or originating from sequencing of a partially (< 100 bp from 5’-prime end) degraded RNA molecule. The clustered read set still contains putative isoforms (e.g., a splicing of most exons would create a gap > 10 bp and is considered a unique read) and possible noise from sequencing of degraded molecules missing > 100 bp from the 5’-prime ends. All clustered reads were then mapped (Minimap2) to the C. atrox reference genome [15,16,31]. Finally, the genome mapping of clustered reads permitted the identification of sets of gene associated read clusters (e.g., for any single gene we often observed the genomic alignment of a highly similar set of reads for a full-length molecule (cluster 1) and the reads for a putative spliced isoform (cluster 2)). Collapsing of many reads into a single consensus sequence occurred only within clustered read sets. The criteria for clustering and collapsing reads did not appear to significantly reduce the breadth of isoform diversity based on inspection of genomic alignments of clustered and collapsed reads across the metalloproteinase gene complex.
The MP paralogs are similar at the nucleotide level raising the possibility that reads from different MP genes could erroneously cluster together, however, inspection of genomic mapping of clustered reads across the MP gene complex (~1.2 mega base pairs and 30 annotated MP genes) revealed alignment of distinct reads at many MP genes suggesting preservation during data processing of gene specific read clusters even among closely related paralogs. In general, we observe a diverse set of putative MP isoforms from several classes (exon skipping, intron retention, truncation) (S13 Fig of MDC4). These putative MP isoforms contained multiple full-length transcripts that were highly similar to each other and it is difficult to determine if these nearly identical transcripts represent biological differences (e.g., different untranslated region (UTR) lengths resulting from different transcriptional start sites (TSSs)) or artifacts of sequencing (e.g., degraded RNAs). Therefore, in order to accurately assess relative expression differences of full-length MPO1 between specimens we used the reference genome-derived MP transcripts.
Analysis of venom gland gene expression using short and long read data.
To quantify relative gene expression differences, we built a meta-transcriptome consisting of reference genome MP transcripts and non-MP venom gland isoseq reads. This approach of combining annotated MP reference transcripts with unannotated non-MP isoseq reads attempts to strike a balance between accurate assessment of MP expression variation while accepting the reduced accuracy of any detected non-MP gene expression differences. Reads from high throughput short read sequencing of venom gland RNA were pseudo-aligned to the meta-transcriptome (Kallisto version 0.48.0) [32]. Differential gene expression analysis was done using edgeR [33] and log2-transformed counts and log fold-change were visualized using R [34].
Distinction between detection of low and zero expression of full-length transcripts.
After mapping short reads to the non-MP isoseq reads and MP transcripts [35] and calculating coverage across the transcripts [36], we found in one region of the MPO1 transcript encoded by exons three and four zero coverage in all low-expressing specimens but detectable (low) coverage in all high-expressing specimens. This finding indicates a difficult to sequence region or high sequence identity between exons three and four of MP genes that in turn leads to a reduced number of unambiguously mapped reads (S7A, B Fig). Therefore, this region has limited utility in detecting relative expression differences because any specimen with zero coverage across this type of region (AZ3, S7A Fig) may actually have low-level expression of a full-length transcript. In contrast specimens (AZ1, AZ2, TX4, S7A Fig) contain multiple zero coverage regions and are therefore unlikely to express a full-length transcript because those regions of the transcript are reproducibly detected in high expressing specimens. In summary, our analysis of MPO1 expression suggests that specimens with relatively low expression and incomplete read coverage across a transcript are unlikely to produce a full-length transcript.
The detection of reads in specimens that are inferred to have deleted MPO1 could be due technical or biological reasons. For example, rare index hopping on multiplexed sequencing runs could result in a very low level of reads from one specimen (MPO1+) being incorrectly assigned to another specimen (MPO1-). Alternatively, incomplete deletion events during recombination events could remove most but not all exons or shuffle some exons to non-reference locations. If the products of such recombination events (partial genes) are near (or retain) active cis-regulatory elements then RNA may be detected even in the absence of a full-length gene.
Detection of full-length MPO transcripts using long reads or assembled short reads.
To further assess the level of support for inferring gene loss from transcriptome data we also examined the isoseq reads from four specimens that aligned to the reference MPO1 gene. We clearly detect MPO1 transcripts in TX1 and TX3 (S9 Fig) but are unable to detect full length transcripts in AZ1 and TX4 (S9 Fig). We did not have isoseq data for every specimen and it is possible that low isoseq read depth could miss a low expressed transcript so we took our high throughput short read data and assembled venom gland transcriptomes for each specimen. We find specimens with a full-length MPO1 transcript and relatively high MPO1 expression (NM1, TX3, TX1) also have partial transcripts that tile across the genomic region and link all coding exons (S8A Fig). In contrast, for specimens with low MPO1 expression (AZ1, AZ2, AZ3, TX4) we do not find full-length assembled transcripts that tile across all MPO1 exons (S8B Fig).
We extended our analysis of genome aligned isoseq reads and assembled transcripts to other MPs (MDC8c, MAD3b and MDC4) and found specimens with relatively low expression lacked full-length transcripts and only partially tiled across genomic loci with assembled transcripts. For example, at MDC8c we observe in two specimens (Fig 3, AZ2 and AZ3) with the lowest relative expression only a small set of partial transcripts that did not fully tile across all exons (S10A Fig) whereas all other specimens had a full collection of tiling partial transcripts aligning to all exons and/or putative full-length isoforms (S10A, B Fig). For MAD3b, the three specimens with the lowest relative expression also had a small set of partial transcripts that did not fully tile across all exons (S13A Fig; AZ1, AZ2, AZ3) and for one of these specimens (AZ1) we also did not detect a putative full-length isoform (S13B Fig). However, the absence of a full tiling set of partial transcripts does not perfectly correlate with absence of expression. For example, in the case of MAD3b two specimens lacked a full tiling set of partial transcripts but did have putative full-length transcripts (S11A, B Fig; TX3 and TX4) and importantly, these two specimens expressed relatively high levels of MAD3b (Fig 3a). For MDC4, we found in all specimens partial assembled transcripts linking all exons and a collection of putative full-length isoforms (S12A, B Fig). In conclusion, in studies seeking to detect full length transcripts one can interrogate transcripts using assembled short reads but sole reliance on short reads can be misleading but a combination of short reads, assembled transcripts and isoseq reads can distinguish between low and the absence of full-length transcripts.
Development of polyclonal antibodies with class-specific binding to venom metalloproteinases
To develop antibodies highly specific for individual MP proteins we first aligned the amino acid sequences of the MDC4 and MPO1 metalloproteinase domains and identified prospective peptide immunogens on the basis of amino acid hydrophobicity, inspection of class III and I metalloproteinase crystal structures for exposed regions [37,38], and by searching for unique sequences in protein alignments. Candidate peptides (MDC4: 253-SNEDKITVKPEAGYT-267, MPO1: 358-RPGLTPGRSYEFSDDS-373) were selected, synthesized and coupled to a carrier before immunization of two rabbits (Pacific Immunology). To characterize the sensitivity and specificity of the antibodies, total C. atrox venom (0.5 micrograms per lane) was fractionated by SDS-PAGE and transferred to a PVDF membrane. After transfer the membrane was cut into strips corresponding to gel lanes and each strip was probed with pre- or post-immune sera at a range of dilutions. Sera from production bleed three was used for affinity purification of polyclonal antibodies using the immobilized peptide antigen.
To determine if the polyclonal-MDC4 antibody signal in western blots is specific for MDC4 we performed a competition experiment. We synthesized the homologous antigen peptide sequences from the 14 class III MP paralogs, and incubated peptides (10, 30, 100-fold molar excess) with MDC4-antibody (0.125 ug/ml) overnight with shaking at 4°C. The peptide-antibody or antibody only solutions were then used to probe WB membrane strips corresponding to a single gel lane loaded with one µg total C. atrox venom. The MDC paralog peptides do not reduce the band intensity suggesting our polyclonal antibody is specific for MDC4.
Targeted sequencing of the Metalloproteinase genomic region
A hybridization-capture approach was used to enrich for genomic DNA encoding the metalloproteinase gene complex from individual specimens. Using the C. atrox MP reference sequence (~1.2 mega base pairs (M bp) with 30 annotated MP genes) as a template, deoxyribonucleic acid (DNA) baits (100 nucleotides long) were designed and synthesized (Arbor Biosciences) at an average density of one bait every 125 nucleotides. However, an initial pilot experiment to evaluate hybridization efficiency and sequencing coverage across the MP complex (using genomic DNA from the same specimen used to generate the reference genome) identified genomic regions with low sequencing coverage. It was unclear if those regions were difficult to sequence or not present in the sequencing library because the absence of a bait resulted in poor enrichment. So additional baits were designed across the low coverage regions and then both the original and second batch of baits were used to enrich genomic DNA from all specimens. The Pacific Biosciences Multiplex Genomic DNA Target Capture Using SeqCap-EZ Libraries protocol was followed for generating the final sequencing libraries. Kapa HiFi polymerase was used for all amplification steps. The average length of input genomic DNA fragments for the sequencing libraries was seven kilobase pairs (kb) (range of two – ten kb). The mean DNA fragment length of the initial barcoded library was five kb and the mean fragment length of the final enriched DNA library was four kb. The libraries were pooled and sequenced with the PacBio Sequel II.
We assessed the efficacy of target enrichment of isolated genomic sequence across the complete MP region by processing and analyzing long reads from the specimen that was used previously to generate a bacterial artificial chromosome (BAC) library and a reference genome assembly [15,16]. Visual inspection (Integrated Genome Viewer (IGV), version 2.8.0) of reads aligned [39] (NGLMR version 0.2.7) to the reference assembly showed high coverage across the MP region but with high levels of variation. In this study, we focused on the MPO1 and MDC4 genes and found complete coverage across this region for the reference specimen (TX1; S18B Fig) thus allowing us to conclude the absence of reads in this region for a particular specimen may reflect the absence of this genomic sequence in that individual.
Assembly of targeted sequencing reads
Pacific Biosciences circular consensus sequence (CCS) reads were trimmed and corrected using CANU (version 1.9) [40–42]. The corrected reads were assembled using the Flye genome assembly program (version 2.8-b1674) [43]. The assembled contigs were aligned to the reference genome using LAST (lastal version 847) [35].
Results
Bimodal expression of several venom toxins in C. atrox
C. atrox venom is composed of metalloproteinases (MPs), PLA2s, serine proteinases, lectins, vasoactive peptides, L-amino acid oxidase and several other proteins consistently found in Crotalus species but at relatively low expression levels [12,44]. In order to capture the potential intraspecific variation in these components, we initially surveyed venoms from nine adult C. atrox specimens from across its range (Fig 1). And in order to detect variation in protein composition at the level of resolution of single toxins, and to be able to differentiate among closely related toxin sequences, we performed mass spectrometry on whole venoms and used exclusive-unique peptide (EUP) counts of known venom proteins to estimate their relative abundance.
The geographic distribution of C. atrox species (red shading) and specimen locations (black shading of counties) are shown. The country, state and county outlines are reproduced with the sf [45] and ggplot2 [46] R packages using data available in the public domain [47]. The species distribution plot is made with sf [45] and ggplot2 [46] using data from [48] under a CC BY license with permission from IUCN Red List, original copyright (2007).
Among the MPs, MDC4, MDC2, MDC7 and MAD3a exhibited limited variation in abundance (small spreads in EUP counts) between specimens. However, three MPs—MPO1, MAD3b and MDC8c are relatively abundantly expressed in some individuals but are markedly reduced or lack expression in others (Fig 2a). In contrast, most members of other venom protein families (e.g., phospholipases, serine proteinases) exhibit limited variation in abundance between individuals, except galactose-lectin (Fig 2a, b and S1 Fig).
(a) Average peptide counts for individual venom proteins from nine specimens (boxed legend in top right). Proteins are ordered along the x-axis from high to low abundance. Box plots show the upper and lower counts that span the interquartile range (IQR: 25 to 75th quartiles) and the median count (black line inside box). The extreme (1.5*IQR) lines extend from the top or bottom of the box and counts beyond the extremes are noted with small black dots. Most venom protein counts are similar between all specimens (narrow boxes) except MPO1, MAD3b, MDC8c and LECG counts vary widely between specimens (wide boxes). Venom proteins expressed at lower levels than SVSP2 (Snake venom serine proteinase – 2) are shown in Supplemental information S1A Fig. (b) Average peptide counts for all known C. atrox metalloproteinases (MP). MPs are ordered along the x-axis according to the order of genes in the reference genome. The reference genome MP gene order is shown schematically below the box plots with genes represented as arrows (class III MPs in green, class II MPs in orange, class I MP in red and fusion MPs in green and orange). Expressed MPs have a line linking the protein identifier and gene arrow above the gene complex and undetected MPs have a line linking the respective protein identifier below the gene complex. The only unresolved gap in the reference sequence is also shown between MDC8a and MDC6e. MPO1, MAD3b and MDC8c have the broadest variation in expression between specimens. MDC2, MDC4 and MDC7 are the most abundant MPs and have limited variation in expression between specimens. Protein abbreviations: MDC, Metalloproteinase, Disintegrin, Cysteine-rich domains, class III zinc metalloproteinase; MAD, Metalloproteinase and Disintegrin domains, class II zinc metalloproteinase; MPO, Metalloproteinase domain only, class I zinc metalloproteinase; LAAO, L-amino acid oxidase; LECG, C-type lectin galactose, PLA2-gB1, Phospholipase A2, group IIG, B1; PLA2-gK, Phospholipase A2, group IIG, K; PLA2-gA1, Phospholipase A2, group IIG, A1.
To further ascertain the state of MP expression in certain individuals (low or absent), we determined that it was necessary to consider both the coverage and abundance of EUPs obtained by mass spectrometry. We aligned EUPs to hypothetical translations of MP reference amino acid sequences and calculated metalloproteinase domain coverage (S2-S5 Figs). This analysis reveals that peptides mapping within the metalloproteinase domain are detected in all specimens, however, the coverage for different segments of the protein (and abundance) varies significantly between specimens. For example, for MPO1, the coverage of peptides spanning the metalloproteinase domain is less than half (39–49%) of the domain length in five specimens, whereas in four specimens most (> 90%) of the domain is covered (S2A and B Fig). To determine if specimens expressing MPs at low levels are due to full coverage with low counts or partial coverage and low counts, we plotted the abundance (counts) of a subset of non-overlapping peptides that span the MP domain. This analysis reveals that the abundances of the few peptides found in the low-coverage specimens are extremely low relative to the abundances of the same peptide identified in the high coverage specimens (S2C Fig). We also compared the mean counts for all MPO1 peptides detected in each specimen with MPO1 metalloproteinase domain coverage and found the specimens with low coverage also had very low mean counts (S3B, D Fig). The positive correlation between low coverage and mean peptide counts is also found for the MDC8c (S3A–D Fig) and MAD3b (S4A–D Fig) proteins. Interestingly, we also note that the low abundance and coverage of MPO1, MAD3b and MDC8c are shared by three specimens (a fourth specimen has similar coverage and abundance only for MPO1 and MDC8c).
These observations are in striking contrast to, for example, those for the MDC4 protein, where the domain coverage and individual and mean peptide counts are very similar between specimens (S5A–D Fig). The detection of just a few peptides for MPO1, MDC8c and MAD3b that only partially span the key protein domain raised the possibility that functional proteins are absent from these specimens’ venoms. This variation in toxin expression led us to investigate the potential genetic mechanism(s) that could be responsible.
Lack of toxin protein expression is correlated with lack of toxin gene mRNA expression
Variation in toxin protein expression could be due to gene regulation (mRNA expression) or gene content. We sought to address whether regulation of venom gene transcription varies using two complementary sequencing platforms to qualitatively and quantitatively assess venom gland gene expression profiles of seven specimens. We used single molecule sequencing of full-length cDNAs from individual C. atrox venom gland RNA samples to generate representative venom gland transcriptomes that permits the qualitative analysis of both canonical and novel venom gland expressed transcript isoforms. Additionally, we used short read sequencing and mapping to MP reference transcripts to test if the observed variation for MPO1, MDC8C and MAD3b (Fig 2) is consistent with mRNA transcript abundance (see Methods section “Detection of full-length MPO transcripts using long reads or assembled short reads” for additional details related to sequencing platforms).
We found that variation in MP mRNA expression broadly agrees with variability of venom protein abundance (Fig 3a; genes as boxes and colored points represent individual specimens). For example, MPO1, MDC8c and MAD3b transcripts are abundantly expressed in individuals with relatively high venom peptide counts and coverage but reduced in the individuals that have low peptide counts and coverage (Fig 3a). For example, specimens TX1, TX3 and NM1 express MPO1 near the level of the abundantly expressed MDC4 gene; but four other specimens (AZ1, AZ2, AZ3, TX4) exhibit much reduced expression (Fig 3a). We observed similar bimodal patterns of MDC8c and MAD3b mRNA expression (stretched boxplots in Fig 3a). The bimodal pattern of expression is not characteristic of all MPs since abundantly expressed class III MPs such as MDC2 and MDC7 are expressed at similar levels between specimens (narrow boxplots in Fig 3b). Furthermore, even close gene paralogs can have drastically different expression patterns (compare boxplots of MAD3a and MAD3b, Fig 3b) suggesting that MP gene expression may also be diverging.
Gene specific box plots of log2 transformed transcript counts (transcripts per million, TPM) for the main metalloproteinases (a) and all known metalloproteinases (b) presented in Fig 1a-b for seven individual specimens (boxed legend; colored points). MPO1, MDC8c and MAD3b have bimodal expression patterns (wide boxes) consistent with the protein expression. The expression pattern of most venom RNAs is consistent with the corresponding protein. The reference genome MP gene order is shown schematically below the box plots with genes represented as arrows as described in Fig 2.
To further investigate the nature of low MP transcript expression, we compared read counts (non-length normalized) of MPO1, MDC8c, MAD3b to the abundantly expressed MDC4. For MPO1, we found three specimens (TX1, TX3, NM1) with abundant mapped reads but four specimens with much fewer mapped reads (AZ1, AZ2, AZ3, TX4) (S6A, B Fig). These specimens with fewer mapped reads also exhibited incomplete read coverage across the length of the reference transcript (S6C Fig; AZ2 with ~50%, AZ1 with ~75%, AZ3 and TX4 with ~85% coverage) with at least one zero coverage region (S7A Fig, dashed boxes span zero coverage regions). This finding contrasts with specimens with high relative MPO1 RNA expression (TX1, TX3, NM1) with read coverage several orders of magnitude greater across the transcript (S7B Fig, compare the range of y-axis between A and B). This analysis suggests that specimens with relatively low MPO1 expression likely do not make any full-length transcript (see Methods section “Distinction between detection of low and zero expression of full-length transcripts” for additional details regarding transcript detection for low expressed genes and read coverage along transcripts and S8, S9 Figs).
Similarly, for MDC8c, the two specimens with the lowest relative expression (Fig 3, AZ2 and AZ3) had a small set of partial transcripts that did not fully tile across all exons (S10A Fig), whereas all other specimens yielded a full array of tiling partial transcripts aligning to all exons and/or putative full-length isoforms (S10A, B Fig; see Methods section “Detection of full-length MPO transcripts using long reads or assembled short reads” for details regarding comparisons of long read isoforms and short read assembled transcripts). For MAD3b, the three specimens with the lowest relative expression also had a small set of partial transcripts that did not fully tile across all exons (S11A Fig; AZ3, AZ2, AZ1) and for one of these specimens (AZ1) we also did not detect a putative full-length isoform (S11B Fig). These results contrast with MP genes that have limited expression variability between specimens, for example MDC4, which shows a complete array of tiling partial transcripts, full-length isoforms and candidate variant isoforms (S12, S13 Figs). We suggest that, as for MPO1, the specimens with relatively low MDC8c or MAD3b expression may not make any full-length RNA transcript.
We were particularly surprised to find MPO1 abundantly expressed in some individuals and apparently absent in others. MPO1 is the only P-I class MP in our SVMP complex assembly, whereas MAD3b and MDC8c both have at least one very closely related paralog and numerous more distantly related paralogs in C. atrox [15]. One alternative possibility is that specimens lacking MPO1 may have different MPO alleles or paralogs that we failed to detect. This led us to determine if our approach of mapping isoseq reads to a reference genome could detect non-reference alleles or MPO paralogs. We do in fact detect an MPO variant that is similar to Atrolysin-C, the only divergent C. atrox MPO protein/gene sequence present in databases (UniProt accession Q90392.1; “MPO-C” in this study), but not found in our reference SVMP gene complex. Visualization of aligned isoseq reads at MPO1 revealed that specimen TX3 expresses two distinct sets of MPO transcripts (S9 Fig; TX3). Alignments of hypothetical translations for several isoforms (PB.5517.400, PB.5517.432, PB.5517.437) revealed that they are identical to the reference MPO1 sequence while two isoforms (PB.5517.428 and PB.5517.421) encode amino acid substitutions identical to those found in Atrolysin C in part of the sequence (amino acids 219–285) but that differ from both Atrolysin C and reference MPO1 sequence from residue 291–385. (S9 Fig). These findings suggest that our approach can detect divergent MPO sequences and that the lack of full-length MPO transcripts encoding intact MPO proteins in some individuals reflects a general lack of MPO expression.
Design and characterization of specific antibodies to MPO1 and MDC4
To further explore MPO1 expression and distribution in venom in more specimens and more easily than via RNA analysis of venom glands, we endeavored to make polyclonal antibodies specific for MPO1 (and MDC4 – a low variation MP). We inspected alignments of MP protein sequences to identify candidate peptide immunogens for the generation of specific polyclonal antibodies. One MDC4 peptide and two MPO1 peptides were synthesized and used to immunize rabbits. Reactivity to total venom in a Western blot assay was measured using crude sera and the sera with the highest activity were used as sources for the affinity purification of peptide-specific polyclonal antibodies. Titration of the affinity-purified polyclonal antibodies shows strong signal with minimal background (Fig 4a, b). The MPO1 antibody is highly specific to class I MPs because we observe a single band at approximately 25 kilodaltons (kDa), where the only known class I MP (MPO1) is predicted to migrate (Fig 4b). Because several class III MPs are expressed in C. atrox venom (particularly MDC2, MDC7 and MDC8c) and those proteins are expected to migrate near MDC4, we needed to exclude the possibility that the signal from the anti-MDC4 antibody could be attributed to binding class III MPs other than MDC4. We performed a competition experiment by pre-incubating peptides from known class III MP paralogs with the anti-MDC4 antibody and examining their effect on the Western Blot signal (Fig 4c). We find that while the MDC4 peptide competes for antibody binding and reduces the signal in a Western blot, the (paralogous) peptides from other class III MPs fail to reduce the signal (Fig 4d).
The activities of the respective (a: MDC4, b: MPO1) affinity-purified antibodies (AFP ab) were titrated using total venom from the reference specimen (TX1) across a broad concentration range (1 to 0.125 micrograms per milliliter (µg/ml) using 2-fold steps). The sensitivity of both polyclonal antibodies is shown by the detection of respective proteins at the expected sizes (class III MP ~ 48 k Da, class I MP ~ 24 k Da) at increasing dilutions of antibody. The MP class specificity of each antibody is evident by the lack of bands at non-target sizes.
To test the within-class specificity of the polyclonal-MDC4 antibody peptides corresponding to the orthologous sequences from MDC paralogs (c) were incubated with polyclonal MDC4 antibody (peptide molar excess of 10, 30 or 100-fold). The peptide-antibody mixture was then used to probe membrane strips containing transferred total venom. Polyclonal-MDC4 antibody mixed with MDC4 peptide has a titratable signal reduction whereas antibody mixed with peptide from MDC4 paralogs shows limited reduction of signal (d, compare signal with increasing competitor (three lanes below triangles) to lane without competitor (-)) suggesting the activity of polyclonal-MDC4 antibody has greater specificity for MDC4 relative to other MDC proteins that may be present in venom.
MPO1 expression is low or undetectable in venoms from most C. atrox individuals in its western range
Our finding of variation in MPO1 abundance contrasts with a previous report of limited variation in C. atrox venom composition [49]. The difference in our results may be explained by the resolution of the methods employed, that is protein family-level resolution for SDS-PAGE versus single protein resolution when combining mass spectrometry with gene annotations. In addition, it is possible that our survey may have captured venom variation that could be linked to some variable, for example ecology or geography not represented or apparent in prior surveys (e.g., all of Rex and Mackessy (2019) specimens were from Arizona).
Indeed, we note that three of the four specimens that exhibited very low or no MPO1 protein or RNA expression (AZ1, AZ2, AZ3) originate from the western part of the species range while those exhibiting abundant MPO1 expression (TX1, TX3, NM1) originate from the eastern part of the range. Interestingly, Schield et al. (2015) previously identified a genomic signature of population structure in C. atrox. Specimens from each side (east or west) of the Continental divide (CD) were shown to have a distinct set of genetic markers. The observations of both venom variation and population structure in C. atrox raised the possibility that distinct venom compositions might occur in different populations.
To further investigate whether MPO1 expression correlates with geography, we examined 14 additional venom samples from across C. atrox’s range (along with the original seven samples) for the presence of MPO1 (and MDC4) using specific antibodies in a Western blot assay.
While C. atrox venom samples from across the range show similar levels of MDC4 expression (Fig 5a-c), MPO1 is undetected (or relatively weak: AZ1, AZ5, AZ8) in most (7/8, 88%) specimens found west of the CD (Fig 5a). In contrast, for most eastern specimens, MPO1 is clearly detected (15/19, 79%) with four exceptions (Fig 5b and c; undetected: NM7, TX4, TX6; relatively weak: TX5). We find statistical support for the association between geography and weak or undetectable MPO1 expression ((X21, N = 27) = 7.73, p = 0.0054).
For each specimen (identifiers above the gel images) one microgram of total venom was separated by SDS-PAGE and transferred to a membrane. Additional controls on each gel are purified MDC4 (lane 1), purified MPO1 (lane 2) and total venom (lane 3) from the reference specimen (TX1; found east of the Continental divide). The top portion of each membrane was probed with anti-MDC4 and the bottom portion probed with anti-MPO1. MPO1 is undetected or extremely low (AZ1, AZ5, AZ8) in most (7/8, 88%) specimens west of Continental Divide (a, Arizona specimens). MPO1 is detected in most (15/19, 79%) eastern specimens (b, New Mexico; c, Texas) with several exceptions (NM7, TX4, TX5, TX6). In contrast, MDC4 is detected in all specimens with limited variability in signal. There is statistical support for an association between geographic origin (d; east or west of CD) and low or undetected MPO1 expression (7/8, west and 4/19, east; Chi-squared = 7.72, p-value = 0.005). The county, state and continental divide boundaries are made with sf [45] and ggplot2 [46] using data available in the public domain [47,50].
In our screening of many C. atrox venoms with the MPO1 polyclonal antibody, we also observed that in some specimens MPO1 appears as a doublet with a faster-migrating (lower molecular weight) band also detected. To test whether these bands are also the result of the polyclonal MPO1 antibody binding to the MPO1 epitope, we performed a competition experiment by incubating the polyclonal antibody with a molar excess of peptide and then probing membranes using the pre-incubated antibody. We find that both signals for the doublet and the lower molecular weight band are eliminated with increasing amounts of MPO1 peptide (S14 Fig), suggesting that our polyclonal antibody is detecting different forms of MPO1 in these samples.
The discovery of this correlation between MPO1 expression and geographical location prompted us to revisit our transcriptome data to perform differential gene expression (DGE) analysis between the east/west specimens as well from additional, independently collected specimens for which published venom gland transcriptomes are available [51]. The DGE analysis found that venom proteins with bimodal expression of protein levels (MPO1, MAD3b, MDC8c) also have a bimodal expression of transcript levels, however, differential gene expression is associated with geographical origin only for MPO1 (S15, S16 Fig).
We again considered the possibility that different paralogs (or alleles) of MPO1 are present but undetected by our methods interrogating venom RNA. To test this possibility, we included the divergent MPO-C sequence in our DGE analysis and found evidence for two specimens (NM1 and TX3) expressing MPO-C (S15 Fig, right side with pink shading). Importantly, inclusion of MPO-C in our DGE analysis did not change the MPO expression state or the geographic association of specimens with low/undetectable expression of the MPO1 reference gene. This association between MPO1 expression and geography raised the question of the potential genetic mechanism(s) underlying the absence of MPO1 expression in individual C. atrox animals.
Whole gene deletion and gene fusion within MPO1 abolish MPO1 venom expression
To identify the genetic basis for loss of MPO1 mRNA and protein expression, we used targeted genomic sequencing, coverage analysis and annotation of assembled contigs from the MPO1 genomic region of multiple individuals with both expression phenotypes (see Methods and S17 Fig). We were able to identify mutations of various types that account for the low or undetectable MPO1 expression phenotypes.
To validate our experimental approach, we first assessed coverage of the MDC4 genomic region. We observe genomic sequencing coverage across the entire MDC4 locus in all specimens which is consistent with an intact MDC4 locus driving full-length mRNA expression (Fig 6a, filled boxes and S18A Fig, right half of all coverage plots). To assess the ability of our approach to recover the MPO1 genomic region, we applied the method to the genomic DNA of the reference specimen (TX1) originally used to assemble the MP complex and found sequence coverage across the entire MPO1 gene (Fig 6a; TX1 and S18A, B Fig). Similarly, for one specimen with high MPO1 expression (TX3), we found sequence coverage across the MPO1 gene and subsequent annotation of assembled contigs yielded an intact MPO1 gene that when hypothetically translated yields a full-length MPO1 protein (Fig 6a and S18A Fig).
Schematic summary of aligned read coverage and annotation at the MPO1 and MDC4 loci. Four specimens (AZ1, AZ2, AZ4, TX4) have zero or low coverage at the MPO1 gene but the adjacent MDC4 gene has similar coverage relative to the other specimens. Specimen AZ3 appears to have a broken MPO1 locus resulting from a fusion event with MAD5a. A full-length intact MPO1 gene is detected in specimen TX3. Annotation of assembled contigs of one specimen (NM1) also revealed an allele containing a nonsense mutation in exon ten of the MPO1 metalloproteinase domain that is predicted to result in a truncated protein.
In contrast, for four specimens (AZ1, AZ2, AZ4, TX4) with low or undetectable MPO1 protein (Fig 5a, c) genomic coverage of the MPO1 locus is not detected (Fig 6a and S18A Fig) and no assembled contigs are identified (S20A Fig). We interpret these animals as bearing homozygous (AZ2, AZ4, TX4) or heterozygous (AZ1) deletion mutations for the MPO1 locus. We infer that the absence of MPO1 is an evolutionary loss, rather than a gain in eastern specimens, because previous work showed that the origin of MPO1 preceded the origin of the Crotalus clade [15].
Annotation of specimen AZ3 revealed a chimeric MPO1 gene with four exons (7–10) that perfectly match the amino acid sequence of MAD5a (S20B Fig). Exon 11 matches both MPO1 and MAD5a exon 11 with 63% similarity, but the following two exons match MPO1 exons 12 and 14 with 98 and 100% amino acid similarity (S20B Fig). This suggests a recombination event between MPO1 and MAD5a yielded a chimeric gene and the rearrangement of genomic sequence disrupted the gene encoding full-length MPO1.
Finally, we note that in specimen NM1, in which we detect expression of MPO1 protein (Fig 5b), we obtained full sequence coverage across the locus for what appear to be two divergent sequences corresponding to MPO1 and MPO-C (S19 Fig). Genome assembly consistently resolved a single sequence in which we found indels in exon ten that change the translation frame to a stop codon and, if expressed, would yield a truncated protein with only a partial metalloproteinase domain (S19A, B and S20C–F Figs). Furthermore, we found that exons 11 and 12 contained the exact amino acid substitutions found in MPO-C (S20B Fig). This result reveals that our targeted genome sequencing methods can also detect MPO sequences that vary substantially from the reference. We infer that this specimen with the MPO-C sequence is likely heterozygous for the mutation because our anti-MPO1 antibody detects a band co-migrating with full-length MPO1 protein (Fig 5b) and the variation in aligned genomic reads (S19 Fig).
In summary, we find in all specimens with low/undetectable MPO1 expression evidence for MPO1 inactivation via whole gene deletion or gene fusion. Gene inactivation by two distinct processes suggests the possibility that at least two independent loss of function events occurred in the gene encoding this toxin that is abundantly expressed when it is present.
Discussion
We have shown that most individual venom toxins in the widely distributed rattlesnake C. atrox vary little in abundance between animals but that several toxins including three metalloproteinases have a bimodal pattern of variation—expressed in some specimens but absent or barely detected in others. The variation in MPO1 expression is particularly surprising because, unlike MAD3b and MDC8c which have closely related paralogs expressed in C. atrox venom, MPO1 is the only toxin of its type (a P-I class metalloproteinase) in C. atrox, it is also one of the most abundant individual venom components (when present), and the absence of MPO1 expression occurs largely in animals found in the western region of the species’ range. Furthermore, we found that loss of MPO1 expression was not a consequence of transcriptional down-regulation but that independent gene inactivating mutations have occurred (whole gene deletion and a gene fusion) that eliminate the toxin from the venom arsenal. The repeated inactivation of MPO1 within a C. atrox subpopulation raises the questions of what ecological factors and evolutionary processes would cause a major toxin to be abandoned in part of the species range.
Variation in prey may underlie variation in C. atrox MPO1 expression
The principal function of snake venoms is to subdue and kill prey. Venoms often contain dozens of different proteins that act singly or in concert to disrupt one or more critical physiological processes such as hemostasis or neuromuscular activity. One main action of C. atrox venom, like that of many other rattlesnakes and members of the family Crotalidae, is to disrupt hemostasis by destroying vascular integrity and interfering with blood coagulation. The major families of toxins responsible for these activities are the metalloproteinases and serine proteases which constitute up to 50% and 20% of the venom by weight, respectively, in C. atrox venom [44]. MPO1 is one of the most highly expressed MPs among the thirty venom MPs encoded in the C. atrox genome [15], It is also the only P-I class metalloproteinase present in the venom. Because this class of enzymes has been shown to have distinct activities on extracellular matrix and fibrinogen substrates, and a more diffuse distribution in vivo relative to P-II and P-III class enzymes [29], it seems likely that MPO1 contributes a distinct spectrum of activities to C. atrox venom. The absence of MPO1 expression in most animals in the western part of the species range and the abundance of the toxin in most animals in the eastern part of the range, coupled with evidence for ongoing widespread gene flow throughout the species entire distribution [52], may reflect that there are different requirements for venom function across C. atrox’s range. One prime candidate to explain this geographic variation in venom composition would be differences in the availability of susceptible prey.
A growing body of evidence suggests that snake venom composition co-evolves with the diversity, availability, and susceptibility of prey [9,53,54]. Among pit vipers, for example, a large-scale survey of venom composition and dietary breadth among many North American species (including Crotalus) found a positive correlation between greater venom complexity and the phylogenetic breadth of stomach contents [51]. And a recent study on the widely distributed C. viridis rattlesnake identified differences in venom composition along the north-south axis of the species’ geographical range that correlates with varying susceptibilities of different prey types [24]. In this case, an MP-rich and low myotoxin-expressing venom in snakes from the more southern region was associated with a more diverse diet consisting of mammals, reptiles, and birds, while snakes in more northern regions exhibited an myotoxin-rich, low-MP venom and preyed principally on mammals. This inversion in myotoxin abundance correlates with a potent lethal effect of the toxin on small mammals but a lack of effect on lizards [24]. The abundance of lizard prey declines dramatically in more northern latitudes, so venom composition in this case appears to be tailored to the availability of susceptible prey, not prey preference.
C. atrox is one of the largest and most widely distributed rattlesnakes, with one of the broadest habitat ranges, occurring across much of the southwestern United States and northern Mexico (Fig 1). Numerous dietary studies have been conducted that reveal the species to be a generalist with a wide range of mammal and bird species observed in stomach contents, as well as lizards [55,56]. A comparative survey of specimens from across its entire geographic range did not reveal any significant differences in the classes of prey consumed (i.e., mammals and lizards) [57], so there is no general prey type that might explain an east-west difference in venom composition.
However, C. atrox in different regions may prey on different species of mammals. For example, while kangaroo rats (Dipodomys sp.) are preyed upon throughout the species range [57], Texas animals have been reported to prey principally on woodrats (Neotoma sp.) and pocket mice (Perognathus sp.) [55]. And lagomorph consumption appears to be greater in western (Arizona) versus eastern (Oklahoma and Texas) specimens [55,58,59]. Regional differences in prey availability may influence venom composition, particularly if there are significant differences in prey susceptibility. Several mammalian species have been shown to be markedly resistant to C. atrox venom, such as the Mexican ground squirrel (LD50 is 13 times that of laboratory mice; [60]), the hispid cotton rat (LD50 is about 40 times that of laboratory mice; [61]), and the gray woodrat (LD50 is over 250 times that of laboratory mice; [61]). Moreover, resistance in some of these species has been shown to correlate with serum antihemorrhagic activity which is directed against venom metalloproteinases [61]. Thus, one possible factor underlying the loss of MPO1 expression may be the emergence of resistance to MPO1 and/or other MP activities in important prey.
In the absence of direct evidence for prey susceptibility shaping MPO1 distribution, however, we must remain open to other ecological or environmental factors that could vary across the species range and affect venom composition. For instance, polymorphism in C. scutulatus venom content was not linked to dietary differences, but to temperature and climatic variation which may influence venom composition more indirectly through, for example, foraging behavior or exposure to predators [22].
Evolutionary loss of toxins: neutral and selective scenarios
In the ongoing co-evolutionary arms races between venomous predators and prey, the potential evolutionary advantages gained by the addition of new venom components or new toxin activities are obvious. Numerous investigators and studies have asserted that the composition of snake and other animal venoms is under strong selection, including balancing selection [62–64] but the evolutionary mechanisms governing the flip side of venom evolution – the loss of venom components, are not so clear. Assuming that the geographic distribution of the MPO1 toxin reflects different requirements for MPO1 and venom function across the range, two distinct evolutionary processes could explain the observed recurrent inactivation of the MPO1 gene and absence of the toxin. First, if the source of selection that maintains MPO1 function in the eastern part of the range is absent in the western range, the relaxation of selection to maintain MPO1 function in the western population would allow for the recurrent inactivation of the MPO1 gene and null alleles to persist under neutral evolution and genetic drift.
Alternatively, it is also possible that selection could favor the elimination of the toxin from venom. It is becoming better appreciated that there can be fitness costs to the production of traits, such that the loss of a trait may be favored when the source of selection for the maintenance of a trait is removed (reviewed in [65]). The production of venom incurs a metabolic cost, with one study reporting an 11% increase in C. atrox resting metabolic rates during periods of venom replenishment, which is 10-fold higher than the predicted rate for production of an identical mass of mixed body growth [66]. In the sea anemone as well, venom production entails a trade-off with growth and reproductive rates [11] and is repressed under stress [67]. Research on other secreted body fluids (mammalian milk, seminal fluids) also suggests that the production of these biologically important mixtures requires significant energy investment [68,69]. In the example examined here, it is plausible that individuals homozygous for MPO1 null alleles that completely eliminate an ineffective and metabolically expensive toxin may possess a fitness advantage over those individuals expressing an ineffective toxin.
It is increasingly apparent that toxin gene loss occurs frequently in venomous snakes [16,17,70,71] and other venomous animals [64]. Evolution via gene loss has been proposed as a potentially important albeit less appreciated mechanism in adaptation to changing environments or novel ecological niches [72,73]. While gene loss can often be ascribed as a consequence, not a cause, of evolutionary adaptation or regressive evolution, it is also the case that testing for selection on loss-of-function/deletion alleles poses distinct experimental and analytical challenges [74]. Nevertheless, selection on loss of function alleles has been reported for a small subset of loci in a recent large-scale population genomic study of cavefish [75] as well as being inferred to contribute to adaptation in herbivores [76], cetaceans [77], fruit bats [78], nematodes [79], vampire bats [80], and mole rats [81]. While beyond the scope of this study, as further examples of toxin loss are documented, it will be valuable and necessary to obtain large-scale population genomic data to identify the potential evolutionary forces operating on such loss-of-function alleles.
Supporting information
S1 Fig. Average exclusive unique peptide counts for venom proteins.
Average exclusive unique peptide counts for non-metalloproteinase venom proteins. Additional venom proteins are detected in the venom of most specimens at relatively low levels and vary little between specimens (Pla2g2g-C1, Phospholipase A2, group IIG, C1; SVSPC1, Snake venom serine proteinase-C1; SVSPH, Snake venom serine proteinase-H, TXVE, Snake venom vascular endothelial growth factor toxin; QPCT, Glutaminyl-peptide cyclotransferase; PRDX4, Peroxiredoxin-4; BKIP, Bradykinin inhibitor peptide; BNP, C-type natriuretic peptide).
https://doi.org/10.1371/journal.pone.0319316.s001
(PDF)
S2–S5 Figs. Individual peptide locations, coverage and counts for single proteins.
These figures show the individual peptide locations (A), percentage of protein coverage (B), individual peptide counts (C) and average peptide counts of a protein (D) for MPO1 (S2 Fig), MDC8c (S3 Fig), MAD3b (S4 Fig) and MDC4 (S5 Fig). For S2-S5 Figs (A), the amino acid locations (x-axis) of exclusive unique peptides (barbell shaped line segments) are mapped to a linear representation of the respective venom protein. Each row shows peptides from a single specimen with the specimen identifier on the right side of the plot and aligned with the horizontal bar plot (B) showing observed coverage across the metalloproteinase domain. Observed coverage is the percentage of the metalloproteinase domain covered by unique peptides after removal of conserved (non-unique between paralogs) or not detected sequences. A few sequence segments of the metalloproteinases are highly conserved among MPs so peptides from those regions cannot be unambiguously assigned to single proteins and have been removed from this analysis (blank spaces shared by all specimens). The counts (x-axis) of individual non-overlapping peptides (y-axis) spanning the protein are shown for each specimen (colored dots) (C). This analysis shows that when a protein has high coverage comprised of many peptides then the associated counts of those individual peptides is often uniform and consistent with the mean count for the total protein (D).
https://doi.org/10.1371/journal.pone.0319316.s002
(ZIP)
S6 Fig. Short read counts and coverage of mapped reads to four reference transcripts (MAD3b, MDC4, MDC8c, MPO1).
(A) Box plots showing between specimen variation in counts of mapped reads to reference transcripts. These counts are not length-normalized so longer transcripts can have higher counts and comparisons across the different classes of transcripts are not indicative of expression differences. (B) Zoomed view of the MPO1 counts highlighting the variation between specimens. (C) Horizontal bar plots showing the percent of the reference transcript covered (x-axis) by mapped reads from each specimen.
https://doi.org/10.1371/journal.pone.0319316.s003
(PDF)
S7 Fig. Read coverage across the MPO1 reference transcript.
Abundant MPO1 expression correlates with complete read coverage across the MPO1 transcript. Specimens with low total read coverage (A) have gaps in coverage (dashed boxes) across the MPO1 transcript whereas specimens with high MPO1 expression (B) have complete coverage across the transcript.
https://doi.org/10.1371/journal.pone.0319316.s004
(PDF)
S8 Fig. Differences in completeness of assembled transcripts correlates with MPO1 expression level.
(A) Specimens with relatively high MPO1 expression have assembled transcripts that tile across the complete gene and link exons. (B) Specimens with relatively low MPO1 expression lack assembled transcripts that tile across the complete gene and link exons. Additionally, specimen NM1 has a linking transcript that connects exon 8 to exon 11 and lacks alignments to exons 9 and 10 thus suggesting the mutated exons 9 and 10 may be removed through splicing.
https://doi.org/10.1371/journal.pone.0319316.s005
(PDF)
S9 Fig. Detection of full-length and novel MPO1 isoforms with single molecule sequencing.
For two specimens (TX1 and TX3) a full-length MPO1 isoform is detected. For specimen TX3, novel isoforms consisting of spliced exons, novel exons and/or retained introns are also shown. For two specimens (AZ1 and TX4) with low MPO1 protein and mRNA levels no full-length MPO1 isoforms are detected. Hypothetical translations of four isoforms (PB.5517.400, PB.5517.428, PB.5517.432, PB.5517.437) and alignment to reference MPO1 sequence identified three sequences (PB.5517.400, PB.5517.432, PB.5517.437) identical to the reference sequence and one (PB.5517.428) with 18 shared amino acids (219–285) with MPO-C (atrolysin-C) but differences in the 3’ prime sequence. This shows our approach is able to detect the expression of non-reference MPO genes.
https://doi.org/10.1371/journal.pone.0319316.s006
(PDF)
S10 Fig. Assembled transcripts aligning to the MDC8c gene and linking multiple exons.
Consistency between assembled transcripts linking all exons and single molecule sequencing of full-length MDC8c isoforms. (A) MDC8c assembled transcripts that tile across the complete gene and link exons with the notable exceptions of AZ2 and AZ3. (B) Full-length MDC8c isoforms identified using single molecule sequencing.
https://doi.org/10.1371/journal.pone.0319316.s007
(PDF)
S11 Fig. Assembled transcripts aligning to the MAD3b gene and linking multiple exons.
Single molecule sequencing recovers full-length MAD3b isoforms in specimens that lack assembled transcripts tiling across the gene. (A) MAD3b assembled transcripts that tile across the complete gene and link exons with the exceptions of AZ1, AZ2, AZ3, TX3, TX4. B) Full-length MAD3b isoforms identified using single molecule sequencing with the exception of AZ1.
https://doi.org/10.1371/journal.pone.0319316.s008
(PDF)
S12 Fig. Evidence for expression of full-length MDC4 expression in all specimens.
(A) MDC4 assembled transcripts from all specimens tile across the complete gene and link all exons. (B) Full-length MDC4 isoforms identified using single molecule sequencing.
https://doi.org/10.1371/journal.pone.0319316.s009
(PDF)
S13 Fig. Isoforms of MDC4 are detected but often expression is limited to a subset of individual specimens.
(A) Isoform sequencing using Pacific Biosciences technology (IsoSeq) from four specimens (TX1, TX3, TX4, AZ1) followed by quantification of isoform abundance using short reads from all specimens identified isoforms from several general classes (grey boxes on left showing full-length, splicing, truncation) that align to the genomic region of MDC4 (blue rectangles or arrows show regions of isoform sequence that align, green rectangles in top row show location of annotated MDC4 exons. (B) Flipped box plots with the isoform-level counts (Transcripts Per Million) for seven specimens. Expression is limited or not detected in most specimens for most isoforms. For the three putative full-length isoforms (PB.1752.158_atx240 (TX4), PB.5518.684_atx239 (TX3), PB-3656.442_atx238 (AZ1)) hypothetical translations of only one (PB.1752.158_atx240 (TX4)) yields a full-length protein with the other two isoforms containing nucleotide substitutions that yield a hypothetical truncated protein.
https://doi.org/10.1371/journal.pone.0319316.s010
(PDF)
S14 Fig. MPO1 reactive doublet is competed away with MPO1 peptide.
Signal from “MPO1 doublet” and lower bands are reduced with excess MPO1 peptide. Initial characterization of the polyclonal MPO1 antibody and screening of individual C. atrox venoms revealed a doublet at ~25 k Da and a lower molecular weight band. These bands may represent non-specific binding activity of the antibody, recognition of post-translationally modified MPO1, an MPO1 allele (doublet) and/or proteolytic processing of MPO1 (lower band). To distinguish between these possibilities, we synthesized the MPO1 peptide antigen and mixed with the polyclonal MPO1 antibody (10, 30, 100 molar excess peptide). These MPO1- antibody-peptide mixtures were used to probe membrane strips from individual venoms (TX1, TX2, TX9, NM1, NM6). (E) The band intensities of the doublet and lower molecular weight decreased with increasing peptide supporting the possibility that the polyclonal antibody is detecting non-canonical MPO1 products (relatively fast and slow migrating polypeptides) that may represent post-translationally modified or proteolytically processed forms of MPO1.
https://doi.org/10.1371/journal.pone.0319316.s011
(PDF)
S15 Fig. Differential gene expression between C. atrox specimens from east and west of the Continental Divide.
MPO1 expression correlates with geographic origin with high MPO1 expression in specimens found east of the Continental Divide (CD) and low expression in specimens found west of the CD. This motivated us to revisit our venom gland transcriptomes and perform differential gene expression analysis on east/west venom gland transcriptomes. We identified several venom metalloproteinase genes, including MPO1 (pink shading), that are differentially expressed (A, B; blue fill of ball-and-stick, P < 0.001) between the east and west groups. The x-axis shows direction of fold changes (log-transformed; logFC) as a ball-and-stick pointing upwards (higher expression in west vs east) or downwards (lower expression in the west vs east). In addition to MPO1, we also identified four additional MP genes (MAD3b, MAD4b, MDC6b and MDC6e) that are differentially expressed (B). We included MPO-C (pink shading, Atrolysin-C, a class I MP variant not present in our reference genome) in our analysis and found this gene to expressed in NM1 and TX3 (A, B). The detection of MPO-C only in high expressing MPO1 specimens suggests our approach can detect non-reference verisons of MPO1 and the identification of low-expressing specimens may not be the result of those specimens expressing a variant MPO.
https://doi.org/10.1371/journal.pone.0319316.s012
(PDF)
S16 Fig. Independent support for the differential expression of MPO1 in C. atrox populations.
Reads from published C. atrox venom gland transcriptomes (Holding et al., 2021, PNAS) that were annotated as east or west are used with the same gene set used for the analysis in supplementary information S15 Fig to determine if any venom metalloproteinase genes are differentially expressed between the east/west groups. The variance in expression levels for all known venom metalloproteinase genes (A) for ten specimens (7, east; 3, west) generally follows the pattern shown in S15A Fig. The same DGE analysis using the published venom gland data, reveals similar decreased MPO1 expression (A, pink shading; P < 0.01). High expression of MPO-C (pink shading) was also detected in two eastern specimens. As in S15 Fig, ball-and-stick plots below the respective box plots show the log fold change in expression for specific genes between the east/west groups. Filled blue circles highlight differentially expressed genes between the east and west groups (B; P < 0.01).
https://doi.org/10.1371/journal.pone.0319316.s013
(PDF)
S17 Fig. Targeted sequencing nucleic acid probe density and locations (black bars) at the MPO1, MDC4, MAD3a regions.
Across the entire MP gene complex probes were at an average density of one probe every 125 nucleotides but there are some larger gaps due to low sequence complexity.
https://doi.org/10.1371/journal.pone.0319316.s014
(PDF)
S18 Fig. Low read coverage suggests the absence of MPO1 in most western specimens.
(A) Aligned read coverage at the MPO1 and MDC4 loci for four western (AZ1–4) and eastern (TX1–3 and NM1) specimens. Four specimens (AZ1, AZ2, AZ4, TX4) have zero or low coverage at the MPO1 gene but the adjacent MDC4 gene has similar coverage relative to the other specimens. Prior work generated a reference genome using specimen TX1 and targeted genomic sequencing of TX1 yielded high cover across the total MPO1 - MDC4 region (upward arrows indicate coverage exceeds the y-axis). (B) High genomic read coverage at the MPO1-MDC4 region for two specimens (TX1 and AZ2) with very high coverage.
https://doi.org/10.1371/journal.pone.0319316.s015
(PDF)
S19 Fig. Targeted genomic sequencing of NM1 yields high coverage across the MPO1 gene and reveals putative nucleotide mutations.
(A) Alignment of NM1 reads to the reference genome shows high coverage across the MPO1 gene with high levels of nucleotide variation. Below the coverage histogram is a pileup of the individual aligned reads (horizontal grey strips) with each nucleotide that is identical to the reference shown in grey while those that differ are shown as a colored tick mark. Zooming in on exons 10 (B), 11 (C) and 14 (D) shows the nucleotide substitutions in the coverage histogram as colored bars with the height proportional to the number of times a substitution is observed. The specific nucleotide is noted in the aligned read. Indels are shown as downward spikes in the coverage histogram and white gaps with a central horizontal bar in the aligned reads. The presence of reads that are identical to the reference sequence (grey strips) and reads that carry all of the nucleotide substitutions is evidence to the presence of two alleles at this region. In contrast, NM1 exon 14 has nucleotide variation but no indels are detected (no downward spikes in coverage and the white regions result from showing the space between the end and beginning or reads) and a hypothetical translation of exon 14 is 100% identical to the reference MPO1 sequence.
https://doi.org/10.1371/journal.pone.0319316.s016
(PDF)
S20 Fig. Alignment and annotation of MPO1 and MDC4 contigs supports inference of genomic deletions at the MPO1 region.
(A) Contigs that align to the MPO1 region were annotated to identify exons. For specimen AZ3, contig 37 aligns to the 5’ region of MPO1 and contains intact exons 1–6 which encode the signal peptide and pro-domain. Contig 5 aligns to MDC4 and extends towards MPO1 but aligns poorly (dashed region). (B) Annotation of this sequence (AZ3, contig 5) revealed metalloproteinase exons 7–12, plus 14. Amino acid percent similarity of exons 7–10 suggests these are MAD5a exons but exons 12 and 14 are nearly identical (98 and 100%, respectively) to MPO1 thus this may be a chimeric gene fusion created during a recombination event. In the reference assembly MAD5a does not contain an exon 14 and is not adjacent to MDC4. Percent identity of protein sequences identified from the annotation of contig 53 in TX3 supports the identification of MPO1 however, a nucleotide insertion in exon 11 results in a hypothetical protein sequence that differs from the reference MPO1. With respect to NM1, annotation exon 10 (contig 5) reveals indels that are predicted to result in translation stop, while exons 11 and 12 are 100% identical (see call outs + and #) to MPO-C (Atrolysin-C) suggesting the identification of a mutated allele of MPO on the assembled contig. (C – F) Alignments of NM1 exon 10 nucleotides (C and E) or amino acids from a hypothetical translation (D and F) show the disruption of coding frame (C, D) that is partially corrected when indels are accounted for (E, F). Compare C and E alignments at positions 13, 17 and 26. Hypothetical translation of the exon 10 without accounting for indels yields a sequence highly divergent from the reference MPO1 and contains a stop codon (D, position 23). (F) Accounting for the indels in a hypothetical translation shows the amino acid sequence similarity between NM1 and TX1 exon 10 sequences at the 3’ segment.
https://doi.org/10.1371/journal.pone.0319316.s017
(PDF)
Acknowledgments
The authors thank Elda E. Sanchez and Mark Hockmuller at the National Natural Toxins Research Center, Texas A&M University-Kingsville for samples; Sara Goodwin from the Cold Spring Harbor Sequencing core for technical consultations; Grzegorz Sabat at the Mass Spectrometry Core facility at the University of Wisconsin-Madison for technical support; Jory van Thiel, Matt Giorgianni, and Rabindra Thakur for comments on the manuscript; and members of the Carroll lab for critical discussions related to this work.
References
- 1. Rausher MD. Co-evolution and plant resistance to natural enemies. Nature. 2001;411(6839):857–64. pmid:11459070
- 2. Carmona D, Fitzpatrick CR, Johnson MTJ. Fifty years of co-evolution and beyond: Integrating co-evolution from molecules to species. Mol Ecol. 2015;24(21):5315–29.
- 3. Ebert D, Fields PD. Host-parasite co-evolution and its genomic signature. Nat Rev Genet. 2020;21(12):754–68. pmid:32860017
- 4. Casewell NR, Wüster W, Vonk FJ, Harrison RA, Fry BG. Complex cocktails: the evolutionary novelty of venoms. Trends Ecol Evol. 2013;28(4):219–29. pmid:23219381
- 5. Phuong MA, Mahardika GN, Alfaro ME. Dietary breadth is positively correlated with venom complexity in cone snails. BMC Genomics. 2016;17(1).
- 6. Holding ML, Biardi JE, Gibbs HL. Coevolution of venom function and venom resistance in a rattlesnake predator and its squirrel prey. Proc R Soc B Biol Sci. 2016;283(1829).
- 7. Smiley-Walters SA, Farrell TM, Gibbs HL. Evaluating local adaptation of a complex phenotype: reciprocal tests of pigmy rattlesnake venoms on treefrog prey. Oecologia. 2017;184(4):739–48. pmid:28516321
- 8. Holding ML, Margres MJ, Rokyta DR, Gibbs HL. Local prey community composition and genetic distance predict venom divergence among populations of the northern Pacific rattlesnake (Crotalus oreganus). J Evol Biol. 2018;31(10):1513–28. pmid:29959877
- 9. Davies EL, Arbuckle K. Coevolution of snake venom toxic activities and diet: evidence that ecological generalism favours toxicological diversity. Toxins. 2019;11(12).
- 10. Koch TL, Robinson SD, Salcedo PF, Chase K, Biggs J, Fedosov AE. Prey Shifts Drive Venom Evolution in Cone Snails. Molecular Biology and Evolution. 2024;41(8).
- 11. Surm JM, Birch S, Macrander J, Jaimes-Becerra A, Fridrich A, Aharoni R. Venom trade-off shapes interspecific interactions, physiology, and reproduction. Sci Adv. 2024;10(11).
- 12. Mackessy SP. Evolutionary trends in venom composition in the western rattlesnakes (Crotalus viridis sensu lato): toxicity vs. tenderizers. Toxicon. 2010;55(8):1463–74. pmid:20227433
- 13. Jackson T, Koludarov I, Ali S, Dobson J, Zdenek C, Dashevsky D. Rapid radiations and the race to redundancy: an investigation of the evolution of Australian elapid snake venoms. Toxins. 2016;8(11):309.
- 14. Casewell NR, Jackson TNW, Laustsen AH, Sunagar K. Causes and Consequences of Snake Venom Variation. Trends Pharmacol Sci. 2020;41(8):570–81. pmid:32564899
- 15. Giorgianni MW, Dowell NL, Griffin S, Kassner VA, Selegue JE, Carroll SB. The origin and diversification of a novel protein family in venomous snakes. Proc Natl Acad Sci U S A. 2020;117(20):10911–20. pmid:32366667
- 16. Dowell NL, Giorgianni MW, Kassner VA, Selegue JE, Sanchez EE, Carroll SB. The Deep Origin and Recent Loss of Venom Toxin Genes in Rattlesnakes. Curr Biol. 2016;26(18):2434–45. pmid:27641771
- 17. Dowell NL, Giorgianni MW, Griffin S, Kassner VA, Selegue JE, Sanchez EE. Extremely Divergent Haplotypes in Two Toxin Gene Complexes Encode Alternative Venom Types within Rattlesnake Species. Curr Biol. 2018;28(7):1016-1026.e4.
- 18. Glenn JL, Straight RC, Wolt TB. Regional variation in the presence of canebrake toxin in Crotalus horridus venom. Comp Biochem Physiol Pharmacol Toxicol Endocrinol. 1994;107(3):337–46. pmid:8061939
- 19. Boldrini-França J, Corrêa-Netto C, Silva MMS, Rodrigues RS, De La Torre P, Pérez A, et al. Snake venomics and antivenomics of Crotalus durissus subspecies from Brazil: assessment of geographic variation and its implication on snakebite management. J Proteomics. 2010;73(9):1758–76. pmid:20542151
- 20. Massey DJ, Calvete JJ, Sánchez EE, Sanz L, Richards K, Curtis R, et al. Venom variability and envenoming severity outcomes of the Crotalus scutulatus scutulatus (Mojave rattlesnake) from Southern Arizona. J Proteomics. 2012;75(9):2576–87. pmid:22446891
- 21. Sunagar K, Undheim EAB, Scheib H, Gren ECK, Cochran C, Person CE, et al. Intraspecific venom variation in the medically significant Southern Pacific Rattlesnake (Crotalus oreganus helleri): biodiscovery, clinical and evolutionary implications. J Proteomics. 2014;99:68–83. pmid:24463169
- 22. Zancolli G, Calvete JJ, Cardwell MD, Greene HW, Hayes WK, Hegarty MJ. When one phenotype is not enough: Divergent evolutionary trajectories govern venom variation in a widespread rattlesnake species. Proc R Soc B Biol Sci. 2019;286(1898).
- 23. Robinson KE, Holding ML, Whitford MD, Saviola AJ, Yates JR, Clark RW. Phenotypic and functional variation in venom and venom resistance of two sympatric rattlesnakes and their prey. J Evol Biol. 2021;34(9).
- 24. Smith CF, Nikolakis ZL, Ivey K, Perry BW, Schield DR, Balchan NR. Snakes on a plain: biotic and abiotic factors determine venom compositional variation in a wide-ranging generalist rattlesnake. BMC Biol. 2023;21(1).
- 25. Margres MJ, Wray KP, Seavy M, McGivern JJ, Sanader D, Rokyta DR. Phenotypic integration in the feeding system of the eastern diamondback rattlesnake (Crotalus adamanteus). Mol Ecol. 2015;24(13):3405–20. pmid:25988233
- 26. Margres MJ, Walls R, Suntravat M, Lucena S, Sánchez EE, Rokyta DR. Functional characterizations of venom phenotypes in the eastern diamondback rattlesnake (Crotalus adamanteus) and evidence for expression-driven divergence in toxic activities among populations. Toxicon. 2016;119:28–38.
- 27. Gutiérrez JM, Escalante T, Rucavado A, Herrera C. Hemorrhage caused by snake venom metalloproteinases: A journey of discovery and understanding. Toxins. 2016;8(4).
- 28. Fox JW, Serrano SMT. Insights into and speculations about snake venom metalloproteinase (SVMP) synthesis, folding and disulfide bond formation and their contribution to venom complexity. FEBS J. 2008;275(12):3016–30. pmid:18479462
- 29. Herrera C, Escalante T, Voisin MB, Rucavado A, Morazán D, Macêdo JKA. Tissue localization and extracellular matrix degradation by PI, PII and PIII snake venom metalloproteinases: clues on the mechanisms of venom-induced hemorrhage. PLoS Negl Trop Dis. 2015;9(4).
- 30. Ukken FP, Dowell NL, Hajra M, Carroll SB. A novel broad spectrum venom metalloproteinase autoinhibitor in the rattlesnake Crotalus atrox evolved via a shift in paralog function. Proc Natl Acad Sci. 2022;119(51).
- 31. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100.
- 32. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34(5):525–7. pmid:27043002
- 33. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40(10):4288–97. pmid:22287627
- 34.
Core Team RR. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016.
- 35. Frith MC, Kawaguchi R. Split-alignment of genomes finds orthologies more accurately. Genome Biol. 2015;16(1):106. pmid:25994148
- 36. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10(2).
- 37. Zhang D, Botos I, Gomis-Rüth FX, Doll R, Blood C, Njoroge FG. Structural interaction of natural and synthetic inhibitors with the venom metalloproteinase, atrolysin C (form d). Proc Natl Acad Sci. 1994;91(18):8447–51.
- 38. Igarashi T, Araki S, Mori H, Takeda S. Crystal structures of catrocollastatin/VAP2B reveal a dynamic, modular architecture of ADAM/adamalysin/reprolysin family proteins. FEBS Lett. 2007;581(13):2416–22. pmid:17485084
- 39. Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15(6):461–8. pmid:29713083
- 40. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods. 2013;10(6):563–9.
- 41. Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol. 2012;30(7):693–700. pmid:22750884
- 42. Berlin K, Koren S, Chin C-S, Drake JP, Landolin JM, Phillippy AM. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol. 2015;33(6):623–30. pmid:26006009
- 43. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019;37(5):540–6.
- 44. Calvete JJ, Fasoli E, Sanz L, Boschetti E, Righetti PG. Exploring the venom proteome of the western diamondback rattlesnake, Crotalus atrox, via snake venomics and combinatorial peptide ligand library approaches. J Proteome Res. 2009;8(6):3055–67.
- 45. Pebesma E. Simple feature for R: standardized support for spatial vector data. R J. 2018;10(1):439–46.
- 46.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York; 2016.
- 47.
Commission for Environmental Cooperation, Statistics Canada, United States Census Bureau, Instituto Nacional de Estadística y Geografía. North American Atlas - Political Boundaries. Ottawa, Ontario, Canada: Government of Canada; Statistics Canada, United States Census Bureau, Instituto Nacional de Estadística y Geografía (INEGI), Commission for Environmental Cooperation; 2022. https://www.cec.org/north-american-environmental-atlas/political-boundaries-2021/
- 48.
Nature Serve, IUCN IUCN (International Union for Conservation of Nature). Crotalus atrox (spatial data). The IUCN Red List of Threatened Species. 2007 [cited 2017 June 23. ]. https://www.iucnredlist.org
- 49. Rex CJ, Mackessy SP. Venom composition of adult Western Diamondback Rattlesnakes (Crotalus atrox) maintained under controlled diet and environmental conditions shows only minor changes. Toxicon. 2019;164:51–60.
- 50. National Atlas of the United States. Continental Divide of the United States. 2002 [cited 2023. ]. http://purl.stanford.edu/pw312bv3382
- 51. Holding ML, Strickland JL, Rautsaw RM, Hofmann EP, Mason AJ, Hogan MP. Phylogenetically diverse diets favor more complex venoms in North American pitvipers. Proceedings of the National Academy of Sciences. 2021;118(17).
- 52. Schield DR, Card DC, Adams RH, Jezkova T, Reyes-Velasco J, Proctor FN. Incipient speciation with biased gene flow between two lineages of the Western Diamondback Rattlesnake (Crotalus atrox). Mol Phylogenet Evol. 2015;83:213–23.
- 53. Lyons K, Dugon MM, Healy K. Diet breadth mediates the prey specificity of venom potency in snakes. Toxins. 2020;12(2).
- 54. Schaeffer R, Pascolutti VJ, Jackson TNW, Arbuckle K. Diversity Begets Diversity When Diet Drives Snake Venom Evolution, but Evenness Rather Than Richness Is What Counts. Toxins. 2023;15(4).
- 55. Beavers RA. Food habits of the western diamondback rattlesnake, Crotalus atrox, in Texas (Viperidae). The Southwestern Naturalist. 1976;20:503–15.
- 56.
Reynolds RP, Scott NJ. Use of a mammalian resource by a Chihuahuan snake community. Woodcock ecology and management: papers from the seventh Woodcock Symposium held at the Pennsylvania State University, University Park, Pennsylvania, 28-30 October 1980. 1981. 99–117.
- 57.
Spencer CL. Geographic variation in the morphology, diet and reproduction of a widespread pit viper, the Western Diamondback rattlesnake (Crotalus atrox). University of Texas at Arlington; 2003.
- 58. Loughran CL, Nowak EM, Schofer J, Sullivan KO, Sullivan BK. Lagomorphs as prey of western diamond-backed rattlesnakes (Crotalus atrox) in Arizona. Southwest Nat. 2013;58(4):502–5.
- 59. Pisani GR, Stephenson BR. Food habits in Oklahoma Crotalus atrox in fall and early spring. Trans Kans Acad Sci. 1991;94(3/4):137.
- 60. Martinez RR, Pérez JC, Sánchez EE, Campos R. The antihemorrhagic factor of the Mexican ground squirrel, (Spermophilus mexicanus). Toxicon. 1999;37(6):949–54. pmid:10340834
- 61. Perez JC, Pichyangkul S, Garcia VE. The resistance of three species of warm-blooded animals to Western diamondback rattlesnake (Crotalus atrox) venom. Toxicon. 1979;17(6):601–7. pmid:524385
- 62. Casewell NR, Wagstaff SC, Harrison RA, Renjifo C, Wüster W. Domain loss facilitates accelerated evolution and neofunctionalization of duplicate snake venom metalloproteinase toxin genes. Mol Biol Evol. 2011;1–38.
- 63. Schield DR, Perry BW, Adams RH, Holding ML, Nikolakis ZL, Gopalan SS. The roles of balancing selection and recombination in the evolution of rattlesnake venom. Nat Ecol Evol. 2022;6(9):1367–80.
- 64. Smith EG, Surm JM, Macrander J, Simhi A, Amir G, Sachkova MY. Micro and macroevolution of sea anemone venom phenotype. Nat Commun. 2023;14(1).
- 65. Lahti DC, Johnson NA, Ajie BC, Otto SP, Hendry AP, Blumstein DT, et al. Relaxed selection in the wild. Trends Ecol Evol. 2009;24(9):487–96. pmid:19500875
- 66. McCue MD. Cost of producing venom in three North American pitviper species. Copeia. 2006;2006(4):818–25.
- 67. Sachkova MY, Macrander J, Surm JM, Aharoni R, Menard-Harvey SS, Klock A, et al. Some like it hot: population-specific adaptations in venom production to abiotic stressors in a widely distributed cnidarian. BMC Biol. 2020;18(1):121. pmid:32907568
- 68. Gittleman JL, Thompson SD. Energy Allocation in Mammalian Reproduction. Am Zool. 1988;28(3):863–75.
- 69. Friesen CR, Powers DR, Copenhaver PE, Mason RT. Size dependence in non-sperm ejaculate production is reflected in daily energy expenditure and resting metabolic rate. J Exp Biol. 2015;218(Pt 9):1410–8. pmid:25954044
- 70. Li M, Fry BG, Kini RM. Eggs-only diet: its implications for the toxin profile changes and ecology of the marbled sea snake (Aipysurus eydouxii). J Mol Evol. 2005;60(1):81–9. pmid:15696370
- 71. Margres MJ, Rautsaw RM, Strickland JL, Mason AJ, Schramer TD, Hofmann EP. The tiger rattlesnake genome reveals a complex genotype underlying a simple venom phenotype. Proc Natl Acad Sci. 2021;118(4).
- 72. Olson MV. When less is more: gene loss as an engine of evolutionary change. Am J Hum Genet. 1999;64(1):18–23. pmid:9915938
- 73. Albalat R, Cañestro C. Evolution by gene loss. Nat Rev Genet. 2016;17(7):379–91. pmid:27087500
- 74. Monroe JG, McKay JK, Weigel D, Flood PJ. The population genomics of adaptive loss of function. Heredity (Edinb). 2021;126(3):383–95. pmid:33574599
- 75. Roback EY, Ferrufino E, Moran RL, Shennard D, Mulliniks C, Gallop J, et al. Population Genomics of Premature Termination Codons in Cavefish With Substantial Trait Loss. Mol Biol Evol. 2025;42(2):msaf012. pmid:39833658
- 76. Hecker N, Sharma V, Hiller M. Convergent gene losses illuminate metabolic and physiological changes in herbivores and carnivores. Proc Natl Acad Sci U S A. 2019;116(8):3036–41. pmid:30718421
- 77. Huelsmann M, Hecker N, Springer MS, Gatesy J, Sharma V, Hiller M. Genes lost during the transition from land to water in cetaceans highlight genomic changes associated with aquatic adaptations. Sci Adv. 2019;5(9).
- 78. Sharma V, Hecker N, Roscito JG, Foerster L, Langer BE, Hiller M. A genomics approach reveals insights into the importance of gene losses for mammalian adaptations. Nature Communications. 2018;9(1).
- 79. Yin D, Haag ES. Evolution of sex ratio through gene loss. Proc Natl Acad Sci U S A. 2019;116(26):12919–24. pmid:31189601
- 80. Blumer M, Brown T, Freitas MB, Destro AL, Oliveira JA, Morales AE. Gene losses in the common vampire bat illuminate molecular adaptations to blood feeding. Sci Adv. 2022;8(12).
- 81. Zheng Z, Hua R, Xu G, Yang H, Shi P. Gene losses may contribute to subterranean adaptations in naked mole-rat and blind mole-rat. BMC Biol. 2022;20(1).