Antimicrobial Functions of Lactoferrin Promote Genetic Conflicts in Ancient Primates and Modern Humans

Lactoferrin is a multifunctional mammalian immunity protein that limits microbial growth through sequestration of nutrient iron. Additionally, lactoferrin possesses cationic protein domains that directly bind and inhibit diverse microbes. The implications for these dual functions on lactoferrin evolution and genetic conflicts with microbes remain unclear. Here we show that lactoferrin has been subject to recurrent episodes of positive selection during primate divergence predominately at antimicrobial peptide surfaces consistent with long-term antagonism by bacteria. An abundant lactoferrin polymorphism in human populations and Neanderthals also exhibits signatures of positive selection across primates, linking ancient host-microbe conflicts to modern human genetic variation. Rapidly evolving sites in lactoferrin further correspond to molecular interfaces with opportunistic bacterial pathogens causing meningitis, pneumonia, and sepsis. Because microbes actively target lactoferrin to acquire iron, we propose that the emergence of antimicrobial activity provided a pivotal mechanism of adaptation sparking evolutionary conflicts via acquisition of new protein functions.


Author Summary
Immunity genes can evolve rapidly in response to antagonism by microbial pathogens, but how the emergence of new protein functions impacts such evolutionary conflicts remains unclear. Here we have traced the evolutionary history of the lactoferrin gene in primates, which in addition to an ancient iron-binding function, acquired antimicrobial peptide activity in mammals. We show that, in contrast to the related gene transferrin, lactoferrin has rapidly evolved at protein domains that mediate iron-independent antimicrobial functions. We also pinpoint signatures of natural selection acting on lactoferrin in human populations, suggesting that lactoferrin genetic diversity has impacted the evolutionary success of both ancient primates and humans. Our work demonstrates how the emergence of new host immune protein functions can drastically alter evolutionary and molecular interactions with microbes.

Introduction
Genetic conflicts between microbes and their hosts are an important source of evolutionary innovation [1]. Selective forces imposed by these antagonistic interactions can give rise to dramatic bouts of adaptive gene evolution through positive selection. J.B.S. Haldane originally speculated on the importance of infectious disease as an "evolutionary agent" over 60 years ago [2], and the Red Queen hypothesis later posited that predators and their prey (or pathogens and their hosts) must constantly adapt in order to sustain comparative fitness [3,4]. More recent studies have demonstrated how evolutionary conflicts progress at the single gene or even single nucleotide level, as molecular interfaces between host and microbial proteins can strongly impact virulence and immunity [5][6][7]. Host-pathogen interactions thus provide fertile ground for studying rapid gene evolution and acquisition of novel molecular traits [8].
Lactoferrin presents a compelling model for investigating adaptation from an ancestral "housekeeping" function to a specialized immunity factor. Lactoferrin arose from a duplication of the transferrin gene in the ancestor of eutherian mammals roughly 160 million years ago [9]. A fundamental and shared feature of these proteins is the presence of two evolutionary and structurally homologous iron binding domains, the N and C lobes, each of which chelates a single iron ion with high affinity. Iron binding by these proteins can effectively starve microbes of this crucial metal, a protective effect termed nutritional immunity [10,11]. Microbes in turn actively scavenge iron from these and other host proteins in order to meet their nutrient requirements [12,13]. The importance of iron in human infectious disease is highlighted by genetic disorders of iron overload, such as hereditary hemochromatosis, which render affected individuals highly susceptible to bacterial and fungal infections [14,15]. In addition to its role in nutritional immunity, lactoferrin has acquired new immune functions independent of iron binding following its emergence in mammals. Lactoferrin is expressed in a variety of tissues and fluids including breast milk, colostrum, saliva, tears, mucous, as well as the secondary granules of neutrophils and possesses broad antimicrobial activity [16]. Portions of the lactoferrin N lobe are highly cationic, facilitating interaction with and disruption of microbial membranes. Two regions of the lactoferrin N lobe in particular, lactoferricin and lactoferrampin, can be liberated from the lactoferrin polypeptide by proteolytic cleavage and exhibit potent antimicrobial activity against bacteria, fungi, and viruses [17,18]. Lactoferrin, as well as lactoferricin alone, can directly bind the lipid A component of lipopolysaccharide (LPS) as well as lipoteichoic acid, contributing to interactions with surfaces of Gram-negative and Gram-positive bacteria [19,20]. Lactoferrin thus poses a unique challenge for microbes-while its ability to bind iron makes it an attractive target for "iron piracy," lactoferrin surface receptors could render cells more susceptible to associated antimicrobial activity. Despite a growing appreciation for lactoferrin's immune properties, the evolutionary implications of these unique functions remain unclear. In the present study we decipher recent signatures of natural selection acting on lactoferrin in primates as well as modern humans to understand the evolutionary consequences of a newly acquired antimicrobial activity from a distinct ancestral function.

Positive selection has shaped the lactoferrin N lobe in primates
To assess the evolutionary history of lactoferrin in primates, we assembled gene orthologs from publicly available databases and cloned lactoferrin complementary DNA (cDNA) prepared from primary cell lines. In total, we compared 15 lactoferrin orthologs from hominoids, Old World, and New World monkeys, representing roughly 40 million years of primate divergence ( Fig 1A and S1 Fig). We then used maximum likelihood-based phylogenetic approaches (performed with the PAML and HyPhy software packages) to calculate nonsynonymous to synonymous substation rate ratios (dN/dS) across this gene phylogeny [21][22][23]. For our study we included the N-terminal 19 amino acid positions of the full-length lactoferrin protein, which are removed during processing of the mature polypeptide in humans. Our analysis indicated that lactoferrin has evolved under episodic positive selection in the primate lineage, consistent with a history of evolutionary conflict with microbes ( Fig 1A and S1-S7 Tables). These findings are also in line with previous genome-wide scans for positive selection in primates which identified the lactoferrin gene (LTF) among other candidate loci [24]. We next determined signatures of selection across individual codons in lactoferrin. In total, 17 sites displayed strong evidence of positive selection (posterior probability >0.95 from Naïve Empirical Bayes and Bayes Empirical Bayes analyses in PAML), with 13 of the 17 sites found in the N lobe ( Fig 1B  and 1C and S1 Fig and S2, S4, S5 and S6 Tables). This observation was notably dissimilar from a parallel analysis of primate serum transferrin, where sites under positive selection were restricted to the C lobe ( Fig 1B and 1C and S3 Table). These results are further consistent with our previous work indicating that rapid evolution in primate transferrin is likely due to antagonism by the bacterial iron acquisition receptor TbpA, which exclusively binds the transferrin C lobe [25][26][27][28]. Thus, while lactoferrin and transferrin both exhibit signatures of positive selection in primates, patterns of selection across the two proteins are highly discordant.

Evolution and diversity of lactoferrin in modern humans
Evidence of episodic positive selection in primate lactoferrin led us to more closely investigate variation of this gene across human populations. Data from the 1000 Genomes Project revealed six nonsynonymous polymorphisms at greater than 1% allele frequency in humans (S8 Table).
Of the 17 sites we identified as rapidly evolving across primate species, amino acid position 47 overlapped with a high frequency arginine (R) to lysine (K) substitution in the N lobe of lactoferrin in humans (Fig 2A and S8 and S9 Tables). This position is markedly polymorphic between populations; while individuals of African ancestry carry the K47 allele at about 1% frequency, this variant is found in non-African populations at roughly 30-65% allele frequency, with the highest frequencies observed among Europeans (Fig 2B and S9 Table). The presence of R47 in related great apes combined with its high frequency in African populations suggests that R47 is in fact the ancestral allele in humans. Data from the Neanderthal genome browser (http://neandertal.ensemblgenomes.org) further revealed lysine to be the consensus residue at position 47 in recently sequenced Neanderthals. The presence of the lactoferrin K47 allele in Neanderthal and non-African human populations and its near absence in Africans suggests one of several intriguing genetic models for the history of this variant, including long-term allelic diversity in hominins, convergent evolution, or introgression from Neanderthals into modern humans.
Given the shared variation at position 47 between primate species and among human populations, we sought to determine whether lactoferrin exhibits signatures of positive selection in modern humans. Calculation of pairwise F ST between a subset of human populations identified an elevated signal of differentiation between European (CEU) and African (YRI) populations [29], consistent with observed differences in allele frequencies between these groups (S2  We next applied measures of haplotype homozygosity to assess the possibility that the K47 haplotype has been subject to natural selection in humans. Linkage around R47 alleles breaks down rapidly within a few kilobases, while the K47 variant possesses an extended haplotype (homozygosity of 0.5 at 21,913 bases), consistent with the possibility of an adaptive sweep in this genomic region (Fig 2C). A selective sweep is also consistent with bifurcation plots around position 47, where the K47 haplotypes possess increased homogeneity relative to R47 haplotypes ( Fig 2D). We observed a slight an elevation of the genome-wide corrected integrated haplotype score (iHS) for the K47 allele (-1.40136) and a depletion of observed heterozygosity (S2, S3 and S4 Figs). We also examined the patterns of cross population extended haplotype homozygosity (XP-EHH). Consistent with the F ST and EHH results, the XP-EHH score was elevated at the K47 position when CEU individuals were compared against YRI (1.1; p-value: 0.129) or CHB (3.1; p-value: 0.003)(S5 Fig). While XP-EHH between CEU and YRI was moderate, surrounding SNPs less than 3 kilobases away had values as high as 2.89 (rs189460549; p-value: 0.01). Genome-wide, the K47 XP-EHH signal is moderate compared to other loci. Next we compared the joint distribution of the p-values from dN/dS analyses [24] with the empirical pvalues from the CEU-CHB XP-EHH analyses (S6 Fig). The previous genome-wide rank for lactoferrin, from dN/dS analyses, was 226 before considering the joint distribution and 156 after. The top 20 genes with the greatest change in rank (dN/dS p-value < 0.01) include BLK, DSG1, FAS, SLC15A1, GLMN, SULT1C3, WIPF1, and LTF. This meta-analysis highlights candidate genes that have undergone species-level as well as population-level selection in primates and humans, respectively. By integrating molecular phylogenetic analyses and population genetics approaches, we pinpointed signatures of positive selection associated with an abundant human lactoferrin polymorphism.

Rapid evolution of lactoferrin-derived antimicrobial peptides
Signatures of positive selection in the lactoferrin N lobe among diverse primates, including position 47 in humans, led us to more closely investigate evolutionary pressures that have influenced variation in this region. After gene duplication from ancestral transferrin, lactoferrin gained potent antimicrobial activities independent of iron binding through cationic domains capable of disrupting microbial membranes. Two portions of the lactoferrin N lobe in particular, termed lactoferricin (amino acids 20-67 in full-length protein; 1-48 in mature protein) and lactoferrampin (amino acids 288-304 in full-length protein; 269-285 in mature protein), have been implicated in these antimicrobial functions [18,30].
Phylogenetic analysis revealed that several sites corresponding to lactoferricin and lactoferrampin display signatures of positive selection (Fig 3A and 3B). Notably, positive selection in lactoferricin localized to sites harboring cationic (lysine, arginine) or polar uncharged residues (asparagine), which could mediate membrane disruption and regulate antimicrobial activity. Position 47, which exhibits signatures of selection in humans as well as other primates, also lies within the lactoferricin peptide region. In contrast, hydrophobic tryptophan residues proposed to mediate insertion into microbial membranes are completely conserved among primates, as are cysteine residues that participate in intramolecular disulfide bond formation (Fig 3A). We also observed rapid evolution of the position immediately C-terminal to the pepsin cleavage site in lactoferrampin (Fig 3A), suggesting that the precise cleavage site in this peptide may be variable among species. Notably, the proteases responsible for lactoferrin processing in  mucosal secretions and neutrophils remain elusive; identification of such factors will assist in revealing the consequences of genetic variation proximal to cleavage sites. Expanding our phylogenetic analysis to other mammalian taxa, we found that lactoferrin also exhibits signatures of positive selection in rodents and carnivores (S7 Fig and S10 Table). While the specific positions that contribute most strongly to these signatures could not be resolved with high confidence, N-terminal regions corresponding to lactoferricin in primates are absent in several rodent and carnivore transcripts, suggesting that this activity may have been lost or modified in divergent mammals. These observations are further consistent with previous work which identified signatures of positive selection in lactoferrin antimicrobial peptide domains across diverse mammals [31]. Together these results demonstrate that lactoferrin-derived cationic peptides of the N lobe are rapidly evolving at sites critical for antimicrobial action.

Distinct microbial interfaces are subject to positive selection in lactoferrin
While rapid evolution of the lactoferrin N lobe may reflect selection for improved targeting of microbial surfaces, it could also represent adaptations that prevent binding by inhibitors encoded by bacteria. For example, pneumococcal surface protein A (PspA) is a crucial virulence determinant of Streptococcus pneumoniae, and several studies have demonstrated that PspA specifically binds and inhibits antimicrobial portions of the lactoferrin N lobe [32]. Consistent with an important evolutionary impact for this interaction, numerous sites under positive selection in the lactoferrin N lobe lie proximal to the PspA binding interface [33], including those corresponding to the lactoferricin peptide ( Fig 3C). These data suggest that adaptive substitutions in lactoferrin could negate PspA binding, leading to enhanced immunity against S. pneumoniae or related pathogens.
Many strains of pathogenic Neisseria, which cause the sexually transmitted disease gonorrhea as well as acute meningitis, encode lactoferrin binding proteins (LbpA and LbpB) which mediate iron acquisition from lactoferrin [34,35]. Of four sites identified under positive selection in the lactoferrin C lobe, at least two appear proximal to the proposed Neisseria LbpA binding interface based on recent molecular modeling studies (S8 Fig) [36]. One of these, position 589, also aligns to a region under strong positive selection in transferrin (position 576 in humans) which directly contacts the related bacterial receptor TbpA (Fig 1B) [28]. These findings suggest that, similarly to transferrin, antagonism by bacterial Lbp proteins may have promoted natural selection in the lactoferrin C lobe. Signatures of selection at distinct lactoferrinpathogen interfaces thus highlight the diverse conflicts that have arisen during the evolution of this unique immunity factor.

Discussion
Together our results suggest that the emergence of novel antimicrobial activity in the N lobe of lactoferrin strongly influenced host-microbe interactions in primates, including modern humans (Fig 4). High disparity in sites under positive selection between the N and C lobes of lactoferrin and transferrin indicate that distinct selective pressures influenced these proteins during primate evolution. We previously demonstrated that primate transferrin has been engaged in recurrent evolutionary conflicts with the bacterial receptor, TbpA [25]. This receptor is an important virulence factor in several Gram-negative opportunistic pathogens including Neisseria gonorrhoeae, Neisseria meningitidis, Haemophilus influenzae, as well as related animal pathogens [26,[37][38][39]. Notably, TbpA binds and extracts iron exclusively from the C lobe of transferrin, and signatures of positive selection in transferrin are almost entirely restricted to the TbpA binding interface (Fig 1) [25]. The fact that transferrin family proteins are recurrently targeted by microbes for iron acquisition may have provided the selective advantage for antimicrobial functions that arose in the lactoferrin N lobe.
Our results suggest at least two non-mutually exclusive scenarios for evolutionary conflicts involving the lactoferrin N lobe. Positive selection in this region could reflect adaption of lactoferrin for enhanced targeting of variable pathogen surfaces. Lactoferricin is capable of binding the bacterial LPS, which itself is heavily modified in many human-associated bacteria to mediate immune evasion and could provoke counter-adaptations at this interface. Conversely, variation in the lactoferrin N lobe could negate interactions with bacterial inhibitory proteins such as PspA encoded by S. pneumoniae. Lactoferrin binding activity has also been identified in several other important bacterial pathogens including Treponema pallidum [40], Staphlococcus aureus [41], and Shigella flexneri [42], raising the possibility of multiple independent evolutionary conflicts playing out at the lactoferrin N lobe. Iron-loaded lactoferrin could further be viewed as a "Trojan horse," where microbes that target it as a nutrient iron source may be more susceptible to antimicrobial peptides. Consistent with this hypothesis, recent work has suggested that Neisseria encoded LbpB recognizes the lactoferrin N lobe, in contrast to its homolog TbpB which selectively interacts with the iron-loaded C lobe of transferrin [35,43,44]. LbpB binding to the lactoferrin N lobe could thus provide a counter-adaptation with dual benefits by neutralizing lactoferrin antimicrobial activity through negatively charged protein surfaces while simultaneously promoting iron acquisition by its co-receptor, LbpA [43]. These observations point to adaptations involving de novo protein functions on both sides of this molecular interface.
It is important to note that many "pathogenic" bacteria that routinely encounter lactoferrin in the respiratory mucosa are generally commensals that rarely cause disease. For example, H. influenzae colonizes a huge proportion of the human population but typically only causes disease in young children who lack a robust immune response. In addition, the dual functions of lactoferrin likely have pleiotropic effects on complex microbial communities in the host mucosa, with inhibition of some members creating new niches for others. Thus, the evolutionary forces acting on lactoferrin and the consequences for positive selection are likely more nuanced than a two-dimensional host-pathogen arms race. Future studies aimed at understanding the functional impact of lactoferrin variation will assist in understanding such complex biological effects.
Our results raise the possibility that the lactoferrin K47 variant introgressed into humans from Neanderthals at some point after the out-of-Africa expansion [45]. An alternative explanation could be convergent evolution of lactoferrin in distinct lineages of early hominins for enhanced immune function. Recent reports indicate that the human lactoferrin K47 variant, within the N lobe lactoferricin peptide, may have a protective effect against dental cavities associated with pathogenic bacteria [46]. Moreover, saliva isolated with patients homozygous for the K47 variant possesses enhanced antibacterial activity against oral Streptococci relative to homozygous R47 individuals [47]. Future analysis of lactoferrin sequence in archaic humans could provide additional insight on the history and functional properties of this variant. Together these studies provide a direct link between variation in the lactoferrin N lobe and protection against disease-causing bacteria, consistent with adaptive evolution of lactoferrin in humans and other primates.
Notably, the lactoferrin gene, LTF, is located only~60 kilobases away from CCR5, a chemokine receptor which is also an entry receptor for HIV [48][49][50][51][52]. A 32-base pair deletion in CCR5 (CCR5-Δ32) confers resistance to HIV infection, and is present at a high frequency in northern Europeans while absent from African populations [53]. Although early evidence suggested that CCR5-Δ32 might itself be subject to positive selection in humans, more recent studies have concluded that these signatures are more consistent with neutral evolution [54]. It is intriguing that, like CCR5-Δ32, the lactoferrin K47 variant exhibits increased allele frequency in European populations relative to Africans. However, the presence of the K47 variant at high frequencies in Asian and American populations points to a much earlier origin for this variant than CCR5-Δ32. Moreover, EHH and bifurcation analyses indicate that the haplotypes associated with the lactoferrin K47 variant do not encompass CCR5, suggesting that variation at the CCR5 locus is unlikely to contribute to signatures of selection in LTF (Fig 2B and 2C and S9 Table). The proximity of the LTF and CCR5 genes combined with their high degree of polymorphism and shared roles in immunity suggest the potential for genetic interactions relating to host defense. Future studies could reveal functional or epidemiological links between these two factors in human immunity.
In summary, we have discovered that lactoferrin constitutes a crucial node of host-microbe evolutionary conflict based on signatures of natural selection across primates, including humans. Our findings suggest an intriguing mechanism for molecular arms race dynamics where adaptations and counter-adaptations rapidly emerge at the level of new protein functions in addition to recurrent amino acid substitutions at a single protein interface (Fig 4). Our evolutionary analyses highlight how the process of gene duplication and subfunctionalization can drastically alter the progression of host-microbe genetic conflicts.
cDNA cloning and sequencing RNA (50 ng) from each primate cell line was prepared (RNeasy kit; Qiagen) and used as template for RT-PCR (SuperScript III; Invitrogen). Primers used to amplify lactoferrin cDNA were as follows: GTGGCAGAGCCTTCGTTTGCC (LF-forward; oMFB256) and GACAG CAGGGAATTGTGAGCAGATG (LF-rev; oMFB313). PCR products were TA-cloned into pCR2.1 (Invitrogen) and directly sequenced from at least three individual clones. Gene sequences have been deposited in Genbank (KT006751 -KT006756).

Phylogenetic analyses and structural observations
DNA multiple sequence alignments were performed using MUSCLE and indels were manually trimmed based on amino-acid comparisons. A generally accepted primate species phylogeny [55] (Fig 1A) was used for evolutionary analysis. A gene tree generated from the alignment of lactoferrin corresponded to this species phylogeny (PhyML; http://atgc.lirmm.fr/phyml/). Maximum-likelihood analysis of the lactoferrin and transferrin data sets was performed with codeml of the PAML software package [21]. A free-ratio model allowing dN/dS (omega) variation along branches of the phylogeny was employed to calculate dN/dS values between lineages. Two-ratio tests were performed using likelihood models to compare all branches fixed at dN/ dS = 1 or an average dN/dS value from the whole tree applied to each branch to varying dN/dS values according to branch.

Human population genetics analysis
For variant-based analyses we used genotype calls from the 1000 Genomes project (release: 20130502, shapeit2 phased). Weir and Cockerham's F st estimator [29] was used for the population comparisons, implemented in GPAT++. EHH and the bifurcation diagrams were calculated using the [R] package REHH [56]. Genome-wide iHS scans were performed using GPAT++ and XPEHH plots were generated previously published datasets [57,58].