Genetic Susceptibility to Non-Necrotizing Erysipelas/Cellulitis

Background Bacterial non-necrotizing erysipelas and cellulitis are often recurring, diffusely spreading infections of the skin and subcutaneous tissues caused most commonly by streptococci. Host genetic factors influence infection susceptibility but no extensive studies on the genetic determinants of human erysipelas exist. Methods We performed genome-wide linkage with the 10,000 variant Human Mapping Array (HMA10K) array on 52 Finnish families with multiple erysipelas cases followed by microsatellite fine mapping of suggestive linkage peaks. A scan with the HMA250K array was subsequently performed with a subset of cases and controls. Results Significant linkage was found at 9q34 (nonparametric multipoint linkage score (NPLall) 3.84, p = 0.026), which is syntenic to a quantitative trait locus for susceptibility to group A streptococci infections on chromosome 2 in mouse. Sequencing of candidate genes in the 9q34 region did not conclusively associate any to erysipelas/cellulitis susceptibility. Suggestive linkage (NPLall>3.0) was found at three loci: 3q22-24, 21q22, and 22q13. A subsequent denser genome scan with the HMA250K array supported the 3q22 locus, in which several SNPs in the promoter of AGTR1 (Angiotensin II receptor type I) suggestively associated with erysipelas/cellulitis susceptibility. Conclusions Specific host genetic factors may cause erysipelas/cellulitis susceptibility in humans.


Introduction
Bacterial non-necrotizing erysipelas and cellulitis are often recurring, diffuse, and spreading infections of the skin and subcutaneous tissues, which manifest with local erythema, pain, and warmth usually accompanied by fever, leukocytosis, lymphangitis, and lymphadenitis [1]. Both Group A (Streptococcus pyogenes) and G (typically Streptococcus dysgalactiae subsp. equisimilis) bhemolytic streptococci are the predominant causative agents of cellulitis/erysipelas but infections may also be caused by group B and C streptococci and Staphylococcus aureus [1,2]. Risk factors for erysipelas/cellulitis include impaired lymphatic drainage, venous insufficiency, skin eruptions and trauma, and obesity [3][4][5]. Erysipelas/cellulitis causes significant morbidity and recurrence is common especially with tibial involvement, history of malignancy, dermatitis or prior surgery of the affected limb [3][4][5][6]. The course of a clinical infection is the outcome of the host genome, the pathogens' virulence, and the environment. Interindividual variation between hosts can cause infections to range from asymptomatic to fatal infection, e.g. the same group A streptococcal (GAS) strain can be carried asymptomatically, cause uncomplicated pharyngitis or potentially fatal bacteremia such as streptococcal toxic shock syndrome or necrotizing fasciitis [7]. Twin, sibling, and adoption studies have recognized genetic factors as important determinants of susceptibility to infectious diseases, but recurrent infections and clustering in families still involve unknown genetic and immunological factors [8].
Mendelian and polygenic host genetic factors are known to influence susceptibility to infection by bacteria, parasites, and viruses including Mycobacterium leprae, Mycobacterium tuberculosis, Streptococcus pneumoniae, Neisseria meningitidis, Schistosoma mansoni, Leishmania donovani, Epstein-Barr virus, and human papilloma virus [9,10]. Conversely, resistance is known for Plasmodium vivax, human immunodeficiency virus-1, meningococcal disease, and norovirus [9,10]. Genome-wide scans have identified susceptibility loci for leishmaniasis (22q12, 2q23-31), leprosy (10p13, 6q25), and tuberculosis (15q, Xq) [11][12][13][14]. Host genetics is a significant factor in determining susceptibility to severe GAS sepsis in mouse and severe invasive GAS infection in humans [7,15]. Mouse susceptibility loci for GAS infection have been mapped to chromosome 17, including the mouse MHC region (syntenic to human 6p21); chromosome 7, which is also linked with susceptibility to Streptococcus pneumoniae infection in mice (syntenic to human 19q13. 1-13.3); and chromosome 2, including genes of the interleukin 1 alpha, and prostaglandin E synthetase pathways (syntenic to human 2q14 and 9q33-34) [16][17][18]. Specific Human Leukocyte Antigen class II (HLAII) haplotypes protect from severe systemic disease caused by GAS whereas other haplotypes increase the risk of severe disease [19,20]. HLAII molecules are receptors for microbial superantigens and their allelic variations can regulate cytokine responses. The intensity of an individual's inflammatory cytokine response correlates directly with the severity of infection: a higher cytokine response leads more often to severe systemic disease than lower cytokine levels [19]. The HLA/MHC region in humans has also been associated with the susceptibility to many other infectious diseases, e.g., HIV/AIDS, hepatitis, leprosy, tuberculosis, malaria, leishmaniasis, and schistosomiasis [21].
We have used erysipelas/cellulitis (hereafter referred to as erysipelas) as a marker infection to identify families with two or more family members suffering from erysipelas and suggesting a possibly increased susceptibility to streptococcal infections. To identify putative susceptibility loci we performed a whole-genome genetic linkage scan and identified suggestive loci on chromosomes 9q34, 3q22-24, 21q22, and 22q13.

Ethics Statement
This study was approved by the Ethical Review Board of Pirkanmaa Hospital District, Tampere, Finland. Written informed consent was obtained from all study participants. All clinical investigations have been conducted according to the principles expressed in the Declaration of Helsinki.

Patients and Families
We recruited individuals with recurrent erysipelas infections for which preventive monthly intramuscular benzathine penicillin injections are reimbursed in Finland. We contacted all 960 individuals reimbursed for benzathine penicillin through the National Health Insurance Institution in the year 2000. Of these, 50% (483) gave consent to participate and 25% had a first-degree relative with a history of erysipelas. We then collected blood samples from 204 recurrent erysipelas patients and 124 relatives from 52 pedigrees with two or more family members suffering from erysipelas. The diagnosis of erysipelas was verified from hospital records for all patients except for six who self-reported to have had erysipelas but no hospital records were available for verification.
An acute erysipelas cohort of 90 patients with acute erysipelas and 90 population controls matched for age and sex was also recruited. An infectious disease specialist recruited the patients from Tampere University Hospital and Hatanpää City Hospital, Tampere, Finland when they were hospitalized for erysipelas. The cohort is described in detail elsewhere [5].

Genomic Screen for Non-parametric Linkage
Samples from twenty affected individuals from six most representative families ( Figure 1) were genotyped using Affymetrix GeneChip Human Mapping 10K Array v 1.0 (Affymetrix, Santa Clara, CA, USA). A total of 11,145 autosomal single nucleotide polymorphisms (SNPs) were used for analysis, with 82-962 SNPs per chromosome. The median physical distance between SNPs was 210 kb (genetic distance ranging from 0.24-1.12 cM), and the average heterozygosity was 0.37. Physical coordinates were mapped against the GRCh37.2 human genome assembly and the deCODE genetic map was used for genetic locations [22]. MERLIN (Multipoint Engine for Rapid Likelihood Inference) software [23] was used for multipoint nonparametric linkage (NPL) analysis. Allele frequencies were estimated from data on 20 affected individuals, and SNPs with unlikely genotypes were removed prior to analysis. The genome-wide significance of NPL all scores was estimated by simulating data 100 times with MERLIN and extracting the highest NPL all score from each simulation. The minimum NPL all score for suggestive linkage was 2.1 (occurring once at random in a genome scan) and the threshold for significant linkage 4.77 (occurring with a 5% probability in a genome scan). Non-parametric linkage analysis was repeated using Caucasian allele frequency estimates obtained from Affymetrix.

Verification of Linkage Peaks
The NPL results were verified with 31 microsatellites surrounding the suggested linkage peaks at 3q22-24, 9q34, 21q22, and 22q13 at approximately 2 cM (0.85-2.2 Mb) intervals (Table 1). We genotyped 91 individuals (54 affected, 31 non-affected, and six who were defined unconfirmed as their erysipelas diagnosis could not be verified) from 19 families using PCR followed by capillary electrophoresis. Further fine mapping of the 9q34 linkage peak (131527468 -135831155 bp) was performed with 22 microsatellites from 130457260 to 136035489 bp ( Table 2). PCR assays were performed in 5 ml volumes containing 20 ng of DNA with standard reagent concentrations and temperature profiles. Fluorescently labeled PCR products were run on an ABI 377 sequencer. Allele calling was performed using Genotyper 2.0 (Applied Biosystems).
MERLIN was used for multipoint NPL analysis as described above. Allele frequencies were estimated from all individuals, nonaffected individuals were assigned affection status unknown, and Mendelian inconsistent genotypes were removed prior to analysis. Linkage analysis was done using two configurations: 0; the unconfirmed affected individuals were analyzed as unknown, and 2; they were analyzed as affected. The genome-wide significance of NPL all scores for configuration 0 was estimated by simulating data 1000 times with MERLIN and extracting the highest NPL all score from each simulation. Minimum NPL all score for significant linkage was 2.49, p = 0.05. For fine mapping of 9q34, linkage analysis was done using four configurations: 0; unconfirmed affected individuals were analyzed as unknown, and 2; they were analyzed as affected. In configurations 0_186 and 2_186, analysis was identical except that allele 186 was called for marker D9S65.

Follow-up Genomic Screening with Higher-density Array
We selected 15 affected patients and 15 unaffected controls for additional genomic screening with the Affymetrix GeneChip Human Mapping 250KSty Array to search for possible allele or haplotype associations assuming a strong genetic effect. Twelve patients were from the families 1, 2, 4, 5, 8,9,12,14,22,32,37, and 38, and their genetically independent family members served as controls. Three patients and three controls were from the acute cohort. Genotypes were called with BRLMM using Affymetrix default parameters. Analysis focused on the defined linkage peaks: 3q22-24 (D3S1306-D3S1299), 9q34 (D9S290-D9S1863), 21q22 (D21S1898-D21S1920), and 22q31 (D22S1159-D22S1141) ( Table 1). To evaluate potential differ-ences of haplotype frequencies between cases and controls, shared heterozygosity among cases was checked, allelic association was analyzed by using Haploview, and haplotype association was analyzed by both Haploview and Haplotype Pattern Mining [24,25].

Candidate Gene Analyses
Altogether five candidate genes in the 9q34 linkage peak were chosen based on their biological relevance in immunity or infections ( Table 2). After PCR with flanking intronic primers, all exons were sequenced in index individuals from the six families (11, 12, 13, 28, 40, and 46) showing most significant linkage (Table  S1). Similarly, exons and exon-intron boundaries of AGTR1 (Angiotensin II receptor, type 1) were sequenced in six probands from the families (7, 13, 37, 40, 43, and 46) showing linkage to the 3q22 area. All PCR reactions were performed in 5 ml volumes containing 20 ng of DNA, with standard reagent concentrations and temperature profiles. Sequencing was performed using dyeterminator chemistry and automated sequencers (ABI, Columbia, Maryland, United States). Primer sequences are available on request.

Initial Non-parametric Genome-wide Linkage Results
We found seven suggestive linkage peaks on chromosomes 9q34, 3q22-24, 21q22, 22q13, 3p24, 10q25, and 11q24, in descending order of genomic significance, on the Affymetrix HMA10K Array (Table 3, Figure 2). The strongest linkage was on chromosome 9q34 (rs578802-rs708616), with an NPL all score of 3.84 and a suggestive genome-wide p-value of 0.24. Results were identical when the analysis was repeated with Caucasian allele frequency estimates from Affymetrix, except that the chromosome 3 peak marker moved from 3p24 (rs1994987) to 3p22 (rs2167176) with an NPL all score of 2.64 and a genome-wide P-value 0.94. Generally, different families contributed strongest to the most significant peaks: 9q34 (Families 12 NPL all 3.2 and 13 NPL all 4.8); 3q22-q24 (Families 13 NPL all .5 and 37 NPL all 2.0); 21q22 (Families 12 NPL all 3.7 and 21 NPL all 2.5); and 22q13 (Families 12 NPL all 2.0 and 31 NPL all 1.7).

Verification of Linkage Peaks
Genotyping 91 affected and non-affected individuals from 19 families with 31 microsatellite markers surrounding the four most suggestive linkage peaks on 3q22-24, 9q34, 21q22, and 22q13 revealed one significant linkage peak for 9q34 with the highest NPL all of 2.77 and p = 0.026 for D9S159 (minimum NPL all score for significant linkage was 2.49) ( Table 1). Only suggestive linkage was seen for 3q22-24, 21q22, and 22q13 ( Table 1). The highest NPL all score was at 9q34 in both configurations. Families 12 and 13 contributed again strongest to the linkage peak on chromosome 9q34, with the highest NPL all scores of 3.64 at D9S179 and 5.02 at D9S313, respectively, and four other families showed suggestive linkage (NPL all scores of 1.34-1.41) (Table S1).  The most significant locus is highlighted in bold. Physical coordinates were mapped against the GRCh37.2 human genome assembly. The deCODE genetic map was used for genetic locations [22] and for markers absent on the deCODE map, genetic coordinates were estimated with linear interpolation using the markers' physical coordinates. cM = centiMorgan.  Figure S1).

NPL
Altogether, 59 annotated protein-coding genes are located within the chromosome 9q34 linkage peak (D9S290 to D9S1199) ( Table 2). The five functionally most interesting genes were sequenced in the index individuals from the six families showing most significant linkage to 9q34 (Table S1, Table 2). PRRX2 (Paired related homeobox 2) is expressed in proliferating fetal fibroblasts and the developing dermal layer, with lower expression in adult skin. An increase in expression of this gene during fetal but not adult wound healing suggests a role in controlling mammalian dermal regeneration and prevention of scar formation [26]. The LAMC3 (Laminin, gamma-3) gene belongs to the family of laminins, which are extracellular matrix glycoproteins and the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including intracellular invasion by several bacterial pathogens such as GAS strains [27]. FIBCD1 (Fibrinogen C domain-containing 1) is a transmembrane endocytic receptor that binds acetylated structures via a highly conserved fibrinogen-related domain (FReD). Ficolins also have FReDs and they play an important role in innate immunity [28]. FIBCD1 binds chitin and has been suggested to control the exposure of intestine to chitin and its fragments, which is important in the immune defense against parasites and fungi and the modulation of immune response [29]. In addition, fibrinogen is a plasma protein that streptococci adhere to in order to avoid host defense. ABL1 (c-abl oncogene 1, nonreceptor tyrosine kinase) is a proto-oncogene which encodes a cytoplasmic and nuclear protein tyrosine kinase implicated in the processes of cell differentiation, cell division, cell adhesion, and stress response. ABL tyrosine kinases are related to the cell penetration of Shigellae and their signaling is required T-cell development and mature T-cell function [30,31]. Sequencing revealed no specific genetic variations that would implicate any of these genes in erysipelas susceptibility.
PTGES (prostaglandin E synthase) is induced by proinflammatory cytokine interleukin 1 beta (IL1B) and synthesizes prostaglandin E2 (PGE2), a key regulator of inflammation by modulating the regulation and activity of T cells and the development and activity of B cells, and by enhancing the production of cytokines and antibodies [32]. PGE2 also modulates the severity of infection caused by GAS [33]. Upon contact with GAS, skin keratinocytes exert a strong proinflammatory response, resulting in the increased expression of several cytokines and the rapid release of PGE2 [34]. PTGES is associated with inflammatory diseases, fever, and pain associated with inflammation, and the deletion of Ptges leads to an impaired febrile response in mice [35]. We sequenced the introns and 10kb upstream of the transcription start site of PTGES as well as the coding region, but found no specific variants, mutations or indels implicating it directly in erysipelas susceptibility.

Follow-up Genotyping with Higher-density Array
We screened 15 affected patients and 15 unaffected control individuals with the Affymetrix GeneChip Human Mapping 250KSty Array and focused analysis on the previously identified regions on chromosomes 3q22 (D3S1306 to D3S1299), 9q34 (D9S290 to D9S1863), 21q22 (D21S1898 to D21S1920), and 22q23 (D22S1159 to D22S1141). The 3q22 locus was the most significant with several SNPs in the promoter region of the Angiotensin II type receptor 1 (AGTR1) between SNPs rs9862062 (148359724 bp) and rs4681157 (148412408 bp) showing nominal association (Table 4).
AGTR1 exons and exon-intron boundaries were sequenced in six probands from the families showing strongest linkage to the 3q22 region. Twelve known SNPs were identified, including rs5186 (also known as 1166 A/C) in the 39UTR. The A allele of rs5186 has been associated with increased serum levels of highsensitivity C-reactive protein and inflammation, and the CC genotype is putatively correlated with hypertension [36,37] (Table  S2). Out of six probands, five were homozygous AA, one heterozygous AC, and none had the CC genotype, thus supporting a potential role in inflammation for the A allele. However, no statistically significant difference in allele frequencies was detected for rs5186 between cases and controls in the acute cohort. No other variants that might explain linkage to this region were found in AGTR1. We chose two AGTR1 promoter area SNPs (rs9862062 and rs718424) that showed association to erysipelas in Haploview analysis, and genotyped them in the family material and in the acute erysipelas cohort by direct sequencing. The reference Gallele of rs9862062 was suggestively associated in the combined family (probands and marry-ins) and acute erysipelas cohort (Fisher's exact test, two-tailed p-value 0.006) and the reference Tallele of rs718424 showed suggestive association with a p-value of 0.017.

Discussion
Individual response to potentially fatal pathogens is modulated by both environmental and host genetic factors [8,9]. Streptococcal infections can vary from localized pharyngitis or erysipelas to potentially fatal necrotizing fasciitis and sepsis. We have used erysipelas/cellulitis, a localized infection of the skin and underlying subcutaneous tissues to identify 52 families with a possibly increased susceptibility to streptococcal infections. This is to our knowledge the largest systematically collected clinical material on familial segregation of recurrent erysipelas. We performed a linkage scan on the six most informative families segregating erysipelas and found evidence for suggestive linkage in seven chromosomal regions, with a maximum NPL score of 3.84 at 9q34. Further fine mapping of the four most significant regions in all of the collected families revealed significant linkage to the chromosome 9q34 region which is syntenic to mouse chromosome 2 (22 to 34 Mb), where a quantitative trait locus (QTL) for GAS susceptibility in mice has been identified [18]. In mouse, 37 candidate genes involved in immune response, cell signalling, cellular assembly and organization, and lipid metabolism were studied for quantitative expression levels pre-and postinfection in strains resistant and susceptible to severe GAS infection. Genes associated with early immune response and upregulated in susceptible strains and downregulated in resistant strains included Il1a, Il1rn (both located on 2q14 in humans), Ptges (located on 9q34 the linkage peak identified here), and Ptges2 (located proximal to the 9q34 linkage peak) ( Table 2). Increased production of prostaglandins has also been associated with Gram positive infections including Streptococcus suis, group B streptococcal, and GAS skin infections [34,[38][39][40]. However, sequencing of PTGES and four other chosen candidate genes in the 9q34 linkage region did not reveal significant genetic variations implicating any of these genes in erysipelas susceptibility. However, it is possible that quantitative expression level analysis of candidate genes could have revealed variation associated with erysipelas [18]. Expression analysis for the candidate genes was not performed in this study. The genes for sequencing were chosen based on their known function and thus, we could have missed genes with yet unknown roles in immunity and infection. The inherent property of genetic linkage is the relatively broad genomic area that is implicated. In this case, the 9q34 region contained 59 genes that in this study were impractical to sequence. Our rationale for choosing target genes was then necessarily based on known functional information and biological plausibility, and we admit this approach has its limitations. More candidate genes will need to be considered as data accumulate. Susceptibility to infection is a complex trait where multiple genes in an immunological pathway or multiple intertwining pathways play a role in disease outcome [18].
Higher density analysis with the Affymetrix HMA250K Array revealed the nominal association to erysipelas of several SNPs in the promoter region of AGTR1 on 3q22. AGTR1 is a G-proteincoupled receptor that mediates the major cardiovascular effects of angiotensin II, a potent vasopressor hormone involved in the development of hypertension, atherosclerosis, and insulin resistance. Angiotensin II is the end product of the renin-angiotensin system (RAS), where renin stimulates the production of angiotensin I from angiotensinogen, which is then converted to angiotensin II by angiotensin converting enzyme (ACE). The activation of the RAS correlates with organ injury and mortality in clinical sepsis, possibly by contributing to the enhanced microvascular tone [41]. Angiotensin II also exerts proinflammatory effects on leukocytes, endothelial cells, and vascular smooth muscle cells and by acting through AGTR1, it increases the expression of cytokines, chemokines, growth factors, and adhesion molecules [42].
Polymorphisms in both ACE and other angiotensinogen genes have been associated with susceptibility to inflammatory diseases such as SLE and psoriasis with frequent tonsillitis [43,44]. The Angiotensin II pathway also plays a potential role in septic shock [45]. Selected SNPs in ACE, AGTR1, and Angiotensin receptor associated protein (AGTRAP) showed significant association with increased 28-day mortality to septic shock for the GG genotype of AGTRAP rs11121816. AGTRAP interacts specifically with the Cterminal tail of AGTR1, and negatively regulates the receptor leading to a functional desensitization to angiotensin II. Pharmacological blockage of the RAS is used to treat hypertension, diabetic nephropathy, and congestive heart failure, but antigen II receptor blockers (ARBs) and ACE inhibitors also suppress proinflammatory cytokines and reduce oxidative stress. Prior usage of ARBs has been shown to reduce mortality in patients hospitalized for sepsis [46]. Polymorphisms in AGTR1 and especially the C allele of rs5186 (+1166A.C) have been associated with hypertension and the A allele of rs5186 has been associated with higher serum levels of high-sensitivity C-reactive protein (CRP) and inflammation [36,37]. Out of our six probands five were homozygous AA, one heterozygous AC, and none had the CC genotype. In the presence of AA or AC genotypes microRNA-155 (miR-155) represses expression of the AGTR1 protein [47]. MiR-155 mediated translational repression can be regulated by, e.g., TGFB1, and MiR-155 expression is significantly increased with the AA or AC genotypes as compared to the CC genotype. MiR-155 is critically involved in the control of specific differentiation processes in the immune response. It functions specifically in regulating T helper cell differentiation and the germinal center reaction to produce an optimal T cell-dependent antibody response, mediated at least partly by regulating cytokine production [48]. Furthermore, the loss of MiR-155 leads to an overall attenuation of immune responses in mouse [49]. High CRP levels and leukocyte counts (i.e., a more severe inflammatory response) in erysipelas are associated with recurrence of erysipelas [5]. Our finding of predominance of the A-allele in our six probands is consistent with these earlier observations. Interestingly, AGTR1 and PTGES are involved in the same pathway, as AGTR1 induces the production of COX, which coverts arachidonic acid into Prostaglandin H2 that in turn is converted by PTGES into Prostaglandin E2.
We found evidence for host genetic factors influencing susceptibility to bacterial non-necrotizing erysipelas/cellulitis, but did not find a common susceptibility factor in all families. We did not find linkage or association with the HLA region previously linked with GAS infection severity in humans [19,20]. It is likely The haplotype that was significantly associated to erysipelas in Haploview is marked with bold letters in the ''Associated allele'' column. Significant p-values in Haploview or Haplotype pattern mining (HPM) for individual SNPs are also highlighted in bold. SNPs belonging to the associated haplotype and a significant p-value in Haploview, and with a significant p-value in HPM, and that showed shared heterozygosity among cases are marked with an asterisk. doi:10.1371/journal.pone.0056225.t004 that as the inflammatory pathways are very complex and the defense against infections is under strong selection, different families are likely to have individual genetic susceptibilities.
Genetic heterogeneity makes it difficult to find significant correlations, which is a common pitfall of studies on host genetic factors predisposing to infections. Much larger patient and control groups will be needed to verify these preliminary results. However, our linkage peak and the region of strongest association coincide with genes and pathways suggested to play important roles in susceptibility to streptococcal infections. The identification of the susceptibility genes would help to understand better the course of infections and ultimately reduce morbidity. Figure S1 NPL plots for the fine mapping of the chromosome 9q34 linkage peak with 22 microsatellite markers. The NPL plots for the four configurations were essentially identical. MERLIN was used for multipoint NPL analyses using four configurations. (A) In configuration 0, unconfirmed affected individuals were analyzed as unknown, and (B) in configuration 2, they were analyzed as affected. In configurations (C) 0_186 and (D) 2_186, analysis was identical to configurations 0 and 2, respectively, except that allele 186 was called for marker D9S65.