Kawasaki disease (KD) is the most common acquired pediatric heart disease. We analyzed Whole Genome Sequences (WGS) from a 6-member African American family in which KD affected two of four children. We sought rare, potentially causative genotypes by sequentially applying the following WGS filters: sequence quality scores, inheritance model (recessive homozygous and compound heterozygous), predicted deleteriousness, allele frequency, genes in KD-associated pathways or with significant associations in published KD genome-wide association studies (GWAS), and with differential expression in KD blood transcriptomes. Biologically plausible genotypes were identified in twelve variants in six genes in the two affected children. The affected siblings were compound heterozygous for the rare variants p.Leu194Pro and p.Arg247Lys in Toll-like receptor 6 (TLR6), which affect TLR6 signaling. The affected children were also homozygous for three common, linked (r2 = 1) intronic single nucleotide variants (SNVs) in TLR6 (rs56245262, rs56083757 and rs7669329), that have previously shown association with KD in cohorts of European descent. Using transcriptome data from pre-treatment whole blood of KD subjects (n = 146), expression quantitative trait loci (eQTL) analyses were performed. Subjects homozygous for the intronic risk allele (A allele of TLR6 rs56245262) had differential expression of Interleukin-6 (IL-6) as a function of genotype (p = 0.0007) and a higher erythrocyte sedimentation rate at diagnosis. TLR6 plays an important role in pathogen-associated molecular pattern recognition, and sequence variations may affect binding affinities that in turn influence KD susceptibility. This integrative genomic approach illustrates how the analysis of WGS in multiplex families with a complex genetic disease allows examination of both the common disease–common variant and common disease–rare variant hypotheses.
Citation: Kim J, Shimizu C, Kingsmore SF, Veeraraghavan N, Levy E, Ribeiro dos Santos AM, et al. (2017) Whole genome sequencing of an African American family highlights toll like receptor 6 variants in Kawasaki disease susceptibility. PLoS ONE 12(2): e0170977. https://doi.org/10.1371/journal.pone.0170977
Editor: Ludmila Prokunina-Olsson, National Cancer Institute, UNITED STATES
Received: November 29, 2016; Accepted: January 13, 2017; Published: February 2, 2017
Copyright: © 2017 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The IRB at UCSD has reviewed the consent forms signed by the family for the collection and analysis of their DNA. It has been determined that the consents cover the sharing of data from the DNA analysis for the purpose of future research by qualified investigators. However, the consent form does not cover the depositing of genetic data in a national database. Data access may be granted by contacting the Principal Investigator, Dr. Jane C. Burns by email at email@example.com.
Funding: National Institute of Health grant U54HL108460 to LOM; Gordon and Marilyn Macklin Foundation's private grant to JCB and AHT; Rady Children’s Institute for Genomic Medicine, San Diego. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Although Dr. Flatley was CEO of Illumina at the time that this research was initiated, Illumina did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support for the whole genome sequencing. Dr. Flatley’s role was as a contributor to the interpretation of data and review of the manuscript.
Competing interests: Whole genome sequencing was donated by Jay Flatley who was the CEO of Illumina at that time. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Although the ability to generate whole genome sequences (WGS) from individual subjects has existed for over a decade, the use of such methods to discover novel disease-causing variants in multiplexed families affected by complex genetic disease has been limited . However, robust methods have been developed to identify rare, disease causative variants in WGS of families with monogenic diseases [2, 3]. Furthermore, such methods have proven useful in identifying individual patients with common diseases that are caused by rare, highly penetrant variants . Individuals with rare Mendelian forms of common complex diseases are often distinguished by extreme phenotypes–such as very early onset, or disease refractory to usual treatments–and by multiple affected members in single families . Here we sought to test this hypothesis in a multiplexed family with Kawasaki disease (KD).
Susceptibility to KD, the most common cause of acquired heart disease in children, is postulated to result from a complex set of genetic variants of which only a limited number have been validated to date . This self-limited illness of unknown etiology presents with the sudden onset of fever and mucocutaneous signs and is associated with coronary artery vasculitis. Inflammation in the arterial wall can compromise the structural integrity, which leads to aneurysm formation in 25% of untreated children . The major sequelae of aneurysms include thrombosis, scarring with stenosis, myocardial ischemia, infarction, and death [8–11]. KD is over-represented among children of Asian descent. In Japan, the country of highest incidence (306/100,000 children <5 years; one in every 60 male and 75 female children affected, respectively), there are more than 14,000 new cases each year and rates continue to rise . In the United States, system dynamic models suggest that by 2030, one in every 1600 adults in the U.S. will have suffered from KD . Data from limited patient series suggests that African American is disproportionately affected by KD [14–16]. Despite their apparent increased susceptibility, children of African American descent has been excluded from previous KD genetic analyses. As with other complex disorders, elucidation of the genetic determinants of KD has hitherto relied on candidate gene and genome-wide association studies (GWAS) using matched population controls and family linkage studies [17–27]. Thus far, many of the associated genetic variants have been located in introns with no associated molecular function identified . Only the C allele of SNV rs28493229 in the inositol 1,4,5-trisphosphate 3-kinase C (ITPKC) gene on chromosome 19q13.2 has been shown to affect gene transcription and impact intracellular calcium signaling and inflammasome activation [22, 28].
Here, we examined the common disease–rare variant hypothesis in KD in an African Americans family with two affected and two unaffected siblings and their unaffected, biologic parents. We report rare, likely pathogenic genotypes in biologically plausible genes that co-segregated with disease in whole genome sequences.
The genomes of all six family members were sequenced with paired, short reads to an aligned mean read depth of 33.2-fold. Unique nucleotide variants (8,018,553) were identified in the family, of which 7,592,729 were of high quality (Fig 1). We created three filter pipelines for further analysis. First, we applied the following filters: recessive homozygous only in the affected children, potentially deleterious (located in an exon, promoter region, splice site, or 3’UTR), and rare (allele frequency <1% or not available in 1,000 Genomes database). To these 34 variants in 31 genes, we applied two additional filters: gene found in KD pathway (defined by Ingenuity Pathway Analysis) and differentially expressed (p<0.05) in our KD transcriptome database . This identified a CAG repeat variant (CAG10 homozygosity in the two affected siblings) in myocyte enhancer factor 2A (MEF2A). The reference allele for this variant is CAG11 and the variant was predicted to be deleterious. The deletion was confirmed by resequencing all six family members (S1 Table).
The diagram illustrates a discovery and validation workflow starting from called variants in an African American family of six members with two KD-affected children as a discovery analysis on WGS followed by knowledge-based filtering derived from published GWAS, disease pathogenesis, and gene-level validation using differential transcript abundance. Confidence filter: a call quality at least 20 and read depth at least 10. Differentially expressed: ≥ 1.2-fold change between acute and convalescent whole blood transcripts. Abbrv.: AF: allele frequency; NA: not available, GWAS: genome-wide association study; KD: Kawasaki disease; TLR6: Toll-like receptor 6; MEF2A: myocyte enhancer factor 2A; ARRDC4:arrestin domain containing 4; SLK: STE20 like kinase; TACSTD2: tumor-associated calcium signal transducer 2.
An alternative set of filters was applied to the 7,592,729 high quality variants: compound heterozygous in two affected siblings only, allele frequency <3% in 1000 Genomes database, found in KD pathway, and differentially expressed in the KD transcriptome database. This resulted in only two rare variants (p.Leu194Pro and p.Arg247Lys) in a single gene (TLR6) that were compound heterozygous only in the affected siblings (Table 1). The MEF2A and TLR6 rare and potentially pathogenic variants that were found in KD pathways were grouped together in Tier 1 (Fig 1 and Table 1). Both TLR6 and MEF2A were differentially expressed in acute and convalescent whole blood RNA samples, with the highest expression levels in the acute phase of KD (Fig 2a and 2b).
P values are uncorrected. Transcript levels from microarray database as previously described, not corrected for cell number . Abbrv.: TLR6: Toll-like receptor 6; MEF2A: myocyte enhancer factor 2A; ARRDC4:arrestin domain containing 4; SLK: STE20 like kinase; TACSTD2: tumor-associated calcium signal transducer 2.
To explore whether common, deleterious variants were also preferentially associated with KD, we analyzed 302 recessive homozygous and deleterious variants in 180 genes. Of these, nine variants in four genes were significantly associated with KD in an imputed European descent, KD GWAS dataset (nominal p <0.05), and were differentially expressed in whole blood during acute versus convalescent KD [19, 29]. We used our European descent GWAS because no published association studies of African Americans KD subjects were available. These nine common variants were grouped in Tier 2 (Fig 1). Interestingly, TLR6 was one of the four Tier 2 genes, as well as arrestin domain containing 4 (ARRDC4), STE20 like kinase (SLK), and tumor-associated calcium signal transducer 2 (TACSTD2). The Tier 2 TLR6 variants were located either in the promoter or 3’UTR and had population allele frequencies greater than 3% in the 1,000 Genomes database (risk allele frequency 4.2–63.3%) (Table 1).
eQTL analysis of common TLR6 variants in KD
Among the six genes in the two tiers, TLR6 had multiple variants in both tiers, suggesting that TLR6 may contribute both to common, sporadic KD as well as uncommon, familial KD. To explore if there were any additional TLR6 variants for which only the affected children were homozygous, we re-analyzed the original unfiltered TLR6 WGS from all six family members. We found 34 additional variants for which only the affected children were homozygous (2 synonymous exonic SNVs and 32 intronic SNVs)(S2 Table). All were common variants with allele frequencies of 0.32–0.75 in African datasets in the 1,000 Genomes database. Next, we looked for association of these 34 variants with KD susceptibility in the imputed European descent GWAS dataset . There were three intronic SNVs (rs56245262, rs56083757 and rs7669329) associated with KD susceptibility with p = 6.9x10-6 (S2 Table and Fig 3). These three intronic SNVs and the seven SNVs in TLR6 from Tiers 1 and 2 are shown in Table 2. The two SNVs in the 3’UTR (rs6822503 and rs12645200) were in linkage disequilibrium (LD, with r2 = 0.97 in Africans, r2 = 1 in European descendants) in the imputed European descent GWAS dataset, as were the three intronic SNVs (rs56245262, rs56083757 and rs7669329), with r2 = 1 in both African and European descent cohorts. We chose rs6822503 and rs56245262 (earlier chromosome position) as representative SNVs for additional analysis.
Association results using the imputed GWAS database were plotted against chromosome location and gene structure of TLR6. Red dots show the SNVs for which only the affected children were homozygous recessive. SNVs above the blue line were associated with a P-value <0.05. TLR6 is encoded on the negative strand so the gene structure is shown 3’ to 5’.
To understand the possible effects of common variants in TLR6 in KD pathogenesis, we performed an eQTL analysis for the two representative SNVs (rs56245262 and rs6822503) and one SNV (rs6837101) in the promoter region of TLR6. Since TLR2/6 activates the transcription factors, NFKB and AP1, we focused the eQTL analysis using only the 415 genes targeted by these transcription factors. For the intronic variant (rs56245262), only one gene (IL6) among 415 NFKB and/or AP1 targets showed a significant difference in acute whole blood transcript levels as a function of genotype (nominal p< 0.001). IL6 transcript levels were lower in subjects homozygous for the risk allele (p = 0.0007 vs. non-risk allele homozygotes, and p = 0.007 vs. heterozygotes) (Fig 4). For variants in the 3’UTR (rs6822503), no gene showed significantly different transcript levels as a function of genotype with a p< 0.001. No genes were regulated as a function of the genotype of the promoter SNV (rs6837101).
TLR6 rs56245262 risk allele (A) is shown in red. Transcriptome data from a. Acute, pre-treatment (n = 146: T/T n = 54, T/A n = 69, A/A n = 23), b. Paired acute, pre-treatment and convalescent (n = 131: T/T n = 53, T/A n = 58, A/A n = 20). P-values were calculated using Mann Whitney test for (a) and Wilcoxon matched-paired test for (b).
Analysis of patient characteristics as a function of genotype
Since the TLR6 intronic SNVs were associated with differential expression of IL6, we reasoned that patient clinical characteristics related to inflammation might also vary as a function of genotype. We used our published dataset of 161 subjects  with detailed demographic and clinical information and 7,602,343 imputed genotypes (from the Illumina HumanOmni1-Quad® chip) to analyze differences in clinical parameters as a function of genotype for the intronic variant (rs56245262)(Table 3). Homozygosity for the risk allele was more common among self-declared Asians (10 of 24 (44%) A/A genotype). This result was consistent with the observation that the A allele frequency is higher in Asian populations (A allele frequencies in 1000 Genome database: Asian 0.62, A/A 0.42, Hispanic 0.37, European descent 0.23). Of interest, the pre-treatment erythrocyte sedimentation rate (ESR) was higher in subjects homozygous for the risk allele (A/A) (median ESR 80 mm/h for A/A vs. 59 mm/h for T/T, p = 0.01).
Analysis of WGS of an African Americans family with two affected and two unaffected siblings and their unaffected, biologic parents highlighted genetic variation in TLR6 in KD susceptibility. These TLR6 variants included both compound heterozygosity for two rare, likely deleterious SNVs and homozygosity for common KD risk SNVs. Subsequently, using an acute KD whole blood transcriptome data set, eQTL analysis of the common SNVs suggested decreased transcript levels of IL6 and higher ESR at diagnosis in individuals homozygous for the risk allele. This integrative genomic approach illustrates how WGS in families with multiple members affected with a complex genetic disease can yield insights into both the common disease–common variant and common disease–rare variant hypotheses.
TLR6- NFKB signaling pathway
TLRs recognize pathogen-associated molecular patterns (PAMPS), lead to activation of the transcription factors, NFKB and AP1, and transcription of genes that control inflammation. TLR6 forms a heterodimer with TLR2 that recognizes peptidoglycan, diacyl lipoproteins, and zymosan derived from Gram positive bacteria, mycoplasmas, and fungi, respectively . TLR6 is constitutively expressed in humans on myeloid dendritic cells, monocytes, B cells, coronary artery endothelial cells (EC), and coronary artery vascular smooth muscle cells [32, 33]. In contrast, TLR2 is widely expressed on immune cells but only expressed on human EC and vascular smooth muscle cells following induction by pro-inflammatory cytokines. Population differences in downstream cytokine production have been observed following TLR stimulation. In a study of African children, higher levels of TNFα were produced following in vitro stimulation with a specific TLR2/6 ligand in a whole blood assay when compared to children of European descent .
We found 41 TLR6 variants located in the promoter, intron 1, exon 2, and the 3’UTR associated with KD susceptibility. The three common intronic SNVs (rs56245262, rs56083757 and rs7669329) influenced the transcription of IL6 and were associated with KD susceptibility both in the European descent GWAS and the African Americans family. IL6, a key cytokine controlled by the transcription factors NFKB and AP1 among others, is reported to be high in the serum of acute KD patients . Our transcriptome data show that IL6 transcripts are low in risk allele carriers during the acute disease (Fig 4) . The variants were located in a 1.5kb region in the middle of the single 28kb intron, 11kb from the splice donor site and 17kb from the splice acceptor site. Five transcripts of TLR6 have been predicted to result from alternative splicing (https://www.ncbi.nlm.nih.gov/nuccore?LinkName=gene_nuccore_refseqrna&from_uid=10333) and it is possible that the intronic SNVs could influence splicing efficiency of these variants (S1 Fig). The eQTL database, GTEx (http://www.gtexportal.org/home/), showed that there were tissue-specific effects of these intronic SNVs with the risk allele decreasing TLR6 transcripts in transformed fibroblasts but increasing TLR6 transcripts in a transformed B-cell line (S2 Fig). No data were available for other tissues of potential interest in KD including endothelial cells, vascular smooth muscle cells, and cardiomyocytes.
Two of the four TLR6 3’UTR SNVs (rs12645200 and rs6822503) were in LD. We failed to find any microRNAs that were predicted to bind these four loci according to the miRdSNV data base (http://mirdSNV.ccr.buffalo.edu/index.php  and miRNASNV V2 (http://bioinfo.life.hust.edu.cn/miRNASNV2/index.php) [37, 38]. For one of the 3’UTR variants, rs6822503, the risk allele was in weak LD (r2 = 0.58, D’ = 0.94) with the C allele of the exonic variant c.G1083C (rs3821985) in Africans, and the two affected siblings were homozygous for the C allele at this locus (S1 Table). Shey et al stimulated whole blood from 70 healthy Africans with the diacylated lipopeptides, FSL-1 and PAM2 (TLR6/2 ligands), and found reduced IL6 levels in cells from subjects homozygous for the C allele of rs3821985 compared to G homozygotes . This finding suggests that our KD-affected siblings might produce lower levels of IL6 upon TLR2/6 stimulation.
The two affected children were compound heterozygotes for the two SNVs in TLR6 exon 2 (p.Leu194Pro and p.Arg247Lys) having inherited the former from the mother and the latter from the father. These SNVs change amino acids located on the extracellular surface of TLR6 in a region predicted to be involved with ligand binding, which could influence ligand-binding affinity (Fig 5). 3D proteomic structure modeling of the two non-synonymous mutations was computationally predicted (SNPs3D: http://www.snps3d.org/modules.php?name=SnpAnalysis&locus_ac=10333). p.Leu194Pro is predicted to be deleterious due to loss of an intramolecular hydrogen bond and p.Arg247Lys is a variant classified as non-deleterious but on the protein ectodomain. The functional impact of these TLR6 variants was tested using an NF-κB luciferase reporter assay in human embryonic kidney 293T cells expressing the TLR6 variants and stimulated with TLR agonists . p.Arg247Lys showed a 15.6% decrease in ability to respond to the TLR2/6 agonist, PAM2CSK4. Cells transfected with p.Leu194Pro constructs showed a more marked decrease in NF-kB activation (25.4%). Thus, compound heterozygosity for these SNVs in the affected children is expected to reduce NF-kB activation.
Positions of the exonic SNVs rs5743809 and rs3522046 on the TLR6 protein diagrammed using NCBI Molecular Modeling database (MMDB). The replaced amino acids (p.Leu194Pro and p.Arg247Lys) are shown in green. Variants are located in the extra-cellular domain in the predicted ligand-binding region and may alter hydrogen binding.
Role of NFKB and IL6 in the modulation of immune responses
A self-limited inflammatory disease like KD must activate potent anti-inflammatory pathways that ultimately lead to the resolution of inflammation. TLR2/6 stimulation initiates inflammation but also stimulates the regulatory compartment of the immune response. In mice, stimulation of TLR2/TLR6 expressed on dendritic cells leads to their differentiation into tolerogenic dendritic cells secreting IL-10 and promotes T cell differentiation into a regulatory (Treg) phenotype . Work by Franco et al. has highlighted the importance of IL-10 secretion by natural regulatory T cells and tolerogenic myeloid dendritic cells in the resolution of inflammation in KD patients [42, 43]. Downstream of TLR2/6, both NFKB and IL6 have complex immune functions [44, 45]. In the carrageenan-induced rat pleurisy model, blocking NFKB led to protracted inflammation with reduced leukocyte apoptosis and decreased release of the anti-inflammatory molecule, TGFβ1, thus highlighting the important role of NFKB in the resolution of inflammation .
IL6 functions by binding the IL6 receptor (R) and gp130, a transmembrane signal transduction protein. The IL6R is expressed only on hepatocytes and a subset of inflammatory cells including macrophages, neutrophils, and naïve T cells. However, gp130 is ubiquitously expressed . During acute inflammation, neutrophils infiltrate tissues and undergo apoptosis with shedding of IL6-sIL6R. This complex binds gp130 on endothelial cells leading to signal transduction that results in monocyte/macrophage recruitment and removal of apoptotic neutrophils. Thus, reduced levels of IL6 might be expected to allow persistence of neutrophils in the arterial wall and in the circulation. Although IL6 levels in the serum are high in acute KD, this may represent synthesis of IL6 by hepatocytes as part of the acute phase response rather than synthesis by circulating immune cells as eQTL are often tissue-specific . This could have important implications for persistence of the acute inflammatory state in KD (Fig 6). Of interest, a pilot study of tocilizumab (monoclonal antibody against human IL6R) plus IVIG for treatment of acute KD was terminated for safety concerns when the first three patients enrolled developed CAA (Prof. Emeritus Shumpei Yokota, Yokohama City University, Japan, personal communication).
The TLR6 variants associated with KD are predicted to yield lower levels of NFKB upon stimulation, thus resulting in lower levels of IL6. During vascular inflammation, neutrophil apoptosis is associated with shedding of the soluble IL6R-IL6 complex, which binds to gp130 on endothelial cells. This stimulates a signaling pathway that switches the adhesion molecule and chemokine profile to one that favors attraction of monocytes to the vessel wall with subsequent downregulation of vascular inflammation. In KD, lower levels of IL6 may lead to persistence of neutrophil recruitment to the vessel wall and prolonged inflammation.
Tier 1 genes
TLR6 and the transcription factor, MEF2A, were the two Tier 1 genes. They were found in KD pathways and were differentially expressed in the KD transcriptome database with significantly higher levels in the acute phase. MEF2A plays a critical role in transcriptional activation of IL-2 during T cell activation . Our associated SNV, rs373652230, was a (CAG) deletion at amino acid 420 in a poly-lysine tract and multiple studies have reported (CAG)n variants associated with coronary artery disease, although the mechanism underlying this association has not been elucidated . Although the Ingenuity Variant Analysis identified the CAG10 variant as rare (allele frequency <1%), studies in European descent, Asian, and Turkish populations found allele frequencies from 12.7–21.9%. We were unable to find data regarding the allele frequency of the (CAG) repeats in African Americans populations. Of interest, the intronic TLR6 SNV, rs56083757 and the promoter SNV rs6837101, were also predicted to influence the DNA binding of MEF2A (HaploReg: http://www.broadinstitute.org/mammals/haploreg/detail_v4.1.php?query=&id=rs56083757). No other functional significance could be assigned to the other SNVs.
The nature of the PAMP that might stimulate TLR6/TLR2 in KD is unknown although a hypothesis has linked KD to inhalation of Candida antigens carried on aerosols arising from agricultural areas in China . Of speculative interest, the glucans from the Candida cell wall are potent ligands for TLR2/6. In addition, C57BL/6 mice are homozygous for the p.Leu194Pro variant in TLR6 and two murine models of coronary artery vasculitis in this species use intraperitoneal injection of either Candida albicans or Lactobacillus casei cell wall extracts, both of which are potent ligands for TLR2/6[49, 50].
Tier 2 genes
Six genes harboring nine variants met the criteria of homozygous recessive, predicted deleterious, significantly associated in the imputed GWAS, and differentially expressed in acute versus convalescent KD (Fig 2). The TLR6 promoter and 3’UTR SNVs were discussed above.
The Tier 2 gene, TACSTD2, is involved in calcium signaling and may be linked in this way to KD pathogenesis. Validated calcium signaling genes linked to KD pathogenesis currently include ITPKC, ORAI1, and SLC8A1 [22, 30, 51]. TACSTD2 encodes for Trop2, a membrane-spanning protein that transduces an intracellular calcium signal. The protein is over-expressed in many epithelial cancers and is thought to be involved in metastasis and malignant cell invasion . Its transcription is regulated by a number of transcription factors including NFKB and its expression was high in acute versus convalescent KD.
The serine/threonine Ste-like protein kinase, SLK, belongs to the family of germinal center kinases, is ubitquitously expressed, and is involved with stress-induced apoptosis, cytoskeletal remodeling, and cell motility . ARRDC4 is a member of the α-arrestin family, which, as a class, is involved with fine-tuning cellular responses to cell surface signals . Both SLK and ARRDC4 were transcriptionally upregulated in acute KD but the mechanism through which these genes may participate in KD susceptibility is unclear.
Strengths and limitations
This is the first analysis of WGS in an African Americans family and provides a database that can be mined for future studies of genetic structure in this population. Many different filters can be applied for subsequent analyses that may uncover additional variants that influence susceptibility to KD. Neither affected child developed coronary artery aneurysms, so only variants affecting susceptibility can be analyzed from this dataset. We recognize that using a whole blood transcriptomic database from a mixed ethnic population could miss transcripts that are only differentially expressed in relevant cardiovascular tissues and among African Americans. This analysis underscores the need to focus genetic and genomic studies on minority populations such as African Americans who are disproportionately affected by KD compared to children of European descent.
The first analysis of WGS from an African Americans family with two siblings affected with KD revealed genetic variation in TLR6 that may be linked to the pro-inflammatory state during acute KD. Previous GWAS and linkage association studies had not identified this gene as influencing susceptibility to KD. Another variant, in TACSTD2, has an intriguing link to calcium signaling that will need to be pursued in future studies. The analytic approach presented here is a novel method for finding potentially relevant variants in WGS in families with individuals affected by complex genetic diseases.
Materials and methods
Members of the African American family selected for whole genome sequencing included two affected sons, an older unaffected son and daughter, and unrelated/unaffected father and mother, all of whom provided written informed consent for study participation. The study was approved by the Institutional Review Board at the University of California San Diego. The two affected subjects were both diagnosed by one of the co-authors (JCB). Neither developed coronary artery abnormalities by echocardiogram and both responded to a single dose of IVIG with defervescence and resolution of inflammation.
Whole genome sequencing
Whole blood samples or Scope® mouthwash rinses were obtained from family members and 10 μg of genomic DNA was extracted and submitted to Illumina Clinical Services Laboratory in San Diego, CA, USA for sequencing using HiSeq 2000. DNA was fragmented and attached to the surface of glass microscope slides. Fluorescently labeled nucleotides were used to sequence the fragments. Laser excitation of the nucleotides was followed by digital imaging. The sequence fragments were assessed for quality scores.
Sequence processing and variant calling
Raw images were processed by Illumina’s CASAVA version 1.8 pipeline to generate six sample FASTQ files for a downstream analysis. The FASTQ files were aligned to a human reference genome (hg19), PCR duplicates marked and variants called using Edico Genome’s DRAGEN pipeline , using default parameters.
VCF files were uploaded to Ingenuity Variant Analysis™ (QIAGEN Redwood City, CA) for tertiary analysis. Variant prioritization was performed by sequentially applying filters in 7 steps of confidence, genetics (recessive homozygous and compound heterozygous), predicted deleterious, rare, found in KD pathway, significant in published KD GWAS, and differentially expressed in KD transcriptome database (Fig 1). The Confidence filter retained variants with a call quality at least 20, read depth at least 10, and eliminated the top 5% of the most exonically variable genes or 100-bp regions in 1000 Genomes to remove false positive variants such as mucin and olfactory receptor genes. The annotated variants were imported into an in-house MySQL database to perform genetic analysis of two inheritance types, recessive homozygous and compound heterozygous. The Homozygous filter required homozygous variants to be present exclusively in the two KD affected children but not in any of the four unaffected individuals. The Compound heterozygous filter was constructed based on the following rules :
- A variant is in a heterozygous state in both of KD affected children.
- A variant must not occur in a homozygous state in any of the four unaffected individuals, i.e. two siblings and two parents.
- A variant that is in a heterozygous state in an affected child must be heterozygous in exactly one of the parents but not both.
The Predicted deleterious filter followed the guideline classification of American College of Medical Genetics (ACMG) and loss-of-function in terms of frameshift, in-frame in/del, start/stop codon change, missense, or splice site loss all implemented in IVA. The Rare filter retained variants whose allele frequency was less than 1% (for recessive homozygosity) or 3% (for compound heterozygosity) in any of three public datasets; 1000 genomes, Exome Aggregation Consortium (ExAC), and NHLBI ESP Exomes. The Pathway filter generated by Ingenuity program retained genes implicated in pathways relevant to KD pathogenesis. These were defined using the Ingenuity system with the key terms “susceptibility to KD”, “calcium signaling pathway, coronary artery aneurysm”, and “coronary artery abnormalities”. The GWAS filter retained variants that were significantly (nominal P-value < 0.05) associated in 405 KD subjects versus 6,252 normal controls in our previously published KD GWAS dataset genotyped on the Illumina HumMap 550 SNV array . Since GWAS targets common SNVs with an allele frequency > 5%, the GWAS filter was applied to variants surviving the Deleterious filter in Fig 1. Quality control was performed with missing rate, minor allele frequency (MAF) and Hardy-Weinberg Equilibrium (HWE) cutoff values. The original dataset was expanded through imputation using SHAPEIT2 for phasing and IMPUTE2 for imputation with the most recent versions of 1000 Genomes version 3 and HapMap3 CEU panels (hg18) as reference data. LiftOver tool from UCSC Genome Browser (Kent et al. Genome Res 2002 PMID:12045153) was used to convert genomes coordinates from assembly hg18 to hg19. The new imputation with updated reference panels increased the total number of imputed SNVs to 7,602,343 from 4,545,265. The Transcriptome filter set gene-level prioritization by retaining differentially expressed genes between 131 paired acute and convalescent whole blood RNA samples (PAXgene). Transcript levels were measured using the Illumina HumanRef-12 V4 BeadChip with 47,000 probes and quality control and analysis were as described. The cutoff for differential expression was an adjusted P-value < 0.05.
The six family members were re-sequenced for the MEF2A (CAG)n variant.
Primers were designed using Primer 3: forward primer (caagtccgaaccgatttcac) and reverse primers (gccaagcacaattggagaat) (product size 247 bp). Fifty nanograms of DNA from each sample was amplified using high-fidelity Taq polymerase (Platinum® Taq DNA Polymerase High Fidelity, Life Technologies) for 35 cycles following the manufacturer’s instructions. PCR products were resolved on 2% agarose gels, excised, and purified (QIAquick Gel Extraction Kit, Qiagen). PCR products were sequenced using forward and reverse primers (Eaton Bioscience Inc., San Diego).
Detailed methods were described previously . A total of 673 probes from 415 genes from the database of literature-curated human TF-target interactions for NFKB and AP1 were used for eQTL analysis .
S1 Fig. TLR6 gene structure, predicted splice variants, and location of SNVs.
S2 Fig. TLR6 expression as a function of genotype at the TLR6 intronic SNV (rs56083757) from the database.
Genotype-Tissue Expression (GTEx: http://www.gtexportal.org/home/).
We thank the family who donated their whole genome sequences. We also thank Francesca Marrassi PhD. for helpful discussion of the TLR6 exonic variants and the International Kawasaki Disease Genetics consortium for the original GWAS dataset. This work was funded by the NIH Roadmap for Medical Research, Grant U54HL108460, a grant from the Gordon and Marilyn Macklin Foundation and by Rady Children’s Institute for Genomic Medicine, San Diego.
- Conceptualization: JK CS JCB.
- Data curation: JK CS HY LTH.
- Formal analysis: JK CS SFK NV OH.
- Funding acquisition: LOM JCB.
- Investigation: JK CS EL LTH AHT JCB.
- Methodology: JK CS SFK JCB.
- Project administration: LOM JCB.
- Resources: JK CS SFK NV AHT JF.
- Software: JK NV.
- Supervision: SFK LOM JCB.
- Validation: CS EL AMR.
- Visualization: JK CS.
- Writing – original draft: JK CS JCB.
- Writing – review & editing: JK CS SFK MLH OH LOM JCB.
- 1. Baranzini SE, Mudge J, van Velkinburgh JC, Khankhanian P, Khrebtukova I, Miller NA, et al. Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature. 2010;464(7293):1351–6. pmid:20428171
- 2. Soden SE, Saunders CJ, Willig LK, Farrow EG, Smith LD, Petrikin JE, et al. Effectiveness of exome and genome sequencing guided by acuity of illness for diagnosis of neurodevelopmental disorders. Sci Transl Med. 2014;6(265):265ra168. pmid:25473036
- 3. Willig LK, Petrikin JE, Smith LD, Saunders CJ, Thiffault I, Miller NA, et al. Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings. Lancet Respir Med. 2015;3(5):377–87. pmid:25937001
- 4. Stittrich AB, Ashworth J, Shi M, Robinson M, Mauldin D, Brunkow ME, et al. Genomic architecture of inflammatory bowel disease in five families with multiple affected individuals. Hum Genome Var. 2016;3:15060. pmid:27081563
- 5. Dinwiddie DL, Bracken JM, Bass JA, Christenson K, Soden SE, Saunders CJ, et al. Molecular diagnosis of infantile onset inflammatory bowel disease by exome sequencing. Genomics. 2013;102(5–6):442–7. pmid:24001973
- 6. Onouchi Y. Genetics of Kawasaki disease: what we know and don't know. Circ J. 2012;76(7):1581–6. pmid:22789975
- 7. Suzuki A, Kamiya T, Kuwahara N, Ono Y, Kohata T, Takahashi O, et al. Coronary arterial lesions of Kawasaki disease: cardiac catheterization findings of 1100 cases. Pediatr Cardiol. 1986;7(1):3–9. pmid:3774580
- 8. Daniels LB, Tjajadi MS, Walford HH, Jimenez-Fernandez S, Trofimenko V, Fick DB Jr., et al. Prevalence of Kawasaki disease in young adults with suspected myocardial ischemia. Circulation. 2012;125(20):2447–53. Epub 2012/05/19. pmid:22595319
- 9. Kato H, Sugimura T, Akagi T, Sato N, Hashino K, Maeno Y, et al. Long-term consequences of Kawasaki disease. A 10- to 21-year follow-up study of 594 patients. Circulation. 1996;94(6):1379–85. pmid:8822996
- 10. Nakamura Y, Aso E, Yashiro M, Tsuboi S, Kojo T, Aoyama Y, et al. Mortality among Japanese with a history of Kawasaki disease: results at the end of 2009. J Epidemiol. 2013;23(6):429–34. Epub 2013/09/18. pmid:24042393
- 11. Gordon JB, Daniels LB, Kahn AM, Jimenez-Fernandez S, Vejar M, Numano F, et al. The Spectrum of Cardiovascular Lesions Requiring Intervention in Adults After Kawasaki Disease. Jacc. 2016;9(7):687–96. pmid:27056307
- 12. Nakamura Y. Lessones from Epidemiologic Studies of Kawasaki Disease in Japan. Circulation. 2015;131(Supplement 2):12.
- 13. Huang SK, Lin MT, Chen HC, Huang SC, Wu MH. Epidemiology of Kawasaki disease: prevalence from national database and future trends projection by system dynamics modeling. J Pediatr. 2013;163(1):126–31 e1. Epub 2013/01/15. pmid:23312687
- 14. Gibbons RV, Parashar UD, Holman RC, Belay ED, Maddox RA, Powell KE, et al. An evaluation of hospitalizations for Kawasaki syndrome in Georgia. Arch Pediatr Adolesc Med. 2002;156(5):492–6. pmid:11980556
- 15. Abuhammour WM, Hasan RA, Eljamal A, Asmar B. Kawasaki disease hospitalizations in a predominantly African-American population. Clin Pediatr (Phila). 2005;44(8):721–5.
- 16. Holman RC, Curns AT, Belay ED, Steiner CA, Schonberger LB. Kawasaki syndrome hospitalizations in the United States, 1997 and 2000. Pediatrics. 2003;112(3 Pt 1):495–501. Epub 2003/09/02. pmid:12949272
- 17. Chang CJ, Kuo HC, Chang JS, Lee JK, Tsai FJ, Khor CC, et al. Replication and meta-analysis of GWAS identified susceptibility loci in Kawasaki disease confirm the importance of B lymphoid tyrosine kinase (BLK) in disease susceptibility. PLoS One. 2013;8(8):e72037. pmid:24023612
- 18. Onouchi Y, Ozaki K, Burns JC, Shimizu C, Terai M, Hamada H, et al. A genome-wide association study identifies three new risk loci for Kawasaki disease. Nat Genet. 2012;44(5):517–21. Epub 2012/03/27. pmid:22446962
- 19. Khor CC, Davila S, Breunis WB, Lee YC, Shimizu C, Wright VJ, et al. Genome-wide association study identifies FCGR2A as a susceptibility locus for Kawasaki disease. Nat Genet. 2011;43(12):1241–6. Epub 2011/11/15. pmid:22081228
- 20. Burgner D, Davila S, Breunis WB, Ng SB, Li Y, Bonnard C, et al. A genome-wide association study identifies novel and functionally related susceptibility Loci for Kawasaki disease. PLoS Genet. 2009;5(1):e1000319. Epub 2009/01/10. pmid:19132087
- 21. Khor CC, Davila S, Shimizu C, Sheng S, Matsubara T, Suzuki Y, et al. Genome-wide linkage and association mapping identify susceptibility alleles in ABCC4 for Kawasaki disease. J Med Genet. 2011;48(7):467–72. Epub 2011/05/17. pmid:21571869
- 22. Onouchi Y, Gunji T, Burns JC, Shimizu C, Newburger JW, Yashiro M, et al. ITPKC functional polymorphism associated with Kawasaki disease susceptibility and formation of coronary artery aneurysms. Nature genetics. 2008;40(1):35–42. pmid:18084290
- 23. Yan Y, Ma Y, Liu Y, Hu H, Shen Y, Zhang S, et al. Combined analysis of genome-wide-linked susceptibility loci to Kawasaki disease in Han Chinese. Human genetics. 2013;132(6):669–80. Epub 2013/03/05. pmid:23456091
- 24. Lee YC, Kuo HC, Chang JS, Chang LY, Huang LM, Chen MR, et al. Two new susceptibility loci for Kawasaki disease identified through genome-wide association analysis. Nature genetics. 2012;44(5):522–5. Epub 2012/03/27. pmid:22446961
- 25. Kim JJ, Hong YM, Sohn S, Jang GY, Ha KS, Yun SW, et al. A genome-wide association analysis reveals 1p31 and 2p13.3 as susceptibility loci for Kawasaki disease. Hum Genet. 2011;129(5):487–95. Epub 2011/01/12. pmid:21221998
- 26. Tsai FJ, Lee YC, Chang JS, Huang LM, Huang FY, Chiu NC, et al. Identification of novel susceptibility Loci for kawasaki disease in a Han chinese population by a genome-wide association study. PLoS One. 2011;6(2):e16853. pmid:21326860
- 27. Shrestha S, Wiener HW, Aissani B, Shendre A, Tang J, Portman MA. Imputation of class I and II HLA loci using high-density SNPs from ImmunoChip and their associations with Kawasaki disease in family-based study. Int J Immunogenet. 2015;42(3):140–6. pmid:25809546
- 28. Alphonse MP, Duong TT, Shumitzu C, Hoang TL, McCrindle BW, Franco A, et al. Inositol-Triphosphate 3-Kinase C Mediates Inflammasome Activation and Treatment Response in Kawasaki Disease. J Immunol. 2016.
- 29. Hoang LT, Shimizu C, Ling L, Naim AN, Khor CC, Tremoulet AH, et al. Global gene expression profiling identifies new therapeutic targets in acute Kawasaki disease. Genome Med. 2014;6(11):541. pmid:25614765
- 30. Shimizu C, Eleftherohorinou H, Wright VJ, Kim J, Alphonse MP, Perry JC, et al. Genetic variation in the SLC8A1 calcium signaling pathway is associated with susceptibility to Kawasaki disease and coronary artery abnormalities. Circulation Cardovascular Genetics. (In Press).
- 31. Akira S, Takeda K. Toll-like receptor signalling. Nat Rev Immunol. 2004;4(7):499–511. pmid:15229469
- 32. Satta N, Kruithof EK, Reber G, de Moerloose P. Induction of TLR2 expression by inflammatory stimuli is required for endothelial cell responses to lipopeptides. Mol Immunol. 2008;46(1):145–57. pmid:18722665
- 33. Yang X, Murthy V, Schultz K, Tatro JB, Fitzgerald KA, Beasley D. Toll-like receptor 3 signaling evokes a proinflammatory and proliferative phenotype in human vascular smooth muscle cells. Am J Physiol Heart Circ Physiol. 2006;291(5):H2334–43. pmid:16782847
- 34. Labuda LA, de Jong SE, Meurs L, Amoah AS, Mbow M, Ateba-Ngoa U, et al. Differences in innate cytokine responses between European and African children. PLoS One. 2014;9(4):e95241. pmid:24743542
- 35. Furukawa S, Matsubara T, Yone K, Hirano Y, Okumura K, Yabuta K. Kawasaki disease differs from anaphylactoid purpura and measles with regard to tumour necrosis factor-alpha and interleukin 6 in serum. Eur J Pediatr. 1992;151(1):44–7. pmid:1728545
- 36. Bruno AE, Li L, Kalabus JL, Pan Y, Yu A, Hu Z. miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3'UTRs of human genes. BMC Genomics. 2012;13:44. pmid:22276777
- 37. Gong J, Liu C, Liu W, Wu Y, Ma Z, Chen H, et al. An update of miRNASNP database for better SNP selection by GWAS data, miRNA expression and online tools. Database (Oxford). 2015;2015:bav029.
- 38. Gong J, Tong Y, Zhang HM, Wang K, Hu T, Shan G, et al. Genome-wide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis. Hum Mutat. 2012;33(1):254–63. pmid:22045659
- 39. Shey MS, Randhawa AK, Bowmaker M, Smith E, Scriba TJ, de Kock M, et al. Single nucleotide polymorphisms in toll-like receptor 6 are associated with altered lipopeptide- and mycobacteria-induced interleukin-6 secretion. Genes Immun. 2010;11(7):561–72. pmid:20445564
- 40. Ben-Ali M, Corre B, Manry J, Barreiro LB, Quach H, Boniotto M, et al. Functional characterization of naturally occurring genetic variants in the human TLR1-2-6 gene family. Hum Mutat. 2011;32(6):643–52. pmid:21618349
- 41. Depaolo RW, Tang F, Kim I, Han M, Levin N, Ciletti N, et al. Toll-like receptor 6 drives differentiation of tolerogenic dendritic cells and contributes to LcrV-mediated plague pathogenesis. Cell Host Microbe. 2008;4(4):350–61. pmid:18854239
- 42. Franco A, Touma R, Song Y, Shimizu C, Tremoulet A, Kanegaye J, et al. Specificity of regulatory T cells that modulate vascular inflammation. Autoimmunity. 2014;47:95–104. pmid:24490882
- 43. Burns JC, Franco A. The immunomodulatory effects of intravenous immunoglobulin therapy in Kawasaki disease. Expert Rev Clin Immunol. 2015;11(7):819–25. pmid:26099344
- 44. Lawrence T, Gilroy DW, Colville-Nash PR, Willoughby DA. Possible new role for NF-kappaB in the resolution of inflammation. Nat Med. 2001;7(12):1291–7. pmid:11726968
- 45. Scheller J, Chalaris A, Schmidt-Arras D, Rose-John S. The pro- and anti-inflammatory properties of the cytokine interleukin-6. Biochim Biophys Acta. 2011;1813(5):878–88. pmid:21296109
- 46. Pan F, Ye Z, Cheng L, Liu JO. Myocyte enhancer factor 2 mediates calcium-dependent transcription of the interleukin-2 gene in T lymphocytes: a calcium signaling module that is distinct from but collaborates with the nuclear factor of activated T cells (NFAT). J Biol Chem. 2004;279(15):14477–80. pmid:14722108
- 47. Liu Y, Niu W, Wu Z, Su X, Chen Q, Lu L, et al. Variants in exon 11 of MEF2A gene and coronary artery disease: evidence from a case-control study, systematic review, and meta-analysis. PLoS One. 2012;7(2):e31406. pmid:22363637
- 48. Rodo X, Curcoll R, Robinson M, Ballester J, Burns JC, Cayan DR, et al. Tropospheric winds from northeastern China carry the etiologic agent of Kawasaki disease from its source to Japan. Proc Natl Acad Sci U S A. 2014;111(22):7952–7. pmid:24843117
- 49. Lehman TJ, Warren R, Gietl D, Mahnovski V, Prescott M. Variable expression of Lactobacillus casei cell wall-induced coronary arteritis: an animal model of Kawasaki's disease in selected inbred mouse strains. Clin Immunol Immunopathol. 1988;48(1):108–18. pmid:3133145
- 50. Takahashi K, Oharaseki T, Wakayama M, Yokouchi Y, Naoe S, Murata H. Histopathological features of murine systemic vasculitis caused by Candida albicans extract—an animal model of Kawasaki disease. Inflamm Res. 2004;53(2):72–7. pmid:15021972
- 51. Onouchi Y, Fukazawa R, Yamamura K, Suzuki H, Kakimoto N, Suenaga T, et al. Variations in ORAI1 Gene Associated with Kawasaki Disease. PLoS ONE. 2016;11(1):e0145486. pmid:26789410
- 52. Shvartsur A, Bonavida B. Trop2 and its overexpression in cancers: regulation and clinical/therapeutic implications. Genes Cancer. 2015;6(3–4):84–105. pmid:26000093
- 53. Cybulsky AV, Guillemette J, Papillon J. Ste20-like kinase, SLK, activates the heat shock factor 1—Hsp70 pathway. Biochim Biophys Acta. 2016;1863(9):2147–55. pmid:27216364
- 54. Shea FF, Rowell JL, Li Y, Chang TH, Alvarez CE. Mammalian alpha arrestins link activated seven transmembrane receptors to Nedd4 family e3 ubiquitin ligases and interact with beta arrestins. PLoS ONE. 2012;7(12):e50557. pmid:23236378
- 55. Miller NA, Farrow EG, Gibson M, Willig LK, Twist G, Yoo B, et al. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Med. 2015;7:100. pmid:26419432
- 56. Kamphans T, Sabri P, Zhu N, Heinrich V, Mundlos S, Robinson PN, et al. Filtering for compound heterozygous sequence variants in non-consanguineous pedigrees. PLoS One. 2013;8(8):e70151. pmid:23940540
- 57. Han H, Shim H, Shin D, Shim JE, Ko Y, Shin J, et al. TRRUST: a reference database of human transcriptional regulatory interactions. Sci Rep. 2015;5:11432. pmid:26066708