Exome Sequencing Analysis Identifies Compound Heterozygous Mutation in ABCA4 in a Chinese Family with Stargardt Disease

Stargardt disease is the most common cause of juvenile macular dystrophy. Five subjects from a two-generation Chinese family with Stargardt disease are reported in this study. All family members underwent complete ophthalmologic examinations. Patients of the family initiated the disease during childhood, developing progressively impaired central vision and bilateral atrophic macular lesions in the retinal pigmental epithelium (RPE) that resembled a “beaten-bronze” appearance. Peripheral venous blood was obtained from all patients and their family members for genetic analysis. Exome sequencing was used to analyze the exome of two patients II1, II2. A total of 50709 variations shared by the two patients were subjected to several filtering steps against existing variation databases. Identified variations were verified in all family members by PCR and Sanger sequencing. Compound heterozygous variants p.Y808X and p.G607R of the ATP-binding cassette, sub-family A (ABC1), member 4 (ABCA4) gene, which encodes the ABCA4 protein, a member of the ATP-binding cassette (ABC) transport superfamily, were identified as causative mutations for Stargardt disease of this family. Our findings provide one novel ABCA4 mutation in Chinese patients with Stargardt disease.


Introduction
Stargardt disease (STGD), which is also known as juvenile macular degeneration, was first reported by Karl Stargardt in 1909. It is one of the most common hereditary retinal dystrophies with an estimated prevalence of at least 1:10,000 [1,2,3]. It presents with a progressive and significant loss in central vision in the first or second decade of life. However, fundus examination is frequently normal early in the course of disease, even when patients already complain of vision loss. At this stage, the clinical diagnosis of Stargardt disease may be missed. Later on, typical fundus manifestations arise, including pigment mottling, frank macular atrophy, a bull's eye maculopathy and fundus flecks in the macular and the perimacular region [4]. However, it should be noted that Stargardt disease presents with highly variable phenotypes. Histologically, Stargardt disease is associated with significant loss in photoreceptor cells and a massive deposition of lipofuscin-like material in the retinal pigment epithelium, which has also been observed in aging human eyes [6,7]. STGD is predominantly inherited as an autosomal recessive trait, although an autosomal dominant form has been also described [5]. Both sexes are equally affected. The STGD gene has been mapped to the short arm of chromosome 1 [8] in a narrow genetic interval, subsequently assigned to band p22.1 [9], now known as ATP-binding cassette, sub-family A (ABC1), member 4 (ABCA4) [10,11]. The gene for an autosomal dominant disorder with a similar phenotype has been reported on chromosome 6 [12]. Autosomal dominant families linked to this locus are at least 50 times less common than families that are consistent with autosomal recessive inheritance [13,14]. Recessively inherited Stargardt disease is likely to be monogenic. Rare cases of STGD or ''Stargardt-like'' disease phenotypes have been reported with mutations in PRPH2 [15,16], VMD2 [17], ELOVL4 [5,18,19] and PROM1 [20]. These genes, as well as ABCA4, are also associated with clinically distinct phenotypes including retinitis pigmentosa, cone/rod dystrophy and pattern dystrophy.
Known candidate genes for Stargardt disease such as ''ABCA4'' contain many exons and there are hundreds of identified mutations. The cost and time requirement for mutation screening of all coding exons by Sanger sequencing would equal or exceed that of high-throughput next generation sequencing (NGS) analysis. In this study, we applied next generation sequencing technology to identify the disease-causing gene in this family as part of a large cohort study for retinal diseases. Next-generation sequencing, in particular whole-exome sequencing (WES) can now be performed rapidly and at minimal cost, allowing analysis of the coding regions (exome) of the human genome in single individuals or small families, including patients in whom a clear genotypephenotype correlation is absent or for clinically and genetically heterogeneous conditions [21,22,23].
In the present study, disease-associated mutations were identified by WES of the two affected siblings followed by validation in the family affected by Stargardt disease. Our results identified two compound heterozygous disease-segregating mutations, c.C2424G, p.Y808X and c.G1819A, p.G607R, in the ABCA4 gene. To exclude the possibility that these mutations were polymorphisms, DNA samples of 1000 unaffected individuals were also screened for these mutations.

Subjects and Clinical Assessment
Study approval was obtained from the Institutional Review Boards of Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital and Henan Eye Hospital and Henan Eye Institute, People's Hospital of Zhengzhou University. Written informed consent was obtained in accordance with the Declaration of Helsinki for all subjects enrolled in this study. For minors, written consent was obtained from the father. Five family members were evaluated by a retina specialist and underwent complete ophthalmological assessment that included visual acuity measurement, fundus photography, fundus fluorescein angiography (FFA), multifocal electroretinogram (mfERG), optical-coherence tomography (OCT) and computerized visual field testing. In the 1000 normal matched controls, all individuals underwent an eye examination and no signs of eye diseases were observed. Venous blood samples were obtained from all subjects in EDTA Vacutainers.

DNA Extraction
All genomic DNA was extracted from peripheral blood using a blood DNA extraction kit according to the protocol provided by the manufacturer (TianGen, Beijing, China). DNA samples were stored at 220uC until used. DNA integrity was evaluated by 1% agarose gel electrophoresis.

Exome Sequencing
Exome sequencing was employed in this study to identify the disease-associated genes. Exome sequencing was performed on DNA samples of the two patients (family members II1 and II2) by Axeq Technology Inc., Seoul, Korea. Each sequenced sample was prepared according to the Illumina protocols. Briefly, one microgram of genomic DNA was fragmented by nebulization, the fragmented DNA was repaired, an 'A' was ligated to the 3' end, Illumina adapters were then ligated to the fragments, and the sample was size selected aiming for a 3502400 base pair product. The size-selected product was PCR amplified, and the final product was validated using the Agilent Bioanalyzer. Streptavidin beads were used to capture probes containing the targeted regions of interest; non-specific binding was then washed out. Then, the sequencing libraries were enriched for the desired target using the Illumina Exome Enrichment protocol and the enriched library validation for quality control analysis were performed by the Axeq Technology. For clustering and sequencing, genomic DNA Illumina TruSeq Exome Capture System (62 Mb) was used to collect the protein coding regions of human genome DNA. It covered 20794 genes and 201121 exons in the Consensus Coding Sequence Region database, approximately 97.2% of CCDS exons or 96.4% of RefSeq exons were captured. (http://www.illumina. com/applications/sequencing/targeted resequencing.ilmn) Each captured library was then loaded onto the Illumina Hiseq2000 sequencer, and we performed high-throughput sequencing for each captured library to ensure that each sample met the desired average sequencing depth. Raw image files were processed by  Illumina base calling Software 1.7 for base calling with default parameters and the sequences of each individual were generated as 90-bp pair-end reads.

Reads, Mapping, and Variant Detection
The high-quality sequencing reads were aligned to the human reference genome (NCBI build 37.1/ hg19) with SOAPaligner (soap2.21). Based on the SOAP alignment results, SOAPsnp v1.05 was used to assemble the consensus sequence and call genotypes in target regions. Data were provided as lists of sequence variants (SNPs and short Indels). For SNP quality control, we filtered SOAPsnp results as follows: (i) Base quality is more than 20; (ii) Depth is between 4 and 200; (iii) Estimate copy number is equal or less than 2; (iv) The distance between two SNPs must be longer than 4. Small Indel detection was performed using the UnifiedGenotyper tool from GATK (version v1.0.4705) after all the high-quality reads were aligned to the human reference genome using BWA (version 0.5.9-r16). SNP and Indel detection were performed only on the targeted exome regions and flanking regions within 200 bp.

Variants Validation
After filtering against multiple databases, Sanger sequencing was used to determine whether any of the remaining variants cosegregated with the disease phenotype in this family. Primers flanking the candidate loci were designed based on genomic sequences of Human Genome (hg19/build37.1) and synthesized by Invitrogen, Shanghai, China: ABCA4-exon13-F, tgagttccgagtcaccctgt; ABCA4-exon13-R, gtcagagctccatgctctcc; ABCA4-ex-on16-F, ctctacctcgagggcatctg; ABCA4-exon16-R, ggctggggatctgaagaact. Genotyping for c.C2424G and c.G1819A in the family members was then confirmed by direct polymerase chain reaction (PCR) and analyzed on an ABI 3730XL Genetic Analyzer. Sequencing data were compared with the Human Genome database.

Clinical Presentation of Family 2084
A two-generation family (family 2084) from Henan Province of China was recruited in this study (Figure 1). Ophthalmic examinations identified two affected individuals as Stargardt disease patients among the five examined family members. Affected members of this family exhibited similar clinical features. The two affected siblings presented with an early-onset markedly decreased vision acuity (OD: 20/400, OS: 20/400) in both eyes and an increasing difficulty to adapt in the dark (Table 1). Fundus examination showed some pigment mottling and yellow-white flecks in both maculae, normal caliber of the retinal vessels, but no pigmented bone spicules in the retinal periphery (Figure 2A). The fluorescein angiogram displayed hyperfluorescent flecks, which extended to the midperipheral retina and fluorescence blocks formed by the pigment mottling in the macula ( Figure 2B). Multifocal Electroretinogram showed severe depressed central waveform and significant loss of paracentral/peripheral retinal response ( Figure 2C). The macular OCT showed hyper-reflective deposits within the RPE layer and the level of the outer segments of the photoreceptors, thinning of the retinal outer layers and enhanced choroidal reflectivity ( Figure 2D).  After identification of variants, we identified 5843 functional SNP and 334 functional Indels that were shared by these two patients ( Table 2). We focused only on the functional SNP/Indel, including non-synonymous variants (NS), splice acceptor and donor site mutations (SS), and frameshift coding-region insertions or deletions (Indels), which were more likely to be pathogenic than others, especially those in homozygous or multiple heterozygous status. We then compared these variants in two affected members with the dbSNP135, 1000 Genome Project, HapMap project, YH database and our in-house database, which generated by our laboratory using 1600 whole exome sequencing data (table 2). The in-house data include whole exome variants data from people without any eye disease, therefore we can exclude the variants with high frequency in people without any eye disease.
Under the autosomal recessive model, variants satisfying a recessive homozygous inheritance model were not identified. This led us to investigate the possibility of recessive compound heterozygous inheritance. Using this model, the filtered data was narrowed to 25 heterozygous variants. We then compared these variants with reported retina genes (https://sph.uth.edu/Retnet/). In both patients we found two mutations c.C2424G (p.Y808X) and c.G1819A (p.G607R) satisfying a recessive compound heterozygous inheritance model (Table 3) in the gene ABCA4 (NM_000350.2). When we checked the human gene mutation database (http://www.hgmd.org/), we found that mutation p.G607R was reported by Andrea Rivera in 2000 [24]. Sanger sequencing confirmed these two mutations in the two affected siblings and demonstrated that their parents were unaffected carriers of Y808X (father) and G607R (mother) mutations, showing complete co-segregation of the mutations with the disease phenotype ( Figure 3). The two mutations described above were absent in 1000 ethnicity-matched control samples screened by direct sequencing. These data, together with the clinical presentation of the two affected siblings, demonstrated that p.G607R and p.Y808X variants in the gene ABCA4 was responsible for Stargardt disease in this family.

Mutation Detection and Analysis
SIFT was used to predict how the identified amino acid substitutions would affect protein function. The previously reported mutation p.G607R of ABCA4 was predicted to be damaging and the novel mutation c.C2424G, p.Y808X in the affected families introduced a stop codon, which removed 1465 amino acids from the ABCA4 protein (2273 amino acids), according to GenBank accession number NM_000350.2. Therefore, this novel mutation is likely a null allele.
Polyphen2 was used to explore sequence homology and the physical properties of corresponding affected amino acids. Mutation p.G607R, located within exon 13, results in changes of a hydrophilic glycine to an arginine at position 607, which may lead to a damaging replacement with a score of 0.99 (sensitivity:0.72, specificity:0.97) Figure 4A. Mutation Y808X, located within exon 16, results in a nonsense mutation, and the mRNA with a premature stop codon is likely to be degenerated by the nonsense-mediated mRNA decay response, thus leading to a decrease in ABCA4 expression. As shown in Figure 4A, both amino acid changes affect highly conserved residues.
We then used TMHMM2.0 to predict the ABCA4 protein structure. Our result showed that the protein was organized in two structurally related tandem-arranged halves with each half  containing transmembrane domains (TMD) followed by nucleotide binding domains (NBD). Our result also indicated that G607R occurs in the ECD1 domain, while Y808X occurs in the disc lumen region between TMD1 ( Figure 4B).

Discussion
This study identifes novel compound heterozygous mutations in the ABCA4 gene as a cause of Stargardt's disease. STGD accounts for approximately 7% of all retinal dystrophies. It is one of the most common genetic forms of juvenile or early adult onset macular degeneration. This condition affects the central retina (macula) with a variable phenotype and a variable age of onset and severity.
The ABCA4 gene, located at the chromosome 1p22.1 with 50 exons, is a large glycoprotein with 2,273 amino-acid and organized into two structurally related tandem-arranged halves with each half containing a transmembrane domain (TMD) followed by a nucleotide binding domain (NBD) [25,26,27]. Both the N and C halves are predicted to have a single membranespanning segment followed by a large exocytoplasmic (extracellular/lumen) domain (ECD), five membrane-spanning segments and a nucleotide-binding domain (NBD) [27,28]. A highly conserved VFVNFA motif near the C-terminus has been shown to play an essential role in the folding of ABCA4 into a functional protein [29].
The ABCA4 protein is localized in cone and rod photoreceptor outer segments [30]. The normal function of ABCA4 is to facilitate the transport of all-trans-retinal from the outer segment disk to the outer segment cytoplasm in the form of a mono-substituted phospholipid known as N-retinylidene-phosphatidylethanolamine (N-ret-PE) [31,32]. When ABCA4 is defective, N-ret-PE irreversibly forms a toxic, insoluble, bisretinoid known as A2PE which is deposited in retinal pigment epithelium (RPE) cells during the process of disc shedding and phagocytosis, eventually leading to cell death and macular degeneration [33].
Compared to wild-type mice, Abca4 knockout mice show significant light-dependent changes in lipids and an accumulation of lipofuscin deposits in the RPE cells. Biochemical analysis of the lipofuscin deposits from Abca4 knockout mice show elevated levels of several fluorescent diretinoid compounds, including A2E, a diretinal pyridinium compound known to be a major component of lipofuscin, all-trans retinal dimer, and related diretinal compounds [34,35,36,37,38,39]. In addition, Abca4 knockout mice, like individuals with Stargardt disease, show a delay in dark adaptation consistent with the delayed removal of all-trans retinal from outer segments following photobleaching [31].
Since the first report of mutations in the ABCA4 gene by Allikmets in 1997, over 800 disease-causing mutations have been identified to date in ABCA4-associated phenotypes and more than half of these have been detected only once [40]. The ABCA4 mutation spectrum includes missense, nonsense, splice-site, frameshift, small deletion and insertion mutations, although the approximately 80% of reported changes are missense mutations. It is now generally believed that mutations in ABCA4 result in a spectrum of related retinal dystrophies, including STGD, bull's eye maculopathy [41,42], retinitis pigmentosa [43,44,45,46,47], cone rod dystrophy and age-related macular degeneration [48]. Numerous genetic studies on STGD patients have revealed that the disease-associated ABCA4 alleles are extraordinarily heterogeneous. It has been estimated that only half of STGD cases have two known or putative disease-causing ABCA4 mutations on separate alleles, nearly one third of cases have a single mutation and the remaining have no definite or probable disease-causing ABCA4 mutations [49].
In our study compound heterozygous mutations p.Y808X and p.G607R of the ABCA4 gene were identified. The previously reported ABCA4 p.G607R mutation [24,50] in exon 13 is a single nucleotide polymorphism, rs61749412, which is predicted probably to be damaging to protein function (PolyPhen2 scores close to 1.0). Through the analysis of membrane topology by TMHMM2.0, we found that the p.G607R mutation was in the ABCA4 ECD1 region, which is involved in stacking interactions with the adenine ring of ATP [51]. The novel stopgain p.Y808X mutation in exon 16 was detected in a heterozygous state, close to the previously reported mutation p.G818E [11]. It was predicted to result in a truncated protein, which severely impaired ABCA4 protein function. This novel compound heterozygous mutation was absent from public databases such as 1000 genomes or Exome Variant Server, excluding them as common polymorphisms.
In summary, we have reported the clinical and genetic characteristics of a Chinese family with Stargardt disease by WES. To identify pathogenic variants, we analyzed these variants by subjecting them to an analytical pipeline for high-confidence variant calling, annotation and filtration and finally identified novel compound heterozygous mutations in ABCA4. Our study provides another compound heterozygous mutation in ABCA4 for Stargardt disease.