CFTR Mutations Spectrum and the Efficiency of Molecular Diagnostics in Polish Cystic Fibrosis Patients

Cystic fibrosis (CF) is caused by mutations in the cystic fibrosis transmembrane regulator gene (CFTR). In light of the strong allelic heterogeneity and regional specificity of the mutation spectrum, the strategy of molecular diagnostics and counseling in CF requires genetic tests to reflect the frequency profile characteristic for a given population. The goal of the study was to provide an updated comprehensive estimation of the distribution of CFTR mutations in Polish CF patients and to assess the effectiveness of INNOLiPA_CFTR tests in Polish population. The analyzed cohort consisted of 738 patients with the clinically confirmed CF diagnosis, prescreened for molecular defects using INNOLiPA_CFTR panels from Innogenetics. A combined efficiency of INNOLiPA CFTR_19 and CFTR_17_TnUpdate tests was 75.5%; both mutations were detected in 68.2%, and one mutation in 14.8% of the affected individuals. The group composed of all the patients with only one or with no mutation detected (109 and 126 individuals, respectively) was analyzed further using a mutation screening approach, i.e. SSCP/HD (single strand conformational polymorphism/heteroduplex) analysis of PCR products followed by sequencing of the coding sequence. As a result, 53 more mutations were found in 97 patients. The overall efficiency of the CF allele detection was 82.5% (7.0% increase compared to INNOLiPA tests alone). The distribution of the most frequent mutations in Poland was assessed. Most of the mutations repetitively found in Polish patients had been previously described in other European populations. The most frequent mutated allele, F508del, represented 54.5% of Polish CF chromosomes. Another eight mutations had frequencies over 1%, 24 had frequencies between 1 and 0.1%; c.2052-2053insA and c.3468+2_3468+3insT were the most frequent non-INNOLiPA mutations. Mutation distribution described herein is also relevant to the Polish diaspora. Our study also demonstrates that the reported efficiency of mutation detection strongly depends on the diagnostic experience of referring health centers.


Introduction
Cystic fibrosis (CF; MIM 219700) is the most frequent autosomal recessive disease among Caucasians; its median incidence in Europe is 1 in 3,500 [1] and ranges from 1 in 1,350 to 1 in 25,000, depending on the population under study [2]. The disease is caused by mutations in the cystic fibrosis transmembrane regulator (CFTR) gene (Ensembl ENSG00000001626) [3]. The comprehensive list of the CFTR mutations, maintained at the Cystic Fibrosis Mutation Database (CFMDB) (www.genet.sickkids.on.ca/cftr), was approximately 1950 as of November 2013.
The most frequent CFTR mutation, F508del, accounts for ,66% CF chromosomes in the general Caucasian population [4][5]. Few other widely spread mutations reach worldwide frequencies .1%, less than twenty have frequencies between 0.1 and 1.0%, and the majority are found only in certain geographical regions, populations or ''private'' (reported in singular families). Local frequencies of the most common mutations vary among populations [4][5], with founder effect(s) shown to be responsible for a number of them. The strong allelic heterogeneity has a direct bearing on the strategy of molecular diagnostics and counseling in CF. In order to increase the rate of CFTR mutation detection and to correctly evaluate the residual risk of being a CF carrier after molecular analysis, it is essential that genetic tests are designed based on the frequency profile characteristic for a given population and that the sensitivity to detect mutations is as high as possible [4][5][6].
The most recent publication regarding the prevalence of CFTR mutations in Poland is now more than 12 years old [7]; the other three papers investigated only a limited number of mutations [8][9] or a specific subpopulation of adult CF patients [10].
The goal of the present study was to provide an updated comprehensive estimation of the distribution of CFTR mutations in Poland, based on the analysis of a representative cohort of CF patients. At the same time, we aimed to assess the effectiveness of INNOLiPA (IL) CFTR tests (Innogenetics) in PL (Polish) population.

Results
Two IL tests, CFTR_19 and CFTR_17 TnUpdate, used to examine a cohort of 738 PL CF patients, revealed mutations in 66.8% and 8.7% of the CF alleles, respectively (a combined efficiency of 75.5%). Two mutations were found in 68.2%, and of one mutation in 14.8% of the patients, and no mutation was identified in 17.1% of PL CF patients (Table 1). Of the 36 most frequent European-wide CFTR mutations targeted by both IL tests, 22 were found in at least one individual ( Table 2).
All the CF patients without two IL mutations were analyzed further using PCR -SSCP/HD -sequencing approach. Among fifty-six non-IL mutations revealed in 99 patients, forty-five were already reported in the CFMDB, and eleven were novel, never described before (Table 2). All migration variants detected by SSCP/HD were confirmed /explained by dideoxy sequencing.
Two of the non-IL CFMDB mutations were relatively frequent. The c.2052_2053insA (legacy name, l.n.2184insA) was found in 15 (1.0%) analyzed chromosomes, and the c.3468+2_3468+3insT (l.n.3600+2insT) -in 11 (0.7%). Both mutations were in a compound heterozygosity, in most cases with F508del, and in singular patients with another CF mutation ( Table 2); in two patients, no mutations in trans were found. In the Clinical and Functional Translation of CFTR database (CFTR2; www.cftr2. org), c.2052_2053insA has been reported as causing pancreatic insufficiency (PI); here, most of the patients were PI but some were pancreatic sufficient (PS); the same was observed for patients carrying c.3468+2_3468+3insT.
Two other CFMDB mutations affected the c.1210-12T n _1210-34_35TG m site in intron 9 (l.n.IVS8-T n /TG m ). Specific SSCP/ HD patterns (confirmed by sequencing) allowed unambiguous differentiation of the T n alleles associated with variable size of the adjacent TG repeat. Among nineteen T 5 alleles found in the whole analyzed cohort, eight (42%) were in cis with TG 12-13 ; these compound alleles were not found in 300 analyzed control chromosomes. The T 5 _TG 12 combination was found in six patients (in three with F508del in trans, in three as the only mutation identified). The T 5 _TG 13 was found in two patients (in one as the only mutation, in one with c.2012delT in trans). The IVS9 T 5 _TG 12-13 compound alleles have been described in the CFTR2 database as having ''varying consequences''. Here, the CF manifestation in the majority of patients carrying these alleles was relatively mild.
Of the remaining non-IL CFMDB mutations, 13 were found on 2-5 chromosomes each, and 28 were found on a single chromosome each. Eleven were amino acid substitutions whose deleterious consequence was not supported by strong negative SVM (Support Vector Machine) values. Three of them (R352Q, Q359R and D1152H) were in a compound heterozygosity with F508del, six (E217G, I506, V562L, G723V, D924N and L967S) had no accompanying mutation in trans. These nine substitutions were tentatively considered causative, but their pathogenic character requires further studies. Another substitution, G576A (SVM +1.73), was found in three patients, in cis with a deleterious R668C allele (SVM -1.61); the latter was also present without G576A, in two patients (in one with c.1585-1G.A in trans). In the UMD-CFTR database (www.umd.be/CFTR), G576A and R668C have been reported in cis; in the CFTR2 database both mutations are described as having ''varying consequences''. Three of our patients carrying R668C were PI, and two appeared PS; PS/PI status was independent on the presence of G576A. We considered R668C a pathogenic mutation, and G576A -an associated element of a compound allele. Finally, S912L (SVM +2.12), found in a single chromosome with F508del in trans, has been reported in the CFMDB as pathogenic only if in cis with c.3067-3072del6; since the latter was not found in the analyzed patient, we considered S912L a neutral polymorphism.
Of the eleven novel changes, seven affected the length of the CFTR protein by: (i) introducing a premature STOP codon (two); (ii) introducing a frameshift or changing the protein length (three); (iii) changing the conserved splicing site positions (two); all were considered pathogenic mutations. They were present in a single patient each (major manifestation indicated in parentheses), and most were in trans with In case of three novel amino acid substitutions (G27V, R153I and I752V), the possibility that a change represented a nonpathological polymorphism was examined. None of these alleles was found in 300 healthy chromosomes from Polish general population, neither reported in the NCBI database for human single nucleotide polymorphisms, which encompassed data from the 1000Genome project (www.ncbi.nlm.nih.gov/SNP; build 137). The SVM values indicated strongly deleterious effect of both R153I and G27V on the protein stability (22.61 and 21.92, respectively), and both positions were conserved in the comparative analysis with orthologues from several Eutherian species (www.ensembl.org); R153I and G27V were therefore assumed to be pathogenic mutations. R153I was found in two unrelated patients with pulmonary manifestation and a known CF mutation on another allele (F508del or c.3528delC); the first patient had elevated chloride values (94 and 98 mmol/L), and no chloride data were available for the second patient. G27V was found in a single patient with PI and pulmonary symptoms (chloride values .100 mmol/L), and was accompanied by F508del on another allele. The deleterious consequence of I752V (found in a single     (although present in Primates) was not conserved in other Mammalian species. I752V was therefore considered to represent a rare neutral polymorphism rather than a pathogenic mutation. An atypical allele of the length polymorphism in intron 9, with an even number of T-s associated in cis with the TG 12 , IVS9 T 8 _TG 12 , was identified in one patient with MI and failure to thrive. No mutation in trans was found and no RNA was available from the patient to experimentally confirm the effect of this mutation on CFTR expression. However, given that a similar change (T 6 allele) was reported in the UMD-CFTR database as pathogenic, we tentatively considered the T 8 allele to be a CF mutation.
MLPA (multiplex ligation-dependent probe amplification) analysis was performed to detect the potential presence of large exonic deletions in 58 patients with only one mutation and in 46 of 100 patients with no pathogenic mutations identified in the IL tests combined with the SSCP/HD screening. No changes indicative of the presence of unknown large exonic deletions were identified.

Discussion
Using both CFTR_19 and CFTR_17TnUpdate tests allowed identification of CFTR mutations in 75.5% of the CF chromosomes. SSCP/HD-based screening detected 53 more sequence changes assumed to be pathogenic mutations. After the extended screening of the CFTR coding sequence, the second mutation was found in 71 of the 109 patients with one IL mutation (Table 3). Among the 126 patients without IL mutations, one mutation was found in 20 and two in 6 individuals. The residual proportion of CF patients with none of the mutations detected was 13.6% (100/ 738), i.e. 3.5% less than after using only IL tests. The combined efficiency of CFTR mutations detection (82.5%) represented 7.0% increase compared to the IL tests alone. These results indicate that IL panels, which are among the most popular CF-diagnostic tests used in Europe, may not optimal for the mutation detection in PL population. The most frequent non-IL alleles identified in this study (i.e. c.2052_2053insA; c.3468+2_3468+3insT; IVS9 T 5 _TG 12-13 ; R668C; G314R) should be included in the PL population screening panel. Mutation distribution described herein is also relevant to the Polish; according to the Polish Press Agency, the rapport of the Polish Ministry of Foreign Affairs issued in July 2013 estimated the size of Polish diaspora at over 18 million people, with the major centers in US, Canada, Germany, UK and other countries (see also www.wikipedia.org/wiki/Polish_diaspora). The aforementioned mutations could be considered when testing these populations.
On the other hand, even the extended screening procedure left over 17% of CFTR mutations unidentified. It has to be kept in mind that, for budgetary reasons, sequencing was performed when an altered migration pattern of an amplicon was detected in the SSCP/HD analysis. Our long experience with the SSCP/HD technique indicates that using appropriate conditions allows detecting ,85-90% of the existing sequence changes. Applying this estimate to the 158 chromosomes with no identified mutation, one may assume that 10-15% of them (,16-24 chromosomes) could harbor sequence change undetected due to technical reasons. Nevertheless, this would still leave over 130 chromosomes with no identified mutation. The presence of large novel exonic deletions, not detectable using SSCP/HD-based approach, was excluded by MLPA analysis performed in ,60% of the chromosomes with no identified mutation. Undetected mutations could therefore be located in the gene regions, which were not targeted in the screening procedure, such as deep intronic sequences or regulatory regions [12]. Also, when estimating mutation detection efficiency, the accuracy of the clinical diagnosis is an important issue, as discussed below.
Heterogeneity of the clinical manifestation may render clinical diagnosis of some cases difficult. This problem had already been noted in one of the previous studies on CFTR mutations, where patients had been divided into groups of ''CF classical, atypical and doubtful'' [7]. In this context it is also worth mentioning that the estimates of CF incidence in PL population differ, very likely reflecting differences in diagnostic schemes [2]. The most-cited source, indicating the incidence of 1:2300, is now 40 years old [13]. The more recent estimates provide much lower values, ranging from 1:5000 [14], 1:6000 cited in WHO 2002 report [15] to 1:7500 for Southeastern Poland estimated for a 1-year period of 2009-2010 [16]. Some of the discrepancies in the mutation detection efficiency across populations and studies may reflect this type of inaccuracies.
To illustrate potential impact of the care center on the correct clinical diagnosis and thus on the efficiency of mutation detection, we compared the results obtained for two subgroups of our cohort ( Fig. 1): patients from the national CF reference center (Institute of Tuberculosis and Lung Diseases in Rabka; N = 368) and from other health centers (general pediatric hospitals in Poznan and other cities, excluding Warsaw (see below); N = 370). The success of the IL tests (detection of both mutations) was 79% in Rabka and 58% in other centers (p,0.0001; Pearson's chi square test). After the extended gene screening, the number of patients with no mutations detected remained significantly (p,0.0001) higher in patients from the general pediatric centers (,23%) than in patients from the specialized CF center in Rabka (,5%). The observed discrepancy can be interpreted as indicating the lower rate of a successful clinical CF diagnosis outside the reference center. Of note, among the patients from Rabka with only one or with no mutation found, ,80% had high sweat chloride values; in the corresponding group from the peripheral centers, high chloride values were reported only in ,50% of the patients, while the other half had ambiguous chloride values or no test results had been reported.
The efficiency of non-IL mutations detection was almost 5 times higher in the patients with one IL mutation than in those with no IL mutation (non-IL mutation found in ,65% and ,13% of the chromosomes, respectively, see Table 3). This discrepancy appears consistent with the scenario that some individuals of the former group (composed mostly of patients from outside of Rabka) were in fact misdiagnosed as CF. In addition, in over 20% of the patients with no mutation found, the clinical diagnosis was based on the presence of MI. While MI occurs in 10-20% of all CF patients at birth [17], a considerable proportion (20-50%) of newborns with MI may not in fact have CF [18][19].
In conclusion, taking into account that up to ,10% of the whole analyzed cohort could have been incorrectly diagnosed as CF, it is possible that the real efficiency of mutation detection would be higher than the estimated 85%; similar correction would apply to the estimated mutation frequencies.
Our study group consisted mostly of CF patients from the Western and Southern PL. In order to assess the mutation spectrum in the whole country, we compared our data with those obtained in the Institute of Mother and Child in Warsaw, where most of the patients were from the Central and Northeastern Poland (Prof. Jerzy Bal, personal communication). The data (based on direct sequencing of the coding region) were only available for patients (N = 480) with both mutations detected (the total number of analyzed CF patients and of patients with only one mutation detected were not provided). To compare data from both laboratories, we calculated allele frequencies in a subset of our patients, with both mutations identified. F508del frequency in this subset was significantly higher than in the full cohort (67.2% vs 54.5%, p = 0.000; Pearsons's chi square), which most probably results from the possible presence of misdiagnosed patients, as discussed above; the adjusted frequencies for other mutations did not differ between both subsets of our cohort. Three mutations in our reduced data set were significantly (p,0.03) less frequent than in Warsaw (F508del: 67.2% vs 71.8%; G542X: 1.98% vs 2.92%; N1303K: 1.47% vs 2.92%), while two mutations were significantly more frequent (c.3468+2_3468+3insT: 0.86% vs 0.10%; c.489+1G.T: 0.43% vs 0%; G314R: 0.34% vs 0%). The discrepancies between data from our laboratory and from Warsaw may reflect regional differences in samples origin.
The most frequent CFTR mutations found in PL patients (.0.1%, as suggested in the recommendations regarding selection Legend: a including six alleles with T 5 _TG 12-13 in intron 9; b including four alleles with T 5or8 _TG 12-13 in intron 9; c also analyzed by MLPA doi:10.1371/journal.pone.0089094.t003  of mutation panels [20]), listed in Table 4, were compared to those reported for several Central and Southeastern European countries [21][22][23][24][25][26][27][28][29]. The overall efficiency of the IL tests in Polish patients (75.5%; 87% in Rabka and 65% in Polish peripheral centers) was within the range reported for other populations (61-90%; see Table 4). Please note, that the highest efficiencies of testing for INNOLiPA mutations have been reported in the countries with a long-standing tradition of genetic CF testing. These data lend even further support to the observation that the reported efficiency of genetic tests in a given population strongly depends on the quality of clinical diagnostics performed in referring care centers from a given country or region. The frequency of F508del (,54.5%), was similar to the estimates for Western Ukrainians, Lithuanians, Romanians and Greeks but significantly (p,0.006; Pearsons's chi square) lower than for Germans, Czechs, Slovaks, Bulgarians, and Serbs; this shows that the decreasing North-South gradient of F508del frequency across Europe [4] does not hold in Eastern Slav populations. The relatively high frequency of exon2.3del21kb alias CFTRdele2,3(21 kb) (,4.5%) was consistent with its Slavic origin [30].
Among non-IL mutations, the frequency of c.2052-2053insA (l.n.2184insA, ,1.0%) was significantly (p,0.005) lower than 7.2% or 5.0% reported for Western Ukraine or Eastern Hungary, respectively [24,25]. This confirms the proposed ''Galician'' origin of this mutation and indicates that Poland is at the decreasing side of its distribution. The c.3468+2_3468+3insT (originally reported in a single PL individual [7]), accounted for ,0.75% PL CF chromosomes; so far, it has been reported only in Poles and in Czechs [21]. The regional origin of Polish patients carrying this insertion (Southeastern part of PL) and the homogeneous SNPbased background haplotype (not shown) indicate that this is probably a founder mutation. G314R, a novel missense change found in 0.37% of the studied CF chromosomes, might be another PL founder mutation. These observations would have to be confirmed by screening for c.3468+2_3468+3insT and G314R in a larger numbers of CF patients from other neighboring European populations.
The relatively high frequency (0.54%) of IVS9 T 5 _TG 12-13 (without R117H in cis) in PL patients could not be compared with other populations. While TG [12][13] in cis with T 5 are known to contribute to less efficient splicing of exon 10 and to abnormal phenotype [31][32], most of the CF tests do not determine the length of the TG repeat, and reporting the T 5 in intron 9 (l.n.IVS8-T 5 ) as a pathogenic mutation has been recommended only if in cis with R117H [12]. Table 4. Cont.  Legend: Data are given in %. a non-INNOLiPA mutations are in bold, and a novel mutation is underlined; b very close to the earlier estimate of 54% based on a much smaller study group of PL patients [7]; c only selected segments of the gene have been screened; d frequency significantly higher or e lower than in Polish cohort (p,0.005, Pearsons's chi square); f [5]; NA-not analyzed/not available; Poland (mostly Southern and Western Poland; this study); Czech Republic [21]; Slovakia (based on [21]); Germany [22]; Lithuania [23]; Western Ukraine [24]; East Hungary [25]; Romania [26]; Bulgaria [27]; Serbia [28]; Greece [29]. doi:10.1371/journal.pone.0089094.t004 The research protocol was approved by the Ethics Committee of the Medical University in Poznan. Genetic data referred to in the manuscript were collected over many years as a component of a routine hospital diagnostic procedure. As such, they are covered by an implied consent, which -according to the local Institutional Review Board -does not require signing additional informed consent form and is documented by the fact of a patient subjecting to the diagnostic procedure. All further analyses were based on the archival data that were stored in the database with no current connection to the patients' identifiers.

Patients
The samples, referred for molecular diagnosis during the period of 1995-2011, originated from different care centers (predominantly Institute of Tuberculosis and Lung Diseases in Rabka, ,50%; and pediatric clinical hospitals from Poznan, ,25%); patients represented mostly Western and Southern PL. The initial diagnosis of CF was based on standard criteria [6,33], which included one or more of the characteristic clinical features (most often: chronic obstruction/ infection of the respiratory tract; gastrointestinal abnormalities; failure to thrive; meconium ileus, MI; exocrine pancreatic insufficiency) and/or at least two positive sweat chloride tests in pilocarpine iontophoresis (.60 mmol/L). Chloride measurements were not collected for newborns and very young children, consistent with the well-known observation that no sweat tests are reliable during the first years of age (e.g. [34]); in particular, for most of the patients with MI no chloride data were available.

Mutation screening
The group of unrelated CF patients (N = 738) had been sequentially examined using IL tests: CFTR_19 and -if two mutations were not detected -CFTR_17TnUpdate. IL reverse dot blot kits rely on the DNA amplification and hybridization with a membrane-immobilized set of probes (Line Probe Assay). The probes correspond to the normal and mutated alleles, covering 36 most frequent European CFTR mutations. The procedure was performed as indicated by the manufacturer (InnoGenetics, Ghent, Belgium; see also [35]).
To comprehensively determine the population frequency profile of CFTR mutations, a follow-up molecular screening was performed. All the patients with only one IL mutation (N = 109) and with no mutation found (N = 126) were analyzed using SSCP/ HD (single strand conformational polymorphism and heteroduplex) technique as described in [11]. Briefly, specific primer pairs were designed for each of the 27 CFTR exons, and the 5' and 3' UTR regions; to achieve the length of each amplicon ,340 bp some exons were analyzed as overlapping fragments. For the SSCP/HD analysis, PCR-amplified segments were denatured and separated in 7 or 8% polyacrylamide (29:1) in 0.5x or 1xTBE gels (optionally with ,2 M urea and 10% glycerol) were run at 8-10 W for 20-40 h at RT or 4uC. Primer sequences, PCR conditions and detailed conditions used to separate each of the analyzed fragments are available from the authors upon request.
Finally, 2/3 of the samples, in which one or both mutations remained unidentified, were examined for possible intragenic rearrangements using MLPA technique with SALSA MLPA Kit P091-B1 CFTR (MRC-Holland). This P091-B1 CFTR probemix contains probes for each of the 24 CFTR exons and a second probe for exons 6, 14, 17 and 24. In addition, the probemix P091-B1 contains a mutation specific that detects the wildtype allele of the F508del mutation.
The effect of amino acid changes on the protein stability was examined using SNPs3D online software (www.snps3d.org); the negative SVM value (,20.5) was assumed to indicate a deleterious effect.

Nomenclature of mutations
The currently recommended numbering of the gene exons (www.ensembl.org) and the protein or cDNA-based HGVS nomenclature of mutations were used throughout the text; the commonly used legacy name (l.n.), if mentioned, was given in parentheses.