Performance Evaluation of NIPT in Detection of Chromosomal Copy Number Variants Using Low-Coverage Whole-Genome Sequencing of Plasma DNA

Objectives The aim of this study was to assess the performance of noninvasively prenatal testing (NIPT) for fetal copy number variants (CNVs) in clinical samples, using a whole-genome sequencing method. Method A total of 919 archived maternal plasma samples with karyotyping/microarray results, including 33 CNVs samples and 886 normal samples from September 1, 2011 to May 31, 2013, were enrolled in this study. The samples were randomly rearranged and blindly sequenced by low-coverage (about 7M reads) whole-genome sequencing of plasma DNA. Fetal CNVs were detected by Fetal Copy-number Analysis through Maternal Plasma Sequencing (FCAPS) to compare to the karyotyping/microarray results. Sensitivity, specificity and were evaluated. Results 33 samples with deletions/duplications ranging from 1 to 129 Mb were detected with the consistent CNV size and location to karyotyping/microarray results in the study. Ten false positive results and two false negative results were obtained. The sensitivity and specificity of detection deletions/duplications were 84.21% and 98.42%, respectively. Conclusion Whole-genome sequencing-based NIPT has high performance in detecting genome-wide CNVs, in particular >10Mb CNVs using the current FCAPS algorithm. It is possible to implement the current method in NIPT to prenatally screening for fetal CNVs.


Introduction
Owing to the discovery of fetal cell-free DNA (cfDNA) in maternal plasma and rapid development of next-generation sequencing (NGS), noninvasive prenatal testing (NIPT) has brought confound changes to antenatal healthcare in the past few years [1]. The clinical validity and utility of NIPT for testing common aneuploidies have been endorsed by various clinical guidelines for using in high risk pregnancies [2]. Future application of NIPT may expand to average-risk pregnancies [2,3]. However, chromosomal CNVs such as deletion and duplication remain a challenge for NIPT because of their small region of chromosomal abnormality [4]. CNVs are known to commonly exist in human genome, and diseases associated with CNVs, such as DiGeorge syndrome (22q11), Cri-du-chat syndrome (5p-), 1p36 deletion syndrome, are documented [5]. Postnatally, pathogenic CNVs are important contributors of intellectual disabilities in newborns, while in prenatal practice increasing evidence showed that disease-causative CNVs are associated with adverse pregnant outcomes. For instance, in samples with normal karyotype, clinically relevant CNVs were identified in 6% with ultrasound abnormalities and in 1.7% with advanced maternal age or positive serum screening results [6]. Recently, Dong et al showed that among samples referred to chromosomal analysis, 6.4% of samples of products of conception (e.g. spontaneous abortions and stillbirth), 13.5% of prenatal samples, and 26.3% of postnatal samples contained pathogenic CNVs [7]. Unlike aneuploidy, the risk of CNVs in fetus is independent of maternal age, and thus younger pregnant women may equally suffer the risk of pathogenic CNVs as older women [8]. Thus prenatal testing for clinically significant CNVs may bring benefit to clinical management and genetic counseling of pregnant outcome. Currently, amniocentesis or chorionic villus sampling (CVS) followed by karyotyping or microarray is the major approach to identify fetal CNVs, although a small but significant risk of miscarriage is associated with the procedures [9].
Several studies have showed the possibility of using whole genome sequencing-based NIPT to detect fetal CNVs [10][11][12]. However, these methods require very deep sequencing which significantly increases the cost and difficulty for clinical use. Recently, several proof-of-concept studies also evaluated the low-coverage sequencing method for the detection of fetal CNVs. For instance, Yin et al developed a method to identify 71.8% of CNVs using 3.5 million reads, but the performance dropped to 41.2% when CNVs were below 5Mb [13]. Straver et al reported the detection of large CNVs (over 20Mb) with low sequencing depth (0.15-1.66X) which had limited clinical value [14]. Lo et al, reported 64.5% (20/31) of accuracy when 4-6 million reads were used to analyze samples with 3Mb to 42Mb CNVs [15]. However, if CNVs were smaller than 6Mb, only 5 in 13 cases were identified. Previously we also reported a low-coverage sequencing method for CNV detection (referred as FCAPS) [16]. Using less than 8 million reads, the method can theoretically detect over 90% of >10Mb CNV at 10% fetal fraction. Although the accuracy appeared to be high, the FCAPS method was only confirmed with four clinical samples containing CNVs. Thus this method needs to be further validated with larger sample size. In this study, we evaluated the performance of FCAPS in detecting CNVs using selected clinical samples with known CNVs and calculate the estimated sensitivity and specificity.
Since September 2011, maternal blood samples were obtained for NIPT service at BGI-Shenzhen with the following requirement: 1) maternal age was above 18 years; 2) gestational age was above 12 weeks; 3) singleton pregnancy. For NIPT test, 5ml of maternal blood from each woman was collected into an EDTA-containing tube with informed consent, and plasma was extracted for NIPT as previously described [17]. Exceeding plasma samples were dispensed and stored at -80°C. In this study, plasma samples that were confirmed by karyotyping or microarray as euploid or CNV-containing were retrospectively selected from our stored plasma collection from Sep 2011 to May 2013. The FCAPS was introduced into NIPT analytic pipeline after May 2013, thus the selected samples had not been pre-screened for CNV. This study was approved by the Institutional Review Board on Bioethics and Biosafety of BGI (BGI-IRB). Written informed consent was provided in this study, research did not be carried out until participants signed the agreement.

DNA preparation and sequencing
QIAamp Circulating Nucleic Acid Kit (QIAGEN) was used to extract plasma cfDNA following the manufacturer's instruction. Then cfDNA was prepared for library construction, quality control, and multiplexing for sequencing as described before [17]. Sixteen libraries were pooled and sequenced with 36-cycles sequencing using Illumina HiSeq2000 platforms. A barcode tracking system was employed during sample preparation. Sequencing reads were trimmed and aligned to a universal unique read set incised from the human reference genome (hg19, NCBI build 37). Risk of chromosomal aneuploidy was calculated using the binary hypothesis t-test and logarithmic likelihood ratio L-score as previously reported [17]. The FCAPS algorithm was used for CNV identification, which employed a regression-based GC correction strategy, binary segmentation for breakpoint localization, and dynamic threshold for signal filtering [16].

Fetal fraction estimation by URY
Fetal fraction was calculated in male pregnancies using method described before [18]. Briefly, formula: ε i;Y ¼ cr i;Y Àcr' i;Y;f cr' i;Y;m Àcr' i;Y;f was used to calculate the fetal fraction estimate by chromosome Y of sample I (ε i,Y ), in which cr 0 i,j,m = f j,m (GC i,j )(j = X,Y) indicates the fitted relative k-mer coverage from a regression of an adult male data set, and cr 0 i,j,f = f j,f (GC i,j )(j = X,Y) indicates the fitted relative k-mer coverage from a regression of a fetal female dataset.

Evaluating performance of CNV identification
Before testing for CNV, identity information of the selected samples was removed. Samples were randomly re-arranged and blinded tested by laboratory and bioinformatics personnel. Testing results were compared to karyotyping or microarray results to calculate sensitivity and specificity. Clinical information such as ultrasound, amniotic fluid and maternal white blood cell detection could be also taken into account to support analysis of test results.

Results
From September 1, 2011 to May 31, 2013 there were 919 samples with karyotyping or microarray results, including 21 samples with CNVs >10Mb, 7 samples with CNVs <10Mb, 5 samples with two CNVs (collectively referred as the positive sample set), and 886 euploid samples (referred as the reference set). Table 1 shows the demographic characteristics of the positive sample set, which contained CNV from 1Mb to 129Mb, including seven deletions causing Cridu-chat syndrome (J01350, mic0014, mic0012, mic0005, HYQ19, H11001, H05010) and five reciprocal CNV syndromes (mic0009,Q00084, H34058, mic0017,H34056) ( Table 2). Mean maternal age of the overall group was 31.71 years old, with the range of 20 to 38 years. Mean gestational age was 20.24 weeks, ranging from 12 to 37 weeks (Table 1). Maternal age of the positive sample set was compared to that of the reference set by T-test, and the p-value of 0.78 indicated no significant difference of the two groups. After removing sample identities, 919 samples were blinded sequenced and analyzed by FCAPS. No testing failure was reported. With 24 plex sequencing, each sample received on about 7M unique reads. Based on URY, fetal fraction of male pregnancies were 9.7% on average.
In the positive sample set, at such low sequencing depth, CNVs were detected in 33 samples by FCAPS (Fig 1). When stratified by CNV size, FCAPS identified 25 samples containing 27 events of CNVs> 10Mb, and 10 samples containing 11 events of CNVs<10Mb (Table 2). Three samples with CNV>10Mb (K003762, K000219, AR00208) and two samples with CNV<10Mb (R02423, mic0016) were undetected by FCAPS. In samples containing multiple CNVs, the same number of CNVs as karyotyping or microarray was identified by FCAPS in 01HK67 and H34056, while in R02423, AR00208, mic0016 CNV was partly identified. In twenty four samples, the CNV locations identified by FCAPS were fully covered or at least 50% overlapped comparing to the karyotyping/microarray results, which were classified as 'Consistent' (Tables 2 and 3). In contrast, in five samples FCAPS predicted the CNV on correction chromosome yet with small (<50%) or no overlap to karyotyping/microarray results, thus were classified as 'Partly Consistent'.
In total of 886 euploid samples, 872 had negative results from FCAPS analysis, resulting in 14 false positive results including 6 CNVs>10Mb and 8 CNVs<10Mb (Table 4). In these 14 false positive samples, 4.3-7.7Mb unique sequencing reads were obtained in each sample and the fetal DNA fraction tested in male pregnancy were 5.4-10.3%, showing consistent sequencing depth to the previous report [16]. Four in these fifteen false positive results were caused by maternal CNV backgrounds, as showed by sequencing maternal white blood cells (Table 4). One false positive case (INC6) had CNV signals close to chromosome telomere (Fig 2). CNVs of the other six false positive cases were less than 10Mb.

Discussion
Using NIPT for CNV detection was showed to be possible [19]. However, NIPT efficacy of CNV detection has not been extensively evaluated, mainly due to the lower disease prevalence [20]. Previously we developed a method to noninvasively detect CNV, which relied on GC-bias correction, binary segmentation, and dynamic threshold for signal filtering to reduce sequence variability and improve accuracy [16,21]. In this study, we evaluated the efficacy of CNV detection using archived samples and showed that CNVs>10Mb can be detected with high sensitivity whereas CNVs<10Mb have reduced detection rate.
In the selected samples with CNVs ranging from 1 to 129Mb, the FCAPS method showed the total sensitivity of 84.21% and specificity of 98.42%. Our method showed relatively high efficacy in detecting CNVs bigger than 10Mb, and the efficacy reduced when testing in CNVs smaller than 10Mb. This trend fits our previous in silicon simulation, as well as other studies showing that reduced CNV size leads to decreased detection power [16]. In general, our method generated 46 positive CNV results in which 32 were consistent to karyotyping/microarray confirmation, leading to a 69.57% of accuracy. However, the real positive predictive value of our method could be different in practice, since the CNVs samples were from a selected group and the occurrence rate did not represent that of a normal pregnancy population. Several previous studies reported their preliminary results of the performance of noninvasive CNV detection. However, it is difficult to compare their results with ours because different sequencing platforms [13], sequencing parameters [13,15], and CNV sizes [14] were involved. Nonetheless, factors affecting performance of CNV detection were commonly suggested by these studies, including CNV size, sequencing depth, fetal fraction, and GC contribution. In this study, the selected samples included a wide range of CNV sizes on different chromosomes, as well as various types of CNVs such as reported microdeletion or microduplication syndromes, imbalanced translocations, and CNV mosaicism. The ability of identifying CNVs of different size and types with relatively high accuracy implies that whole-genome sequencing-based method benefits the identification of genome-wide CNVs without prior knowledge of their locations.
Maternal CNV background has been reported to induce NIPT false positive results [13,17]. This is in consistent to our data that maternal white blood cells were available for verification in four false positive cases, all confirmed with maternal CNV backgrounds. Another sample had a CNV close to telomere. Due to the lack of maternal white blood cells, the false positive reason could not be validated. However, telomere sequence may be prone to have false positive or false negative results [22]. Among the remaining nine false positive samples, six had CNVs at submicroscopic level, which may be difficult to confirm by karyotyping method due to limited resolution [22]. Thus our data support the use of microarray for prenatal diagnosis owing to better resolution, as suggested by the American College of Medical Genetics and Genomics and the American College of Obstetricians and Gynecologists [23][24][25].
The existence of false positive results and the fact that current NIPT method cannot distinguish the source of CNV (maternal background or fetal origin) indicate that CNVs identified by NIPT should be confirmed by prenatal diagnosis and maternal background testing to provide the comprehensive information for post-test genetic counseling. However, this may significantly increase the screening cost and thus reduce the clinical utility of screening for CNV by NIPT [15]. Nonetheless, the decrease of NIPT cost and improvement of accuracy in the future may improve the cost-effectiveness of CNV screening and overcome this barrier for clinical use. Furthermore, the existence of false negative result of CNV detection in our study as well as previous other studies implies that a negative result of CNV screening by NIPT cannot rule out the possibility of clinically significant CNVs. Thus other clinical information such as ultrasound result should be also taken into account to interpret result and provide post-test counseling.
Several limitations remained in this study. Firstly, fetal fraction and confirmation of maternal CNV background was only available in limited samples, which impedes the difficulty in explaining false positive and false negative results. Secondly, due to the low occurrence rate of CNV in prenatal samples, limited number of samples was selected from archived storage, thus the clinical performance of our method in particular the positive predictive value could not be assessed. Moreover, the positive samples may not well-represent clinically significant CNVs which are commonly less than 3.5 Mb [26]. Further studies using ideally prospective CNV samples are needed for clinical validation of the method.

Conclusion
In conclusion, our study showed a high sensitivity and specificity in detecting CNVs over 10Mb using a low coverage sequencing method, which is consistent to the previous in silicon analysis. The method also appeared to have good performance in detecting CNVs smaller than 10Mb but further evaluation is still required. Our results demonstrated that noninvasive prenatal screening for fetal CNVs is promising for clinical use although its clinical utility needs to be further studied.