Germline copy number variations in BRCA1/2 negative families: Role in the molecular etiology of hereditary breast cancer in Tunisia

Hereditary breast cancer accounts for 5–10% of all breast cancer cases. So far, known genetic risk factors account for only 50% of the breast cancer genetic component and almost a quarter of hereditary cases are carriers of pathogenic mutations in BRCA1/2 genes. Hence, the genetic basis for a significant fraction of familial cases remains unsolved. This missing heritability may be explained in part by Copy Number Variations (CNVs). We herein aimed to evaluate the contribution of CNVs to hereditary breast cancer in Tunisia. Whole exome sequencing was performed for 9 BRCA negative cases with a strong family history of breast cancer and 10 matched controls. CNVs were called using the ExomeDepth R-package and investigated by pathway analysis and web-based bioinformatic tools. Overall, 483 CNVs have been identified in breast cancer patients. Rare CNVs affecting cancer genes were detected, of special interest were those disrupting APC2, POU5F1, DOCK8, KANSL1, TMTC3 and the mismatch repair gene PMS2. In addition, common CNVs known to be associated with breast cancer risk have also been identified including CNVs on APOBECA/B, UGT2B17 and GSTT1 genes. Whereas those disrupting SULT1A1 and UGT2B15 seem to correlate with good clinical response to tamoxifen. Our study revealed new insights regarding CNVs and breast cancer risk in the Tunisian population. These findings suggest that rare and common CNVs may contribute to disease susceptibility. Those affecting mismatch repair genes are of interest and require additional attention since it may help to select candidates for immunotherapy leading to better outcomes.


Introduction
Breast cancer is the most common malignancy in women worldwide with approximately 2.09 million new cases diagnosed per year [1]. It is estimated that 5-10% of all breast cancers are hereditary cases [2,3]. Family based linkage, gene re-sequencing as well as genome wide association studies allowed the identification of high, moderate and low penetrant variants that collectively explained only half of the breast cancer genetic component [3,4]. Thus, the genetic background of a substantial part of hereditary cases are yet to be discovered. Copy number variations (CNVs), typically defined as a gain or a loss of DNA sequences larger than 50 bp compared to a reference genome [5], might contribute to the remaining genetic basis of breast cancer risk [6]. Several CNVs have already been identified as associated with many diseases including complex disorders such as cancer [7]. CNVs may contribute to disease development through their impact on gene expression and protein structure. Indeed, deleterious CNVs in cancer patients have been observed in more than 30% of highly penetrant cancer-predisposing genes, including BRCA1, BRCA2, APC, as well as mismatch repair (MMR) genes [7,8].
Germline CNVs represent 4 to 28% of all inherited BRCA mutations depending on the study population [9]. Pathogenic CNVs are more frequent in BRCA1 than BRCA2 and reach respectively 27% and 8% of BRCA genetic variations. This may be explained by the higher number of Alu sequences in BRCA1, and also by the homologous recombination events that occur between BRCA1 and its pseudogene [10,11]. The highest contribution of BRCA1 CNVs was reported in the Dutch population in which 27% to 36% of all germline BRCA1 mutations are CNVs [6]. In the Tunisian population, the contribution of BRCA CNVs to breast cancer susceptibility is not well defined. A unique report was published describing exon 5 deletion and exon 20 duplication in BRCA1 identified each in one patient [12]. Furthermore, several studies have been conducted in BRCA negative breast cancer patients and have led to the identification of rare candidate CNVs that might contribute to breast cancer susceptibility. [3,8,[13][14][15]. Common CNVs are also expected to play a role in cancer etiology. It was shown that approximately 40% of cancer-related genes are disrupted by a common CNV. These common cancer CNVs, and by analogy with common cancer SNPs, are thought to confer, each, only a minor increase in the disease risk but when considered collectively they may lead to a substantially elevated risk [16].
The association between common germline CNVs and breast cancer risk was assessed only in a few reports. Recently, a genome wide association study in Chinese population revealed an association between a common copy number deletion in APOBEC3 loci and breast cancer risk with 1.31 and 1.76-folds increased risk associated with one copy deletion and two copy deletion, respectively [17]. This finding was subsequently validated in Caucasian population [18]. Moreover, other common CNVs were found to be associated with breast cancer risk through whole genome CNV genotyping studies including those disrupting OR4C11, OR4P4, OR4S2 and UGT2B17 genes [4]. These results were replicated in the study of Kumaran et al,.2017 which revealed the association of 200 common CNVs (frequency >10%) with breast cancer risk of these, 21 CNVs were also associated with disease prognosis. Those disrupting ZFP14, JAK1, LPA and PDGFRA genes were found to be associated with both recurrence-free survival and overall survival [19]. The association between CNVs and disease prognosis in breast cancer patients has also been explored in earlier studies where CNVs in the drug metabolism genes GSTT1 and GSTM1 were found to predict treatment outcome [20]. The association between other metabolizing enzymes such as CYP2D6 and SULT1A1 and the clinical response toward tamoxifen therapy in breast cancer patients have been also evaluated in several reports [21,22]. So far, several techniques have been used to characterize CNVs such as multiplex ligationdependent probe amplification (MLPA), real-time PCR and genomic arrays [7]. Nevertheless, thanks to advances made in sequencing technologies such as next generation sequencing (NGS), which generate millions of sequences of the same target genomic region, it is now possible to detect CNVs from NGS data using the appropriate bioinformatics tools. These latter usually applied a read depth approach based on counting the number of reads aligned to a particular region of the human genome [10,23].
Here, we used whole exome data to evaluate the contribution of germline CNVs to breast cancer risk in Tunisian patients who were negative for pathogenic mutations in known breast cancer susceptibility genes.

Patients
The studied cohort included 9 patients with a strong family history of breast cancer referred from the Departments of Medical Oncology of Abderrahman Mami Hospital, Surgical Oncology of Salah Azaiez Institute and Medical Oncology of the Military Hospital of Tunis. In addition, 10 non-affected unrelated individuals were included as matched controls for CNVs detection. Written informed consent was obtained from all study participants. The present study was conducted in accordance with the ethical standards of Helsinki declaration and approved by the biomedical ethics committee of Institut Pasteur de Tunis (2017/16/E/Hôpital A-M).

DNA isolation
Genomic DNA was isolated from peripheral blood, collected on EDTA, by the salt precipitation method [24]. DNA quantity and purity were evaluated using a NanoDrop™ spectrophotometer.

Whole Exome Sequencing (WES)
Whole Exome Sequencing was performed on breast cancer patients and control individuals. Samples were prepared according to Agilent's SureSelect Protocol Version 1.2 and enrichment was carried out according to Agilent SureSelect protocols. Paired-end (2 × 100) sequencing was performed on enriched samples on the Illumina HiSeq2000 platform using TruSeq v3 chemistry. Data were analyzed as described elsewhere [25]. In order to assess the quality of sequencing and to ensure that target regions are well covered, coverage analysis was performed using GATK [

Copy number variations detection and analysis
CNVs were called from WES data using the ExomeDepth R package that uses read depth data to call CNVs from exome sequencing experiments. Each tested exome was compared to an optimized set of the control exomes that had been generated by identical laboratory and computational procedures. ExomeDepth presumes that the CNV of interest is absent from the aggregate reference set [28]. Analysis was performed using the hg19 assembly as a human reference genome. Identified CNVs were annotated using the AnnotSV program which is designed for annotating and ranking Structural Variations (SVs) [29]. This program provides several relevant annotations including the computed allelic frequency relative to overlapping CNVs from the Database of Genomic Variants (DGV), the 1000 genomes project and the Deciphering Developmental Disorders (DDD) study that contain a catalogue of SVs of control individuals from worldwide populations [30,31]. It also reports frequencies of overlapping CNVs from gnomAD and I.M. Hall's lab [32]. In addition to these annotations, this tool also provides a systematic CNVs classification based on the same type of categories delineated by the American College of Medical Genetics and Genomics (ACMG) (Class 1 = benign; Class 2 = likely benign; Class 3 = VOUS (variant of unknown significance); Class 4 = likely pathogenic; Class 5 = pathogenic). In order to prioritize clinically relevant CNVs, we have first eliminated those considered as common. Indeed, a CNV was thought to be common if at least 70% of this CNV is overlapped with a documented CNV from the DGV, the 1000 genomes database, the DDD study data control sets, gnomAD or the I.M. Hall's lab and it has a frequency � 1%. Otherwise, the called CNV is considered as rare. Subsequently, only CNVs classified as likely pathogenic or pathogenic were kept for further analysis.
In addition, we have searched published data on common CNVs and breast cancer risk to assess the possible contribution of this type of variations to hereditary breast cancer in the studied cohort.

Gene set enrichment analysis and biological pathways investigation
Overrepresentation enrichment analysis was conducted using EnrichR, a bioinformatics webbased tool that contains a large collection of more than 100 gene set libraries [33]. Enriched pathways were visualized using ClueGO, a cytoscape plug-in that allows the visualization of the non-redundant biological terms for large clusters of genes in a functionally grouped network [34].
We investigated the biological and functional features of genes contained within CNVs classified as likely pathogenic and pathogenic using different online databases: 1) Network of Cancer Genes version 6.0 to identify genes associated with malignancy [35], 2) Web-based Gene Set Analysis Toolkit V2 (WebGestalt2) to reveal common functions of the gene products [36], 3) Kyoto Encyclopedia of Genes and Genomes (KEGG) Mapper-Search Disease tool for searching disease genes in the KEGG DISEASE database [37], 4) The Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 which provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes and to identify genomic loci associated with genetic disorders including cancer [38]. Moreover, and to select genes likely associated with malignancy a gene disease association (GDA) network was generated using the DisGeNET Cytoscape App. This latter interrogates the DisGeNET database, which integrates gene-disease associations from literature and from various expert curated databases [39].

Results
In the current study, we performed whole exome sequencing for 9 BRCA negative breast cancer cases and 10 matched controls with the aim to assess the contribution of germline CNVs to hereditary breast cancer in the Tunisian population. The mean age at diagnosis for breast cancer patients included in this study was 43.9 years old (29-60 years) and family history of breast and/or ovarian cancer was present in all cases. Table 1 summarizes the epidemio-clinicopathological characteristics of these patients. WES data analysis showed no deleterious point mutations on all known breast cancer susceptibility genes. Coverage analysis demonstrated that target regions are well covered, eliminating the possibility of false negative results (Additional data are given in S1 Table). These findings led us to hypothesize that other forms of variations such as CNVs may account for disease susceptibility.

Gene set enrichment analysis
Gene set enrichment analysis was performed based on biological process GO terms, Wikipathways and KEGG pathways to explore the main functions of the genes disrupted by CNVs in breast cancer patients. The top 10 enriched GO terms and pathways are illustrated in S2 Table. The obtained results were visualized as a functionally organized network in order to group highly overlapping gene sets into functional clusters (Fig 2). Adaptive immune response, antigen processing and presentation, olfactory receptor activity and xenobiotics metabolism by cytochrome P450 were the main enriched functions. Interestingly, analysis of biological pathways supplied by Wikipathways revealed an enrichment of Tamoxifen metabolism (pvalue = 0.01743 (Fisher exact test)).

Rare copy number variations likely associated with malignancy
In order to identify the most relevant CNVs that might be associated with hereditary breast cancer predisposition, we have first looked for deletions and duplications within 37 genes frequently analyzed in high risk breast and ovarian cancer families [41]. The full list of genes investigated is shown in S3 Table. Two unrelated patients originating from two distinct geographical regions (BC22 and BC37) carried a 20.8kb heterozygous deletion on 7p22.1 locus overlapping RSPH10B (exons 2-7) and PMS2 (exons [13][14][15] genes. This CNV is not described in the DGV database yet it overlaps with a rare and pathogenic deletion reported in the dbVar database (nssv8639488). No additional deleterious CNVs have been identified on the remaining known genes. Therefore, we have applied several filters to detect CNVs on other genes that may contribute to disease susceptibility. Common CNVs with a frequency �1% have been eliminated. A total of 184 CNVs remained and were then filtered according to their potential pathogenicity. Only CNVs ranked as pathogenic or likely pathogenic were kept which reduces the number of CNVs to 39 (Fig 3). The remaining CNVs were further filtered to keep only those disrupting cancer genes. Five relevant CNVs have been identified as affecting the following cancer genes APC2, POU5F1, KANSL1, DOCK8 and TMTC3. CNVs encompassing DOCK8 and KANSL1 were classified as pathogenic, while those identified on APC2, POU5F1 and TMTC3 were ranked as likely pathogenic (Table 2 and Fig 3). Functional gene annotation, biological pathways investigations and gene disease network analysis (Fig 4) revealed relevant features for the five selected candidate CNVs. KANSL1 gene which is mapped to pathways affected in adenoid cystic carcinoma was disrupted due to two large deletions of 644.6kb and 734.2kb identified in two unrelated patients, BC37 and BC39, respectively. Based on our analysis, this gene seems to be associated with adenoid cystic carcinoma and leukemia. Duplication in DOCK8 was detected in one patient (BC40) and was found to be associated with neuroblastoma and hematologic neoplasms. The identification of CNVs within APC2 (BC47) and POU5F1 (BC39) was of interest as these genes were assigned to the Wnt signaling pathway which had a critical role in regulating cell proliferation and differentiation. Interrogation of KEGG disease and DiSgeNET databases revealed an association between APC2 gene and colorectal cancer, medulloblastoma and breast cancer, while POU5F1 was mainly associated with germ cell tumors. One patient (BC19) carried a duplication in TMTC3 gene. According to the most recent update of the Network of Cancer Genes database, TMTC3 is considered as a candidate cancer gene significantly mutated in pancreatic cancers with both point mutations and CNVs that have been detected. In two families (BC1 and BC52) CNVs prioritization did not reveal any potentially relevant rare CNVs. For family BC1, two related members have been sequenced and we focused our analysis only on rare CNVs shared   between the two members to confirm the familial segregation. No rare CNVs have been detected in this family. This was the case also for BC52, suggesting that rare CNVs do not contribute to breast cancer susceptibility in these two families.

Common copy number variations likely associated with breast cancer risk
To evaluate whether detected common CNVs overlap with CNVs known to be associated with breast cancer risk, a literature review has been conducted. Interestingly, several common CNVs identified in the current study are overlapping with CNV regions that were previously reported as associated with an increased risk of breast cancer at 1.28 to 2.9 folds (p-value = 0.02 to 1.10 × 10 −06). This mainly involves the following 8 genes: UGT2B15, UGT2B17, OR4C11, OR4P4, OR4S2, APOBEC3A, APOBEC3B and GSTT1 (Additional data are given in Table 3). These CNVs may contribute to breast cancer heritability through a polygenic risk model particularly for BC52. Indeed, this patient harbored several CNVs reported as associated with breast cancer risk involving UGT2B17, OR4C11, OR4P4, OR4S2 and GSTT1 genes. In addition, a homozygous deletion of UGT2B17 was also detected in BC22 and BC40 and heterozygous deletions encompassing APO-BEC3A/B and GSTT1 genes were detected in BC39 and BC40 respectively.

CNVs in genes involved in tamoxifen metabolism and treatment outcome
Tamoxifen metabolism pathway was found to be enriched in breast cancer patients involving UGT2B15, SULT1A1 and CYP2D6 genes. CNVs in these genes might influence sensitivity to tamoxifen treatment. Based on available clinical data and taking into account the limited number of cases, we tried to assess the response to hormonal therapy of patients carriers of these CNVs. Indeed, BC22 carried deletions of UGT2B15, SULT1A1 and duplication of CYP2D6 genes while 3 other patients (BC37, BC40 and BC52) harbored deletions in SULT1A1 gene. We observed that all these patients had a good clinical response to tamoxifen with absence of disease recurrence for at least 12 months from the beginning of the endocrine therapy ( Table 1).

Identification of copy number variable regions and estimation of their frequencies in the Tunisian population
In order to assess the accuracy of our data we have mapped our CNV calls to data from the study of (Romdhane et al (Fig 5). Interestingly, 58 out of 280 (20%)  of our CNVRs/CNVs overlapped with data reported in the Tunisian population. All shared CNVRs/CNVs were mapped to public data on structural variations from the DGV, the 1000 genomes project, the DDD study or the I.M. Hall's lab which provide confidence in the CNV calling method used in this study. The majority of these CNVRs/CNVs were common (having a frequency >1% in the public databases) and were found to affect enriched pathways such as olfactory receptor activity and xenobiotics metabolism. The remaining 222 CNVRs/CNVs were unique to breast cancer patients and are thought to contain CNVs associated with the disease susceptibility given their rarity in the Tunisian population. This was confirmed by our analysis since all candidate CNVs that we have identified and that were found to affect the cancer genes PMS2, APC2, POU5F1, KANSL1, DOCK8 and TMTC3 are part of this category.

Discussion
The contribution of germline DNA copy number variations in breast cancer risk remains relatively undefined compared with the well documented association between point mutations and breast cancer susceptibility. Over the last decades, much advance has been made in the field of CNVs detection [50]. Nevertheless, the assessment of whether a CNV is benign or affects vital biological function is still challenging [50]. In the current study several CNVs were called and overrepresentation enrichment analysis showed an enrichment in immune response, olfactory receptor activity and xenobiotic metabolism functions and this is in agreement with what have been described in the CVN map of the human genome [5]. Moreover, the called CNVs were found to be unequally distributed among chromosomes. We have interestingly found a high proportion of copy number deletions within chromosome 17. Indeed, abnormalities affecting this chromosome are well recognized to play an important role in tumorigenesis and often arise in breast cancer. These aberrations include ERBB2 amplification, BRCA1 loss, P53 loss, and TOP2A amplification or deletion that are known to play important roles in breast cancer pathophysiology and treatment response [51, 52]. Subsequent analyses allowed the identification of several rare and common CNVs that may contribute to hereditary predisposition in patients who do not harbor pathogenic mutations in known breast cancer susceptibility genes. Six rare CNVs were believed to be the most relevant. Of special interest, was a rare pathogenic copy number deletion in the mismatch repair (MMR) gene PMS2 involving exons 13-15 deletion that was detected in two unrelated patients. Mutations in PMS2 are linked to Lynch syndrome, which is characterized by early incidence of colorectal cancer, along with increased risk of other malignancies including endometrial, ovarian, small bowel, and brain carcinoma. This same pathogenic deletion was previously identified in two patients with transverse colon cancer [53]. In the current report, none of the two breast cancer patients had personal or family history of the traditional malignancies associated with the Lynch syndrome. A recent research study showed that women with alterations in PMS2 gene have a 3-fold increased risk for breast cancer and 37.7% cumulative risk by the age of 60 [54].
In the same study, it was shown that 11.1% of women with a Lynch syndrome alteration had no personal or family history of colorectal, endometrial, or ovarian cancer. Our findings along with those of the latter study suggest that women whose personal or family history is limited to breast cancer might carry PMS2 alterations. It was also reported that patients with germline mutations in MMR genes are candidates for immunotherapy with PD- Moreover, among genes affected by this CNV, PLEKHM1 (which is not deleted in Koolen-de Vries syndrome) is also considered as an ovarian cancer predisposing gene [58]. All these findings support the implication of KANSL1 and PLEKHM1 in cancer which may explain the phenotype of our two patients. In addition, other interesting genes were identified including APC2 and POU5F1. These genes are mapped to the Wnt signaling pathway which has been highly associated with cancer [59]. This pathway is activated in a large fraction of breast cancers which contributes to tumor recurrence and lower overall survival. Indeed, this pathway also has implications for therapeutic interventions in cancers [60]. Taking the example of POU5F1 gene, previous studies showed that the expression of this gene is required for the maintenance of transformed breast cancer cells and suggested its utility as a novel clinical biomarker and a potential target for gene-specific therapy of breast cancer [61]. In addition, alterations in APC2 through loss of heterozygosity, promoter hypermethylation and somatic copy number aberrations were also described in breast tumors [62]. Moreover, we have identified a duplication in DOCK8 gene that overlapped with a pathogenic CNV previously reported in individuals with developmental disabilities [63]. In addition to this, other reports suggested that DOCK8 may have tumor suppressor functions. In fact, copy number deletions in this gene were described in human cancer particularly in neuroblastomas [42], in primary lung cancers, gastric and breast cancer cell lines [64]. Furthermore, one patient harbored a duplication in TMTC3 gene which was found to be unregulated in breast cancer associated blood vessels and may therefore constitute a potentially anti-angiogenic target for breast cancer therapy [65]. CNVs in this gene were also detected in pancreatic cancers [43]. For one family (BC1), CNVs prioritization did not allow the identification of candidate rare CNVs potentially associated with breast cancer risk. Breast cancer susceptibility in this family is likely due to family specific genetic variants [25]. In the present study, several common CNVs overlapping with CNV regions previously reported as associated with breast cancer risk were identified including CNVs affecting UGT2B15, UGT2B17, OR4C11, OR4P4, OR4S2, APOBEC3A, APOBEC3B and GSTT1 genes [4,19]. Several studies have found an association between APOBEC3 deletion and the risk of various cancers, particularly breast cancer with up to 1.3-fold increased risk. This locus was shown to be significantly associated with breast cancer risk in different populations including those of Chinese, Iranian, and European ancestries [18,47,48]. It was demonstrated also that deletion in the APOBEC3 loci disrupting APOBEC3A and APOBEC3B genes lead to the decreased expression of the corresponding genes [66,67]. Moreover, the association between GSTT1 gene deletion and breast cancer risk has been widely studied and it was demonstrated that GSTT1 null genotype is associated with increased breast cancer risk [68] and also with significant downregulation of GSTT1 gene resulting in loss of protein expression [69][70][71]. This latter contributes to tumor cell survival by detoxification of numerous products induced by cancer therapy such as chemotherapy [49]. Interestingly GSTT1 was previously investigated in the Tunisian population and results have shown significant association between the gene deletion and the risk of early onset of breast carcinoma [49]. On the other hand, the absence of GSTT1 gene deletion was found to be significantly associated with poor clinical response to chemotherapy [49]. In the present study, response to chemotherapy cannot be effectively assessed due to the limited number of cases and since all patients received adjuvant treatment. Nevertheless, it is noteworthy that patients with GSTT1 gene deletion (BC40, BC52) had a good survival with absence of cancer recurrence for at least 5 years, while disease relapse was observed in 3 patients (BC1-1, BC22, BC47) with a normal copy of GSTT1. In addition, two recently published reports showed that OR4C11, OR4P4, OR4S2 and UGT2B17 are associated with breast cancer with respectively 2.6, 2.4, 2.1 and 2.2-fold increase in breast cancer risk [4,19] and it was proven also that the expression of UGT2B17 gene is correlated with the corresponding germline CNVs [19]. Based on these observations we have suggested a polygenic inheritance for one patient as she harbored CNVs in all the above genes. The assessment of whether these CNVs could be associated with breast cancer risk in the Tunisian population will be of keen interest and need to be conducted in a larger cohort. In addition, our pathway analysis resulted in mapping some common CNVs namely CYP2D6, UGT2B15 and SULT1A1 to tamoxifen metabolism. In the present report, SULT1A1 and UGT2B15 deletions seem to correlate with good clinical response to tamoxifen. In fact, tamoxifen and its metabolites are inactivated by these genes through sulfation and glucuronidation respectively. It has been demonstrated that SULT1A1 copy number is highly associated with the enzymatic activity, which is considered as a predictive biomarker for tamoxifen response [72]. A duplication within CYP2D6 was detected in one patient receiving tamoxifen treatment. This gene catalyzes the transformation of the tamoxifen to its active form 4-OH-TAM [73] and it was suggested that a subject with duplication of active CYP2D6 will metabolize drugs at an ultra-rapid rate, which could lead to a loss of therapeutic efficacy at standard doses [74]. Contrarily, in the present study, the patient carrying CYP2D6 gene duplication had a good clinical response to tamoxifen therapy. The evaluation of the clinical relevance of CNVs in tamoxifen-metabolizing genes to drug efficacy in Tunisian breast cancer patients is of important interest since it may help to improve therapeutic decisions.
Here we described a substantial number of CNVs that might be of clinical interest in Tunisian breast cancer patients using WES data. This report is the first to use WES in the analysis of CNVs in Tunisian BRCAx families and it is considered to be among the first studies to elucidate the contribution of CNVs to disease susceptibility in BRCA negative families using WES data. Nonetheless, the findings of this study have to be seen in light of some limitations mainly related to the small sample size investigated. This could be explained in part by the rarity of BRCA negative familial breast cancer cases especially that the incidence of breast cancer in Tunisia is lower compared to that in developed countries and also by the limited resources that hampers the generation of an important number of exomes. Nevertheless, it is important to note that exome sequencing has previously been shown to be a valuable tool for detecting germline CNVs. Indeed, integration of CNV analysis in exome sequencing data-analysis pipelines, which until now have mostly focused on single nucleotide variants analysis, seems to be a promising approach for the detection of most of the alterations associated with disease susceptibility in a cost-effective manner. However, the specificity and the number of CNVs identified vary greatly depending on the used platforms and the CNVs detection algorithms [75]. In fact, benchmarking of several CNVs detection tools from exome data showed that a significant fraction of called CNVs are only present in a single tool [76]. It was demonstrated also that ExomeDepth is one the most balanced tools concerning sensitivity and specificity [77] and this latter was supported to be integrated with routine targeted NGS diagnostic services for Mendelian diseases [78]. Additionally, clinically relevant CNVs resulting from the different breast cancer studies highly depend on the bioinformatic tools and the methodology used to prioritize variants and to interpret results. To overcome these challenges, it is important to perform large scale studies, to pool data from previous reports, to analyze CNVs by combining different algorithms and to interpret the called CNVs using a consistent approach.

Conclusions
In this study, we have identified a number of germline CNVs that possibly increase the susceptibility to breast cancer and that could therefore explain a fraction of familial breast cancer cases particularly those with no mutations in the major susceptibility genes. Screening of CNVs found in Wnt and MMR pathways must be considered in breast cancer patients since it might help to guide personalized therapeutic decisions. Furthermore and taking into account the genetic proximity with other populations in Middle East and North Africa (MENA) region, the present study will have an impact on molecular diagnosis of breast cancer not only for Tunisian patients but also for patients from other neighboring countries.
Supporting information S1