Accuracy of frozen section in intraoperative margin assessment for breast-conserving surgery: A systematic review and meta-analysis

Background and objectives It is well established that tumor-free margin is an important factor for reducing local recurrence and reoperation rates. This systematic review with meta-analysis of frozen section intraoperative margin assessment aims to evaluate the accuracy, and reoperation and survival rates, and to establish its importance in breast-conserving surgery. Methods A thorough review was conducted in all online publication-databases for the related literature up to March 2020. MeSH terms used: “Breast Cancer”, “Segmental Mastectomy” and “Frozen Section”. We included the studies that evaluated accuracy of frozen section, reoperation and survival rates. To ensure quality of the included articles, the QUADAS-2 tool (adapted) was employed. The assessment of publication bias by graphical and statistical methods was performed using the funnel plot and the Egger’s test. The review protocol was registered in PROSPERO (CRD42019125682). Results Nineteen studies were deemed suitable, with a total of 6,769 cases. The reoperation rate on average was 5.9%. Sensitivity was 0.81, with a Confidence Interval of 0.79–0.83, p = 0.0000, I2 = 95.1%, and specificity was 0.97, with a Confidence Interval of 0.97–0.98, p = 0.0000, I-2 = 90.8%, for 17 studies and 5,615 cases. Accuracy was 0.98. Twelve studies described local recurrence and the highest cumulative recurrence rate in 3 years was 7.5%. The quality of the included studies based on the QUADAS-2 tool showed a low risk of bias. There is no publication bias (p = 0.32) and the funnel plot showed symmetry. Conclusion Frozen section is a reliable procedure with high accuracy, sensitivity and specificity in intraoperative margin assessment of breast-conserving surgery. Therefore, this modality of margin assessment could be useful in reducing reoperation rates.


Results
Nineteen studies were deemed suitable, with a total of 6,769 cases. The reoperation rate on average was 5.9%. Sensitivity was 0.81, with a Confidence Interval of 0.79-0.83, p = 0.0000, I2 = 95.1%, and specificity was 0.97, with a Confidence Interval of 0.97-0.98, p = 0.0000, I-2 = 90.8%, for 17 studies and 5,615 cases. Accuracy was 0.98. Twelve studies described local recurrence and the highest cumulative recurrence rate in 3 years was 7.5%. The quality of the included studies based on the QUADAS-2 tool showed a low risk of bias. There is no publication bias (p = 0.32) and the funnel plot showed symmetry.

Introduction
Breast-conserving surgery (BCS) followed by radiation therapy (RT) to eradicate microscopic residual disease is the standard procedure in early stage breast cancer treatment, since it provides similar survival rates, and better cosmetic results when compared to total mastectomy [1][2][3][4].
Reoperation rates in breast-conserving surgeries in literature range from 20% to 40% [5] due to positive margins status in H&E stain of the surgical specimen. The cause of such variation is multifactorial, but it is well-established that tumor-free margins excision reduces local recurrence and reoperation rates [6][7][8][9][10][11]. However, there is no consensus about the best method to achieve it, particularly intraoperative margin assessment. There are several techniques to evaluate intraoperative margins, such as gross analysis, radiography, cytology and frozen section procedure. Data from a cohort study, which included 24,217 patients, showed those that did not use frozen section during surgical procedures were four times more likely to need reoperation than women who underwent a lumpectomy for breast cancer followed by a frozen section procedure [12]. Despite the advantages of macroscopic analysis, this procedure can be performed directly by the surgeon, and boasts of higher accuracy (80%), sensitivity (49%) and specificity (86%) than other techniques [13].
The intraoperative frozen section analysis consists of selecting suspicious margins, freezing samples submitting them to histological sections, usually with the aid of a cryostat, and staining them for microscopy analysis. However, this implies an increase of surgery time [14][15][16], as well as the possibility of margin damage [17]. Furthermore, different studies did not reach a consensus regarding its accuracy and its impact on local recurrence rates.
Based on the abovementioned, we propose a systematic review with meta-analysis of intraoperative frozen section assessment of margins to analyze its accuracy when compared to final formalin-fixed paraffin embedded analysis, as well as reoperation and survival rates of patients submitted to this technique. The results of this review may help establishing the role of the frozen section assessment of margins in conserving surgeries.

Methods
This systematic review followed recommendations proposed by the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy [18] and the PRISMA statement (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) [19]. The review protocol was registered and accepted by the international prospective register of systematic reviews (PROS-PERO) under CRD42019125682.

Search methods for identification of studies
In March 2020, we conducted a systematic literature search of articles published on frozen section as a method employed for margin assessment on breast-conserving surgery using MED-LINE (via PubMed), Lilacs (via BVS), Embase (via Elsevier) and ClinicalTrials.gov, Cochrane and "gray literature". No language and date restrictions were applied. MeSH terms: "Breast Cancer" [Title/Abstract], "Segmental Mastectomy" [Title/Abstract] and "Frozen Section" [Title/Abstract]. The search results were combined and exported to the EndNote1 bibliographic management tool, and duplicate results were removed [19]. Two trained reviewers (M.T.G and N.C.) independently reviewed all titles for possible inclusion. All disagreements were resolved via consensus by a third senior researcher (B.S.M.).

Inclusion and exclusion criteria
All clinical trials and observational studies included this this review had the same type of target patients: women with invasive and/or in situ breast cancer that underwent breast-conserving surgery and had their margin samples submitted to frozen section assessment (index test). Only studies that presented certain data were included, such as outcome, accuracy compared to the formalin-fixed paraffin-embedded analysis (reference standard test), reoperation rates, and/or overall survival rate.
The exclusion criteria took into consideration overlapping databases, frozen section of only sentinel lymph nodes, no comparison with paraffin analysis or different methods of intraoperative assessment.

Data extraction
Two researchers manually extracted the following data from all studies included in this review: number of patients, number of cases, staging, age, concept of free margin, intraoperative margin assessment method, follow-up time, number of true positives (frozen section and paraffin with positive margins), number of true negatives (frozen section and paraffin with free margins), number of false positives (positive frozen section margins and free paraffin margins), number of false negatives (free frozen section margins and positive paraffin margins), total positive cases with the paraffin method, total negative cases with the paraffin method and reexcision rate. For local recurrence and overall survival, data was combined using the inverse variance method on the log-HR scale, and on the log-RR scale for dichotomous outcomes. If the data were diverse enough to permit effect sizes combination in a meaningful or valid manner, we presented such results individually using table and graphical formats, as well as a narrative approach to summarize the data. In cases where accuracy was not explicitly reflected, we constructed a 2 x 2 table to calculate the required data. All disagreements were resolved via consensus by a third senior reviewer.

Data collection and analysis
The next step was carried out by two reviewers, who screened all abstracts and potential articles to determine which would be submitted to a full manuscript evaluation. When a selected article lacked some necessary detail, including sensitivity and specificity, an attempt was made to contact the corresponding author.

Assessment of methodological quality
Two reviewers independently assessed quality of the articles using the QUADAS-2 tool [20] (University of Bristol, UK), adapted to this diagnostic accuracy meta-analysis. The resultant QUADAS-2 tool was used to assess studies in four key domains: patient selection, index test, reference standard, flow and timing. Questions in each domain were rated (low, high, unclear) in terms of risk of bias and concerns regarding applicability (for patient selection, index test and reference standard only). All disagreements were resolved via consensus by a third expert researcher.
The assessment of publication bias by graphical and statistical methods was performed using the funnel plot and the Egger's test.

Statistical analysis and data synthesis
A meta-analysis was conducted using methods recommended by the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. The accuracy of diagnostic tests was summarized by creating a 2 x 2 table for each study, based on information retrieved from the published papers. Test results were reported qualitatively (positive or negative) and their sensitivity and specificity (95% confidence intervals) were demonstrated in by forest plots created with the Review Manager 5 software to determine heterogeneity of diagnostic accuracy amongst included studies [21]. The receiver operating characteristic curve (SROC curves) was used to measure diagnostic performance. R version 3.1 and Meta-DiSc software were also employed to perform statistical analyses. A sensitivity and subgroup analysis were carried out, taking into consideration type of study, cut-off margin and histological subtype.

Results
In total, 2,298 studies were identified, manually cross-referenced and duplicate excluded. Of those, 2,262 were excluded since they did not fit the inclusion criteria, with 36 full-text evaluated articles remaining. Five were defined as "awaiting classification", while awaiting a reply to the contact emails sent to the corresponding authors, and twelve were excluded due to reasons described in Fig 1. In the end, 19 studies were deemed suitable for this review.
For each study, patients that underwent frozen section were evaluated to collect accuracy measures such as true positive, true negative, false positive and false negative rates. Reoperation rates average was 5.9%, ranging from 0 to 23.9% (Table 2).
A sensitivity analysis was also carried out considering only the cross-section studies. Sensitivity and specificity were evaluated in 6 studies [32][33][34][35][36][37]. Intraoperative assessment sensitivity was 0.64, with a CI of 0.59-0.69), p = 0.0000, and inconsistency (I 2 ) of 97.1%, which included a total of 1387 tests. Specificity was 0.98, with a CI of 0.97-0.99, p = 0.0000, and I 2 of 91.5%, in the same sample. The accuracy, represented by the area under the SROC curve, is 0,98.
The sensibility analysis by histological subtype was not possible due to lack of individual data on each test. Only two authors performed an evaluation by histological type, which will be describe in the results. Osako et al. showed an increase of 11.9 chance of positive margins in the final pathology (p = 0.01) in patients with invasive lobular carcinoma, larger tumors, or extensive intraductal component (EIC), and who were 50 years old or younger. Jorn et al. claimed that only disease multifocality (histologically discrete tumors at least 2 cm apart) could be a risk factor to increased reoperation rates, with OR of 3.41 (CI 1.38-8.40, p = 0.008). The article did not associate histological subtype and tumor sizer with further surgeries. The invasive ductal carcinoma subtype had an OR of 0.75 (CI 0.31-1.82, p = 0.37), invasive lobular carcinoma subtype had an OR of 2.29 (CI 0.52-9.98, p = 0.37) and larger tumor size (> 2 cm) OR 1.33 (CI 0.26-6.74, p = 0.733).
In two studies, no patients presented local recurrence during an average follow-up of 40 months and 12 months, respectively [15,16] [30].

Methodological quality of included studies
Using the adapted QUADAS-2 tool, the risk of bias was analyzed in each selected study ( Fig  5).
Regarding participant selection, studies were considered to present low risk of bias since all studies included only patients with previous breast cancer diagnosis.
In the flow and timing assessment, 18 out 19 studies were considered as having low risk of bias [15-17, 22-26, 28-37]. Olson et al. (2007) was considered as high risk of bias due to inadequate exclusion [27].
There was no publication bias (p 0.32) and the funnel plot showed symmetry (Fig 6).

Discussion
Despite the large variability of negative margin definitions, it is well-known that positive margins in breast-conserving surgeries are associated with increased rates of local recurrence [39]. Reducing reoperation rates is the greatest advantage of intraoperative frozen section margin assessment, which consequently reduces patient anxiety and improves quality of life. Moreover, with the increase in BCS, more favorable cosmetic outcomes are made possible, sometimes preventing mastectomy altogether. This saves money on additional surgeries and hospital stays, and avoids delays in the start of adjuvant treatments. Main limitations relate to technical difficulties of the method, availability of a pathologist in the operating room, increased costs and additional time in the operating room.
Some oncology centers routinely perform the intraoperative assessment of the margins with frozen section and/or touch cytological imprint (TIC). A meta-analysis study, which includes 9 studies related to frozen section, evaluated the accuracy of different intraoperative techniques for margin assessment and reported sensitivity of 86% and specificity of 91% with 97% of heterogeneity for the frozen section technique [40]. Our sample, which it is 50% larger (n = 4,293 exams), has shown a slightly lower sensitivity (81%), with higher specificity 97% for the frozen section method, but still with a high risk of inconsistency (I 2 = 90.8%). This might be due to the setting in which the included studies were carried out, all in tertiary centers, which probably implies the pathologists and surgeons are more experienced.
Our meta-analysis is novel in the sense that a methodological quality assessment of studies was included using the QUADAS-2 tool, thus associating the frozen section test to breast-conserving surgery and reoperation rates. Another strength of this study is the use of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy.
This study has some limitations, though, which are intrinsic to the quality of the included studies due to heterogeneity of the available data, including the definition of free margins, no reply from e-mails requesting raw data, and lack of stratified data of true positives, false positives, true negative and false negatives for DCIS and IDC.
For patients with invasive tumors, a consensus statement (2014) has suggested that a positive margin should be considered as "tumor at ink" [41]. Less than 1 mm of histologically normal tissue between the tumor and the resected border can be considered "clear" and therefore, do not require re-excision. This consensus also considered this margin equally appropriate for patients with in situ tumors, and associated with invasive carcinoma, as long as the intraductal component is smaller than 25 percent of the tumor. Since 2013, a trend in the reduction of reoperation rates has been observed, which was described by Yang et al. [42]. Therefore, in 2016, Morrow et al. showed a decrease of 16% in re-excision rates among surgeons consensus [43]. For patients with exclusive ductal carcinoma in situ (DCIS), the National Comprehensive Cancer Network (NCCN) guidelines had previously suggested a margin of � 1 mm for DCIS, which could increase re-excision rates if compared to the definition of negative margin as "no tumor at ink" [44]. In this review, it was not possible to perform separate analysis of IDC and DCIS. Even if studies included both neoplasias, none presented separate accuracies for each. Cabioglu et al. (2007) reported reoperation rates among DCIS twice as high (14%) when compared to IDC (7%) [45]. This is the core issue of this review and may influence future guidelines since it could possibly be incorporated into clinical practices.
Analyzing some older studies, we are left with different definitions of free margins [39]. When a larger margin is required to be considered free, this could interfere in true positive, true negative, false positive and false negative rates and, therefore, would also interfere with the accuracy of the technique.
Five studies were left as "awaiting classification", since attempts to contact corresponding authors by email to obtain their stratified data regarding accuracy received no reply, and thus means data could not be extracted.
In clinical practice, avoiding readmission and reoperation would decrease hospital expenses; in that sense, Alvarado et al estimated that frozen section assessments could result in an yearly saving of $3.7 billion, which means less than $20,000/QALY (quality-adjusted life years) and a 89.7% reduction of reoperation rates.
Despite false negative rates of up to 23%, the reoperation rate found is still much lower than expected and this might be due to the great variability in the interpretation of test results among the studies. Ikeda et al. (1997) opted for radiotherapy for false negative cases based on patient's opinion and physician's advice [26]. Kim et al. defined positive margins as > 1mm, however they did not reoperate false positive cases because cancer cells were not in the margin itself [16]. Only one patient with a false negative result in the Noguchi study refused a second operation because since the involvement was histologically minimal [35]. Osako et al. (2015) did not reoperate 59 out of 60 false negative cases due to minimal residual disease [28].
This review, considering only studies that analyzed LR, found rates ranging from 0 to 7.5% in a follow-up average of 12-62 months. Local recurrence rate (LR) of 4.2% was reported for overall breast-conserving surgeries [46].
In the future, the findings of this meta-analysis will be used as the parameters required for the development of a Markov model to determine whether the implementation of intraoperative frozen section assessments in the Brazilian public health system is a cost-effective intervention. Since studies from different countries were included, this model could easily be adapted to other settings, private or public, in different countries, improving health care services at adequate costs.

Conclusion
Frozen section is a reliable technique for intraoperative margin assessment in breast-conserving surgery with high levels of accuracy, sensitivity and specificity. Due to this high precision for negative results, routine use of this test may aid surgeons in the pursuit of tumor-free surgical margins, therefore reducing reoperation rates.