Optical coherence tomography for glaucoma diagnosis: An evidence based meta-analysis

Purpose Early detection, monitoring and understanding of changes in the retina are central to the diagnosis of glaucomatous optic neuropathy, and vital to reduce visual loss from this progressive condition. The main objective of this investigation was to compare glaucoma diagnostic accuracy of commercially available optical coherence tomography (OCT) devices (Zeiss Stratus, Zeiss Cirrus, Heidelberg Spectralis and Optovue RTVue, and Topcon 3D-OCT). Patients 16,104 glaucomatous and 11,543 normal eyes reported in 150 studies. Methods Between Jan. 2017 and Feb 2017, MEDLINE®, EMBASE®, CINAHL®, Cochrane Library®, Web of Science®, and BIOSIS® were searched for studies assessing glaucoma diagnostic accuracy of the aforementioned OCT devices. Meta-analysis was performed pooling area under the receiver operating characteristic curve (AUROC) estimates for all devices, stratified by OCT type (RNFL, macula), and area imaged. Results 150 studies with 16,104 glaucomatous and 11,543 normal control eyes were included. Key findings: AUROC of glaucoma diagnosis for RNFL average for all glaucoma patients was 0.897 (0.887–0.906, n = 16,782 patient eyes), for macula ganglion cell complex (GCC) was 0.885 (0.869–0.901, n = 4841 eyes), for macula ganglion cell inner plexiform layer (GCIPL) was 0.858 (0.835–0.880, n = 4211 eyes), and for total macular thickness was 0.795 (0.754–0.834, n = 1063 eyes). Conclusion The classification capability was similar across all 5 OCT devices. More diagnostically favorable AUROCs were demonstrated in patients with increased glaucoma severity. Diagnostic accuracy of RNFL and segmented macular regions (GCIPL, GCC) scans were similar and higher than total macular thickness. This study provides a synthesis of contemporary evidence with features of robust inclusion criteria and large sample size. These findings may provide guidance to clinicians when navigating this rapidly evolving diagnostic area characterized by numerous options.


Introduction
Glaucoma is the leading cause of irreversible blindness worldwide [1].As the population continues to age, and average life expectancies increase, the prevalence of this debilitating disease will grow.Glaucoma is one of the leading causes of blindness in working-age populations of industrialized nations, and is the most common cause of permanent vision loss in persons older than 40 years of age, after age-related macular degeneration [2][3][4].
Glaucoma is a multifactorial, chronic optic nerve neuropathy that is characterized by progressive loss of retinal ganglion cells (RGC), which leads to structural damage to the optic nerve head (ONH), retinal nerve fiber layer (RNFL), and consequent visual field defects [5].Early diagnosis and treatment of glaucoma has been shown to reduce the rate of disease progression, and improve patients' quality of life [6].The currently accepted gold standards for glaucoma diagnosis are optic disc assessment for structural changes, and achromatic whiteon-white perimetry to monitor changes in function [7].However, imaging technologies such as optic coherence technology (OCT) are playing an increasing role in glaucoma diagnosis, monitoring of disease progress, and quantification of structural damage [8,9].
OCT is a non-invasive, non-contact imaging modality that provides high-resolution crosssectional imaging of ocular tissues (retina, optic nerve, and anterior segment).Image acquisition is analogous to ultrasound, where light waves is used in lieu of sound waves.Low coherence infrared light is directed toward the tissue being imaged, from which it scatters at large angles.An interferometer (beam splitter) is used to record the path of scattered photons and create three-dimensional images [10][11][12][13].OCT is highly reproducible, and is thus widely used as an adjunct in routine glaucoma patient management [14][15][16].
Peripapillary RNFL analysis is the most commonly used scanning protocol for glaucoma diagnosis [14][15][16], as it samples RGCs from the entire retina; however, it does suffer certain drawbacks related to inter-patient variability in ONH morphology [17,18].To overcome some of these disadvantages, the macular thickness has been proposed as a means of glaucoma detection [19]-50% of RGCs are found in the macula, and RGC bodies are thicker than their axons, thus are potentially easier to detect.The older time-domain (TD) OCT devices, such as Zeiss Stratus, were able to only measure total macular thickness, which had been shown to have poorer glaucoma diagnostic accuracy than RNFL thickness [20][21][22].Spectral-domain (SD) OCT (Zeiss Cirrus, Heidelberg Spectralis, Optovue RTVue, Topcon 3D-OCT) allows for measurement of specific retinal layers implicated in the pathogenesis of glaucoma, namely: macular nerve fiber layer (mNFL), ganglion cell layer with inner plexiform layer (GCIPL), and ganglion cell complex (GCC) (composed of mNFL and GCIPL).Segmented analysis is purported to have better diagnostic ability for glaucoma than total retinal thickness [23,24], and may be comparable to RNFL thickness [23,25,26].
Currently, several OCT devices are available on the market, each with unique technologies purported to provide better clinical information to the user.The technical features of these various systems have been described elsewhere [27,28].Reichel et al. also provide images obtained from each of the OCT systems [27].It is unclear however; which OCT device should be selected by practitioners when making referral or treatment decisions.The aim of this meta-analysis was to provide pooled estimates for the accuracy and detection capability of the most commonly used OCT imaging devices (Zeiss Cirrus OCT, Zeiss, Stratus OCT, Heidelberg Spectralis, Optovue RTVue, Topcon 3D-OCT) for glaucoma diagnosis and classification between patients and healthy individuals.

Overview of review methods
The main objective of this investigation was to compare the glaucoma diagnostic accuracy for each of the OCT devices commercially available, namely Zeiss Stratus, Zeiss Cirrus, Heidelberg Spectralis, Optovue RTVue and Topcon 3D-OCT.We compared diagnostic accuracies of RNFL and macular parameters obtained by these imaging devices.This review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement methodology [29].A PRISMA flow diagram is used to illustrate the flow of records throughout this review (Fig 1).

Data sources and search strategy
The search strategy for this investigation was comprehensive, aiming to retrieve the largest possible number of relevant studies.An electronic search strategy was developed through consultation with an experienced ophthalmologist specializing in glaucoma management.The search end date was February 2017.There was no specified search start date.Any study providing information on area under receiver operating characteristic curve, sensitivity, specificity, negative predictive value, positive predictive value, likelihood ratio, or diagnostic odds ratio was included.Published and unpublished studies were considered.
The following bibliographic databases were searched: MEDLINE 1 (Ovid MEDLINE(R) Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Ovid MEDLINE(R) Daily, Ovid MEDLINE and Versions(R)), EMBASE 1 (Embase Classic+Embase), CINAHL 1 , Cochrane Library 1 (Wiley Library), Web of Science 1 , and BIOSIS 1 .Specific keywords used in the search included terms for glaucoma, optical coherence tomography, imaging device manufacturer (ie.Zeiss, Heidelberg, RTVue, Topcon), and diagnostic testing including terms for diagnostic evaluative tests (ie.Area under receiver operating characteristic curve, etc.).Search strategies for each of the devices are available in S1

Inclusion and exclusion criteria
All studies that assessed the diagnostic accuracy of OCT for detection of glaucoma were considered for inclusion in our review.As the goal of this investigation was to maximize generalizability and applicability to clinical practice, a broad gold standard was accepted for inclusion, ie.White on white automated perimetry, optic disc appearance (clinically or by photograph), or combination thereof.Accepting a wider gold standard more accurately reflects the reality of clinical practice, and allowed for inclusion of a larger number of articles, improving robustness of the quantitative meta-analysis.Only human, clinical studies published in English-language were accepted.Patient were 18 years of age or greater.No exclusions were made for patient ethnicity, or country where study was conducted.Included studies assessed at least one of five devices, namely Stratus OCT (Carl Zeiss Meditec, Jena, Germany), Cirrus OCT (Carl Zeiss Meditec), Spectralis OCT (Heidelberg Engineering Inc., Heidelberg, Deutschland), RTVue (Optovue Inc., Freemont, United States), and 3D-OCT (Topcon, Tokyo, Japan).These devices were included as they represent the newest or most widely utilized OCT devices available for glaucoma diagnosis at the time of this review.Studies of both RNFL and macular areas for glaucoma diagnosis were included.
During full-text screening, articles were included if they reported area under receiver operating characteristic curve (AUROC) statistics.Manuscripts that did not report standard error or confidence intervals for AUROC were excluded.Other exclusions were: duplicate manuscripts, non-diagnostic studies, studies of pediatric patients, studies without control participants, and investigations of OCT devices other than those previously specified.

Study selection
All studies included for consideration underwent two levels of screening by two independent reviewers.All records were uploaded to an online interface (Covidence, Veritas Health Innovation, Melbourne, Australia) to coordinate and support the screening process.First, a broad screen of titles, keywords and abstracts (Level 1) was performed.At this stage, studies were tagged as either "Relevant", "Irrelevant" or "Maybe Relevant".For all relevant studies, full text screening was performed (Level 2) using the stricter a priori inclusion criteria detailed previously.
After each level of screening, disagreements between article screeners were resolved through consultation with the primary author.Reasons for exclusion were documented and are reported in the review.The PRISMA flow chart of studies during screening is illustrated in Fig 1.

Data extraction and quality assessment
An electronic data extraction form specific to this meta-analysis was developed a priori.Data collected included study identification information (title, authors, journal and year of publication, study methodology (design, inclusion/exclusion criteria, gold standard type), patient variables (number of patients/controls, glaucoma diagnosis, age, gender), OCT device used, area imaged (RNFL, macula subtype), and AUROC (with SE/CI).
The quality assessment of diagnostic accuracy studies, version 2 (QUADAS-2) [30] was used to assess the risk of bias and applicability concerns of all manuscripts included in this review.This assessment tool comprises four key domains: 1) patient selection, 2) index test, 3) reference standard, and 4) flow of patients through the study and timing between index test and reference standard.Each domain was assessed in terms of risk of bias.The first three domains were assessed for their applicability to the research question being assessed by the review.Results of QUADAS-2 are summarized in Fig 2.

Data synthesis and statistical analysis
All statistical analyses were performed using MedCalc (Version 17.2, MedCalc Software, Ostend, Belgium).Meta-analysis for the AUROC was selected instead of other measures such as sensitivity and specificity.The AUROC is a commonly used metric for diagnostic accuracy of medical tests.It was found to be more consistently reported in the included studies.Whereas some studies reported a combination of parameters, others reported sensitivity values for particular specificity cut-offs, which, in turn, were not consistent across studies.AUROC reflects both the sensitivity and specificity of a diagnostic test, can be compared across studies, and can be combined between similar studies when measures of uncertainty (standard error (SE) or confidence interval (CI)) are provided [31].
Meta-analysis was completed using MedCalc (MedCalc, Version 17.2, MedCalc Software, Ostend, Belgium).The main outcome of this study was pooled AUROC for each of the following groups: all glaucoma patients, perimetric glaucoma, pre-perimetric, mild glaucoma, moderate to severe glaucoma, and myopic glaucoma.As there currently does not exist any international consensus on the definition of glaucoma severity, there was heterogeneity in the way that each study defined their patient groups.For consistency, we defined each group as follows: 1) Perimetric glaucoma-glaucoma based on abnormal visual field measurements; 2) Pre-perimetric glaucoma-glaucoma diagnosed based on optic disc appearance, with normal visual field measurements; 3) Mild glaucoma-perimetric glaucoma, defined as mean deviation of > -6.00 dB as per the Hodapp-Parrish-Anderson criteria [32].Patients with normal visual fields were not included in this group; 4) Moderate to severe glaucoma-perimetric glaucoma, defined as mean deviation < -6.00 dB [32]; 5) Myopic glaucoma-any definition of myopia as defined by study authors, this could include dioptric definition (ex.Spherical equivalent < -6.0) or axial length definition (AL >25mm).
Individual measures of AUROC from each study were pooled into a weighted summary AUROC for each group using the methods described in Zhou et al. [31] Heterogeneity among included studies was tested by computing the I 2 , Z-value and χ2 statistics.An I 2 value of less than 50% implies low heterogeneity and supports the use of a fixed-effect meta-analysis model.A value of greater than or equal to 50% implies high heterogeneity and supports the use of a random-effects model.Additionally, a high Z-value, a low p-value (<0.01) and a large χ2 value implies significant heterogeneity and supports the use of a random-effects model using DerSimonian and Laird methods.Forest plots were generated to visualize results.Publication bias was assessed through evaluation of funnels plots of included studies for each pooled AUROC.

Search results and study characteristics
Study flow is summarized in Fig 1 .After removal of duplicates, 1301 records underwent title and abstract (Level 1) screening.825 were excluded as irrelevant.The remaining 477 records underwent full-text screening (Level 2).Of these, 327 articles were excluded as they did not meet the study inclusion criteria, or manuscript was unable to be obtained.At the end of screening, 150 articles were included for meta-analysis [21][22][23][24].

Study quality
A summary of the methodological quality assessment for included studies is provided in Fig 2 .Overall methodological quality of all included studies was strong in terms of risk of bias and applicability to the research question.Of note, there was an unclear risk of bias in patient selection for 39.3% of studies.This was largely due to inadequate reporting of patient selection methods in these manuscripts; thus, risk of bias was unable to be ascertained.

Evaluation of publication bias
Funnel plots were constructed to evaluate publication bias in the meta-analysis.Several funnel plots were created, one for each imaging parameter (average, superior, inferior etc.), of each area (RNFL, macula), for each OCT device, within each patient subgroup.No pattern was evident, ie.no one patient group, OCT device, or scan type/parameter was noted to be more likely to have publication bias.

Discussion
This meta-analysis demonstrates that OCT is a valuable adjunctive tool to aid in glaucoma diagnosis.Pooled estimates of diagnostic accuracy (AUROC) for the most commonly used OCT instruments (Zeiss Cirrus OCT, Zeiss, Stratus OCT, Heidelberg Spectralis, Optovue RTVue, Topcon 3D-OCT) were determined based upon their ability to differentiate between normal participants and glaucoma patients.A summary of the technical features of each device are outlined in Table 8.
The 150 studies included reported the diagnostic capability of several RNFL and macular parameters.Macular scans were further subdivided by retinal segmentation (GCC, GCIPL, mNFL or total retinal thickness).The AUROCs for average, superior and inferior RNFL parameters were larger than for nasal and temporal areas, a finding that was consistent for the overall patient group, as well as glaucoma subgroups.This finding is explained by the work of Traynis et al., 2014 who proposed a schematic of glaucomatous damage to the macula.Retinal ganglion cells (RCGs) in the regions of the macula most vulnerable to glaucomatous damage (inferior macula and region outside of the central 8 degrees of macula), project to the inferior and superior quadrants of the optic disc.Whereas RCGs in the less vulnerable regions (superior macula), project to the temporal region of the disc [179].
By comparison, in the macular GCIPL scans, we found that the inferonasal and superonasal parameters had poorer diagnostic efficacy than the average, superior, inferior, and temporal (infero-and superotemporal parameters).These differences between parameters were not found in the macular GCC scans.Comparing between different scan types, RNFL thickness, macular GCIPL and macular GCC had similar diagnostic capability to differentiate between normal and glaucomatous eyes.Total macular thickness had lower AUROC for glaucoma diagnosis than these more specific scan types.Through stratification of patients by disease severity for sub-analysis, we also note that the diagnostic capability of OCT improves with increased disease severity.
One major question we wished to address through this review was whether there were instrument-dependent differences in diagnostic ability of OCT.It appears that for the majority of subgroups, there are no notable differences between devices.

Comparison with other reviews
Previous reviews on the diagnostic capability of OCT for glaucoma have been published [14,[181][182][183][184][185].The present review has some unique advantages over previous reports.First, as mentioned previously, a wide gold standard was accepted for inclusion, ie.White-on-white Importantly, this meta-analysis provides pooled estimates of AUROCs, rather than sensitivity and specificity, as used in previous reviews.Only one other OCT review, by Chen et al. [184] identified reported pooled AUROCs; however, that review was limited to only 21 studies of Zeiss Stratus OCT.Reporting of AUROC is advantageous when describing the utility of a diagnostic test as it represents the diagnostic capability of the test regardless of specific cutoff used.We found that individual studies were inconstant in their reporting of sensitivity and specificity, with certain studies reporting sensitivities and particular specificity cutoffs, and others reporting the "optimal" sensitivity/specificity cutoff.Meta-analysis of such inconsistent data is difficult.

Limitations
One limitation of this study was the relatively large number of case-control studies that were captured in the inclusion criteria.The case-control design has been suggested to overestimate accuracy [186].As the main purpose was to compare the diagnostic performance of the most common currently used OCT devices and none were found to be superior, this limitation unlikely introduced any significant bias.Another limitation may have resulted from choosing to the compare a number of macular parameters.Unlike RNFL scans, studies were quite heterogeneous in terms of which macular parameters were reported, ie.some reported GCIPL, GCC, mNFL, and total thickness.As such, these scan types had to be separated for meta-analysis, reducing sample sizes, and consequently increasing instability of AUROC estimates.Importantly, all studies included in the meta-analysis evaluated the ability to differentiate healthy controls from confirmed glaucoma patients, which does not reflect real clinical practice where many patients are undifferentiated.

Conclusion
The currently available OCT devices (Zeiss Cirrus, Zeiss Stratus, Heidelberg Spectralis, Optovue RTVue, Topcon 3D-OCT) demonstrated good diagnostic accuracy in their ability to differentiate glaucoma patients from normal controls.This ability increased with the severity of the glaucoma.There was no major device-related differences in diagnostic capacity.Within RNFL scans, the nasal and temporal parameters are more poorly diagnostic than the average, superior and inferior parameters.Across all macular GCIPL scans, the nasal (supero-and infero-nasal) parameters had lower AUROCs than the average, superior, inferior and temporal regions.The diagnostic capacity of RNFL is similar to segmented macular regions (GCIPL, GCC), and better than total macular thickness.As OCT technology continues to evolve at a faster pace than functional assessments of optic nerve health, future studies will be needed to fully understand its role in glaucoma management.

Perimetric Glaucoma-Pooled AUROCs (if I 2 > 50% random effects meta-analysis was used, if I 2 < 50% fixed effects was used) Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI
AUROCs of OCT for glaucoma diagnosis in myopic patients are summarized in Table7, and illustrated in Fig 8

Table 6 . Pooled AUROCs of RNFL and macular OCT parameters for moderate to severe glaucoma patients. Moderate to Severe Glaucoma-Pooled AUROCs (if I 2 > 50% random effects meta-analysis was used, if I 2 < 50% fixed effects was used) Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI RNFL Macula-GCC
https://doi.org/10.1371/journal.pone.0190621.t006Fig 7. Forest plot of diagnostic accuracies of RNFL and macular OCT parameters, moderate to severe glaucoma.https://doi.org/10.1371/journal.pone.0190621.g007

Table 7 . Pooled AUROCs of RNFL and macular OCT parameters for myopic patients. Myopic Patients-Pooled AUROCs (if I 2 > 50% random effects meta-analysis was used, if I 2 < 50% fixed effects was used) Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI Test Parameter, Location and OCT Device Number of Studies Pooled Sample Size Pooled AUROC 95% CI RNFL Macula-GCC
[183]ard automated perimetry, optic disc appearance, elevated IOP, or any combination thereof.This wide gold standard more accurately reflects true clinical practice, where patients undergoing OCT to aid in glaucoma diagnosis may have undergone many of these other diagnostic modalities previously.The majority of previous reviews have limited inclusion criteria to those patients who have exclusively undergone standard automated perimetry.Our approach enabled the inclusion of 150 OCT studies, markedly larger than previous meta-analyses; a Cochrane review by Michelessi et al.[181]identified 63 OCT studies, Fallon et al.[185]identified 47 studies, Ahmed et al.[182]identified 84 studies, and Oddone et al.[183]identified 34 studies.The larger number of studies included enabled a more robust meta-analysis and the analyses of several patient subgroups.