Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The diagnostic performance of cochlear endolymphatic hydrops and perilymphatic enhancement in stratifying Ménière’s disease probabilities: A meta-analysis of semi-quantitative MRI-based grading systems

  • Neda Azarpey,

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Department of Radiology, Shahid Beheshti University, Tehran, Iran

  • Shahrzad-Sadat Seyed-Bagher-Nazeri,

    Roles Conceptualization, Methodology, Software, Validation, Writing – review & editing

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Omid Yazdani,

    Roles Methodology, Software, Validation

    Affiliation Department of Radiology, Shahid Beheshti University, Tehran, Iran

  • Romina Esbati,

    Roles Investigation, Methodology, Project administration, Software

    Affiliation Department of Radiology, Shahid Beheshti University, Tehran, Iran

  • Paria Boustani,

    Roles Formal analysis, Methodology, Software, Writing – original draft

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Mobasher Hajiabbasi,

    Roles Formal analysis, Methodology, Project administration

    Affiliation Faculty of Medicine, Islamic Azad University of Tonekabon, Tonekabon, Iran

  • Pouya Torabi,

    Roles Data curation, Investigation, Methodology, Visualization

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Dorreh Farazandeh,

    Roles Writing – original draft

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Hana Farzaneh,

    Roles Writing – original draft

    Affiliation Department of Radiology, Iran University of Medical Sciences, Tehran, Iran

  • Ashkan Azizi,

    Roles Data curation, Formal analysis, Software, Writing – original draft

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Behnam Amini ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – review & editing

    Behnamamini717@gmail.com

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Moein Ghasemi,

    Roles Writing – review & editing

    Affiliation Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran

  • Zohre Ghasemi

    Roles Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Department of Otorhinolaryngology-Head and Neck Surgery, Imam Khomeini Hospital Complex, Tehran University of Medical Sciences, Tehran, Iran

Abstract

Background

The diagnosis of Meniere’s Disease (MD) presents significant challenges due to its complex symptomatology and the absence of definitive biomarkers. Advancements in MRI technology have spotlighted endolymphatic hydrops (EH) as a key pathological marker, necessitating a reevaluation of its diagnostic utility amidst the need for standardized and validated MRI-based grading scales.

Methods

Our meta-analysis scrutinized the diagnostic efficacy of semi-quantitative MRI-based cochlear endolymphatic hydrops (EH) and perilymphatic enhancement (PLE) grading systems in delineating clinically relevant discriminations: “Spotting” the shift from normal or asymptomatic ears to possible/probable MD (pMD), “Confirming” the progression to definite MD (dMD), and “Establishing” the presence of dMD. A thorough literature search up to October 2023 resulted in 35 pertinent studies, forming the basis of our analysis through a bivariate mixed-effects regression model.

Results

Using criteria from the American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) and Barany Society, across varying thresholds and disease probabilities; the Establishment model at an EH grade 1 threshold revealed a sensitivity of 85.4% and a specificity of 82.7%. Adjusting the threshold to EH grade 2 results in a sensitivity increase to 92.1% (CI: 85.9–95.7) and a specificity decrease to 70.6% (CI: 64.5–76.1), with a DOR of 28.056 (CI: 14.917–52.770). The Confirmation model yields a DOR of 5.216, indicating a lower diagnostic accuracy. The Spotting model demonstrates a sensitivity of 48.3% (CI: 34.8–62.1) and a specificity of 88.0% (CI: 77.8–93.9), with a DOR of 6.882. The normal ears subgroup demonstrated a notably high specificity of 89.7%, while employing Nakashima’s criteria resulted in a reduced sensitivity of 74.9%, significantly diverging from other systems (p-value < 0.001). The PLE grading system showcased exceptional sensitivity of 98.4% (CI: 93.7–99.6, p-value < 0.001).

Conclusion

Our meta-analysis supports a tailored diagnostic approach for MD, emphasizing the need for effective grading systems at each stage. For "Spotting," the model shows high specificity but requires improved sensitivity, suggesting additional criteria are needed. The "Confirming" stage highlights the need for refined, sensitive grading systems due to lower diagnostic accuracy. In the "Establishing" stage, an EH grade 1 threshold is effective, but grade 2 enhances sensitivity while reducing specificity, indicating a need for balance. The PLE grading system excels in sensitivity, making it highly reliable. High specificity in the normal ears subgroup confirms accurate non-pathological distinction, though Nakashima’s criteria show reduced sensitivity, underscoring variability in grading systems. These findings advocate for a standardized, unified grading system balancing sensitivity and specificity across all MD stages to optimize diagnostics and clinical outcomes.

Introduction

Ménière’s Disease (MD) encapsulates a significant challenge within the otolaryngology and neurology, characterized by its indeterminate symptomatology and the absence of concrete biomarkers for diagnosis [1, 2]. The reliance on clinical criteria and patient-reported symptoms further complicates the diagnostic process, especially in the disease’s nascent stages [3, 4]. The unclear nature of MD’s etiology remains a topic of significant interest, with endolymphatic hydrops (EH) posited as a pivotal factor in its pathogenesis, characterized by an aberrant fluid accumulation within the inner ear’s endolymphatic spaces [35]. This association has catalyzed advancements in diagnostic methodologies, notably through delayed post-gadolinium MRI techniques, closing the gap between subjective clinical assessments and objective diagnostic indicators [3, 4, 6].

The introduction of gadolinium contrast in MRI scans has been a pivotal development for MD diagnostics, enabling the demarcation of endolymphatic spaces as contrast defects and facilitating a correlation between the degree of EH and the clinical manifestations of MD [3, 4, 6]. Despite this progress, the diagnostics of MD is hindered by the variability of grading systems, ranging from qualitative to semi-quantitative and volumetric scales. This diagnostic conundrum is further exacerbated by the incidental discovery of EH in asymptomatic individuals through MRI, obscuring distinctions between disease severity and physiological variance [3, 4, 6].

The development and implementation of standardized grading scales for MRI-based diagnostics of EH in MD patients remain a formidable challenge. The inconsistency in grading systems underscores a significant hurdle in achieving consensus on diagnostic criteria, complicating the interpretation and comparability of research findings [6]. The literature has seen various attempts [712] to consolidate MRI-based evaluations of EH in MD, with Connor et al. [12]’s meta-analysis standing out as a seminal work that navigated through the complexities of diagnostic performance with a quantitative lens. However, the study’s approach to amalgamating patient-based and ear-based EH measurements, the unilateral application of the Barany criteria as the definitive standard, and the inclusion of ears with alternate audio-vestibular disorders in the control group, introduces potential biases that may skew the interpretation of EH’s role in MD.

Our study seeks to transcend these limitations through a methodologically robust approach, prioritizing categorization of disease probabilities, stringent selection criteria, and a discerning adoption of diagnostic standards. Our analysis delves into the diagnostic veracity of various EH grading systems, aiming to “Spotting” the shift from normal or asymptomatic ears to probable MD (pMD), “Confirming” the progression to definite MD (dMD), and “Establishing” the presence of dMD. By evaluating the efficacy of semi-quantitative grading systems for cochlear EH and perilymphatic enhancement (PLE), our study provides new insights on the diagnostic utility of these scales in refining the diagnostic approach for MD [1, 2].

Methods and materials

Search strategy

A comprehensive literature search was conducted across PubMed, Scopus, Web of Science, and the Cochrane Library to identify studies relevant to the diagnostic accuracy of endolymphatic hydrops (EH) grading in Meniere’s disease. This search, crafted in collaboration with a medical librarian, was peer-reviewed using the Peer Review of Electronic Search Strategies (PRESS) guidelines. The strategy combined keywords and Medical Subject Headings (MeSH) terms, including "Meniere", "cochlear Disease", "endolymphatic hydrops", "Imaging", "Magnetic Resonance", "MRI", "FLAIR", and "Three-Dimensional Imaging", using Boolean operators to ensure breadth and depth. The search was limited to articles published in English up to October 10, 2023, to capture the most current research within the constraints of available translation resources. In selecting studies, a phased approach was employed, focusing on reports detailing diagnostic metrics of cochlear EH grading, as well as perilymphatic enhancement (PLE) evaluation in Meniere’s disease. The PICOS framework guided the selection, encompassing participants diagnosed with MD per established criteria, MRI acquisition methodologies, control groups, outcomes related to EH visualization, and study designs including randomized case-control, prospective and retrospective cohort, and cross-sectional studies. Exclusions were made for reviews, editorials, insufficient data, and animal studies, along with studies with non-discriminatory EH reporting or focusing solely on audio-vestibular symptoms in non-control ears. The screening process began with titles and abstracts, advancing eligible studies to full-text review in adherence to PRISMA-Diagnostic Test Accuracy (PRISMA-DTA) guidelines [13]. This process was meticulously documented in a PRISMA flow diagram. Two independent reviewers (N.A./B.A.) conducted the screenings. Disagreements were resolved through discussion or consultation with a third reviewer in medical statistics, ensuring a thorough and unbiased study selection.

Data extraction

Data extraction for this study was carried out using a pre-designed Excel spreadsheet. The parameters included first author, publication year, patient inclusion criteria, sample size, methods of gadolinium-based contrast administration (either intratympanic or intravenous), levels of Meniere’s disease diagnostic certainty, locations of endolymphatic hydrops (EH), the status of control ears, and diagnostic test results. The extraction process involved manual entry by two independent reviewers into a shared database, minimizing transcription errors. Discrepancies encountered during data extraction were resolved through a reconciliation process, which included revisiting the original articles and, if necessary, consulting a third reviewer.

Quality assessment

The quality of the included studies was rigorously assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool [14]. This tool provides an in-depth evaluation across four domains: patient selection, index test, reference standard, and flow and timing. Each domain was evaluated independently by the reviewers, with the risk of bias rated as ’low’, ’high’, or ’unclear’. These ratings were based on predefined criteria aligned with QUADAS-2 guidelines. Additionally, the first three domains were also appraised for their relevance and applicability to the research question.

Data synthesis and analysis

A bivariate mixed-effects regression model was utilized to jointly model sensitivity and specificity, taking into account the heterogeneity observed between studies. The criteria for pooling studies in the meta-analysis were stringently defined, requiring at least three studies examining the same index test and diagnosis and congruence in population characteristics, test applications, and methodologies, as assessed through clinical judgment. To provide a comprehensive assessment of heterogeneity, various statistical measures were employed. These included the variances of logit-transformed sensitivity and specificity, bivariate I2, and the area of the 95% prediction ellipse. We also involved subgroup analyses and meta-regression to explore potential sources of variability among the studies. All statistical analyses were carried out using R (version 4.3.0; R Foundation for Statistical Computing, Vienna, Austria), incorporating specialized packages such as lmtest for likelihood ratio tests, lme4 for mixed-effects models, and msm for standard error calculations. Notably, a formal assessment of publication bias was not included in this meta-analysis. This decision was made after considering the complexities and potential low statistical power associated with assessing publication bias in diagnostic accuracy reviews [15, 16].

Configurations of the generative pre-trained transformer

This research employed specialized configurations of the GPT-4, meticulously engineered through advanced prompt structuring and schema modifications to address distinct investigatory requisites. The spectrum of tasks encompassed the precision refinement of R scripts dedicated to statistical examination, validation of data integrity extracted from included studies against their primary sources, designing comprehensive literature surveys, and enhancement of manuscript vernacular.

Results

Study selection process

Following PRISMA guidelines, our systematic search across databases yielded 1,793 records, with 699 duplicates removed during screening (Fig 1). A total of 1,094 records were scrutinized for relevance, leading to 931 exclusions. The remaining 163 articles underwent full-text review, resulting in 127 further exclusions due to non-conformity with our inclusion criteria, leaving 35 studies for the final quantitative synthesis (Table 1). In this study, no missing data were encountered. This outcome is attributed to the strict inclusion and exclusion criteria applied during the study selection process, which ensured that only studies with complete and relevant data were included in the final analysis.

thumbnail
Fig 1. PRISMA flowchart of study selection process.

This flowchart illustrates the systematic process of study selection, adhering to the PRISMA guidelines.

https://doi.org/10.1371/journal.pone.0310045.g001

thumbnail
Table 1. An overview of hydrops imaging in MD across diverse MRI grading systems.

https://doi.org/10.1371/journal.pone.0310045.t001

Risk of bias and applicability concerns

In the QUADAS-2 evaluation of the included studies (S4 File), patient selection consistently presented a high risk of bias due to the reliance on case-control designs, which inherently limit population diversity and introduce spectrum bias. The absence of a definitive gold standard for MD diagnosis further compounded this issue, resulting in a high-risk assignment for the reference standard across all studies. However, studies that employed comprehensive clinical diagnostic criteria were assigned a low risk of bias, as this approach ensures a representative patient population. The conduct and interpretation of diagnostic tests were also marked by high risk due to the challenges in blinding within MD cohorts, which heightened the potential for observer bias. Conversely, the use of predefined grading systems in all included studies supported a low-risk assessment in this domain, as standardization minimizes subjective interpretation. Applicability concerns were deemed low, given the consistent use of clinically validated assessment protocols, ensuring that the findings are relevant and generalizable to clinical practice.

Evaluative synopsis of MRI-based cochlear hydrops grading systems

In an analytical review of MRI-based cochlear hydrops classification, the gradated Nakashima and Baráth systems were contrasted with the binary classifications of PLE and Kahn (Table 2). The Nakashima system, focusing on Reissner’s membrane displacement and the scala vestibuli’s spatial dynamics, alongside Baráth’s approach to perilymphatic space dilation, reveals graded hydrops severity in MD staging. Studies utilizing the Baráth system often report fewer cases of Grade 2 hydrops compared to the Nakashima system, which may reflect the stricter criteria for severe hydrops classification in the former. Binary systems of PLE and Kahn focus on the presence of hydrops without grading severity. The aggregated data visualized in Fig 2 further solidifies these interpretations, with the clustered representation of hydrops severity offering a visual corroboration of the textual data. The distribution of hydrops across the ear categories shows a pronounced skew towards severe hydrops in dMD ears.

This figure displays a histogram comparing cochlear hydrops severity grading across MRI systems. The upper portion shows Nakashima and Baráth systems’ grading (Grade I for mild, Grade II for severe hydrops) alongside patient categorizations. The lower part contrasts binary classifications (normal vs. hydrops presence) by PLE and Kahn systems, including data for non-graded studies, across patient categories (asymptomatic, healthy controls, definite, and probable MD).

Quantitative synthesis of MRI-based EH grading in MD diagnosis

The analysis evaluates the diagnostic performance of MRI-based EH grading, using AAO-HNS and Barany criteria as reference standards (Table 3). At the EH grade 1 cutoff (S1 Fig in S1 File), the Establishment model, which distinguishes dMD from control groups (including normal and asymptomatic subgroups), achieved a sensitivity of 85.4% (CI: 78.5–90.3) and specificity of 82.7% (CI: 78.8–86.0), with a DOR of 27.888 (CI: 16.454–47.268) and a correlation coefficient of -0.076. At the EH grade 2 cutoff (S2 Fig in S1 File), sensitivity increased to 92.1% (CI: 85.9–95.7) with reduced specificity of 70.6% (CI: 64.5–76.1) and a DOR of 28.056 (CI: 14.917–52.770). The Confirmation model (distinguishing dMD from pMD, S3 Fig in S1 File) showed lower diagnostic accuracy, with a DOR of 5.216. For the Spotting model (distinguishing pMD from control, S4 Fig in S1 File), the sensitivity was 48.3% (CI: 34.8–62.1) and specificity 88.0% (CI: 77.8–93.9), with a DOR of 6.882 (CI: 2.725–17.382).

thumbnail
Table 3. MRI grading systems’ diagnostic accuracy for MD.

https://doi.org/10.1371/journal.pone.0310045.t003

Heterogeneity analysis

Heterogeneity was moderate to substantial, indicated by Bivariate I2 values. Predictive uncertainties, shown by Prediction Ellipse areas (Table 3), were smallest in the Establishment model (0.272 for grade 1, 0.156 for grade 2), larger in the Confirmation model (0.624), and moderate in the Spotting model (0.366).

Subgroup perspective

Establishment model.

As illustrated in the Fig 3 and Table 4, utilizing AAO-HNS criteria as a reference, cochlear EH indicated a sensitivity of 79.6% (CI: 64.9–89.1) and specificity of 79.7% (CI: 72.3–85.6), with a DOR of 15.318 (CI: 6.798–34.515). Barany Society criteria exhibited a higher sensitivity of 88.1% (CI: 80.6–92.9) and specificity of 84.1% (CI: 79.7–87.7), with a DOR of 39.149 (CI: 20.886–73.382), though statistical analysis indicated no significant differences (p-value: sensitivity = 0.182, specificity = 0.253, DOR = 0.188). The subgroup analysis based on gadolinium administration route revealed contrasting outcomes: IT route showed a sensitivity of 68.9% (CI: 39.7–88.2), specificity of 79.4% (CI: 64.5–89.1), and a DOR of 8.556 (CI: 2.205–33.199), while the IV route demonstrated higher sensitivity of 87.1% (CI: 80.5–91.7), specificity of 83.2% (CI: 79.1–86.6), and a DOR of 33.442 (CI: 19.38–57.705), with no significant statistical differences (p-value: sensitivity = 0.098, specificity = 0.545, DOR = 0.196). In the normal ears subgroup (n = 6), a notable specificity of 89.7% (CI: 82.7–94.1) was observed, significantly higher than in the asymptomatic subgroup (p-value = 0.025). Using Nakashima’s criteria (n = 18), a lower sensitivity of 74.9% (CI: 64.3–83.2) was recorded, significantly different from other systems (p-value < 0.001). Barath and Kahn systems showed high sensitivities of 89.5% (CI: 75.1–96.0) and 89.8% (CI: 68.3–97.3), respectively, without significant deviations. The PLE grading system (n = 7) revealed a high sensitivity of 98.4% (CI: 93.7–99.6) but a lower specificity of 74.9% (CI: 65.6–82.3), leading to a remarkably high DOR of 180.207 (CI: 41.13–789.566), significantly outperforming other systems in sensitivity and DOR (p-value < 0.001).

thumbnail
Fig 3. HSROC analysis for MD diagnosis at grade 1 and 2 thresholds.

Part (a) examines diagnostic efficacy at Grade 1, showing sensitivity and specificity for various criteria and gadolinium routes. Part (b) compares diagnostic accuracy at Grade 2, assessing criteria performance at this elevated severity. The opening plot of each section shows the aggregate model performance including all subgroups.

https://doi.org/10.1371/journal.pone.0310045.g003

thumbnail
Table 4. Diagnostic performance of MRI grading systems in MD subgroups.

https://doi.org/10.1371/journal.pone.0310045.t004

Establishment model at grade 2 cutoff.

Utilizing the grade 2 cutoff (Fig 3B), AAO-HNS and Barany systems showed comparable results. AAO-HNS criteria demonstrated a sensitivity of 90% (CI: 79.1–95.5) and specificity of 73.7% (CI: 64.5–81.3), closely aligned with the Barany system’s sensitivity of 93.3% (CI: 85.8–97.0) and specificity of 68.5% (CI: 60.6–75.4), yielding DORs of 25.146 (CI: 10.897–58.027) and 30.152 (CI: 13.401–67.841) respectively. The differences were minor and not statistically significant, suggesting their interchangeable utility in clinical assessments at this elevated diagnostic threshold. Nakashima’s criteria, however, exhibited a high sensitivity of 96.7% (CI: 85.9–99.3), albeit at a lower specificity of 60.5% (CI: 50.1–70.1), resulting in a DOR of 45.442 (CI: 9.256–223.084). This contrasts with the Barath system’s sensitivity of 89.8% (CI: 83.1–94.0), and specificity of 73.9% (CI: 68.0–79.1), with a DOR of 24.913 (CI: 13.922–44.584). Notably, statistical analysis indicated a significant difference in specificity (p-value = 0.030) between Nakashima’s and other criteria, marking it as less specific but potentially more sensitive for higher-grade cases.

Confirmation model.

As depicted in Fig 4A, the comparative diagnostic evaluation of dMD versus pMD using AAO-HNS criteria revealed a sensitivity of 69.6% (CI: 44.1–86.9) and a higher specificity of 86.0% (CI: 66.9–94.9), resulting in a DOR of 14.075 (CI: 5.803–34.136). Conversely, the Barany criteria exhibited a sensitivity of 76.7% (CI: 58.6–88.4) but a notably lower specificity of 48.3% (CI: 29.3–67.8), leading to a DOR of 3.07 (CI: 1.478–6.378). The significant disparity in specificity (p-value = 0.010) between these criteria suggests less discriminative capability for distinguishing between dMD and pMD. In the individual analysis of Nakashima versus Barath systems, neither showed significant deviation in sensitivity or DOR from other subgroups. Nakashima’s criteria indicated a sensitivity of 65.0% (CI: 45.5–80.4) and specificity of 71.6% (CI: 43.7–89.1), with a DOR of 4.681 (CI: 1.608–13.628); Barath’s criteria demonstrated a sensitivity of 76.0% (CI: 46.7–92.0) and specificity of 59.5% (CI: 23.8–87.4), with a DOR of 4.665 (CI: 1.045–20.836).

thumbnail
Fig 4. HSROC analysis for MD at grade 1 threshold.

Section (a) contrasts definite Meniere’s Disease (dMD) against probable Meniere’s Disease (pMD), while section (b) differentiates probable MD (pMD) from control groups. The opening plot of each section shows the aggregate model performance including all subgroups.

https://doi.org/10.1371/journal.pone.0310045.g004

Spotting model.

In the context of pMD versus control (Fig 4B), using Barany criteria as the reference standard, the cochlear EH grading systems achieved a sensitivity of 58.2% (CI: 36.8–76.9) and specificity of 93.0% (CI: 85.6–96.8), with a DOR of 18.539 (CI: 6.78–50.689). However, statistical significance was not observed. A notable decrease in sensitivity (p-value = 0.044) was seen in the Nakashima system at 36.4% (CI: 24.8–49.7), indicating a reduced efficacy in identifying pMD cases compared to other systems (Table 4).

Discussion

To be able to monitor the progress of the disease, it is essential to distinguish between the ears at different stages of the disease. Yet, the lack of consensus on grading scales illustrates the broader issues in medical diagnostics, emphasizing the challenges in achieving standardization. Methodologically, the selection of studies for MRI-based EH and perilymphatic space grading systems requires rigorous criteria to synthesize diverse research findings, pointing to the need for methodological refinements to enhance research reliability and validity.

In the quantitative analysis, the Establishment model demonstrated balanced diagnostic potential, whereas the Confirmation and Spotting models showed challenges in differentiating MD stages. In Establishment model, the sensitivity and specificity were high with Grade 1, but specificity reduced with Grade 2, indicating a potential compromise in accurately identifying true negatives at this higher threshold. The Confirmation model grappled with lower sensitivity and specificity in distinguishing dMD from pMD, a reflection of the intrinsic difficulty in separating these closely intertwined MD stages. In the Spotting model, the focus shifts to the detection of pMD, where the overall lower sensitivity, particularly pronounced in the Nakashima system, signals potential risks in overlooking early-stage MD cases, despite the high specificity that underscores the systems’ efficiency in excluding non-MD individuals.

In our meta-analysis, we prioritize an intricate disease probability categorization, rigorous selection criteria, and a discerning adoption of diagnostic standards to address and rectify the heterogeneities and biases pervading prior quantitative syntheses. By striving for a more granular and unified diagnostic schema, our research proposes a clinical framework for applying these grading systems based on disease probability, thus laying the groundwork for improved patient outcomes through more accurate disease staging. Our findings, derived from a rigorous analysis of 35 studies, advocates for a deliberate, informed choice of grading system, aimed at optimizing patient outcomes in the challenging terrain of MD management, reinforcing the indispensable role of customized diagnostic approach, attuned to the clinical objectives.

Grading paradigms confront inconsistencies arising from divergent severity thresholds. The lacuna in standardization not only impedes precise diagnosis and categorization but also exacerbates the interpretative intricacies of MRI outputs, necessitating a more cohesive and multi-faceted approach that reconciles technological proficiency with the complex pathophysiology of MD [17]. The PLE system showed the highest sensitivity and DOR, particularly in the Establishment model. The Nakashima criteria, in contrast, had lower sensitivity, especially notable at the Grade 2 threshold; it also showed a decrease in sensitivity in the Spotting model, highlighting its reduced efficacy in identifying pMD from control cases. The Barath system maintained a balance between sensitivity and specificity across thresholds. The study’s heterogeneity ranged from moderate to substantial, with the smallest predictive uncertainties observed in Establishment models, both at Grade 1 and Grade 2.

The detailed visualization of EH shed light on the intricate pathology of the disease with an unprecedented level of clarity. The detailed approaches, focusing on specific morphological changes such as Reissner’s membrane displacement and perilymphatic space dilation, underscore the complex nature of MD. Conversely, the binary systems, despite their simplicity and ease of use, risk glossing over these subtleties, highlighting the overarching challenge in achieving a diagnostic balance that is both comprehensive and practically applicable.

In the Establishment model, a higher specificity in the normal ears subgroup versus the asymptomatic subgroup accentuates the former’s diagnostic precision in negating MD presence, pivotal for circumventing unwarranted interventions. The PLE system’s paramount sensitivity and DOR, markedly outstripping its counterparts, albeit with a specificity trade-off, positioning it as a potent diagnostic tool in scenarios valuing the maximization of true MD case detection. The exploration of the Establishment model at the Grade 2 cutoff revealed a tightly knit performance between the AAO-HNS and Barany systems, both heralding high sensitivity and moderate specificity, hinting at their interchangeable clinical utility. In the Confirmation model, pronounced specificity disparity between the AAO-HNS and Barany criteria, with the former showcasing a balanced sensitivity and a notably higher specificity, indicated a robust capability to accurately exclude pMD cases. Conversely, the Barany criteria, despite a marginally superior sensitivity, significantly lagged in specificity, unveiling a less discriminative power in segregating dMD from pMD.

The cochlear grading systems varied in their sensitivity and specificity for different comparisons of ear categories. The contrast between quantitative metrics, like Nakashima’s area measurements, and qualitative descriptors, as used by Baráth and PLE, further complicates the task of integrating these systems into a unified diagnostic framework. Additionally, the subjective nature of some systems, particularly those relying on qualitative assessments like Baráth and PLE, introduces the risk of interobserver variability. This can lead to biased or erroneous results, depending on the radiologist’s expertise and interpretive skills. Lastly, a crucial limitation of these grading systems is their failure to account for the dynamic progression of MD and other cochlear pathologies. They offer static snapshots that may not accurately reflect the evolving nature of these conditions.

Several limitations must be acknowledged to fully interpret the findings accurately. Firstly, the diversity in study designs, including participant selection and diagnostic approaches for Meniere’s disease (MD), introduces variability potentially influencing the overall results. Secondly, inherent biases in patient selection, diagnostic criteria, and reporting across studies compromise the integrity of the data. Additionally, the potential for data overrepresentation due to multiple publications by the same authors may introduce bias. Moreover, the lack of standardized diagnostic criteria for MD, the reliance on clinical history for diagnosis, and the limitations of the statistical methods used, such as I2 values in diagnostic test accuracy reviews [18], further challenge the interpretation of the findings. These complexities underscore the imperative for rigorous methodological standards, transparent reporting, and advancements towards uniform diagnostic guidelines, thereby enhancing the reliability and validity of mental health research.

Supporting information

S1 File. The forest plots of diagnostic models.

https://doi.org/10.1371/journal.pone.0310045.s001

(DOCX)

S3 File. Reasons for exclusion of database and register reports following full-text review for eligibility.

https://doi.org/10.1371/journal.pone.0310045.s003

(DOCX)

S4 File. Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2).

https://doi.org/10.1371/journal.pone.0310045.s004

(DOCX)

References

  1. 1. Goebel JA. 2015 Equilibrium Committee Amendment to the 1995 AAO-HNS Guidelines for the Definition of Ménière’s Disease. Otolaryngol Head Neck Surg. 2016;154(3):403–4. Epub 2016/02/18. pmid:26884364.
  2. 2. Lopez-Escamez JA, Carey J, Chung WH, Goebel JA, Magnusson M, Mandalà M, et al. Diagnostic criteria for Menière’s disease. J Vestib Res. 2015;25(1):1–7. Epub 2015/04/18. pmid:25882471.
  3. 3. van Steekelenburg JM, van Weijnen A, de Pont LMH, Vijlbrief OD, Bommeljé CC, Koopman JP, et al. Value of Endolymphatic Hydrops and Perilymph Signal Intensity in Suspected Ménière Disease. AJNR Am J Neuroradiol. 2020;41(3):529–34. Epub 2020/02/08. pmid:32029469; PubMed Central PMCID: PMC7077918.
  4. 4. Homann G, Vieth V, Weiss D, Nikolaou K, Heindel W, Notohamiprodjo M, et al. Semi-quantitative vs. volumetric determination of endolymphatic space in Menière’s disease using endolymphatic hydrops 3T-HR-MRI after intravenous gadolinium injection. PLoS One. 2015;10(3):e0120357. Epub 2015/03/15. pmid:25768940; PubMed Central PMCID: PMC4358992.
  5. 5. Pyykkö I, Nakashima T, Yoshida T, Zou J, Naganawa S. Meniere’s disease: a reappraisal supported by a variable latency of symptoms and the MRI visualisation of endolymphatic hydrops. BMJ Open. 2013;3(2). Epub 2013/02/19. pmid:23418296; PubMed Central PMCID: PMC3586172.
  6. 6. Nakashima T, Naganawa S, Pyykko I, Gibson WP, Sone M, Nakata S, et al. Grading of endolymphatic hydrops using magnetic resonance imaging. Acta Otolaryngol Suppl. 2009;(560):5–8. Epub 2009/04/22. pmid:19221900.
  7. 7. Conte G, Lo Russo FM, Calloni SF, Sina C, Barozzi S, Di Berardino F, et al. MR imaging of endolymphatic hydrops in Ménière’s disease: not all that glitters is gold. Acta Otorhinolaryngol Ital. 2018;38(4):369–76. Epub 2018/09/11. pmid:30197428; PubMed Central PMCID: PMC6146579.
  8. 8. Han A, Kontorinis G. A systematic review on delayed acquisition of post-gadolinium magnetic resonance imaging in Ménière’s disease: imaging of the endolymphatic spaces. J Laryngol Otol. 2023;137(3):239–45. Epub 2022/06/09. pmid:35674257.
  9. 9. Zanetti D, Conte G, Scola E, Casale S, Lilli G, Di Berardino F. Advanced Imaging of the Vestibular Endolymphatic Space in Ménière’s Disease. Front Surg. 2021;8:700271. Epub 2021/09/10. pmid:34497826; PubMed Central PMCID: PMC8419327.
  10. 10. Song CI, Pogson JM, Andresen NS, Ward BK. MRI With Gadolinium as a Measure of Blood-Labyrinth Barrier Integrity in Patients With Inner Ear Symptoms: A Scoping Review. Front Neurol. 2021;12:662264. Epub 2021/06/08. pmid:34093410; PubMed Central PMCID: PMC8173087.
  11. 11. Lopez-Escamez JA, Attyé A. Systematic review of magnetic resonance imaging for diagnosis of Meniere disease. J Vestib Res. 2019;29(2–3):121–9. Epub 2019/07/30. pmid:31356219.
  12. 12. Connor S, Grzeda MT, Jamshidi B, Ourselin S, Hajnal JV, Pai I. Delayed post gadolinium MRI descriptors for Meniere’s disease: a systematic review and meta-analysis. European Radiology. 2023;33(10):7113–35. pmid:37171493
  13. 13. Frank RA, Bossuyt PM, McInnes MDF. Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy: The PRISMA-DTA Statement. Radiology. 2018;289(2):313–4. pmid:30015590.
  14. 14. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36. Epub 2011/10/19. pmid:22007046.
  15. 15. Macaskill P, Gatsonis C, Deeks J, Harbord R, Takwoingi Y. Analysing and presenting results. Cochrane handbook for systematic reviews of diagnostic test accuracy: Cochrane Collaboration; 2010.
  16. 16. Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of clinical epidemiology. 2005;58(9):882–93. pmid:16085191
  17. 17. Attyé A, Eliezer M, Boudiaf N, Tropres I, Chechin D, Schmerber S, et al. MRI of endolymphatic hydrops in patients with Meniere’s disease: a case-controlled study with a simplified classification based on saccular morphology. Eur Radiol. 2017;27(8):3138–46. Epub 2016/12/22. pmid:27999985.
  18. 18. McInnes MD, Moher D, Thombs BD, McGrath TA, Bossuyt PM, Clifford T, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA statement. Jama. 2018;319(4):388–96. pmid:29362800
  19. 19. Bernaerts A, Vanspauwen R, Blaivie C, van Dinther J, Zarowski A, Wuyts FL, et al. The value of four stage vestibular hydrops grading and asymmetric perilymphatic enhancement in the diagnosis of Menière’s disease on MRI. Neuroradiology. 2019;61(4):421–9. Epub 2019/02/06. pmid:30719545; PubMed Central PMCID: PMC6431299.
  20. 20. Bernaerts A, Janssen N, Wuyts FL, Blaivie C, Vanspauwen R, van Dinther J, et al. Comparison between 3D SPACE FLAIR and 3D TSE FLAIR in Menière’s disease. Neuroradiology. 2022;64(5):1011–20. Epub 2022/02/13. pmid:35149883; PubMed Central PMCID: PMC9005391.
  21. 21. Chen W, Geng Y, Lin N, Yu S, Sha Y. Magnetic resonance imaging with intravenous gadoteridol injection based on 3D-real IR sequence of the inner ear in Meniere’s disease patient: feasibility in 3.5-h time interval. Acta Otolaryngol. 2021;141(10):899–906. Epub 2021/09/15. pmid:34520311.
  22. 22. Pai I, Connor S. Low Frequency Air-Bone Gap in Meniere’s Disease: Relationship With Magnetic Resonance Imaging Features of Endolymphatic Hydrops. Ear Hear. 2022;43(6):1678–86. Epub 2022/05/19. pmid:35583512; PubMed Central PMCID: PMC9592161.
  23. 23. Conte G, Caschera L, Calloni S, Barozzi S, Di Berardino F, Zanetti D, et al. MR Imaging in Menière Disease: Is the Contact between the Vestibular Endolymphatic Space and the Oval Window a Reliable Biomarker? AJNR Am J Neuroradiol. 2018;39(11):2114–9. Epub 2018/10/20. pmid:30337432; PubMed Central PMCID: PMC7655340.
  24. 24. Domínguez P, Manrique-Huarte R, Suárez-Vega V, López-Laguna N, Guajardo C, Pérez-Fernández N. Endolymphatic Hydrops in Fluctuating Hearing Loss and Recurrent Vertigo. Front Surg. 2021;8:673847. Epub 2021/06/18. pmid:34136529; PubMed Central PMCID: PMC8202684.
  25. 25. Guajardo-Vergara C, Suárez-Vega V, Dominguez P, Manrique-Huarte R, Arbizu L, Pérez-Fernández N. Endolymphatic hydrops in the unaffected ear of patients with unilateral Ménière’s disease. Eur Arch Otorhinolaryngol. 2022;279(12):5591–600. Epub 2022/05/17. pmid:35578137; PubMed Central PMCID: PMC9649467.
  26. 26. Han SC, Kim YS, Kim Y, Lee SY, Song JJ, Choi BY, et al. Correlation of clinical parameters with endolymphatic hydrops on MRI in Meniere’s disease. Front Neurol. 2022;13:937703. Epub 2022/08/13. pmid:35959407; PubMed Central PMCID: PMC9361122.
  27. 27. Jasińska A, Lachowska M, Wnuk E, Pierchała K, Rowiński O, Niemczyk K. Correlation between magnetic resonance imaging classification of endolymphatic hydrops and clinical manifestations and audiovestibular test results in patients with definite Ménière’s disease. Auris Nasus Larynx. 2022;49(1):34–45. Epub 2021/04/19. pmid:33865653.
  28. 28. Kahn L, Hautefort C, Guichard JP, Toupet M, Jourdaine C, Vitaux H, et al. Relationship between video head impulse test, ocular and cervical vestibular evoked myogenic potentials, and compartmental magnetic resonance imaging classification in menière’s disease. Laryngoscope. 2020;130(7):E444–e52. Epub 2019/11/20. pmid:31742710.
  29. 29. Kazemi MA, Ghasemi A, Casselman JW, Shafiei M, Zarandy MM, Sharifian H, et al. Correlation of semi-quantitative findings of endolymphatic hydrops in MRI with the audiometric findings in patients with Meniere’s disease. J Otol. 2022;17(3):123–9. Epub 2022/07/19. pmid:35847569; PubMed Central PMCID: PMC9270562.
  30. 30. Kenis C, Crins T, Bernaerts A, Casselman J, Foer B. Diagnosis of Menière’s disease on MRI: feasibility at 1.5 Tesla. Acta Radiol. 2022;63(6):810–3. Epub 2021/05/19. pmid:34000823.
  31. 31. Kirbac A, Incesulu SA, Toprak U, Caklı H, Ozen H, Saylisoy S. Audio-vestibular and radiological analysis in Meniere’s disease. Braz J Otorhinolaryngol. 2022;88 Suppl 3:S117–s24. Epub 2022/10/19. pmid:36257895.
  32. 32. Li X, Wu Q, Sha Y, Dai C, Zhang R. Gadolinium-enhanced MRI reveals dynamic development of endolymphatic hydrops in Ménière’s disease. Braz J Otorhinolaryngol. 2020;86(2):165–73. Epub 2019/01/03. pmid:30600169; PubMed Central PMCID: PMC9422425.
  33. 33. Mainnemarre J, Hautefort C, Toupet M, Guichard JP, Houdart E, Attyé A, et al. The vestibular aqueduct ossification on temporal bone CT: an old sign revisited to rule out the presence of endolymphatic hydrops in Menière’s disease patients. Eur Radiol. 2020;30(11):6331–8. Epub 2020/06/17. pmid:32537729.
  34. 34. Morimoto K, Yoshida T, Sugiura S, Kato M, Kato K, Teranishi M, et al. Endolymphatic hydrops in patients with unilateral and bilateral Meniere’s disease. Acta Otolaryngol. 2017;137(1):23–8. Epub 2016/08/27. pmid:27564645.
  35. 35. Morimoto K, Yoshida T, Kobayashi M, Sugimoto S, Nishio N, Teranishi M, et al. Significance of high signal intensity in the endolymphatic duct on magnetic resonance imaging in ears with otological disorders. Acta Otolaryngol. 2020;140(10):818–22. Epub 2020/07/11. pmid:32646259.
  36. 36. Morita Y, Takahashi K, Ohshima S, Yagi C, Kitazawa M, Yamagishi T, et al. Is Vestibular Meniere’s Disease Associated With Endolymphatic Hydrops? Front Surg. 2020;7:601692. Epub 2021/01/05. pmid:33392247; PubMed Central PMCID: PMC7775543.
  37. 37. Naganawa S, Yamazaki M, Kawai H, Bokura K, Iida T, Sone M, et al. MR imaging of Ménière’s disease after combined intratympanic and intravenous injection of gadolinium using HYDROPS2. Magn Reson Med Sci. 2014;13(2):133–7. Epub 2014/04/29. pmid:24769636.
  38. 38. Nahmani S, Vaussy A, Hautefort C, Guichard JP, Guillonet A, Houdart E, et al. Comparison of Enhancement of the Vestibular Perilymph between Variable and Constant Flip Angle-Delayed 3D-FLAIR Sequences in Menière Disease. AJNR Am J Neuroradiol. 2020;41(4):706–11. Epub 2020/03/21. pmid:32193190; PubMed Central PMCID: PMC7144642.
  39. 39. Oh SY, Dieterich M, Lee BN, Boegle R, Kang JJ, Lee NR, et al. Endolymphatic Hydrops in Patients With Vestibular Migraine and Concurrent Meniere’s Disease. Front Neurol. 2021;12:594481. Epub 2021/03/30. pmid:33776877; PubMed Central PMCID: PMC7991602.
  40. 40. Okazaki Y, Yoshida T, Sugimoto S, Teranishi M, Kato K, Naganawa S, et al. Significance of Endolymphatic Hydrops in Ears With Unilateral Sensorineural Hearing Loss. Otol Neurotol. 2017;38(8):1076–80. Epub 2017/07/15. pmid:28708796.
  41. 41. Pai I, Mendis S, Murdin L, Touska P, Connor S. Magnetic resonance imaging of Ménière’s disease: early clinical experience in a UK centre. J Laryngol Otol. 2020;134(4):302–10. Epub 2020/04/04. pmid:32241307.
  42. 42. Sano R, Teranishi M, Yamazaki M, Isoda H, Naganawa S, Sone M, et al. Contrast enhancement of the inner ear in magnetic resonance images taken at 10 minutes or 4 hours after intravenous gadolinium injection. Acta Otolaryngol. 2012;132(3):241–6. Epub 2011/12/29. pmid:22201230.
  43. 43. Shi S, Guo P, Wang W. Magnetic Resonance Imaging of Ménière’s Disease After Intravenous Administration of Gadolinium. Ann Otol Rhinol Laryngol. 2018;127(11):777–82. Epub 2018/08/30. pmid:30156867.
  44. 44. Shiraishi K, Ohira N, Kobayashi T, Sato M, Osaki Y, Doi K. Comparison of furosemide-loading cervical vestibular-evoked myogenic potentials with magnetic resonance imaging for the evaluation of endolymphatic hydrops. Acta Otolaryngol. 2020;140(9):723–7. Epub 2020/07/24. pmid:32700983.
  45. 45. Sousa R, Lobo M, Cadilha H, Eça T, Campos J, Luis L. Is there progression of endolymphatic hydrops in Ménière’s disease? Longitudinal magnetic resonance study. Eur Arch Otorhinolaryngol. 2022. Epub 2022/11/08. pmid:36344698.
  46. 46. Suárez Vega VM, Dominguez P, Caballeros Lam FM, Leal JI, Perez-Fernandez N. Comparison between high-resolution 3D-IR with real reconstruction and 3D-flair sequences in the assessment of endolymphatic hydrops in 3 tesla. Acta Otolaryngol. 2020;140(11):883–8. Epub 2020/07/22. pmid:32692635.
  47. 47. Tagaya M, Yamazaki M, Teranishi M, Naganawa S, Yoshida T, Otake H, et al. Endolymphatic hydrops and blood-labyrinth barrier in Ménière’s disease. Acta Otolaryngol. 2011;131(5):474–9. Epub 2011/01/05. pmid:21198346.
  48. 48. Wu Q, Dai C, Zhao M, Sha Y. The correlation between symptoms of definite Meniere’s disease and endolymphatic hydrops visualized by magnetic resonance imaging. Laryngoscope. 2016;126(4):974–9. Epub 2015/09/04. pmid:26333096.
  49. 49. Xie J, Zhang W, Zhu J, Hui L, Li S, Ren L, et al. Differential Diagnosis of Endolymphatic Hydrops Between "Probable" and "Definite" Ménière’s Disease via Magnetic Resonance Imaging. Otolaryngol Head Neck Surg. 2021;165(5):696–700. Epub 2021/02/03. pmid:33528304.
  50. 50. Yamamoto M, Teranishi M, Naganawa S, Otake H, Sugiura M, Iwata T, et al. Relationship between the degree of endolymphatic hydrops and electrocochleography. Audiol Neurootol. 2010;15(4):254–60. Epub 2009/11/20. pmid:19923813.
  51. 51. Yoshida T, Sugimoto S, Teranishi M, Otake H, Yamazaki M, Naganawa S, et al. Imaging of the endolymphatic space in patients with Ménière’s disease. Auris Nasus Larynx. 2018;45(1):33–8. Epub 2017/03/04. pmid:28256285.