Figures
Abstract
Aim of the study
The aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence (AI) algorithms in detecting meniscal injuries.
Materials and methods
A systematic search was performed in the Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central databases on July, 2024. The included studies’ reporting quality and risk of bias were evaluated using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and the Prediction Model Study Risk of Bias Assessment Tool (PROBAST), respectively. Also, a meta-analysis was done using contingency tables to estimate diagnostic performance metrics (sensitivity and specificity), and a meta-regression analysis was performed to investigate the effect of the following variables on the main outcome: imaging view, data augmentation and transfer learning usage, and presence of meniscal tear in the injury, with a corresponding 95% confidence interval (CI) and a P-value of 0.05 as a threshold for significance.
Results
Among 28 included studies, 92 contingency tables were extracted from 15 studies. The reference standard of the studies were mostly expert radiologists, orthopedics, or surgical reports. The pooled sensitivity and specificity for AI algorithms on internal validation were 81% (95% CI: 78, 85), and 78% (95% CI: 72, 83), and for clinicians on internal validation were 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. The pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88), and 88% (95% CI: 84, 91), respectively.
Citation: Mohammadi S, Jahanshahi A, Shahrabi Farahani M, Salehi MA, Frounchi N, Guermazi A (2025) Diagnosis of knee meniscal injuries using artificial intelligence: A systematic review and meta-analysis of diagnostic performance. PLoS One 20(6): e0326339. https://doi.org/10.1371/journal.pone.0326339
Editor: Osama Farouk, Assiut University Faculty of Medicine, EGYPT
Received: December 26, 2024; Accepted: May 27, 2025; Published: June 24, 2025
Copyright: © 2025 Mohammadi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting information files.
Funding: The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Competing interests: Ali Guermazi is a consultant to Novartis, Coval, Scarcell, 4Moving, Paradigm, Peptinov, Levicept, Pacira, TissueGene, Medipost, ICM and Formation Bio. He is a shareholder of BICL, LLC. This does not alter our adherence to PLOS ONE policies on sharing data and materials. All other authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Introduction
The menisci are crescent-shaped fibrocartilaginous structures in the knee joint, vital for weight-bearing, shock absorption, and leg movements [1,2]. Meniscal damage is considered the most frequent knee injury and can be either traumatic or degenerative. Meniscal tears and macerations are the two types of meniscal damage that can lead to pain, functional limitations, and early onset progression of osteoarthritis. Therefore, rapid and accurate diagnosis of meniscal damage is of paramount importance as it enables early detection and prevention of osteoarthritis [3,4].
Magnetic resonance imaging (MRI) is considered the noninvasive modality of choice for the diagnosis of meniscal damage, while arthroscopy is highly accurate but an invasive method [5]. Increased signal intensity in MR images alongside soft tissue swelling can be interpreted as meniscal tear, although depending on the type of tear, various signs might be present [6]. A recent meta-analysis on the diagnostic performance of MRI in meniscal injuries found that time for acquisition, MRI system technology, and type of MRI sequence can affect the performance of this modality [7]. Additionally, previous studies have shown that the accuracy of MRI is markedly poorer in specific types of meniscal damage, such as degenerative tears and macerations, and some cases of meniscal tears accompanying anterior cruciate ligament tears [8]. Furthermore, due to the time-consuming acquisition and interpretation process of MRI, as well as inter-reader and intra-reader variability, its diagnostic performance can vary widely [9]. These factors highlight the potential artificial intelligence-based methods for the interpretation of images to improve early detection of meniscal damage.
Artificial intelligence (AI) is one of the vast branches of computer science that enables the accomplishment of various tasks without human intervention [10]. Machine learning is a component of AI in which algorithms are designed and programmed to learn tasks from data through experience [11]. Deep learning, which is structured similarly to neuronal networks, is a subset of machine learning that processes data through multiple layers so that the output of one layer serves as the input for the next layer [11,12].
The utilization of AI in medicine is rapidly growing today [13]. Previous studies have demonstrated that deep learning algorithms can perform thoroughly with acceptable accuracy, especially in diagnostic tasks [12]. A study by Xue et al. [14] observed a higher sensitivity, specificity, and AUC in deep learning-aided image-based cancer diagnosis (88%, 88%, and 0.94, respectively). Also, Kuo and colleagues [15], in a study comparing clinicians and AI in fracture detection, showed that AI performance is the same as clinicians’.
In various medical situations, such as meniscal damage, timely and accurate detection can significantly enhance the outcome. AI-assisted diagnosis has the potential to provide substantial benefits in terms of both time and cost. In this study, the aim was to systematically review the literature and perform a meta-analysis to estimate the performance of artificial intelligence algorithms in detecting meniscal injuries.
Materials and methods
Protocol and registration
This study was performed based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines with a major focus on Diagnostic Test Accuracy extension (PRISMA-DTA) and adherence to the preferred essential items in writing a systematic review of diagnostic test accuracy studies [16,17]. Our study was registered on the International Prospective Register of Systematic Reviews (PROSPERO) website (Registration No. CRD42022323106). The aim was to include studies assessing the performance of AI in detecting meniscal injuries. Two reviewers (Medical doctors with at least 4 years of experience in performing systematic reviews) independently worked on each step of this review, and any discrepancies were solved through the supervision of the third reviewer (A medical doctor with at least 6 years of experience in performing systematic reviews) in the discussion.
Search strategy and study selection
Primarily, a comprehensive literature search was performed in databases of Scopus, PubMed, EBSCO, Cinahl, Web of Science, IEEE Xplore, and Cochrane Central in April, 2022 using related keywords (S1 Table), and our systematic search was updated in July, 2024 to find published articles that evaluated validation and performance of an AI algorithm as a diagnostic tool for detection of meniscal damage, regardless of their study setting, language, target population, and publication time. Also, all references of the included studies were screened to find any studies that might have been missed. The articles violating the following exclusion criteria were excluded from our study: letters, opinions, book chapters, conference abstracts, reviews, animal studies, studies performing only segmentation analysis, and studies using natural language processing (NLP) on electronic health records (EHR).
Data extraction
The following data were extracted from the included studies: First author’s name and publication year, country, study design, their inclusion and exclusion criteria for the images or the participants, imaging modality, view of imaging, algorithms and architectures, dimension of images, evaluation and validation methods, validation size, number of images in each training, testing, and tuning sets, model output, the standard reference for diagnosis, values of true positive, true negative, false positive, false negative, sensitivity, specificity, positive and negative predictive values, the AUC, and accuracy. Moreover, an email was sent to the correspondence of the studies that had not reported their data thoroughly to ask for the complete results. The reported data were used to build contingency tables and calculate sensitivity and specificity, where applicable.
Statistical analysis
In order to determine the diagnostic performance of AI and clinicians, a random-effect meta-analysis was performed with a corresponding 95% CI, using contingency tables extracted from the included studies for at least three adequate studies to calculate pooled sensitivity and specificity. A meta-regression analysis was conducted to determine between-study heterogeneity for the following covariates: view of imaging, usage of data augmentation and transfer learning, and presence of meniscal tear in the injury. Besides, the data were divided based on whether the injury was on the medial or lateral side and a separate meta-regression analysis was performed for each medial and lateral menisci. A P-value of 0.05 was considered as a threshold for significance. The entire data analysis was conducted using Stata 16 software (Stata Corp, College Station, TX) [18,19]. The “midas” command in Stata does not use hierarchical or bivariate models to simultaneously model sensitivity and specificity in the meta-analysis of diagnostic performance studies. Instead, it uses a univariate approach where the logit-transformed sensitivity and specificity are separately analyzed and combined across studies using a fixed-effects or random-effects model. It also uses a logistic regression model for the random-effects model and meta-regression in order to make comparisons between subgroups. This method has been widely used in previous meta-analyses [15,20].
Publication bias
The slope coefficient represents the relationship between the diagnostic odds ratio (DOR) and the inverse of the square root of the effective sample size (ESS). Publication bias refers to the tendency for studies with positive or significant results to be more likely to be published than those with negative or nonsignificant results. In the context of diagnostic performance studies, publication bias can occur if studies reporting higher DORs are more likely to be published, leading to an overestimation of the test’s accuracy. The slope coefficient is used to assess the presence of publication bias by examining whether there is a relationship between the DOR and the ESS. If there is no publication bias, the slope coefficient should be close to zero. A positive slope coefficient suggests that studies with higher DORs have smaller ESS, indicating a potential publication bias towards studies with positive results. Conversely, a negative slope coefficient suggests a potential publication bias towards studies with negative results [21].
Quality and risk of bias assessment
Since there are no perfectly suitable criteria to assess AI studies’ reporting quality and risk of bias, previous studies [15] were used to find the best tools for this attempt. The reporting quality of the included studies in reporting the results was assessed using the TRIPOD. This checklist consists of 22 items evaluating each section [22]. A modified version of TRIPOD was used due to the inapplicability of its few items for AI studies (S2 Table).
The risk of bias and applicability of the included studies were estimated using the PROBAST, consisting of several categorized questions in 4 domains of participants, predictors, outcomes, and analysis [23]. The predictors’ domain was omitted because it was irrelevant in the setting of our study, and both training and testing sets were assessed in the first domain (S3 Table).
Results
Study selection and characteristics
As a result of our systematic search, 3294 studies were extracted and downloaded to Endnote version 20. After removing duplicates, 2822 studies underwent the title and abstract screening process. A full-text screening was carried out for 33 eligible studies, leading to the exclusion of five studies for these reasons: lack of relevance [24–26], unavailability of the full text [27], and measuring variables other than interest. At last, 28 studies [2,9, 28–53] included meeting the inclusion criteria. More detailed information on study selection could be obtained from the flow diagram in Fig 1.
All included studies used MRI to investigate meniscal damage, although one of them [36] also included computed tomography (CT) images in their study. Twenty-four studies [2,9,28–40,45–53] recruited an internal test set to estimate the diagnostic performance of their AI algorithm, of which two studies [2,38] had an external test set too. Four studies [41–44] used only an external test set for this purpose.
All of the MRI, except in two studies [42,49], were in sagittal view. Also, thirteen studies [2,9,28,31,33,36,39,40,44,45,51–53] included multiple views of MRI, and five studies [39,40,46–48] developed different models for each view of MRI. Regarding the type of injury, [28,30,32,33,35,40,42,45,51,53]eighteen studies (Four externally validating, and 14 internally validating studies) [2,9,29,31,34,36–39,41,43,44,46–50,52] did not report the subtype of the meniscal damage. Three studies (two with internal and one with both internal and external test sets) [28,40,45] reported degenerative meniscal injury. Two studies included complex tears [28,42], and three observed either partial or complete maceration of the menisci [30,35,42]. Furthermore, among the internal validation studies, two [30,32] had both vertical and horizontal tears in their dataset, while two studies [33,53] only noted horizontal tears.
Only five studies [39,40,44,49,53] used a cross-validation method to validate their AI algorithm, while others used different validation methods, namely random split sampling, stratified split sampling, validation set, or training set. The model output of 21 studies [2,29–35,37,41–44,46–53] was binary classification, of which one study [30] also had its model perform a multiclass classification on MRI. Among others, the study by Chou et al. [39] assessed the algorithm’s performance in determining the probability of meniscal damage. Eleven studies [2,28,31,33,35,39,42,44,45,50,53] had a comparison group of expert clinicians, thirteen [9,30,32,36–38,42,46–49,52,53] carried out a comparison of multiple algorithms, while one study [51] made a comparison between multiple imaging views. All of the included studies mentioned a reference standard for their diagnosis, of which three [32,41,43] were their past study, while others had experts such as radiologists or surgeons as the reference standard, although one study [46] did not state their reference standard. More detailed information can be obtained from Tables 1,2, and S4.
Study participants
The number of participants represented by the training data in each study ranged from 28 [29] to 7903 [44] (median, 530.50; interquartile range, 963; S4 Table). The percentage of disease-positive participants varied widely (median, 42.65; range, 87; interquartile range, 32.92; S4 Table). However, six studies [9,32,34,37,43,49] did not report data on the number of participants, and eleven studies [2,32,35,37,38,40,42,44–46,49] did not report the proportion of disease-positive participants.
Algorithm development and model output
The number of images in training sets (median, 583; range, 18510; interquartile range, 1015.2; Table 1) and testing sets (median, 154.50; range, 3590; interquartile range, 182.50; Table 1) differed widely among the included studies, although one of the studies developing an internally validated algorithm [31], did not report the size of their testing set. The size of the datasets in the externally validating studies varied from 100 [41] to 1620 [45] (median, 296; interquartile range, 580; Table 2). Ten studies [34,35,39,42,43,45,46,48–50] used data augmentation, and five [29,31,37,46,47] carried out a transfer learning process.
The performance of the presented algorithms in the studies was assessed by various metrics such as accuracy (n = 19), sensitivity and specificity (n = 19), negative or positive predictive value (n = 7), AUC (n = 18), and f1 (n = 2). More detailed information on algorithm development is shown in Tables 1,2, and S4–S6.
Quality assessment
In terms of adherence to the TRIPOD checklist, four applicable items were reported in less than or equal to 50% of the studies: sample size estimation, reporting the model’s performance, availability of supplementary information, and declaration of the funding source. Fig 2 demonstrates the pattern of articles’ adherence to this tool.
According to the PROBAST checklist, more than 70% of the studies were found to have low concern regarding the risk of bias and applicability in participant selection and outcome determination. However, there was a high concern in the analysis domain for more than 60% of the included studies. The overall results show that although all of the studies were at low risk in applicability, less than 30% of the studies were found to be at lower risk of bias based on this tool (Fig 2).
Meta-analysis
Of the 28 included studies, 15 [2,9,28–31,33,35,38,39,41,42,44,45,53] provided sufficient data and 92 contingency tables were extracted from them. Seventy contingency tables were extracted from 12 [2,9,28–31,33,35,38,39,45,53] studies, internally validating the AI algorithm. Five tables were extracted from four [2,41,42,44] external validation studies and 17 tables from four studies [28,31,33,38] assessing the clinicians’ diagnostic performance on internal test sets. Furthermore, 24 contingency tables were extracted from five studies internally validating AI algorithms [9,28,30,33,39], and six tables from two studies evaluating clinicians’ performance on internal test sets [28,33] presenting data for lateral and medial menisci separately.
The pooled results are shown in Table 3. The pooled sensitivity and specificity for internally validated algorithms were 81% (95% CI: 78, 85), 78% (95% CI: 72, 83), and for clinicians, 85% (95% CI: 76, 91), and 88% (95% CI: 83, 92), respectively. On the other hand, the pooled sensitivity and specificity for studies validating algorithms with an external test set were 82% (95% CI: 74, 88) and 88% (95% CI: 84, 91), respectively. Besides, a subgroup analysis was performed on the studies reporting separate lateral and medial menisci results. Table 3 presents detailed pooled results of our meta-analysis.
Regarding meta-regression analysis, for studies internally validating their algorithm, statistically significant higher sensitivity was associated with data augmentation usage (85%; 95% CI: 77, 93; P, 0.01) and multiple-view imaging (86%; 95% CI: 81, 90; P, 0.00). In contrast, no significant difference was observed in the usage of transfer learning. Moreover, lower specificity was associated with the presence of tear in the injury (87%; 95% CI: 84, 90; P, 0.00) and multiple view imaging (85%; 95% CI: 81, 89; P, 0.00) in the studies validating algorithms with external test sets. There was no significant difference between multiple and single-view imaging in the studies assessing clinicians’ performance on internal test sets. S7–S11 Tables illustrate more detailed results, including the meta-regression analysis of studies reporting lateral and medial menisci separately. Also, SROC curves and Forest plots for each analysis are included in Figs 3 and 4, respectively.
Publication bias
The result of the publication bias analysis is presented in S12 Table. The slope coefficient for studies evaluating internally and externally validated algorithms and clinicians’ performance on internal test sets were 0.12 (95% CI: −6.18, 6.44; P, 0.967), 8.39 (95% CI: −63.24, 80.04; P, 0.734), and −14.83 (95% CI: −24.81, −4.85; P, 0.006), respectively. Therefore, there is a high risk of publication bias for studies assessing clinicians’ performance with internal test sets. Regarding the publication bias results, funnel plots included in Fig 5 are more illustrative and informative.
Discussion
The main findings of our study are as follows: AI algorithms had a lower diagnostic accuracy compared with clinicians on internal validation, with pooled sensitivity and specificity of 81% (95% CI: 78, 85) and 78% (95% CI: 72, 83), respectively. Almost the same pattern was applied to the medial and lateral menisci. Still, interestingly, AI algorithms reached a better diagnostic accuracy for medial meniscus damage than lateral meniscus, with pooled sensitivity and specificity of 83% (95% CI: 77, 87) and 82% (95% CI: 74, 89). Regarding external validation, the diagnostic accuracy of AI algorithms was not comparable with clinicians because no studies evaluated clinicians’ performance on external test sets. However, AI showed an acceptable pooled sensitivity and specificity of 82% (95% CI: 74, 88) and 88% (95% CI: 84, 91), respectively. Eleven internal validation studies had insufficient data for the meta-analysis [32,34,36,37,46–52]. Among these, the reported sensitivities were above 90% only in the studies by Qiu et al. and Ölmez et al. [36,37]. Additionally, except the models for anterior and posterior horns and the body of the meniscus in the study by Tack and colleagues [34], axial MRI in Kara and colleagues’ study [46], and the study by Sharma et al. [52], all of the reported AUCs in the mentioned studies were high (≥0.9) [32,34,36].
Few studies concentrate on AI performance in detecting meniscus damage compared to other pathologies. A systematic review by Kunze et al. [54] identified five studies investigating meniscus tears. For AI algorithms, they reported an AUC range of 0.84 to 0.91 and a prediction accuracy range of 75% to 90%. Likewise, they mentioned that AI algorithms did not outperform expert clinicians [54]. Similarly, according to our pooled results in the internal validation setting, clinicians’ diagnostic performance surpassed AI algorithms, even separately for medial and lateral menisci in subgroup analysis. In another review by Fritz et al. [55], a wide range of results were reported. The sensitivity ranged from 58% to 89%, with considerably lower sensitivity for lateral meniscus, and specificity ranged from 74% to 92% [55]. Our result also had the same pattern, demonstrating higher pooled sensitivity, specificity, and AUC values for the medial meniscus. Another point of interest in the studies was to develop an algorithm for the segmentation, classification, and diagnosis of anterior cruciate ligament (ACL) tears. In the study by Dung et al. [56], they achieved an accuracy of 92% for fully ruptured ACL. Our primary assumption is that these algorithms might be beneficial in further studies to classify and detect meniscal damages.
Because of some limitations, our results should be judged cautiously. First, although adherence to the TRIPOD checklist was acceptable, 70% of the included studies were found to be high risk, according to PROBAST. Second, subgroup and meta-regression analysis was conducted for data augmentation and transfer learning, view of imaging, and meniscus tear because of noticeable heterogeneity between the studies. Third, the publication bias analysis illustrated that internal validation studies were associated with a higher risk of publication bias. Fourth, studies externally validating clinicians were not found, which complicates the interpretation of external validation results due to the need for a comparator. On the other hand, none of the included studies reported data on the AI-aided diagnostic performance of the clinicians. Correspondingly, further studies are needed in this field to estimate the effect of implementing AI accurately in detecting meniscal damage. Last but not least, the terminology of meniscal damage comprises various subtypes, including traumatic, degenerative, or mixed tears that can be partial or complete [57]. Meniscal displacement, fragment dislocation, disinsertion, etc., are other causes of meniscal damage [57]. In line with this, a high level of between-study heterogeneity was found in the terminology used to define the meniscal damage subtypes. Almost half of the included studies did not report for meniscal damage subtype [2,9,29,31,34,36–39,41,43,44], and even the ones did not report the exact category [30,32,33]. Therefore, performing a subgroup analysis based on the meniscal damage subtype was impossible. In addition, the between-study heterogeneity was even beyond this. Nine studies [29,30,32,34,35,37,38,41,43] tested their algorithm only on sagittal planes of MRI, which can lead to a misinterpretation of the result. Another possible obstacle in interpreting our results is false positive findings due to conditions such as meniscal ossicles [58]. However, the details of the reported characteristics of meniscal damage are mentioned in S4 Table.
Conclusion
To conclude, using AI as a diagnostic tool is burgeoning, especially in image-based diagnoses. The results of this study imply the lower diagnostic performance of AI-based algorithms in knee meniscal injuries compared with radiologists. Future studies providing data on the performance of AI algorithms in detecting various meniscal damage subtypes are warranted to shed light on the exact applicability of AI-based algorithms in real-world clinical settings. Another interest of future studies could be determining the validity of AI-based algorithms in identifying meniscal lesions when acquiring surgical treatments.
Supporting information
S5 Table. Internal validation results for algorithms and clinicians.
https://doi.org/10.1371/journal.pone.0326339.s005
(DOCX)
S6 Table. External validation results for algorithms and clinicians.
https://doi.org/10.1371/journal.pone.0326339.s006
(DOCX)
S7 Table. Meta-regression, AI on internal validation.
https://doi.org/10.1371/journal.pone.0326339.s007
(DOCX)
S8 Table. Meta-regression, AI on internal validation lateral meniscus.
https://doi.org/10.1371/journal.pone.0326339.s008
(DOCX)
S9 Table. Meta-regression, AI on internal validation medial meniscus.
https://doi.org/10.1371/journal.pone.0326339.s009
(DOCX)
S10 Table. Meta-regression, AI on clinicians internal validation.
https://doi.org/10.1371/journal.pone.0326339.s010
(DOCX)
S11 Table. Meta-regression, AI on external validation.
https://doi.org/10.1371/journal.pone.0326339.s011
(DOCX)
References
- 1. Makris EA, Hadidi P, Athanasiou KA. The knee meniscus: structure-function, pathophysiology, current repair techniques, and prospects for regeneration. Biomaterials. 2011;32(30):7411–31. pmid:21764438
- 2. Hung TNK, Vy VPT, Tri NM, Hoang LN, Tuan LV, Ho QT, et al. Automatic detection of meniscus tears using backbone convolutional neural networks on knee MRI. J Magn Reson Imaging. 2023;57(3):740–9. pmid:35648374
- 3. Favero M, Ramonda R, Goldring MB, Goldring SR, Punzi L. Early knee osteoarthritis. RMD Open. 2015;1(Suppl 1):e000062.
- 4. Pache S, Aman ZS, Kennedy M, Nakama GY, Moatshe G, Ziegler C, et al. Current concepts review meniscal root tears: current concepts review. Arch Bone Jt Surg. 2018. Available from: http://abjs.mums.ac.irtheonlineversionofthisarticleabjs.mums.ac.ir
- 5. Crawford R, Walley G, Bridgman S, Maffulli N. Magnetic resonance imaging versus arthroscopy in the diagnosis of knee pathology, concentrating on meniscal lesions and ACL tears: a systematic review. Br Med Bull. 2007;84:5–23. pmid:17785279
- 6. De Smet AA. How I diagnose meniscal tears on knee MRI. AJR Am J Roentgenol. 2012;199(3):481–99. pmid:22915388
- 7. Shakoor D, Kijowski R, Guermazi A, Fritz J, Roemer FW, Jalali-Farahani S, et al. Diagnosis of knee meniscal injuries by using three-dimensional MRI: a systematic review and meta-analysis of diagnostic performance. Radiology. 2019;290(2):435–45. pmid:30457479
- 8. Lecouvet F, Van Haver T, Acid S, Perlepe V, Kirchgesner T, Vande Berg B, et al. Magnetic resonance imaging (MRI) of the knee: identification of difficult-to-diagnose meniscal lesions. Diagn Interv Imaging. 2018;99(2):55–64. pmid:29396088
- 9. Shin H, Choi GS, Shon O-J, Kim GB, Chang MC. Development of convolutional neural network model for diagnosing meniscus tear using magnetic resonance image. BMC Musculoskelet Disord. 2022;23(1):510. pmid:35637451
- 10. Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69S:S36–40. pmid:28126242
- 11. Mintz Y, Brodie R. Introduction to artificial intelligence in medicine. Minim Invasive Ther Allied Technol. 2019;28(2):73–81. pmid:30810430
- 12. Zaharchuk G, Gong E, Wintermark M, Rubin D, Langlotz CP. Deep learning in neuroradiology. AJNR Am J Neuroradiol. 2018;39(10):1776–84. pmid:29419402
- 13. Katzman BD, van der Pol CB, Soyer P, Patlas MN. Artificial intelligence in emergency radiology: a review of applications and possibilities. Diagn Interv Imaging. 2023;104(1):6–10. pmid:35933269
- 14. Xue P, Si M, Qin D, Wei B, Seery S, Ye Z, et al. Unassisted clinicians versus deep learning-assisted clinicians in image-based cancer diagnostics: systematic review with meta-analysis. J Med Internet Res. 2023;25:e43832. pmid:36862499
- 15. Kuo RYL, Harrison C, Curran T-A, Jones B, Freethy A, Cussons D, et al. Artificial intelligence in fracture detection: a systematic review and meta-analysis. Radiology. 2022;304(1):50–62. pmid:35348381
- 16. McInnes MDF, Moher D, Thombs BD, McGrath TA, Bossuyt PM, the PRISMA-DTA Group, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA statement. JAMA. 2018;319(4):388–96. pmid:29362800
- 17. Alabousi M, Soyer P, Patlas MN. Writing a successful systematic review manuscript for a radiology journal. Can Assoc Radiol J. 2023;74(3):471–3. pmid:36046850
- 18. Dwamena B. MIDAS: Stata module for meta-analytical integration of diagnostic test accuracy studies. 2009. Available from: https://EconPapers.repec.org/RePEc:boc:bocode:s456880
- 19. Harbord RM, Whiting P. Metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. Stata J. 2009.
- 20. Mohammadi S, Salehi MA, Jahanshahi A, Shahrabi Farahani M, Zakavi SS, Behrouzieh S, et al. Artificial intelligence in osteoarthritis detection: a systematic review and meta-analysis. Osteoarthritis Cartilage. 2024;32(3):241–53. pmid:37863421
- 21. Lin L, Chu H. Quantifying publication bias in meta-analysis. Biometrics. 2018;74(3):785–94. pmid:29141096
- 22. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med. 2015;13:1. pmid:25563062
- 23. Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170(1):51–8. pmid:30596875
- 24. Kostopoulos S, Sidiropoulos K, Glotsos D, Athanasiadis E, Boutsikou K, Lavdas E, et al. Pattern-recognition system, designed on GPU, for discriminating between injured normal and pathological knee cartilage. Magn Reson Imaging. 2013;31(5):761–70. pmid:23333579
- 25. Ebrahimkhani S, Dharmaratne A, Jaward MH, Wang Y, Cicuttini FM. Automated segmentation of knee articular cartilage: joint deep and hand-crafted learning-based framework using diffeomorphic mapping. Neurocomputing. 2022;467:36–55.
- 26. Lee J-I, Kim D-H, Yoo H-J, Choi H-G, Lee Y-S. Comparison of the predicting performance for fate of medial meniscus posterior root tear based on treatment strategies: a comparison between logistic regression, gradient boosting, and CNN algorithms. Diagnostics (Basel). 2021;11(7):1225. pmid:34359308
- 27. Key S, Baygin M, Demir S, Dogan S, Tuncer T. Meniscal tear and ACL injury detection model based on AlexNet and Iterative ReliefF. J Digit Imaging. 2022;35(2):200–12. pmid:35048231
- 28.
Ramakrishna B, Safdar N, Siddiqui K, Kim W, Liu W, Saiprasad G, et al. Automated discovery of meniscal tears on MR imaging: a novel high-performance computer-aided detection application for radiologists. Medical Imaging 2008: Computer-Aided Diagnosis. SPIE; 2008. pp. 691531. https://doi.org/10.1117/12.773167
- 29. Zarandi MHF, Khadangi A, Karimi F, Turksen IB. A computer-aided Type-II fuzzy image processing for diagnosis of meniscus tear. J Digit Imaging. 2016;29(6):677–95. pmid:27198133
- 30. Saygılı A, Albayrak S. An efficient and fast computer-aided method for fully automated diagnosis of meniscal tears from magnetic resonance images. Artif Intell Med. 2019;97:118–30. pmid:30527276
- 31. Bien N, Rajpurkar P, Ball RL, Irvin J, Park A, Jones E, et al. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet. PLoS Med. 2018;15(11):e1002699. pmid:30481176
- 32. Couteaux V, Si-Mohamed S, Nempont O, Lefevre T, Popoff A, Pizaine G, et al. Automatic knee meniscus tear detection and orientation classification with Mask-RCNN. Diagn Interv Imaging. 2019;100(4):235–42. pmid:30910620
- 33. Fritz B, Marbach G, Civardi F, Fucentese SF, Pfirrmann CWA. Deep convolutional neural network-based detection of meniscus tears: comparison with radiologists and surgery as standard of reference. Skeletal Radiol. 2020;49(8):1207–17. pmid:32170334
- 34. Tack A, Shestakov A, Lüdke D, Zachow S. A multi-task deep learning method for detection of meniscal tears in MRI data from the osteoarthritis initiative database. Front Bioeng Biotechnol. 2021;9:747217. pmid:34926416
- 35. Astuto B, Flament I, K Namiri N, Shah R, Bharadwaj U, M Link T, et al. Automatic deep learning-assisted detection and grading of abnormalities in knee MRI studies. Radiol Artif Intell. 2021;3(3):e200165. pmid:34142088
- 36. Qiu X, Liu Z, Zhuang M, Cheng D, Zhu C, Zhang X. Fusion of CNN1 and CNN2-based magnetic resonance image diagnosis of knee meniscus injury and a comparative analysis with computed tomography. Comput Methods Programs Biomed. 2021;211:106297. pmid:34536633
- 37. Ölmez E, Akdoğan V, Korkmaz M, Er O. Automatic segmentation of meniscus in multispectral MRI using regions with convolutional neural network (R-CNN). J Digit Imaging. 2020;33(4):916–29. pmid:32488659
- 38. Li Y-Z, Wang Y, Fang K-B, Zheng H-Z, Lai Q-Q, Xia Y-F, et al. Automated meniscus segmentation and tear detection of knee MRI with a 3D mask-RCNN. Eur J Med Res. 2022;27(1):247. pmid:36372871
- 39. Chou Y-T, Lin C-T, Chang T-A, Wu Y-L, Yu C-E, Ho T-Y, et al. Development of artificial intelligence-based clinical decision support system for diagnosis of meniscal injury using magnetic resonance images. Biomed Signal Process Control. 2023;82:104523.
- 40. Wang Y, Li Y, Huang M, Lai Q, Huang J, Chen J. Feasibility of constructing an automatic meniscus injury detection model based on dual-mode magnetic resonance imaging (MRI) radiomics of the knee joint. Comput Math Methods Med. 2022;2022:2155132. pmid:35392588
- 41. Köse C, Gençalioğlu O, Şevik U. An automatic diagnosis method for the knee meniscus tears in MR images. Expert Systems with Applications. 2009;36(2):1208–16.
- 42. Pedoia V, Norman B, Mehany SN, Bucknor MD, Link TM, Majumdar S. 3D convolutional neural networks for detection and severity staging of meniscus and PFJ cartilage morphological degenerative changes in osteoarthritis and anterior cruciate ligament subjects. J Magn Reson Imaging. 2019;49(2):400–10. pmid:30306701
- 43. Roblot V, Giret Y, Bou Antoun M, Morillot C, Chassin X, Cotten A, et al. Artificial intelligence to diagnose meniscus tears on MRI. Diagn Interv Imaging. 2019;100(4):243–9. pmid:30928472
- 44. Rizk B, Brat H, Zille P, Guillin R, Pouchy C, Adam C, et al. Meniscal lesion detection and characterization in adult knee MRI: A deep learning model approach with external validation. Phys Med. 2021;83:64–71. pmid:33714850
- 45. Li J, Qian K, Liu J, Huang Z, Zhang Y, Zhao G, et al. Identification and diagnosis of meniscus tear by magnetic resonance imaging using a deep learning model. J Orthop Translat. 2022;34:91–101. pmid:35847603
- 46. Kara AC, Hardalaç F. Detection and classification of knee injuries from MR Images Using the MRNet Dataset with progressively operating deep learning methods. Mach Learn Knowl Extr. 2021;3(4):1009–29.
- 47. Thengade A, Rajurkar A. Comparative analysis of deep convolutional neural network for detection of knee injuries. Int J Eng Trends Technol. 2024;72(2):47–57.
- 48. Güngör E, Vehbi H, Cansın A, Ertan MB. Achieving high accuracy in meniscus tear detection using advanced deep learning models with a relatively small data set. Knee Surg Sports Traumatol Arthrosc. 2025;33(2):450–6. pmid:39015056
- 49. Harman F, Selver MA, Baris MM, Canturk A, Oksuz I. Deep learning-based meniscus tear detection from accelerated MRI. IEEE Access. 2023;11:144349–63.
- 50. Jiang K, Xie Y, Zhang X, Zhang X, Zhou B, Li M, et al. Fully and weakly supervised deep learning for meniscal injury classification, and location based on MRI. J Imaging Inform Med. 2025;38(1):191–202. pmid:39020156
- 51. Mangone M, Diko A, Giuliani L, Agostini F, Paoloni M, Bernetti A, et al. A Machine Learning Approach for Knee Injury Detection from Magnetic Resonance Imaging. Int J Environ Res Public Health. 2023;20(12):6059. pmid:37372646
- 52. Sharma S, Umer M, Bhagat A, Bala J, Rattan P, Rahmani AW. A ResNet50-Based Approach to Detect Multiple Types of Knee Tears Using MRIs. Math Probl Eng. 2022;2022:1–9.
- 53. Ma Y, Qin Y, Liang C, Li X, Li M, Wang R, et al. Visual cascaded-progressive convolutional neural network (C-PCNN) for diagnosis of meniscus injury. Diagnostics (Basel). 2023;13(12):2049. pmid:37370944
- 54. Kunze KN, Rossi DM, White GM, Karhade AV, Deng J, Williams BT, et al. Diagnostic performance of artificial intelligence for detection of anterior cruciate ligament and meniscus tears: a systematic review. Arthroscopy-J Arthrosc Relat Surg. 2021;37(2):771–81. pmid:32956803
- 55. Fritz B, Fritz J. Artificial intelligence for MRI diagnosis of joints: a scoping review of the current state-of-the-art of deep learning-based approaches. Skeletal Radiol. 2022;51(2):315–29. pmid:34467424
- 56. Dung NT, Thuan NH, Van Dung T, Van Nho L, Tri NM, Vy VPT, et al. End-to-end deep learning model for segmentation and severity staging of anterior cruciate ligament injuries from MRI. Diagn Interv Imaging. 2023;104(3):133–41. pmid:36328943
- 57. Jarraya M, Roemer FW, Englund M, Crema MD, Gale HI, Hayashi D, et al. Meniscus morphology: Does tear type matter? A narrative review with focus on relevance for osteoarthritis research. Semin Arthritis Rheum. 2017;46(5):552–61. pmid:28057326
- 58. Caudal A, Guenoun D, Lefebvre G, Nisolle J-F, Gorcos G, Vuillemin V, et al. Medial meniscal ossicles: associated knee MRI findings in a multicenter case-control study. Diagn Interv Imaging. 2021;102(5):321–7. pmid:33339774