Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

New AI-algorithms on smartphones to detect skin cancer in a clinical setting—A validation study

  • Teresa Kränke ,

    Roles Data curation, Supervision, Validation, Writing – original draft

    teresa.kraenke@medunigraz.at

    Affiliation Department of Dermatology and Venereology, Medical University of Graz, Graz, Austria

  • Katharina Tripolt-Droschl,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Dermatology and Venereology, Medical University of Graz, Graz, Austria

  • Lukas Röd,

    Roles Data curation, Formal analysis

    Affiliation Medical University of Graz, Graz, Austria

  • Rainer Hofmann-Wellenhof,

    Roles Data curation, Formal analysis, Project administration, Writing – review & editing

    Affiliation Department of Dermatology and Venereology, Medical University of Graz, Graz, Austria

  • Michael Koppitz,

    Roles Data curation, Formal analysis, Software, Writing – review & editing

    Affiliation Medical University of Graz, Graz, Austria

  • Michael Tripolt

    Roles Conceptualization, Data curation, Investigation, Project administration, Writing – review & editing

    Affiliation Department of Dermatology and Venereology, Medical University of Graz, Graz, Austria

Abstract

Background and objectives

The incidence of skin cancer is rising worldwide and there is medical need to optimize its early detection. This study was conducted to determine the diagnostic and risk-assessment accuracy of two new diagnosis-based neural networks (analyze and detect), which comply with the CE-criteria, in evaluating the malignant potential of various skin lesions on a smartphone. Of note, the intention of our study was to evaluate the performance of these medical products in a clinical setting for the first time.

Methods

This was a prospective, single-center clinical study at one tertiary referral center in Graz, Austria. Patients, who were either scheduled for preventive skin examination or removal of at least one skin lesion were eligible for participation. Patients were assessed by at least two dermatologists and by the integrated algorithms on different mobile phones. The lesions to be recorded were randomly selected by the dermatologists. The diagnosis of the algorithm was stated as correct if it matched the diagnosis of the two dermatologists or the histology (if available). The histology was the reference standard, however, if both clinicians considered a lesion as being benign no histology was performed and the dermatologists were stated as reference standard.

Results

A total of 238 patients with 1171 lesions (86 female; 36.13%) with an average age of 66.19 (SD = 17.05) was included. Sensitivity and specificity of the detect algorithm were 96.4% (CI 93.94–98.85) and 94.85% (CI 92.46–97.23); for the analyze algorithm a sensitivity of 95.35% (CI 93.45–97.25) and a specificity of 90.32% (CI 88.1–92.54) were achieved.

Discussion

The studied neural networks succeeded analyzing the risk of skin lesions with a high diagnostic accuracy showing that they are sufficient tools in calculating the probability of a skin lesion being malignant. In conjunction with the wide spread use of smartphones this new AI approach opens the opportunity for a higher early detection rate of skin cancer with consecutive lower epidemiological burden of metastatic cancer and reducing health care costs. This neural network moreover facilitates the empowerment of patients, especially in regions with a low density of medical doctors.

Registration

Approved and registered at the ethics committee of the Medical University of Graz, Austria (Approval number: 30–199 ex 17/18).

Introduction

The incidence of skin cancer, malignant melanoma (MM) and non-melanoma skin (NMSC), is rising worldwide. In Europe, over 144.000 new MM cases are reported each year, being responsible for more than 27.000 deaths per year [1, 2]. The most common NMSC are basal cell carcinomas (BCC) and squamous cell carcinomas (SCC). However, the exact count of NMSC in Europe is not definable as not all tumors are gathered in local databases. Data from Germany suggest an incidence of 119-145/100.00 in 2010 [3, 4]. Both, BCCs and SCCs usually do have a favorable prognosis, but also have the potential for local destructive growth and in advanced cases also for metastatic disease. The reported metastases rate of BCC ranges from 0.0029% to 0.55% with common sites being the regional lymph nodes, lunges, bones, skin, and the liver. Focusing on SCC, it is reported that approximately 4% of all patients will develop metastases and 1.5% die from the disease [59]. Recent data from the American Academy of Dermatology [10] estimates that NMSC affects more than 3 million Americans per year and that 196.060 new cases of melanoma were diagnosed in 2020.

Although there have been advancements in the treatment of metastatic skin cancer in the last decade, the mortality rates, especially those of MM, still strongly depend on its early detection [1113]. While the 5-year survival rate according to the AJCC-Classification 8 (American Joint Committee on Cancer) is nearly 100% for very thin melanomas, it is less than 30% for advanced stages. Consequently, early detection of skin cancer is crucial to avoid metastatic disease as well as high morbidity and mortality rates. Of note, health care costs are another considerable factor that can be influenced by early detection. A recent Australian study showed yearly average costs of 115.109 AUS$ per case for metastatic melanoma; in contrast, the yearly costs for the early stages 0–1 are about 1681 AUS$ [14] on average.

There is growing evidence that artificial intelligence is a valuable supplementary tool in various medical sectors (e.g., radiology and dermatology) [15, 16]. The emergence of new technological tools, especially convolutional neural networks (CNNs), enabled an automated, in vitro image-based diagnosis of various skin diseases [17]. Several studies [1828] investigated CNNs regarding their diagnostic accuracy concerning melanoma recognition. Notably, most skin cancer recognition networks have currently been used for the classification of high-quality images. However, in a realistic scenario a high variance of image quality and image characteristics have to be taken into account. Very recently, a meta-analysis [29] reported an unreliable performance for smartphone-based applications; the application with the best performance had a sensitivity of 80% and specificity of 78%.

Neural networks

We used a classical convolutional neural network (CNN) and a novel stratification CNN based on a region proposal network (RPN). Both are usable on smartphones in daily routine. Notably, both algorithms were previously developed, already fulfill the CE-criteria and are registered as medical product at the Austrian Federal Office for Safety in Health Care (registration number: 10965455). The effectiveness of similar neural networks was recently demonstrated in an “in-vivo” skin cancer classification task [28]. The RPN is more hardware demanding and needs, in today available smartphones, a longer processing time (2–5 seconds); but this is still acceptable to analyze the lesions. More detailed information on the algorithms is given in the supplementary material.

The aim of the study presented herein was to evaluate and validate the diagnostic accuracy in skin cancer recognition of the two different neural networks (image classifier/analyze and region proposal network / detect), which were tested separately and in conjunction on a mobile phone in a clinical setting. This constitutes a novel approach in optical clinical detection of skin lesions.

Methods

Study design

The study was prospective and designed as single-center study at the Department of Dermatology and Venereology, a tertiary referral center, in Graz in order to evaluate the diagnostic and risk-assessment accuracy of two CNNs in comparison to the histopathological and clinical diagnosis. The study was approved by the local ethics committee (Approval number: 30–199 ex 17/18). All procedures were conducted according to the principles of the Declaration of Helsinki and patients gave written informed consent prior to enrollment. The risk classification of the CNN was stated as correct, if it matched the clinical diagnosis of two dermatologists or the histological diagnosis if available. Notably, the histological diagnosis (in case a biopsy/excision was done) was always considered as reference standard. However, if both dermatologists evaluated a lesion as being benign (based on distinct clinical and dermoscopic malignancy-criteria), no histology was performed—this procedure explains the low number of histologically proven lesions compared with the number of included lesions. Consequently, the last decision (whether to biopsy/excise a lesion or not) was always made by the two dermatologists, even if the algorithm made an opposite risk classification. A small proportion of patients denied biopsy/excision of a suspicious lesions. In these cases, the diagnosis of the two dermatologists was considered as reference.

Moreover, a clinical and dermoscopic digital follow-up of all non-excised “dysplastic nevi” was performed 3–6 months after evaluation.

Participants

Patients with a minimum age of 18 years, who were either scheduled for preventive skin examination or removal of at least one skin lesion, were eligible for participation. The recruitment process was consecutive, meaning that every patient, who was seen by one of the study authors was ask to participate. The only exclusion criterion was a residual tumor after incomplete resection of any skin cancer. These broadly defined inclusion criteria are based on the fact that the study was conducted at a tertiary referral center usually caring for patients at high risk for developing skin cancer of any type. The enrollment phase was between June 2018 and December 2019 and five lesions (both, benign and malignant appearing) were scanned per patient averagely.

Procedure and mobile devices

In a first step, patients were screened by at least two experienced dermatologists independently of each other clinically and dermoscopically. The lesions´ evaluation was made consensually. Scans of nails and scalp-hair were excluded. Notably, also clearly benign lesions were selected for further AI-evaluation. Second, images of the selected lesions were taken by a third dermatologist using the integrated camera and the flash of different mobile phones (Samsung S7, Honor 7A, iPhone Xs, iPhone 6s, Huawei P10) from a distance of approximately 15cm. For an adequate blinding-process, the evaluating dermatologists in the first step had no knowledge of the algorithms´ evaluation. As the detect algorithm was in development until March 2019, the first 132 patients were assessed only with the analyze algorithm.

Algorithms

We tested two different machine-learning algorithms, named analyze and detect. During their developing phase, they were trained with 18.384 images labeled with one of 47 distinct subcategories. Both algorithms are based on a two-step approach: 1) calculating probabilities for the different labels per image, and 2) risk assessment based on these probabilities.

Inspired by previous publications, we developed a three-level decision-tree (subcategory, category, risk level) (Fig 1). The 47 subcategories were divided into five categories (benign, anatomical structure, non-neoplastic, precancerous, malignant). These categories were then divided into three risk levels (low, medium, high). Detailed information about the algorithms is given in the supplementary.

thumbnail
Fig 1. Graphical depiction of the "three-level decision-tree".

https://doi.org/10.1371/journal.pone.0280670.g001

Statistical analysis

To quantify the risk-assessment accuracy of both algorithms, sensitivity and specificity of these systems in comparison to the histopathological (if available) and clinical diagnosis were calculated. In contrast to binary (healthy versus sick, high versus low risk) diagnostic tools, the studied approaches provided three risk levels in concordance with the clinical practice of assessing a skin lesion as benign, precancerous or malignant. To calculate the sensitivity and specificity the risk levels “medium” and “high” were summarized as “non-benign” and the risk level “low” was accordingly entitled “benign”. This classification is line with the clinical relevance as both risk levels in the group “non-benign” need further medical action independently of their risk level. Statistical parameters were based on the risk level high and medium versus low to the endpoint benign vs. non-benign.

The specificity was calculated twice for each algorithm, one including and one excluding images of the risk category “benign”. For binary analyses the lesions rated as “benign” were excluded in one calculation in order to adequately differentiate the risk categories “medium/yellow” and “high/red”. Performance differences between both algorithms were assessed via 2-sample tests for equality of proportions with continuity correction based on Pearson’s Chi-square statistic.

Results

Patients´ and lesions´ characteristics

A total of 238 patients with 1171 lesions (86 female; 36.13%) were included; all of them were scanned with the analyze algorithm and 92 patients (38.65%) with 552 lesions (27 female; 29.35%) were additionally screened with the detect algorithm. The average age was 66.19 (SD = 17.05) in the analyze test group and 66.42 (SD = 17) in the detect group. The distribution of the skin types according to Fitzpatrick scattered as follows: Analyze: Skin type I: 29 (12.2%); skin type II: 141 (59.2%); skin type III: 64 (26.9%); skin type IV: 4 (1.7%). Detect: Skin type I: 13 (14.1%); skin type II: 54 (58.7%); skin type III 22 (24%); skin type IV: 3 (3.2%). On average, 5 lesions per patient (range: 1 to 18) were selected and scanned. The detailed distribution is shown in Table 1.

thumbnail
Table 1. Distribution of the scanned lesions in the different age classes.

The total number of participants as well as the percentage in the respective algorithm-group is shown.

https://doi.org/10.1371/journal.pone.0280670.t001

Lesions allocated to the risk group “high” (n = 196): Malignant melanoma (n = 20), squamous cell carcinoma (n = 55), basal cell carcinoma (n = 114) and Bowen´s disease (n = 7).

Lesions allocated to the risk group “medium” (n = 283): Actinic keratosis (n = 115) and dysplastic nevus (n = 168).

The assignment of the lesions to the respective risk groups by the algorithm is given is given in the Tables 24. In 165 lesions (154 analyze and 67 detect) a histological examination was performed.

thumbnail
Table 2. Assignment of the lesions to the respective risk groups by the algorithm.

https://doi.org/10.1371/journal.pone.0280670.t002

thumbnail
Table 3.

Shown are the absolute numbers of the six subcategories “red/high” and “yellow/medium” with their allocation to the three risk groups by each algorithm (a and b). 3c shows the histopathological results of all excised lesions. 3d and 3e display crosstabulations of the respective algorithm with the clinical category.

https://doi.org/10.1371/journal.pone.0280670.t003

thumbnail
Table 4. Crosstabulations risk*histology for ANALYZE (a) and DETECT (b).

https://doi.org/10.1371/journal.pone.0280670.t004

Mobile devices

The mainly used operating system on the mobile devices was Android for both algorithms (analyze in 85.09% and detect in 85.74% of the cases). The remaining scans were performed with iOS. The imaging procedure was done easily in most of the cases excluding the abovementioned excluded body regions and usually no more than one attempt per lesion was needed in order to get a good image. The average imaging time including the recording of patient´s data was three minutes.

Accuracy of the algorithms

Sub-category accuracy.

When focusing on the 47 subcategories, the diagnostic accuracy of the detect algorithm was 88.35% (best subcategories [each exceeding 88.35%]: Bowen´s disease, comedo, hematoma, keloid; worst subcategory [50%]: dermatofibroma), compared to 81.74% in the analyze group (best subcategory [each exceeding 81.74%]: actinic keratosis, hemangioma, seborrheic keratosis, cyst; worst subcategory [33.3%]: hypopigmentation) (p = 0.0005, χ2 = 12.05). Fig 2 shows detailed information on the diagnostic accuracy of both algorithms concerning the 47 subcategories summarized in their respective risk levels.

thumbnail
Fig 2. Diagnostic accuracy of both algorithms in the three risk levels as well as the overall diagnostic accuracy.

The columns indicate the percentages of correct diagnoses in the respective risk level.

https://doi.org/10.1371/journal.pone.0280670.g002

Concerning the operating systems, analyze showed an average diagnostic accuracy of 81.2% on Android and 84.83% on iOS (p = 0.29, χ2 = 1.11). For detect, the average diagnostic accuracy was 88.24% for Android and 89.02% for iOS (p = 0.98, χ2 = 0).

Overall accuracy.

Statistical analysis in the classification groups “benign” versus “non-benign” showed a sensitivity of 95.35% (CI 93.45–97.25) for analyze and 96.4% (CI 93.94–98.85) for detect. The specificity, including non-neoplastic lesions, increased to 90.32% (CI 88.1–92.54) for analyze and 94.85 (CI 92.46–97.23) for detect. No significant difference between the sensitivity (p = 0.6) could be proven between both algorithms. However, we found a significant difference between the algorithms when focusing on their specificity (including non-neoplastic lesions p = 0.02, excluding non-neoplastic lesions p = 0.04).

A notable lower diagnostic accuracy (76% with detect and 72% with analyze) for malignant lesions compared to benign lesions was also observed and mostly (over 70%) attributable to incorrect diagnoses within the malignant category.

Discussion

Our results indicate that both tested networks have the potential to serve as cost and time -effective skin screening tools for the general population. Due to the substantial gain of processing power and camera quality, most mobile devices fulfill the technological requirements to enable mobile skin screenings at home (Fig 3).

thumbnail
Fig 3. Easy use of the algorithm by the patient on the smartphone.

https://doi.org/10.1371/journal.pone.0280670.g003

The use of artificial intelligence in this context has certain advantages like cost- and time-savings [30]. Of note, neural networks cannot replace a dermatologist, mostly because these systems are not able to take further diagnostic and/or therapeutic measures [30]. However, these systems are a promising concept to alert patients with (early) skin cancer. Consequently, the patients can consult a dermatologist earlier in order to avert further harm.

Both networks were trained with identical datasets and archived similarly good results, despite differences in the network structure. Test- and training-images were taken with a variety of mobile devices, in different settings and by non-trained individuals making the algorithms robust against low image quality and variations.

Several conceptual and practical differences between both networks have to be noted. A broad range of evidence supports whole-image analysis [15] and support vector machines [31]. The new approach of the use of RPNs in the context of skin cancer diagnosis has certain advantages. We showed that the RPN-based detect algorithm diagnoses were significantly more accurate than the whole-image analyze algorithm approach. A reason for the higher diagnostic accuracy might be a better performance for images with several or disjoint lesions. As every lesion is evaluated separately, smaller (pre-) cancerous lesions can be detected even nearby larger benign lesions. Nevertheless, this benefit was not conferrable to the benign vs. non- benign assessment. Another advantage of RPNs is, that they draw bounding boxes, highlighting the analyzed sites of the image, and hereby allowing users to confirm that the favored lesion was analyzed (Fig 4). The only drawback is, that detect requires more CPU capacities. In up-to-date smartphones it takes several seconds to analyze a picture. In contrast analyze is less hardware demanding and capable of real-time classification even as a video stream.

thumbnail
Fig 4. The RPN-based detect algorithm with bounding boxes highlighting the four analyzed lesions of one image.

Every detect lesion is assigned to a risk level as indicated by different colors.

https://doi.org/10.1371/journal.pone.0280670.g004

Some limitations of this study have to be mentioned: One major challenge for studies in a hospital setting [2944] is posed by the diagnosis validation of clinically non-malignant lesions. Clinically malignant and most precancerous skin lesions are resected and histologically examined, whereas the diagnosis of clinically benign lesions is only based on visual evaluation. Given a visual classification-accuracy of below 90% (sensitivity 87% and specificity 81%) [37] in dermatologists, test- and training-data can be estimated to contain a number of misclassifications. We aimed to reduce this bias by validating clinically benign lesions by at least two dermatologists and by the follow-up of dysplastic nevi.

A second limitation is the generalizability of the study population in terms of their risk profile, age and skin type. The average age of our study population with over 65 years is significantly higher than the average age of the Austrian population. In addition, both networks were trained and tested solely with lesions of the Central European population (skin type according to Fitzpatrick I-IV). The distribution of skin-types underrepresents individuals with dark skin (Fitzpatrick V+VI) [44] and requires an expansion of future training- and test-images. Additionally, we have chosen a highly selected population (mostly patients with a high risk for developing any kind of skin cancer), which is not fully applicable for the general population; however, our primary aim was to detect malignant lesions with a high sensitivity explaining this selected population. Our study investigated the effectiveness of two smartphone-compatible neural networks in the risk assessment of skin lesions. Both approaches were proven to be reliable and effective tools in skin cancer detection. A notable lower diagnostic accuracy (76% with detect and 72% with analyze) for malignant lesions compared to the other categories was mostly (over 70%) attributable to incorrect diagnoses within the malignant category. This confusion did therefore not influence the accuracy of the risk level assessment. The remaining lesions were wrongly labeled to the precancerous category, which also had no effect on the risk level assessment. These incorrect diagnoses/labels within the non-benign category (precancerous and malignant lesions) have no clinical relevance, as the allocation to this category needs further medical attention in any case. Furthermore, the only relevance for the patients is the differentiation between benign and non-benign lesions. In this context, the study by Tran et al. [45], who investigated the diagnostic accuracy of general practitioners and dermatologists in various skin conditions (e.g., benign and malignant skin tumors, bacterial/fungal infections, inflammatory diseases), should be mentioned. The overall diagnostic accuracy of the general practitioners in their study was much lower than those of neural networks in our study. Obviously, neural networks will never achieve a diagnostic accuracy of 100%; however, in comparison to abovementioned study, our neural networks surpassed the general practitioners; given that, our networks could have a positive impact on the health system as unnecessary visits and histological examination will be reduced [45, 46].

Despite above-mentioned limitations, these networks are a promising and market ready technological innovation with the potential to further increase awareness of skin cancer and promote its early detection. Thereby, the health and financial burden of skin cancer could be decreased for the patients and the society.

Conclusion

Our study showed both tested AI-approaches (CNN and RPN) to be capable of classifying images of various skin lesions with a high accuracy regarding the subcategory diagnosis and the risk assessment. Three main parameters of interest where examined: The diagnostic accuracy (showing the efficacy of the AI), the benign vs. non- benign sensitivity and specificity (showing the clinical relevance). Concerning the overall diagnostic accuracy, the detect-algorithm outperformed the analyze-algorithm significantly (88% versus 82%). In terms of risk assessment, no significant differences were found between the two approaches concerning the sensitivity (each exceeding 95%); however, when focusing on the respective specificity the detect algorithm outperformed the analyze algorithm (94.85% versus 90.43%). These results could alert patients with malignant lesions to consult a physician quickly. Whereas, especially in times of the Covid-19 pandemic, patients with benign lesions are prevented from doing so. In conclusion, both algorithms surpassed the performance of automated classifications [14, 3234] and assessments by physicians [3234, 47, 48] in comparable classification tasks. This neural network moreover facilitates the empowerment of patients, especially in regions with a low density of medical doctors.

References

  1. 1. Ferlay J, Colombet M, Soerjomataram I, et al. Cancer incidence and mortality patterns in Europe: Estimates for 40 countries and 25 major cancers in 2018. Eur J Cancer. 2018;103:356–387. pmid:30100160
  2. 2. World Cancer Research Fund. Wcrf.org/dietandcancer/cancer-trends/skin-cancer-statistics. Last accessed: March 27th, 2021
  3. 3. Lomas A, Leonardi-Bee J, Bath-Hextall F. A systematic review of worldwide incidence of nonmelanoma skin cancer. Br J Dermatol. 2012;166(5):1069–80. pmid:22251204
  4. 4. Eisemann N, Waldmann A, Geller AC, et al. Non-melanoma skin cancer incidence and impact of skin cancer screening on incidence. J Invest Dermatol. 2014;134(1):43–50. pmid:23877569
  5. 5. Wysong A., Aasi SZ, Tang JY. Update on metastatic basal cell carcinoma: A summary of published cases from 1981 through 2011. JAMA Dermatol. 2013;149(5):615–6. pmid:23677097
  6. 6. von Domarus H, Stevens PJ. Metastatic basal cell carcinoma: Report of five cases and review of 170 cases in the literature. J Am Acad Dermatol. 1984;10(6):1043–60.
  7. 7. Karia PS, Han J, Schmults CD. Cutaneous squamous cell carcinoma: Estimated incidence of disease, nodal metastasis, and deaths from disease in the United States, 2012. J Am Acad Dermatol. 2013;68(6):957–66. pmid:23375456
  8. 8. Jambusaria-Pahlajani A, Kanetsky PA, Pritesh KS, et al. Evaluation of AJCC tumor staging for cutaneous squamous cell carcinoma and a proposed alternative tumor staging system. JAMA Dermatol. 2013;149(4):402–10. pmid:23325457
  9. 9. Tanese K, Nakamura Y, Hirai I, Funakoshi T. Updates on the systemic treatment of advanced non-melanoma skin cancer. Front Med (Lausanne). 2019 Jul 10;6:160 pmid:31355203
  10. 10. https://www.aad.org/media/stats-skin-cancer [last access: 10th May 2021]
  11. 11. Svedman FC, Pillas D, Tayler A, et al. Stage-specific survival and recurrence in patients with cutaneous malignant melanoma in Europe—a systematic review of the literature. Clin Epidemiol. 2016;8:109–22. pmid:27307765
  12. 12. Zörnig I, Halama N, Bermejo JL, et al. Prognostic significance of spontaneous antibody responses against tumor-associated antigens in malignant melanoma patients. Int J Cancer. 2015;136(1):138–51. pmid:24839182
  13. 13. Mandalà M, Imberti GL, Piazzalunga D, et al. Clinical and histopathological risk factors to predict sentinel lymph node positivity, disease-free and overall survival in clinical stages I-II AJCC skin melanoma: Outcome analysis from a single-institution prospectively collected database. Eur J Cancer. 2009;45(14):2537–45. pmid:19553103
  14. 14. Doran CM, Ling R, Byrnes J, et al. Estimating the economic costs of skin cancer in New South Wales, Australia. BMC Public Health. 2015;15:952. pmid:26400024
  15. 15. Rezazade Mehrizi MH, van Ooijen P, Homan M. Applications of artificial intelligence (AI) in diagnostic radiology: a technography study. Eur Radiol. 2021;31(4): 1805–1811. pmid:32945967
  16. 16. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. pmid:28117445
  17. 17. Haenssle HA, Fink C, Winkler J, et al. Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann Oncol. 2020;31(1):137–143. pmid:31912788
  18. 18. Marka A, Carter JB, Toto E, Hassanpour S. Automated detection of nonmelanoma skin cancer using digital images: A systematic review. BMC Med Imaging. 2019;19(1):21. pmid:30819133
  19. 19. Maron RC, Weichenthal M, Utikal JS, et al. Systematic outperformance of 112 dermatologists in multiclass skin cancer image classification by convolutional neural networks. Eur J Cancer. 2019;119:57–65. pmid:31419752
  20. 20. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148–154. pmid:30852421
  21. 21. Brinker TJ, Hekler A, Enk AH, et al. Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. Eur J Cancer. 2019;113:47–54. pmid:30981091
  22. 22. Zhao XY, Wu X, Li FF, et al. The Application of Deep Learning in the Risk Grading of Skin Tumors for Patients Using Clinical Images. J Med Syst. 2019;43(8):283. pmid:31300897
  23. 23. Cui X, Wei R, Gong L, et al. Assessing the effectiveness of artificial intelligence methods for melanoma: A retrospective review. J Am Acad Dermatol. 2019;81(5):1176–1180. pmid:31255749
  24. 24. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373–381. pmid:29953582
  25. 25. Haenssle HA, Fink C, Stolz W, et al. Dermoscopy in special locations: Nails, acral skin, face, and mucosa. Hautarzt. 2019;70(4):295–311.
  26. 26. Al-Masni MA, Kim DH, Kim TS. Multiple skin lesions diagnostics via integrated deep convolutional networks for segmentation and classification. Comput Methods Programs Biomed. 2020;190:105351. pmid:32028084
  27. 27. Han SS, Kim MS, Lim W, et al. Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm. J Invest Dermatol. 2018;138(7):1529–1538. pmid:29428356
  28. 28. Udrea A, Mitra GD, Costea D, et al. Accuracy of a smartphone application for triage of skin lesions based on machine learning algorithms. J Eur Acad Dermatol Venereol. 2020;34(3):648–655. pmid:31494983
  29. 29. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies. BMJ. 2020;368:m127. pmid:32041693
  30. 30. Seeja RD, Suresh A. Deep learning based skin lesion segmentation and classification of melanoma using Support Vector Machine (SVM). Asian Pac J Cancer Prev. 2019;20(5):1555–1561. pmid:31128062
  31. 31. Du-Harpur X, Watt FM, Luscombe NM, Lynch MD. What is AI? Applications of artificial intelligence to dermatology. Br J Dermatol. 2020;183(3):423–430. pmid:31960407
  32. 32. Tschandl P, Rinner C, Apalla Z, et al. Human–computer collaboration for skin cancer recognition. Nat Med. 2020;26(8):1229–1234. pmid:32572267
  33. 33. Chuchu N, Takwoingi Y, Dinnes J, et al. Smartphone applications for triaging adults with skin lesions that are suspicious for melanoma. Cochrane Database Syst Rev. 2018;12(12):CD013192. pmid:30521685
  34. 34. Thissen M, Udrea A, Hacking M, et al. mHealth App for risk assessment of pigmented and nonpigmented skin lesions—a study on sensitivity and specificity in detecting malignancy. Telemed J E Health. 2017;23(12):948–954. pmid:28562195
  35. 35. Silveira CEG, Carcano C, Mauad EC, et al. Cell phone usefulness to improve the skin cancer screening: preliminary results and critical analysis of mobile app development. Rural Remote Health. 2019;19(1):4895. pmid:30673294
  36. 36. Hekler A, Utikal JS, Enk AH, et al. Superior skin cancer classification by the combination of human and artificial intelligence. Eur J Cancer. 2019;120:114–121. pmid:31518967
  37. 37. Phillips M, Greenhalgh J, Marsden H, Palamaras I. Detection of malignant melanoma using artificial intelligence: An observational study of diagnostic accuracy. Dermatol Pract Concept. 2019;10(1):e2020011. pmid:31921498
  38. 38. Brinker TJ, Hekler A, Enk AH, von Kalle C. Enhanced classifier training to improve precision of a convolutional neural network to identify images of skin lesions. PLoS One. 2019;14(6):e0218713. pmid:31233565
  39. 39. Dick V, Sinz C, Mittlböck M, et al. Accuracy of Computer-Aided Diagnosis of Melanoma: A Meta-analysis. JAMA Dermatol. 2019;155(11):1291–1299. pmid:31215969
  40. 40. Piccolo D, Smolle J, Wolf IH, et al. Face-to-face diagnosis vs telediagnosis of pigmented skin tumors: A teledermoscopic study. Arch Dermatol. 1999;135(12):1467–71. pmid:10606051
  41. 41. Wang YC, Ganzorig B, Wu CC, et al. Patient satisfaction with dermatology teleconsultation by using MedX. Comput Methods Programs Biomed. 2018;167:37–42. pmid:30501858
  42. 42. Maron RC, Haggenmüller S, von Kalle C, et al. Robustness of convolutional neural networks in recognition of pigmented skin lesions. Eur J Cancer. 2021;145:81–91. pmid:33423009
  43. 43. Yu Z, Jiang X, Zhou F, et al. Melanoma recognition in dermoscopy images via aggregated deep convolutional features. IEEE Trans Biomed Eng. 2019;66(4):1006–1016. pmid:30130171
  44. 44. Haluza D, Simic S, Moshammer H. Sun exposure prevalence and associated skin health habits: Results from the Austrian population-based UVSkinrisk survey. Int J Environ Res Public Health. 2016;13(1):141. pmid:26797627
  45. 45. Tran H, Chen K, Lim AC, et al. Assessing diagnostic skill in dermatology: a comparison between general practitioners and dermatologists. Australas J Dermatol. 2005;46(4):230–4. pmid:16197420
  46. 46. Jinnai S, Yamazaki N, Hirano Y, et al. The development of skin cancer classification system for pigmented skin lesions using deep learning. Biomolecules. 2020;10(8):1123.
  47. 47. Boyce Z, Gilmore S, Xu C, Soyer HP. The remote assessment of melanocytic skin lesions: A viable alternative to face-to-face consultation. Dermatology. 2011;223(3):244–50. pmid:22095005
  48. 48. Bandic J, Kovacevic S, Karabeg R, et al. Teledermoscopy for skin cancer prevention: a comparative study of clinical and teledermoscopic diagnosis. Acta Inform Med. 2020;28(1):37–41. pmid:32210513