Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Evaluating AI-based breastfeeding chatbots: quality, readability, and reliability analysis

Abstract

Background

In recent years, expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites to seek breastfeeding-related information. At the same time, AI-based chatbots-such as ChatGPT, Gemini, and Copilot-have become increasingly prevalent on these platforms (or on dedicated websites), providing automated, user-oriented breastfeeding guidance.

Aim

The goal of our study is to understand the relative performance of three AI-based chatbots: ChatGPT, Gemini, and Copilot, by evaluating the quality, reliability, readability, and similarity of the breastfeeding information they provide.

Methods

Two researchers evaluated the information provided by three different AI-based breastfeeding chatbots: ChatGPT version 3.5, Gemini, and Copilot. A total of 50 frequently asked questions about breastfeeding were identified and used in the study, divided into two categories (Baby-Centered Questions and Mother-Centered Questions), and evaluated using five scoring criteria, including the Quality Information Provision for Patients (EQIP) scale, the Simple Measure of Gobbledygook (SMOG) scale, the Similarity Index (SI), the Modified Dependability Scoring System (mDISCERN), and the Global Quality Scale (GQS).

Results

The evaluation of AI chatbots’ answers showed statistically significant differences across all criteria (p <  0.05). Copilot scored highest on the EQIP, SMOG, and SI scales, while Gemini excelled in mDISCERN and GQS evaluations. No significant difference was found between Copilot and Gemini for mDISCERN and GQS scores. All three chatbots demonstrated high reliability and quality, though their readability required university-level education. Notably, ChatGPT displayed high originality, while Copilot exhibited the greatest similarity in responses.

Conclusion

AI chatbots provide reliable answers to breastfeeding questions, but the information can be hard to understand. While more reliable than other online sources, their accuracy and usability are still in question. Further research is necessary to facilitate the integration of advanced AI in healthcare.

1. Introduction

Breastfeeding is the healthiest and most natural food source for newborns and infants and has significant short- and long-term effects on the health of mothers, infants, and young children [1]. Breastfeeding rates vary from society to society. Worldwide, only 44% of newborns are breastfed within the first few hours of life[2]. The reasons for low breastfeeding rates are generally the socio-demographic characteristics of mothers and infants, their social environment, health problems and inadequate breastfeeding education received before and after birth [3].

With the development of technology in recent years, much health-related information has become accessible via the internet. Expectant and breastfeeding mothers commonly use various breastfeeding-related social media applications and websites [4]. Therefore, it should be ensured that expectant and breastfeeding mothers have access to accurate information from digital platforms such as the internet and social media [5,6]. According to a 2019 study, 2.79 billion people worldwide use social media and the internet. However, the usefulness of information is affected by its quality and patients’ ability to understand it [7]. Therefore, it is important to verify the accuracy and usability of internet-based information.

Artificial intelligence (AI) [8], especially its applicability to healthcare, is an exciting area of research today. Humanity aims to create intelligent machines capable of performing tasks that typically require human intelligence, such as problem solving, decision making, and language understanding with AI [9,10]. AI software can serve as a source of information for both healthcare professionals and patients. Many studies have evaluated the effectiveness of AI in areas such as diagnosis, treatment planning, and patient monitoring [1113].

Despite the growing prevalence of AI chatbots, the potential of these tools in addressing breastfeeding-related health information remains largely untapped. This study addresses this critical gap by focusing on pediatric and maternal health topics within the context of AI applications. Unlike previous studies, which have primarily focused on general medical information or adult health topics, this research offers a novel perspective by examining the unique needs of expectant and breastfeeding mothers.

AI models, such as ChatGPT (OpenAI), Gemini (Google), and Copilot (Microsoft), make information understandable by simplifying complex medical terms. In this way, they facilitate access to information by helping patients and their families understand health issues more easily [1417]. In addition, the accuracy and reliability of any information from chatbots, not just health-related information, is cause for concern [18,19]. Currently, there are a limited number of studies comparing these chatbots with respect to pediatric topics [2022]. Therefore, it is important to further investigate and evaluate the role of AI software in healthcare [11].

AI chatbots that can interact with humans using natural language can provide useful information to help with care-related decisions. The goal of our study is to understand the relative performance of such chatbots by evaluating the quality, reliability, readability, and similarity of the information they provide.

2. Materials and methods

2.1. AI-Based chatbots included in the study

The focus of this study is to evaluate the breastfeeding information provided by three AI chatbots: ChatGPT version 3.5 (OpenAI, 2023), Gemini (Google, 2023), and Copilot (Microsoft Edge, 2023). ChatGPT is a general-purpose chatbot trained on a large database. Gemini is the latest and most powerful AI model introduced with Massive Multitask Language Understanding (MMLU), which can interpret images, text, video and audio, and is multilingual. Copilot uses artificial intelligence and natural language processing technologies to assist users in text-based conversations.

2.2. Identifying questions about breastfeeding

On February 1, 2024, two researchers (E.Ö.K. and İ.K.) conducted a Google search with the keyword “Frequently Asked Questions about Breastfeeding” to identify the most frequently asked questions by non-experts. E.Ö.K. is a pediatrician with 11 years of clinical experience, and İ.K. is a medical doctor with 6 years of clinical experience. Both doctors have at least 4 years of clinical experience in breastfeeding-related areas. The 50 most frequently asked and answered questions were identified. The questions related to breastfeeding are shown in Table 1. The questions were classified into two categories: “Baby-Centered Questions” and “Mother-Centered Questions.” The responses to the 25 questions in each category were evaluated separately. “Baby-Centered Questions” pertain to matters pertaining to breastfeeding and the infant, whereas “Mother-Centered Questions” relate to issues concerning breastfeeding and the mother. Each question was presented on a new user page to the chatbots involved in the study, and they were asked one by one. Table 2 shows some examples of the answers given by the chatbots to the questions asked. All chatbot responses are presented in the Annex, Chatbots’ Responses. The flowchart of study procedure was shown in Fig 1.

thumbnail
Table 2. Some examples of the chatbots’ responses to the questions posed.

https://doi.org/10.1371/journal.pone.0319782.t002

2.3. Evaluation criteria

Five different scoring criteria were used to evaluate responses to the 25 questions in each category: Quality Information Provision for Patients (EQIP) tool, Modified Dependability Scoring System (mDISCERN), Simple Measure of Gobbledygook (SMOG), Global Quality Scale (GQS), and Similarity Index tool. This scoring criterias were chosen based on their established validity and reliability in assessing written health information [2327].

The EQIP, developed by Molt et al., is a 20-item scale used to comprehensively assess written medical information [23,28]. The first 14 questions relate to the overall quality of the information in the text, while the remaining questions relate to disease, procedure, or drug information. The EQIP tool includes a rating scale with 4 options: “yes,” “somewhat,” “no,” and “not applicable.” EQIP scores range from 0% to 100%, with higher scores indicating better quality [23].

SMOG is a scale used to predict the level of education required for the average person to understand any text. This scale aims for 100% comprehension [29].

The reliability and quality of the responses were assessed using the mDISCERN and GQS scales. DISCERN is a brief questionnaire which provides users with a valid and reliable way of assessing the quality of written information on treatment choices for a health problem. mDISCERN is a practical, easy-to-use tool adapted from the original DISCERN five-question reliability tool used in similar scientific studies. It consists of a total of five questions that can be answered “yes” or “no” and scored between 0 and 5 [30]. The GQS scale is a Likert scale developed by Bernard and colleagues to assess the overall quality of the data. It is scored on a scale from 1 to 5 [31]. In our study, these scales were used to evaluate the written texts provided by chatbots.

The similarity of the responses was calculated as a percentage using the Similarity Index scale through the iTenticate program (http://www.ithenticate.com). According to the calculated scores of the responses, 0%–10% indicated high originality, 10%–20% acceptable similarity, 20%–40% high similarity, and over 40% very high similarity.

2.4. Statistical analysis

Descriptive statistics, such as maximum, minimum, mean, median, standard deviation, and quartiles between 25% and 75%, were calculated for the collected data. Data that were not normally distributed were evaluated using the Shapiro–Wilk test. The Kruskal–Wallis test was used to determine the mean statistical differences between different chatbots and question categories. Intra- and inter-observer agreement was assessed by rescoring all questions after two weeks and calculating the intraclass correlation coefficient (ICC) with a 95% confidence interval. P values less than 0.05 were considered statistically significant. The Bonferroni multiple comparison test was used and therefore adjusted p-values are given. All statistical analyses were performed using Jamovi software (The Jamovi Project, 2022, version 2.3; Sydney, Australia, available at https://www.jamovi.org).

2.5. Ethical approval

This study did not require ethical approval as it did not involve any material obtained from humans or animals.

3. Results

The answers given by different AI chatbots were evaluated and categorized according to the given criteria. Statistically significant differences between the chatbots were observed in all evaluation criteria (p <  0.05). Copilot stood out with the highest scores in EQIP, SMOG, and SI scale evaluations (48.9 ±  14.2, 18.5 ±  2.03, and 28.0 ±  20.8, respectively), while Gemini achieved the highest scores in mDISCERN and GQS evaluations (4.15 ±  0.936, 4.12 ±  0.940, respectively). No statistically significant difference was found between Copilot and Gemini for the mDISCERN and GQS scores. The reliability and quality scores showed that all three AI-based chatbots had high reliability and quality. For readability, based on the SMOG index, it was found that they required at least a university education. According to the similarity evaluation results, ChatGPT showed a high level of authenticity (8.56 ±  17.6), while Copilot showed a high level of similarity (28.0 ±  20.8). Descriptive statistics and post hoc comparisons of all categories are shown in Table 3 and Table 4.

thumbnail
Table 3. Descriptive statistics and post hoc comparisons of all categories.

https://doi.org/10.1371/journal.pone.0319782.t003

thumbnail
Table 4. Cohen’s d-values of chatbot comparisons of all categories’.

https://doi.org/10.1371/journal.pone.0319782.t004

In the “Mother-Centered Questions” category, only EQIP, SMOG, and Similarity Index showed statistically significant differences between the chatbots (p < 0.001). In terms of reliability and quality, all three chatbots showed high reliability and good quality. The highest mean scores for EQIP, SMOG, GQS, and SI were obtained with Copilot (46.3 ± 13.9, 17.7 ± 2.00, 4.10 ± 1.16, and 26.5 ± 23.0, respectively). The highest mean for mDISCERN was obtained in Gemini (4.47 ± 0.819).  According to the SMOG index, the readability level of all three chatbots was above the university level. According to the similarity score, ChatGPT showed a high level of originality, Gemini showed an acceptable similarity, and Copilot showed a high level of similarity. Descriptive statistics and comparisons of the questions in the “Mother-Centered Questions” category are shown in Table 5 and Table 6.

thumbnail
Table 5. Descriptive statistics and post hoc comparisons of the ‘Mother-Centered Questions’ category.

https://doi.org/10.1371/journal.pone.0319782.t005

thumbnail
Table 6. Cohen’s d-values of chatbot comparisons of Mother-Centered Questions’.

https://doi.org/10.1371/journal.pone.0319782.t006

Statistically significant differences (p < 0.05) were observed in three of the chatbots’ responses to the 25 questions in the “Baby-Centered Questions” category. All three chatbots showed high reliability and good quality but required university-level training for readability. Copilot had the highest scores for EQIP, mDISCERN, SMOG, GQS, and SI (51.4 ± 14.2, 4.07 ± 1.26, 18.9 ± 1.91, 4.13 ± 1.21, and 29.4 ± 18.6, respectively). According to the similarity index, ChatGPT showed an acceptable level of similarity, while Gemini and Bard chatbots showed a high level of similarity. Descriptive statistics and comparisons of questions in the “Baby-Centered Questions” category are shown in Table 7 and Table 8.

thumbnail
Table 7. Descriptive statistics and post hoc comparisons of the ‘Baby-Centered Questions’ category.

https://doi.org/10.1371/journal.pone.0319782.t007

thumbnail
Table 8. Cohen’s d-values of chatbot comparisons of Baby-Centered Questions’.

https://doi.org/10.1371/journal.pone.0319782.t008

According to the averages of the chatbots’ responses to different categories, all three chatbots had higher EQIP scores in the “Baby-Centered Questions” category. When comparing the “Mother-Centered Questions” and “Baby-Centered Questions” categories, statistically significant differences were found for the mDISCERN score in ChatGPT; the EQIP and SMOG scores in Copilot; and the mDISCERN, EQIP, and SMOG scales in Gemini. In terms of reliability, all three chatbots scored higher in the “Mother-Centered Questions” category, but this difference was statistically significant in ChatGPT and Gemini. According to the SMOG scale, the responses of Copilot and Gemini were significantly lower in the “Mother-Centered Questions” category. The data on the responses of the chatbots are shown in Table 9 and Table 10.

thumbnail
Table 9. Comparisons of the Baby-Centered Questions and Mother-Centered Questions.

https://doi.org/10.1371/journal.pone.0319782.t009

thumbnail
Table 10. Cohen’s d-values of chatbot comparisons of Baby-Centered Questions’ and Baby-Centered Questions’.

https://doi.org/10.1371/journal.pone.0319782.t010

4. Discussion

To our knowledge, this study is the first to evaluate the quality, reliability, and readability of information provided by AI-based chatbots in the breastfeeding domain. In addition, a limited number of studies have evaluated the information differences between AI models and other online information sources [20].

According to our study data, all three chatbots demonstrated high reliability and good quality in their responses. Among the chatbots, ChatGPT provided the most authentic answers, as indicated by its higher originality scores in the Similarity Index evaluation. However, in terms of reliability, Gemini and Copilot outperformed ChatGPT, as they were able to explicitly reference their data sources more consistently. These findings suggest that while all three chatbots are effective in providing high-quality information, their strengths lie in different areas, highlighting the need for further refinement to balance reliability, originality, and transparency across AI platforms.

The Food and Drug Administration (FDA) has approved 521 medical AI models for some medical fields [32]. However, there are currently no regulations governing the use of chatbots for medical purposes. Legal issues that need to be considered when using chatbots in healthcare include source clarity, the impact of misinformation on patient decisions, conflicts of interest, copyrights (e.g., source material(s) chatbots might pull from), and data security.

Fahy et al. [33] discovered notable variations in the reliability of information sources, while Azak et al. [34] reported the low quality and reliability of information in their study. In our study, we evaluated the reliability of three chatbots using mDISCERN, adapted from DISCERN, and found it to be high. Additionally, we observed that Gemini and Copilot were statistically more reliable than ChatGPT. We believe this is because Copilot and Gemini are able to explicitly state the data sources they use.

Many studies based on visual and written data have been conducted in the area of breastfeeding [35]. A literature review evaluating the quality of web-based health information for patients reported significant differences between sources [33]. A study by Azak et al. using DISCERN to evaluate breastfeeding videos found that such videos had high viewership but low quality and reliability [34]. Similarly, Hopkins et al. used the DISCERN tool to evaluate the accuracy and quality of online breastfeeding information and found that 31 websites were included in the study, of which four websites were exemplary [36]. Another study using the GQS scale found that only 18.8% (31 videos out of 165) of breastfeeding education videos were rated as good or excellent. These examples show that videos are a limited resource for patients [37]. Another study emphasized that information provided by physicians or hospitals is of high quality, but educational programs on popular platforms accessible to the public should be developed by experts [38]. In our study, contrary to previous results, we found that the quality of the three different artificial intelligence-based chatbots was good. Thanks to their ability to obtain information directly from the sources in the current literature, we believe that the quality of chatbots’ responses has improved over time and will continue to improve in the future.

A study using SMOG to assess readability in postpartum womefn found that a high level of education was required [39]. Again, a study evaluating the readability of breastfeeding information websites reported that the reading level of these websites was quite difficult and could be understood by university graduates only readily [40]. In our study, the readability of the responses was assessed using SMOG, and it was found that the readability of the responses was difficult, and that they were generally aimed at people with a university education. Based on these data, it is clear that both chatbots and websites target an audience with a certain level of education. Solutions can be developed to improve the readability of the data so that individuals with lower levels of education can easily understand.

Previous research, such as that by Agarwal et al. (2023) and Jedrzejczak and Kochanek (2024), emphasized the potential of AI models in delivering reasoning-based medical knowledge. For instance, Agarwal et al. demonstrated that AI could generate accurate, reasoning-based multiple-choice questions for medical education. Similarly, Jedrzejczak and Kochanek highlighted the variability in chatbot responses in the audiology domain. Our study complements these findings by demonstrating the applicability of AI-based chatbots in breastfeeding, an area previously underexplored.

Nonetheless, there are areas for improvement. As noted by Fahy et al. (2014) and Azak et al. (2023), the reliability and quality of digital health resources are inconsistent. While Gemini and Copilot were statistically more reliable than ChatGPT in our study, concerns about undisclosed sources remain, consistent with Warren et al.‘s (2024) findings on chatbot data transparency.

5. Conclusion

Artificial intelligence chatbots demonstrate the potential to provide reliable and high-quality information on breastfeeding. However, the complexity and readability of the information may limit its accessibility to individuals with lower educational levels. While AI-based chatbots outperform many traditional online platforms in terms of reliability and quality, concerns regarding their accuracy, usability, and source transparency remain. Our findings highlight the need for further refinement of AI technologies to ensure that they cater to a broader audience, including those with varying levels of health literacy. Future integration of AI systems into healthcare settings could enhance the accessibility and personalization of medical information. However, continued research is necessary to optimize these platforms and establish guidelines for their clinical and non-clinical use.

Limitations: This study has several limitations. First, the evaluations were performed with only three AI platforms, and the response quality of other platforms is unknown. Second, the study may not be fully representative of real clinical scenarios. Third, the chatbots used have different language models, and AI technology is developing rapidly. Therefore, the results may change significantly with updates and new versions. Forth, the same questions weren’t repeated multiple times for each AI chatbot, which may introduce variability in responses. Fifth, at the time of our study, none of the chatbots consistently provided sources for their responses.

Acknowledgments

The authors would like to thank İ. K. and M.D. for her contributions.

Referances

  1. 1. Victora CG, Bahl R, Barros AJD, França GVA, Horton S, Krasevec J, et al. Breastfeeding in the 21st century: epidemiology, mechanisms, and lifelong effect. Lancet. 2016;387(10017):475–90. pmid:26869575
  2. 2. Organization WH, Unicef. Implementation guidance: protecting, promoting and supporting breastfeeding in facilities providing maternity and newborn services–the revised Baby-friendly Hospital Initiative. Geneva: World Health Organization; 2018.
  3. 3. Rollins NC, Bhandari N, Hajeebhoy N, Horton S, Lutter CK, Martines JC, et al. Why invest, and what it will take to improve breastfeeding practices?. Lancet. 2016;387(10017):491–504. pmid:26869576
  4. 4. Cangöl E, Şahin NH. Emzirmeyi Etkileyen Faktörler ve Emzi̇rme Danışmanlığı (Factors Affecting Breastfeeding and Breastfeeding Counselling). ZKTB. 2014;45(3):100.
  5. 5. Baker B, Yang I. Social media as social support in pregnancy and the postpartum. Sex Reprod Healthc. 2018;17:31–4. pmid:30193717
  6. 6. Tomfohrde OJ, Reinke JS. Breastfeeding mothers’ use of technology while breastfeeding. Computers in Human Behavior. 2016;64:556–61.
  7. 7. DUONG CTP. Social Media. A Literature Review. JMR. 2020;13(38):112–26.
  8. 8. Radanliev P, De Roure D, Maple C, Ani U. Super-forecasting the “technological singularity” risks from artificial intelligence. Evol Syst (Berl). 2022;13(5):747–57. pmid:37521026
  9. 9. Rao A, Kim J, Kamineni M, Pang M, Lie W, Succi M. Evaluating ChatGPT as an adjunct for radiologic decision-making. MedRxiv. 2023;2023(02):23285399.
  10. 10. Kacer EO, Ipekten F. Can ChatGPT provide quality information about fever in children? Journal of Paediatrics and Child Health. 2024.
  11. 11. Jia N, Luo X, Fang Z, Liao C. When and how artificial intelligence augments employee creativity. Academy of Management Journal. 2024;67(1):5–32.
  12. 12. Issaiy M, Zarei D, Saghazadeh A. Artificial Intelligence and Acute Appendicitis: A Systematic Review of Diagnostic and Prognostic Models. World J Emerg Surg. 2023;18(1):59. pmid:38114983
  13. 13. Revilla-León M, Gómez-Polo M, Barmak AB, Inam W, Kan JYK, Kois JC, et al. Artificial intelligence models for diagnosing gingivitis and periodontal disease: A systematic review. J Prosthet Dent. 2023;130(6):816–24. pmid:35300850
  14. 14. Yurdakurban E, Topsakal KG, Duran GS. A comparative analysis of AI-based chatbots: Assessing data quality in orthognathic surgery related patient information. J Stomatol Oral Maxillofac Surg. 2024;125(5):101757. pmid:38157937
  15. 15. Carlbring P, Hadjistavropoulos H, Kleiboer A, Andersson G. A new era in Internet interventions: The advent of Chat-GPT and AI-assisted therapist guidance. Internet Interv. 2023;32:100621. pmid:37273936
  16. 16. Liévin V, Hother C, Motzfeldt A, Winther O. Can large language models reason about medical questions?. Patterns. 2023.
  17. 17. Ozdemir Kacer E, Kacer I. Evaluating the quality and reliability of YouTube videos on scabies in children: A cross-sectional study. PLoS One. 2024;19(10):e0310508. pmid:39418293
  18. 18. Benichou L, . The role of using ChatGPT AI in writing medical scientific articles. J Stomatol Oral Maxillofac Surg. 2023;124(5):101456. pmid:36966950
  19. 19. Májovský M, Černý M, Kasal M, Komarc M, Netuka D. Artificial intelligence can generate fraudulent but authentic-looking scientific medical articles: Pandora’s box has been opened. J Med Internet Res. 2023;25:e46924. pmid:37256685
  20. 20. Patil N, Huang R, van der Pol C, Larocque N. Comparative performance of ChatGPT and bard in a text-based radiology knowledge assessment. Canadian Association of Radiologists Journal. 2023.
  21. 21. Schukow C, Nguyen V-H. Addressing chatbots as artificial intelligence aids in pediatric pathology. Pediatr Dev Pathol. 2024;27(3):278–9. pmid:37981637
  22. 22. Rokhshad R, Zhang P, Mohammad-Rahimi H, Pitchika V, Entezari N, Schwendicke F. Accuracy and consistency of chatbots versus clinicians for answering pediatric dentistry questions: A pilot study. J Dent. 2024;144:104938. pmid:38499280
  23. 23. Moult B, Franck LS, Brady H. Ensuring quality information for patients: development and preliminary validation of a new instrument to improve the quality of written health care information. Health Expect. 2004;7(2):165–75. pmid:15117391
  24. 24. Walsh TM, Volsko TA. Readability assessment of internet-based consumer health information. Respir Care. 2008;53(10):1310–5. pmid:18811992
  25. 25. D’Souza RS, D’Souza S, Sharpe EE. YouTube as a source of medical information about epidural analgesia for labor pain. Int J Obstet Anesth. 2021;45:133–7. pmid:33339713
  26. 26. Langille M, Bernard A, Rodgers C, Hughes S, Leddin D, van Zanten SV. Systematic review of the quality of patient information on the internet regarding inflammatory bowel disease treatments. Clin Gastroenterol Hepatol. 2010;8(4):322–8. pmid:20060070
  27. 27. Sharafoddini A, Dubin JA, Lee J. Patient similarity in prediction models based on health data: a scoping review. JMIR Med Inform. 2017;5(1):e7. pmid:28258046
  28. 28. Carlsson T, Axelsson O. Patient information websites about medically induced second-trimester abortions: a descriptive study of quality, suitability, and issues. J Med Internet Res. 2017;19(1):e8. pmid:28073735
  29. 29. McLaughlin GH. SMOG grading-a new readability formula. Journal of Reading. 1969;12(8):639–46.
  30. 30. Kumar N, Pandey A, Venkatraman A, Garg N. Are video sharing web sites a useful source of information on hypertension?. J Am Soc Hypertens. 2014;8(7):481–90. pmid:25064770
  31. 31. Bernard A, Langille M, Hughes S, Rose C, Leddin D, Veldhuyzen van Zanten S. A systematic review of patient inflammatory bowel disease information resources on the World Wide Web. Am J Gastroenterol. 2007;102(9):2070–7. pmid:17511753
  32. 32. Joshi G, Jain A, Araveeti SR, Adhikari S, Garg H, Bhandari M. FDA-approved artificial intelligence and machine learning (AI/ML)-enabled medical devices: an updated landscape. Electronics. 2024;13(3):498.
  33. 33. Fahy E, Hardikar R, Fox A, Mackay S. Quality of patient health information on the Internet: reviewing a complex and evolving landscape. Australas Med J. 2014;7(1):24–8. pmid:24567763
  34. 34. Azak M, Yılmaz B, Şahin N. Analysis of YouTube© videos regarding breastfeeding during the coronavirus disease (COVID-19) pandemic. Matern Child Health J. 2023;27(9):1548–58. pmid:37256516
  35. 35. Mohandas S, Rana R, Sirwani B, Kirubakaran R, Puthussery S. Effectiveness of interventions to manage difficulties with breastfeeding for mothers of infants under six months with growth faltering: a systematic review update. Nutrients. 2023;15(4):988. pmid:36839345
  36. 36. Hopkins M, Meedya S, Ivers R, Charlton K. Review of online breastfeeding information for Aboriginal and Torres Strait Islander women. Women Birth. 2021;34(4):309–15. pmid:32653396
  37. 37. Orbatu D, Yildirim Karaca S, Alaygut D, Karaca I. Educational features of YouTube videos depicting breastfeeding: quality, utility, and reliability analysis. Breastfeed Med. 2021;16(8):635–9. pmid:33739866
  38. 38. Jo C-K, Lee S-Y, Kim M-J. Utility evaluation of information from Youtube on breastfeeding for preterm babies. Neonatal Med. 2019;26(4):185–90.
  39. 39. Vnuk AK. An analysis of breastfeeding print educational material. Breastfeed Rev. 1997;5(2):29–35. pmid:9699471
  40. 40. Dornan BA, Oermann MH. Evaluation of breastfeeding Web sites for patient education. MCN Am J Matern Child Nurs. 2006;31(1):18–23. pmid:16371821