Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

Distribution into chapters of total 150 MCQs.

More »

Table 1 Expand

Fig 1.

Number of questions with the corresponding detected accuracy.

Gemini 1.5 Flash revealed a mean accuracy of 68.3% (SD = 39.7), Gemini 2.0 Flash 76.3% (SD = 36.4), ChatGPT o1-mini 76.7% (SD = 35.3) and ChatGPT 4o 78.9% (SD = 37.0).

More »

Fig 1 Expand

Fig 2.

Accuracy of the tested LLMs divided into different chapters.

Each grey dot illustrates the accuracy for one asked question. An accuracy of 0 represents 0/10 correct answers for one question. The headings of the chapters corresponding to their number can be found in Table 1. (A) represents the accuracy by chapter for Gemini 1.5 Flash, (B) for Gemini 2.0 Flash, (C) for ChatGPT 4o, (D) for ChatGPT o1-mini.

More »

Fig 2 Expand