Table 1.
Chat-based large-language models evaluated in this study.
Fig 1.
The anatomy icon was drawn de novo by the corresponding author in Adobe Illustrator 2024 (Adobe Inc., San Jose, CA, USA); no external images, stock assets, or anatomical atlases were used, retrieved, traced, or adapted.
Table 2.
Accuracy (%) of six large-language models in identifying anatomical labels.
Fig 2.
Correct-response rates and 95% confidence intervals of six large-language models when answering multiple-choice items.
Superscript letters above each bar denote pairwise comparisons, with values sharing at least one common letter (a-d) do not differ significantly according to χ² tests with Benjamini-Hochberg adjustment.
Table 3.
Consistency (%) of six large-language models across two repeated interactions.
Table 4.
Per-label response latency (median seconds, interquartile range) of six large-language models.
Fig 3.
Log-scale violin-and-box plots showing the distribution of per-label response latency (seconds) for six large-language models.
Fig 4.
Accuracy-versus-latency trade-off for six large-language models.
Each cross marker locates a model by its median response latency (seconds, log-scale) and overall accuracy. Marker size is proportional to consistency across repeated interactions.
Fig 5.
Example of landmark-name identification on an anterior facial muscle plate.
(A) Facial musculature redrawn for illustrative purposes. (B) Ground-truth terms with the corresponding outputs.
Table 5.
Pragmatic model selection for dental anatomy teaching.