Table 1.
Key metrics for evaluating translation accuracy.
Table 2.
BLEU Scores for Translations by ChatGPT (with Prompts 1 and 2), Google Translate, and Bing Translator across Chapters.
Table 3.
ROUGE-N Scores for Translations by ChatGPT (with Prompts 1 and 2), Google Translate, and Bing Translator across Chapters.
Table 4.
ROUGE-L Scores for Translations by ChatGPT (with Prompts 1 and 2), Google Translate, and Bing Translator across Chapters.
Table 5.
METEOR Scores for Translations by ChatGPT (with Prompts 1 and 2), Google Translate, and Bing Translator across Chapters.
Table 6.
BERT Scores for Translations by ChatGPT (with Prompts 1 and 2), Google Translate, and Bing Translator across Chapters.
Fig 1.
Post-Hoc Analysis of Translation Tools Across Multiple Evaluation Metrics.
Fig 2.
Comprehensive Performance Comparison of Translation Tools Across Five Evaluation Metrics.
Table 7.
Wilcoxon Signed-Rank Test Results Comparing ChatGPT Prompt Strategies Across Evaluation Metrics.
Fig 3.
Distribution of Evaluation Scores by Metric and Translation Tool.
Fig 4.
Comparison of ChatGPT Prompt Strategies Across Multiple Evaluation Metrics.
Table 8.
Inter-rater reliability by rating dimension (quadratic weighted Cohen’s Kappa).
Table 9.
Overall inter-rater reliability across all dimensions and systems.
Table 10.
Descriptive statistics by system × dimension (pooled raters).
Table 11.
Distribution of Translation Errors by Type and Chapter for Each Machine Translation Tool.
Table 12.
Descriptive Statistics of Error Types by Translation Tool (Mean±SD).
Table 13.
One-way Analysis of Variance for Grammar, Spelling, and Style Errors Among Translation Tools.
Table 14.
Post-hoc Analysis of Translation Error Variations Among Four MT System.