Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation

doi:10.1371/journal.pone.0315829

Table 1.

Configuration of the LastBERT model.

More »

Expand

Fig 1.

Student model architecture.

More »

Expand

Fig 2.

Top-level overview of the teacher-student knowledge distillation process.

More »

Expand

Table 2.

Sample Posts from the ADHD dataset.

More »

Expand

Fig 3.

Top-level overview for ADHD classification study.

More »

Expand

Fig 4.

Training and validation metrics over Epochs during Knowledge distillation process.

More »

Expand

Table 3.

Performance of LastBERT on GLUE benchmark datasets.

More »

Expand

Fig 5.

Accuracy, validation loss, and training loss over Epochs for MRPC.

More »

Expand

Fig 6.

Accuracy, validation loss, and training loss over Epochs for SST-2.

More »

Expand

Fig 7.

Validation loss, training loss, Matthews correlation, and accuracy over Epochs for CoLA.

More »

Expand

Fig 8.

Accuracy, validation loss and training loss over Epochs for QQP.

More »

Expand

Fig 9.

Accuracy, validation loss, and training loss over Epochs for MNLI.

More »

Expand

Fig 10.

Validation loss, training loss, Pearson and Spearman correlations over Epochs for STS-B.

More »

Expand

Fig 11.

Precision, recall, and F1 score over Epochs for LastBert model.

More »

Expand

Fig 12.

Accuracy, training loss, and validation loss over Epochs for LastBert model.

More »

Expand

Fig 13.

Confusion matrix for LastBERT model.

More »

Expand

Fig 14.

ROC curve for LastBERT model.

More »

Expand

Fig 15.

Precision, recall, and F1 score over Epochs for DistilBERT model.

More »

Expand

Fig 16.

Accuracy, training loss, and validation loss over Epochs for DistilBERT model.

More »

Expand

Fig 17.

Confusion matrix for DistilBERT model.

More »

Expand

Fig 18.

Precision, recall, and F1 score over Epochs for ClinicalBERT model.

More »

Expand

Fig 19.

Accuracy, training loss, and validation loss over Epochs for ClinicalBERT model.

More »

Expand

Fig 20.

Confusion matrix for ClinicalBERT model.

More »

Expand

Table 4.

Performance comparison of LastBERT, DistilBERT, and ClinicalBERT on ADHD dataset.

More »

Expand

Table 5.

Weighted average comparison of LastBERT, DistilBERT, and ClinicalBERT on ADHD dataset.

More »

Expand

Table 6.

Comparison of related works and our study on ADHD text classification.

More »

Expand

Table 7.

Performance comparison of LastBERT with Other BERT and knowledge distillation models on GLUE datasets.

More »

Expand