Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Comprehensive workflow of the proposed solution, illustrating the iterative pipeline from dataset acquisition and preprocessing to model fine-tuning, validation, and the continuous feedback loop for deployment.

More »

Fig 1 Expand

Table 1.

Structure of raw Binhvq News Corpus.

More »

Table 1 Expand

Table 2.

Structure of raw vi-error-correction-2.0 dataset.

More »

Table 2 Expand

Table 3.

Structure of raw OPUS Tatoeba dataset.

More »

Table 3 Expand

Fig 2.

Data acquisition flowchart detailing the automated crawling mechanism, including link validation, content extraction, HTML cleaning, and deduplication logic for constructing the raw corpus.

More »

Fig 2 Expand

Table 4.

Structure of raw vietnamese crawling dataset.

More »

Table 4 Expand

Table 5.

Structure of raw english crawling dataset.

More »

Table 5 Expand

Table 6.

Structure of Binhvq News Corpus after processed.

More »

Table 6 Expand

Table 7.

Structure of vi-error-correction-2.0 after processed.

More »

Table 7 Expand

Table 8.

Structure of OPUS Tatoeba dataset after processed.

More »

Table 8 Expand

Table 9.

Structure of Vietnamese crawled dataset after processed.

More »

Table 9 Expand

Table 10.

Structure of English crawled dataset after processed.

More »

Table 10 Expand

Fig 3.

Class Imbalance in the Vietnamese Dataset.

More »

Fig 3 Expand

Table 11.

Comparison between metric tables of model XLM-RoBERTa, BARTpho, E5.

More »

Table 11 Expand

Fig 4.

Comparative Model Performance.

(a) Evaluation Loss scaled relative to the maximum observed loss (E5 = 0.0906, set as 100%). Lower percentages indicate better convergence, with BARTPho achieving only 11.62% of the maximum loss. (b) Performance metrics (Accuracy, Precision, Recall, F1) scaled relative to the maximum score achieved across models.

More »

Fig 4 Expand

Table 12.

Example used for comparing with each model.

More »

Table 12 Expand