Semantic textual similarity for modern standard and dialectal Arabic using transfer learning

doi:10.1371/journal.pone.0272991

Fig 1.

The main framework of BERT.

More »

Expand

Fig 2.

Given parallel data from two languages, a student model can be trained such that the generated vectors for the two languages sentences are close to the teacher language sentence vector.

More »

Expand

Table 1.

Examples of different levels of correlation between the sentences in STS dataset.

More »

Expand

Table 2.

Some examples of the proposed translations along with original English sentences.

More »

Expand

Fig 3.

A framework of models generation using the third approach.

More »

Expand

Table 3.

Accuracy of machine translation based and interleaved MSA models tested based on Spearman rank correlation between the cosine similarity of sentence representations and the reference labels of the testing dataset in [8].

More »

Expand

Table 4.

Accuracy of knowledge distillation-based MSA models tested based on Spearman rank correlation between the cosine similarity of sentence representations and the reference labels of the testing dataset in [8].

More »

Expand

Table 5.

Accuracy of main Egyptian models tested based on Spearman rank correlation between the cosine similarity of sentence representations and the reference labels of the testing dataset in [8] after translation to Egyptian Arabic.

More »

Expand

Table 6.

Accuracy of main Saudi Arabian models based on Spearman rank correlation between the cosine similarity of sentence representations and the reference labels of the testing dataset in [8] after translation to Saudi Arabic.

More »

Expand

Table 7.

Comparisons between the proposed models and current state-of-the-art Arabic STS models based on Spearman rank correlation between the cosine similarity of sentence representations and the reference labels of the testing dataset in [8].

More »

Expand