Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Absolute positional encoding method.

More »

Fig 1 Expand

Fig 2.

Relative positional embedding matrix.

More »

Fig 2 Expand

Fig 3.

An example of the second positional encoding with k = 1.

and are divided into parts respectively and spliced into d parts as the second positional encoding.

More »

Fig 3 Expand

Fig 4.

Visualization of two absolute positional encoding methods.

More »

Fig 4 Expand

Fig 5.

Visualization of position-wise cosine similarity of different position embeddings.

Lighter denotes the higher similarity.

More »

Fig 5 Expand

Fig 6.

The similarity between the 128th positional encoding and others.

More »

Fig 6 Expand

Table 1.

The dataset situation for different tasks.

More »

Table 1 Expand

Table 2.

The parameter configuration of different models.

More »

Table 2 Expand

Table 3.

Results on the WMT’14 En-De and the WMT’16 En-Ro tasks with Base model and the IWSLT’14 En-De and the IWSLT’17 En-Fr tasks with Small model.

More »

Table 3 Expand

Table 4.

Results on the WMT’14 En-De task with Deep and Big model.

More »

Table 4 Expand

Table 5.

Results of OPR on WMT’14 En-De with different hyperparameter k values.

More »

Table 5 Expand

Table 6.

BLEU scores of 5 independent runs and statistical test results for APE vs. OPR on WMT’14 En-De task with Big model.

More »

Table 6 Expand

Table 7.

Experimental results on sentences of different lengths.

More »

Table 7 Expand

Fig 7.

Perplexity comparison during training.

More »

Fig 7 Expand