Dependency-aware self-attention for robust neural machine translation

Overview of the dependency-aware self-attention (DASA) mechanism.

Overall BLEU on four translation benchmarks.

Overall SacreBLEU on four translation benchmarks.

Translation results of different neural machine translation systems.

Ablation on encoder placement of DASA on IWSLT14 De→En.

Sensitivity of translation performance to the bias strength on IWSLT14 De→En.

Mean and standard deviation of BLEU scores over three runs with different random seeds on IWSLT14 De→En.

All other training and decoding settings are identical to the main experiments.

Low-resource robustness on IWSLT14 De→En under varying fractions of training data.

Inference efficiency across datasets on dual RTX 3090Ti GPUs.

BLEU by sentence-length bins on IWSLT14 De→En.

Visualization of attention distributions in the baseline transformer (left) and the proposed dependency-aware self-attention (DASA, right).