Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Example of visual question answering tasks.

More »

Fig 1 Expand

Fig 2.

Overall flowchart of the proposed SRRN model.

More »

Fig 2 Expand

Fig 3.

Sparse attention mechanism encoder.

More »

Fig 3 Expand

Fig 4.

The sparse attention mechanism.

More »

Fig 4 Expand

Fig 5.

Co-attention structure diagram.

More »

Fig 5 Expand

Table 1.

Performance of different SRRN variant models on VQA 2.0 dataset.

More »

Table 1 Expand

Fig 6.

The change of accuracy and loss function in the process of model training.

More »

Fig 6 Expand

Fig 7.

(a) The accuracy of “Yes/No” based on different parameters. (b) Accuracy of “Number” based on different parameter. (c) Accuracy of “Other” based on different parameters. (d) Accuracy of “All” based on different parameters.

More »

Fig 7 Expand

Table 2.

The performance of different layers of encoder and decoder.

More »

Table 2 Expand

Table 3.

Comparison of pre-trained model parameters and SRRN model on the VQA 2.0 dataset.

More »

Table 3 Expand

Table 4.

Performance comparison results on VQA 2.0 dataset.

More »

Table 4 Expand

Table 5.

Performance comparison results on GQA dataset.

More »

Table 5 Expand

Fig 8.

Example of visualization on VQA 2.0 and GQA datasets.

More »

Fig 8 Expand