MHS-VIT: Mamba hybrid self-attention vision transformers for traffic image detection

doi:10.1371/journal.pone.0325962

Fig 1.

Illustration of the Mamba Hybrid Self-Attention Vision Transformers (MHS-VIT) architecture.

More »

Expand

Fig 2.

Illustration of the Focus Block architecture.

More »

Expand

Fig 3.

(a) Detailed structure of the SVT Block. (b) Illustration of the VISSBlock architecture. (c) Illustration of the DLS Block. (d) Illustration of the LS Block.

More »

Expand

Fig 4.

Illustration of the LSDetect architecture.

More »

Expand

Table 1.

Ablation study on MHS-VIT.

More »

Expand

Fig 5.

DLS Block integration designs explored in the ablation study.

More »

Expand

Table 2.

Ablation study on DLS block

More »

Expand

Fig 6.

Comparison chart of different model detection results on the TROD datasets.

More »

Expand

Fig 7.

Compare the performance of different model on the TROD dataset.

More »

Expand

Table 3.

Comparison of MHS-VIT with other image detection networks on the TROD (bold indicates our framework).

More »

Expand

Fig 8.

Comparison chart of different model detection results on the TSLD datasets.

More »

Expand

Fig 9.

Compare the performance of different model on the TSLD dataset.

More »

Expand

Table 4.

Comparison of MHS-VIT with other image detection networks on the TSLD (bold indicates our framework).

More »

Expand