Early warning on safety risk of highly aggregated tourist crowds based on VGGT-Count network model | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Analysis of loss variation in different epoch for a VGGT-Count network.
A crowd image is first fed into VGG-19 network for convolution. Then the flatten output feature map is transmitted into the transformer encoder with Multi-Head Attention. Finally, a regression decoder predicts the density map. The Optimal Transport (OT) and Total Variation (TV) loss function is optimized during the training process.

More »

Fig 2 — Fig 2.

Analysis of loss variation in different epoch for a VGGT-Count network.

More »

Table 1 — Table 1.

Comparison with the state-of-the-art methods on ShanghaiTech A, ShanghaiTech B, and UCF-QNRF.
The top performance is highlighted in bold, while the second best is underlined.

More »

Table 2 — Table 2.

Comparison of real-time performance in different models with size, frames and inference time.

More »

Table 3 — Table 3.

Optimizing performance by using different components and structures on ShanghaiTech B datasets.

More »

Fig 3 — Fig 3.

Visualization results of VGGT-Count vs DM-Count.

More »

Fig 4 — Fig 4.

Visualization results of VGGT-Count in different scenarios.

More »