Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Analysis of loss variation in different epoch for a VGGT-Count network.

A crowd image is first fed into VGG-19 network for convolution. Then the flatten output feature map is transmitted into the transformer encoder with Multi-Head Attention. Finally, a regression decoder predicts the density map. The Optimal Transport (OT) and Total Variation (TV) loss function is optimized during the training process.

More »

Fig 1 Expand

Fig 2.

Analysis of loss variation in different epoch for a VGGT-Count network.

More »

Fig 2 Expand

Table 1.

Comparison with the state-of-the-art methods on ShanghaiTech A, ShanghaiTech B, and UCF-QNRF.

The top performance is highlighted in bold, while the second best is underlined.

More »

Table 1 Expand

Table 2.

Comparison of real-time performance in different models with size, frames and inference time.

More »

Table 2 Expand

Table 3.

Optimizing performance by using different components and structures on ShanghaiTech B datasets.

More »

Table 3 Expand

Fig 3.

Visualization results of VGGT-Count vs DM-Count.

More »

Fig 3 Expand

Fig 4.

Visualization results of VGGT-Count in different scenarios.

More »

Fig 4 Expand