Mukara: A deep learning alternative to the four-step travel demand model with a case study on interurban highway traffic prediction in the UK

doi:10.1371/journal.pone.0345576

Table 1.

A comparison between the four-step travel demand model and existing deep learning models.

More »

Expand

Fig 1.

Overview of the Mukara workflow.

More »

Expand

Table 2.

Summary of notations used in the Mukara model.

More »

Expand

Table 3.

Summary of data sources used in this study.

More »

Expand

Fig 2.

Distributions of the five edge features used in the Mukara model.

Namely, driving distance, driving duration, straight-line distance, average driving speed, and detour factor. These metrics were normalised for input into the model.

More »

Expand

Table 4.

Breakdown of raster input channels for population and employment features.

More »

Expand

Fig 3.

Heat maps of aggregated population and employment across England and Wales in year 2022.

Higher intensity indicates areas with larger population and employment density, based on LSOA-level data rasterised to 1 km x 1 km grid.

More »

Expand

Table 5.

The categories, OpenStreetMap (OSM) keys, and associated values used to extract thematic raster layers for input features.

More »

Expand

Fig 4.

Heat maps showing the distribution of 12 features including land use, road network, and POI across England and Wales.

The features were aggregated to 1 km x 1 km grid.

More »

Expand

Fig 5.

Histogram of average weekday daily traffic volumes.

The histogram shows traffic volumes across the 498 sensors, calculated over 8 years (2015–2022).

More »

Expand

Fig 6.

Spatial visualisation of average weekday daily traffic volumes.

The figure shows traffic levels across the 498 sensors, averaged over 8 years (2015–2022) and aggregated for both directions of the same highway segment.

More »

Expand

Fig 7.

A visual representation of the CNN block in the Mukara model.

The block processes grid features () by extracting regions of interest (ROIs) around each node, applying convolutional and pooling layers, and generating node embeddings () for subsequent graph attention processing.

More »

Expand

Fig 8.

An overview of the GAT block in the Mukara model.

Initial node embeddings () are refined through multiple GAT layers, incorporating edge embeddings (). The final embeddings are concatenated and passed through an MLP to predict traffic volumes for edges. Solid and dashed lines represent training and test edges, respectively.

More »

Expand

Table 6.

Model variants included in the ablation study.

More »

Expand

Fig 9.

Learning curve of the default Mukara model.

More »

Expand

Fig 10.

Loss values (MGEH and MAE) in the test set for models with different hyperparameter configurations.

More »

Expand

Fig 11.

Prediction performance of the Mukara model.

(Left) Scatter plot comparing predicted traffic volumes with ground truth values for all sensor-year points, with GEH boundaries for reference. (Upper right) Histogram of mean GEH for each sensor, averaged over 8 years. (Lower right) Bar plots of MGEH and MAE for sensors grouped by traffic volume quartiles. Results are for the first fold of the cross-validation. Metrics shown are mean and standard deviation across folds.

More »

Expand

Fig 12.

Error maps showing signed MGEH values for northbound and southbound traffic.

Positive values (red) indicate overestimation, while negative values (blue) indicate underestimation. The maps reveal localised errors, particularly around areas such as Manchester, but no clear geographical trends overall.

More »

Expand

Table 7.

Model performance under random 5-fold CV and spatially blocked CV. Values are reported as mean (standard deviation) across folds.

More »

Expand

Fig 13.

Test set loss values for different population and employment stratification levels.

Increased stratification improves model performance for employment and combined features.

More »

Expand

Fig 14.

Radar plots showing the percentage change in MGEH (red) and MAE (blue) when individual feature sets are removed.

The analysis is presented for overall performance and traffic volume tertiles (low, medium, high). Negative changes indicate a reduction in loss, suggesting possible overfitting or redundancy.

More »

Expand