Comparing UNet configurations for anthropogenic geomorphic feature extraction from land surface parameters

doi:10.1371/journal.pone.0325904

Fig 1.

UNet architecture [1].

FMs = feature maps, w = width, h = height. Arrows indicate operations while blocks indicate data: input predictor variables, intermediate feature maps, or output class logits.

More »

Expand

Fig 2.

Residual connection [2] within a double-convolution block using an identity connection (a) and projection connection (b).

More »

Expand

Fig 3.

Dilated convolution (DC) module implemented in geodl [27].

More »

Expand

Fig 4.

Squeeze and excitation (SE) module [58].

More »

Expand

Fig 5.

AG module mechanism [61].

More »

Expand

Fig 6.

UNet implementation provided by geodl [27].

More »

Expand

Fig 7.

Training, test, and validation areas for (a) agricultural terraces (terraceDL) dataset [29] in Iowa, USA; (b) surface coal mining valley fill faces (vfillDL) dataset [30] in southern West Virginia, eastern Kentucky, and southwestern Virginia, USA; and (c) historic mine benches (minebenchDL) dataset [31] in northern West Virginia, USA.

(d) through (i) show example terrain surfaces and associated geomorphic features as examples.

More »

Expand

Fig 8.

Training loss for terraceDL, mineBenchDL, and vfillDL datasets using all training samples and different model configurations across 25 training epochs.

More »

Expand

Table 1.

Overview of the available training, validation, and test chips for each dataset used.

More »

Expand

Table 2.

Summary of UNet-based models with descriptions of each configuration.

More »

Expand

Table 3.

Model complexity and computational cost comparison. 1 GFLOP = 1 billion FLOPs; 1 GMAC = 1 billion MACs.

More »

Expand

Table 4.

Assessment metrics used in study. TP = true positive; TN = true negative; FP = false positive; FN = false negative.

More »

Expand

Fig 9.

Training loss for terraceDL, mineBenchDL, and vfillDL datasets using varying training sample sizes and different model configurations across 25 training epochs.

More »

Expand

Fig 10.

Validation F1-score for terraceDL, mineBenchDL, and vfillDL datasets using varying training sample sizes and different model configurations across 25 training epochs.

More »

Expand

Fig 11.

F1-score calculated from the withheld test data using different architectural configurations and training set sizes.

Red points indicate the Base UNet model.

More »

Expand

Table 5.

Testing set assessment metrics for prediction of agricultural terraces using different architectural configurations and varying sample sizes.

More »

Expand

Table 6.

Testing set assessment metrics for prediction of historic mine benches using different architectural configurations and varying sample sizes.

More »

Expand

Table 7.

Testing set assessment metrics for prediction of surface coal mine valley fill faces using different architectural configurations and varying sample sizes.

More »

Expand

Fig 12.

Comparison of agricultural terraces detection results using different architectural configurations.

More »

Expand

Fig 13.

Comparison of historic mine benches detection results using different architectural configurations.

More »

Expand

Fig 14.

Comparison of valley fill faces detection results using different architectural configurations.

More »

Expand

Fig 15.

Bootstrap analysis of overall accuracy, F1-score, precision, and recall across model configurations for mineBenchDL, terraceDL and vfillDL datasets.

Error bars represent variability from 10 bootstrap test samples, highlighting performance stability differences across configurations.

More »

Expand