A hybrid deep-learning-architecture for identifying cotton content in fabric materials

doi:10.1371/journal.pone.0346583

Fig 1.

Visualization of a possible application of our approach to fabric classification.

More »

Expand

Table 1.

Overview of related work for fabric classification.

More »

Expand

Fig 2.

Schematic visualization of the hybrid architecture.

More »

Expand

Fig 3.

Visualization of the DenseNet121 architecture, in which an AFPN and a DConv layer is integrated between DenseNet blocks three and four.

More »

Expand

Fig 4.

Visualization of deformable convolution with the example of a 3×3 convolution filter.

(1) normal convolution, (2) deformable convolution.

More »

Expand

Fig 5.

Illustration of an AFPN.

More »

Expand

Fig 6.

Visualization of the Swin Transformer architecture.

More »

Expand

Fig 7.

Training and Evaluation approach: The dataset is split into a training set and a test set.

Data augmentation is applied to the training set, followed by model training using transfer learning and fine-tuning. Finally, the resulting model is evaluated on the test set.

More »

Expand

Table 2.

Overview of the hyperparameters used for hyperparameter tuning. TL = Transfer learning; FT = Fine-Tuning.

More »

Expand

Fig 8.

Example images from the dataset for each class.

Shown are two images per cotton content class (13 classes in total) [4].

More »

Expand

Table 3.

Overview of multiple performance indicators applied to measure the performance of the hybrid model across five folds.

More »

Expand

Table 4.

Overview of the results achieved for global RMSE, MAE, TPR, PPV, TNR, NPV, F1-score, and the 95% Wilson confidence interval (CI) for the TPR of each class.

More »

Expand

Table 5.

Statistical comparison of the hybrid architecture against DenseNet121 and Swin Transformer V2 based on classification accuracy. Paired tests were conducted across N = 10 measurements (two independent runs of 5-fold cross-validation). Significance levels: * p < 0.05, ** p < 0.01, *** p < 0.001.

More »

Expand

Table 6.

Ablation study of the proposed architecture. DN = DenseNet121, Swin = Swin Transformer V2, DConv = deformable convolution layer, AFPN = adaptive feature pyramid network, 2nd FC = second fully connected layer. Paired tests were conducted across N = 10 measurements (two independent runs of 5-fold cross-validation). Gain denotes the accuracy difference in percentage points (pp) relative to the full architecture shown in the first row. The p-values result from paired t-tests comparing each configuration to the first-row architecture. Significance levels: * p < 0.05, ** p < 0.01, *** p < 0.001.

More »

Expand

Fig 9.

Mean confusion matrix across all five folds.

More »

Expand

Fig 10.

Progression curves of training and validation loss from the second run of cross-validation are shown.

Training was stopped at epoch 20 with early stopping. The lowest validation loss occurred at epoch 10.

More »

Expand

Fig 11.

Scatter plot showing the distribution of probabilities with which the image was correctly assigned to a class.

The test images (n = 653) from all 5 runs of the cross-validation are shown. The colored dots indicate the average of all individual probabilities for a class. The average probabilities per class are: class 30%: 86.47%; class 40%: 88.93%; class 50%: 92.21%; class 53%: 69.59%; class 58%: 69.76%; class 60%: 69.93%; class 63%: 56.12%; class 65%: 70.00%; class 66%: 81.71%; class 80%: 84.15%; class 95%: 78.83%; class 98%: 73.95%; class 99%: 74.53%.

More »

Expand