Prediction of cognitive impairment through speech data analysis: A comparative evaluation of deep learning models

doi:10.1371/journal.pone.0349412

Prediction of cognitive impairment through speech data analysis: A comparative evaluation of deep learning models

Fig 2

Architecture of the AST (Audio Spectrogram Transformer) model.

This schematic details how 2D audio spectrograms are divided into localized patches and processed through transformer encoder blocks utilizing self-attention mechanisms to learn global acoustic context.

doi: https://doi.org/10.1371/journal.pone.0349412.g002