Skip to main content
Advertisement

< Back to Article

A New Approach to Model Pitch Perception Using Sparse Coding

Fig 7

Comparing the performance of different dictionaries over moderate and high amplitude stimulus levels.

All simulations have the same spectral structure (Eq 6). This spectral structure is simulated for various fundamental frequencies, f0, and the figures show the estimated pitches for each such case (i.e., the maximum peak in each pdf). The estimations are taken from an interval of ± 0 .5 octaves around f0. Each row, i.e., figures A-B and figures C-D, show the estimation results of the SC model for the two dictionaries Dsine, and Dstack, respectively (see text). The column subplots refer to different stimuli levels: moderate (45dB SPL), and high (90dB SPL) amplitudes. The x-axis denotes the location of the first harmonic within the stimuli (i.e., the 3rd harmonic); the thick black dashed lines define the main octave (f0), and the thin black dashed lines define the lower and upper octaves, i.e., 0.5 f0 and 2f0, respectively. (A-B) At low frequencies, up to about 4k Hz of the lower harmonic in the complex stimulus, the estimations of the Dsine dictionary converge to the expected frequencies for both moderate and high stimuli. However, from 4k Hz and above, the pitch estimations for the high stimuli levels diverge from the main octave to other ratios of f0. (C-D) The pitch estimations of the Dstack dictionary converge to the main octave better for the low and high frequencies and for both amplitudes.

Fig 7

doi: https://doi.org/10.1371/journal.pcbi.1005338.g007