Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Model overview.

For each batch comprising pairs of a music track x and its corresponding multi-tag y, the music tracks undergo transformations (indicated by arrows) to compute the self-supervised learning loss and the metric learning loss . The losses are used to define the overall loss function (Eq (20)) to train our proposed model. After training the model, given a music track x, the embedding vector zexc and the estimated probabilities of multi-tag are used for similarity-based retrieval and auto-tagging, respectively.

More »

Fig 1 Expand

Table 1.

Results for supervised scenario of MagnaTagATune dataset.

More »

Table 1 Expand

Table 2.

Results for supervised scenario of MTG-Jamendo dataset.

More »

Table 2 Expand

Fig 2.

Similarity-based retrieval R@K results for semi-supervised scenario of MagnaTagATune dataset.

More »

Fig 2 Expand

Fig 3.

Similarity-based retrieval M@K results for semi-supervised scenario of MagnaTagATune dataset.

More »

Fig 3 Expand

Fig 4.

Auto-tagging results for semi-supervised scenario of MagnaTagATune dataset.

More »

Fig 4 Expand

Fig 5.

Similarity-based retrieval R@K results for semi-supervised scenario of MTG-Jamendo dataset.

More »

Fig 5 Expand

Fig 6.

Similarity-based retrieval M@K results for semi-supervised scenario of MTG-Jamendo dataset.

More »

Fig 6 Expand

Fig 7.

Auto-tagging results for semi-supervised scenario of MTG-Jamendo dataset.

More »

Fig 7 Expand

Fig 8.

T-SNE visualization of similarity latent space for MagnaTagATune dataset.

Green, blue, and yellow dots correspond to music tracks with ‘female vocal’ tags, ‘no vocal’ tags, and other tags, respectively. The percentage % indicates the reduction in labels used for training.

More »

Fig 8 Expand

Fig 9.

T-SNE visualization of similarity latent space for MTG-Jamendo dataset.

Green, blue, and yellow dots correspond to music tracks with ‘instrument—voice’ tags, ‘genre—instrumentalpop’ tags, and other tags, respectively. The percentage % indicates the reduction in labels used for training.

More »

Fig 9 Expand