Fig 1.
Block diagram of the methodology, including a procedure of data collection (Text and Music).
Data is being collected from reddit posts using an API (a). Titles reflect a variety of user-submitted posts on music in subreddits. Along with this, instances of context used in combination with the music tracks are collected (b). The data is then processed to generate results using Topic modeling, Thematic analysis and Statistical tests (c).
Fig 2.
An example of a Reddit post used in our study with a title, description, and comments.
The title reflects the intention of the user to use music, the description provides further details about the post made by the user and the comments section shows various tracks shared.
Table 1.
Examples of text snippets and subsequent categorization into healthy and unhealthy using HUMS.
Fig 3.
Topic modelling pipeline used for clustering the Reddit posts.
All Reddit posts are clustered into different themes using the BERTopic technique. This picture illustrates with examples, the different components of the pipeline.
Table 2.
Results obtained from topic modeling and thematic analysis.
Fig 4.
Distribution of the number of posts assigned to each listening strategy identified in the r/depression subreddit.
Use of music for relaxation and calming is the most common listening strategy overall whereas deliberately listening to depressive music is most common among the unhealthy music listening strategies.
Fig 5.
Violin plots for frequency scores of all the lyrical themes.
The significant differences are depicted using the p-values obtained from the MWU test, in Self-reference, Optimism and Blame. The y-axis represents frequency scores for the occurrence of words in the dictionary lists for each of its semantic variables.
Fig 6.
Violin plots describing the distribution of valence and energy values for the tracks.
Tracks associated with unhealthy music listening strategies have a significantly lower valence when compared to those associated with the healthy listening strategies.