From Motion to Emotion: Accelerometer Data Predict Subjective Experience of Music

Music is often discussed to be emotional because it reflects expressive movements in audible form. Thus, a valid approach to measure musical emotion could be to assess movement stimulated by music. In two experiments we evaluated the discriminative power of mobile-device generated acceleration data produced by free movement during music listening for the prediction of ratings on the Geneva Emotion Music Scales (GEMS-9). The quality of prediction for different dimensions of GEMS varied between experiments for tenderness (R12(first experiment) = 0.50, R22(second experiment) = 0.39), nostalgia (R12 = 0.42, R22 = 0.30), wonder (R12 = 0.25, R22 = 0.34), sadness (R12 = 0.24, R22 = 0.35), peacefulness (R12 = 0.20, R22 = 0.35) and joy (R12 = 0.19, R22 = 0.33) and transcendence (R12 = 0.14, R22 = 0.00). For others like power (R12 = 0.42, R22 = 0.49) and tension (R12 = 0.28, R22 = 0.27) results could be almost reproduced. Furthermore, we extracted two principle components from GEMS ratings, one representing arousal and the other one valence of the experienced feeling. Both qualities, arousal and valence, could be predicted by acceleration data, indicating, that they provide information on the quantity and quality of experience. On the one hand, these findings show how music-evoked movement patterns relate to music-evoked feelings. On the other hand, they contribute to integrate findings from the field of embodied music cognition into music recommender systems.


Introduction
Music is often used to regulate emotions like reducing stress or to influence one's mood as shown by [1] or [2]. This means that listening to music is highly linked to the experience of emotions [3][4][5]. Nonetheless, these subjective qualities of music do still play a minor role in the field of Music Information Retrieval (MIR) and Music Recommender Systems (MRS), i.e. for the retrieval and recommendation of music offered by web-based services [6].
In the field of Affective Computing [6] there is increasing effort to connect the physical characteristics of music to emotional valence and arousal values of the circumplex model from [7]. However, considering acoustical features of music and its implications on the perception of emotions does not yet consider the motor origin hypothesis of emotion in music as claimed by [8]. Accordingly, music is often discussed to be emotional because it reflects expressive movements in audible form [9][10][11]. Furthermore, the term emotio derived from the Latin movere, to move, and is being used as a synonym for being moved. That is why Leman calls for new non-verbal, embodied possibilities to describe music and its experience [12]. He suggests to use corporeal articulations as a bridge between linguistic self-report measures and measurements of physical energy like pitch, loudness or tempo because "human action can realize the transformation from physical energy to cultural abstraction, and vice versa" ( [12] p. 77).
Describing music by moving a mobile device like a smartphone, could build a bridge between physical energy and subjective experience. Smartphones are among the most increasingly popular devices to listen to music [13]. Once corporeal articulations are available from smartphone-assessed motion data and emotionally interpretable, missing emotional descriptions of music in MIR and MRS could be provided based on these embodied descriptions. Hence, a model able to translate between corporeal and verbal descriptions of music would not only offer innovative, multimodal access to the retrieval of music, but also place additional semantic annotations about the emotional qualities of music at the disposal that also enrich the conventional verbal search.

Verbal Models of Musical Emotion
The most widespread models to describe the emotional qualities of music listening experience are the basic emotions model, the circumplex model and the Geneva Emotion Music Scales (GEMS) [14].
The basic emotion model assumes, that there are four to six basic emotions that have evolutionary meaning and that are culturally universal. They often include fear, anger, happiness, sadness, disgust, and surprise [14]. The circumplex model maps these basic emotions and many other emotional feelings onto a two-dimensional space that is spanned by valence ("how pleasant or unpleasant is the experience?") and arousal ("how intense is the experience?") [7]. Accordingly, the quality of each emotion can be described by these two underlying qualities.
The GEMS have been iteratively developed and evaluated in four studies [15]. The aim for music-related emotions was to find "a more nuanced affect vocabulary and taxonomy than is provided by current scales and models of emotion" ( [15] p. 513). The original version of the GEMS comprises 45 terms including feeling of transcendence, nostalgic, solemn or impatient that are not part of any other emotional model. The GEMS-9, a shortened version of the GEMS-45, are, like the long-version, grouped into the categories of sublimity (wonder, transcendence, tenderness, nostalgia, peacefulness), vitality (power, joyful activation) and unease (tension, sadness) [15]. Torres-Eliard, Labbé and Grandjean collected self-report measures of the GEMS and suggest that it was a suitable model to assess musical emotion [16]. They concluded that "the results indicate a high reliability between listeners for different musical excerpts and for different contexts of listening including concerts, i.e. a social context, and laboratory experiments" ( [16] p. 252).

Corporeal Articulation of Musical Emotion
Several studies have shown that there is a close link between movement and emotion in music. Sievers et al. asked participants to adjust the features rate, jitter, consonance/smoothness, step size and direction for each of the following five emotions: anger, happiness, peacefulness, sadness and fear [10]. For one group the adjustment of features led to different movement and appearance of a bouncing ball. For the second group adjusting features changed the melody and expression of a piano piece. Experiments were conducted both in the U.S.A. and Cambodia. The settings used for different emotions were highly similar for motion and music in both cultures. Accordingly, the authors conclude that emotion expression in music and movement seem to be based on the same universal features. Giordano et al. studied the relationship between walking and emotion and its implications for the expression of musical performances [8]. Slow, quiet, and irregular walking sounds were associated with expressing sadness while fast, loud, and regular walking sounds with happiness. Similar patterns are also used in music performance (see [17]). Thus, they concluded that their findings "support the motor-origin hypothesis of musical emotion expression that states that musicians and listeners make use of general movement knowledge when expressing and recognizing emotions in music" ([8] p. 29). This connection between musical rhythm and motor activities was supported by Parncutt already in 1987 [18].
There are many possibilities to express music listening experience in an embodied way. Among them are tapping or moving parts of the body along with the beat, singing, imitating to play a musical instrument or dancing. Hedder also evaluated an approach based on facial expression as form of embodiment [19]. Drawing as described in De Bruyn, Moelants, and Leman [20] and [12] is another alternative as a means of graphical attuning to the experience. Last but not least, the employment of acceleration sensor data generated by arm gestures as by Amelynck, Grachten, van Noorden and Leman [21] was described to be a very promising approach of multimodal querying on mobile devices.
Amelynck et al. [21] investigated how motion can be linked to emotion in the context of MIR. They asked participants to perform arm gestures while holding a Wii remote controller in order to describe the music. Afterwards the emotional qualities of the musical excerpts were rated on the dimensions of valence and arousal. Using motion features recorded with the Wii controller generated fairly good predictions for the dimension of arousal, but performed less precisely for the dimension of valence. The authors argue that this might be due to people rating sad music as pleasant as described in [22]), and conclude that the circumplex model might be unsuitable to be used with musical emotions. Also Juslin and Vjästfjäll note that perceiving mixed emotions, that are positive and negative at the same time, limits that employability of the circumplex model [23]. Accordingly, Amelynck et al. suggest the GEMS as an alternative to this circumplex model.

Aims and Experimental Design
The goal of the presented study is to explore which and how well each of the GEMS can be predicted by mobile-device generated acceleration data. That way, the present study continues the work of Amelynck [21], testing the use of an alternative emotion model and using different motion sensors. Hence, these findings will contribute to understand, how acceleration data can be used to integrate embodied music cognition into Music Recommender Systems. Furthermore, it will test the often described similarities between certain motion features and emotional qualities [8] [10].
First, a pilot study was conducted to develop a measurement instrument for the following experiments. Afterwards, we conducted two experiments to test if and how accelerometer data can be used to describe musical experience. Here, the second experiment tested if results could be replicated for different music samples and if results changed when participants were free to choose the songs they felt like moving to. Furthermore, we also tested how different GEMS qualities relate to different movement patterns (rhythmic vs. gestalt), e.g. if music experienced as sad was less suitable for rhythmic movement patterns.

Ethics Statement
Prior to participating in both experiments, individuals were informed of their general goals and of the procedures involved, i.e. describing the music corporeally and to rate its emotional qualities. They gave oral consent to participating in the study and storing of the data collected during the experiment. No ethics approval was required from the Technical University Berlin for behavioral studies such as those reported in this manuscript. There was no institutional review board available at the department where the experiment was conducted. Neither of the experiments involved deception or stressful procedures. Participants were informed that they were free to leave the experiment at any time, and that their data was analyzed anonymously. Participants in both experiments were recruited on a voluntary basis from the students and acquainted interested persons. Some students got course credit for participation. Others shared a professional or private interest in the study and its methodology and therefore volunteered to participate. The research reported in this manuscript was carried out according to the principles expressed in the Declaration of Helsinki. No other than the personal identifying information reported in this manuscript were collected (see Methods section).

Development of Measurement Smartphone App
We conducted a pilot study to design and accompany the development of the measurement instrument. In this phase, we interviewed 11 persons with different backgrounds, i.e. different age, gender, musicians and non-musicians, for their preferred way of describing music experience in an embodied way. After testing two favorites in a prototypical stage, participants opted for performing free movements while holding their smartphone device over drawing lines. Hence, an Android App was developed iteratively applying the think-aloud method to integrate participants' feedback as described by [24].
The choice for Android was due to a wider spread of Android devices that could enable us to repeat the experiment with a lager sample size in the future. The app presented the music stimuli to participants and simultaneously recorded accelerometer data from smartphone sensors (for more details, we want to refer to the documentation for Android Developers [25]). Afterwards, it presented nine emotional attributes taken from the GEMS-9 short version. Here, participants were instructed to rate the emotional qualities of the music excerpts presented.

Stimulus Selection
Music used in both experiments was selected in a participatory approach [26] based on suggestions by the participants of the pilot study according to the following criteria: 1. account for a variety of field participants' preferences 2. cover the range of the GEMS-9 3. keep the balance between female and male artists 4. do not let emotions be covered in a stereotypical way like tenderness by female artists or tension by male artists 5. artists of different color 6. cover a variety of genres For each musical piece, an excerpt of *40s duration was chosen such that it was as homogeneous as possible w.r.t GEMS qualities during this time.

Motion Data Analysis
The general workflow for motion data analysis was as follows: 1. get raw acceleration data from motion sensor for x, y and z in 3D space 2. cut beginning (first 5s) and end to standardize duration d of signals to d = 35s 3. resample with sample rate *5.7Hz 4. apply PCA to x, y and z 5. extract motion features 6. normalize range of motion features intra-individually 7. split data set into training (50%) and test (50%) set 8. stepwise select features on training set and fit linear regression model for each GEMS feeling 9. evaluate model quality on test set for each GEMS feeling Motion Feature Extraction. As participants needed a few seconds to fully get into the movement, the first five seconds were cut from the beginning of the motion data. Prior to feature extraction we applied a Principle Component Analysis (PCA) to each recording of accelerometer data (per stimulus and participant). That way, the x-, y-and z-dimensions were transformed to three principal components PC1, PC2 and PC3. This helped to enhance the comparability of movements between participants, e.g. to account for different ways of holding a device. We did not apply PCA in order to compress data, all dimensions were kept. Furthermore, we did not extract direction-relevant features that would have spoken against applying a PCA. Table 1 shows an overview on the features extracted to characterize the movement, categorized into tempo, size, regularity and smoothness. The statistical features absolute skewness, median and standard deviation (std) are computed to get a time compressed representation of the features extracted from the time series of motion data. As most features were not normally distributed, we chose to compute the median over the mean. During the selection of features we learned that the standard deviation still served as a significant feature to represent the degree of variance in the distribution. In order to remove any inter-individual differences in movement size, all features were subsequently range-normalized intra-individually to the interval [0-1]. Fig 1 shows an example of one accelerometer recording before and after applying PCA. Here, the extraction of different features is illustrated. When results are described, we will refer to positive acceleration in eigenspace as forward and to negative acceleration in eigenspace as backward movement.
Model Fitting. We fitted one linear regression model for every GEMS feeling, with all motion features as predictor variables. There was no strong multi-collinearity between motion features indicated by the fact, that for any predictor, the Variance Inflation Factor was smaller than 10. Stimulus order effects could have resulted in an emotional afterglow effect of stimuli (e.g. the rating of the second piece is influenced by the rating of the first one), violating the assumption of observational independence in linear regression models. However, no such effect could be observed in test plots on autocorrelation of residuals and test plots mapping order against residuals. Features were selected using the forward stepwise algorithm for linear regression [27]. Before selecting the features and fitting the model, the data was partitioned into training and test set, *50% each, in order to evaluate each model's ability to generalize with unseen data. The test set was compiled by randomly selecting five observations from each participant.

Method
The following sections summarize the first experiment that was conducted after developing the measurement app in the pilot study.
Stimuli. The musical stimuli for the first experiment were compiled by the pilot study's participants according to the criteria described in the General Methods section. Table 2 shows the final list of musical excerpts. The list of samples was presented in random order for each participant. However, they were then free to choose the preferred order in which to assess the samples.
Participants. For this experiment, we recruited 22 participants from the Master of Science program Audio Communication and Technology at TU Berlin. They had an average age of 27 years (SD = 2.36). 73% identified as 'male', 18% as 'female', 5% as 'rather male' and another 5% did not identify with any gender. 91% were experienced in playing an instrument, the production of music or singing in a choir. 5% only had short term experience in making music beyond classes in school and 5% indicated to have no experience at all. 36% already participated in dancing classes or similar activities for which movement is related to music. 50% were only dancing occasionally in clubs or on concerts. 14% had no experience at all in moving to music. 82% are regularly using a smartphone, 9% are experienced in using a smartphone but do not use one now and 9% are not using one at all. Participants also indicated, that for them they  Procedure. The laboratory used was illuminated only slightly and offered enough space to move freely. The app ran on a Motorola Moto G with Android Version 4.4. Participants wore AKG headphones featuring a 1,5m long cable. During the experiment, they were alone in the lab with doors closed. Before, a guided test tour through the app was given in order to familiarize participants with the experiment. They were informed that the study was about describing the music corporeally and to rate it in terms of the GEMS. They did not know that,  subsequently their GEMS ratings would be predicted from movement. We also told them that there was no right or wrong way to move to the music. After a participant selected a song, the first step was to listen to the song in order to be prepared for the corporeal articulation. Participants could stop the presentation of the excerpt early when they decided that they knew the music already well enough to describe it.
Afterwards, the movements were actually recorded by the device's acceleration sensor synchronized to the music. For this part of the study participants were instructed as follows: "Please move now with the device according to the music. It is important that you stand and don't sit during motion capturing. You can move freely, i.e. all parts of the body, but keep in mind that only movement of the device can be captured." After each embodied description participants rated the perceived emotional qualities of the musical excerpts according to the GEMS-9 on a 100-point, unipolar intensity scale initialized to '0' (Table 3). They were instructed as follows: "Please rate the perceived emotional quality of the music according to the GEMS-9. Do not rate how you felt during listening." Subsequent to the GEMS, participants were asked how suitable they considered both embodied and verbal descriptions for the music excerpt. At the end of the experiment participants were asked to fill out a short socio-biographical questionnaire.

Results and Discussion
Similarities between Movements and GEMS for First Experiment. The fixed effects from Table 4 indicate that music perceived as transcendent was related to a rather irregular tempo of movement (std_dist_midcrosses). For wonder the movement's size (std_peak) was irregular in the second component and regular (skewness_peak) in the first principle component. Power related to regular and large movements. Tenderness was described by small  (median_peak) movements with regular backward (std_fall) and irregular forward (std_rise) gestures. Nostalgia was also described by small movements (median_peak) with irregularly smooth backward movements in the third component and regular ones in the second component (skewness_fall). In contrast to power, peacefulness is characterized by small movements. When participants rated music as joyful, they performed movements with regular backward phases and less regular forward movements like jumping. For sadness movements were slow while for tension movements were primarily large with irregular tempo. Fig 2 shows that participants preferred rating on GEMS to describe their experience when they perceived nostalgia, sadness, tenderness or peacefulness. For joy, power and wonder participants preferred embodied descriptions. For tension and transcendence we did not observe such an association between feeling rating and description preference.
Prediction Results. Table 5 indicates that tenderness, power and nostalgia were predicted best, followed by tension, wonder and sadness. For peacefulness and joy around 20% of variance  in the data could be explained by the fitted regression models. Transcendence was most difficult to predict because only 14% of the variance in the data could be explained through the motion data. Since for both, the training and test data set, a comparable amount of variance could be explained, there was no evidence for overfitting of the model to the data. Since we asked participants to rate the intensity of experience of the different GEMS, most feelings are likely to be correlated with the overall intensity or arousal of emotional experience (see Fig 2). Here, most feelings show small to large correlations with power. Therefore, we applied a PCA on these rating data, in order to represent emotional experience with less dimensions. A resulting arousal component in the GEMS rating could be interpreted as describing the quantity or intensity of the feeling, whereas a valence component as describing the quality. That way, we could identify, if there are potential differences in predicting the quantity/intensity and quality of emotional experience from movement data.
Based on the Elbow Method we extracted two components explaining most of the variance in the data. Table 6 visualizes the item loadings of the eigenvectors having the highest eigenvalues. The first PC might be interpreted as the degree of relaxation as it got high positive loadings for tenderness, peacefulness and nostalgia, but high negative loadings for power and tension. Accordingly, it represents the opposite of arousal. The second PC might be interpreted as positive valence as it got high positive loads for joy and wonder, but a negative loading for sadness. Hence, the interesting question was whether the motion features only predicted the degree of relaxation (vs. arousal and activation) or if they could also explain degrees of valence inherent in the experienced emotionality. The estimates for the fixed effects in Table 4 show that relaxation was expressed by small movements while positive valence related to fast movements. Furthermore, the prediction results from Table 5 indicate that for relaxation about 44% of the variance in data could be described by the fitted regression models, while only 8% positive valence variance is explained. As prediciton accuracy is similar on both training and test set, there was no overfitting of the model to the data.
Discussion. The findings in predicting the relaxation and valence components indicate, that not only arousal or intensity was predicted in the GEMS ratings, but also the quality of perceived emotion, i.e. positive valence. However, only 8% of the variance in valence were covered by the approach. Therefore, these results are similar to those of [21]. This could be explained by the fact there were several GEMS with little mean intensity and little variance as can be seen from the polarity profile in Fig 3. Examples are transcendence, sadness, wonder, and tension. Even though joy showed a rather high degree of variance according to the polarity profile, its prediction turned out to be difficult in this experiment. There might not have been a From Motion to Emotion common movement pattern throughout participants for music perceived as joyful. In general, it also should to be noted that there have not been enough music excerpts to sufficiently cover all states of the GEMS. We therefore conducted a second experiment with a different set of stimuli that were chosen to cover more different GEMS feelings.

Experiment 2
Experiment 2 was conducted in order to test, if the findings from Experiment 1 could be replicated. Furthermore, different music stimuli were chosen in order to cover more states of the GEMS. We also asked for more feedback concerning the movement patterns used from participants. Doing so, we tested the following two hypotheses: "Participants prefer embodied rhythmic-related descriptions when they perceive power, tension or joy" and "Participants prefer embodied gestalt-related descriptions or verbal descriptions (GEMS) when they perceive sublimity-related feelings like nostalgia or transcendence". We assumed that sublimity-related musical emotions were more related to musical contour and melody. Thus, we assumed participants to imitate musical contour by performing gestures that are more gestalt-like and less rhythmic. Stimuli. After the first experiment, during which participants were not free to choose the music they described corporeally, we thought it might further improve prediction results, if participants chose the music themselves. The study from Liljeström, Juslin and Västfjäll observed that emotions were perceived more intensely when participants chose the music themselves [28]. Therefore, we asked them to propose music in order to compile a list from which they could chose 10 samples out of 20. Table 7 shows the final list of musical excerpts used and how often participants chose each excerpt in absolute numbers.
Participants. 21 students participated from the following courses: Audio Communication and Technology (40%), engineering in the field of technical environmental protection (20%), and others 40%. Participants had an average age of 28 years (SD = 3.0). 76% identified as 'male', 19% as 'female' and 5% did not identify with any gender. 72% were experienced in playing an instrument, the production of music or singing in a choir. 29% only had short term or no experience in making music beyond classes in school. 52% already participated in dancing classes or similar activities for which movement is related to music. 81% were only dancing occasionally in clubs or on concerts. 9% had only little or no experience at all in movement to music. 81% are regularly using a smartphone, 19% do not.
Procedure and Data Analysis. Procedure and data analysis were conducted as in the first experiment with the exception of the following minor changes: 1. All scales were initialized to '50' instead of '0' to equalize the effort to move the slider in either direction.
2. The German translation of the GEMS-9 was adjusted to the one from Lykartsis et al. that could be confirmed to fit [29].
3. Two additional questions were added to the post-experiment questionnaire: "How suitable do you consider a corporeal description of this music excerpt by movement according to musical contour?" and "How suitable do you consider a corporeal description of this music excerpt by movement according to rhythm?" (see S2 Fig). These two questions should provide additional information about the preference for certain movement patterns.

Results and Discussion
Similarity between Movements and GEMS for Second Experiment. As you can observe from Tables 8 and 9, no significant features were found for transcendence. Unlike the first experiment, music featuring wonder was described by slow (median_dist_midcrosses)  movements with irregular and longer backward phases (median_fall and std_fall). Though, wonder was related to regular backward phases at the same time (skewness_fall). Furthermore, gestures were of regular size (std_peak). As in Experiment 1 power is reflected by large movements but of varying size (std_peak) and speed (std_dist_midcrosses). Tenderness was related to slow motion (max_freq_hz and median_dist_midcrosses) of regular size and longer backward phases. Nostalgia, too, was performed by slow and regularly sized movements but with irregularly long forward phases. Similar to the first experiment, peacefulness was characterized by small movements. During the second experiment, additionally slow and regularly sized gestures were observed. Joy, however, stimulated regular, large and fast gestures. In the second experiment sadness was correlated to tempo being slow and irregular. Here, tension was characterized by fast but large movements. The gesture's size was a significant feature during the first experiment, too.
Confirming both of our hypotheses on description preferences, participants preferred GEMS and gestalt movements referring to melodic contour to describe their experience when they perceived wonder, sadness, tenderness, peacefulness or nostalgia (see Fig 4). For joy and power participants preferred embodied and rhythmic descriptions. For tension and transcendence were not related to description preference.
Prediction Results for Second Experiment. Table 10 shows that power and tenderness were also among the top prediction results for the second experiment. This time, however, sadness, joy, peacefulness and wonder scored much higher in terms of R 2 whereas nostalgia, tension and transcendence were among the GEMS being most difficult to predict. Since for both, the training and test data set, a comparable amount of variance could be explained, there was no evidence for overfitting of the model to the data. GEMS ratings for the second experiment were similarly correlated like in the first one (see S1 Fig). Applying PCA to the GEMS rating data also resulted in two components explaining most of the variance in the data (according to the Elbow Method). Table 11 indicates that relaxation and negative valence were present as most dominant principle components. For relaxation loadings were again highly positive for tenderness, peacefulness and nostalgia, but highly negative loads for tension. The second PC got high negative loads for joy, but positive loads for sadness.
In contrast to the first experiment, estimates for the fixed effects (Table 9) show that relaxation was expressed by slow and regular movements while negative valence related mainly to  and negative valence about one third of the variance in data could be described by the fitted regression models. Since for both, the training and test data set, a comparable amount of variance could be explained, there was no evidence for overfitting of the model to the data. Discussion. The second study confirmed that there were two principal components in the GEMS ratings: relaxation and (positive/negative) valence. This time, for relaxation and valence approximately same amounts of variance could be explained by the regression models. That might probably be due to the fact that for the second experiment joyful experiences could be predicted to a much better degree. Also sadness and peacefulness were predicted better, while for transcendence no regression model could be fitted since there were no significant features. In general, prediction results from the first experiment were slightly worse. That might be due to the chosen samples or the participants' movement patterns or both. Another possibility is that participants did not keep up movement speed or size over the whole period of the musical excerpt and e.g. only moved half-time for fast music. That is why, they might sometimes not be able to describe the experience corporeally, though they rate it as more powerful. Hence, for ergonomic and biomechanical reasons, corporeal descriptions and GEMS ratings might drift apart (cf. [ [12]pp. 112-114]). One countermeasure could be to inspect the motion data in different time windows and to chose the window featuring maximum speed or size, instead of simply averaging over the whole time. The extracted motion features already covered a good degree of rhythmic qualities but were ignorant of the course and direction of the movement expressed. These gestalt features could be particularly important when it comes to complex emotional expressions in music like nostalgia.

General and Concluding Discussion
Results for both experiments showed that movements predicted both arousal and in Experiment 2 also valence. The quality of prediction for different degrees of GEMS also varied between experiments for joy, sadness, transcendence, wonder and tension, for others like power or peacefulness results could be reproduced. For many GEMS feelings, participants applied similar movement patterns across the two experiments (cf. Tables 4, 8 and 9).
These movement patterns often follow the similar principles reported in previous studies of [17] and [8]. There, joy or happiness was consistently associated with medium sound levels, high tempi and small timing variation. The size (large) of the movement (equaling sound level) and small timing variation were significant features for the second experiment. Furthermore, like both studies cited, we also showed that sadness was correlated with irregular and slow movements. Considering the prediction results, there are some possible explanations for the better results of the more rhythm-related feelings (e.g. power, cf. Fig 4) over those requiring presumably more additional gestalt elements: Emotions like nostalgia and transcendence call for features that are less bound to rhythm but to the musical contour like directional and time series features. Acceleration data seems to not sufficiently cover nostalgic or transcendent gestures. As acceleration does not determine the absolute position in space, it is particularly difficult for slow but big gestures to be captured. This calls for applying different and more sophisticated motion sensor fusion techniques. Participants preferred the embodied description for the more energetic and joyful musical excerpts (cf . Figs 2 and 4) and hence, they might also not be trained to express certain feelings like sadness or transcendence in a corporeal way. Furthermore, there might be significant interindividual differences in how participants are able to express feelings through movement that would also be an interesting topic of future investigations. Also, given the often reported high inter-individual variance of musical emotion, a much larger data set would be desirable. This would allow to model the different individual or stimulus specific sources of variance in the data.

Conclusion
This study evaluated the predictive power of mobile-device generated acceleration data produced by free movement during music listening experience for the prediction of different degrees of the Geneva Emotion Music Scales (GEMS-9). The results show that participants considered the corporeal description of music as very suitable and produced movement data that could be used to predict emotion ratings. Hence, this study contributed to the envisioned use case of multimodel querying in two ways: First of all, it emphasizes the user need and acceptance for an innovative embodied access to MIR and MRS. Secondly, it showed that such an approach is technically feasible. Since GEMS that are more related to rhythm could be predicted better, observations suggest that there might still be a lot of hidden potential in additional movement features capturing direction and position in order to describe the gestalt of the movement.