A computational lens into how music characterizes genre in film

doi:10.1371/journal.pone.0249957

A computational lens into how music characterizes genre in film

Table 3

The six pooling functions, where x_i refers to the embedding vector of instance i in a bag set B and k is a particular element of the output vector h.

In the multi-attention equation, L refers to the attended layer and w is a learned weight. The attention module outputs are concatenated before being passed to the output layer. In the feature-level attention equation, q(⋅) is an attention function on a representation of the input features, u(⋅).

doi: https://doi.org/10.1371/journal.pone.0249957.t003