Statistical Patterns in Movie Rating Behavior

doi:10.1371/journal.pone.0136083

Fig 1.

Distribution of the number of votes for IMDb movies (S1 Dataset).

The dashed line fits the data of the function , where A is a normalization constant, α = 1.51 and λ = 4.0 10⁻⁶.

More »

Expand

Fig 2.

Color map of the number of votes n_i vs. the average rating 〈r_i〉 of each IMDb movie i.

In the color map, each bullet contains the number of movies indicated by the color scale. The white region indicates zero movies. The dotted and dashed lines represent the (binned) arithmetic and geometric mean values, respectively. The vertical lines indicate the quartiles, which divide the dataset into four groups ( of equal size.

More »

Expand

Fig 3.

Impact of ratings on the distribution P(n_v) of votes for IMDb movies, for (a) the two groups and separated by the median and (b) the four groups determined by the quartiles, as indicated in Fig 2.

In this and other equivalent figures, the dashed line with slope -3/2 is drawn for comparison with the distribution of the entire dataset.

More »

Expand

Fig 4.

Color map of number of votes n_i vs. the year of release of IMDb movies.

Each bullet contains the number of movies indicated by the color scale. The dotted and dashed lines represent the (binned) arithmetic and geometric mean values, respectively.

More »

Expand

Fig 5.

Impact of a movie’s age on the distribution of votes.

P(n_v) for IMDb movies (a) with less than a given number of years, and (b) released within the interval indicated on the figure, chosen to contain the same number of movies.

More »

Expand

Fig 6.

P(n_v) for TV series and feature movies.

More »

Expand

Fig 7.

Impact of film genre on the distribution of votes P(n_v) for (a) dramas and comedies and (b) other genres.

Some movies belong to more than one genre.

More »

Expand

Fig 8.

Color map of the number of votes n_i vs. the budget b_i of each IMDb movie i.

Each bullet contains the number of movies indicated by the color scale. The vertical lines indicate the quartiles. The dashed line was obtained by means of a non-parametric regression rLOESS [15].

More »

Expand

Fig 9.

Impact of the budget of feature films on the distribution of votes P(n_v) for (a) the two groups and separated by the median with respect to the budget and (b) the four groups using the quartiles, indicated in Fig 8, and the last percentile .

The dashed line with slope -3/2 is drawn for comparison, as well as the distribution for all films with budget information (only feature films with b_i ≥ 10³ US$ were considered).

More »

Expand

Fig 10.

(a) Increment of the number of votes Δn_v = n_v(t₂) − n_v(t₁) as a function of n_v and (b) the relative increment Δn_v/n_v = [n_v(t₂) − n_v(t₁)]/n_v(t₁) as a function of the age of the movie for Δt = t₂ − t₁ ≃ 1 month (red, obtained with sets 2 and 3) and 22 months (green, obtained with sets 1 and 2).

See (S1 Dataset). The same list of movies, with at least 5 votes at t₁, was considered. The symbols represent the arithmetic (circles) and geometric (diamond) mean values. The dotted lines are a guide; the dashed line in panel (a) with slope 1 was drawn for comparison.

More »

Expand

Fig 11.

Pictorial representation of the contagion process and the equivalent branching process.

(a) Underlying network of contacts. The contagion starts at an initiator node (largest node). Contagion occurs (green arrows) to some of its neighbors (a number of them that we assume to be a random variable) and so on an avalanche develops. (b) A branching tree is built from the contacts that participate of the contagion process. (c) Branching tree realization of a simple Galton-Watson process. The largest node represents the initiator, the first successive generations of the tree are identified with colors, and the final tree is shown as a result of a cascade that becomes extinct at the 13th generation. (d) Distribution of avalanche sizes from simulations of the contagion process: for a simple (network-free) Galton-Watson (GW) process and for the equivalent contagion process on top of Erdős-Rényi (ER) and Barabási-Albert (BA) networks of size 10⁶ and average connectivity 〈k〉 = 100. In all cases the probability p_j of influencing j individuals was arbitrarily chosen to be exponential with mean p ≲ 1.0, and 10⁶ realizations were considered.

More »

Expand