Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Diagram of the research route.

We take a twofold approach to characterize through the cognitive reading activity the complexity and coherence of texts. On one side, we perform an eye-tracking experiment to collect fixation data from a group of people. The fixation activity associated to each subject while reading a given text is computed by binarizing the states of each word, defined positive +1, if the subject fixates it at least twice, or negative −1 if the subject does not fixate or fixates it only once. We then compute the reading “magnetization” of a given text for each subject and average it over all subjects to obtain 〈m〉. From the pairwise cross-correlations between the fixation sequences of the subjects, we also infer a “Hamiltonian” for each text by means of the Maximum Entropy principle using a Boltzmann machine-learning algorithm. A thermodynamic analysis of the energy fluctuations allows us to determine whether the text is near a “critical point”. In parallel, we collected reading-comprehension data from an Internet extensive survey performed with 400 people in an attempt to quantify the complexity 〈π〉 and coherence 〈ψ〉 of the texts.

More »

Fig 1 Expand

Table 1.

Texts information.

More »

Table 1 Expand

Fig 2.

Eye-tracking reading pattern.

Plot showing the sequences of gazes and fixations during a typical eye-tracking experiment. In this particular case, the data was collected while the subject was reading the MEL text. The blue circles represent the fixations and their sizes stand for the corresponding duration times. The solid lines between circles indicate the gaze trajectory along the text. For a given text, we measure the number of times during the entire reading that a fixation of subject i falls into the rectangle box delimiting a word r.

More »

Fig 2 Expand

Fig 3.

Fixation activities.

Raster plots of the fixation activities obtained for all subjects while reading the texts. Accordingly, for each subject i, the state of a word is active (+1) if (blue) or inactive (−1) if (white).

More »

Fig 3 Expand

Table 2.

Average magnetizations and reading times per word.

More »

Table 2 Expand

Fig 4.

Heat capacity as a function of temperature for the system of fixation activities.

Heat capacity curves for all texts, with Cv maximal at the critical temperature Tc. The temperature at which the texts are being read is the operating temperature T = To = 1. It can be seen that the system is above and near the critical point for all texts, and the RT1 and RT2 texts are clearly the furthest.

More »

Fig 4 Expand

Table 3.

Distance to criticality.

More »

Table 3 Expand

Table 4.

Respondents panel data.

More »

Table 4 Expand

Fig 5.

Distributions of complexity ratings.

Distributions of complexity ratings among individuals for all texts read in the survey. The values π = 1, 2, 3, 4, 5 correspond to a scale ranging from a “very simple” text (π = 1) to a “very complex” text (π = 5).

More »

Fig 5 Expand

Fig 6.

Distributions of coherence ratings.

Distributions of coherence ratings among individuals for all texts read in the survey. The values ψ = 1, 2, 3, 4, 5 correspond to a scale ranging from a “not coherent” text (ψ = 1) to a “very coherent” text (ψ = 5).

More »

Fig 6 Expand

Table 5.

Complexity and coherence mean values.

More »

Table 5 Expand

Fig 7.

Reading times against text complexity.

(A) The average reading time per word 〈t〉 generally increases with 〈π〉, although the relation is not monotonic. (B) The rank of 〈t〉 plotted against the rank of 〈π〉 shows that several discordant pairs are observed between the two variables. The dashed line corresponds to the function y = x. The Kendall rank correlation coefficient is τ = 0.87 (p = 0.0001).

More »

Fig 7 Expand

Fig 8.

Average magnetization against text complexity.

(A) The average magnetization, 〈m〉, of the fixation activities increases almost monotonically with 〈π〉, except for a local minimum at the complexity of GSV. (B) By ranking both measures in crescent order and plotting the ranks of 〈m〉 against the ranks of 〈π〉 for all texts, we can see that eight out of ten texts occupy exactly the same positions in the two lists. The dashed line corresponds to the function y = x. The Kendall rank correlation coefficient τ = 0.96 (p = 5 × 10−6) indicates the very high trend of monotonicity between the two variables.

More »

Fig 8 Expand

Fig 9.

Distance to criticality and text coherence.

Relation between the distance to criticality ToTc and the average coherence 〈ψ〉 of the texts. Texts rated with low coherence 〈ψ〉 < 2.75 are associated with large values of ToTc (RT1 and RT2), while texts considered to be coherent 〈ψ〉 > 3.25 are close to criticality (ST1, ST2, JUB, HCL, MEL, QUI), suggesting an implicit cohesive reading response among individuals. The two texts rated with intermediate values of 〈ψ〉 (GAU and GSV), however, induced rather distinct responses in terms of ToTc.

More »

Fig 9 Expand