Effect of Age on Variability in the Production of Text-Based Global Inferences

As we age, our differences in cognitive skills become more visible, an effect especially true for memory and problem solving skills (i.e., fluid intelligence). However, by contrast with fluid intelligence, few studies have examined variability in measures that rely on one’s world knowledge (i.e., crystallized intelligence). The current study investigated whether age increased the variability in text based global inference generation–a measure of crystallized intelligence. Global inference generation requires the integration of textual information and world knowledge and can be expressed as a gist or lesson. Variability in generating two global inferences for a single text was examined in young-old (62 to 69 years), middle-old (70 to 76 years) and old-old (77 to 94 years) adults. The older two groups showed greater variability, with the middle elderly group being most variable. These findings suggest that variability may be a characteristic of both fluid and crystallized intelligence in aging.

Kintsch [15] and colleagues suggested, that in order to remember the information in a text we need to reduce the amount of information by transforming the verbatim information into an abstract version of the text (see also [22,23]). This abstract representation comes in the form of global inferences, which represent holistic concepts such as the theme or main point of a text [15,16,[22][23][24][25][26]. These global inferences reduce the amount of information to be stored in memory because they integrate the text specific information with the individual's world knowledge and experience (i.e., extra-textual information). Moreover, because global inferences represent generalized information (i.e., the text information is extended to contexts beyond the text itself, see e.g., [9]), we generate global inferences in order to fill informational gaps within the text and this allows us to incorporate the information from the text into our own world knowledge [27,28].
Interestingly, the capacity to generate global inferences appears stable across age. For example, Ulatowska et al. [9] reported that there was no age difference in forming global representations of text in a longitudinal study of global inference generation in older adults. Similarly, Olness [29] found no differences between college-aged, middle-aged, and older adults in generating global inferences for didactic and non-didactic texts. Yet, there is growing evidence that knowledge structures thought to remain sable in aging-such as vocabulary and global inferences-may, in fact, be variable. For example, Christensen [30] found increased variability in older adults for measures of both memory, spatial, and reasoning skills (i.e., fluid intellectual abilities) as well as verbal abilities, including vocabulary (i.e., a crystallized intellectual ability). Similarly, Caskie, Schaie, and Willis [31] found considerable variability in verbal, spatial, and reasoning abilities in adults between 25 and 81 years of age. In addition, the patterns of variability were different for verbal abilities versus spatial and reasoning skills. In particular the changes in verbal abilities showed later onset, greater variability in the timing of onset, as well as greater variability in the overall rate of change. At the level of text comprehension, Hertzog, Dixon, and Hultsch [32] found significant variability in memory for textual information not accounted for by text-related factors in a group of seven elderly women. Likewise, Dixon and colleagues [33] reported an age increased variability in text recall for the stories used in the Logical Memory subtest of the Wechsler Memory Scale. Together all these findings suggest that an age-related increase in variability of the knowledge structures underlying linguistic ability and global inference generation may be a hallmark of cognitive aging, in the same way as the age-related increase in variability in reaction time, memory, and other cognitive abilities [34][35][36][37][38][39][40][41][42]. Therefore, we decided to investigate if age increased the variability of global inferences of participants. In order to do so, we measured agerelated variability in generating global inferences among three groups of older adults.

Results
Thirty-four participants between the ages of 62 and 94 years were divided into three age groups for the purpose of analysis. The young-old (Y-O) group consisted of 12 individuals (62 to 69 years of age), the middle-old (M-O) and old-old (O-O) elderly groups consisted of 11 participants each (70 to 76 and 77 to 94 years of age, respectively). Each participant gave 2 possible lessons for each of 12 Aesop fables ( [43], see Supplementary Information S.1). Each lesson was scored categorically according to the criteria outlined in the Method section 4.3.1. Data were analyzed using discriminant correspondence analysis (DICA) [44][45][46][47][48][49][50][51][52].
DICA is a multivariate technique developed to classify observations described by qualitative and/or quantitative variables into apriori defined groups and has been used to discriminate clinical populations, such as early versus middle stage Alzheimer's disease [51] and autistic paranoia from paranoid schizophrenia [52]. Based on correspondence analysis (CA), DICA is a type of principal component analysis (PCA)-specifically tailored for the analysis of categorical data-that represents the rows and columns as points in (a high dimensional) space [45,[49][50][51][53][54][55][56][57]. Just like PCA, DICA finds the most important dimensions of variance of the data. These dimensions are uncorrelated to each other and ordered by the amount of the data variance that they explain. Rows and columns can be plotted as maps by using their coordinates on these dimensions. In order to reveal the pattern of variables associated with group differences, DICA analyzes a data table in which each row sums the behavior of the participants of a given group (see [51] for more information). DICA is then obtained from the CA of this summed table. This analysis reveals the similarities and differences in patterns of performance across the age groups. See Method section 4.3.3, File S2, Figure S1 and [44,51,58,59] for more information.
The DICA derived two factors accounting, respectively, for 85 percent and 15 percent of the data variance. The eigenvalues (l), proportion of explained variance (t), and the contributions of each variable and group to the total variance for Factors 1 and 2 are shown in Table 1. The higher the contribution, the more important that variable (or observation) is in defining a given factor.

Age-related Patterns of Global Inference Generation
The DICA uncovered age related patterns in lesson generation performance. Factor 1 separated the Y-O from the M-O and O-O groups (see Figure 1). Because DICA reliably separated the Y-O from the other groups, the effect size is quite large and is detectable with our current sample size. However, to ensure that we could detect a reasonable effect size, we computed an aposteriori effect size analysis using G*Power 3 [60,61]. For the purpose of power analysis, multivariate discriminant analysis can be considered under the manova framework [62,63]. With an a of.05 and achieved power (1{b) of.95, we had an effect size (f 2 (V )) of. 41. This effect size is equivalent to a critical Pillai's V of 0.6 across the 3 groups, meaning that the between group-variance is 60% of the total variance. This effect size and critical V were considered adequate to be able to discriminate between the Y-O, M-O, and O-O groups.
The results of the DICA are shown in Figure 1. The scoring categories are shown in separate displays to help reading the map. The variable contributing the most to Factor 1 is ''switching perspectives between lesson types.'' The young elderly group more frequently switched perspectives than the middle and old elderly groups. Success in switching between LESSON 1 and LESSON 2 is more strongly associated with a LESSON 1 that incorporated information from outside of the text (i.e., extra-textual) and Accurate n/a n/a Inaccurate n/a n/a a Note that in correspondence analysis, the eigenvalues (l) are never greater than 1. b Contributions are the proportion of variance of a given factor explained by the age group or scoring category. represented the main character's viewpoint. Successful switches in perspective also were more frequently stated as proverbs and showed themes consistent with the fable for both lesson types. By contrast, the middle and older elderly groups switched perspectives less frequently than the young elderly group. Failure to switch perspective between lesson types was associated with maintaining the main character's viewpoint for LESSON 2 and producing text specific lessons for both lesson types (i.e., the information content of each lesson did not go beyond information stated explicitly in the fable). Switch failures also were characterized by more frequent use of non-proverbial linguistic forms (i.e., a concrete interpretation) and inaccurate representation of the fable theme for both LESSONs 1 and 2.
Factor 2 distinguished the middle and old elderly groups. The middle elderly group had a slightly greater tendency to maintain the main character's perspective on LESSON 2. Furthermore, the middle elderly group produced LESSON 1 showing an inaccurate fable theme more frequently than those produced by the old elderly group. However, the old elderly group's LESSON 1 had a slightly greater tendency, on average, to adopt neither the main nor the supporting character's perspectives. The old elderly group also tended to state both lessons in a non-proverbial form.
The performance of the groups and the individual participants by age group are shown in Figure 2. The young elderly participants clustered more tightly together, indicating that they were predominantly successful in switching perspectives. The tight Figure 1. Discriminant correspondence analysis. Variables shown along Factors 1 and 2. Lambda (l) and tau (t) are the eigenvalues and the percentage of explained inertia (i.e., variance) for a given factor (l 1~: 0105, t 1~: 8539; l 2~: 0018, t 2~: 1461). All sub-figures are plotted on the same scale along each factor. (A) Switch Perspective and Linguistic Form collapsed across both lesson types. Note that young elderly group switched perspectives between lesson types more frequently than the middle or old elderly groups. (B) Generalization Level for each lesson type. Note that the young elderly group produced extra-textual lessons more frequently. Extra-textual lessons incorporate information from outside of the text. (C) Character Viewpoint for each lesson type. Note that they young elderly group more frequently adopted the viewpoint of the main character for the best lesson (LESSON 1) and the supporting character for the alternate lesson (LESSON 2). (D) Representation of Theme was included as a supplementary element. Supplementary elements are variables that were not included in the calculations, but were projected into the space to see their placement along the factors. They are used to aid with interpretation. Note that the young elderly group more frequently produced lessons reflecting accurate fable themes for both LESSON  grouping also suggests that the young elderly group showed less between participant variability in generating lessons. The middle and old elderly participants, by contrast, were more dispersed. Some of the middle and old elderly participants showed a pattern of lesson generation similar to the young elderly participants, while others did not. This suggests greater between participant variability, especially in the ability to switch perspectives between LESSONs 1 and 2.

Variability in Global Inference Generation
The variability in generating global inferences within the age groups was evaluated using a bootstrap procedure [64][65][66]. The bootstrap produced 95% confidence interval ellipses for each age group (see Figure 3; a description of the bootstrap is presented in the File S2.6.2). The area of a confidence interval ellipse represents the variability within each group. When the confidence ellipses do not overlap there is a significant difference between the groups at the p~:05 level. Consequently, the confidence ellipses show that the young elderly group is reliably different from the middle and old elderly groups because there is no overlap with the confidence ellipses of the other two groups. In addition, the size of the young elderly group's ellipse is smaller, indicating that there is less variability within this group. Although the middle and old elderly groups were not reliably distinguished, the middle elderly group, surprisingly, had the ellipse with the greatest area indicating that the middle elderly group showed the most variability (see also Figures 2A and 2c for actual dispersion in group performance).

Quality of the DICA Model
We evaluated the quality of our DICA model by computing the amount of variance explained by the DICA (R 2~: 23, pv:01; see Supplementary Information for details). We also evaluated how the model would generalize to new participants by using a jackknife procedure (also called ''leave one out'' procedure). The jackknife procedure [64,67,68] removes, in turn, each of the participants from the sample and performs a new DICA on the remaining participants. The distance between the removed participant (projected into the new DICA space as a supplementary element) and each of the groups is computed and the participant is assigned to the closest group (see [44] and [68] for more information about the jackknife in DICA). The results of the jackknife are summarized in Table 2. The columns represent the original group assignment and the rows represent the DICA assignment. From Table 2, we found that of the 34 possible assignments, only 13 were correct. The young elderly participants were more reliably assigned to their group (9 out of 12 correctly assigned) than participants from the middle and old elderly groups (2 correct assignments out of 11 participants for each group). This

Discussion
Although most studies examining cognitive performance variability in the elderly have examined skills that are known to decrease with age (e.g., fluid intelligence abilities, reaction time (RT) or memory [39,40,[69][70][71][72][73][74][75][76][77][78][79][80][81][82][83]), skills that remain stable or improve across age (i.e., crystallized intelligence) also show intertrial variability. However, this age-associated pattern of variability may differ between the two intelligence domains. For example, variability in rt for speeded tasks shows that older adults are consistently more variable than younger adults [81][82][83] and that this increased rt variability is associated with poorer cognitive performance in normally aging older adults [84][85][86]. The general increase in variability in the M-O and O-O groups relative to the Y-O group supports this view and may be associated with the older two groups' general difficulty in switching perspectives between LESSON 1 and LESSON 2.
By contrast, when older adults show increased variability in gist recall accuracy (rather than rt or detail recall), this increase in variability tends to be associated with poor health, rather than normal age-related change [32,87]. In normally aging adults, increased item-to-item variability in non-speeded tasks (such as gist recall) is associated with higher mean performance and may actually be an indicator of learning rather than decline [76,88]. The finding that the M-O group showed greater variability than the O-O group suggests that, at least in non-speeded tasks, increased variability may not be completely maladaptive. The strict view of increased variability indicating cognitive decline predicts a linear association between variability and age (see e.g., [81][82][83][84][85][86]), yet the current data do not show this pattern. Rather, they suggest the possibility varying patterns of variability at different life stages, especially given that the Y-O, M-O, and O-O individuals were cognitively normal and successfully performed the task.
If we consider that learning may also be a mechanism for increased variability in aging, then the M-O group would be expected to show the greatest amount of variability because this group has the largest proportion of recently retired individuals undergoing a major life change. For example, Adam and colleagues [89,90] have found sudden decreases in cognitive functioning immediately following retirement, a pattern which suggests that there may be an increase in variability in cognitive performance around this time. Such a change in variability would be similar to the recursive increases in variability and subsequent plateauing during periods of social and cognitive development during childhood and adolescence [88].
Although showing increased variability associated with age, the current results show a mixed pattern. This suggests that multiple mechanisms may underlie the increase in performance variability for crystallised intellectual abilities in older adults and that the relationship between age and variability may not be as straightforward as with fluid intellectual skills. Nevertheless, these findings show that variability with age may not be just an indicator of decline, but may also signal new learning. As Garrett et al. [91] so aptly said, ''variability is more than just noise'' (p. 4914).

Participants
Thirty-four participants between the ages of 62 and 94 years were divided into three age groups for the purpose of analysis. The young elderly (Y-O) group consisted of 12 individuals (62 to 69 years of age), the middle (M-O) and older (O-O) elderly groups consisted of 11 participants each (70 to 76 and 77 to 94 years of age, respectively). All participants were highly educated, with an average of 15 years of formal education. All participants were living in the community and were self-reported native English speakers. None exhibited clinical signs of impaired cognitive performance as tested by the 7 Minute Screen [92,93]. All participants scored within normal age limits on a hearing screening that included the Erber Sentences [94], CID Sentences [95], and a self-report of hearing loss. All subjects made no errors on a visual narrative screener where they read aloud an additional fable typeset in the same font as the stimulus fables. This study was approved by the Internal Review Board (IRB) of the University of Texas at Dallas. All participants gave written informed consent. Table 3 gives the participant characteristics.

Stimuli and Task
We selected twelve short narratives from George Townsend's translations of Aesop's fables [43]. We used fables because cultural knowledge is transmitted via their didactic form. This transmission of cultural knowledge takes the form of a lesson or moral (i.e., types of global inferences). In addition, the role of the fables in transmitting knowledge or ''general truths'' gives to the fables a similar function to proverbs in discourse. However, unlike proverbs, fables require the theme, lesson, or moral to be inferred from the characters' actions and their consequences. Meaning in proverbs, by contrast, is derived from the text itself and not from its application to real-world contexts because proverbs are already stated in a global inference-like format [9,96,97]. Because fables are didactic, readers can interpret them at two levels: literally, at the level of the text itself (i.e., a textual interpretation), or metaphorically, as a guide to culturally appropriate behavior in real-life contexts (i.e., an extratextual interpretation; [98][99][100][101][102]). Given that multiple interpretations of each fable is possible, fables can be, at least in part, interpreted as each reader chooses [103] and therefore interpretations of a given fable can vary with the reader, the information that is chosen as salient during comprehension (e.g., a given character's actions), and the overall level of generalization (i.e., textual versus extratextual). All fables employed two characters, contained three episodes (i.e., setting, action, and resolution components), were between 10 and 21 propositions in length [15], and contained no mixture of anthropomorphized animal and human characters. We then modified the fables to exclude specific mention of character attributes (e.g., lazy, wise, etc.) and any specific mention of the moral or lesson. Fables are shown in File S1. We asked participants to generate two different lessons or morals for each of the 12 fables. We instructed participants to first give what they considered to be the ''best'' lesson for the fable (LESSON 1). We then asked participants to generate a second possible lesson for each fable that reflected a different interpretation or perspective (LESSON 2). The examiner read the fable to participants and a card with the printed fable was within view during generation of both lessons to minimize memory demands.

Analysis
4.3.1 Response coding. Lessons were scored categorically according to: (a) whether there was a switch in perspective between LESSON 1 and LESSON 2, (b) whether the lesson reflected text specific or extratextual content [9,29], (c) whether the lesson portrayed the viewpoint of the main or of the supporting character [98], and (d) whether the lesson was given in the form of a statement or proverb, that is, a literal or metaphorical interpretation [104]. The accuracy or semantic fit of each lesson theme was scored in reference to the original fable. Representation of theme was not included as an active variable in the analysis due to the high degree of accurate semantic representation produced by all three age groups (91% accurate). Table 4 shows further definitions of the scoring categories with examples.
4.3.2 Inter-rater reliability. Inter-rater reliability was analyzed on a random 20% of the data by comparing the first author's coding with the code ratings of a second trained rater. Point-bypoint agreement was 79%. A Cohen's Kappa was calculated to correct for chance agreement (k = 0.621), corresponding to a ''substantial'' rating of agreement [105].

Statistical analysis.
We used discriminant correspondence analysis (DICA) to analyze the coded lesson responses. DICA combines the features of correspondence analysis (CA) and discriminant analysis ( [44,106]; see also [51] for a tutorial on language oriented applications). Correspondence analysis (CA) is a type of principal component analysis (PCA)-specifically tailored for the analysis of categorical data-that represents the rows and columns as points in a (high dimensional) space [45,49,50,[53][54][55]. In addition, CA (and consequently, DICA) can handle data sets with few observations described by many nominal variables [44,45,51,107].
Just like PCA, CA finds orthogonal factors or dimensions that reveal the patterns and the associations between the row and column profiles. The importance of the factors is determined by their inertia (i.e., a quality akin to variance), denoted by l and the proportion of explained inertia, denoted by t. CA converts contingency tables into visual displays (i.e., maps) in which the row profiles and column profiles represent points in the display. The proximity of the points within the display represents their degree of associa'tion. Points distributed more closely in space are more strongly associated than those that are farther apart. In addition, CA places no constraints on the data; therefore, the pattern seen in the maps represents associations contained within the data and not those superimposed by an external model [47,48].
DICA is a multivariate technique developed to classify observations described by qualitative and/or quantitative variables into apriori defined groups and therefore adds a discriminative component to CA. Here, we used DICA to analyze LESSON  For the DICA, participants were grouped into the three age groups. Then, the pattern of performance of the participants in each group was combined into its common pattern of performance (see [51] for more information on how the common pattern is developed). Table 5 shows the age-group by lesson response contingency table, the common pattern of performance used for the DICA in the current study.
We then ran a CA on the common performances, which allowed us to examine the similarities and differences in patterns of performance across the age groups. CA and DICA also can be used to estimate the amount of variability within and between each category. To do this 95% confidence ellipses are constructed using a bootstrap resampling technique ( [108,109]; see also File S2.6.2). A detailed mathematical appendix is included in the Supplementary Information. Figure S1 (TIFF)

Supporting Information
File S1 Supporting Information PDF file. (PDF) File S2

(PDF)
Author Contributions