Transparent Meta-Analysis: Does Aging Spare Prospective Memory with Focal vs. Non-Focal Cues?

Background Prospective memory (ProM) is the ability to become aware of a previously-formed plan at the right time and place. For over twenty years, researchers have been debating whether prospective memory declines with aging or whether it is spared by aging and, most recently, whether aging spares prospective memory with focal vs. non-focal cues. Two recent meta-analyses examining these claims did not include all relevant studies and ignored prevalent ceiling effects, age confounds, and did not distinguish between prospective memory subdomains (e.g., ProM proper, vigilance, habitual ProM) (see Uttl, 2008, PLoS ONE). The present meta-analysis focuses on the following questions: Does prospective memory decline with aging? Does prospective memory with focal vs. non-focal cues decline with aging? Does the size of age-related declines with focal vs. non-focal cues vary across ProM subdomains? And are age-related declines in ProM smaller than age-related declines in retrospective memory? Methods and Findings A meta-analysis of event-cued ProM using data visualization and modeling, robust count methods, and conventional meta-analysis techniques revealed that first, the size of age-related declines in ProM with both focal and non-focal cues are large. Second, age-related declines in ProM with focal cues are larger in ProM proper and smaller in vigilance. Third, age-related declines in ProM proper with focal cues are as large as age-related declines in recall measures of retrospective memory. Conclusions The results are consistent with Craik's (1983) proposal that age-related declines on ProM tasks are generally large, support the distinction between ProM proper vs. vigilance, and directly contradict widespread claims that ProM, with or without focal cues, is spared by aging.


Introduction
Prospective memory (ProM) is the ability to become aware of a previously-formed plan at the right time and place, for example, becoming aware that one wishes to mail a letter while passing by a post office, or that one wishes to buy groceries while passing by a supermarket (see Figure 1) [1][2][3]. Several important distinctions have been made in the literature on prospective memory. First, Graf and Uttl [1,2] distinguished between different subdomains of prospective memory: prospective memory proper or episodic prospective memory (cf. Harris [4]), vigilance/monitoring, and habitual prospective memory. Prospective memory proper brings back to awareness previously-formed plans and intentions at the right place and time so that we can act upon those plans and intentions. For example, it is ProM proper that brings back to consciousness the plan to mail a letter when approaching the post office. Vigilance/monitoring differs from prospective memory proper in that the plan remains in consciousness. To illustrate, an air-traffic controller maintains a plan -to issue orders to maintain the separation of planes -in consciousness and watches out for cues to issue such orders. Although this distinction between ProM proper and vigilance/monitoring is widely recognized [1,2,[5][6][7][8], it is rarely made explicit and readers must carefully read method sections to determine if a specific study concerned ProM proper vs. vigilance/monitoring [2]. In habitual prospective memory, as in prospective memory proper, a plan is made, leaves consciousness, and then must be brought back into consciousness at the right place and time. However, in contrast to ProM proper, the plan must be brought back to consciousness repeatedly whenever the ProM cue calls for the plan's performance, and in contrast to vigilance/monitoring, the plan leaves consciousness between successive occurrences of ProM cues. A classic example of habitual prospective memory use is of taking one's medication every day at bedtime. Secondly, Harris [4] and others have distinguished between event-cued and time-cued prospective memory. In event-cued prospective memory the ProM cue is an event, such as passing by the post office en route home, whereas in time-cued prospective memory the ProM cue is a time, for example to take one's medication daily at 9:00 a.m.
For over twenty years, researchers have debated whether prospective memory declines with aging or is spared by aging. Craik's [9,10] theoretical analysis suggesting that age-related declines in prospective memory would be large -at least as large, or larger, than age-related declines in retrospective memory -was quickly opposed by Einstein and McDaniel's [11] claim that ProM is an ''exciting exception to typically found age-related decrements in memory'' (p. 724). Almost twenty years later, McDaniel and colleagues [12] summarized the extant research with the following: ''Although the pattern of age-related effects is mixed, a significant number of studies show little or no age-related decrements in prospective memory performance on this [typical] event-based prospective memory task'' (p. 823).
Most recently, in an attempt to explain this ''puzzle of inconsistent age-related declines in prospective memory'' [13], McDaniel and Einstein [13,14] introduced the distinction between focal and non-focal cues, arguing that aging spares prospective memory with focal cues but impairs prospective memory with non-focal cues. For focal cues, the ongoing task requires processing of cue features relevant to the ProM plan, whereas for non-focal cues the ongoing task does not require processing of information relevant to the plan. To illustrate, encountering and talking to a friend to whom one intends to tell something is an example of a focal cue, whereas catching a glimpse of one's friend at a party while talking to someone else is an example of non-focal cue (see McDaniel, Einstein, & Rendell [13], p. 142). McDaniel, Einstein, and their colleagues argue that prospective memory with focal cues does not decline with aging because retrieval of the plan in response to the appearance of a focal cue is ''automatic'', ''reflexive'', and ''obligatory'' [13][14][15].
In a comprehensive transparent meta-analysis of previous research, Uttl [2] has recently demonstrated that, for studies conducted under controlled laboratory conditions, prospective memory performance declines with aging for event-cued prospective memory proper (d = 21.13), event-cued vigilance/monitoring (d = 20.77), and time-cued vigilance/monitoring (d = 20.96), whereas for studies conducted in natural settings, prospective memory task performance improves with aging for time-cued prospective memory proper (d = +0.53) and time-cued habitual prospective memory (d = +0.76). Thus, the cumulative findings from laboratory studies are consistent with Craik's [9,10] theoretical proposal by demonstrating that age-related declines in ProM proper are large, at least as large as age-related declines in retrospective memory, and negate any claims that prospective memory does not decline with aging. In contrast, older adults' better performance on prospective memory tasks in uncontrolled natural settings can be explained by older adults' greater reliance on compensatory strategies, external memory aids, motivation, and other factors (see for example Maylor [16] for discussion of non-cognitive variables that can explain older vs. younger adults' superior performance on ProM tasks in natural settings).
Equally important, Uttl's [2,3] reviews revealed a number of methodological problems within the prospective memory research, such as: severe ceiling effects that artificially restrict the magnitude of age-related declines in individual studies; age confounds (e.g., intelligence, ongoing task difficulty) that almost always favor older adults; and failures to distinguish between the various subdomains of prospective memory. To illustrate some of the most critical methodological problems afflicting prior research on prospective memory and aging, Figure 2 focuses on event-cued prospective memory assessed under controlled laboratory conditions (collapsing across ProM proper, vigilance, and habitual ProM). Panels A and B show that the performance of younger and older adults, respectively, was perfect or nearly perfect in a substantial number of previous studies, severely limiting the size of observable age-related declines. Panel C shows the magnitude of raw ProM age-related declines as a function of performance of older adults; it highlights that the size of the age decline is directly dependent upon the performance of older adults, r = 20.63 (see Uttl [2,3,5]; see McDaniel & Einstein for an independent replication of this finding [14]). When the task was Figure 1. A typical situation requiring ProM proper is to buy groceries en route home form work. We make a plan to get groceries en route from work, engage in unrelated activities (work), and the function of ProM proper is to bring the plan back to consciousness at the right time and place, while driving home, in response to the ProM Cue (supermarket) (from Uttl [2]). doi:10.1371/journal.pone.0016618.g001 Figure 2. Age-related declines in event cued ProM: Summary of methodological problems and key findings. Panels A and B show that performance of younger and older adults, respectively, was perfect or nearly perfect in a substantial number of previous studies, severely restricting the size of observed age-related declines. Panel C shows the magnitude of raw ProM age-related declines as a function of the performance of older adults; it highlights that the size of the age decline is directly dependent upon the level of performance of older adults, r = 20.63. Panel D shows a so easy that older adults scored perfect (p = 1.0 or 100%) no agerelated declines could emerge, whereas when the task was more difficult substantial age-related declines emerged. Accordingly, one may conclude that one of the most robust findings in prospective memory literature is that the size of age-related declines depends on the researcher's ability to avoid ceiling effects [5]. Panel D shows a strong relationship, r = 20.49, between the raw ProM age decline and one of the most common age confounds: the verbal intelligence advantage of older adults over younger adults expressed in terms of standard deviations. Thus, one way to avoid obtaining age-related declines in prospective memory research is to compare older adults who score two standard deviations higher than younger adults on verbal intelligence measures. Panel E shows the magnitude of raw ProM age-related declines for studies with no confounds vs. studies with age confounds favoring older adults (e.g., intelligence, ongoing task difficulty). As one would expect, the data show that confounds favoring older adults reduce the size of agerelated declines in ProM. Consistent with Uttl's [2,3] reviews, Panel F shows the frequency distribution of raw age-related declines. These data highlight that, despite widespread ceiling effects and despite intelligence and ongoing task difficulty confounds favoring older adults, the vast majority of laboratory studies of event cued prospective memory assessed in laboratory conditions have found that older adults perform more poorly than younger adults.
Although unlikely in light of the overwhelming evidence of large age-related declines in event-cued prospective memory proper and event-cued vigilance/monitoring, it is still possible that there may be no age-related declines with focal cues as argued by McDaniel, Einstein, and their colleagues [13][14][15]. McDaniel and Einstein [14] recently tabulated 82 age contrasts from previously published event-cued laboratory experiments, classified each contrast as arising from the use of ''focal'', ''non-focal'', and ''indeterminate'' ProM cues, and reported that raw age-related declines were larger on non-focal than focal cues. However, they did not attempt to statistically determine whether age-related declines are actually absent with focal cues. Uttl [2] reviewed and formally analyzed McDaniel and Einstein's [14] Table 7.4 are biased towards minimizing age differences for the following reasons. First, McDaniel and Einstein omitted over 50% of all laboratory event-cued age contrasts reported in the literature, and they did not include all nonconfounded age contrasts of event-cued prospective memory proper (all showing substantial age-related declines) (e.g., [17][18][19][20][21][22][23]), with the exception of Tombaugh et al. [24]. Given that age-related declines in ProM proper are much larger than on vigilance/ monitoring [2,3], this exclusion of ProM proper studies necessarily reduced the size of age-related declines. Second, many of the studies with focal cues listed in McDaniel and Einstein's Table 7.4 confounded age with intelligence, whereas only a few studies with non-focal cues have done so. In turn, this bias artificially reduced the size of age-related declines with focal cues. Third, McDaniel and Einstein did not consider severe ceiling effects that artificially minimize age differences [2,3,5]. Fourth, McDaniel and Einstein did not consider the distinction between ProM proper and vigilance/monitoring even though they themselves have endorsed this distinction on several occasions (e.g., [6]). In summary, McDaniel and Einstein's selective mini meta-analysis of ProM agerelated declines with focal vs. non-focal cues has several flaws due to the biases enumerated above and discussed in detail in Uttl [2]. However, based on Uttl's [2] analysis of McDaniel and Einstein Table 7.4, we can conclude that even this biased data set selected by McDaniel and Einstein themselves strongly contradicts their claims that prospective memory with focal cues is spared by aging, and that it is not ''automatic'', ''reflexive'', or ''obligatory''.
More recently, Kliegel, Jager, and Phillips [25] conducted a meta-analysis of event-cued prospective memory with focal vs. nonfocal cues and reported that age-related declines were larger for non-focal cues (d = 0.72) than focal cues (d = 0.54). Unfortunately, this most recent meta-analysis is also severely limited by numerous methodological problems. First, Kliegel [19], Uttl et al. [21]), excluded all studies using continuous measures of prospective memory (e.g., Uttl [26]), and excluded all studies published in book chapters (e.g., Graf,Uttl,& Dixon [20]), with the exception of a not-yet-published study by Maylor et al. (cited in [25]) as these ''authors declared that they did not intend to submit the study to a journal'' [25]. Interestingly, the authors included ''in preparation'', ''submitted'', and ''in press'' works from their own labs (see Kliegel et al., Supplemental Table 1). Second, Kliegel et al. [25] did not consider methodological problems with prospective memory studies enumerated and discussed by Uttl [2,3,5] including widespread ceiling effects that reduce age differences and standard deviations, invalidate reliabilities and correlations, and in turn, invalidate any effect size indexes calculated from group means and standard deviation such as Hedges' d used by the authors. Third, Kliegel et al. [25] disregarded age confounds, including intelligence and ongoing task difficulty, and analyzed confound-free and age confounded studies mixed together. Fourth, Kliegel et al. [25] did not take into account that age-related declines vary across subdomains of ProM. Thus, Kliegel et al. 's [25] results and conclusions are an artifact of a particular blend of selected confound free and age confounded studies from various subdomains of ProM mixed together and analyzed by an effect size index that is inappropriate for ceilinglimited age contrasts.
Accordingly, the present meta-analysis has three aims. The first aim is to determine if event-cued prospective memory with focal cues is spared by aging as argued by McDaniel, Einstein, and their colleagues [13,14]. The second aim is to examine whether the size of age-related declines with focal vs. non-focal cues varies with prospective memory subdomains (ProM proper vs. vigilance/ monitoring). The third aim is to determine whether age-related declines with focal cues are smaller than age-related declines in retrospective memory. Finding that ProM with focal cues does not decline with age would support McDaniel and Einstein's [13,14] claim that there are no age-related declines in ProM with focal cues as well as their theory that retrieval of prospective memory strong relationship, r = 20.49, between the raw ProM age decline and one of the most common age confounds: the verbal intelligence advantage of older adults over younger adults, expressed in terms of standard deviations. Panel E shows the magnitude of raw ProM age-related declines for studies with no confounds vs. studies with age confounds favoring older adults (e.g., intelligence, ongoing task difficulty). Panel F shows the frequency distribution of raw age-related declines; it indicates that the vast majority of the previous studies have found age-related declines in event cued prospective memory assessed in laboratory (see Uttl [2] plan in response to focal cues is ''automatic'', ''reflexive'', and ''obligatory,'' whereas a finding of substantial age-related declines even with focal cues would contradict their claims and theories. Moreover, if age-related declines vary more within prospective memory subdomains and age confounds (e.g., intelligence, ongoing task difficulty) than within the focal vs. non-focal cue distinction, the results would suggest that at least some prospective memory researchers have been focusing, metaphorically, on the wrong tree or even the wrong forest in their attempts to explain, what they believe, are inconsistent age-related declines across studies.
Importantly, to minimize biases and artificial reductions in estimated effect sizes arising from methodological and measurement issues with primary data including widespread ceiling effects, low reliability, and the dichotomous nature of most of the prospective memory indexes, the present meta-analysis employs three ways of analyzing the data: graphical meta-analysis combined with effect size model fitting (see Uttl [2]), robust outcome count meta-analysis, and traditional meta-analysis using d probit rather than the inferior d p or d phi based methods that derive d from means, standard deviations, or ts, ps, and Fs (see [2,27]). Figure 3 depicts the search for relevant studies; the search proceeded in several steps, closely following the method employed by Uttl [2]. First, the PsycLIT database was searched from the earliest available date to the end of March 2010 for the following terms: ''prospective memory'' and ''memory for intentions'' and these two searches were combined with OR operator. Second, the references in Birt [28], Henry et al. [29], Uttl [2], Kliegel et al. [25] were examined for potentially relevant articles and the identified articles were examined for relevance. Next, the references in all relevant articles and book chapters, retrieved by any method were examined for potentially relevant articles and the identified articles were examined for relevance. This search yielded 815 potentially relevant articles.

Selection of Studies Included in Meta-Analysis
The full text of all potentially relevant articles was examined for studies that reported performance on an event cued prospective memory task in laboratory settings for at least one group of younger and one group of older adults; the participant groups were healthy and without any diseases known to affect cognition (e.g., dementia); at least the mean performance for each age group was provided; and the studies were written in English. Tasks were considered to be prospective memory tasks if they required participants to perform some action in the future without any prompting from experimenters. For a few studies with more than two age groups spanning the adult lifespan, groups younger than 60 years of age were collapsed into the younger group, and groups older than 60 years of age were collapsed into the older group. This examination identified 62 articles, each reporting at least one age contrast conforming to the inclusion criteria above, and yielding 228 age contrasts in total.
Two age contrasts were excluded from the meta-analysis because age was confounded with conditions known to negatively affect cognition. For example, Mantyla and Nilsson [30] conducted a population based study of prospective memory and as a result many of their older participants scored withing the impaired range on the Mini Mental State Examination [31], a quick index of possible dementia. Three age contrasts were excluded because an examiner asked participants for their belongings and the participants' prospective memory task was to ask for their belongings at the end of the experiment (e.g., [32]). Given that the belongings turned over by participants differed across studies and participants, and likely also in terms of personal importance, it is unclear what the effect of this confound may be. Finally, six age contrasts were excluded because participants performed experimental like tasks in uncontrolled naturalistic settings (e.g., home, online) and little is known about the participants and/or how they performed the tasks (e.g., [33,34]). To illustrate, in an ingenious study by Logie and Maylor [34], participants self-selected themselves to complete various memory tasks linked from the BBC website including a single trial ProM proper task. Thousands of people participated (73,018) and the results showed large age-related declines across the adult life span. However, the study can be criticized because, for example, we do not know much about the participants (e.g., their verbal intelligence), their state (e.g., sober, tired), and we do not know how they performed the task due to the uncontrolled setting. The excluded age contrasts, including the number and performance of younger and older adults are listed in

Classification of Age Contrasts Included in Meta-Analysis
Age-Related Confounds Favoring Older Adults. The methodological review of previously published studies reveals that a large proportion of studies and age contrasts are severely limited by age confounds favoring older adults [2,3] (see Figure 2). Thus, each age contrast was classified into one of the two categories: age contrasts with no confounds and age contrasts with confounds favoring older adults (e.g., ongoing task confounds favoring older adults, intelligence confounds favoring older adults). Ongoing task confounds favoring older adults were introduced by Einstein and McDaniel [11] who made the ongoing task easier for older adults relative to younger adults; this design was subsequently adopted by a number of other investigators (e.g., [12,[35][36][37][38]). However, making the ongoing task easier for older adults artificially reduces the size of age differences and makes it impossible to disentangle the effects of aging from the effects of giving older adults an easier ongoing task (see Uttl [2] for a discussion of this issue).
Intelligence confounds favoring older adults refer to designs where highly intelligent older adults were compared to less intelligent younger adults. Since intelligence is positively correlated with prospective memory performance [39,26,35], this confound is also likely to artificially reduce the size of age-related declines. Indeed, as seen in Figure 2, Panel D, the intelligence advantage of older adults is moderately strongly and negatively correlated with the size of age-related declines. The affected studies include Einstein and McDaniel [11]; Cherry and LeCompte [35]; Reese and Cherry [40]; Cherry and Plauche [38]; Farrimond, Knight, and Titov [41]; Kvavilashvili et al. [37] and others. For the purposes of this article, the data are considered confounded with intelligence if older adults score more than 1.0 standard deviation above the ability of younger adults.

Prospective Memory Proper, Vigilance, and Habitual
Prospective Memory. Consistent with the definitions above, each prospective memory task was classified as measuring prospective memory proper, vigilance/monitoring, or habitual prospective memory. Tasks that included a time delay or intervening task between prospective memory instructions and commencement of an ongoing task were classified as measuring prospective memory proper whereas tasks that included no delay between instructions and the ongoing task were classified as measuring vigilance/monitoring. This classification is consistent with the view expressed by many leading researchers in the field [1,2,[6][7][8]. To illustrate, Marsh et al. [7] explain that ''this task was merely a distractor task placed between the prospective memory instruction and the onset of the rating [ongoing] task so that the prospective task did not become vigilance task…'' (p. 304). Similarly, Shapiro and Krisnan [8] note that ''this delay [15 min] has been shown to be sufficient to clear short-term memory and to ensure that it is not treated as a vigilance task…'' (p. 174). If a prospective memory proper task was to be executed repeatedly in response to the same cue and with the plan likely to leave consciousness between successive presentation of the cue, the task was classified as habitual prospective memory (see Uttl [2]).
Focal vs. Non-focal Cues. Focal cues are cues that participants must work with as part of the ongoing task, whereas non-focal cues are cues that need not be processed by participants during the course of an ongoing task. In other words, focal cues carry information relevant for performing an ongoing task, whereas non-focal cues do not provide any information relevant to performance of an ongoing task [13,14]. By this definition, a questionnaire that a participant is required to complete is considered a focal cue if prospective memory instructions require the participant to perform some action when they are presented with the questionnaire. In contrast, the color of a toy is considered a non-focal cue when the ongoing task requires participants to sort toys into semantic categories but does not require them to attend to each toy's color. Consistent with these definitions and examples, prospective memory cues were classified as focal or non-focal for each age contrast.
The cue classification as focal vs. non-focal was compared to the cue classifications in the two previous meta-analyses by McDaniel and Einstein [14] and Kliegel et al. [25] using percentage agreement and Krippendorff's Alpha [42]. The Krippendorff's alpha measures the degree of inter-coder agreement or inter-rater reliability, with 1 indicating perfect reliability, 0 indicating the absence of reliability, and negative values indicating the systematic disagreement. The values above 0. 80

Meta-Analysis Methodology
Multiple Effect Sizes from Single Studies. Effect sizes were calculated for each age contrast, that is, for each reported condition with both young and older adults. However, to satisfy an independence assumption for the application of meta-analysis, each participant could contribute to only one age contrast for statistical analysis purposes. Thus, when one group of participants was tested under two different conditions, the following criteria were used to select one condition to be included in the statistical analyses: (1) condition which was administered first was preferred; (2) the condition with the smaller retrospective memory load was preferred; (3) the condition with more lenient scoring was preferred (see [1,2,21]); and (4) if the preceding criteria were insufficient to unambiguously choose a condition, the condition was selected at random.
Data Visualization and Modeling. Following Uttl [2], data visualization and modeling techniques were used to determine effect size estimates that are the least affected by ceiling effects, skewed standard deviations, and other distribution problems that are widespread in prospective memory research [2,3]. Specifically, the performance of younger adults was plotted as a function of the performance of older adults and then the best fitting theoretical effect size curve and associated effect size was determined using double variate squared error minimization methods, both with and without weighting each point by its sample size. This modeling method minimizes the influence of ceiling-limited data as data points close to either the floor or ceiling have a small or no effect on determination of the best fitting curve. The 95% confidence interval (CI) on fitted effects were derived and the differences between the effect sizes were tested using bootstrapping methods that are robust, conservative, and require few assumptions relative to classical meta-analytic methods [43].
Robust Count Techniques. Robust statistical techniquescounts and sign tests -were used to determine if specific prospective memory subdomains were affected by aging.
Conventional Meta-Analysis. To satisfy traditionalists, conventional meta-analytic techniques were used to estimate effect sizes. However, given the dichotomous nature of primary outcome measures in all but a few studies [1][2][3], the probit was chosen as an effect size index and then transformed to its dequivalent d probit [27]. Theoretical and empirical research as well as examples discussed in Uttl [2] and Sanchez-Meca et al. [27] demonstrate that d probit underestimates the true effect size much less than phi to d transformations or indices calculated using means and standard deviation such as d p or Hedges' d used in previous meta-analyses [25,28,29]. Although these results are not reported, the data were also analyzed using odds ratios and the odds ratios yielded nearly identical effect sizes. The I 2 measure of inconsistency between studies/age contrasts in a meta-analysis is also provided; it ranges from 0 to 100% and quantifies the percentage of total variation across the studies attributed to heterogeneity rather than chance [44]. Higgins et al. [44] suggest that I 2 values of 25% indicate low inconsistency, 50% moderate inconsistency, and 75% high inconsistency among the studies.
Blocking. To avoid misleading and biased results, the studies were blocked by confound (e.g., age contrasts with no confounds, age contrasts with confounds favoring older adults) and analyzed separately.

Results
The meta-analysis included 217 age contrasts from 57 articles, representing 6,765 younger (mean age = 24.2 years) and 5,906 older (mean age = 71.7 years) individuals, doubling or tripling the size of the meta-analyses reported by McDaniel and Einstein [14] and Kliegel et al. [25]. Thus, the present meta-analysis represents a substantial advance over the previous meta-analyses in its comprehensive coverage of previously published research.
For illustrative purposes only and to allow comparison with the ''mix everything together'' approach adopted in the meta-analysis by Kliegel et al. [25], Figure 4 shows the performance of younger adults as a function of the performance of older adults with focal vs. nonfocal cues for all available contrasts, that is, disregarding age confounds (e.g., intelligence, ongoing task difficulty) and prospective memory subdomains (e.g., prospective memory proper, vigilance). Figure 4 includes the best fitting estimated d derived by double variate square error minimization methods and associated 95% confidence intervals derived by bootstrapping methods using 10,000 samples [43] and also highlights that the vast majority of studies in both focal and nonfocal conditions show substantial age-related declines and that age-related declines on focal cues [d = 20.69; 95% CI = (20.89, 20.50)] were comparable to age-related declines on non-focal cues [d = 20.64; 95% CI = (20.78, 20.51)]. However, these results reflect a particular blend of confounds and prospective memory subdomains. Figure 5 shows the performance of younger adults as a function of the performance of older adults for conditions with focal vs. non-focal cues, for prospective memory proper and for vigilance (no studies of event cued habitual prospective memory were identified in the review, see Uttl [2], for a more extensive discussion of this point), for confound free age-contrasts only. Several findings are readily apparent from the data. First, the majority of previous confound-free age contrasts examined vigilance/monitoring and a comparatively small number of age contrasts examined prospective memory proper. Only four ProM proper age contrasts with non-focal cues and binary outcome measures were identified by the review. An additional three contrasts not shown in the figure involved continuous measures of ProM and all showed an age decline (see [26,20]  For each subdomain and type of cue (focal vs. nonfocal), Table 12 shows the number of age contrasts available (k) as well as a summary of the outcomes -number of age contrasts showing decline, age parity (i.e., no differences), and age improvement, for all outcomes (i.e., a participant may have contributed data to more than one condition/age contrast) and for independent outcomes only (i.e., each participant contributed data to only one condition/ age contrast). In addition, for independent outcomes only, Table 12 shows the result of the robust sign test meta-analyses and the conventional random effect meta-analyses using d probit as the effect size index, including the inconsistency index.
The data in Table 12 are consistent with the modeling results. Looking at the independent outcomes only, both the binomial tests and d probit show statistically significant large age-related declines in all three subdomains/focal/non-focal conditions with sufficient data: ProMP with focal cues (12 declines, 0 ties, 0 improvements; d probit = 21.01), vigilance with focal cues (20 declines, 0 ties, 1 improvement; d probit = 20.58), and vigilance with non-focal cues (27 declines, 2 ties, 2 improvements; d probit = 20.61).
Finally, Table 12 shows the outcomes of studies with age confounds favoring older adults (ongoing task confounds, intelligence confounds, or both). Even though the age confounds favored older adults, the results of these confounded studies also show age-related declines in all conditions except in ProM proper with focal cues, d probit = 20.22, 95% CI = (20.45,+0.01). Not surprisingly, however, age-related declines were smaller in these age-confounded studies favoring older adults than in the studies without age confounds (see Uttl [2]).

Discussion
The current comprehensive meta-analysis of age-related declines in ProM with focal vs. non-focal cues yielded several critical findings. First, age-related declines in ProM with both focal and non-focal cues are large. Second, age-related declines in ProM with focal cues vary across subdomains; they are large in ProM proper and smaller in vigilance. Third, age-related declines in ProM proper with focal cues (d = 21.09) are as large or larger than age-related declines in recall measures of retrospective memory as reported in several independent meta-analyses of retrospective memory declines (d = 1.01, Spencer & Raz [45]; d = 0.97, LaVoie & Light [46]; d = 0.99, Verhaeghen Marcoen, & Gossens [47]).
The substantial age-related declines in ProM with both focal and non-focal cues directly contradicts Einstein, McDaniel, and their colleagues' claims that aging spares prospective memory with focal cues, consistent with previous findings by Uttl [2] and Kliegel et al. [25]. As discussed by Uttl [2,3], the evidence offered by McDaniel, Einstein, and their colleagues in support of their claim that prospective memory is an ''exciting exception to age-related declines in memory'' has been based on null findings due to (1) methodological artifacts such as ceiling effects [5], (2) intelligence confounds favoring older adults (see Figure 2), (3) ongoing task difficulty confounds favoring older adults (e.g., [11]), (4) studies with astonishingly low statistical power to detect even very large age differences in ProM (see Uttl [2] for discussion); and (5) studies of vigilance as opposed to ProM proper where age-related declines are smaller [2,3]. Perhaps not surprisingly, claims of no age-related declines in ProM with focal cues are similarly based on data compromised by ceiling effects, intelligence age-confounds, ongoing task-age confounds, low statistical power, and studies of vigilance. When age-confounded studies are removed from the analyses and the studies are blocked by ProM subdomain, the accumulated evidence shows that age-related declines in ProM with focal cues are large in ProM proper (d = 21.09) and smaller but still substantial in vigilance (d = 20.59). In turn, the findings strongly contradict McDaniel and Einstein's claims that ProM with focal cues is spared by aging due to ''automatic'', ''obligatory'', or ''reflexive'' retrieval of the previously formed plan. On the contrary, smaller age-related declines in studies with vs. without ongoing task age confounds favoring older adults (e.g., easier ongoing tasks for older adults) suggest that retrieval of the plan requires cognitive resources and is anything but automatic, obligatory, or reflexive.
The current meta-analysis revealed much larger age-related declines in ProM with focal cues than the previous meta-analyses. As noted in the introduction, Uttl [2] reanalyzed data presented by McDaniel and Einstein [14] in their Table 7.4 and demonstrated that even McDaniel and Einstein's own very limited selection of studies and classification of ProM cues as focal vs. non-focal revealed large age-related declines in ProM with both focal and non-focal cues, contrary to their claims. However, Uttl also noted that McDaniel and Einstein's selection was biased towards smaller age-related declines, as their Table 7.4 omitted most of the studies of ProM proper, omitted over 50% of all studies of ProM, and ignored methodological artifacts such as ceiling effects, intelligence confounds, and ongoing task ease confounds. Similarly, Kliegel et al. 's [25] meta-analysis suffered from a number of shortcomings including the omission of many published studies, and failure to consider methodological artifacts such as ceiling effects, intelligence, confounds, and ongoing task ease confounds (see above for details). Thus, when confounded studies are removed and the data are analyzed separately for ProM proper and for vigilance, the current meta-analysis show large age-related declines in ProM proper with focal cues and smaller but still large age-related declines in vigilance with focal cues. The present meta-analysis revealed that although the agerelated declines in vigilance were numerically smaller with focal vs. non-focal cues, this difference was not statistically significant due to the small size of the difference. In contrast, age-related declines in ProM proper were numerically smaller for the non-focal vs. focal cues, but the statistical comparison would not be meaningful due to the small number of studies that have assessed ProM proper with non-focal cues. Considering the small and inconsistent effects of focal vs. non-focal cues on the size of age-related declines, the focal vs. non-focal cue distinction is unlikely to explain the ''perplexing pattern'' (i.e., many studies finding age-related declines but some finding no age-related declines) [48]. In addition, it is important to note that smaller age-related declines with focal vs. non-focal cues are consistent with all theories of prospective memory and aging including Craik's [9,10] account, Maylor's [39] task appropriate processing account, Meier and Graf's [49] transfer appropriate processing account, and McDaniel and Einstein's multiprocess view [50], and thus, do not favor  [13,14] predicts no age-related declines with focal cues. It has been argued, however, that to study age differences in prospective memory properly one ought to make the ongoing task easier for older vs. younger adults. Einstein and McDaniel [11] explained that they made their ongoing task easier for older vs. younger adults because ''this [making word lists shorter for older vs. younger adults] equated functional difficulty'' of the ongoing task. Similarly, Kvavilashvili et al. [37] explained: ''in order to properly assess age effects on prospective memory it is necessary to ensure that both age groups have equal amounts of attentional resources available for the execution of prospective memory task.'' Thus, one may argue that the current meta-analysis actually supports McDaniel and Einstein's multiprocess view because the studies that confounded age with the ease of ongoing task and/or verbal intelligence actually resulted in no statistically significant age-related declines (barely missing the conventional alpha = 0.05).
As discussed by Uttl [2], however, this line of reasoning is specious. First, functionally equating ongoing task difficulty or demands appears difficult, if not impossible, by simply making an the ongoing task easier for older adults in some arbitrary way. To illustrate, Einstein and McDaniel [11] shortened word lists on their working memory task for older vs. younger adults by one item to make them equally difficult but they did not succeed: older adults actually outperformed (significantly) younger adults. Second, ensuring that both age groups have ''equal amounts of attentional resources available [left over] for the execution of prospective memory task'' is even more daunting without some accurate measure of the left-over resources. Equal performance on the ongoing task does not mean that the amount of left-over resources is the same for younger and older adults and arbitrarily making the ongoing task easier for older vs. younger adults is unlikely to achieve this objective. Third, younger and older adults need not use the same resource pools to achieve the same level of performance, rendering the entire exercise focused on a single resource pool rather superfluous. Fourth, any attempt to equate ''functional difficulty'' of an ongoing task is likely to have limited ecological validity and real-life relevance. To illustrate, imagine some younger and older adults, all of whom have made plans to buy groceries en route home, traveling by the same bus from  work (a university) past the supermarket (ProM cue) to their homes in the city center. Slowing down the ongoing task for older vs. younger adults would be equivalent to slowing down the progress (in time and space) of the seats occupied by older adults relative to the progress of the seats occupied by younger adults along the bus route while keeping all the seats on the same bus. Presently, this seems impossible. Fifth, the current meta-analysis shows that for ProM proper with focal cues, age-related declines are large even though the ongoing task demands in most of these studies were zero as these studies did not include any resource demanding ongoing tasks. For example, participants were listening to an experimenter saying ''this is the end of the task'' (ProM cue), doing nothing else, having all of their resources available to them, and yet large age-related declines emerged [21].
Accordingly, smaller age-related declines in confounded studies favoring older adults should not be interpreted as showing ''no agerelated declines with focal cues''. A more appropriate description of these findings is: ''If the ongoing task is made much easier for older vs. younger adults and/or if older adults are much smarter than younger adults, then age decline in ProM with focal cues is reduced.'' Indeed, these findings parallel those found with retrospective memory. For example, performance on recall tests declines substantially when attentional resources are divided at retrieval (e.g., [51]) and, thus, one can easily eliminate age-related declines in recall by dividing attention for younger adults more than for older adults at retrieval (this confounding would be appropriate because it would ''equalize'' available resources to younger and older adults for retrieval of previously learned words, following Kvavilashvili et al. 's [37] reasoning and applying it to retrospective memory age-related declines). Similarly, given the moderately strong correlations between recall and verbal intelligence (e.g., [26,52]), one can easily eliminate age-related declines in recall by comparing less intelligent younger adults with more intelligent older adults.
One could argue that the current study's findings depend on accurate classification of ProM cues as focal vs. non-focal based on McDaniel and Einstein's description of the characteristics of these cues, and that McDaniel and Einstein would classify the cues  differently. The analyzes of inter-rater agreement between the cue classification in the current meta-analysis and the classification of cues by McDaniel and Einstein themselves (reported in the method section) set aside these concerns: the inter-rater agreement was very high using both the percentage agreement as well as Krippendorff's alpha measures. The present meta-analysis has several limitations. First, the high prevalence of ceiling effects in ProM studies has likely reduced the estimated effect sizes even though modeling and d probit were used for estimation. Second, the estimated effect sizes are limited by the low reliability of binary indexes of ProM used in primary studies [2,3]. This low reliability of ProM measurement is also expected to reduce estimated effect sizes. Third, the operational definition of ProM proper vs. vigilance used in this meta-analysis classified age contrasts as ProM proper if there was an intervening task or a delay between ProM instructions and start of an ongoing task. However, it is possible that performance in some studies classified as ProM proper was more dependent on vigilance as these studies used multiple cues. Once a participant responds to one of the cues, he or she may start monitoring for cues and performance may reflect primarily vigilance rather than ProM proper [2,20]. Fourth, many reports included in the meta-analysis did not provide any assessment of participants' verbal intelligence. Thus, it is possible that intelligence confounds were also present in some of the studies classified as not confounded by verbal intelligence in the present meta-analysis. Finally, to my knowledge, there have been no longitudinal studies of age changes in ProM to date that would verify decline in memory prospectively and all studies to date used Figure 5. Age-related declines in ProM with focal vs. non-focal cues, for confound-free age-contrasts only. Large age-related declines are readily apparent in all of the conditions where sufficient data are available: ProM proper with focal cues, vigilance/monitoring with focal cues, and vigilance/monitoring with nonfocal cues. Moreover, for focal cues, age-related declines are much larger on ProM proper than on vigilance/ monitoring, d difference = 0.40 with bootstrap 95% CI = (0.14, 0.68), and for vigilance/monitoring, age-related declines are only numerically larger with non-focal cues than with focal cues, d difference = 0.05 with bootstrap 95% CI = (20.17,0.27). doi:10.1371/journal.pone.0016618.g005 cross-sectional design. It is possible, although unlikely, that the pattern of findings could be different in longitudinal studies.

Conclusions
The current meta-analysis represents a substantial advancement over the previous meta-analyses of age-related declines in ProM with focal vs. non-focal cues. First, the present findings are supported by three meta-analytic approaches -graphical model fitting methods, robust count methods, and the more traditional meta-analysis based on d probit effect size indexes. Second, the present meta-analysis is more comprehensive by doubling to tripling the number of included studies relative to the previous meta-analyses by Eintein and McDaniel [14], Uttl [2] (formal meta-analysis of McDaniel and Einstein's meta-analysis of agedeclines with focal vs. non-focal cues), and Kliegel et al. [25]. Third, the current meta-analysis did not combine non-confounded with confounded studies, nor ProM proper with vigilance studies, but rather analyzed them separately, which is necessary if one wishes to learn about age-related declines in ProM proper vs. vigilance, rather than age-related declines in a particular blend of ProM proper and vigilance, and non-confounded and confounded studies.
Lastly, this study highlights that age-related declines in ProM with focal cues are large, that even age-related declines in ProM with focal cues can vary across ProM subdomains with large agerelated declines in ProM proper and smaller but substantial declines in vigilance, and that age-related declines in ProM proper with focal cues are as large as or even larger than age-related declines in retrospective memory. In turn, these results are consistent with Craik's [9,10] proposal that age-related declines on ProM tasks are generally large, as large as age-related declines in recall measures of retrospective memory, and vary with the degree of environmental support (i.e., larger on ProM proper vs. vigilance/monitoring). The results support the distinction between ProM proper vs. vigilance/monitoring (see Brandimonte [53], Graf and Uttl [1], Uttl [2,3]); they highlight the need for authors to explicitly and openly distinguish between ProM proper and vigilance/monitoring, rather than requiring the reader to pore over the method section with a fine-toothed comb to find out whether a particular study investigated vigilance/monitoring or ProM proper (e.g., [6][7][8]). The results directly contradict Einstein, McDaniel, and their colleagues' claims that ProM with focal cues is spared by aging [13,14,54]. Finally, the results strongly suggest that the distinction between ProM proper vs. vigilance/monitoring, age confounds, ceiling effects, and low statistical power are responsible for what some have called a ''perplexing pattern'' (lack of age-related declines in some studies vs. strong age-related declines in other studies) of age-related declines [48]. The ''perplexing pattern'' is not perplexing at all; it is due to methodological problems and conceptual confusions that have plagued ProM research. Note. De = decline, Eq = equal, Im = improvement; k = number of age contrasts; N = total number of individuals; p = binomial test p; Insuf. = Insufficient data = fewer than 6 age contrasts ; *Independent outcomes; I 2 = inconsistency index. doi:10.1371/journal.pone.0016618.t012