Involving stakeholders in the design of ecological momentary assessment research: An example from smoking cessation

Peter D. Soyster; Aaron J. Fisher

doi:10.1371/journal.pone.0217150

Abstract

Ecological momentary assessment (EMA) is a data collection method that involves repeated sampling of participants’ real-time experience and behavior as they unfold in context. A primary challenge in EMA research is to design surveys that adequately assess constructs of interest while minimizing participant burden. To achieve this balance, researchers must make decisions regarding which constructs should be included and how those constructs should be assessed. To date, a dearth of direction exists for how to best design and carry out EMA studies. The lack of guidelines renders it difficult to systematically compare findings across EMA studies. Study design decisions may be improved by including input from potential research participants (stakeholders). The goal of the present paper is to introduce a general approach for including stakeholders in the development of EMA research design. Rather than suggesting rigid prescriptive guidelines (e.g., the correct number of survey items), we present a systematic and reproducible process through which extant research and stakeholder experience can be leveraged to make design decisions. To that end, we report methods and results for a series of focus group discussions with current tobacco users that were conducted to inform the design of an EMA study aimed at identifying person-specific mechanisms driving tobacco use. We conclude by providing recommendations for item-selection procedures in EMA studies.

Citation: Soyster PD, Fisher AJ (2019) Involving stakeholders in the design of ecological momentary assessment research: An example from smoking cessation. PLoS ONE 14(5): e0217150. https://doi.org/10.1371/journal.pone.0217150

Editor: Adam T. Perzynski, The MetroHealth System and Case Western Reserve University, UNITED STATES

Received: November 1, 2018; Accepted: May 6, 2019; Published: May 22, 2019

Copyright: © 2019 Soyster, Fisher. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant study data are available through the Open Science Framework at: https://osf.io/wch9m/. DOI 10.17605/OSF.IO/WCH9M.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Ecological momentary assessment (EMA) is a data collection method that involves repeated sampling of participants’ real-time experience and behavior as they unfold during daily life [1]. In contrast to other longitudinal data collection methods (e.g., longitudinal panel study), EMA typically results in many observations per participant over a relatively short period of time—often a few days to several weeks. Rather than assessing global retrospective self-reports, which can be limited by recall bias, EMA captures participants’ self-reported experiences and behaviors in real-time, in natural contexts. EMA studies routinely yield anywhere from dozens to hundreds of observations per participant (frequently referred to as intensive longitudinal data or intensive repeated measures [2]). The large number of observations per participant allows for the statistical examination of within-person processes as they unfold over time. EMA methods have been employed to study a wide range of behavioral and emotional phenomena, including emotional avoidance [3], substance use [4], and nutrition behaviors [5]. In practice, participants in EMA studies provide in-the-moment reports of their experience and behavior using a paper and pencil diary, responses to text messages, a smartphone app, or wearable device, as they go about their daily lives. These data can be quantitative (e.g., numerical ratings of mood) or qualitative (e.g., free-response text), and are collected actively through self-report or passively through mobile phone or wearable sensors (e.g., fitness tracking watch).

There are three sampling paradigms used in EMA studies; event-contingent (participants initiate a response when a predefined event has occurred), signal-contingent (participants are prompted at random times to get a representative sample of their experiences), and time-contingent (participants are prompted on a fixed time schedule) [6]. Many studies utilize more than one type of sampling process (e.g., signal-contingent sampling 4 times per day and participants initiate a report if an event of interest occurred between random surveys) [1,4]. Among published EMA studies, sampling frequency varies widely, from once per day, to as many as 60 times per day [7]. The sampling period–the duration of days or weeks that an individual participates in EMA–shows similar variability across studies.

In psychological and behavioral health research, EMA has traditionally been used to collect high-granularity data on within-person processes (e.g., moment to moment relationship between stress and alcohol craving for Participant X). These data are collapsed across participants to understand generalizable patterns of—or between-person differences in—these within-person processes (e.g., the average moment to moment relationship between stress and alcohol craving for Population X). In early applications of EMA design, intensive longitudinal data were often substantially reduced in granularity by averaging responses across days [8], or weeks [9]. Fortunately, as the number of researchers using EMA designs has increased [2], the statistical methods used for analyzing such data has become more sophisticated [10] allowing for more detailed analysis of within-person processes. Recently, EMA has begun to be used for person-specific (idiographic) research [11, 12]. Such approaches utilize EMA, and an accompanying suite of specialized statistical analyses (e.g., [13, 14]) to understand psychological phenomena as they exist within a specific person. In a person-specific approach, EMA surveys can measure a set of possible mechanisms leading to outcomes of interest (e.g., anxiety, substance use), and statistical analyses can identify which mechanisms appear to be relevant for a given person. These types of analyses can be completed pre-treatment and used to guide treatment selection and personalization [15].

Despite the variety of applications of EMA, currently there are not clear guidelines regarding how best to construct EMA studies. Some authors have characterized common elements of EMA studies [4] but often suggest a ‘common sense’ approach wherein prospective researchers use their intuition or best judgment to design surveys they think will best answer their research question. Thus, EMA surveys are highly variable in terms of the types of questions included, the number of assessments per day, and the length of the sampling period [16]. Flexibility in EMA design is desirable as it allows for researchers to collect data in a way that fits the goals of a given study; however, as EMA studies become increasingly common, methods are needed to guide best practices in designing assessment protocols that balance idiosyncrasy and flexibility, for meeting individual study and participant needs, with consistency and standardization, for comparison of results across studies.

The question of how best to select EMA survey items is an open and important one. For many outcomes and predictors of interest, extant theoretical and empirical work may provide a valuable resource from which to select potential constructs or items for intensive repeated measurement. However, for a variety of reasons–particularly participant burden–it is impractical to assess the complete set of possible variables. Large batteries of self-report measures are not practical and are unlikely to be tolerable for research participants under conditions of intensive repeated measurement [17]. Thus, there is an impetus for researchers to select a restricted set of items that provide adequate coverage for constructs that may be otherwise best-measured by larger batteries.

There are several techniques EMA researchers could employ to select items from larger validated retrospective measures. For instance, researchers could select a subset of items that have the highest factor loading on the construct of interest. While intuitive, this approach may not be optimal as it neglects that cross-sectional and in-the-moment (i.e. time-varying) measures are inherently different and, therefore, can yield substantially different results [18]. Estimates may fail to generalize across time, individual persons, or both [19]. Items drawn from a validated cross-sectional measure, but one that has not been shown to generalize across time, would likely not be useful in the context of EMA. Additionally, latent factor indicators are assumed to be conditionally independent and interchangeable, rather than representing unique sources of variance. Such a position is contrary to recent network theories of measurement [20].

Another approach is to generate a single item to assess each construct. As has been reported elsewhere [21], there is mixed evidence about the effectiveness of single-item measures in cross-sectional research. In some studies, single items have been shown to perform as well as multi-item scales for concrete constructs [22, 23]; others have concluded that multi-item measures are more valid [24]. In EMA research, participants often report on unidimensional constructs in terms of their current experience [16] (e.g., intensity of fatigue). In these cases, a single well-chosen item has been shown to be sufficient [21, 25].

Finally, researchers could simply rely on the precedent set by previous EMA studies. The recent proliferation of EMA studies has resulted in a large body of work from which researchers can select items. However, due to the large variability in study designs described above, there is unlikely to be a clearly superior choice. Additionally, while extant studies report the items used in the investigation, detailed explanation of the rationale for selecting included items is not routinely reported.

As researchers work to select a face-valid, parsimonious, and comprehensive set of items, difficult decisions must be made regarding which constructs should be included, and how those constructs should be assessed. We propose that engagement with potential research participants (stakeholders) will greatly inform, and strengthen these choices [26].

Engaging stakeholders in research design

In 2008, the National Institute of Mental Health (NIMH) first published the Strategic Plan for Research, outlining four strategies to accelerate progress in both basic and clinical science [27]. Strategy 4.3 highlighted the need to involve stakeholders in all aspects of the research pipeline in order to increase the effectiveness of mental health interventions. Above and beyond the many benefits to including stakeholders in the research process [28], there are several benefits that are particularly relevant for EMA designs. Stakeholders can provide invaluable assistance to researchers in generating and narrowing down lists of items that could be included in EMA surveys to capture mechanisms related to the outcomes of interest. Additionally, stakeholders can provide important information–prior to expensive pilot testing–regarding the feasibility and acceptability of the proposed EMA assessment strategy within a given population.

Focus groups with stakeholders may be particularly helpful during the design phase of EMA research [29]. Focus groups are structured or semi-structured conversations conducted to gather stakeholder insights on shared or individual perspectives around specific topics. Advantages of focus groups include their ability to explore stakeholders’ individual knowledge and experiences, while simultaneously capitalizing on group dynamics and interpersonal exchanges among participants [30]. Thus, in addition to developing standardized methods for EMA survey construction, researchers should endeavor to include stakeholders in the early stages of design to improve the relevance and acceptability of EMA research.

The goal of the present paper is to introduce a general approach for including stakeholders in the development of EMA research design. To that end, we report methods and results from a series of focus group discussions with current tobacco users that were conducted to inform the design of an idiographic study aimed at identifying person-specific mechanisms driving tobacco use. We conclude by providing recommendations, based on our experiences, for item-selection procedures in future EMA studies.

An example from smoking cessation

Tobacco use is the leading cause of preventable death in the United States, resulting in an estimated 480,000 deaths per year [31]. Beyond early mortality, the consequences of tobacco use are wide-ranging, extending to physical health, mental health, and even the ability to find employment–particularly for those from vulnerable populations (e.g., those with mental illness, LGBTQ+) [31–34]. In light of the mortality and impairment associated with tobacco use, reducing the number of people who smoke is a major goal in behavioral and public health. There is a dire need for effective and affordable tobacco treatments (TT) to reduce use.

Decades of tobacco research have yielded insights into what mechanisms should be targeted in evidence-based TT [34]. Although many interventions have demonstrated effectiveness in clinical trials, the majority of people who receive TT fail to quit [35]. One potential area for improving TT outcomes is personalization: modifying the content and presentation of interventions to make them more relevant for a given person or population. Existing research is mixed regarding the effects of specific mechanisms on tobacco use (e.g., the relation between negative affect and subsequent tobacco use). As a result, there has recently been a call for novel assessment approaches that would yield person-specific information about mechanisms driving tobacco use, which would allow for the personalization of TT [36, 37].

Given that existing research is equivocal regarding which variables are most related to subsequent tobacco use, the present study utilized an approach to examine the experiences of stakeholders regarding what they believe drives their momentary choices to smoke or not. Specifically, we aimed to (1) review existing research in order to develop a list of constructs that have been empirically or theoretically linked to tobacco use; (2) examine current tobacco users’ thoughts and attitudes about the relevance of these constructs to their own smoking behavior; and (3) synthesize these information sources to develop survey items to be used in a future EMA study of tobacco use.

Methods

All study procedures were approved by the University of California, Berkeley Committee for Protection of Human Subjects and all participants provided informed consent prior to participation.

Aim 1

To address the first aim, we sought to compile a relatively short list of constructs for potential inclusion in a future EMA study of tobacco use, to be refined through the second and third aims. Specifically, we were interested in mechanisms relating to increased or decreased likelihood of current tobacco users choosing to smoke at a given point in time—rather than mechanisms related to the initiation of smoking in non-users. As the ultimate goal of the future EMA study was to identify personalized smoking cessation treatment targets, we were specifically interested in identifying constructs that (1) had a high probability of being relevant to current tobacco users, (2) would be likely to vary in level or intensity within individuals over the course of a day, and (3) were potentially modifiable through a known evidence-based intervention. We also aimed to understand the typical wording, response format, and sampling frequency used to measure these constructs in other published, peer-reviewed EMA studies of tobacco use.

To those ends, we conducted a systematized review [38] of literature regarding within-person mechanisms relating to tobacco use. Due to the applied nature of this work, we elected to forgo a formal systematic review, and instead conducted a comprehensive review of previously published studies without incorporating strict metrics for analyzing the quality of each study.

We began by utilizing a social cognitive theory [39] framework to guide our search for constructs related to health behaviors in general, expanding our search to include studies of tobacco-specific constructs. A trained research assistant searched the Google Scholar, PsychINFO, and PubMed databases to identify studies for possible inclusion. Search terms and phrases were combined and reflected a focus on tobacco use (tobacco, smoking, cigarette, and nicotine), the employed methodology (ecological momentary assessment and experience sampling methods), and constructs included in the EMA surveys (within-individual, mechanisms, social cognitive theory, self-efficacy, locus of control, expectancy, health beliefs, craving, withdrawal, motivation to quit, stages of change, negative affect, positive affect, stress, and impulsivity). No restriction was placed on the year of publication. Searches were restricted to English language, peer-reviewed studies. Databases were searched in May of 2017. For each combination of search terms, all indexed article titles were reviewed, and potentially relevant articles (i.e. those relating to within-person processes driving tobacco use) were retained.

After de-duplication, these search procedures resulted in an initial list of 279 unique articles. Two researchers then independently reviewed the abstracts for these articles and manually excluded papers that (1) did not include a focus on tobacco use, (2) did not include any application of EMA methods (reviews of EMA studies were retained), (3) only utilized EMA during or after a quit-smoking attempt, or (4) were secondary analyses of data presented in an already included study. Interrater reliability was high among the reviewers (92% agreement) and all disagreements were discussed and settled before thematic analysis. In total, 52 studies were retained.

Thematic analysis.

After identification, a trained research assistant reviewed the full text of each study to extract all reported variables included in the EMA surveys, the wording and response format for each item, item sampling rate, and the length of the sampling period (Table 1). The first author reviewed a random sample of 10 articles to confirm the validity of data extracted by the research assistant. No conflicting variable extraction was evidenced during this checking step.

Download:

Table 1. Frequencies of reviewed study characteristics.

https://doi.org/10.1371/journal.pone.0217150.t001

The authors then collaboratively thematically analyzed and coded the variables to reduce the data into meaningful discrete construct categories. There was a large degree of overlap in the definitions, and component parts, of purportedly distinct constructs in psychology [40]. For instance, having difficulty concentrating was found to be a shared indicator for depressive episodes, nicotine withdrawal, and impulsivity [41, 42]. To validly match potential constructs to specific evidence-based interventions (a criterion for construct inclusion in the present study), it was important to thematically analyze and classify the constructs used in previous studies in a way that resulted in minimal construct overlap. Initial construct categories were based on the descriptions provided in the reviewed studies (i.e., what the authors stated they were attempting to measure). This classification was done deductively; we utilized our knowledge of relevant psychological theory (e.g., happy, positive, and enthusiastic were coded as positive emotions [43]) to determine construct classification. As these construct categories were identified collaboratively between the two authors, and no new constructs were proposed (i.e., we did not attempt to codify a previously unknown construct), thus we did not formally assess trustworthiness or reliability.

From these constructs, we then generated a list of potential survey items—at least one for each construct—to be considered for inclusion in the final study (Table 2). In the case that there was majority agreement in how a construct was assessed among the reviewed studies, the most prominent item wording was used. In the case that there was no consensus among the reviewed studies, the wording of individual items was either pulled directly from an established measure of the construct (e.g., nicotine withdrawal [42]), or a new item was generated by the authors.

Download:

Table 2. EMA survey items presented to focus groups and final items chosen for study inclusion.

https://doi.org/10.1371/journal.pone.0217150.t002

Finally, the sampling paradigm (e.g., event-, signal-, or time-contingent), sampling frequency, response format, and length of sampling period were reviewed, and the frequency of each approach was tabulated. When relevant, we weighted our proposed design choices to be consistent with the more frequently used design features. However, as the future EMA study intended to employ idiographic time series analyses, a relatively large number (>100) of roughly evenly spaced observations was required for each participant. This meant that in some cases, frequently used design methods (e.g., event-contingent sampling, only assessing constructs once per day) were deemed to be incompatible with the goals of the future EMA study.

Aim 2

To address our second aim of understanding tobacco users’ attitudes about constructs relevant to their smoking behavior, we conducted three focus groups with current tobacco users to examine their thoughts and attitudes about the relevance of the items generated through Aim 1 to their own smoking behavior. Additionally, we sought participants’ input regarding additional constructs/items not identified through Aim 1. One group included participants from an undergraduate research participation pool, and two groups were composed of participants recruited from the larger San Francisco Bay Area community.

Participants.

Participants (N = 19) were adults (Mean age = 33.6, SD = 14.5, range = 19–60) who self-identified as current tobacco users. Participants were drawn from communities from which we intended to recruit participants for a future study of person-specific factors driving tobacco use. Across the three focus groups, 11 participants (57%) identified as female, 8 identified as male. The groups were diverse with respect to race/ethnicity (32% white, 26% Black or African American, 26% Asian/Pacific Islander, 16% mixed or ‘other’) and sexual orientation (68% heterosexual, 5% homosexual, 16% bisexual/queer, 11% unsure or prefer not to disclose). Participants smoked an average of 7.6 cigarettes per day (SD = 6.61, range = 1–27) for an average of 16.2 years (SD = 16.2, range = 1–52). Eighty-four percent indicated that they were seriously thinking about quitting smoking in the next six months, with 47% indicating they were thinking about quitting in the next 30 days.

Procedure.

Potential participants were recruited from advertisements on Craigslist, flyers posted in the community, and through an undergraduate research participation pool. Interested individuals were directed to an online screening survey designed to confirm eligibility for study inclusion. To be included in the study, participants were required to be adults with English-language proficiency, to report having smoked 100 or more cigarettes in their lifetime [44], and to report currently smoking ≥ 1 cigarette per week. Neither desire nor motivation to quit smoking were required to participate.

Eligible participants were invited to our lab at the University of California, Berkeley for enrollment and participation in a focus group. After completing consent procedures, participants’ self-reported smoking status was biochemically verified using exhaled carbon monoxide (M = 11.1ppm, SD = 7.8 ppm, range 2 ppm– 30 ppm) [45]. Before the focus groups began, participants completed a computer-based survey, which assessed participant demographics, current and past tobacco use, and a variety of psychological and emotional variables that have been linked to differential patterns of tobacco use (e.g., personality facets were assessed using the NEO Five Factor Inventory [46]; past-week anxiety was assessed using the Overall Anxiety Severity and Impairment Scale [47]). In addition to providing descriptive information about the sample of focus group participants, these measures were included to understand the length of time participants required to complete the baseline battery of measures we planned to include in the future EMA study. We recruited 7 participants for the undergraduate focus group, and a total of 12 participants for the two community member focus groups. Participants who completed the focus group were reimbursed with $25 or partial course credit.

Focus groups were conducted between July and August of 2017 and had an average duration of 60 minutes. The groups were facilitated by one of two graduate student researchers with training and experience in conducting qualitative research (F1: white, male, PhD student; F2: white, female, PhD student), with assistance from two research assistants. At the beginning of the focus groups, the facilitator explained how EMA research is conducted, and outlined the goals of the planned tobacco EMA study with the following statement:

“The goal of this conversation is to share a list of factors that we think might cause people to smoke more or less throughout the day. We hope that you can tell us if you agree or disagree with our list, and let us know about any factors you think are missing from our list. We also hope that through this discussion you can share your experiences with smoking so that we make our final list of daily survey questions as relevant as possible”.

Participants were provided a written list of 39 potential EMA items and were asked to indicate whether or not they thought each item related to their smoking (0 = not related, 1 = related). The group then discussed their reasons for marking items as related or unrelated to their smoking. For any item marked as related to smoking, participants could indicate if that item was particularly strongly related to their smoking (i.e., the item is very related). Participants were encouraged to focus their responses as to whether or not an item applied specifically to them, not whether they believed it may be important for others. After this initial discussion, facilitators used a semi-structured interview guide to ask open-ended questions, covering (1) likes and dislikes about smoking, (2) causes of smoking/refraining from smoking, (3) thoughts and feelings that occur immediately before and after smoking, (4) experiences of cigarette craving, and (5) experiences with past quit-attempts.

Directed content analysis.

The focus groups were audio-recorded. One research assistant transcribed the audio recordings verbatim, and another checked the transcriptions for accuracy. After transcription, the group facilitators and two research assistants independently analyzed the transcripts using directed content analysis [48]. The goal of directed content analysis is to analyze new data (e.g., focus group transcripts) to validate or extend an extant theory. The present study did not specifically seek to validate the existence of the constructs identified through the first aim, as this work had already been completed by the reviewed studies. Instead, we sought to validate the assumption that each of proposed items would be relevant for some focus group participants but not others. While we did not have specific hypotheses regarding which items would be relevant to which participants, we used a deductive approach to extend our understanding of which constructs where relevant for the largest proportion of participants (through tabulating frequencies for the presence of a code), and to understand more about the function of each construct in relation to smoking. The initial list of codes consisted of the constructs identified through Aim 1 and were used to classify the content of participant statements. These constructs were compared among the coders, and through an iterative process, were used to extract sub-themes relevant to increased or decreased smoking. After determining the final list of construct codes, two research assistants coded the transcripts and tabulated frequencies for the presence of each code (Table 1). Intercoder agreement was high across each of the three transcripts (community 1 = 94%; community 2 = 92%, college = 95%). All coding disagreements were reviewed among the coding research assistants and first author, and a consensus vote was used to make a final coding determination.

Given the process-focused nature of EMA research, the coding team utilized a functional analysis approach [49] to identify constructs that were (1) antecedents of smoking behavior choices, (2) the behavior prompted by those antecedents (i.e. smoking or not smoking), and (3) the consequences of choosing whether or not to smoke. Put another way, this content analysis was not intended to comment on or fine-tune the definition of a given construct (e.g., craving); instead, we assessed whether participants felt craving was relevant to their smoking. This was the principal interest—to identify the most common and influential predictors of smoking. Additionally, we sought to identify potential mechanistic pathways through which craving led to increased or decreased smoking, and the consequences of those actions (i.e. the antecedents and consequences of craving).

Aim 3

To synthesize the data collected through the previous aims, our research group held several meetings to review the data. Through these discussions, we developed a list of EMA survey items, integrating feedback from stakeholders and literature review. The goal of these discussions was to determine which constructs should be included (e.g., negative emotions) and the number and wording of survey items used to measure those constructs (e.g., I feel sad).

Construct and item selection.

We started by defining a set of decision rules for determining which constructs should be assessed in the EMA survey (Fig 1). These decision rules were designed to reduce the list of constructs identified through the previous aims to those that (1) had a high probability of being relevant to current tobacco users, (2) would be likely to vary in level or intensity within individuals over the course of a day, and (3) were potentially modifiable through a known evidence-based intervention. After a list of constructs was codified, a similar process (Fig 2) was used to determine the number of items needed to measure each construct, and the specific wording of each item. In the case that our predetermined decision rules were insufficient to determine inclusion/exclusion status, a ruling was made by consensus vote. Similarly, if two items measuring the same construct had equal support for inclusion, consensus vote was used to select one item for inclusion.

Download:

Fig 1. Decision rules used to determine construct inclusion.

Conceptual diagram of the decision rules used to determine construct inclusion.

https://doi.org/10.1371/journal.pone.0217150.g001

Download:

Fig 2. Decision rules used to determine survey item inclusion.

Conceptual diagram of the decision rules used to determine survey item inclusion.

https://doi.org/10.1371/journal.pone.0217150.g002