Factors associated with research participation in a large primary care practice-based pediatric cohort: Results from the TARGet Kids! longitudinal cohort study

Background All longitudinal cohort studies strive for high participant retention, although attrition is common. Understanding determinants of attrition is important to inform and develop targeted strategies to improve study participation. We aimed to identify factors associated with research participation in a large children’s primary care cohort study. Methods In this longitudinal cohort study between 2008 and 2020, all children who participated in the Applied Research Group for Kids (TARGet Kids!) were included. TARGet Kids! is a large primary care practice-based pediatric research network in Canada with ongoing data collection at well-child visits. Several sociodemographic, health, and study design factors were examined for their associations with research participation. The primary outcome was attendance of eligible research follow-up visits. The secondary outcome was time to withdrawal from the TARGet Kids! study. Generalized linear mixed effects models and Cox proportional hazard models were fitted. We have engaged parent partners in all stages of this study. Results A total 10,412 children with 62,655 total eligible research follow-up visits were included. Mean age at enrolment was 22 months, 52% were male, and 52% had mothers of European ethnicity. 68.4% of the participants attended at least 1 research follow-up visit. Since 2008, 6.4% of the participants have submitted a withdrawal request. Key factors associated with research participation included child age, ethnicity, maternal age, maternal education level, family income, parental employment, child diagnosis of chronic health conditions, certain study sites, and missingness in questionnaire data. Conclusions Socioeconomic status, demographic factors, chronic conditions, and missingness in questionnaire data were associated with research participation in this large primary care practice-based cohort study of children. Results from this analysis and input from our parent partners suggested that retention strategies could include continued parent engagement, creating brand identity and communication tools, using multiple languages and avoiding redundancy in the questionnaires.


Introduction
In epidemiological research, the longitudinal cohort study is a powerful research design and offers numerous advantages by following a group of individuals over a period of time [1]. However, attrition, or loss of participants over the course of a study, is an inevitable occurrence in all longitudinal cohort studies and may compromise the power and validity of study findings and introduce bias due to differential loss to follow-up [2]. In order to develop strategies to reduce loss to follow-up and improve participant retention, it is crucial to identify factors associated with research participation.
Numerous pediatric longitudinal cohorts have studied research participation using a descriptive approach by comparing the characteristics of participants who continued in the study and those not followed up [3][4][5][6][7][8][9]. Compared to children who remained involved in the study, those lost to follow-up were more likely to be older children, male, and ethnic minority groups, had higher birthweights, had younger and foreign-born mothers, were less likely to live together with both parents, and had lower socioeconomic status [3][4][5][6][7][8][9]. Several longitudinal cohort studies in children examined the determinants of attrition by using a more rigorous analytical method and found that sociodemographic factors including older children, younger mothers, low parental educational level, single parenthood, and parents born in a foreign country were associated with higher attrition [10][11][12][13]. In addition, children with overweight or obesity, children with low well-being, and smoking and drinking in adolescents were associated with poor study participation [10,11,14]. Missingness (non-response items) at baseline has also shown to be positively associated with loss to follow-up, suggesting that individuals who did not participate fully at baseline were less likely to participate in future follow-ups [10]. Among these cohorts examining attrition determinants, two were school-based cohorts [10,11,14], one was a birth cohort, [15] and two cohorts focused on children at increased risk of medical conditions [12,13]. No prior studies have examined factors associated with participation in healthy children recruited from a primary care setting.
Launched in 2008, the Applied Research Group for Kids (TARGet Kids!) [16] is a large primary care practice-based research network for children in Canada which recruits families with children under 5 years from primary care practices and invites participants to complete questionnaires and collect physical measures at each participant's annual well-child visit.
Identifying factors associated with research participation in the TARGet Kids! cohort study would help to inform and develop targeted strategies to improve participant retention in TAR-Get Kids!, but also other healthcare based cohort studies for children. Moreover, results from this study and our experience with participant retention during the past 12 years may provide valuable information to other researchers when designing and implementing future cohort studies. The objective of this paper was to determine sociodemographic, health, and study design factors associated with attendance of research follow-up visits and withdrawal from the TARGet Kids! longitudinal cohort study. We also aimed to summarize our current engagement strategies and outline opportunities of improvement. We hypothesized that various factors, including but not limited to, low socioeconomic status, parents not born in Canada, and missing questionnaire data would be associated with lower level of research participation.

Study design and population
In this longitudinal cohort study, all children and their parents/caregivers participating in TARGet Kids! [16] enrolled between June 2008 and March 2020 (prior to the start of the COVID-19 pandemic) were included. TARGet Kids! (www.targetkids.ca) is a primary care practice-based research network for children in Canada, with over 100 primary healthcare providers from 15 large primary care practices participating across the Greater Toronto Area, Kingston, and Montreal [16]. The overall goal of TARGet Kids! is to improve the health of Canadians by optimizing growth and development through preventive interventions in early childhood. Since 2008, TARGet Kids! has been enrolling healthy children from primary healthcare practices and inviting them to participate at follow up well-child visits through adolescence [16]. At each regularly scheduled well-child visit, parents/caregivers of participating children were invited by a mailed letter to participate in a research follow up visit and to complete an age-specific standardized questionnaire adapted from the Canadian Community Health Survey [17] with questions on socio-demographic information, physical and mental health, health behaviours (e.g., nutrition, screen time, physical activity, and sleep), school and childcare arrangement, health services use, and developmental screening [16]. In addition to parent-reported questionnaire data, anthropometric data (child and parent height/length, weight, and waist circumference) were measured by a trained research assistant and families were invited to participate in non-fasting blood sample collection by a trained phlebotomist at each practice site [16]. Children were recruited at their well-child visit anytime between 0 to 5 years old [16]. All parents were invited by mail to participate with data collection twice a year before age 2 years, and then every year until age 18 years [16]. Research assistants would then approach participants who attended their well-child visit on site and invite them to participate. Children with health conditions affecting growth, children with chronic health conditions at enrolment except for asthma, children with severe developmental delay, and families who were not able to communicate in English or French were excluded from the TARGet Kids! cohort study [16]. Written consent was obtained from parents/caregivers of all participating children. This study was approved by the Research Ethics Boards at The Hospital for Sick Children (#10000-12436) and St. Michael's Hospital (#17-335). The design of this study has been informed by input from our parent partners [18].
included child sex, maternal and paternal ethnicity, maternal education, maternal age at enrolment, biological mother and father place of birth, child immigration status, language spoken most often at home, maternal and paternal employment status, child birthweight, and TARGet Kids! practice site at enrolment. Employment status was only collected at enrolment and thus it was treated as a non-time-varying factor. Certain TARGet Kids! sites had few participants or shorter involvement with TARGet Kids! and therefore were grouped with larger sites.
Data on time-varying factors were repeatedly collected at each visit and data at the closest time point prior to each research follow-up visit were used, which represents the most recent information available to the research team. Time-varying factors included child age at each visit, self-reported annual family income, number of siblings, child living arrangement, dwelling type, food insecurity within the past year, child weight status, parent weight status, parent history of chronic health conditions, child history of chronic health conditions, and months of the well-child visit. Child age and month of the well-child visit were estimated based on the expected visit date in the case of non-visits. Each child's height and weight were measured at each attended visit by trained research assistants at each site using standardized protocols from National Health and Nutrition Examination Survey (NHANES) [19]. Child weight was measured using an infant scale for children <2 years of age, and a calibrated precision digital scale for older children (Seca, Hamburg, Germany, www.seca.com/en_us/products/allproducts.html). Child length was measured using a length board for children <2 years of age and standing height was measured using a stadiometer (Seca, Hamburg, Germany, www.seca. com/en_us/products/all-products.html). Height and weight of the parent who accompanied their child at the visit were also measured by trained research assistants. Body mass index (BMI) was calculated by dividing weight in kilograms by height (or length) in meters squared. Child BMI was age-and sex-standardized into zBMI scores using the World Health Organization Child Growth Standards, [20] as recommended in Canada [21].
All other time-varying factors were collected through repeated parent-reported questionnaires at each well-child visit. Food insecurity (Yes/No) was categorized as "No" if they answered "Never True" for the following two questions, otherwise food insecurity was categorized as "Yes": Within the past 12 months we worried whether our food would run out before we got money to buy more (Never true/ Sometimes true/ Often true); Within the past 12 months the food we bought just didn't last and we didn't have money to get more (Never true/ Sometimes true/ Often true). Parent history of chronic health conditions (Any/None) was categorized as "Any" if the mother or the father of the child has been diagnosed with at least one of the following conditions: multiple sclerosis, diabetes, osteoporosis, heart disease, hypertension, high cholesterol, cancer, asthma, depression or anxiety, stroke, alcohol or drug problems, attention deficit hyperactivity disorder (ADHD), autism spectrum disorder (ASD), learning disability, overweight or obesity. Child history of chronic health conditions (Any/None) was categorized as "Any" if the child has been diagnosed with at least one of the following conditions: ADHD, allergies, asthma, ASD, developmental delay, diabetes, eczema, learning problem, overweight, cancer, or inflammatory bowel disease.
For time-varying factors, we used data at the closest time point before each research followup visit when examining their associations with study participation. When these factors were missing for the closest time point, data was used from the last visit of the same subject (forward filling). When no data was available to fill in, these factors were coded as a separate category "Missing" for non-response. Some of the factors we examined were not included in the early versions of the TARGet Kids! questionnaires. For example, annual family income was not asked in 2008; food insecurity was not asked before 2013; dwelling type was not asked between 2011 and 2013. For these questions, a separate category "Question not asked" was assigned, to differentiate from missingness. These approaches were chosen over other methods, such as imputation, as the variables in the model represent the information that would be available to a research team at that point in time, and not necessarily the true values of those variables if they had been successfully measured.

Measures of research participation
The primary outcome of this study was attendance of eligible research follow-up visits, measured dichotomously (attended/missed). Attendance of a research follow-up visit was defined as the return of the parent-reported questionnaire for that visit. The number of eligible research follow-up visits per child was determined based on the date of enrolment, child age at enrolment, site stop date (some sites stopped working with TARGet Kids! during the study period), and withdrawal status. The secondary outcome of this study was time to withdrawal. Withdrawal from the study occurred when a parent submitted a request to withdraw their child from TARGet Kids!. Data collected on the child up to the point of withdrawal remained in the study database. When feasible, reasons for withdrawal were collected by the research assistants at each site.

Statistical analysis
Participant characteristics were generated using descriptive statistics. To determine factors associated with attendance of research follow-up visits, generalized linear mixed effects models using a logistics link were fitted using repeated measures of exposure and primary outcome, including random intercepts for children and their families, since this study included some siblings from the same family. To analyze factors associated with time to withdrawal, Cox proportional hazard models with frailty terms for children and their families were fitted. For both analyses, two sets of models were fitted: separate models for each factor (Model 1) to examine the raw associations between each factor and the outcomes and an adjusted model combining all factors together in one model controlling for confounding and inferring potential causations (Model 2). All p-values were two-tailed and statistical significance was set at alpha = 0.05. The Bonferroni-adjusted alpha was 0.0022 (0.05 divided by 23 as we examined 23 factors) to adjust for multiple comparisons. R version 4.0.2 for Mac was used for all analyses [22].

Results
For the primary analysis, 10,412 children with 62,655 total eligible research follow-up visits were included between June 3, 2008 and March 4, 2020 (see Fig 1 for sample size flowchart). Baseline, non-time-varying characteristics, collected at enrolment were presented in Table 1: the mean age at enrolment was 22.2 months, 52% were male, and 52% had mothers and fathers of European ethnicity. Time-varying characteristics were based on eligible research follow-up visits and are shown in Table 2: the majority of the families had $80,000 or more annual household income and 11.6% of the children had at least one chronic condition diagnosis. Study participation is described in Table 3. Out of the 62,655 eligible research follow-up visits, 31.2% were attended. Out of the 10,412 participants eligible for follow-up, 7,124 participants (68.4%) attended at least 1 research follow-up visit. Since 2008, 696 (6.4% out of 10,914) participants have submitted a withdrawal request and the mean time from enrolment to withdrawal was 3.5 years. Time-consuming was the main reason according to the research assistants' documentation.
Results on factors associated with attendance of research follow-up visits are shown in Table 4. In the adjusted model (Model 2), older children, certain TARGet Kids! sites, families with lower income (less than $39,999 and $40,000 to $79,999 relative to $150,000 or more), and children living in an apartment had higher odds of missing a research follow-up visit compared to the reference groups; older mothers at enrolment (�38 years relative to <32 years), children with siblings, and children with chronic health conditions had lower odds of having a missed research follow-up visit compared to the reference groups (see odds ratios, confidence intervals and p-values in Table 4). Child immigration status was not included in the model due to its low variability. Missingness was consistently associated with higher odds of having a missed research follow-up visit.
The time-to-withdrawal analysis included 10,914 children (see Fig 1). Results on factors associated with time to withdrawal are shown in Table 5. In Model 2, the estimated hazard of withdrawal was 0.95 times lower with each 1 month increase in child age (adjust HR = 0.95; 95%CI: 0.94, 0.95; p < .001) after adjusting for the other variables in the model.

Discussion
In this primary care practice-based pediatric longitudinal cohort, research participation was described, and we investigated multiple sociodemographic, health, and study design factors and their association with research participation. To our knowledge, our study is one of the first primary care cohort studies of children to examine factors associated with follow-up participation and withdrawal using a sophisticated methodological approach beyond descriptive statistics. It is worth noting that, a missed research visit could be due to research assistants not being on site, or unable to approach multiple participants at the well-child visit or other administrative reasons that we were unable to measure in this study.
When we examined each factor's association with follow up, children with mothers with lower education level, unemployed parents, and lower household income were more likely to have a missed research follow-up visit. Our findings align with other pediatric cohort studies [4,5,8,10,11] and intervention programs [3,[23][24][25] demonstrating low socioeconomic status as a key attrition determinant. Consistent with previous literature, our study demonstrated that older children, [3,11] ethnic minority groups, [4] younger mothers, [5,8] parents born in a foreign country, [5,10,11] and children not living with 2 parents in the same household [3,10] were independently associated with lower level of research participation. Our study showed that children and parents diagnosed with chronic health conditions had lower odds of having a missed research follow-up visit, suggesting that families with chronic health  conditions may be more interested in participating in health-related research as they may be more concerned about their families' health. This is a novel finding of our study. In the multivariable analysis, family income remained as a key attrition determinant, with a clear doseresponse relationship between income and odds of missing a research follow-up visit.
Although the participation of TARGet Kids! had no direct financial burden to the household, parents from low-income families may work multiple part-time jobs, making them difficult to commit to questionnaire completion. In addition to time demands, literacy demands and health and life stresses may also be barriers hindering research participation from low-income families [26]. Missingness in questionnaire data was also a key factor in both adjusted and unadjusted models, suggesting that individuals who did not complete the questionnaires were less likely to participate in future research visits [10].
In line with a large multi-centre children's cohort study in Europe, [10,11] children and parents with a higher weight status were more likely to miss a follow up visit. Although obesity is considered by many a chronic disease, individuals with obesity may not feel comfortable participating in a cohort study which involves having their weight and height measured by a research assistant in practice. This may be related to ongoing weight bias in practice, as well as stigma and discrimination towards children and parents with obesity [27] in the healthcare system. Alternatively, weight status may be confounded by other social determinants of health; after adjusting for income, education, and ethnicity, the association with weight status was no longer significant in the multivariable model [28,29].
Various engagement and retention strategies have been utilized by longitudinal cohort studies to improve study retention. A systematic review [30] identified 95 strategies and classified them into 4 themes: reducing barriers to participation, creating a project community,  a Food insecurity is classified as No if they answered "Never True" for the following 2 questions, otherwise food insecurity is classified as "Yes": • Within the past 12 months we worried whether our food would run out before we got money to buy more (Never true/ sometimes true/ often true) • Within the past 12 months the food we bought just didn't last and we didn't have money to get more (Never true/ sometimes true/ often true) b zBMI: age-and sex-standardized body mass index (BMI) based on the World Health Organization (WHO) Child Growth Standards. BMI was calculated using height/ length and weight, measured by a trained RA. According to the WHO weight status cut-offs for children, zBMI < -2 is classified as underweight, -2 � zBMI � 1 is classified as normal weight, 1 < zBMI � 2 is classified as at risk for overweight for children  fathers) from different families with diverse sociodemographic backgrounds, to partner with us. Since the establishment of the PACT, we have engaged PACT members in all stages of our research [31,32]. Ongoing collaboration with the PACT has ensured that our approach to families, measures, and follow-up procedures are sensitive to the needs of families facing socioeconomic barriers and children from racialized communities. We also have a Patient and Family Engagement Specialist working with TARGet Kids! as a core team member, providing ongoing communication and support for the team to address parent and clinician perspectives, and creating opportunities to engage underrepresented populations. The PACT participated in all stages of this study including setting priorities, determining research questions, and interpreting and communicating findings. The use of social media as a tool to promote research participation is increasing and has shown effectiveness in a number of observational studies and clinical trials, including those with hard-to-reach population [33,34]. Effective use of social media has also been a focus of our retention strategies. Informed by the PACT, we have recently enhanced the TARGet Kids! website (www.targetkids.ca), Instagram, and Twitter account, creating a brand identity (e.g., study logo, colour palettes, and typography) to engage study participants. Communication tools such as infographics, newsletters, and parent-friendly summaries of our publications are also among the strategies used to engage parents, families and retain participants.
Our findings and input from the PACT members suggested a need for more tailored support and targeted strategies for families. In this study, families speaking a language other than English or French at home were less likely to have ongoing research follow-up visits. To reduce barriers for these families, we propose to adjust our inclusion criteria and implement translation services to translate TARGet Kids! consent forms and questionnaires into more used languages other than English or French among TARGet Kids! participating families. Families who report ethnicity other than European were more likely to miss a visit or withdraw participation in TARGet Kids!. We plan to ensure our staff is representative of the study population and implement cultural sensitivity training program focusing on Equity, Diversity, and Inclusion (EDI). Since families who withdrew from TARGet Kids! found that participation in the study was time-consuming, we have asked PACT members for feedback on questionnaires. We have reviewed our tools, removed overlapping questions, and started to administer shorter tools in more infrequent time intervals, to reduce respondent burden on participating families. Since families leaving TARGet Kids! participating practices was another reason for withdrawal, measures could be developed to follow these participants through online questionnaires, health insurance numbers, and at-home parent measurement for height and weight using standardized techniques. Furthermore, a more consistent approach for reimbursement for time for participation (gift cards distribution) can be implemented to ensure participants are compensated  punctually, and incentives may be offered to participants who complete all data collection points [35]. Lastly, a child-friendly reward program (e.g., stickers, colouring books, or puzzle pieces for bloodwork completion) could be implemented to motivate children and their families.
Strengths of this study include the longitudinal study design with 12 years of follow-up, a standardized data collection procedure, a large sample size of over 10,000 children allowing us to collect repeated measures on multiple participant characteristics, and a sophisticated methodological approach to study research participation. Moreover, ongoing collaboration with the PACT has ensured that our study was well-informed by participating parents. There are potential limitations to our study. First, most of our data were parent-reported, which may have introduced self-reporting bias and may be limited to parents' perspectives; research assistants' perspectives were not described in this study. Second, other factors may have contributed to research participation but were not included in the model (e.g., full-time vs. part-time research assistant at each site, workload of the research assistant, blood draw participation, family involvement in other research studies). Although we provided adjustment for multiple testing, type I error may have still occurred. Lastly, results of this study may only be applicable to healthcare-based studies since our study population was predominantly recruited from primary care settings in urban areas of Canada.

Conclusions
In this large primary care practice-based cohort study of children, we identified multiple factors associated with research participation. Our study confirmed that low-income families were less likely to have ongoing participation in research. Understanding determinants of attrition is crucial to inform and develop effective strategies to support target populations and improve study participation and completeness of retention data. Qualitative evidence from parents, research staff and health care providers may be helpful to further understand factors influencing parents' decision-making around research participation to aid in the development of targeted retention strategies.