Figures
Abstract
Background
Stroke is a leading cause of death and disability worldwide. In India, it is the fourth leading cause of death and fifth leading cause of disability, posing a major public health concern. National surveys reveal an increasing trend in stroke risk factors such as tobacco use, physical activity, alcohol use, hypertension, and dyslipidemia. However, knowledge regarding the combined effect of these risk factors and their various combinations is limited. Understanding the individual, combined, and synergistic effects of known risk factors, along with new risk factors, is essential to address gaps in stroke epidemiology. This study aims to examine the effect of various risk factors of acute stroke and their association with stroke occurrence and its outcomes (survival, disability and quality of life).
Methods
This retrospective-prospective cohort will be conducted in one taluka of Kolara district and two urban wards of Bengaluru with a total population of ~400,000. All stroke-free individuals above 30 years of age ~200,000 individuals in the selected sites will be participants of stroke-free period and all first ever stroke patients in the community will be part of stroke and post-stroke period respectively. The study subjects will be recruited through a complete house-to-house survey at baseline and undergo annual follow-ups during the stroke-free period, with specific assessments at defined time points during the stroke and post-stroke period for a period of one year. Efforts are implemented to minimize loss to follow-up, including community engagement, a helpline number, and hospital-based surveillance.
Discussion
This large population-based cohort study addressing stroke epidemiology in the country, is one -of-its-kind, attempting to fill certain critical gaps in the natural history, management, and outcomes of stroke in India. This research has the potential to provide important insights into the effect of novel risk factors of stroke and various combinations of risk factors of stroke. Furthermore, the development of a stroke risk predictability calculator will add value to the existing Indian National Programme for Prevention & Control of Non-Communicable Diseases (NP-NCD) and offers a model for similar countries once developed.
Citation: Banandur PS, Sukumar GM, Arvind BA, P. R. S, V. S. B, Loganathan S, et al. (2024) Population-based cohort across stroke life course in India-The NIMHANS-NH-SKAN stroke project: A study protocol. PLoS ONE 19(10): e0310309. https://doi.org/10.1371/journal.pone.0310309
Editor: Dorothy Lall, CMC Vellore, IPH India, INDIA
Received: March 14, 2024; Accepted: August 29, 2024; Published: October 2, 2024
Copyright: © 2024 Banandur et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: No datasets were generated or analyzed during the current study. All relevant data from this study will be made available upon study completion. All data collected will be uploaded into a password protected (3-layers) server with access only to specified authorized study team members. Every access and activity within the server will be logged as per pertinent policies, rules and guidelines.
Funding: The study is funded by non-commercial organization SKAN research trust URLs to funder’s websites: https://skanrt.in/ While our research project is funded, I wish to emphasize that our research endeavours are entirely non-commercial in nature and are dedicated to advancing academic knowledge within our respective field. Authors- Dr. Thimappa Hegde, Dr. Komal Prasad, and Dr. Lavanya Garady contributed to the conceptualization of the study and reviewed and finalized the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Increasing life expectancy in most countries has resulted in increased proportion of elderly and associated burden of chronic Non-Communicable Diseases (NCDs). NCDs account for about two-thirds of deaths both globally (68%) and in India (60%) [1] leading to significant long term health, economic and social costs. The proven cost-effectiveness of prevention interventions, make NCD prevention and control, a priority in the 21st century [2].
Stroke is the second leading cause of death and the leading cause of disability [2, 3]. Globally, between 1990 and 2019, there is a 70% increase in stroke incidence, a 43% rise in stroke-related deaths, a 102% increase in stroke prevalence, and 143% surge in Disability-Adjusted Life Years (DALYs) [4]. India, is currently undergoing an epidemiological transition from communicable diseases to NCDs [5]. Stroke is the fourth leading cause of death, fifth leading cause of disability [6], and a major public health concern in India. In addition to being fatal, stroke is associated with short and lifetime disability among survivors, affecting their quality of life and productivity [7]. A high proportion of stroke survivors suffer from permanent impairments resulting in deficient self-care, requiring long-term support from care-givers, adding to individual health and social costs [8].
Epidemiology and care of stroke widely varies across settings in India. Stroke accounted for 7.73% fatalities and 4.26% of DALYs in Karnataka, a state in southern India [9]. An estimated 500,000 stroke cases are prevalent at any given point of time in Karnataka, placing immense burden on the health system [9]. The situation is compounded by lack of regular, valid and reliable stroke related epidemiological and treatment data for evidence-driven programming [10]. Incidence of stroke is largely dependent on incidence of stroke risk factors, which are dynamic and defined based on understanding of epidemiology, from other countries.
Ongoing national surveys and research studies indicate an increasing trend in prevalence of known stroke risk factors like tobacco use, physical inactivity, alcohol use, hypertension and dyslipidemia. However, current knowledge lacks clarity on the combined effect of multiple risk factors and their various combinations. Understanding the individual, combined and synergistic effect of these known stroke risk factors coupled with information on neo-risk factors for stroke, would help to bridge gaps in stroke epidemiology. This enables in-depth understanding of stroke and its risk factors and facilitates evidence-based health systems and services for risk reduction, case management and rehabilitation of stroke.
The NIMHANS-NH-SKAN stroke project, a large population-based cohort study, aims to examine all issues related to stroke covering risk factors, occurrence, treatment and outcomes. This project also intends to develop an India-specific stroke risk predictability calculator for the studied risk factors that shall predict an individual’s risk of stroke over specific time periods. The NIMHANS-NH-SKAN stroke project is implemented by the Department of Epidemiology, Centre for Public Health, NIMHANS, Bengaluru
We describe the methodology of NIMHANS-NH-SKAN Stroke project which aims to:
- estimate the incidence and incidence rates of first-ever stroke among risk factors (individually and combined) in Karnataka.
- estimate the strength of association between select risk factors and acute first-ever stroke in Karnataka.
- estimate short- and long-term survival, disability and quality of life of first-ever stroke patients in Karnataka.
- identify and estimate strength of association of factors associated with short- and long-term survival, disability and quality of life of first-ever stroke patients in Karnataka.
- develop a specific risk predictability calculator for development of stroke over specific time periods.
Methodology
This population-based cohort study assesses the effect of various risk factors on stroke occurrence and its outcomes namely survival, disability and quality of life. For the ease of understanding methodology is explained in two parts (Fig 1):
- Stroke-free period–period from initial recruitment of a study subject until development of stroke or end of follow-up period.
- Stroke and post-stroke period–period after when a study subject develops stroke until one-year post-stroke or death whichever is earlier.
Study design and study settings
Stroke-free period.
This is a retrospective-prospective cohort study (mixed/ ambispective cohort study) conducted in two urban wards of Bengaluru city and one taluka (sub-district) of Kolara district in Karnataka, a southern Indian state. Specific wards and sub-districts were selected for operational convenience. Our strong presence and established rapport within these communities will enable easy acceptance of the survey and effective community mobilization. This approach is expected to result in higher response rates and minimal lost to follow-up. Naturally existing risk groups constitute exposures and stroke as outcome.
Identification and recruitment of study subjects
Stroke free period.
All stroke-free permanent residents currently residing in the study area aged 30 years and above, and willing to participate will be included. Individuals with previous history of stroke and chronic debilitated/ bed ridden individuals or those who are otherwise unable to participate in the study will be excluded from the study. A baseline house-to-house enumeration of all eligible study subjects will be conducted in the study area. Eligible respondents who enter the study area during the first and second year of follow-up will also be included in the study. An IEC/Community engagement plan to actively engage and mobilize the community throughout the project period is being developed. The strategy is to create awareness and disseminate information regarding the project to the community, their participation and for information to the study team about the cases occurring within the study areas. In addition, this community engagement is likely to minimize non-response as well as dissolve any issues that might arise due to misconceptions regarding the study that influence participation.
Stroke and post-stroke period.
All first ever stroke cases reported within the study area shall be recruited. Primarily, recruitment will be within the hospitals where the study subjects are admitted. All these cases will be identified through community informants (family members, health workers/ASHAs/Anganwadi workers/community leaders/hospital staff/ anybody who are likely to know the occurrence of stroke) within the study area. All field workers and community will be sensitized and trained to report the occurrence of stroke and will be compensated for factual reporting. As part of community engagement, an exclusive helpline will be established to enable case identification. Hospital based surveillance will also be setup to identify cases from the study area admitted in the hospitals.
Setting and healthcare available
India’s healthcare system operates on a three-tier hierarchical model. At the primary level, basic health services are offered through Primary Health Centres and Sub-Centres (Health and Wellness Centres) located in villages or towns. The secondary level includes district hospitals that provide additional diagnostic and surgical services along with basic healthcare. Tertiary care is offered by medical colleges and hospitals, which deliver advanced medical treatments and surgeries. Stroke cases typically receive care at either the secondary or tertiary levels.
Based on previous records, study subjects from the study area mostly visit a network of one neuro specialty hospital and one medical college hospital at tertiary level and one district hospital, one medical college hospital and two government hospitals at the secondary level.
Alternatively, if the study subjects are admitted in hospitals which are not within reasonable distance, recruitment shall be done after they come back from the hospitals within their households.
Follow-up
Stroke-free period.
After the baseline assessment, all survey respondents will be followed up once annually at three time points (year 2, year 3 and year 4) (Fig 2). For those respondents who move out of the study area, attempts will be made to collect data from them, if they are available within a reasonable travel distance (e.g.: within Kolara and close by urban wards within Bengaluru). Efforts will also be made through telephonic calls either to contact them personally or ascertain their stroke status, to the extent possible.
The study end-point for subjects in the stroke-free period will be when the subject (Fig 2)
- completes the follow-up period of four years without developing stroke
- develops stroke during the four-year follow-up period
- dies/ migrates from the study area/ is loss to follow-up
- stops participation in the study for any other reason
Stroke and post-stroke period.
Any respondent who develops stroke during the study period will be assessed within the hospital and followed up at four time-points (28th day, 3rd month, 6th month and 12th month) post-stroke (Fig 2). For those respondents who move out of the study area, efforts similar to stroke free period shall be made to minimize loss to follow up.
The study end-points for post-stroke period subjects will be when the subject
- completes the follow-up period of 1 year
- dies/migrates/ lost to follow-up
- stops participation in the study for any other reason
Ascertainment of exposure and outcome
Stroke-free period.
We plan to study multiple exposures in this study. Naturally existing stroke risk groups namely weight, mental health/ stress (includes depression, anxiety, stress, non-suicidal self-injury, suicidal behavior and workplace stress), non-communicable diseases (diabetes mellitus, hypertension, dyslipidemia and coronary vascular diseases) substance use including tobacco (smoking/smokeless), alcohol and drugs (injecting/sniffing/oral), nutrition, physical inactivity and certain neo-risk factors namely gambling, COVID-19 infection and vaccination, sleep (insomnia and obstructive sleep apnea), exposure to media, risk of cell phone addiction and Restless leg syndrome constitute exposures. The primary outcome assessed is stroke (including its sub-types).
Stroke and post-stroke-period.
We plan to compare/assess the difference in the survival (both long term and short term), disability and quality of life among stroke patients. Type of stroke will be considered apriori exposure along with different exposures assessed for the stroke free period as confounders.
The definitions/ assessments proposed to ascertain different exposures and outcome is shown in Table 1.
Study instruments
All eligible participants will be interviewed using semi structured partially open-ended schedules specifically developed for data collection and posted on low-literacy-user-friendly digital platform. These digital schedules are separate for stroke-free and stroke and post-stroke period. Each of these schedules will have multiple forms with multiple sections. The contents and scales utilized are standardized validated scales approved by the expert advisory group specifically constituted for the project (Table 1).
Data collection
Data collection shall be done using a combination of face-to-face interview utilizing a specifically developed semi-structured partially open-ended interview schedule, review of clinical records and biological investigations. Initially, a complete household census will be done in the study area to identify eligible study respondents. The interview schedule consists of 30 sections (S1 Table). Data on socio demographic, clinical history, family medical history, in addition to participant reported exposures, information on disability and quality of life of the participant/patient will be obtained using data collected during the stroke-free period. Data collection will be performed by trained individuals with previous experience in population and/ hospital based medical research data collection, community and social work. The trained data collectors will also measure height, hip abdominal circumference and blood pressure (measured after 15 minutes of resting in a comfortable sitting posture). Biological samples will be collected by trained laboratory investigators, stored and transported to the laboratories immediately. Samples from district will be sent to the medical college hospital or district hospital. Sample from Bengaluru will be sent to the Neuro-Specialty care hospital. All data will be collected using a low-literacy user-friendly digital data collection instrument specifically developed for the project. The field supervisors/coordinators along with program coordinators will perform 5% repeat data collection for assessment of quality of data collected. Any deviation/ errors in data collection more than 10%, calls for repeat survey within the respective data collection area.
During data collection for stroke and post-stroke period, care-givers of subjects will be interviewed if the participants are unable to provide information. The participants/caregivers will be assisted by the data collectors in responding to the questionnaire. In case of death of the participant during the study period, the date of death along with other appropriate/relevant information (through medical records and caregiver information) will be obtained. For those respondents who move out of the study area, attempts will be made to collect data from them either face-to-face, if they are available within a reasonable travel distance (e.g.: within Kolara and close by urban wards within Bengaluru) or through telephonic interview.
Training of data collection
The training for the data collectors and laboratory investigators will be participatory, employing different methods including classroom sessions, training in the hospital (observation and demonstration of interviews), community (both supervised and independent) and hands-on training (for digital data collection and biological sample collection) by specialized personnel for each domain. It is planned to conduct induction and refresher trainings at the beginning of each data collection cycle (once a year) for the study team ensuring standardized data collection. To ensure quality apart from rigorous training, weekly and fortnightly review and problem-solving meetings will be held both locally and with the NIMHANS-NH-SKAN team online.
Monitoring
An independent monitoring team (IMT) is constituted to oversee project implementation activities. Members of the team include mid-level specialists mainly from the field of Community medicine/ Epidemiology/Public Health, Neurology and Social work with relevant work experience. The IMT will maintain a constant and ongoing supervision to ensure that the project activities are carried out as per protocol. The IMT will conduct monitoring visits every quarter and submit a report to the study team emphasizing the progress and identifying critical bottle-necks/challenges in the implementation of the study.
Expert advisory group
The Expert Advisory Group (EAG), composed of specialists in Neurology, Neurosurgery, Epidemiology, Biostatistics, and Public Health with extensive research experience in Public Health and Stroke Epidemiology, is established to provide strategic oversight and technical support for the study. This group will play a crucial role in directing the study, monitoring its progress, ensuring its quality, and reviewing timelines. The group will provide regular and timely advice to study investigators on the study’s progress and played an integral role in reviewing the master protocol and suggesting modifications. The group convenes biannually (every 6 months) to assess the study’s progress and provide necessary input.
Data management
All data collected will be uploaded into a password protected (3-layers) server with access only to specified authorized study team members. Every access and activity within the server will be logged as per pertinent policies, rules and guidelines. The core team within NIMHANS shall check for completeness, quality and accuracy of data collected utilizing specifically developed quality check formats. Any discrepancy/ clarifications shall be sought with the field team and rectified/ modified within the server on a weekly basis. Only specific core team members authorized to access within NIMHANS shall have the access and rights to modify data as per field team recommendations/justification. All the modifications shall be time logged. All forms, data collected and stored will be linked through unique ID of the study participants including laboratory and follow-up data.
Biochemical sample collection and measurements
Laboratory confirmation for estimation of glucose and lipid profile will be done using venous blood sample collected from study participant. Approximately 5–7 ml of venous blood will be collected from the cubital vein using a needle and syringe. Blood sample will be drawn by qualified/trained personnel under aseptic precautions using disposable needles. Adequate pressure will be applied for sufficient duration of time to prevent bleeding from the site of blood collection. Immediately, the collected blood will be transferred to plasma separator vacutainer tubes. Plasma will be separated after centrifugation. Serum glucose and lipid profile will be estimated using hexokinase method and colorimetric assay respectively.
Sample size and power calculations
Stroke-free period.
The study sites constitute two urban wards in Bengaluru and Mulbagal taluka in Kolara district (with 20:80 –urban: rural distribution of population). Each urban ward is expected to have a minimum of 80,000 people. Approximate population within the taluka is 250,000. The proportion of population above 30 years is approximately 58.9% [11]. Thus, about 241,490 subjects (410,000*0.589) are expected to be available for this project. Assuming 20% attrition, we expect ~193,192 subjects to participate in the study.
Assuming 150 new stroke cases per 100,000 population per year and follow-up duration of 3 years at 5% level of significance, sample sizes are calculated for different risk group (Table 2). Since the expected prevalence of obesity is the least among all the risk groups (6.2%), sample size estimation based on obesity as exposure is expected to cover the required sample size for other groups also. With the expected population of ~240,000 subjects, the study results would have a power of 90% to achieve a Hazard Ratio (HR) of 1.4. Even with an attrition rate of 20% (~193,000 subjects) we would be able to achieve the proposed objective with >80% power.
Stroke and post-stroke period.
With the minimum estimated number of 193,000 respondents available (after accounting for 20% attrition) for stroke-free cohort and an expected incidence of 150 per 100,000 population per year, we expect around 290 cases of stroke occurring in the study areas every year. Thus, a total of 870 cases are expected to occur in the study area during the study period. Efforts shall be made to recruit all these cases into the stroke cohort study.
Statistical analysis
Stroke-free period.
Study subjects will be assigned into different risk factor groups. Multiple individual and combination of naturally existing exposure groups within the community shall be formed. The specific risk groups (comparison groups) of single, dual, triple or multiple risk factors for stroke are formed to estimate and compare the incidence of stroke among multiple exposure groups. Formation, classification, and follow-up of risk factor groups would be made in the database. Neither the study subjects nor the data collectors shall be informed about the grouping of their specific risk groups. However, information regarding the health status shall be provided to the respondent to enable them to seek appropriate care.
Incidence proportion and Incidence rates for stroke and different types of stroke will be calculated and presented with corresponding 95% confidence intervals. Actual life table methods will be used for estimation on Incidence rates. Person-months of follow-up and person-months of exposures (for each risk factor) will be estimated. Overall incidence rates for person-months of follow-up will be estimated. Incidence rate for specific exposure groups will use person-months/ person-years of exposure to risk factor as the denominator. Measures of risk for stroke and type of stroke, for each risk factor, will be determined using appropriate regression methods. Attributable and population attributable risk (PAR) and the corresponding 95% CIs will be obtained. For those study subjects who prematurely exit (dies/ migrates from the study area/ lost to follow-up or stops participation for any other reason) from the study, the stroke-free duration of such participants will be utilized for analysis.
The risk factors namely weight, mental health/stress, NCDs, substance use, nutrition and physical inactivity are considered apriori exposures and stroke as outcome. A conceptual framework based on multifactorial web of causation developed by Mc Mohan and Pugh [16] consisting of apriori exposures and other potential confounders influencing the association between them and the outcome will be developed through desk review and expert consultation. This shall form the basis for statistical analysis for the study. The difference in stroke risk for the different risk groups will be assessed using the Cox proportional hazard models with age of onset of risk factor (e.g., start of smoking/tobacco use/ alcohol etc.) as the time scale. Stroke as well as its sub types (thrombotic/hemorrhagic/Sub Arachnoid Hemorrhage/Cerebral Venous Thrombosis) will be examined as outcomes. All risk factors within the conceptual framework will be assessed for confounding during the Cox proportional hazard modeling. We propose to assess categorical representations in the event of non-linear relationships and develop binary definitions (e.g., presence/absence of hypertension, obesity, diabetes, stress, etc.) to facilitate model comprehensibility. The proportional hazard assumption will be tested using the score test for each risk factor. If the proportionality assumption is observed violated, Cox model for time varying covariates will be used for estimating the adjusted time varying hazard ratios (HRs). The HRs, their corresponding 95% CIs and p-values will be reported.
To develop gender specific stroke risk probability calculator, the estimated regression coefficients from the fitted gender specific Cox proportional hazards regression models will be used along with the estimated baseline survival functions at 5 years. All continuous predictors will be transformed using natural logarithms in order to minimize the influence of extreme outliers. The ability of the stroke risk probability in discriminating individuals who develop stroke from those who do not will be measured using c statistic [17]. The goodness of fit of the risk prediction model will be evaluated using a modified Hosmer and Lemeshow Chi square statistics which measures the model’s ability of agreement between observed and predicted events within 5 years [18].
Stroke and post-stroke period.
We intend to understand the effect of exposure groups from the Stroke-free period on survival, disability and quality of life among these stroke cases. Survival probabilities with corresponding 95% CIs at 28 days, 3, 6 and 12 months would be estimated using Kaplan Meier method. Univariate and multivariable Cox proportional hazard regression would be performed to identify the risk factors for death at 28 days, 3rd, 6th and 12th month. In the multivariable Cox proportional model, the confounders are identified based on the clinical importance as well as the difference in the baseline characteristics of the participants between survived and non-survived individuals at all four time points. The proportional hazard assumption will be tested using the score test for each risk factor. If the proportional assumption is observed violated, then Cox model for time varying covariates will be used for estimating the adjusted time varying hazard ratios (HRs). The HRs, corresponding 95% confidence intervals and p values will be reported.
The mRS is an ordinal scale with score ranging from zero to six. Proportional odds model will be used to identify the factors associated with disability at 28th day, 3rd month, 6th month and 12th month of onset of stroke. The quality-of-life variable is a continuous variable. Univariate and multiple linear mixed model regression analysis will be performed to identify the factors related with quality of life at 28th day, 3rd month, 6th month as well as after 12 months as outcome. In the mixed model, the subjects will be considered random and time will be considered as fixed. The linear or quadratic trend of quality of life over time will be checked using profile plots. Accordingly, time will be considered as a fixed continuous or categorical factor in the model. All confounders will be added as covariates in the model. The normality assumption for linear regression will be checked using Shapiro Wilk test as well as checking the histogram along with skewness and kurtosis. Appropriate transformations will be used if the dependent variable is found to be non- normal.
Ethical clearance and consent
The study protocol was reviewed and approved by the Institutional Ethics Committee of NIMHANS vide letter NIMHANS/35th IEC (BS & NS DIV.)/2022; dated 17-06-2022. All eligible individuals shall be administered written informed consent. All concerned documents are available both in local language Kannada and English. The documents will be provided in the language preferred by the respondent/s. Illiterate individuals will have the consent form read to them in the presence of a literate witness and provide a thumb impression. All participants shall be given a copy of the signed informed consent form. The process of consent and data collection will be conducted at a place convenient to the participant ensuring adequate privacy and confidentiality. All study subjects detected to have any health issues during data collection, shall be informed about their health status and advised to seek appropriate care. All laboratory reports shall be sent to the respondent through an automated electronic reporting system to the phone numbers provided during data collection.
Discussion
This large-scale community-based prospective-retrospective cohort study is envisaged to assess the effect of known and neo-risk factors in the natural history of stroke beginning from the stroke-free period to development of stroke and its subsequent outcomes (survival, disability and quality of life post-stroke). This study enrolls a sample of ~200,000 individuals naturally stratified into multiple population-based risk groups within Bengaluru and Kolara districts in Karnataka, India. Stratification of the population into six risk/exposure groups of single, dual, or multiple combinations for stroke will be done. At the end of the study, we propose to develop an India-specific stroke risk predictability calculator for the studied risk factors that shall predict an individual’s risk of stroke over five-year periods. Further, data from this study shall also enable understanding the effect of these risk factors, individually and combined, on NCDs other than stroke and their outcomes.
Cohort study is the most suitable study design to accomplish our stated objectives. However, prolonged latency period of development of stroke in an individual’s life span, multifactorial causation, and indefinite time of onset of risk factors/NCDs are known difficulties to conduct cohort studies on NCDs including stroke [19]. The employed mixed (Prospective-Retrospective) cohort design attempts to address these challenges. The study considers various apriori risk factors i.e., weight, mental health/stress, NCDs, nutrition, physical inactivity and substance use. Recall based retrospective /documentary evidence of the onset of certain risk factor counters the prolonged latency and provides valid information on the time of onset of risk factors. These study subjects are further followed prospectively to have a clear insight of the time period between onset of risk factors and development of stroke. Other major advantage of our mixed cohort study design is ability to study multiple exposures and multiple outcomes namely stroke, survival, disability, quality of life, and other NCDs. Cohort studies, mostly assess single exposures and multiple outcomes [19]. The mixed cohort study design in this scenario allows assessment of multiple exposures as well. Stroke being a rare event [20], a case control design is a better suited design [21]. However, concerns with respect to temporality, recall bias, and availability of funding made us opt for a mixed (Prospective-retrospective) cohort study. Other advantages like less susceptibility to selection bias, possibility to estimate true risk, measures of impact (population attributable risk and attributable risk) made us opt for cohort design compared to a case-control design. Furthermore, our objective to assess multiple exposures and outcomes along with developing a stroke-risk predictability calculator makes cohort study a valuable study design in comparison to other available study designs. Alternatively, stroke registries are an invaluable resource to assess the burden of stroke. Stroke registries in India are predominantly urban-centric and limited to few risk factors and/or outcomes assessed [22, 23]. Difficulties in obtaining informed consent, unwillingness to share data, difficulty in retrieving medical records, and inadequate documentation of deaths are all known issues [24] limiting the utility of stroke registries to accomplish our stated objectives. Further, our study involves follow-up of naturally existing risk factor groups from the stroke-free period to the development of stroke and beyond, looking at stroke outcomes. This life course approach of assessing the effect of risk factors on stroke and its outcomes is unique to our study and is best accomplished by the adapted mixed cohort study design.
Ascertainment of risk factors and confounders, in our study, involves utilization of multiple validated and standardized tools for use within the community coupled with strong quality assessment and monitoring of study processes. Most tools proposed are extensively used routinely in clinical and biomedical research, medical practice, auditing and policy making [10, 25–42]. Age of onset/diagnosis of risk factors is considered to panelize data in this study, instead of the more typical and accurate time of assessment of the risk factor during the study [43]. The Cox model implicitly matches subjects on the time scale used. Since the rates of stroke increase with age, more so after onset of risk factor, it is assumed to probably have a larger effect than time of assessment. Thus, age of onset/diagnosis is considered most appropriate to use as time scale for the model.
One of the challenges in implementing such large-scale community-based cohort studies is ensuring community participation and high response rates [44]. Our study employs a strong community engagement plan to ensure high response rate and community participation. These measures are intended to reduce non-response/ loss-to-follow up–known limitations of a cohort study. The study team is acquainted with the cultural and social ethos of the study sites along with a long working relationship with the respective district administrations within Bengaluru and Kolara is likely to facilitate reduction in non-response.
Uniform and standardized data collection, minimizing known information biases like rounding errors, social desirability, and recall issues related to the exact dates of onset / diagnosis of risk factors assume importance when such large-scale cohort studies are implemented. Appropriate informed consent procedures, conducting interviews at a place convenient to the participant ensuring confidentiality and privacy are likely to reduce information bias. Gender-matched data collectors facilitate minimizing information bias mainly social desirability related to certain sensitive information.
Eligible respondents are those aged ≥30 years. This is based on the recommendation of the EAG since the Indian National Programme for Prevention & Control of Non-Communicable Diseases (NP-NCD) recommends opportunistic screening for risk factors and stroke among individuals aged 30 years and above [45]. Further, including these individuals shall capture dynamics of stroke in the young, an issue in countries of South-east Asian region, especially India [22, 23]. The study also includes pregnant and lactating women since they are known to develop stroke due to various risk factors, including those related to pregnancy [23].
The sample size for both stroke-free and post-stroke period is adequate in terms of the size and power. Although, apriori risk factors are known risk factors for stroke, to our knowledge, there are very minimal studies looking at these risk factors on stroke outcomes namely, survival, disability and quality of life. Not many studies assess the effect of various combinations of stroke risk factors on development of stroke and stroke outcomes. Our study provides an opportunity to understand risk factors of stroke, outcomes of stroke and various combinations of the same. This is a unique strength of our study.
The study employs specific quality assessment procedures along with an IMT, monitoring different processes involved in the study regularly. The entire study methodology is validated and approved by an EAG. In addition, the project team will monitor various activities and processes, including data collection, data storage, training, and documentation focusing on quality. Strict protocols are established for data transfer and management with access-controlled mechanisms. All these efforts likely ensure implementation of study activities to the highest of standards.
Strengths and limitations
This study has several strengths and certain limitations that need mention. The large sample size of ~200,000 individuals is a strength of the study. This serves to estimate the effect of individual and multiple combinations of risk factors to be assessed on stroke. Five among the six risk factors being assessed in this study are known risk factors. One of the risk factors (mental health/ stress cohort) is unique and a strength of the study. There are other neo-risk factors namely of gambling, COVID-19 infection and vaccination, sleep (Insomnia and obstructive sleep apnea), exposure to media, risk of cell phone addiction and restless leg syndrome being assessed in our study; which are unique & strength of the study. The study setting includes both urban (both slum and non-slum) and rural settings in Bengaluru and Kolara, which reflects the diversity and heterogeneity of the Indian population. Using a mixed cohort design (retrospective-prospective), allows for understanding the age of onset/diagnosis and for quantification of risk factors from onset/diagnosis. This enables assessment of time and dose response relationship between the risk factors, stroke and stroke outcomes. Thus, enabling comprehensive understanding of natural history of stroke across the lifespan, from the stroke-free period to outcomes of stroke. Development of stroke risk predictability calculator is a strength of this study. There are no such calculators available in India. Despite the strengths of this study, there are certain limitations that needs mention. Although data collectors would have been trained to collect information on certain sensitive issues, social desirability bias in reporting certain sensitive information like substance use cannot be completely ruled out. The adapted study design is known to increase the period of observation and negate the long latency associated with development of non-communicable outcomes. While ascertaining exposures, it is ideal to have time of onset as the beginning of exposure. However, date of diagnosis was the best alternative available given the elusive nature of onset of NCDs. This might pose a challenge of lead time bias associated with development of stroke as well as its consequent survival. This will be assessed and adjusted for in the analysis stage [46]. Adaptation of positive behaviors or change in risk behaviors are known challenges while conducting cohort studies [19]. This may reduce the incidence and risk of stroke among the study groups. However, sample size is calculated on the lowest possible incidence of stroke based on incidences reported in literature. This is likely to counter the reduced incidence and risk of stroke to a certain extent due to adoption of positive behavior. Intermixing of risk factors might complicate ascertaining measures of effect and impact. However, we propose to calculate rate ratio as the measure of effect. This negates the contamination of exposure ascertainment due to inter-mixing/changing of exposure status over the study period. The project is limited to specific areas, namely one taluka of Kolara district and two selected urban wards of Bengaluru city. Due to this limited coverage, the project may not fully statistically represent the diverse population and stroke burden across the entire state or country limiting the generalizability of results. But the choice of the study setting has its significance. The residents of the study sites come from various socio-demographic backgrounds (urban-rural-linguistic-slum/non-slum backgrounds). This ensures representation of various socio demographic/ economic and cultural strata representing diversity of the state or country. However, this might need to be further explored to ensure generalizability.
Conclusion
This large population-based stroke cohort assessing stroke and its outcomes, first endeavor of its kind, would help address the need to have sufficient epidemiological data regarding the natural history of stroke, its management and outcomes for evidence-based stroke care in India. Being one of the largest population-based cohort studies in India the NIMHANS-NH-SKAN project has the potential to offer valuable information about stroke epidemiology, risk factors, and care in India with major implications in policy making and health programming both at community and national level including introduction of stroke risk predictability calculator in Indian National Programme for Prevention & Control of Non-Communicable Diseases (NP-NCD) in India in specific and similar countries in general.
Supporting information
S1 Table. Definitions proposed to ascertain exposures and outcomes for stroke-free period, stroke and post stroke period along with standardized tools.
https://doi.org/10.1371/journal.pone.0310309.s001
(DOCX)
Acknowledgments
The authors extend their sincere gratitude to the Expert Advisory Group, whose invaluable insights and expertise significantly contributed to the development of this scientific paper. Special thanks to Dr. Pratima Murthy, Dr. Jeyaraj Durai Pandian, Dr. Sreekumaran Nair, Dr. Prashant Mathur, Dr. Srinivasa G A, Dr. Girish Baburao Kulkarni, Mr. Sundar Ramaswamy, and Dr. G Gururaj for their guidance and support. Additionally, the Independent Monitoring Team comprising of Dr. Akshaya KM, Dr. Usha S, Dr. Ramesh Holla, Dr. K Vidusha, Dr. Aravind Karinagannanavar, Dr. Malatesh Undi, Dr. Sharankumar Holyachi, and Dr. Anwith HS played a crucial role in ensuring the robustness and quality of the research. Their dedication and expertise are sincerely acknowledged. We would also like to thank the State and district health administration of Karnataka and Kolara respectively, health authorities of BBMP for providing administrative approvals for the conduct of the study. We would also like to thank our assistant research officers in our team Ms. Ashwini A and Shraddha S Pada.
References
- 1. Nethan S, Sinha D, Mehrotra R. Non Communicable Disease Risk Factors and their Trends in India. Asian Pac J Cancer Prev APJCP. 2017;18(7):2005–10. pmid:28749643
- 2.
Noncommunicable Diseases, Rehabilitation and Disability [Internet]. [cited 2024 Jan 29]. Available from: https://www.who.int/data/gho/data/themes/noncommunicable-diseases
- 3.
Institute for Health Metrics and Evaluation (IHME). Findings from the Global Burden of Disease Study 2017. Seattle, WA: IHME; 2018.
- 4.
Understanding epidemiological transition in India ‐ PMC [Internet]. [cited 2024 Jan 29]. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028906/
- 5. Yadav S, Arokiasamy P. Understanding epidemiological transition in India. Glob Health Action. 2014 May 15;7: pmid:24848651
- 6.
Global, regional, and national burden of stroke and its risk factors, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019 ‐ PubMed [Internet]. [cited 2024 Jan 29]. Available from: https://pubmed.ncbi.nlm.nih.gov/34487721/
- 7. Jo YJ, Kim DH, Sohn MK, Lee J, Shin YI, Oh GJ, et al. Clinical Characteristics and Risk Factors of First-Ever Stroke in Young Adults: A Multicenter, Prospective Cohort Study. J Pers Med. 2022 Sep;12(9):1505. pmid:36143290
- 8. Sanchez-Gavilan E, Montiel E, Baladas M, Lallanas S, Aurin E, Watson C, et al. Added value of patient-reported outcome measures (PROMs) after an acute stroke and early predictors of 90 days PROMs. J Patient-Rep Outcomes. 2022 Jun 13;6(1):66.
- 9.
Institute for Health Metrics and Evaluation [Internet]. [cited 2024 Jan 29]. GBD India Compare. Available from: http://vizhub.healthdata.org/gbd-compare/india
- 10. Mathur P, Rangamani S, Kulothungan V, Huliyappa D, Bhalla BB, Urs V. National Stroke Registry Programme in India for Surveillance and Research: Design and Methodology. Neuroepidemiology. 2020;54(6):454–61. pmid:33075771
- 11.
Karnataka Population Census 2011, Karnataka Religion, Literacy, Sex Ratio ‐ Census India [Internet]. [cited 2024 Jan 29]. Available from: https://www.censusindia.co.in/states/karnataka
- 12.
ICMR-NCDIR. National Noncommunicable Disease Monitoring Survey (NNMS) 2017–18. Bengaluru, India: ICMR-NCDIR; 2020.
- 13. Murthy RS. National Mental Health Survey of India 2015–2016. Indian J Psychiatry. 2017;59(1):21–6. pmid:28529357
- 14. Mathur P, Kulothungan V, Leburu S, Krishnan A, Chaturvedi HK, Salve HR, et al. National noncommunicable disease monitoring survey (NNMS) in India: Estimating risk factor prevalence in adult population. PLoS ONE. 2021 Mar 2;16(3):e0246712. pmid:33651825
- 15.
Epidemiology | Study Design and Data Analysis, Third Edition | Mark Wo [Internet]. [cited 2024 Jan 29]. Available from: https://www.taylorfrancis.com/books/mono/10.1201/b16343/epidemiology-mark-woodward
- 16. Andersen H. History and Philosophy of Modern Epidemiology. 2007;
- 17.
Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation ‐ Pencina ‐ 2004 ‐ Statistics in Medicine ‐ Wiley Online Library [Internet]. [cited 2024 Jan 29]. Available from: https://onlinelibrary.wiley.com/doi/10.1002/sim.1802
- 18. D’Agostino RB, Nam BH. Evaluation of the performance of survival analysis models: discrimination and calibration measures. Handbook of statistics. 2003 Jan 1;23:1–25.
- 19.
Park K. Text book of preventive and social medicine. 19th ed. Jabalpul: M/S Banarsidas bhanot publications; 2007.
- 20.
What’s the Relative Risk? A Method of Correcting the Odds Ratio in Cohort Studies of Common Outcomes | Research, Methods, Statistics | JAMA | JAMA Network [Internet]. [cited 2024 Jan 29]. Available from: https://jamanetwork.com/journals/jama/fullarticle/188182
- 21.
Tenny S, Kerndt CC, Hoffman MR. Case Control Studies. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 [cited 2024 Jan 29]. Available from: http://www.ncbi.nlm.nih.gov/books/NBK448143/
- 22. Dalal PM, Malik S, Bhattacharjee M, Trivedi ND, Vairale J, Bhat P, et al. Population-based stroke survey in Mumbai, India: incidence and 28-day case fatality. Neuroepidemiology. 2008;31(4):254–61. pmid:18931521
- 23. Nagaraja D, Gururaj G, Girish N, Panda S, Roy AK, Sarma GRK, et al. Feasibility study of stroke surveillance: data from Bangalore, India. Indian J Med Res. 2009 Oct;130(4):396–403. pmid:19942742
- 24. Pandian JD, Singh G, Bansal R, Paul BS, Singla M, Singh S, et al. Establishment of population-based stroke registry in Ludhiana city, northwest India: feasibility and methodology. Neuroepidemiology. 2015;44(2):69–77. pmid:25764983
- 25. Roy R, Sukumar GM, Philip M, Gopalakrishna G. Face, content, criterion and construct validity assessment of a newly developed tool to assess and classify work–related stress (TAWS– 16). PLOS ONE. 2023 Jan 6;18(1):e0280189. pmid:36608043
- 26.
WHO technical specifications for blood glucose meter [Internet]. [cited 2024 Jan 29]. Available from: https://www.who.int/publications/m/item/who-technical-specifications-for-blood-glucose-meter
- 27. Alberti KG, Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation. Diabet Med J Br Diabet Assoc. 1998 Jul;15(7):539–53. pmid:9686693
- 28. Mohan V, Kaur T, Anjana RM, Pradeepa RG. ICMR-India DIABetes [INDIAB] Study Phase 1Final Report (2008–2011). Indian Council of Medical Research;2018.
- 29.
National Programme for Prevention and Control of Cancer, Diabetes, Cardiovascular Diseases & Stroke (NPCDCS) Operational Guidelines Revised (2013–2017). Directorate General of Health Services, Ministry of Health & Family Welfare, Government of India; 2013.
- 30.
Performance Measures Hypertension | ACP Online [Internet]. [cited 2024 Jan 29]. Available from: https://www.acponline.org/clinical-information/performance-measures/clinical-topic/Hypertension
- 31. Cleeman J. ATP III Guidelines At-A-Glance Quick Desk Reference.
- 32.
Clinical values of resting electrocardiography in patients with known or suspected chronic coronary artery disease: a stress perfusion cardiac MRI study ‐ PMC [Internet]. [cited 2024 Jan 29]. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8714441/
- 33.
Tata Institute of Social Sciences (TISS), Mumbai and Ministry of Health and Family Welfare, Government of India. Global Adult Tobacco Survey GATS 2 India 2016–17. New Delhi: Ministry of Health and Family Welfare;2018.
- 34. Heatherton TF, Kozlowski LT, Frecker RC, Fagerström KO. The Fagerström Test for Nicotine Dependence: a revision of the Fagerström Tolerance Questionnaire. Br J Addict. 1991 Sep;86(9):1119–27.
- 35. Ebbert JO, Patten CA, Schroeder DR. The Fagerström Test for Nicotine Dependence-Smokeless Tobacco (FTND-ST). Addict Behav. 2006 Sep;31(9):1716–21.
- 36.
Scoring the AUDIT [Internet]. [cited 2024 Jan 29]. Available from: https://auditscreen.org/about/scoring-audit
- 37.
CAGE Questionnaire: Purpose, Questions, After Results [Internet]. [cited 2024 Jan 29]. Available from: https://www.verywellhealth.com/cage-questionnaire-5216479
- 38.
Cohen, S. and Williamson, G. Perceived Stress in a Probability Sample of the United States. Spacapan, S. and Oskamp, S. (Eds.) The Social Psychology of Health. Newbury Park, CA: Sage, 1988.
- 39.
SDG Indicators—SDG Indicators [Internet]. [cited 2024 Jul 27]. Available from: https://unstats.un.org/sdgs/metadata/.
- 40. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: Validity of a brief depression severity measure. J Gen Intern Med 2001; 16:606e13. pmid:11556941
- 41.
Outcomes Validity and Reliability of the Modified Rankin Scale: Implications for Stroke Clinical Trials [Internet]. [cited 2024 Jan 29]. Available from: https://www.ahajournals.org/doi/epub/10.1161/01.STR.0000258355.23810.c6
- 42. Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. 2006 May 22;166(10):1092–7. pmid:16717171
- 43.
Modeling Survival Data: Extending the Cox Model | SpringerLink [Internet]. [cited 2024 Jan 29]. Available from: https://link.springer.com/book/10.1007/978-1-4757-3294-8
- 44. Corry NH, Williams CS, Battaglia M, McMaster HS, Stander VA. Assessing and adjusting for non-response in the Millennium Cohort Family Study. BMC Med Res Methodol. 2017 Jan 28;17(1):16. pmid:28129735
- 45.
Case-Control Studies ‐ an overview | ScienceDirect Topics [Internet]. [cited 2024 Jan 29]. Available from: https://www.sciencedirect.com/topics/agricultural-and-biological-sciences/case-control-studies
- 46. Duffy SW, Nagtegaal ID, Wallis M, Cafferty FH, Houssami N, Warwick J, et al. Correcting for lead time and length bias in estimating the effect of screen detection on cancer survival. Am J Epidemiol. 2008 Jul 1;168(1):98–104. Epub 2008 May 25. pmid:18504245.