Population-based identification and temporal trend of children with primary nephrotic syndrome: The Kaiser Permanente nephrotic syndrome study

Introduction Limited population-based data exist about children with primary nephrotic syndrome (NS). Methods We identified a cohort of children with primary NS receiving care in Kaiser Permanente Northern California, an integrated healthcare delivery system caring for >750,000 children. We identified all children <18 years between 1996 and 2012 who had nephrotic range proteinuria (urine ACR>3500 mg/g, urine PCR>3.5 mg/mg, 24-hour urine protein>3500 mg or urine dipstick>300 mg/dL) in laboratory databases or a diagnosis of NS in electronic health records. Nephrologists reviewed health records for clinical presentation and laboratory and biopsy results to confirm primary NS. Results Among 365 cases of confirmed NS, 179 had confirmed primary NS attributed to presumed minimal change disease (MCD) (72%), focal segmental glomerulosclerosis (FSGS) (23%) or membranous nephropathy (MN) (5%). The overall incidence of primary NS was 1.47 (95% Confidence Interval:1.27–1.70) per 100,000 person-years. Biopsy data were available in 40% of cases. Median age for patients with primary NS was 6.9 (interquartile range:3.7 to 12.9) years, 43% were female and 26% were white, 13% black, 17% Asian/Pacific Islander, and 32% Hispanic. Conclusion This population-based identification of children with primary NS leveraging electronic health records can provide a unique approach and platform for describing the natural history of NS and identifying determinants of outcomes in children with primary NS.


Introduction
Limited population-based data exist about children with primary nephrotic syndrome (NS).

Methods
We identified a cohort of children with primary NS receiving care in Kaiser Permanente Northern California, an integrated healthcare delivery system caring for >750,000 children. We identified all children <18 years between 1996 and 2012 who had nephrotic range proteinuria (urine ACR>3500 mg/g, urine PCR>3.5 mg/mg, 24-hour urine protein>3500 mg or urine dipstick>300 mg/dL) in laboratory databases or a diagnosis of NS in electronic health records. Nephrologists reviewed health records for clinical presentation and laboratory and biopsy results to confirm primary NS.

Introduction
Nephrotic syndrome (NS) is one of the most common kidney disorders in children, with an estimated population incidence ranging from 2 to 16 cases per 100,000 children depending on the setting and population studied [1]. Although children may exhibit NS secondary to other diseases, medication use, or infections, most pediatric NS is considered idiopathic and has three main general attributed etiologies: minimal change disease (MCD), focal segmental glomerulosclerosis (FSGS), and membranous nephropathy (MN) [1,2]. Children with NS are at risk for various potential adverse short-and long-term outcomes, including higher rates of infection, hypertension, venous thromboembolism, fractures, and progression to chronic kidney disease and end-stage kidney disease [3][4][5][6][7][8]. Therefore, appropriate systematic identification and population management strategies for children with NS, especially primary NS, are needed to facilitate prevention of subsequent clinical complications. Valuable insights on the etiology, management strategies, and outcomes of NS have come primarily from selected prospective cohort studies or registries for children with NS [9][10][11][12][13][14][15][16]. However, these studies often require biopsy-confirmed NS for inclusion, are limited to patients identified with diagnosis codes for NS, rely on data only from tertiary care medical centers or other selected referral settings, or combine all types of NS, such that results may not fully reflect the population-level burden of primary NS.
In this study, we estimate and characterize the population incidence of pediatric primary NS in a large, integrated healthcare delivery system through structured review of available data within electronic health records (EHR) and associated health system data sources.

Source population and study sample
The source population included members of Kaiser Permanente Northern California (KPNC), an integrated health care delivery system currently providing comprehensive care to >4.5 million members throughout Northern California. Its membership is highly representative of the regional and statewide population with regard to sociodemographic characteristics [17].
The study sample included all pediatric (age <18 years) health plan members who had nephrotic range proteinuria and/or a diagnosis code suggestive of possible NS between January 1, 1996 and December 31, 2012 using methods described in detail below. Eligible patients were identified for this study using laboratory test results and diagnosis codes associated with clinical encounters. After identifying the initial cohort of eligible patients based on EHR algorithms, targeted subgroups were selected for manual adjudication of medical records by board-certified nephrologists to confirm the presence of NS, and assign the type of NS (primary vs. secondary) and presumed cause of primary NS (MCD, FSGS or MN).
The study was approved by the KPNC institutional review board, and waiver of informed consent was obtained due to the nature of this retrospective data-only study.

Identification of potential nephrotic syndrome using proteinuria results
We identified children with nephrotic range proteinuria if they had �1 positive urine test result from any setting (inpatient or outpatient) found in health system laboratory databases using any of the following criteria: albumin-to-creatinine ratio (ACR) >3500 mg/g; proteinto-creatinine ratio (PCR) >3.5 mg/mg; 24-hour urine protein excretion >3500 mg; or �3+ on urine protein dipstick. These criteria are designed to be more specific for electronic ascertainment of potential nephrotic syndrome using thresholds of proteinuria values that are more stringent than typical clinical cutoffs [18].

Identification of potential nephrotic syndrome using diagnosis codes
We included children in the cohort if they had �1 primary discharge, outpatient, or emergency department diagnosis of NS during our sampling timeframe. Diagnoses of NS were ascertained from health plan databases based on relevant International Classification of Diseases, Ninth Edition (ICD-9) codes. The following ICD-9 codes were used for identification: 581, 581.0, 581.1, 581.2, 581.3, 581.8, 581.81, 581.89, and 581.9. The diagnosis codes were initially categorized into groups for facilitating the targeted physician adjudication process. Specifically, manual physician review of health records was prioritized for (1) children with documented nephrotic range proteinuria and a diagnosis of NS, (2) children with a diagnosis of NS only, and (3) children with documented nephrotic range proteinuria in the absence of a diagnosis of NS.

Validation of nephrotic syndrome and assembly of final cohort
The overall cohort that underwent validation included 2,541 children identified with a qualifying proteinuria measurement or diagnosis codes of NS in the absence of diabetes mellitus, with the following specific validation groups: any NS diagnosis (N = 1,281); nephrotic range proteinuria based on laboratory measurements of urine ACR, PCR, or 24-hour urine protein and no documented diagnosis of kidney disease (N = 85); nephrotic range proteinuria based on �3 urine dipstick measurement of 3+ proteinuria and no documented diagnosis of kidney disease (N = 329); proteinuria based on �1 qualifying urine dipstick measurement and receiving �1 medical therapy used for NS (i.e., angiotensin-converting enzyme inhibitors, angiotensin II receptor blockers, azathioprine, cyclosporine, cyclophosphamide, methylprednisone, prednisone, prednisolone, tacrolimus, and mycophenolate mofetil) within 1 year of index date (N = 746); and nephrotic range proteinuria based on two urine dipstick measurements of 3 + proteinuria and no documented diagnosis of kidney disease. For the last criterion, a random sample of 100 patients was reviewed given the very large number of identified patients to evaluate the potential yield for identifying primary NS.
Children confirmed with NS using data from the EHR and manual review of medical records required evidence of symptoms and/or signs consistent with NS and a laboratory measurement indicating nephrotic range proteinuria at the time of diagnosis. Presumed cause of NS was ascertained by review of biopsy results in KPNC pathology databases, if available. For children without available biopsy results in KPNC, we incorporated information from nephrology or other treating physician notes, other laboratory values, and treatment patterns to assign the presumed etiology. Biopsies conducted outside of KPNC or before joining KPNC were not routinely captured in pathology databases and were reviewed only through manual review of provider notes. All causes of NS were considered presumed, unless a definitive biopsy result was available within or outside KPNC. More details on the specific methodology for adjudication of medical records, exclusion criteria, and rules are described in the S1 File. We excluded patients if the nephrologist reviewer could not confirm a diagnosis of NS using the criteria described above, or among confirmed NS cases, if we could not establish an index date (i.e., date of initial NS diagnosis), if the index date was before 1996 or after 2012, or if the patient was older than 18 years based on the index date found. The overall assembled cohort included 365 children with confirmed NS (Fig 1).

Covariates
Demographic characteristics (age, gender and self-reported race/ethnicity, if available) were obtained from health plan databases. We defined targeted comorbidities by diagnosis or procedure codes supplemented with available laboratory test results, outpatient vital sign data, and/ or prescribed medications using EHR-based data that were cleaned and linked at the individual-patient level into the KPNC Virtual Data Warehouse as previously described and validated [19,20]. Household educational attainment and annual income were estimated using residential block-level information from U.S. census data. Low education was defined as living in a census block where more than 25% of those aged 25 years or older had less than a 12th-grade education; low income was defined as living in a block where median annual household income was less than $35,000 per year. Patient vital status was determined using comprehensive information from health plan administrative and clinical databases, member proxy reporting, Social Security Administration vital status files, and California state death certificate information.

Statistical analyses
We conducted all analyses using SAS version 9.3 (Cary, N.C.). We first characterized the overall cohort based on the presumed etiology of confirmed NS. Next, given the focus of the study on primary NS, we calculated crude incidence (per 100,000 person-years) and associated 95% confidence limits of primary NS during the study period by dividing the number of confirmed primary NS cases by the total person-years contributed by the pediatric KPNC population with no diabetes mellitus. We similarly calculated the crude annual incidence (per 100,000 person-years) and associated 95% confidence intervals of primary NS due to presumed MCD; the relatively small number of cases of either presumed FSGS or MN precluded our reliably calculating incidence of for those types of primary NS. We averaged annual rates into the following categories of time (1996-1999, 2000-2003, 2004-2006, 2007-2009 and 2010-2012) for increased precision per time period. We calculated the age-and sex-adjusted incidence and associated 95% confidence limits of primary NS per category of time, directly standardized to the 2010-2012 population, as well as the age-and sex-stratified crude incidence of primary NS. We performed Cochrane-Armitage tests to evaluate for significant linear trends over time.

Cohort assembly
Among 2,541 eligible children with potential NS whose medical records were manually reviewed, we confirmed NS in 365 children (179 [47%] classified with primary NS), and the proportion of confirmed NS cases varied substantially by method of EHR-based identification ( Table 1). Children with documented proteinuria and a diagnosis code of NS were confirmed to have NS in 39% of cases as compared to only 6.5% of children who had a diagnosis code of NS but no documented proteinuria found in health system laboratory databases.

Baseline characteristics
Among children with primary NS, median age was 6.9 (interquartile range: 3.7 to 12.9) years, 43% were female, 26% were White, 13% were Black, 17% were Asian/Pacific Islander, and 32% identified as Hispanic ( Table 2). Of note, one third of patients lived in a household with low educational attainment and 15% of patients in households with a median household income of less than $35,000 based on census-based classification during the study period. The prevalence of diagnosed chronic lung disease was 22%, while other documented comorbid conditions were infrequent. Data on body mass index, blood pressure, serum albumin, lipoproteins and hemoglobin were unavailable in a significant fraction of patients ( Table 2).
Characteristics varied by presumed etiology of primary NS. Compared to those with presumed FSGS, children with presumed MCD were younger and had higher eGFR at index date. Although 30% of children had unknown self-reported race, children with FSGS were more likely to be Black or multiracial than patients with other etiologies of primary NS, and children with MCD were more likely to be of Asian or Pacific Islander descent. As noted previously, a large proportion of all patients did not have available body mass index, blood pressure and selected laboratory tests, which precluded comparison of those measures across types of primary NS. Kidney biopsies were performed at KPNC in 72 (40%) children classified with primary NS, with a higher biopsy rate in those with FSGS (86%) and MN (75%) than in those cases attributed to MCD (23%) ( Table 2).

Temporal trends of primary NS
The overall incidence of primary NS within our pediatric population was 1.47 (95% Confidence Interval [CI]:1.27-1.70) per 100,000 person-years. The incidence of all primary NS increased significantly over the study period, from 1.04 (95% CI:0.72-1.51) per 100,000 person-years in 1996-1999, to 1.88 (95% CI:1.38-2.55) per 100,000 person-years in 2007-2009 and 1.62 (95% CI:1.17-2.24) per 100,000 person-years in 2010-2012 (Fig 3). The incidence of primary NS due to presumed MCD also increased over the study period (P = 0.01 for trend) (Fig 3). After standardizing for the distribution in age and sex using the 2010-2012 Kaiser Permanente Northern California overall pediatric population, the adjusted incidence was similar to the crude incidence (S1 Fig in S1 File). When stratified by age categories, we observed a significant increase in the incidence of primary NS among children aged 6-11 years, but not among children aged 0-5 years or 12-17 years (S2 Fig in S1 File). We also observed a significant increase in the incidence among boys over time, but not girls, with the overall rates being higher in boys than in girls (S3 Fig in  S1 File).

Discussion
Leveraging EHR, clinical and administrative data over a 15-year period within a pediatric population receiving care in an integrated healthcare delivery system in California, we developed a structured methodology to systematically delineate the population incidence and selected characteristics of children with primary NS. We identified 179 children who had confirmed primary NS, with an incidence of 1.47 per 100,000 person-years averaged over the entire study period. We also observed that the incidence of confirmed primary NS increased within our pediatric population from 1.04 per 100,000 person-years in 1996-1999 to 1.62 per 100,000 person-years in 2010-2012. This increase may be driven by increases in incidence among children aged 6-11 years, and particularly among boys compared to girls. However, it is unclear whether this increase reflects a true increase in NS incidence in the underlying populations, or a change in practice patterns or population demographics over time. Children with primary NS had low comorbidity burden overall, with the exception of diagnosed chronic lung disease which have been previously described to be associated with certain types of NS in children [21,22]. As expected, the availability of kidney biopsies in KPNC among children with presumed MCD was much lower than in children classified with FSGS or MN, as invasive procedures are frequently not performed in children with lower severity or steroid-sensitive NS. When applied retrospectively and prospectively, these methods have the potential to aid in the systematic identification of children with NS as well as in designing population-based care delivery strategies as well as facilitating observational research and recruitment into clinical trials. Our observed incidence estimates are comparable to the limited number of other population-based studies on pediatric NS. In the Netherlands, Bakkali et al. observed an overall NS incidence of 1.52 per 100,000 person-years in children from 2003-2006 but did not distinguish between primary and secondary NS [12]. In a single-center, hospital-based study of diagnosed pediatric NS in Toronto, Banh et al. observed an increase in incidence from 1.99 to 4.71 per 100,000 person-years from 2001 to 2011 [23]. In addition, Kikunaga et al. reported an incidence of 6.49 per 100,000 person-years in Japan from 2010-2012, although this estimated incidence was based on survey results and may overestimate true incidence based on standardized   [25]. Variation in reported incidence across populations is not fully understood but is likely driven, at least in part, by differences in the identification methodology as well as racial and ethnic distributions of the source populations. For example, the majority of Toronto cohort members were South Asians who are known to have a higher incidence of NS, and the Paris cohort used solely laboratory values to identify NS likely leading to inclusion of secondary NS [23,25,26]. In contrast to other studies using only diagnosis codes to identify NS, we additionally performed targeted manual chart review by board-certified nephrologists for evidence of a clinical presentation consistent with NS which enhanced our specificity and reduced misclassification compared with reliance only on administrative diagnosis codes. To our knowledge, only one other published study has attempted to evaluate data from EHR systems to identify rare kidney diseases in a pediatric population. In that study, Denburg et al. developed computable phenotypes to identify a broad  set of glomerular diseases using a combination of diagnosis codes, transplant procedure codes, and kidney biopsies [27]. While the authors did not incorporate laboratory data or address primary NS, their findings further support the potential of using EHR data to accurately identify patients with glomerular diseases, including idiopathic NS, which can enable more population-based care and research for rarer kidney diseases.
A major strength of our study is the inclusion of a large, ethnically diverse pediatric population receiving care in a fully integrated healthcare delivery system where essentially all care is coordinated across inpatient, emergency department and outpatient settings. This allowed for the assessment of available laboratory data across all practice settings to systematically screen for children with evidence of nephrotic-range proteinuria, as well as the assessment of baseline demographic and risk factors and conditions, if documented in the EHR. In addition, all primary NS cases were manually confirmed through targeted manual medical records review by nephrologists, which is considered the gold standard for retrospective case identification and also demonstrated the suboptimal performance of diagnosis codes alone to identify children with NS.
Our study also has certain limitations. Because this retrospective study relied on data collected as part of clinical care across an earlier time period in children, comprehensive data were not available for all variables of interest, with a significant proportion of patients missing data for vital signs, body mass index and selected laboratory tests. Biopsy reports from procedures performed at other facilities were also not comprehensively available for review, and the presumed etiologies of MN and FSGS in those patients were derived using diagnosis codes and other provider notes. This lack of available laboratory and biopsy data is likely also driven by less severe or steroid sensitive NS cases where a provider may choose not to perform a more invasive blood test or procedure if the proteinuria and clinical presentation is consistent with NS and was controlled after an initial treatment regimen. This is in contrast to prospective studies using a structured research protocol where recruited children with NS can undergo regular data collection and laboratory testing. However, we believe that our approach could be widely used by health systems with available EHR-based laboratory and diagnosis data to facilitate identification of children with potential primary NS who may benefit from systematic evaluation and laboratory testing to confirm their diagnosis and subsequent disease management. We recognize that manual review of medical records by nephrologists to confirm cases of primary NS on a large scale can be time-and resource-intensive, so additional efforts are needed to refine diagnostic approaches in even more contemporary populations and advanced EHR systems, along with incorporating methods such as natural language processing and other machine learning methods applied to unstructured EHR notes and biopsy reports [28]. Given the lower number of children with primary NS due to presumed FSGS and MN, we were unable to calculate stable estimates of changes in incidence of these types of primary NS over the study period. We were also unable to further characterize racial differences in the incidence of primary NS due to the limited availability of data on self-reported race. Our proteinuria thresholds for the initial electronic identification of children with NS were more stringent by design to increase specificity. This may have resulted in underestimation of NS incidence among those without a diagnosis code and less severe proteinuria. Although the large majority of our insured population retains continuous membership for multiple years, some patients may disenroll before a diagnosis or laboratory result is obtained which could result in a potentially reduced incidence estimate. Finally, given that our study was conducted within an integrated healthcare delivery system in Northern California, our findings may not be fully generalizable to other geographic areas, health care systems, or uninsured patients.
In conclusion, we developed a large-scale approach based on EHR data combined with targeted physician adjudication to systematically identify and delineate the incidence and characteristics of children with primary NS treated in an integrated healthcare delivery system. Further efforts to refine our methods to incorporate more comprehensive EHR data and advanced machine learning methods are needed to promote more efficient population-based identification and characterization of patients with primary NS who may benefit from structured management and follow-up.