Urinary MicroRNA Profiling in the Nephropathy of Type 1 Diabetes

Background Patients with Type 1 Diabetes (T1D) are particularly vulnerable to development of Diabetic nephropathy (DN) leading to End Stage Renal Disease. Hence a better understanding of the factors affecting kidney disease progression in T1D is urgently needed. In recent years microRNAs have emerged as important post-transcriptional regulators of gene expression in many different health conditions. We hypothesized that urinary microRNA profile of patients will differ in the different stages of diabetic renal disease. Methods and Findings We studied urine microRNA profiles with qPCR in 40 T1D with >20 year follow up 10 who never developed renal disease (N) matched against 10 patients who went on to develop overt nephropathy (DN), 10 patients with intermittent microalbuminuria (IMA) matched against 10 patients with persistent (PMA) microalbuminuria. A Bayesian procedure was used to normalize and convert raw signals to expression ratios. We applied formal statistical techniques to translate fold changes to profiles of microRNA targets which were then used to make inferences about biological pathways in the Gene Ontology and REACTOME structured vocabularies. A total of 27 microRNAs were found to be present at significantly different levels in different stages of untreated nephropathy. These microRNAs mapped to overlapping pathways pertaining to growth factor signaling and renal fibrosis known to be targeted in diabetic kidney disease. Conclusions Urinary microRNA profiles differ across the different stages of diabetic nephropathy. Previous work using experimental, clinical chemistry or biopsy samples has demonstrated differential expression of many of these microRNAs in a variety of chronic renal conditions and diabetes. Combining expression ratios of microRNAs with formal inferences about their predicted mRNA targets and associated biological pathways may yield useful markers for early diagnosis and risk stratification of DN in T1D by inferring the alteration of renal molecular processes.


Matched samples from patients with microalbuminuria (MA)
This analysis concerns microRNA profiles in patients who will develop either intermittent (IMA) or persistent (PMA) microalbuminuria at two different occasions: a) before the development of MA i.e. when the urine appears to be "normal" by conventional clinical laboratory testing and b) upon development of microalbuminuria. For this analysis, a hierarchical linear mixed model was used accounting for the explicit matching between MA patients in pairs, as well as the implicit matching between normoalbuminuric and microalbuminuric samples from the same patient was used. By employing a random effect for patient pairs, the latter were allowed to have their own specific expression for each microRNA of interest, constrained to be normally distributed around the overall mean (random intercept model). Such a modification allows the model to account for confounding factors co-varying with the variables used in matching (age, sex, diabetic control i.e. level of Hemoglobin A 1 C) that could potentially affect microRNA expression. The fixed effects part of the regression equations in this model reflect the clinical classification of the patients (IMA v.s. PMA) and the laboratory classification of the corresponding samples (normoalbuminuric or microalbuminuric) through indicator variables: The corresponding set of regression coefficients is shown in When fitting the model we adopted the parameterization of Table 1 due to the higher numerical stability and better convergence of the Monte Carlo simulations. In order to compare microalbuminuric samples between patients with PMA and IMA without refitting, we formed the difference between the the second and the third coefficients of Table 2 after convergence had been achieved. The adopted model assumes the quantification cycle value of each of the control reactions (Y c i ) to be normally distributed around its mean µ c i with a standard deviation σ w (measurement noise): The mean is related to an experimental panel factor, E c (ID i ), and a control PCR specific factor B c (ID i , R i ) according to the regression equation: where ID i is the individual sample the ith control measurement came from, while R i indexes the well number of the three microRNAs (hsa-miR-423-5p, hsa-miR-103, hsa-miR-191 ) the three small RNAs (U6, SNORD38B, SNORD39A) 1 and the interplate calibrator UniSP3. The panel factors were modelled as normal random effects: which were used to effect a panel specific normalization of non-control C q values. In this equation, the standard deviation σ B can be interpreted as the magnitude of the noise in quantification cycle values due to the universal amplification step and run-to-run variability in signal acquisition by the real time PCR system. The control PCR specific factors are modelled as an additional level in the hierarchical model by exploiting the replicate PCR reactions (N R reactions in each plate using N P distinct primer sets). In the ith individual sample these are assumed to be normally distributed around their mean µ P CR i,R miR (j) with a standard deviation σ ID RmiR(j) : in which R miR(j) maps the jth PCR reaction to the corresponding primer set. To account for clinical, and laboratory classification of samples and the matching between pairs, the means were resolved with the aid of the regression: in which α c k is the expression level of the kth control reaction and a P c j,k is the (random) intercept of the kth control reaction in the the jth pair of patients. A slightly different regression was used for the analysis of the non-control PCR reactions since these were assayed only once in each sample. The N non-control measurements (from all patient samples) map to N miR unique microRNAs which are assumed to be normally distributed around their mean with the same standard deviation (measurement noise) as the control reactions: Similar to the control PCR reactions, indicator variables are used to classify the corresponding urine sample according to clinical and laboratory classification. For the non-control reactions, the panel specific factors appear as offsets that adjust individual measurements for the variable efficiency in the universal amplification step. Since these factors are not observed, they are treated as missing data which are estimated from the control reactions. This corresponds to a sequential two stage model for missing data but with full allowance for uncertainty in the estimation of unknown quantities.
In this analysis, non-informative priors were specified for the model parameters (regression coefficients, standard deviations of random effects): The values of the standard deviation of the normal priors for the mean parameters (1000) and the upper limit of the uniform priors for standard deviations (100 and 20) were motivated by numerical considerations as they yield essentially non-informative priors in the context of the finite precision arithmetic of digital computations.

Matched samples from patients with overt nephropathy and normals
This analysis is similar to the one carried out for patients with microalbuminuria in that a random effects are used to account for the matching of patients in pairs. The only difference lies in the fixed effects part of the model, i.e. the indicator variable (X Ov i = 1 if the sample came from a patient with overt nephropathy and 0 otherwise) and the corresponding regression coefficient (interpretable as ΔC q qPCR signal for overt nephropathy relative to the normal state): Non-informative N (0, 10 6 ) priors were specifed for the nephropathy effect for both control and non-control PCRs, as well as for the remaining model parameters similar to the previous sections.

Calculation of ΔΔC q values
Each of the three models allow the estimation of ΔC q values for all RNAs of interest assayed in the experiment.
To calculate the ΔΔC q values, one forms the difference between the ΔC q for the kth microRNA (given by a "beta" coefficient in the sample comparison contemplated) and the ΔC q for UniSP3. Since both ΔC q values are estimated from the data, these differences are averaged over the posterior distribution of these quantities given the raw C q signals. The corresponding integral can be approximated by forming the differences between the corresponding ΔC q samples from the Markov Chain Monte Carlo simulations. Means and Standard Errors of these ΔΔC q s were used in the meta-analysis of microRNA targets.

WinBUGS Code
WinBUGS code for the Bayesian estimation of the C q in three analyses are given in the listings below. In WinBUGS the normal distribution is parameterized in terms of the precision (inverse of the square of the standard deviation), necessitating the introduction of auxilliary variables in the code. The cut function was used in the code to ensure that only data from the control PCR reactions were used to "learn" the values of the panel specific normalization factors. During Monte Carlo Markov Chain (MCMC) simulation for the estimation of the quantification cycle (threshold crosing) values of the non-control PCR reactions, this uncertainty was taken fully into account by marginalizing over the posterior distribution of E c (ID i ) obtained by conditioning on the control data. For all models, we simulated three chains with overdispersed initial values and explored convergence both by visual inspection and formal convergence diagnostics (Gelman-Rubin). The first 10000 iterations of the simulations were discarded (burnin phase) and subsequently 200000 draws were obtained in WinBUGS. Only one draw in 100 was stored and used further, eliminating for all practical purposes the autocorrelation between consecutive samples. WinBUGS MCMC output was then imported in the R programming language for further processing (calculation of means and standard errors of ΔΔC q s, Credible Interval and pseudocontour probabilities computations, meta-analysis etc.).