HLA and non-HLA genes and familial predisposition to autoimmune diseases in families with a child affected by type 1 diabetes

Genetic predisposition could be assumed to be causing clustering of autoimmunity in individuals and families. We tested whether HLA and non-HLA loci associate with such clustering of autoimmunity. We included 1,745 children with type 1 diabetes from the Finnish Pediatric Diabetes Register. Data on personal or family history of autoimmune diseases were collected with a structured questionnaire and, for a subset, with a detailed search for celiac disease and autoimmune thyroid disease. Children with multiple autoimmune diseases or with multiple affected first- or second-degree relatives were identified. We analysed type 1 diabetes related HLA class II haplotypes and genotyped 41 single nucleotide polymorphisms (SNPs) outside the HLA region. The HLA-DR4-DQ8 haplotype was associated with having type 1 diabetes only whereas the HLA-DR3-DQ2 haplotype was more common in children with multiple autoimmune diseases. Children with multiple autoimmune diseases showed nominal association with RGS1 (rs2816316), and children coming from an autoimmune family with rs11711054 (CCR3-CCR5). In multivariate analyses, the overall effect of non-HLA SNPs on both phenotypes was evident, associations with RGS1 and CCR3-CCR5 region were confirmed and additional associations were implicated: NRP1, FUT2, and CD69 for children with multiple autoimmune diseases. In conclusion, HLA-DR3-DQ2 haplotype and some non-HLA SNPs contribute to the clustering of autoimmune diseases in children with type 1 diabetes and in their families.


Deriving the multivariate models
We use generalized linear models (e.g. McCullagh and Nelder 1989) to investigate the relationship between binary outcomes and different covariates of interest. These models are typically used as adhoc tools, merely by assuming some link function (e.g. logit regression). In this section we show, how a complementary log-log link function arises naturally from assuming proportional hazards between different genotypes.
In models 1 and 3, we investigate, whether the SNP covariates (HLA class II haplotypes in model 1, and non-HLA SNPs in model 3) help to predict additional autoimmune diseases (AIDs) in a child. Let us assume that the baseline risk of contracting any non-diabetic AID is a constant 0 per time unit. Then it follows that the probability of having at least one non-diabetic AID at age is Now, let us consider individual with individual-specific covariates and age at diagnosis . If we assume proportional hazards, the individual-specific risk is given by where is a vector of regression coefficients. It follows from (A2) that the individual-specific probability is given by Equation A4 can be estimated by using a generalized linear model with a complementary log-log link for binary data. The age at diagnosis is to be modelled as an offset. However, we also estimate an alternative model as a robustness check. In this model, is a parameter to be estimated freely, and consequently, logage is treated as any other covariate.
In models 2 and 4, we investigate, whether the SNP covariates (HLA class II haplotypes in model 2, and non-HLA SNPs in model 4) help to predict the AID status of a family. If we assume proportional hazards in line with A1, then the joint hazard of the family can be modelled as where denotes the vector of covariates for the th member of family , and denotes the size of family . However, we do not observe directly, and changes in time in an unobserved manner.
Thus, we assume that the family-specific risk depends on the covariates of the index patient (denoted by ) and the number of family members as In line with A3 and A4, we derive the model to be estimated as We note that we do not include on the right-hand side of A8, as most of the person-time at risk comes from other family members, and is relatively unimportant regarding the whole family.
However, we run an alternative model as a robustness check. In A9, is a parameter to be freely estimated.

Robustness checks 1. Backward model selection
In the main matter, we estimate A4 and A8 by using complementary log-log regression and perform stepwise forward model selection for the SNP covariates. We note that stepwise model selection may stop in a locally optimal model (as opposed to a globally optimal one), and also that it is inconvenient to perform exhaustive search over the whole model space of models 3 and 4 (2 33 = 8,589,934,592 items). Thus, we check the robustness of our results by running stepwise model selection in backward direction. The results are as presented in Table T1.
Regarding HLA class II haplotypes (models 1 and 2), stepwise backward model selection  Table T1. Multivariate models by using stepwise backward model selection.
The joint P values concern the HLA class II haplotypes or non-HLA SNPs.

Bayesian information criterion
The Bayesian information criterion (BIC, Schwarz 1978) is sometimes used as an alternative to AIC. It is based on different statistical principles, and it generally favors a more parsimonious model than AIC. Thus, in this subsection, we run the stepwise forward model selection to minimize BIC. The results are as presented in Table T2.
In these data, the use of BIC prunes away most SNPs (and HLA class II haplotypes), but DR4-DQ8 and rs11711054 (CCR3-CCR5) are still retained in models 1 and 4, respectively. Moreover, their effects are almost equal to those found in the main matter by using AIC. (The differences arise from different sets of other covariates.) The effects of the confounding factors are largely similar to those found in the main matter, plasma glucose level and log-GADA level being the only significant confounders. Table T2. Multivariate models by using Bayesian information criterion.
The joint P values concern the HLA class II haplotypes or non-HLA SNPs.

Alternative person-time
The models of the main matter are based on equations A4 and A8. The derivation of A4 assumes that the baseline risk is constant in time, and A8 is based on mainly ad-hoc arguments.
Consequently, it is possible that the relationship between the person-time and the outcome of interest is misspecified in both models. As the person-time at risk affects the probability of a positive outcome, it is worthwhile to further investigate this issue.
As a robustness check, we run estimation and model choice by using alternative specifications for the person-time (equations A5 and A9). For these analyses, we use AIC and stepwise forward model selection. The results are as presented in Table T3.
Regarding HLA class II haplotypes (models 1 and 2), model specifications A5 and A9 give the very same model, as A4 and A8 (main matter), notwithstanding the log-age covariate. In model 3 (non-HLA SNPs and AID clustering within the child), rs763361 (CD226) has been left out of the model, but the effects of the other SNPs have the same directions and a similar pattern of significance.
For model 4 (non-HLA SNPs and autoimmune families), the results are very similar to those given in the main matter. The same set of SNPs has been chosen by using both specifications (A8 in the main matter and A9 here). Table T3. Multivariate models by using alternative specifications of person-time.
The joint P values concern the HLA class II haplotypes or non-HLA SNPs.

Conclusion
The results are largely insensitive to direction of model choice (robustness check 1), albeit backward model selection chooses a few more non-HLA SNPs for autoimmune families (model 4).
The results are somewhat sensitive towards using BIC in place of AIC as a model choice criterion (robustness check 2). However, the results we present in the main matter are not only justified by AIC, but also by univariate and multivariate significance tests (Wald and LR, respectively).
Moreover, the effects of DR4-DQ8 and CCR5 are still retained, if BIC is used in place of AIC. The results are insensitive towards using log-age as a covariate, and thus altering the specification of person-time (robustness check 3). In this case, the results differ only regarding one SNP (rs763361) in model 3.