Exome Sequencing and Prediction of Long-Term Kidney Allograft Function

Abstract Current strategies to improve graft outcome following kidney transplantation consider information at the HLA loci. Here, we used exome sequencing of DNA from ABO compatible kidney graft recipients and their living donors to determine recipient and donor mismatches at the amino acid level over entire exomes. We estimated the number of amino acid mismatches in transmembrane proteins, more likely to be seen as foreign by the recipient’s immune system, and designated this tally as the allogenomics mismatch score (AMS). The AMS can be measured prior to transplantation with DNA for potential donor and recipient pairs. We examined the degree of relationship between the AMS and post-transplantation kidney allograft function by linear regression. In a discovery cohort, we found a significant inverse correlation between the AMS and kidney graft function at 36 months post-transplantation (n=10 recipient/donor pairs; 20 exomes) (r2>=0.57, P<0.05). The predictive ability of the AMS persists when the score is restricted to regions outside of the HLA loci. This relationship was validated using an independent cohort of 24 recipient donor pairs (n=48 exomes) (r2>=0.39, P<0.005). In an additional cohort of living and mostly intra-familial recipient/donor pairs (n=19, 38 exomes), we validated the association after controlling for donor age at time of transplantation. Finally, a model that controls for donor age, HLA mismatches and time post-transplantation yields a consistent AMS effect across these three independent cohorts (P<0.05). Taken together, these results show that the AMS is a strong predictor of long-term graft function in kidney transplant recipients. One Sentence Summary Prediction of long-term kidney graft function with exome sequencing


Introduction 53 54
Survival of patients afflicted with End Stage Renal Disease (ESRD) is superior following kidney 55 transplantation compared to dialysis therapy. The short-term outcomes of kidney grafts have 56 steadily improved since the early transplants (performed in the 1960s) with refinements in 57 immunosuppressive regimens, use of DNA-based HLA typing, and better infection prophylaxis 58 (1)(2)(3). Despite these advances, data collected across the USA and Europe show that 40-50% of 59 kidney allografts fail within ten years of transplantation (4). This observation strongly suggests 60 that as yet uncharacterized factors, including genomic loci, may adversely impact long-term 61 post-transplantation outcomes. 62 Observational studies have demonstrated the importance of matching for the HLA-determined 63 proteins on kidney graft outcome. Therefore, in many countries, including the USA, donor 64 kidney allocation algorithms includes consideration of HLA matching. With widespread 65 incorporation of HLA matching in organ allocation decisions, it has become clearer that HLA 66 mismatching represents an important risk factor for kidney allograft failure but fails to fully 67 account for the invariable decline in graft function and failure in a large number of cases over 68 time. Indeed, only a 15% survival difference exist at 10 years post transplantation between the 69 fully matched kidneys and the kidneys mismatched for both alleles at the HLA-A, B and DR loci 70 (5). These observations suggest that mismatches at non-HLA loci in the genome could influence 71 long-term graft outcomes. The current clinical practice of prescribing life-long 72 immunosuppressive therapy to recipients of fully HLA matched donor kidneys, but not to 73 recipients of monozygotic identical twin kidneys, also suggests a role for non-HLA related 74 genomic factors on graft outcome. While tests of allelic frequencies are a hallmark of genetic research, transplantation has none of 76 the Mendelian characteristics for which genetic tests have been developed. Therefore, the 77 assumption of the Mendelian transmission model seems inadequate to develop predictors of graft 78 function following transplantation. Indeed, previous attempts at using this methodology have 79 identified small genotype effects on graft function in cohorts of hundreds of transplant patients, 80 but often could not be replicated in independent cohorts (reviewed in (6)). 81 In this work, we present a new method to estimate the genomic compatibility between the organ 82 graft recipient and donor. This approach, designated as allogenomics in this communication, 83 considers the entire coding sequence of both recipient and donor genomes, as determined by 84 exome sequencing. The allogenomics concept makes it possible to estimate a quantitative 85 compatibility score between the genomes of a recipient and potential donor and is calculated 86 from genotypes and genome annotations available before transplantation. The allogenomics 87 approach does not assume a Mendelian inheritance model but integrates the unique features of 88 transplantation such as the existence of two genomes in a single individual and the recipient's 89 immune system mounting an immune response directed at antigens displayed by the donor 90 kidney. In this report, we show that this new concept helps predict long-term kidney transplant 91 function from the genomic information available prior to transplantation. 92

95
The allogenomic concept and the allogenomics mismatch score (AMS) 96 97 The allogenomics concept is the hypothesis that interrogation of the coding regions of the entire 98 genome for both the organ recipient and organ donor DNA can identify the number of 99 incompatible amino-acids (recognized as non-self by the recipient) that inversely correlates with 100 long-term graft function post transplantation. Fig. 1A is a schematic illustration of the 101 allogenomics concept. Because human autosomes have two copies of each gene, we consider 102 two possible alleles in each genome of a transplant pair. To this end, we estimate allogenomics 103 score contributions between zero and two, depending on the number of different amino acids that 104 the donor genome encodes for at a given protein position. Fig. 1B shows the possible 105 allogenomics score contributions when the amino acids in question are either an alanine, or a 106 phenylalanine or an aspartate amino acid. The allogenomics mismatch score (AMS) is a sum of 107 amino acid mismatch contributions. Each contribution represents an allele coding for a protein 108 epitope that the donor organ may express and that the recipient immune system could recognize 109 as non-self (see Equation 1 and 2 in Fig. 1C and Materials and Methods and full description in 110 the supplementary appendix). 111 112 We have developed and implemented a computational approach to estimate the allogenomics 113 mismatch score from genotypes derived for pairs of recipient and donor genomes. (See Materials  114 and Methods for a detailed description of this approach and its software implementation, the 115 allogenomics scoring tool, available at http://allogenomics.campagnelab.org.) Our approach is 116 designed to consider the entire set of protein positions measured by a genotyping assay, or restrict the analysis to a subset of positions P in the genome. In this study, we focus on the subset 118 of genomic sites P that encode for amino acids in transmembrane proteins. In order to test the allogenomics hypothesis, we isolated DNA from 10 kidney graft recipients 125 and their living donors (Discovery Cohort), performed whole exome sequencing and analyzed 126 genotype data for these recipient/donor genome pairs (10 pairs, 20 exomes). These patients were 127 a subset of patients enrolled in a multicenter Clinical Trial in Organ Transplantation-04 (CTOT-128 04) study of urinary cell mRNA profiling, from whom tissue/cells were collected for future 129 mechanistic studies (7). Table 1 provides demographic and clinical information about the 130 patients included in the Discovery Cohort. Exome data were obtained with the Illumina TrueSeq 131 exome enrichment kit v3, covering 62Mb of the human genome. Primary sequence data analyses 132 were conducted with GobyWeb (8) (data and analysis management), Last (9) (alignment to the 133 genome) and Goby (10) (genotype calls). 134 Kidney graft function is a continuous phenotype and is clinically evaluated by measuring serum 135 creatinine levels or using estimated glomerular filtration rate (eGFR)(11). In this study, kidney 136 graft function was evaluated at months 12, 24, 36 and/or 48 following transplantation using 137 serum creatinine levels and eGFR, calculated using the 2011 MDRD (12) formula. We examined 138 whether the allogenomics mismatch score is associated with post-transplantation allograft 139 function. 140 We found positive linear associations between the allogenomics mismatch score and serum 142 creatinine level at 36 months post transplantation (r 2 adj.=0.78, P=0.002, n=10, at 36 months) but 143 not at 12 or 24 months following kidney transplantation ( Fig. 2A, B, C). We also found a 144 negative linear relationship between the score and eGFR at 36 months post transplantation (r 2 145 adj.=0.57, P=0.02) but not at 12 or 24 months following kidney transplantation (Fig. 2D, E, F). 146 These findings suggest that in the discovery cohort the allogenomics score is predictive of long-147 term graft function, but perhaps not short-term function. 148 149

The AMS correlates with graft function in a second, independent cohort of non-related living 150 kidney donor/recipient pairs 151 152
We sought to validate the observation that the AMS is associated with post-transplantation 153 kidney graft function by testing the association in an independent cohort of kidney transplant 154 patients. To this end, we sequenced DNA collected from 24 additional kidney donor/recipient 155 pairs (see Table 1 for information about subjects included in the Validation cohort). DNA 156 sequencing was performed using the Agilent Haloplex assay covering 37Mb of the coding 157 sequence of the human genome. We called the genotypes and estimated the AMS as described 158 for the discovery cohort (see Methods). Fig. 3 shows that, as observed with the Discovery 159 cohort, the AMS correlates progressively better with kidney graft function at longer times 160 following transplantation. At 36 months post transplantation, a small to moderate positive 161 association was observed between the allogenomics mismatch score and the serum creatinine 162 level (r 2 adj. 0.139, P=0.049) (Fig. 3C) and eGFR (r 2 adj. 0.078, P=0.11) (Fig. 3G). The 163 association between the score and graft function was stronger and reached significance at 48 164 months post transplantation for both creatinine level (r 2 adj. 0.394, P<0.01) (Fig. 3D), and eGFR (r 2 adj. 0.284, P=0.02) (Fig. 3H), further validating the association in the Validation cohort. In 166 order to test whether models trained on one cohort would generalize to another similar cohort, 167 we trained models on the Discovery cohort and used the fixed model to predict graft function in 168 the Validation cohort. Fig. S1 shows that such a fixed model does generalize when presented 169 with new recipient-donor pairs, and also exhibited better fit to the longer 48-month time-point 170 compared to the earlier time point (Fig. S1B vs. Fig. S1C). Similarly, models trained on the 171 Validation cohort generalize to the Discovery cohort (Fig. S2). These results establish that the 172 parameters of the models are stable, despite the relatively small numbers of kidney recipient-173 donor pairs included in the Discovery and Validation cohorts. 174

175
The AMS weakly correlates with graft function in a third cohort 176 In order to further test the strength of the relationship between AMS and graft function we 177 applied the approach to a third independent cohort. To this end, we assembled a new cohort 178 composed mostly of living related kidney donor pairs. We used a peculiarity of the French 179 transplant system, directed by the French national agency for organ procurement transplant, 180 which until recently did not allow living non-related kidney transplantation, to select 19 181 additional pairs from one French center. Demographic and clinical data for this cohort are shown 182 in Table 1. As expected in this situation, the range of the AMS is lower than in the other two 183 cohorts, ranging from 349 to 811 in the well-matched cohort (mean 559 ± 147). This range was 184 700 to 1630 (mean 1094 ± 259) in the validation cohort and 994 to 2033 in the discovery cohort 185 (mean 1335 ± 304). These data suggest that the effective range of variation of the AMS is 186 approximately 300-2000 (1700 AMS units). 187 In this third cohort, we did not find the simple association between the AMS and graft function 188 that we observed in the first two cohorts. However, after correction for the strong effect of donor age, we observe a trend in the same direction as the association observed in the first two cohorts

Impact of genotyping platform on the estimation of the AMS 251
We studied the impact of the genotyping platform on the estimation of the AMS (Fig. S3). Large 252 cohorts of matched recipient and donor DNA are being assembled and genotyped with SNP chip 253 array technology such as the Illumina 660W bead array platform (13). We asked whether such 254 platforms would be appropriate to validate the allogenomics model in large cohorts. S3B indicates that the exome assay captures many more sites with rare polymorphisms (minor 257 allele frequency <5%) than the GWAS array platform. This is expected because exome assays 258 directly sequence an individual DNA, while GWAS platforms are designed with a fixed set of 259

Impact on eGFR
polymorphisms and will not include many of the rare polymorphisms any given individual may 260 carry. Case-control designs are appropriate when studying phenotypes that are expected to be 282 associated with genotypes that follow a Mendelian inheritance mechanism. We reason that 283 transplant patients are not ideal subjects for this experimental design. Indeed, patients who 284 received a kidney transplant have two genomes in their body: their germline DNA, and the DNA 285 of the donor. In cases when the transplanted kidney was from an unrelated donor (e.g., organs 286 from deceased donors), it is clear that a Mendelian genetic transmission mechanism is not at 287 play. Importantly, even in cases where the donor is one of the parents of the transplant recipient 288 (familial, living related transplant), the genome of the parent will break the assumptions of Mendelian inheritance. Because the transplant recipient has two genomes after transplant, it is 290 not appropriate to assume that genomic markers can be identified when assuming a Mendelian 291 inheritance process. Yet, this assumption has been made in most of the transplantation genomic 292 studies published to date. 293

294
The allogenomics concept that we present in this manuscript postulates a different mechanism 295 for the development of the immune response in the transplant recipient: immunological and 296 biophysical principles strongly suggest that alleles present in the donor genome, but not in the 297 recipient genome, will have the potential to produce epitopes that the recipient immune system 298 will recognize as non-self. This reasoning explains why the allogenomics score is not equivalent 299 to the genetic measures of allele sharing distance that have been used to perform genetic 300 clustering of individuals (14). 301 302 Results presented in this manuscript suggest that allogenomic mismatches in proteins expressed 303 at the surface of donor cells could explain why some recipients' immune systems mount an 304 attack against the donor organ, while other patients tolerate the transplant for many years, when 305 given similar immunosuppressive regimens. If the results of this study are confirmed in 306 additional independent transplant cohorts (renal transplants, solid or hematologic transplants), 307 they may prompt the design of prospective clinical trials to evaluate whether allocating organs to 308 recipients with a combination of low allogenomics mismatch scores and low HLA mismatch 309 scores improves long term graft outcome. A positive answer to this question could profoundly 310 impact the current clinical and regulatory framework for assigning organs to ESRD patients. 311 In this study, we introduced the allogenomics concept to quantitatively estimate the 313 histoincompatibility between an organ donor from an ABO compatible living donor and recipient 314 outside of the HLA locus. We tested the simplest model derived from this concept to calculate an 315 allogenomics mismatch score (AMS). We demonstrated that the AMS, which can be estimated 316 before transplantation, helps predict post-transplantation kidney graft function. Interestingly The selection criteria for cadaveric donors include consideration of HLA matching and of the age 331 of the donor. Compared to related donors we expect that the range of the AMS will be 332 comparable to that in the discovery cohort (composed in majority of unrelated donors), 333 Therefore, we expect that the allogenomics approach would also be predictive in cadaveric 334 deceased donors, given that the AMS score should be high in these pairs. However, since many factors can independently influence graft function after transplantation from a cadaveric donor 336 (e.g. cold ischemia time), potentially much larger cohorts may be required in such settings to 337 achieve sufficient power to adequately control for the covariates relevant to cadaveric donors and 338 to detect the allogenomics effect. On the other hand, it is also possible that most polymorphisms that contribute to the score have a 358 low frequency in the population (e.g., minor allele frequency less than 5%), which would make 359 the identification of common sites of mismatches unlikely. Contributions are observed for each polymorphic site p in a set P, where P is determined by the 482 genotyping assay and analysis methods, and can be further restricted (e.g., to polymorphisms 483 within genes that code for membrane proteins). Score mismatch contributions σ p (G rp ,G dp ) are 484 calculated using the recipient genotype G rp and the donor genotype G dp at the polymorphic site p. 485 Here, we consider that a genotype can be represented as a set of alleles that were called in a 486 given genome. For instance, if a subject has two alleles at one polymorphic site, and we denote 487  Fig. 1C). 494 A contribution of 1 is added to the score for each polymorphic site where the donor genome has 495 an allele (a dp ) that is not also present in the recipient genome. When both donor and recipient 496 genome are called at polymorphic site P, no contribution is added. For example, assuming a 497 genomic site where the donor genome has two alleles, i.e., G dp ={A,B}, and the recipient genome 498 is homozygote with G rp ={A}. In this case, (G rp ,G dp )=1. Fig. 1B presents additional examples of 499 donor and recipient genotypes and indicates the resulting score contribution (the subscript p is 500 omitted for conciseness). Score contributions are summed across all polymorphism sites in the 501 set P to yield the allogenomic mismatch score (see Fig. 1C Equation 1). 502 503

Selection of Informative Polymorphisms 504
The selection of the set of polymorphic sites P is important to the effectiveness of the approach. 505 In the current method, we select exonic polymorphic sites that are (1) predicted to create non-506 synonymous change in a protein sequence, (2) are located in a gene that codes for one or more 507 membrane proteins (defined as any protein with at least one predicted transmembrane segment, 508 information obtained from Biomart (21), Ensembl database 75). Additional filters can be applied 509 to restrict P, which may lead to improved prediction of transplant clinical endpoints. 510 Constructing additional filters will require the study of a larger training set of matched recipient and donor genotypes, which currently does not exist. It is possible that such study will indicate 512 that other criteria than (2) also lead to predictive scores. 513

Implementation: the Allogenomics Scoring Tool 514
We developed the allogenomics scoring tool to process genotypes in the VCF format and 515 produce allogenomics mismatch scores for specific pairs of genomes in the input file. The 516 allogenomics scoring tool was implemented in Java with the Goby framework and is designed to 517   where an allogenomics mismatch score contribution different from zero are counted. We filtered the 837 exome genomic sites to exclude sites not found on the Illumina 660W genotyping platform (used in 838 (13)). After filtering, the allogenomics score is estimated with 1,820 remaining genomic sites. B) 839 The minor allele frequency (MAF) of the alleles described at each set of genomic sites is shown as 840 a histogram (MAF is estimated from the EVP database, see Methods). Exome sequencing is an 841 assay that directly observes variations in an individual DNA sample. The MAF distributions 842 confirm that exome sequencing helps estimate contributions from many rare (MAF<5%) 843 polymorphisms, whereas the chip genotyping platform estimates the score based on contributions 844 from frequent alleles. C) The scatterplot of the relationship between 36-month eGFR and the score 845 estimated from the exome sites, or the subset of sites also measured by the GWAS platform. While 846 some trend is still visible with sites measured on the GWAS platform, more samples would be 847 needed to reach significance in the combined Discovery and Validation cohorts (n=34 pairs). Note 848 that the magnitude of the scores is smaller on the GWAS platform because fewer contributions are 849 summed. In contrast, the exome assays (Illumina TrueSeq for the Discovery cohort or Agilent 850 Haloplex for the Validation cohort) result in stronger and significant correlations in the same set of 851 samples. 852