The assessment of response to lithium maintenance treatment in bipolar disorder (BD) is complicated by variable length of treatment, unpredictable clinical course, and often inconsistent compliance. Prospective and retrospective methods of assessment of lithium response have been proposed in the literature. In this study we report the key phenotypic measures of the “Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder” scale currently used in the Consortium on Lithium Genetics (ConLiGen) study.
Materials and Methods
Twenty-nine ConLiGen sites took part in a two-stage case-vignette rating procedure to examine inter-rater agreement [Kappa (κ)] and reliability [intra-class correlation coefficient (ICC)] of lithium response. Annotated first-round vignettes and rating guidelines were circulated to expert research clinicians for training purposes between the two stages. Further, we analyzed the distributional properties of the treatment response scores available for 1,308 patients using mixture modeling.
Substantial and moderate agreement was shown across sites in the first and second sets of vignettes (κ = 0.66 and κ = 0.54, respectively), without significant improvement from training. However, definition of response using the A score as a quantitative trait and selecting cases with B criteria of 4 or less showed an improvement between the two stages (ICC1 = 0.71 and ICC2 = 0.75, respectively). Mixture modeling of score distribution indicated three subpopulations (full responders, partial responders, non responders).
Citation: Manchia M, Adli M, Akula N, Ardau R, Aubry J-M, Backlund L, et al. (2013) Assessment of Response to Lithium Maintenance Treatment in Bipolar Disorder: A Consortium on Lithium Genetics (ConLiGen) Report. PLoS ONE 8(6): e65636. https://doi.org/10.1371/journal.pone.0065636
Editor: Kazutaka Ikeda, Tokyo Metropolitan Institute of Medical Science, Japan
Received: January 24, 2013; Accepted: April 26, 2013; Published: June 19, 2013
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Funding: The work on assessment of lithium response has been supported by a grant from Canadian Institutes of Health Research #64410 to MA. MG-S was supported by Unitatea Executiva pentru Finantarea Invatamantului Superior, a Cercetarii, Dezvoltarii si Inovarii (UEFISCDI), Bucharest, Romania (grant no. 89/2012). JMA and AN were supported by a grant from the Swiss National Foundation (#32003B_125469/1 to JM Aubry). ConLiGen is in part supported by funds from the Intramural Research Program of the National Institute of Mental Health (NIMH) at the National Institutes of Health (NIH), Department of Health and Human Services, United States Government. It is further supported by a grant from the Deutsche Forschungsgemeinschaft to MR, MB, and TGS (RI 908/7-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: Co-authors James B. Potash, Andreas Reif, and Bernard T. Baune are PLOS ONE Editorial members. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials.
Bipolar disorder (BD) is a lifelong and severe psychiatric illness characterized by recurrences of episodes of depression and hypomania/mania . Lithium is among the first-line maintenance treatments for BD , , preventing relapses and recurrences of opposite polarity. In addition, lithium decreases the risk of suicidal behaviour and all-cause mortality in mood disorders –.
Naturalistic analyses show that approximately one third of BD patients achieve complete remission on lithium –. Lithium-responsive BD patients have distinct clinical features, such as episodicity of clinical course , absence of rapid cycling , and a family history of BD , corresponding to the BD “core phenotype” .
Despite a significant genetic component for lithium-responsive BD , , pharmacogenetic studies have not produced replicated results , . One possible explanation for the lack of conclusive pharmacogenetic findings is the varying definition of lithium response across the studies. Indeed, the assessment of lithium maintenance treatment response, and consequently the definition of the phenotype under study, is complicated by factors inherent to the natural history of BD. The irregular clinical course of BD  as well as variable treatment adherence  are only few of the factors that contribute to the complexity in assessing the response to lithium maintenance treatment.
To reduce the impact of the clinical heterogeneity of BD in pharmacogenetics (and possibly to define genetically more homogeneous subgroups of BD patients), researchers have proposed to select prospectively followed patients on lithium monotherapy with unequivocal clinical response , . However, this may not be practical if large patient samples are needed. In such cases, we need to rely on retrospective evaluation of treatment response. Several such methods have been described in the literature including the Affective Morbidity Index (AMI)  and the Illness Severity Index . The AMI takes into account the duration and the severity of an episode, the latter scored on a 4-point scale (0 = no conspicuous affective disturbance, 1 = mild depression or mania, 2 = moderate depression or mania, 3 = severe depression or mania). The area under the curve can be calculated from these two variables and compared between defined treatment periods. Similarly, the Illness Severity Index measures the efficacy of lithium treatment in controlling mood episodes. It is defined as the frequency of affective episodes prior to starting lithium adjusted for age at the time lithium was started . However, changes of affective morbidity might be not only a result of the treatment, but could be due to other factors. In the Consortium on Lithium Genetics (ConLiGen, www.ConLiGen.org) study , we adopted the “Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder” as the principal method of evaluation of the response to lithium , . In addition to measuring the degree of clinical improvement, this scale weighs clinical factors considered relevant in determining whether the observed clinical change is in fact due to the lithium treatment.
Since ConLiGen is an international multi-centre collaboration, it has been crucial to assess the key phenotypic measures and the response to long-term lithium treatment reliability across the participating research groups. Here we present: 1) the results of the reliability analysis of response to lithium treatment across the participating centres, and 2) the distributional properties of the scale scores. These two sets of findings have been instrumental in obtaining stringent phenotypic definitions of lithium response. These analyses are of particular importance in light of the genome-wide association study (GWAS) currently being undertaken by ConLiGen.
Materials and Methods
Assessment of Clinical Response to Lithium Treatment
The response to lithium treatment was measured using a previously published and validated rating scale: the “Retrospective Criteria of Long-Term Treatment Response in Research Subjects with Bipolar Disorder” , . Briefly, this scale quantifies the degree of improvement in the course of treatment (A criterion or A score) expressed as a composite measure of change in frequency and severity of mood symptoms. The A score is weighed against 5 factors (B criteria) which allow one to determine if the observed improvement is a result of the treatment rather than a spontaneous improvement or an effect of additional medication. Specifically, the B criteria consider: the number of episodes before/off the treatment (B1), the frequency of episodes before/off the treatment (B2), the duration of the treatment (B3), the compliance during period(s) of stability (B4) and the use of additional medication during the period of stability (B5). The total score (TS) is obtained by subtracting the B score from the A score.
Analysis of the Inter-rater Agreement and Reliability of the Assessment of Lithium Response
The agreement and reliability of the assessment of lithium response between raters of 29 ConLiGen participating centres was measured using a two-stage case-vignette rating procedure (Table 1). Specifically, the study protocol had three phases: 1) twelve standardized case vignettes prepared by investigators (M.A., J.G., C.S.) at Dalhousie University were circulated and rated by 70 investigators; 2) annotated first-round vignettes and rating guidelines were circulated for training purposes after the first stage; 3) sixteen additional more complex vignettes prepared by senior researchers at Dalhousie University, Johns Hopkins University School of Medicine, National Institute of Mental Health (NIMH) and Academia Sinica of Taiwan (M.A., J.G., J.P., T.G.S., F.M., A.C.) were circulated and rated by 48 investigators at the participating sites. The first set of vignettes was based exclusively on BD patients who had been prospectively followed in a specialty program and with detailed clinical information on the course of illness and treatment history. The second set of vignettes was heterogeneous and included patients treated in various settings, some with limited clinical details assessed cross-sectionally. Since raters had no prior knowledge of the rating scale, this design allowed us to estimate the impact of training on agreement and reliability of lithium response assessment. The rating procedure was performed from April 2009 to October 2012.
The degree of concordance of lithium response definition was assessed with Cohen’s kappa (κ)  and intra-class correlation (ICC) coefficient . These analytical methods were applied to the dichotomous and continuous definition of lithium response, respectively. The κ statistics (multiple raters with two outcomes) were calculated with 95% confidence interval (CI) for each cut off point of the TS scale in the range from 3 (non response to lithium) to 8 (full response to lithium). Interpretation of the strength of agreement was made according to Landis and Koch: poor (κ <0.00), slight (0.00–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), almost perfect (0.81–1.00) .
The quantitative scores of the treatment response scale were analyzed in the first (ICC1) and second (ICC2) stage of ratings. Specifically, we analyzed the TS (weighted clinical improvement), the A score (uncorrected clinical improvement), the B score (quantification of confounders), and the A score when B score ≤4. The latter measure allows the identification of “valid cases” through selection at the B criteria. Subjects with B score ≤4 are likely to have a clinical improvement causally related to lithium treatment. The ICC was tested with the two-way random effects model, that assumes a random sample of K investigators selected from a larger population, and each rates N targets (i.e., case vignettes) altogether, and the two-way mixed effects model, with each target rated by each of the same K investigators, who are the only ones of interest. For both models we calculated the single and average measure reliability.
Analysis of the Distributional Properties of the Treatment Response Scale
For the analysis of the distributional properties, we accessed TS data of 1,308 BD patients from the NIMH centralized ConLiGen phenotypic dataset.
Mixture analysis: frequentist and Bayesian approach.
We used mixture analysis to test whether we could identify subgroups of patients according to the degree of response to lithium as expressed by TS. The choice of the mixture model that best fit the distribution of TS was made according to the Akaike’s and Schwarz’s Bayesian information criteria (AIC and BIC, respectively). The lower values of these two criteria indicated the most parsimonious model that best fit the empirical function of total score distribution. The analysis was performed using the “NMixEM” function implemented in the MixAk package  of R software (version 2.13.2).
To verify the findings from the frequentist mixture analysis, we performed the Bayesian mixture analysis employing a minimum message length approach (MML) . Specifically, we used the Snob software  to test whether the distribution results from a union of a number of “classes”, where the distributions “within-classes” are homogeneous and have a simple form, but vary significantly “between-classes”. The best fitting model was indicated as the most parsimonious model (i.e., the one with the lower cost expressed in nits, a specific measure unit conventionally used to express the length message). The analysis was performed using a measurement error equal to 2.5 empirically estimated by plotting the distribution of TS.
Cut off point calculation.
Cut off points were derived using the theoretical TS function and calculating each data point’s probability of belonging to each class. Specifically, once the mixture model parameters were estimated, we calculated the posterior probability of any data point x belonging to the i-th class aswhere ω is the weight, μ is the mean, σ is the standard deviation.
The resulting probabilities were then compared in order to establish which class the data point belonged to.
Inter-rater Agreement and Reliability of the Assessment of Lithium Response
Raters agreed to a substantial/moderate (first stage of case-vignettes ratings) and moderate/fair (second stage of case-vignettes ratings) degree in assessing lithium response as a dichotomous variable (response/non response) (Table 2). We did not detect an effect of training as shown by the lack of improvement in κ. Specifically, in the first stage of ratings, the κ score showed a substantial level of agreement when we considered the TS cut off for response to lithium at 6 (κ = 0.65, 95% CI = 0.36–0.85) and at 8 (κ = 0.61, 95% CI = 0.33–0.83). The highest κ value was for the TS cut off point of 7 (κ = 0.66, 95% CI = 0.38–0.86). The second stage of ratings had overall lower κ values than the first indicating a moderate level of agreement in the assessment of lithium response (TS = 6: κ = 0.51, 95% CI = 0.29–0.73; TS = 7: κ = 0.54, 95% CI = 0.31–0.76; TS = 8: κ = 0.54, 95% CI = 0.28–0.76). Again, the highest κ value was found for the TS cut off point of 7. Details can be found in Table 2.
We then analyzed the inter-rater reliability for the continuous definition of lithium response. We found that ICC values (two-way random and mixed effects models, single measure) were higher in the first stage of ratings for TS (ICC1 = 0.74 versus ICC2 = 0.55), for A score (ICC1 = 0.66 versus ICC2 = 0.52) and for total B score (ICC1 = 0.59 versus ICC2 = 0.34). However, the training improved the inter-rater reliability of the A score when B score was ≤4 (ICC1 = 0.71 versus ICC2 = 0.75). These results are outlined in Table 2.
Analysis of the Distributional Properties of the Treatment Response Scale
Distribution of the TS and joint distribution with score A.
Figure 1 illustrates the distribution of TS and A score in 1,308 BD patients characterized for lithium response. Two hundred eighty three patients (21.6%) had TS equal to 0 and 104 patients (8%) had A score equal to 0. In the whole sample the mean A score ± standard deviation] was 6.1±3.1 and the mean TS was 4.4±3.1. The joint distribution of TS and A scores is represented in Figure 2. It illustrates the presence of two frequency peaks at the extreme ends of the scale, namely at 0 and in the area comprised between score A equal to 9 and TS equal to 8–10. A third peak is present at the intersection of A score equal to 6 and TS of 4.
Histogram plot of the scale scores in 1,308 bipolar disorder patients characterized for response to lithium maintenance treatment.
Mixture analysis: frequentist and bayesian approach.
The frequentist mixture analysis on TS showed a best-fitting theoretical model of three normal components (AIC = 6467.69, BIC: 6498.75) (Figure 3A). A model with four components did not improve the fit (AIC = 6471.68, BIC = 6513.09, respectively). The mean TS was 0.76±1.15 for the non responder component, 4.6±1.15 for the partial responder component and 8.3±1.15 for the full responder component, with 37%, 30%, and 33% of the population proportion, respectively.
Frequentist, A, and Bayesian minimum message length, B, mixture modeling identify three subpopulations of non responders (grey), partial responders (red), and full responders (blue) in total scores of 1,308 bipolar disorder patients characterized for response to lithium maintenance treatment.
The MML mixture analysis identified the most parsimonious model of three normal components [mean, (SD), (proportion of population)]: 0.5, (1.00), (32%); 4.5, (1.7), (38%); 8.4, (1.2), (30%)], representing the non responder, the partial and the full responder groups of patients. The model is displayed in Figure 3B.
Cut off point calculation.
The functions of TS identified with the two different mixture analysis approaches (frequentist and Bayesian) were used to derive the probability of belonging and to calculate the cut off point between the components. The frequentist mixture model suggested two cut off points at TS = 3 and TS = 6.4. Considering the Bayesian MML theoretical function, we obtained two cut off points at 2 and 7. These results confirmed that TS ≥7 is the most appropriate cut off for the definition of full response to lithium prophylaxis as suggested in previous studies , .
The purpose of this study was to assess the key phenotypic measures of response to lithium treatment in the large international collaborative Consortium on Lithium Genetics. To this end, two main analyses have been carried out: the inter-rater agreement and reliability of lithium response definition across the ConLiGen participating sites, and the analysis of the distributional properties of the lithium treatment response scale . We found that two definitions of lithium response, one dichotomous and the other continuous had moderate to substantial inter-rater agreement and reliability. Specifically, the two-stage case vignettes inter-rater reliability analysis pointed to the measure of clinical improvement under lithium treatment expressed by the A score and with selection of “valid cases” through a total B score ≤4. This phenotypic definition of lithium response had a substantial inter-rater reliability in the first stage of ratings (ICC1 = 0.71) with further improvement in the second stage (ICC2 = 0.75).
Regarding the dichotomous definition of lithium response, a scale TS ≥7 was identified as the best cut off as shown by inter-rater agreement κ scores in the first (κ = 0.66) and second (κ = 0.54) stages of case vignette ratings. Further, the analysis of the distributional properties of the treatment response scale further supported this dichotomous definition. In addition, this same measure of lithium response has been previously proposed in several clinical and genetic papers , , , .
Some methodological considerations need to be made. For the analysis of the distributional properties, we applied mixture modeling, a method that has been extensively used in psychiatry for the identification of patient subgroups, reducing phenotypic heterogeneity and ultimately helping genetic research –. It should be noted that this method is exploratory and it does not identify the factors determining the differences between the identified subgroups . A validation of the model can be obtained by comparison of the characteristics of each subgroup. In the ConLiGen study, we plan to use the clinical correlates of lithium response as external validators of the phenotypic measure suggested by the mixture modeling. Such analysis will test and compare the direction and magnitude of the association of a number of clinical variables with lithium response in its dichotomous and continuous definition.
Notably, the analysis of inter-rater reliability and agreement has involved investigators belonging to different research groups with different clinical backgrounds and training. Nevertheless, the use of standardized case vignettes and the training procedures has produced moderate to substantial agreement in the assessment of lithium response. These findings are of importance, given the evidence that even in the context of inpatient unit settings the inter-rater agreement can be unsatisfactory .
We performed a two-stage case-vignettes procedure aimed at testing the effect of training on the assessment of lithium response. Contrary to our expectations, we only detected improvement in the inter-rater reliability of lithium response expressed by the A score and with selection of “valid cases” through a total B score ≤4, but not in that expressed by TS or A score. Arguably, the second set of vignettes described more complicated clinical cases with comorbidities, lack of compliance and multiple treatments, all factors that could have influenced the scoring of the B criteria. Indeed, the ICC for the total B score decreased noticeably in the second stage of ratings, implying an increased variability in rating that impacted the discrimination among cases . This explanation is corroborated by the finding of the higher ICC2 of A score with total B score ≤4. By applying this cut-off we decreased the assessment variability ultimately increasing the discrimination among cases.
Further, these findings confirm that patients with short duration of lithium treatment, poor compliance, and concomitant medications are unlikely to be assessed reliably. This argues against the inclusion of such complex, non-standard cases in pharmacogenomic studies of lithium response. Finally, the higher inter-rater agreement and reliability found in the first set of vignettes suggests that the assessment of lithium response is reliable if sufficient clinical details are available. On the other hand if the information is limited, additional rater training will be of little help.
In conclusion, our findings support the use of two definitions of lithium response for the pharmacogenomic GWAS currently being performed by ConLiGen. Accurate phenotypic definitions of treatment response are crucial in pharmacogenomic studies , . Heterogeneity in the phenotype definition of treatment response can be a problem especially when in the context of psychiatric disorders. In the absence of other reliable clinical measures of response to lithium, this study has suggested two plausible phenotypic definitions that await application and validation in other samples.
Conceived and designed the experiments: M. Alda TGS FJM MB MR. Performed the experiments: M. Manchia M. Alda RA JMA LB CEMB BTB FB SB CBP EB CVC ATAC CC S. Clarke PMC CD MDZ JRD BE PF LF MAF JF SG JG FSG PG OG R. Hashimoto JH R. Hoban SJ JPK LK TK JRK SKS SK PHK IK GL CL ML SGL CALJ M. Maj AM LM TM PBM FM PM AN MN TN CO UO NO RHP AP JBP DRE AR ER SR JKR MS PRS OKS BS FS MGS GS LRS CS JWS AS TS PS SKT AT AW DZ MB MR TGS M. Adli. Analyzed the data: M. Manchia M. Alda. Contributed reagents/materials/analysis tools: NA JMB S. Cichon SDD UH LH GAR GT JS NRW PPZ. Wrote the paper: M. Manchia M. Alda.
- 1. Goodwin FK, Jamison KR (2007) Manic-depressive illness. New York: Oxford University Press.
- 2. Yatham LN, Kennedy SH, O’Donovan C, Parikh S, MacQueen G, et al. (2005) Canadian Network for Mood and Anxiety Treatments (CANMAT) guidelines for the management of patients with bipolar disorder: consensus and controversies. Bipolar Disord 7 Suppl 35–69.
- 3. Fountoulakis KN, Kasper S, Andreassen O, Blier P, Okasha A, et al. (2012) Efficacy of pharmacotherapy in bipolar disorder: a report by the WPA section on pharmacopsychiatry. Eur Arch Psychiatry Clin Neurosci 262 Suppl 11–48.
- 4. Muller-Oerlinghausen B, Ahrens B, Grof E, Grof P, Lenz G, et al. (1992) The effect of long-term lithium treatment on the mortality of patients with manic-depressive and schizoaffective illness. Acta Psychiatr Scand 86: 218–222.
- 5. Cipriani A, Pretty H, Hawton K, Geddes JR (2005) Lithium in the prevention of suicidal behavior and all-cause mortality in patients with mood disorders: a systematic review of randomized trials. Am J Psychiatry 162: 1805–1819.
- 6. Muller-Oerlinghausen B, Felber W, Berghofer A, Lauterbach E, Ahrens B (2005) The impact of lithium long-term medication on suicidal behavior and mortality of bipolar patients. Arch Suicide Res 9: 307–319.
- 7. Prien RF, Caffey EM, Klett CJ (1974) Factors associated with treatment success in lithium carbonate prophylaxis: report of the Veterans Administration and National Institute of Mental Health Collaborative Study Group. Arch Gen Psychiatry 31: 189–192.
- 8. Solomon DA, Keitner GI, Miller IW, Shea MT, Keller MB (1995) Course of illness and maintenance treatments for patients with bipolar disorder. J Clin Psychiatry 56: 5–13.
- 9. Maj M, Pirozzi R, Magliano L, Bartoli L (1998) Long-term outcome of lithium prophylaxis in bipolar disorder: a 5-year prospective study of 402 patients at a lithium clinic. Am J Psychiatry 155: 30–35.
- 10. Baldessarini RJ, Tondo L (2000) Does lithium treatment still work? Evidence of stable responses over three decades. Arch Gen Psychiatry 57: 187–190.
- 11. Rybakowski JK, Chlopocka-Wozniak M, Suwalska A (2001) The prophylactic effect of long-term lithium administration in bipolar patients entering treatment in the 1970s and 1980s. Bipolar Disord 3: 63–67.
- 12. Grof P, Duffy A, Cavazzoni P, Grof E, Garnham J, et al. (2002) Is response to prophylactic lithium a familial trait? J Clin Psychiatry 63: 942–947.
- 13. Garnham J, Munro A, Slaney C, Macdougall M, Passmore M, et al. (2007) Prophylactic treatment response in bipolar disorder: results of a naturalistic observation study. J Affect Disord 104: 185–190.
- 14. Chillotti C, Deiana V, Manchia M, Lampus SF, Ardau R, et al. (2009) [Evaluation of lithium treatment response in Sardinian bipolar patients]. Riv Psichiatr 44: 28–35.
- 15. Grof P, Alda M, Grof E, Fox D, Cameron P (1993) The challenge of predicting response to stabilising lithium treatment. The importance of patient selection. Br J Psychiatry Suppl 16–19.
- 16. Kleindienst N, Engel R, Greil W (2005) Which clinical factors predict response to prophylactic lithium? A systematic review for bipolar disorders. Bipolar Disord 7: 404–417.
- 17. Alda M (2001) Genetic factors and treatment of mood disorders. Bipolar Disord 3: 318–324.
- 18. Alda M, Grof P, Rouleau GA, Turecki G, Young LT (2005) Investigating responders to lithium prophylaxis as a strategy for mapping susceptibility genes for bipolar disorder. Prog Neuropsychopharmacol Biol Psychiatry 29: 1038–1045.
- 19. Duffy A, Alda M, Milin R, Grof P (2007) A consecutive series of treated affected offspring of parents with bipolar disorder: is response associated with the clinical profile? Can J Psychiatry 52: 369–376.
- 20. Cruceanu C, Alda M, Turecki G (2009) Lithium: a key to the genetics of bipolar disorder. Genome Med 1: 79.
- 21. McCarthy MJ, Leckband SG, Kelsoe JR (2010) Pharmacogenetics of lithium response in bipolar disorder. Pharmacogenomics 11: 1439–1465.
- 22. Angst J, Sellaro R (2000) Historical perspectives and natural history of bipolar disorder. Biol Psychiatry 48: 445–457.
- 23. Sachs GS, Rush AJ (2003) Response, remission, and recovery in bipolar disorders: what are the realistic treatment goals? J Clin Psychiatry 64 Suppl 618–22.
- 24. Turecki G, Grof P, Grof E, D’Souza V, Lebuis L, et al. (2001) Mapping susceptibility genes for bipolar disorder: a pharmacogenetic approach based on excellent response to lithium. Mol Psychiatry 6: 570–578.
- 25. Lopez de Lara C, Jaitovich-Groisman I, Cruceanu C, Mamdani F, Lebel V, et al. (2010) Implication of synapse-related genes in bipolar disorder by linkage and gene expression analyses. Int J Neuropsychopharmacol 13: 1397–1410.
- 26. Coppen A, Peet M, Bailey J, Noguera R, Burns BH, et al. (1973) Double-blind and open prospective studies on lithium prophylaxis in affective disorders. Psychiatr Neurol Neurochir 76: 501–510.
- 27. Maj M, Del Vecchio M, Starace F, Pirozzi R, Kemali D (1984) Prediction of affective psychoses response to lithium prophylaxis: the role of socio-demographic, clinical, psychological and biological variables. Acta Psychiatr Scand 69: 37–44.
- 28. Schulze TG, Alda M, Adli M, Akula N, Ardau R, et al. (2010) The International Consortium on Lithium Genetics (ConLiGen): an initiative by the NIMH and IGSLI to study the genetic basis of response to lithium treatment. Neuropsychobiology 62: 72–78.
- 29. Cohen J (1960) A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20: 37–46.
- 30. Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86: 420–428.
- 31. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33: 159–174.
- 32. Komarek A (2011) Mixture of methods including mixtures. CRAN.
- 33. Wallace CS, Dowe DL (2000) MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10: 73–83.
- 34. Wallace CS, Boulton DM (1968) An Information Measure for Classification. Computer Journal 11: 185–194.
- 35. Squassina A, Manchia M, Congiu D, Severino G, Chillotti C, et al. (2009) The diacylglycerol kinase eta gene and bipolar disorder: a replication study in a Sardinian sample. Mol Psychiatry 14: 350–351.
- 36. Squassina A, Manchia M, Borg J, Congiu D, Costa M, et al. (2011) Evidence for association of an ACCN1 gene variant with response to lithium treatment in Sardinian patients with bipolar disorder. Pharmacogenomics 12: 1559–1569.
- 37. Bellivier F, Golmard JL, Rietschel M, Schulze TG, Malafosse A, et al. (2003) Age at onset in bipolar I affective disorder: further evidence for three subgroups. Am J Psychiatry 160: 999–1001.
- 38. Etain B, Mathieu F, Rietschel M, Maier W, Albus M, et al. (2006) Genome-wide scan for genes involved in bipolar affective disorder in 70 European families ascertained through a bipolar type I early-onset proband: supportive evidence for linkage at 3p14. Mol Psychiatry 11: 685–694.
- 39. Hamshere ML, Gordon-Smith K, Forty L, Jones L, Caesar S, et al. (2009) Age-at-onset in bipolar-I disorder: mixture analysis of 1369 cases identifies three distinct clinical sub-groups. J Affect Disord 116: 23–29.
- 40. Lubke G, Neale M (2008) Distinguishing between latent classes and continuous factors with categorical outcomes: Class invariance of parameters of factor mixture models. Multivariate Behav Res 43: 592–620.
- 41. Cheniaux E, Landeira-Fernandez J, Versiani M (2009) The diagnoses of schizophrenia, schizoaffective disorder, bipolar disorder and unipolar depression: interrater reliability and congruence between DSM-IV and ICD-10. Psychopathology 42: 293–298.
- 42. Burdock EI, Fleiss JL, Hardesty AS (1963) A new view of inter-observer agreement. Personnel Psychology 16: 373–384.
- 43. Daly AK (2010) Genome-wide association studies in pharmacogenomics. Nat Rev Genet 11: 241–246.
- 44. Motsinger-Reif AA, Jorgenson E, Relling MV, Kroetz DL, Weinshilboum R, et al.. (2010) Genome-wide association studies in pharmacogenomics: successes and lessons. Pharmacogenet Genomics.