Validity of the French version of the Autonomy Preference Index and its adaptation for patients with advanced cancer

Background While patient-centered care is recommended as a key dimension for quality improvement, in case of serious illness, patients may have different expectations regarding information and participation in medical decision-making. In oncology, anticipation of disease worsening remains difficult, especially when patient’s preferences towards prognosis medical information are unclear. Valid tools to explore patients’ preferences could help targeting end-of-life discussions, which have been shown to decrease aggressiveness of end-of-life care. Our aim was to establish the validity and reliability of the French version of the Autonomy Preference Index (API) among patients with incurable cancer and in primary care setting. Three supplementary items were specifically developed to evaluate preparedness to anticipate disease deterioration among patients with incurable cancer. Methods The psychometric properties of the API translated into French were assessed among patients consecutively recruited from January to March 2017 in the waiting rooms of 19 general practitioners (N = 391) and in an oncology (N = 187) clinic in Paris. Relationships between the newly-developed items and the API subscale scores were studied. Results A three correlated factors confirmatory model (two factors related to decision-making and a factor related to information-seeking preferences) showed an acceptable fit on the whole sample and no measurement invariance issue was found across settings, age, sex and educational level. Internal consistency and test-retest reliability were acceptable for the information-seeking and decision-making subscales. One of the newly-developed items on patients’ ability to anticipate a decision on the use of artificial respiration if a sudden deterioration of their illness occurred was not related to the API subscale scores. Conclusion The French version of the API was found valid and reliable for use in general practice and oncology settings. The additional items on patient preparedness to anticipate disease deterioration can be of interest to ensure that patient values guide all end-of-life clinical decisions.


Introduction
Shared decision-making is a process in which a choice is jointly made by a provider and a patient or a proxy decision-maker [1]. Taking its roots in the "patient-centered care" movement in healthcare, this process was pointed to, at the turn of the millennium, as a key aim to ensure that 21st-century Health Care Systems "cross the Quality chasm" [2,3]. Consideration of patient preferences as to their level of involvement in the decision-making process has now become an ethical imperative, and has been integrated into healthcare programs and legal texts in many countries [3,4].
While patient participation in decision-making processes is essential in all medical contexts, it is particularly complex in situations of incurable illness. Many informed decisions need to be made, for example, treatment limitation or cessation, Do-Not-Resuscitate orders, place of care, etc. [5]. Information-sharing between the patient and the physician is recognized as one of the main characteristics in the definition of shared decision-making in healthcare [6]. Information on disease evolution and prognosis is a prerequisite for patients to assess the risk-benefit ratios of their therapeutic options [7].
In oncology, numerous studies have shown that patients do not receive exhaustive information on their situation maybe because delivery and receipt of this information are tricky for both parties in the sharing process [8][9][10]. Physicians may worry about increasing patient anxiety, as it has been shown that patients have variable expectations towards prognostic medical information [11]. This may be the reason why anticipation of disease deterioration still remains difficult for both patient and oncologist, although end-of-life discussions were already suggested several years ago to reduce the aggressiveness or invasiveness of end-of-life care by facilitating shared decisions and the traceability of do-not-resuscitate orders [12].
In this context, physicians need to adapt their communication according to patients' expectations regarding information and their desire to be involved in decisions, and also according to their preparedness to anticipate disease deterioration [13]. For that matter, the need for an assessment of these patient's preferences has been highlighted in the literature [4, 13,14]. To our knowledge, three measurement tools aiming to assess both information preferences and the desire to participate in decision-making have already been used among patients with incurable or terminal cancer: 1) visual analog scales initially developed for patients in emergency wards [15,16], 2) the Krantz Health Opinion Survey, a self-administered 16-item questionnaire, initially developed for students, and concerning medical care in general with a focus on self-medication [17,18], and 3) the Autonomy Preference Index (API), a self-administered 23-item questionnaire, initially developed for patients in primary care settings [19].
The API has various advantages to be used among patients with incurable cancer over the two other measurement tools identified. First, it does not focus on self-medication contrary to the Krantz Health Opinion Survey. Second, its psychometric properties have already been studied in English and German in various populations (primary care settings, patients with asthma, mental illness, chronic pain) [20][21][22][23][24][25]. Third, its original structure allows for adaptation depending on the context as it has already been done for psychiatric patients for example [21,24,25]. Among the 23 items of its original version, 8 items assess informationseeking (IS score) preferences and the remaining 15 items assess preferences for participation in decision-making (DM) including 6 general items used to compute the DM score and 9 items related to three clinical vignettes representing different levels of severity: the upper respiratory tract illness (URI score) is used to represent a mild condition; hypertension (HBP score) for a moderately severe condition; and myocardial infarction (MI score) for a severe life-threatening condition. In some previous studies, only the 14 items related to the IS and DM scores were used [23,25,26] while in others, vignettes were adapted to the context [20,21,24]. So in our study, we aimed to validate the API in a population of patients with incurable cancer, and to develop an additional vignette with supplementary items, specifically for these patients, to evaluate their preparedness to anticipate disease deterioration, as this was not addressed in the original API.
The working objectives of this study was thus to translate the API, to evaluate its psychometric properties (reliability and construct validity) in a population of primary care patients, as for the original version, and in a population of patients with incurable cancer. We also assessed measurement invariance which is an essential property for questionnaire, as for any measurement tool, to guarantee accurate group comparisons. According to Mokkink et al. and Milsap, "a measuring device should function in the same way across varied conditions, so long as those varied conditions are irrelevant to the attribute being measured" [27,28]. We studied measurement invariance across age, sex and education level as usually performed and recommended, across French and English languages to ensure the comparability of the scores from both language versions, and across both settings (primary care patients and patients with incurable cancer) to check the likeness of the API factor structure in these settings [28][29][30]. The Consensus-based Standards for the selection of health Measurement Instruments (COS-MIN) guidelines were followed to report the results [31].

The Autonomy Preference Index and the supplementary items for preparedness to anticipate disease deterioration
A 5-point Likert scale is used to answer to the 23 items of the API (a score of 5 indicating the strongest preference) (S1 Questionnaire, S1 Table). The computation of the five scores from the API was explained in the original publication as follow: The IS and DM scores are computed as the sum of the 8 and 6 answers respectively linearly adjusted to range from 0 to 100 (strongest desire possible). The URI, HBP and MI scores are computed from the sum of the answers to the three items, linearly adjusted to range from 0 to 10 (strongest desire possible) [19].
The additional clinical vignette developed to address the preparedness of patients with advanced cancer to anticipate disease deterioration. This vignette concerns a chronic, terminal respiratory illness requiring oxygen therapy that can potentially evolve towards a sudden deterioration, requiring artificial respiration (S2 Questionnaire, Table 1). This situation was chosen to minimize the chances for a patient with advanced cancer of identifying with this situation. The three items (answers on a 5-point Likert scale) concerned the desire to participate in the advance decision on whether to use artificial respiration, preference regarding the anticipation of this decision, the ability to decide on this point at a time when the situation has not yet arisen.

Translation process
Following the steps described in the current recommendations on the cross-cultural adaptation of questionnaires [32,33], four French experts from various disciplines (palliative care, general medicine, public health, epidemiology, biostatistics, psychometrics) with good English language proficiency and two English-French bilinguals independently translated the English version of the API into French [34]. A consensus meeting was then held to reach a consensual French version of the questionnaire, on the basis of the six independent translations. The author of the first version of the API, J. Ende, was contacted to ask for permission, but he was not available to participate in the translation process. No back-translation was performed as it is not required in this context [35]. Individual semi-structured cognitive debriefing sessions (acceptability, comprehensibility and consistent interpretation across participants) were organized with 13 subjects (7 with incurable cancer and 6 without any declared illness; 8 males; 4 under 30 years, 7 aged 30 to 70 years and 2 over 70 years) who tested this version (completion time: 3 to 10 min). Minor form changes were made on some items following the content analysis of these debriefing meetings, yielding the final French version of the API (S1 Questionnaire). Table 1. Frequencies (%) of the answers to the items of the additional clinical vignette "preparedness to anticipate disease worsening" in the ONCO group.
Additional clinical vignette: "Suppose you are suffering from a chronic, terminal respiratory disease. At home, you need oxygen therapy all the time and your movements are limited. You know that in case of sudden deterioration (for example because of a lung infection), you may have to be put on artificial respiration (a tube connected to a machine that breathes for you, while you are asleep and unconscious), without you being able to give your opinion. Regarding the decision to use this artificial respiration:"

Study samples
Two samples of subjects were consecutively recruited from January 2017 to March 2017: 1) in the waiting rooms of 19 general practitioners involved in the general practice network of Paris-Sud University (France) and selected to ensure representation of the various social backgrounds in the Paris area (GP sample), 2) in the oncology outpatient clinic of Cochin Hospital in Paris (ONCO sample). Cochin hospital is a tertiary care hospital treating around 4500 new cancer patients each year, with an oncology ward and three other medical specialty wards (gastroenterology, pneumology, dermatology) that have an oncologic activity of care and use the oncology outpatient clinic for ambulatory anticancer treatment and follow-up. Explanations on the study were provided to all consecutive French-speaking patients aged 18 years or older, without cognitive or psychopathologic disorders, by an independent researcher, unknown to the patients in the two settings. Patients were included in the study if they agreed to participate and, for patients recruited in the oncology clinic, if their Eastern Cooperative Oncology Group (ECOG) performance status was 2 or below and if they had incurable cancer. There was no incentive to participate to this study. This study was approved by the ethics committee "Comité de Protection des Personnes Sud-Est VI" (n˚ID-RCB: 2016-A01960-51) and patients provided written informed consent to participate. Measurement invariance across the French and English language versions was studied for the IS and DM items using data from the only known study in which the API was used, involving 120 patients with incurable cancer in Australia [26].

Data collection
Using a self-administered questionnaire, the patients provided socio-demographic information including sex, age, educational level, profession and whether they were living with a partner or were single. In the ONCO sample, information on their cancer history and treatment was collected from medical files, while in the GP group, their perceived health status was collected using the following question: "Would you say that overall, your health is: excellent / very good / good / medium / poor?". The patients completed the French version of the API (and the additional vignette in the ONCO sample) and answered two questions on their global judgment concerning their information preferences (on a 4-point Likert scale) and their desire to participate in decisions (on a 5-point Likert scale) ( Table 2). In the ONCO sample, patients were asked if they would agree to complete the API again at the time of their next scheduled visit (every 15-21 days). The characteristics of the 578 patients included are described for each sample in Table 2. In the GP sample, subjects were younger (49±17 vs 64±12 years), more frequently women, with a lower level of education and less frequently professionals or managers. In the ONCO sample, cancer had been diagnosed for a median time of 20 (8-41) months and the primary tumour sites were lung, colon and/or rectum, pancreas and ovary for 47(25%), 27(14%), 23(12%) and 22(12%) of the patients respectively. In the GP sample, 297(76%) patients rated their health as "excellent, very good or good" and 93(24%) rated their health as "medium or poor".

Statistical analyses
Categorical data was summarized as frequencies (%) and quantitative data as means ± standard deviation or medians (first quartile-third quartile) as appropriate. For each item, we looked for ceiling and floor effects (threshold chosen a priori >95% of respondents choosing the highest and lowest categories respectively).
2.5.1. Psychometric properties of the API. The structural validity was studied using confirmatory factor analysis (CFA) with a robust estimator for categorical data, the Weighted Least Square Means and Variances adjusted [36]. Two models were fitted, as they were both previously found in the literature to possess an acceptable fit: a three-factor model (8 IS items, 6 DM items, 9 clinical vignette items [24]), and a two-factor model (8 IS items, 15 DM and  1, acceptable fit otherwise) and models were compared using a nested model test [37]. Measurement invariance was tested consecutively across groups defined by the inclusion setting (GP or ONCO sample), age (categorized according to quartiles), sex, educational level and language version. A multigroup CFA and the classic three-step sequence were used to investigate configural, metric and scalar invariance [38,39]. We consecutively tested these three levels of invariance in fitting three different nested models having increasing constraints. For the sex invariance for example, the same model was hypothesized in both groups and the followed sequence of nested model tests was: 1) configural invariance: unconstrained factor loadings and item thresholds; 2) metric invariance: factor loadings constrained to be equal across sex groups and unconstrained item thresholds; 3) scalar invariance: factor loadings and item thresholds constrained to be equal across sex groups. Each level of measurement invariance was considered to be present if the fit indices difference, ΔCFI and ΔRMSEA, between nested models was -0.01 and 0.015 or below respectively [40][41][42].
Internal consistency was assessed using Cronbach's alpha and McDonald's omega coefficients (acceptable if �0.7) [43,44]. Test-retest reliability was assessed among patients in the ONCO sample who agreed to complete the API again at their next scheduled visit, using intraclass correlation coefficients (ICC, acceptable if �0.7) for scores on each API subscale [45]. To assess convergent validity, the association between API subscale scores and the patients' global judgment on their information preferences and desire to participate in decisions was evaluated using a one-way analysis of variance. Finally, for hypothesis testing, mean API subscale scores were compared, using a one-way analysis of variance or Student t-tests as appropriate between patients according to sex (a priori hypothesis: lower scores among men), age (lower scores among older patients), marital status (higher scores for singles) and educational level (higher scores for higher education levels).

Relationships between items in the additional vignette and API subscale scores.
In the ONCO sample, the relationships between answers to the items in the additional clinical vignette and the API subscale scores were studied using Kruskall-Wallis or Mann-Whitney's tests as appropriate. Fisher's exact tests were also used to study associations with global judgments on information preferences and the desire to participate in decisions.
Statistical tests were two-sided and a p-value>0.05 was considered significant. Analyses were performed using Stata v.14 software for data management and basic statistics and Mplus v7.4 software for the confirmatory factor analysis (CFA), which implements full information maximum likelihood to handle missing data (lower than 2% whatever the item in the whole sample) [46,47].

Results
The characteristics of the 578 patients included are described for each sample in Table 2. In the GP sample, subjects were younger (49±17 vs 64±12 years), more frequently women, with a lower level of education and less frequently professionals or managers. In the ONCO sample, cancer had been diagnosed for a median time of 20 (8-41) months and the primary tumour sites were lung, colon and/or rectum, pancreas and ovary for 47(25%), 27(14%), 23(12%) and 22(12%) of the patients respectively. In the GP sample, 297(76%) patients rated their health as "excellent, very good or good" and 93(24%) rated their health as "medium or poor".
Frequencies of answers to each item in the API in the two samples are summarized in S1 Table. No floor or ceiling effect was identified and there were fewer than 2.5% missing answers to each item. Scores on each of the subscales are shown in Table 2. No difference was found concerning the DM and IS scores, but significantly higher scores were observed for the URI and HBP vignettes in the GP sample than in the ONCO sample. Frequencies of answers to each of the three items in the additional vignette in the ONCO sample are shown in Table 1. A third of the sample preferred an equally shared decision with the doctor concerning the use artificial respiration in this fictional situation, and three quarters would wish to address this point with their doctor in advance, and thought that it was possible to give their opinion on this decision at a time when the situation had not yet arisen.
The three-factor CFA model shown in Fig 1, provided an acceptable fit to the data (CFI = 0.94, TLI = 0.93 and RMSEA = 0.060, 95%CI: [0.055 to 0.065]), better (p<0.001) than the fit of the two-factor model (CFI = 0.65, TLI = 0.61 and RMSEA = 0.142, 95%CI: [0.137 to 0.146]). As revealed by the ΔCFI and ΔRMSEA, no measurement invariance issue was found across groups defined by inclusion setting, age, sex, educational level, and language version, as the highest level of measurement invariance studied (scalar invariance) was reached (S2 Table).
Cronbach's alpha coefficients were 0.69 for the 6 DM items, 0.73 for the 9 items related to the clinical vignettes and 0.71 for the 8 IS items. McDonald's omega coefficients were 0.72 for the 6 DM items, 0.73 for the 9 items related to the clinical vignettes and 0.75 for the 8 IS items. For the assessment of test-retest reliability, 96 patients from the ONCO sample completed the API again at their next scheduled visit (mean time from baseline: 17±4 days). The ICCs were 0.80 (95%CI: 0.70 to 0.87) for the DM score, 0.59 (95%CI: 0.45 to 0.71) for the URI vignette score, 0.68 (95%CI: 0.55 to 0.77) for the HBP vignette score, 0.59 (95%CI: 0.44 to 0.71) for the MI vignette score and 0.72 (95%CI: 0.70 to 0.87) for the IS score.
Results concerning convergent validity and hypothesis testing are shown in Table 3. Good convergent validity was observed for every subscale with statistically higher scores in groups defined by a stronger desire for decision-sharing and information. The a priori hypotheses were supported by the data for all patient characteristics studied on most of the subscales.
The relationships between answers to the items in the additional clinical vignette and the API subscale scores are shown in Table 4. The desire to participate in the advance decisions was strongly and positively related to DM and the vignette subscale scores (p<0.001), but not to the IS score (p = 0.223); preference regarding the anticipation of this decision was related to the IS score (p = 0.010) but not to other API scores (p>0.05), and the ability to decide on this point at a time when the situation had not yet arisen was not related to any of the API subscale scores (p>0.05). The same relationships (or lack of relationships) were observed with the global judgement on information preferences and desire to participate in decisions.

Discussion
In this study, the French version of the API showed adequate psychometric properties for use among patients in primary care settings or among patients with incurable cancer. The additional vignette specifically developed for use among patients with advanced cancer brought additional information on the patients' preparedness to anticipate disease deterioration: while its first item (desire to participate in the advance decision to use artificial respiration) and second item (preference regarding the anticipation of this decision) correlated with the DM and IS subscale scores in the API, its third item (addressing patients' "ability to decide on this issue at a time when the situation has not yet arisen") did not correlate with any of the API subscales. Indeed, the API do not address the question of anticipation of end-of life decisions.
Interestingly, whereas the practice of end-of-life discussions is far from being common in France and very few patients have written their living wills [5], very few patients (0 to 2%) failed to answer these items. Research on end-of-life quality of care has recently shown that Advance Care Planning (ACP) is beneficial for shared decision-making, the traceability of donot-resuscitate orders, and the reduction of aggressive end-of-life care [12, 26,48], but that it can also disrupt coping mechanisms for some patients. Indeed, results from the Coping with Cancer study suggested that patients with such psychosocial factors as emotional numbness may have their fears rather exacerbated by end-of-life discussions, resulting in unreasonable demands of care and life-maintaining treatments [49]. Educational initiatives to improve communication and enhance implication in decision-making among seriously ill patients are therefore needed and are currently being developed in protocols interestingly involving both healthcarers and patients/caregivers perspectives [50][51][52]. Since this module provides additional information on patient preferences for anticipation, a theme that is not addressed by the API, and since it is well accepted by patients, our clinical vignette can be used in conjunction with the API, as a comprehensive scale to guide doctor-patient communication in the context of advanced cancer. Concerning the original 23-item API, a three-factor CFA model was found to have a better fit to the data than a two-factor model. While two factors were initially hypothesized [19], in a recently published study, the authors rigorously assessed the structural validity of the API adjusted for the setting of mental health and also found that a three-factor model provided a better fit for the data than a two-factor model [24]. This finding is also more consistent with the API scoring system, which distinguishes the vignette scores from the 6-item DM score, suggesting that these 15 items are likely to be linked to more than one factor. In agreement with previous findings, the desire for information factor was not or poorly correlated with the decision-making factors, [19,21,23] and it was the same items (4, 6 and 20; the reversely coded items) that were found to have low loadings [25]. No further analyses were performed to assess the fit of adapted models (i.e. without these reversely coded items, as in Bonfils et al [25]), as our aim was to adapt the classic version of API into French to facilitate comparisons between studies that have already used this version. However, our results are consistent with those of previous studies and suggest that it would be interesting to carefully reconstruct this instrument to enhance its psychometric properties.
Measurement invariance was assessed precisely to guarantee that group comparisons would be accurately interpreted. In this study, we did not find any measurement invariance issues related to age, sex, educational level or population studied. This means that, for example, the URI and HBP score differences observed between the two samples studied were not due to a different interpretation of one or several items in these two vignettes according to the setting.
They could result from a phenomenon of confusion, as there were many imbalanced characteristics between these two samples, or from real preference differences, but not from measurement error deriving from differential functioning of the API measurement tool between these two groups. In addition, thanks to the authors of the Australian study, [26] we were able to assess measurement invariance related to the language version (French and English) and found no issue for the 14 items, setting aside the vignettes (not included in the Australian study), meaning that a comparison of the IS and DM scores obtained using the two language versions can be accurately interpreted.
Finally, the assessment of the other psychometric properties of the original 23-item version of the API showed an acceptable level of internal consistency according to Cronbach's alpha coefficients, an acceptable level of test-retest reliability according to ICCs, good convergent validity and adequate a priori hypothesis testing for most of the API subscales.
Of course, this study is not without limitations. First of all, the size of the sample of patients with incurable cancer was too small to accurately assess the structural validity of the API. To circumvent the difficulty in recruiting patients with an incurable illness, we decided to recruit patients in a primary care setting and to assess measurement invariance of the API across settings. This study design enabled a sample size that guaranteed the accuracy of the assessment of the structural validity across the two settings. Another limitation concerned the fact that the three vignettes were not used in the Australian study and this precluded the assessment of the measurement invariance across language versions for these vignettes. In most of the studies on the API, these vignettes are not used, and to our knowledge, our study is the first where measurement invariance across language versions has been assessed for the IS and DM scores. Finally, it would have been interesting to assess measurement invariance according to other characteristics, like for example anxiety and depression which may influence the interpretation of some items of the API. However, due to time constraints we did not collect information on depression and anxiety level in the GP sample.

Conclusions
Our findings suggest that the French version of the API is valid and reliable in both general practice and oncology settings, and that accurate score comparisons can be made across age, sex, educational level, setting and English and French versions. The additional vignette developed provides interesting information on the patients' preparedness to anticipate disease deterioration, which can be of interest in the development of research on advance-care planning discussions to promote patient-centered care, ensuring that patient values guide all clinical decisions in the end-of-life period.