Took the lead in conceiving and designing the study, interpreting the data, and writing the paper: RP. Contributed to the conception and design of the study, interpretation of the data, and writing and final approval of the paper: CMM AN RJH SM. Collected the data, contributed to the paper and approved final version: NSM OM AS MJG SL JS. Contributed to the interpretation of the data, writing the paper and approved final version: AMC.
The authors have declared that no competing interests exist.
The collection of accurate data on adherence and sexual behaviour is crucial in microbicide (and other HIV-related) research. In the absence of a “gold standard” the collection of such data relies largely on participant self-reporting. After reviewing available methods, this paper describes a mixed method/triangulation model for generating more accurate data on adherence and sexual behaviour in a multi-centre vaginal microbicide clinical trial. In a companion paper some of the results from this model are presented
Data were collected from a random subsample of 725 women (7.7% of the trial population) using structured interviews, coital diaries, in-depth interviews, counting returned gel applicators, focus group discussions, and ethnography. The core of the model was a customised, semi-structured in-depth interview. There were two levels of triangulation: first, discrepancies between data from the questionnaires, diaries, in-depth interviews and applicator returns were identified, discussed with participants and, to a large extent, resolved; second, results from individual participants were related to more general data emerging from the focus group discussions and ethnography. A democratic and equitable collaboration between clinical trialists and qualitative social scientists facilitated the success of the model, as did the preparatory studies preceding the trial. The process revealed some of the underlying assumptions and routinised practices in “clinical trial culture” that are potentially detrimental to the collection of accurate data, as well as some of the shortcomings of large qualitative studies, and pointed to some potential solutions.
The integration of qualitative social science and the use of mixed methods and triangulation in clinical trials are feasible, and can reveal (and resolve) inaccuracies in data on adherence and sensitive behaviours, as well as illuminating aspects of “trial culture” that may also affect data accuracy.
The accurate measurement of product use and related behaviour in microbicide trials (but also in many other fields) is important for a number of reasons.
First, poor adherence reduces the chance of demonstrating effectiveness. If a trial shows overall benefit then relating the level of protection to adherence is valuable in interpreting the results, and has important implications for predicting effectiveness in real-life settings. Also, in order to properly interpret the results of trials that do not show a protective effect, it is necessary to be able to identify to what extent this may be due to the product not being efficacious, participants not using it, or not using it correctly, participants increasing protective behaviours such as condom use, increased risky behaviour related to perceived protection of the product, or other high-risk behaviours such as anal sex
Second, the use of investigational microbicides may negatively affect participants, either directly as a result of harmful side effects or indirectly as a result of changes in behaviour. Having accurate data on product use and related behaviour is important for assessing safety
Third, understanding the reasons for different levels of adherence provides insights that are useful for the design of future clinical trials and for facilitating rollout and access if the product proves effective.
Finally, understanding the reasons for non-adherence and for not reporting or inaccurately reporting non-adherence and other relevant behaviours is also important because it can be fed back into the trial and used to improve adherence and the accuracy of adherence data. Similarly, understanding the issues involved in the inaccurate reporting of sexual behaviour and other relevant practices during the trial makes it possible to adjust data collection techniques and improve accuracy.
The assumption among biomedical researchers is that the best and most accurate measure of adherence (and other relevant behaviours) would be a validated biomarker – some objective biological indicator of whether the study product has been used or whether the participant has engaged in certain behaviours (such as condom use or unprotected sex). This could then be used as the “gold standard” against which the accuracy of other perhaps easier and cheaper methods could be measured. Unfortunately, although there are a number of potential biomarkers for both sexual behaviour and vaginal microbicide use, these have either not been adequately validated or are not feasible in large clinical trials due to issues such as cost, logistics and acceptability. So in the absence of a “gold standard”, the collection of data on adherence, sexual behaviour (including high-risk behaviours that are often stigmatised) and vaginal practices relevant to microbicide studies, relies largely on participant self-reporting, usually through structured questionnaires.
The limitations of structured questionnaires for collecting sexual behaviour and other sensitive data are well recognised. Also, because of the sensitivity of the topics and the likelihood of desirability bias, structured face-to-face interviews in a clinic setting are not ideal for collecting accurate data (if project staff promote condoms and ask participants to use gel every time they have sex, then participants are more likely to report that they have complied, and they are less likely to report stigmatised behaviours such as anal sex). They are also not ideal for understanding participants' reasons for non-adherence or the scope and reasons for inaccurate reporting. Various other methods, are available for collecting self-reported sexual behaviour data, but these methods also all have disadvantages. In recent years behavioural research relating to HIV has moved increasingly toward using and comparing different methods, and microbicide researchers have started to experiment with methods based on participant self-assessment techniques such as computer assisted self-interview (CASI). Usually these studies report on the fit between data from different self-report methods, or between self-report and biological data.
The use of mixed methods in a single study has often revealed inconsistencies between the data collected using different instruments, but not much attempt has been made to find out
What is required is to move beyond mere comparison to investigate and understand the reasons for divergent results, and then attempt to increase the accuracy of the results. This can be accomplished through the use of mixed methods and the triangulation of results, in dialogue with participants, and
In what follows we first review the pros and cons of the main methods currently available (or being developed) for measuring adherence and related behaviour in vaginal microbicide and similar studies. Then, after briefly describing the Microbicides Development Programme MDP301 Phase III trial, we describe in detail the mixed method/triangulation model that has been developed by the MDP team in an attempt to gather more accurate data on adherence and sexual behaviour. In a separate paper we discuss some of the findings from this process
Because all self-reporting is ultimately dependent on the truthfulness, memory and accuracy of the study participants, the development and use of respondent-independent methods is seen as a priority.
All respondent independent methods, but particularly biomarkers and smart applicators, also raise serious questions about trust and acceptability: how willing will study participants be to use products that have been designed on the assumption that they (the participants) are unreliable?
It is often argued that self-assessment reduces the risk of desirability bias and therefore generates more accurate reporting of sensitive behaviours. Various methods have been, or are being, tried:
Despite their individual merits for measuring adherence and sensitive behaviour, none of these self-assessment techniques are capable of generating in-depth understanding of behaviours or reasons for behaviours.
There are a number of other methods that are used (or could be used) in microbicide and related research to collect behavioural information.
One way of overcoming the disadvantages of individual methods and enhancing the accuracy (and depth) of self-report data is through the combination of different methods. There is a growing methodological literature on mixed method research
To develop or evaluate study tools and procedures.
To examine different aspects of the research question.
To broaden the scope of the research.
To triangulate results in order to get more accurate data.
Qualitative methods are not commonly used in the context of clinical trials, but when they are this is generally in the form of small ancillary components aimed at collecting data on acceptability. They are also used to inform protocol development and questionnaire design in the early stages of the trial
We do not limit the definition of “mixed methods” to the combination of quantitative and qualitative approaches and we consider that the use of different quantitative methods together, or different qualitative methods, could also be described as “mixed method” if they are used in the same project to study the same phenomenon or different aspects of the same phenomenon. We shall also avoid any theoretical discussion about the distinction between “methods” and “techniques”, mixed “methods” vs mixed “models”, and what exactly the term “mixed methods” does or should refer to.
Following this broader definition, there are various examples of the use of mixed methods to collect behavioural data in medical research on sexual and reproductive health. These studies have been concerned with the accuracy of data on sensitive behaviours and have often used biomarkers to validate self-report data. They have had mixed results. For example, one study of men and women attending an STI clinic in the US found that self-reported condom use was not supported by STI incidence
In microbicide research there is a trend towards experimenting with new methods – particularly CASI – in combination with more conventional ones, such as face-to-face interviews. CASI has been used in three MTN clinical trials in an attempt to get more accurate information on sexual behaviour. In the VOICE trial (MTN-003), a Phase IIb study of Tenofovir vaginal gel and Truvada tablets for the prevention of HIV infection in women, CASI is being used to ask the questions the researchers deem sensitive; at three of the South African Carraguard phase III trial sites CASI was assessed against various STI biomarkers
Until we have validated biomarkers for all the behaviours that are relevant to microbicide trials, such as adherence, sexual behaviour, condom use – something that does not seem likely in the near future – we will remain dependent on some form of self-reporting for most of these data. And we will also continue to need self-report data to inform us about the details of and reasons for particular behaviours. It is therefore crucial to continue to develop and improve methods so that the data we collect are as accurate as they can possibly be. One way forward is to go beyond the parallel use of different methods to triangulation.
The term triangulation is derived from surveying and navigation, where it refers to finding a position – a fixed point – by getting bearings on different objects. The methodological use of the term is usually traced back to a 1959 article by Campbell and Fiske
This is a simplification, however, and in the literature on social science research methods there has been heated discussion about what triangulation is and is not, and whether it is possible at all
While a detailed consideration of these issues falls outside the scope of this paper, we are raising them here because we want to make clear that the triangulation model we describe here represents an attempt to move beyond simply comparing methods and trying to work out which is more accurate, toward developing a more composite and holistic picture, while at the same time accepting a necessary degree of uncertainty in the result. Perhaps the term “triangulation” is not the ideal term for this process, given its connotations of precision, but we will continue to use it here for want of a better one. The model described below is based on a model initially developed and used successfully for a number of years to study sexual behaviour change in Uganda
MDP is an international partnership set up to evaluate vaginal microbicides to prevent HIV transmission (
Prior to the phase III trial, feasibility studies were conducted at each of the centres to assess retention of participants during 12 months of follow up and to obtain estimates of HIV sero-incidence rates, pregnancy, and condom use in settings where condoms were promoted and provided free of charge and risk-reduction counselling and STI treatment were provided. The feasibility studies also assessed behavioural characteristics of the potential study populations. A pilot study, using placebo gel, followed the feasibility studies, and the results of this were used to inform the final protocol for the phase III trial
A substantial social science component was included from the outset. The main objectives of this were to improve and assess the accuracy of adherence and sexual behaviour data, collect detailed data on sexual behaviour and vaginal hygiene practices, assess participants' comprehension of the study and the informed consent procedures, and assess the acceptability of the product and trial procedures
At the start of the feasibility study it was not clear which methods, apart from the conventional clinical CRFs, would be used and what they would look like. Feasibility and pilot studies facilitated internal discussion and consultation with the study communities, as well as the development and testing of methods and approaches.
Because concerns had been voiced about the feasibility and acceptability of coital diaries in some of the study communities, the social science teams at each of the centres developed different formats and tested these in the study communities during the feasibility study. Key questions were: should the diaries be pictorial or text, how explicit should the pictures be and would this be acceptable, which behaviours should they cover, how detailed should they be. One site (Tanzania) tested five different formats
One centre (Johannesburg) piloted CASI, in particular relating to sensitive topics such as anal sex, but the results were not very different from those achieved with interviews, and this, together with technical difficulties at the time and the impracticality for large study populations in some of the rural areas, led to the decision not to use CASI. The use of mobile phones was also considered and rejected for similar reasons.
Although the use and centrality of a case record form (CRF) for the collection of behavioural data was assumed from the start, the feasibility and pilot studies enabled it to be developed and tested in parallel to the coital diaries and in-depth interview guides, enabling the comparison of data for individual sex acts.
Another important aspect of the feasibility studies was to investigate the local cultural context and clarify key concepts and terms. This involved identifying key vernacular terms relating to relationships and sexual practices and exploring their meanings. Many of the relevant behavioural terms were highly ambiguous, and this was further complicated by the multilingual nature of some of the study sites. As a result there were often multiple possible translations, none of which reflected the exact meaning of the standard English terms that are used in this type of research. The feasibility and pilot studies facilitated some refining of translations and terminologies.
Different methodological options were assessed and developed during the feasibility study. An effort was made to select methods that were relatively simple and feasible across the different settings, and which had both complementary strengths and different weaknesses (to reduce the possibility that agreement in the results may be a result of sharing the same weakness). These methods were then tested in the pilot study and refined before being adopted in the trial:
Structured interviews recorded on case record forms (CRF)
Pictorial coital diaries (CD)
Semi-structured in-depth interviews (IDI)
Counting returned gel applicators
Focus group discussions (FGD)
Ethnography
This selection combines quantitative and qualitative, self-assessment and face-to-face, and self-report and a more respondent independent technique. The only potential biomarker available at the time was the applicator stain test developed by the Population Council
Focus group discussions with community members and trial participants and ethnography carried out in the study communities and clinics provided additional contextual information. The CRF, CD and IDI were developed in parallel and covered the same topics and the same time period in order to facilitate comparison.
At each of the six African centres a subset of women was randomly assigned to the social science component of the study, which was responsible for the triangulation. The target sample size for this subset was at least 100 per centre (i.e. a total of 600 women across the trial). This number was thought to be small enough to enable the collection of detailed qualitative data and yet large enough to generate results that could be generalised to the whole trial population. By the end of recruitment we had recruited a total of 725 women (7.7% of the trial population) into the social science subsample.
All trial participants had 4-weekly clinic visits during which they received gel and condom supplies, returned used and remaining unused gel applicators, and were interviewed using a CRF. The visits at weeks 4, 24, 40 and 52 were longer as they included a clinical interview and examination and the CRF interview was more detailed, containing questions about gel use, vaginal washing and other practices, and detailed questions on each sex act during the last week (or four weeks if the participant did not have sex in the last week). The triangulation procedures were linked to three of these long clinic visits, at weeks 4, 24 and 52. (It was felt that it was sufficient to triangulate data early, in the middle and at the end of follow-up and therefore unnecessary to also include these procedures at week 40 as well).
The social science component of the trial was made up of teams at each centre consisting of 4–5 interviewers led by a senior social scientist. The social science component was coordinated centrally to ensure standardised procedures and training.
The final social science dataset consists of 1866 in-depth interviews, most with matching CD, CRF and applicator count data, from 725 women. In addition there are 462 interviews with 244 male partners. There are also 100 FGDs with trial participants who were not randomised to the social science component, 119 FGDs with community members, and extensive ethnographic notes. These have all been transcribed and coded in Nvivo.
In this section we discuss some of the practical issues relating to this approach, looking first at what worked and then considering some of the problems and potential solutions.
MDP301 demonstrated that it is possible to integrate a substantial qualitative component into a clinical trial, even one carried out under the stringent criteria required for product licensing. It showed that this approach can greatly enhance the quality and richness of key trial data
Although feasibility and pilot studies are not essential for a mixed method approach, they did enhance the quality of the collaboration and the data collection tools. In addition to being important from a clinical trial perspective for assessing HIV incidence and retention, the feasibility studies provided space for the development of innovative approaches to collecting sensitive information and the opportunity to integrate these into a coherent methodological whole. They were crucial for understanding the terminologies and meanings that are necessary for developing valid instruments, especially in multicultural and multilinguistic research settings. They were also important for building trust between collaborators from disciplines with very different approaches. Having a pilot study between feasibility studies and trial was important for testing trial and clinic procedures, gel distribution, etc., but also for piloting the combination of methods and the triangulation procedures.
The combination of different but complementary methodological
Four main problem areas emerged from, or were highlighted by, this approach.
Although mitigated to some extent by the ethnographic work during the preparatory phase, the sensitivity of the topics and the lack of fit between the participants' messy descriptions and vague categories and the quantification and ostensibly precise categories of the trialists meant that some ambiguity persisted in the data (for example relating to what should be considered a “sex act”). More ethnographic work could have been done to clarify terminologies in the early stages. Here it is important to focus on how researched communities
Various tensions arose due to different disciplinary assumptions and epistemologies. We give two examples. First, quantitative medical researchers assume that in order for data to be standardised and comparable, respondents must be asked exactly the same question in exactly the same way. This requires reading the question and related explanatory information verbatim from the questionnaire. Here “the same question” refers to the wording and the delivery. From a qualitative perspective, however, the focus is more on the respondents' interpretation of the questions and what they mean by their answers. In other words, it might be necessary to word and ask questions differently in order to ask the “same” question and get comparable answers. For example, it is clear from the in-depth interviews that, as the trial progressed and participants became familiar with the trial definition of a sex act, the meaning of the questions about sex acts changed for them, while the wording of the CRF questions (and their meaning for the trialists) remained the same. This suggests that getting more reliable data might actually require using more open questions.
Second, although it was accepted by the trialists that a degree of flexibility was necessary in the collection of qualitative data, the relatively inflexible clinical trial culture tended to impinge on this freedom. For example, it was difficult to adjust the in-depth interview question guide during the study because of the assumption that it would then need new IRB approval. It also took many months to get agreement (and then only in some research centres) to carry out additional follow up interviews with participants about suspected gel sharing and dumping, because such interviews were not described in the trial protocol. This lack of flexibility is partly due to the assumptions underlying quantitative research and partly a result of the proliferation of GCP rules and IRB requirements, which are perhaps appropriate for clinical data collection but less so when applied to qualitative behavioural studies.
MDP invested much time and effort in training interviewers, and it is difficult to imagine that more could have been done. However, the triangulation process revealed that many of the inaccuracies that could be traced to the clinic CRF were a result of errors made by the interviewers. This was confirmed when we recorded and transcribed a sample of CRF interviews and compared these with the completed CRFs for the same interviews. A similar problem also bedevilled some of the in-depth interviews, which sometimes left much to be desired with regard to the depth of the probing and the follow-up of potentially interesting topics.
These problems might have been mitigated by more training, and by better quality control (for example through regular recording and comparison of a subsample of interviews, as mentioned above, and the integration of this into a system of ongoing training). But recruitment and selection of interviewers is perhaps more crucial. It should not be simply assumed that with a little interview training nurses and councillors make good interviewers, or that because someone has a degree in social science they are naturally able to do rich in-depth interviews. Good interviewing techniques can be learnt, but some people have more aptitude for this type of social interaction than others, and this should be taken into account when recruiting interviewers (for example by getting applicants to do an interview as part of the selection process).
This trial generated the largest set of qualitative data that has ever been collected in a single study, as far as we are aware. Although this had the advantage of enabling us to generate numbers from qualitative data for a relatively large and representative proportion of the trial population (7.7%) across multiple sites, it also brought with it a number of technical problems. For example, existing versions of software packages designed to manage and code qualitative data proved incapable of handling such a large dataset, and the data had to be spread across numerous databases. Qualitative software developers need to work toward increasing the capacity of the databases that are part of their programmes.
Also, in-depth interviews are time consuming to do and, especially, to transcribe, translate and code and, given the large number of interviews, this frequently led to backlogs. However, because only part of the in-depth interview was devoted to the triangulation of adherence and sexual behaviour data, with the rest focusing on other broader contextual issues, it should be relatively easy to separate the triangulation process from the rest of the in-depth interview, which could then be made into a much shorter process involving all trial participants, limiting the rest of the in-depth interview to a smaller sub-sample of participants. This “triangulation interview” could then be the source of the final quantitative trial data on adherence and behaviour. These triangulation interviews could be routinely recorded and a sub-sample transcribed for training and quality control.
This paper has described the integration of qualitative and anthropological methods and innovative quantitative methods into a large multi-centre clinical trial, and the triangulation of results in order to obtain more accurate data on product use and sexual behaviour. While there are various examples of the use of mixed methods in clinical trials, the Microbicides Development Programme has, as far as we are aware, developed and implemented the most comprehensive combination of mixed methods and triangulation in a clinical trial to date. The study is unique in having integrated these into the trial in order to improve accuracy rather than using them for parallel or retrospective evaluations, and in the way that the qualitative data were collected from a substantial representative sample of the trial population. The key innovative aspect is the identification and resolution of inaccuracies in the data during the study in a process that involved a customised in-depth interview and dialogue between researchers and participants.
It is often argued that this type of research is time consuming, that it costs too much, and that the results are not “objective”. But if it is not done then trialists risk having spent millions of dollars and still ending up not knowing what it was they paid so much to find out. The experience in MDP301 suggests ways of re-thinking how we get a true grip on the most challenging aspect of HIV prevention research: adherence to protocol and to prevention behaviours that require enduring commitment.
Many people have contributed, directly or indirectly, to this paper. We would like to thank the members of the social science teams who collected the data: Jessica Philip, Mdu Mntambo, Sello Seoka, Florence Mathebula, Elisha Hilali, Amina Sufian, Veronica Selestine, Kagemlo Kiro, Salma Matari, Stella Namukwaya, Racheal Kawuma, Winifred Nalukenge, Henry Luwugge, Robert Lubega, Misiwe Mzimela, Sizakele Sukazi, Armstrong Nkhule Mngomezulu, Cebile Mdluli, Bisalomo Mwanza, Serah Kalumbilo. We would also like to thank Mary Rauchenberger and Steven Sheehan for their contribution to the development of the quantitative databases and their patient assistance whenever it was required, and Precious Lunga, Nicola Kagenson and Julie Bakobaki for their role in central coordination. Finally, a special word of thanks to the reviewer at PLoS One, Polly Harrison, whose various rounds of critical-but-constructive comments and suggestions have made this paper so much better.