Usability assessment of seven HIV self-test devices conducted with lay-users in Johannesburg, South Africa

Introduction The first 90 of the 90-90-90 initiative introduced by the World Health Organization(WHO) in 2015 requires 90% of people with HIV be aware of their status by 2020. In South Africa, conventional facility-based testing had reached 84.9% in 2018; innovative new methods, like HIV self-testing(HIVST) may close the testing gap. This study aimed to determine the usability of seven HIVST kits among untrained South Africans. Methods This cross-sectional study of 1400 adults in Johannesburg evaluated the usability of five blood fingerstick and two oral fluid HIVSTs, using WHO prequalification criteria, from June 2016 to June 2018. Participants were handed one kit, with no further information about the device or test procedure, and asked to perform the test in front of an observer. The observer used product-specific semi-structured questionnaires organized into a composite usability index(UI) using a HIVST process checklist, a contrived results interpretation and a post-test interview that expanded on participant experiences with the device and instructions-of-use(IFU). Participants were not tested themselves, but provided with contrived results to interpret. Results The average UI was 92.8%(84.2%-97.6%); the major difficulty was obtaining and transferring the specimen. Participants correctly interpreted 96.1% of the non-reactive/negative, 97.0% of the reactive/positive, 98.0% of the invalid and 79.9% of the weak positive results. Almost all participants(97.0%) stated they would visit a clinic or seek treatment for positive results; with negative results, half(50.6%) stated they should re-test in the next three months while one-third(36.1%) said they should condomize. Nearly all found the devices easy to use(96.6%), the IFUSs easy to understand(97.9%) and felt confident using the test unassisted(95.9%) but suggested improvements to packaging/IFUs to further increase usability; 19.9% preferred clinic-based testing to HIVST. Conclusion The UI and interpretation of results was high and in-line with previous usability studies, suggesting that these kits are appropriate for use in the general, untrained and unsupervised public.

Introduction HIV testing and counselling (HTC) represents the first '90' of the 90-90-90 initiative [1] and in South Africa 84.9% of adults living with HIV know their HIV status, however there are concerns that South Africa will not reach this 90% target by 2020 [2]. Established facility-based testing presents barriers to accessing HTC [3][4][5], and innovative new methods like HIV selftesting (HIVST) are needed close the test gap by reducing barriers for priority populations [6][7][8]. HIVST refers to a process in which a person performs their own HIV test; which may be more convenient, require less travel and waiting time, and be more private [9][10][11].
Initially, home-based HIV tests required the user to send a blood sample to a laboratory, then wait days or weeks for results; however modern HIVST rapid diagnostic tests (RDTs) can provide results in minutes, without a laboratory [12]. In 2012 the Food and Drug Administration in the United States approved OraQuick ADVANCE Rapid HIV-1/2 Antibody Test as the first-ever over-the-counter HIVST RDT, including individuals without any prior experience in HIV testing [13].
Growing evidence from different contexts supports the acceptability and usability of HIVST [14][15][16][17] and guidelines suggest that only validated products be used for programmes. Since then, 40 countries, including South Africa, have incorporated HIVST in national policies [18] and the South African National Department of Health will only permit the use of WHO pre-qualified products to be used in public health programmes [19]. The WHO prequalification (PQ) process was designed in accordance with international standards, which includes assessing clinical performance among a broad range of self-testing users, with studies that compare the device to a suitable reference standard test [20].
The HIV Self-Testing Assessments and Research (HSTAR) programme, supports HIVST developers in the PQ process, by independently providing data on usability. The purpose of this study was to determine the usability of seven prospective HIVSTs in the hands of untrained, intended users in line with the specifications set out by WHO in the abovementioned prequalification document.

Study design
This usability evaluation was a cross sectional study that used convenience sampling to recruit consenting adults from the general population in the inner-city of Johannesburg, South Africa from June 2016 until June 2018. Using WHO PQ guidance, this study evaluated the usability of seven HIVSTs that were in varying stages of development for the South African market. The HIVST devices were evaluated independently and in series, with one device evaluation finishing before the next started. No participants were enrolled for more than one device.

HIVSTs
To simulate the real world experience, each HIVST device being evaluated was presented in shelf ready packaging (similar to other devices already approved for distribution within South Africa), using the manufacturer's instructions for use (IFU) and kit components. Seven HIVST devices were assessed: five fingerstick whole-blood (FS) devices and two oral fluid (OF) devices. The five FS devices were produced by Atomo Diagnostics (Australia) (two devices; generation 1 and generation 2), Biolytical Laboratories (Canada), Biosure Ltd (United Kingdom), and Chembio Diagnostic Systems (USA), while the two OF tests were produced by Calypte Biomedical Corporation (USA) and Orasure Technologies (USA).

Study participants
Based on the WHO prequalification documents, a sample size of 200 participants [20] was suggested for the usability assessment of each device (1400 total participants). Recruiters were made conscious during training that the study aimed to include diverse age groupings and education levels, as well as equal gender participation [20].
Participants were included if they were 18 years and older, able to read English, first-time HIV self-testers, willing to provide oral fluid or fingerstick blood samples (according to which test was being evaluated) and reported an unknown HIV status. Healthcare workers (including lay counsellors who do HIV testing), any person who had any prior experience with HIV selftesting, persons who had received an experimental HIV vaccine or were taking HIV pre-exposure prophylaxis, and persons known to be HIV positive or have any extenuating condition which may interfere with the process (such as poor vision or intoxication) were excluded. Participants were registered onto a Biometric Enrolment System, which uses fingerprint scanning to eliminate the chance of duplicate enrolment.

Pilot
At the time of evaluation, no verified or standardized questionnaires for investigating the usability of HIVSTs for prequalification were available, so a product-specific semi-structured questionnaire was developed based on the WHO PQ literature [20][21][22][23]. The questionnaire was piloted in a sample of 50 participants for each HIVST device. Findings were shared with the manufacturers, and they were encouraged to incorporate the feedback into their final product for evaluation [16,24]. Not all manufacturers chose to amend their end product, but in some cases, there were major edits to their IFU to ensure clarity and the successful performance of critical steps.
In order to evaluate participants' actions and perceptions regarding usability, the questionnaire was organized into three parts: a usability index guided by a HIVST process checklist; the interpretation of contrived results; and a post-test interview that assessed the participants' competency and experiences, while also inviting recommendations (see supporting information S1 Data Collection Checklists). remained silent throughout and did not provide any assistance or interference. To document each participants' actions, an observer used a product-specific semi-structured questionnaire, which included a product-specific checklist and a post-test interview. As this study only evaluated usability, not performance of the HIVST, the HIVST devices were removed after the final processing step (before results could be observed) and substituted for contrived tests that were developed by each manufacturer. Participants were provided with four contrived devices (non-reactive/negative, reactive/positive, weak positive and invalid), serially and in random order, that displayed the possible results and asked to interpret each result.
Usability index. Prior studies often describe usability qualitatively, as a way to provide feedback that can be incorporated into future designs, however at the time of this study, there were no validated data collection tools to quantify usability [22,23]. A product-specific HIVST process checklist was developed to calculate a usability index that could be applied to each of the HIVSTs independently. This usability index was motivated by previous HIVST briefing documents from the Blood Products Advisory Committee [25], which quantified operational error rates by identifying and tracking errors based on the IFU. Instead of tracking erroneous steps to identify the error rate (expressed as a percentage), this study tracked successful steps, in order to identify usability with the usability index, which was also expressed as a percentage. The checklist and steps used to calculate each usability index was product-specific, so direct comparisons between HIVSTs could not be made.
In order to calculate the usability index, a device and IFU assessment of each HIVST was used to create a product-specific yes/no checklist of all steps for the HIVST procedure, which ranged from 10 to 17 questions. A trained observer used the checklist to document each participants' usability of the HIVST by tracking the number of successful participants that completed each step, which was presented as a frequency and percentage. For steps with negative inflection, the 'No' response was the value used towards the final usability index (ie. Was it difficult for the participant to remove the test device from the pouch?). The successful usability percentages of each step were then averaged, to provide the usability index for each device, from 0% (unusable) to 100% (highly usable) [26,27].
Interpreting contrived results. To evaluate the participant's ability to interpret the device results, contrived tests were provided by each manufacturer to represent the four possible test outcomes: 1) non-reactive/negative, 2) reactive/positive, 3) weak positive, and 4) invalid (no control). Observers used a yes/no checklist to document whether the participant noted the control line and the test line, then whether their interpretation of the results was correct. They rated the participants' apparent level of confidence and satisfaction, whether the participant appeared calm, nervous, verbally distressed or confused, and whether staff intervention was requested.
HIV test results were not recorded or reported to participants. Participants who wished to have a HIV test were referred to the nearby clinics adjacent to the study sites, where testing is readily available; this is conventional practice in similar studies where HIV results are not made known to participants, for research design reasons, and is a local ethics board requirement.
Post-test interview. As part of the post-test interview, participants were asked what to do following both positive and negative results, in order to evaluate how well the participant understood the IFU recommendations for each type of test result.
The post-test interview evaluated each participant's comprehension of the IFU and packaging information, whether they would use the test again or recommend it to a friend, and experiences throughout the testing process. Open-ended questions asked for comments and recommendations for their test device.

Data analysis
Data were transcribed from the product-specific semi-structured questionnaire into an excel database by field workers. Quantitative data was analyzed with descriptive statistics in Excel. Qualitative data on usability and experiences were categorized and assessed to provide context and supplement the quantitative results.

Ethical considerations
The Human Research Ethics Committee of the University of the Witwatersrand provided approval for the study (No. 160306). Once approved, the protocol was registered with the National Human Research Ethics Committee (www.ethicsapp.co.za), the South African National Clinical Trial Registry (www.sanctr.gov.za) and ClinicalTrial.gov. Trained study staff obtained written informed consent from all study participants using an information sheet and informed consent document approved by the Wits HREC. The informed consent form was translated into English, Zulu, Sotho and Xhosa. Participants were given no incentives or reimbursements for their participation in the study.

Usability index
The average usability index for all seven HIVSTs was 92.8% (84.2% to 97.6%). Each HIVST has been described individually in further detail below (Table 2) (slightly adapted from the original questionnaire for ease of presenting the results), while the complete datasets for each HIVST are available as supporting information S1-S7 Datasets.
Biosure. The overall usability index of Biosure was 84.2%. Nearly all participants (193/ 200;96.5%) read and used the IFU and 23(11.5%) participants experienced difficulties opening the packaging. Twenty one (10.5%) participants had difficulties lancing their finger, while 37 (18.5%) participants had difficulties forming a blood droplet, which resulted in only 156 (78.0%) participants able to fill the tube with an adequate amount of blood during specimen collection. Approximately one quarter of participants had trouble placing the buffer pot upright in the slot (52/200;26%) and were unable to push the test tube to the bottom of the buffer pot (47/200;23.5%). Despite any missed or incorrect steps, 178(89.0%) participants got to the last step.
INSTI. For INSTI, the overall usability index was 97.4%. All (200) participants read and used the information sheet and 10(5.0%) had difficulty removing the device from the packaging. All participants were also able to remove the cap of Bottle 1 and the lancet. With respect to specimen collection, 194(97%) participants correctly massaged their finger, 198(99%) correctly lanced their finger, 188(94%) participants were able to form a bold droplet, and 171(88.5%) were able to successfully transfer the droplet into Bottle 1. Two (1.0%) participants had difficulties capping Bottle 1, seven (3.5%) participants did not shake the bottle the required four times, and only two (1.0%) participants were unable to pour the liquid from Bottle 1 into the device and wait until it disappeared. Similarly, seven (3.5%) participants did not shake the second or third bottle four times, and two (1.0%) participants did not pour the liquid from Bottle 2 or Bottle 3 into the device and wait until it disappeared. Despite any missed or incorrect steps, 199(99.5%) participants completed the entire process.
Atomo1. The usability index for Atomo1 was 89.1%. All 200 participants read and used the information sheet and only 2(1.0%) experienced difficulty removing the device from the pouch, however 64(32.0%) were unable to correctly place the device once removed. During specimen collection, all the participants were able to prime the lancet and correctly massage their finger, however 5(2.5%) participants did not successfully prick their finger and 19(8.5%) did not squeeze their finger hard enough to create a blood droplet; this led to only 126(64%) participants filling the blood tube with the correct volume of blood. One hundred and eightyone (90.5%) participants flipped the blood tube into the well but only 126(63.0%) ensured that the blood successfully moved into the well. The 3 drops of buffer were added correctly by 182 (91.0%) participants and the test fluid ran across the strip for 185(92.5%) participants. A total of 187(93.5%) participants completed the entire process, despite any missed or incorrect steps.
Atomo2. The Atomo2 usability index was 97.6%. Each of the 200(100%) participants read and used the information sheet and only one (0.5%) had difficulty removing the device from the package. For specimen collection, all participants were able to remove the tab on the lancet and push the grey button to prick the finger. Despite only 163(83.0%) participants having correctly massaged their finger to stimulate blood flow, 197(98.5%) participants were able to form a blood droplet and successfully touch the blood droplet to the channel. Fourteen (7.0%) participants did not produce enough blood to adequately fill the channel, but 198(99.0%) participants were able to press the button to activate the test and the fluid successfully ran across the strip of all but one (0.5%) test. Despite any missed or incorrect steps, all of the participants (200/200;100%) completed the entire process. Usability assessment of seven HIV self-test devices conducted with lay-users in Johannesburg, South Africa  Chembio. The usability index for Chembio was 93.7%. Almost all participants (198/ 200;99.0%) read and used the information sheet, while six (3.0%) had difficulty removing the device from the foil package. One hundred and ninety-nine (99.5%) participant successfully set up the stand on a flat surface, while four (2.0%) participants were unable to remove the buffer cap or correctly insert the buffer cap into the test stand. While preparing for specimen collection, six (3.0%) participants did not open the disinfectant wipe, 39(19.5%) had difficulties opening the sterile pad and 10(5.0%) did not disinfect the finger and allow it to dry. While three (1.5%) participants did not push down hard enough to properly prick the skin, almost all participants (198/200;99.0%) successfully uncapped the lancet, correctly placed it against the side of their fingertip and were able to squeeze out the first drop of blood. One hundred and seventy-four (87.0%) were also able to squeeze out a second drop of blood and 194(97.0%) filled the testing device with an adequate amount of blood. After specimen collection, 137 (68.5%) participants used the sterile pad to wipe up the blood. Once the specimen was collected, 194(97.0%) participants were able to insert the tip of the device into the test stand opening, 162(81.0%) were able to push firmly through the foil cap (confirmed by 3 snaps) and 175 (87.5%) participants checked for the formation of pink stain within one minute of puncturing the buffer pot. A total of 197(98.5%) participants completed the entire process, despite any missed or incorrect steps.
Orasure. For Orasure, the overall usability index was 92.2% and all 200 participants read and used the information sheet. Six (3.0%) participants had difficulty removing the test device from the pouch and 20(10.0%) participants had difficulty removing the test tube from the pouch. Once removed, 196(98.0%) were able to remove the test tube cap, and 151(75.5%) were Calypte. The usability index for Calypte was 95.5%. All 200 participants read and used the information sheet and only three (1.5%) experienced difficulty removing the test contents from the box. Eighty-one (40.5%) had difficulty inserting the test tube into the box and 24 (12.0%) had difficulty pulling the cap off the test tube. For specimen collection, all participants successfully removed the oral brush from the plastic bag, 194(97.0%) correctly used the brush (brush the upper and lower gums twice each) and 199(99.5%) correctly inserted the brush into the test tube when done collection. One hundred and ninety-five (97.5%) correctly pushed the brush up and down in the test-tube (six to eight times), 197(98.5%) correctly squeezed fluid from the brush correctly, then all of the participants correctly removed the brush from the test tube. Once the brush was removed, 198(99.0%) participants properly removed the test strip from the foil pouch and correctly dropped it into the test tube. All 200(100%) participants completed the entire process, despite any missed or incorrect steps.
Participants acknowledged difficulty with some steps due to kit engineering (e.g. difficulty assembling the device, or fitting parts together), particularly where there was a tight fit or firm pressure was required. Participants made suggestions where some issues could be resolved by the IFU-asking for some steps to be clarified or emphasized in places: "Some steps were hard to follow, pictures help" -Male, 25 years old, FS When asked what they liked least about their HIV self-test, participants in the FS studies identified a general dislike of the needle, however this was reported by less than 12% (6%-11%) of the fingerstick participants. Many participants had never used a safety lancet and were nervous about the needle, but then expressed surprise that it wasn't as painful as anticipated. Some participants expressed frustration with the fingerstick process ("needle not working" and "needle not sharp enough") and could not acquire enough sample (particularly noted for device Atomo1). For the fingerstick step, participants suggested improved instruction to successfully apply the lancet ("press firmly" with a picture of the correct location) and to obtain the necessary sample volume ("keep squeezing" to form blood droplet): "Safety lancet is bit confusing; the producer should make it more visible." -Male, 48 years old, FS For the OF studies, participants had minimal apprehension about obtaining a specimen; most of the complaints focused on difficulties with assembling the kit components ("cap is too tight" and "where to insert the tube"). When asked what they liked best about their HIV selftest, both OF and FS participants overwhelmingly liked the convenience and confidentiality of a self-test. Approximately 7% of OF self-testers specifically stated that that they like that their test required "no needle" and "no blood" (though four OF participants stated a higher confidence for blood-based testing). Both OF and FS participants preferred home-based self-testing to clinic-based testing, and appreciated that the HIV self-test was confidential, fast, did not require clinic queues and gave them autonomy for their health decisions.
"Home test is easier and less scary compared to clinic." -Male, 30 years old, OF Several participants expressed a desire for a trained professional to be available to assist in using the device (both FS and OF), if necessary, while others were concerned about the lack of counselling available with the HIVST.
"Before the client buy this product, the must be someone to demonstrate to the clients at pharmacy on how to use this product."

Discussion
While this is the first large-scale usability study incorporating multiple HIVSTS in South Africa, the strong usability outcomes, are consistent with recent studies conducted in other populations. A study of the Exacto Test HIV in the Central African Republic showed that 91% of participants found the test easy to use and 91.6% performed the test correctly, however 23% asked for oral assistance [28]. Similarly, a Kenyan INSTI usability study showed that 94.3% of participants found the test easy to use [23]. The high average usability index of 93.8% is further corroborated by 96.6% of all participants stating that the HIVSTs were easy to use.
Despite the high usability scores, this evaluation also provided observations and perspectives that may improve upon current offerings by identifying points of confusion or hesitation, as well as any critical and non-critical errors made during the self-testing process. For each device, difficulties with packaging, instructions, and/or kit components were identified for improvement to increase ease-of-use and reduce misuse of the self-test. Each device manufacturer received a final usability report to help assess product readiness for the HIV self-testing market in South Africa, along with recommendations for any HIVST kit improvements.
User errors were high when participants had difficulty obtaining and transferring the specimen. For the FS devices, the most common specimen collection errors included lancing mistakes, not acquiring a sufficient blood droplet, or not adequately filling the transfer capillary. These results were comparable to a previous usability study of Atomo1 in adolescents from Cape Town, South Africa [21]. Errors in FS sample acquisition might be decreased through improvements to the IFU and/or more reliable lancets. For the OF devices, the most common specimen collection errors came from incorrectly swabbing the mouth, which could also be decreased through improvements to the IFU. Since no diagnostic results were collected, these user errors cannot be directly linked to incomplete or incorrect test results; follow-on studies of full diagnostic performance are currently underway.
As participant responses suggest, it would be beneficial to have a choice between whole blood or oral fluid HIVST device, as different users have indicated a preference for both; one kind of self-test modality is unlikely to suit everyone. This variance was also observed in a recent study of men who have sex with men (MSM) in Mpumalanga, South Africa, which showed that OF tests (OraSure) were easier to use, while the FS tests (Atomo) were preferred by participants [29]. Even with the choice of OF and FS devices, self-testing alone, is unlikely to be beneficial for everyone who needs an HIV screening test, as roughly 20% of all participants indicated they were uncomfortable performing the test without guidance or counselling at hand. This proportion was similar across 7 devices, suggesting that people's views on this are unrelated to any technical aspect of HIVST devices and may instead indicate a social preference. Similar findings were published in an OraQuik study from Singapore, which suggested that the acceptability of HIVSTs may be influenced by socioeconomic status [30]. Age may also play a factor, as the Cape Town adolescent usability study showed slightly lower usability scores with a similar style data collection tool, as participants scored the device 4 out of a maximum rating of five [21]. A multi-dimensional approach including home-testing, communitysupported testing and clinic-based testing will likely be needed to reach key and under-tested populations, including men, adolescents and serodiscordant partners. The ability to privately and easily perform an HIV self-test is a much-needed innovation to enable earlier diagnosis of HIV and empower individuals to monitor their own health and behaviour.
Although not part of the usability index, the interpretation of contrived results and what to do after testing also identified some areas of possible improvement. While most participants correctly interpreted non-reactive/negative, reactive/positive and invalid (no control line) results, one quarter of all participants misinterpreted the weak reactive/positive, most commonly as non-reactive/negative. The results for the positive, negative and invalid tests were similar to the Central African Public study, however that study did not investigate the interpretation of weak positive results [28]. The contrived weak positive results ranged in intensity and were specific to each product, so some device weak results may have appeared feint or lighter than others. Most of the IFUs provided simple recommendations for test results with the pictured examples, such as "go to clinic" for a reactive/positive result, and "re-test in 3 months" for a non-reactive/negative result, however some IFUs did not include recommendations for the non-reactive/negative or invalid test results. We noted that participants achieved the most accurate result interpretation when the test device could be placed next to "life sized" examples of the possible test outcomes in the IFU. In order to achieve the most accurate result interpretation, "life sized" comparison examples of all possible test outcomes should be prominently displayed in the IFUs of each HIVST, a suggestion that was also made by Bwana et al in the Kenyan study, as that IFU also lacked a weak positive image [23]. This should also be coupled with simple IFU recommendations of what to do after testing, in the case of each possible result.
The results of the HSTAR001 Usability study presents general guidelines for safe and easyto-use HIVST test kits including minimizing packaging and clearly marking kit components, simplifying and ordering IFU steps (placing particular emphasis on key steps where necessary), minimizing the number of words used in favour of simple pictures and universal icons (e.g. for the FS have a picture of the target (finger) and orientation of lancet, emphasizing the need to press firmly and for oral fluid to include a picture how the user must place or move tool inside their mouth), providing interpretation of all test outcomes and including "life size" pictures for device comparison, and clearly indicating the next steps for each type of test outcome.

Limitations
This study presented some limitations. The convenience sampling may present a selection bias, as there were different proportions of sub-demographics between devices, which may have been due to the different communities surrounding each study location. The evaluation of the devices in series ensured no cross-contamination, however the participants from the community may have become more aware of HIVST by the time the last device was tested. For data collection, there is no validated or standardized usability test for HIVSTs, so the productspecific semi-structured questionnaire was developed internally and used to quantify usability. This questionnaire only allowed for each device to be evaluated independently. No direct comparisons between products could be made, as a result of different device components and IFUs not being standardized across kits. For example, there was no universal standard for intensity of a weak positive used to test readings of contrived results. The WHO prequalification only requires independent data on usability and does not require any direct comparisons between products, however we are working on developing a more standardized tool as a separate project as comparisons would be beneficial as more HIV self-tests reach the market.
The usability and comprehension of test instructions are also likely to be context and population specific (for instance, high levels of participants preferring English IFUs), limiting the generalisability of these findings. Some responses may have been influenced by participants' prior HIV testing experience and post-test counselling prior to this study, which is high in the sampling area. Since no diagnostic results were collected, we could not ascertain whether the user errors identified in the study led to incomplete or incorrect test results in this usability study; follow-on studies of full diagnostic performance are currently underway.

Conclusions
The results of the HSTAR001 HIVST Usability studies indicate that several devices show overall high levels of usability (ease-of-use) and acceptability in the hands of lay people. Correlating these usability results to test outcome and diagnostic accuracy is the next step in forthcoming studies.
Each of these 7 HIVST devices evaluated in this study is intended to enter the South African HIVST market-and more are likely to follow. Several HIVST candidates are close to final evaluation of test performance and international regulatory approval, and one has recently achieved approval from the WHO PQ (Orasure). Lessons learnt from this evaluation have resulted in guidelines for use, which if taken up by manufacturers will assist in their HIVST candidates achieving WHO PQ Test design, usability and performance, however, are only the first step; to pave the way for HIVST uptake, there needs to be policy engagement to approve and support distribution channels appropriate to both the private pharmacy-based and public health markets.