Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The Adverse Event Unit (AEU): A novel metric to measure the burden of treatment adverse events

  • Michael K. Hehir ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Writing – original draft, Writing – review & editing

    Affiliation Department of Neurological Sciences, University of Vermont Larner College of Medicine, Burlington, Vermont, United States of America

  • Mark Conaway,

    Roles Data curation, Formal analysis, Investigation, Methodology, Writing – original draft

    Affiliation University of Virginia, Charlottesville, Virginia, United States of America

  • Eric M. Clark,

    Roles Conceptualization, Data curation, Software, Writing – review & editing

    Affiliation University of Vermont Complex Systems Center, Burlington, Vermont, United States of America

  • Denise B. Aronzon,

    Roles Conceptualization, Data curation, Methodology, Writing – review & editing

    Affiliation Timberlane Pediatrics, South Burlington, Vermont, United States of America

  • Noah Kolb,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Neurological Sciences, University of Vermont Larner College of Medicine, Burlington, Vermont, United States of America

  • Amanda Kolb,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Family Medicine, University of Vermont Larner College of Medicine, Burlington, Vermont, United States of America

  • Katherine Ruzhansky,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Medical University of South Carolina, Charleston, South Carolina, United States of America

  • Reza Sadjadi,

    Roles Conceptualization, Writing – review & editing

    Affiliation Harvard Medical School Massachusetts General Hospital, Boston, Massachusetts, United States of America

  • Eduardo A. De Sousa,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Mercy Clinic Neurology, Neuroscience Institute of Oklahoma City, Oklahoma City, Oklahoma, United States of America

  • Ted M. Burns

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft

    Affiliation University of Virginia, Charlottesville, Virginia, United States of America



To design a physician and patient derived tool, the Adverse Event Unit (AEU), akin to currency (e.g. U.S. Dollar), to improve AE burden measurement independent of any particular disease or medication class.


A Research Electronic Data Capture (REDCap) online survey was administered to United States physicians with board certification or board eligibility in general neurology, subspecialty neurology, primary care internal medicine or family medicine, subspecialty internal medicine, general pediatrics, and subspecialty pediatrics. Physicians assigned value to 73 AE categories chosen from the Common Terminology Criteria of Adverse Events (CTCAE) relevant to neurologic disorder treatments. An online forced choice survey was administered to non-physician, potential patients, through Amazon Mechanical Turk (MTurK) to weight the severity of the same AE categories. Physician and non-physician data was combined to assign value to the AEU. Surveys completed between 1/2017 and 3/2019.


363 physicians rated the 73 AE categories derived from CTCAE. 660 non-physicians completed forced choice experiments comparing AEs. The AEU provides 0–10, weighted values for the AE categories studied that differ from the ordinal 1–4 CTCAE scale. For example, CTCAE severe diabetes (category 4) is assigned an AEU score of 9. Although non-physician input changed physician assigned AEU values, there was general agreement among physicians and non-physicians about severity of AEs.


The AEU has promise to be a useful, practical tool to add precision to AE burden measurement in the clinic and in comparative efficacy research with neurology patients. AEU utility will be assessed in planned comparative efficacy clinical trials.


There is increasing emphasis on adverse event(AE) burden in neurology as new treatments are approved [13]. AEs cost more than $136 billion per year and add an average of 5 days to neurological hospitalizations [47]. AEs are important to patients and represent a barrier to treatment adherence. When structuring neurological treatment paradigms, among medications with equal efficacy, treatment decisions will be dictated by differences in AE burden, treatment burden, and cost. We remain without a practical metric to measure AEs that facilitates comparison of medications within and across different classes based on AEs alone.

The Adverse Event Unit (AEU) is a physician and patient weighted consensus unit, akin to currency (e.g. US dollar), designed to quantify and compare AE burden over time. Unlike previous measures, the AEU facilitates AE measurement independent of any disease or medication class, in terms of a number of AEUs that can be compared over time [813]. AEU scores can be combined with other outcome metrics and quality of life scores to better define the differences among treatments in comparative efficacy trials and in the clinic. Understanding AE tolerance in different neurological conditions and AEU validation against other disease metrics is planned for future studies. This manuscript describes the derivation of the AEU and potential applications for this new tool.


Development of the AEU was designed as a two-phase protocol to obtain input from physician experts and potential patients. In the first phase, US physicians assigned weight to the severity of AE associated with treatments for neurological illnesses. In the second phase, non-physician potential patients recruited through the Amazon Mechanical Turk (MTurk) service ( rated the severity of the same group of AE. Data obtained from both phases was combined to generate value for the AEU. Surveys were completed between 1/2017 and 3/2019.

Standard protocol approvals, subject consent

The institutional review board (IRB) at the University of Vermont approved this protocol with a waiver of consent as all subjects were recruited anonymously through on-line surveys. Survey completion implied consent.

Physician subjects

United States physicians completed an on-line survey utilizing the secure Research Electronic Data Capture tool (REDCap) [14] hosted at University of Vermont. The target population was physicians with board certification or board eligibility in general neurology, subspecialty neurology, primary care internal medicine or family medicine, subspecialty internal medicine, general pediatrics, and subspecialty pediatrics. These specialties were chosen to capture the broad range of physicians who provide medical care for neurological patients.

Champions (MKH, TMB, DBA, KR, NK, AK, and ED) identified at US centers recruited colleagues in their communities and at other centers through a combination of targeted emails and in person meetings with groups of physicians. The American Academy of Neurology facilitated recruitment of current and previous physician recipients of the development award that supported the current study. All respondents were encouraged to forward the survey to colleagues in the aforementioned medical specialties.

Potential patient subjects

The online survey tool, MTurk, was used to recruit potential patients to represent a sample of the general population in the United States. MTurk is a viable and validated method to collect data about clinical and social science populations [15, 16]. MTurk participants produced similar results when compared to in person university recruited populations in psychological surveys, behavioral tests, matched comparison groups, economic experiments, clinical studies, and social science studies [1720]. In general, the MTurk participants tend to be of younger age. To sample a broad age range reflective of a typical neurology patient population, we stratified the surveys into the following available age cohorts: 25–30 years, 30–35 years, 35–45 years, 45–55 years, and greater than 55 years. The MTurk tool did not permit additional age stratification in the greater than age 55 years category. Subjects were paid $5 for survey completion.

Survey design and administration (Fig 1)

Items for analysis.

The investigators (TB and MH) chose 73 AE categories relevant to medications prescribed across the field of neurology from the Common Terminology Criteria of Adverse Events (CTCAE) version 4 for analysis (Appendix 1 in S1 File) [21]. The CTCAE is a physician expert derived, widely employed, ordinal [15], unweighted scale commissioned by the National Cancer Institute used to measure AE in many clinical trials [21]. Although AE severity increases along the CTCAE scale, items given the same value may not be of equal burden. For example, the AE of moderate hypertension (level 3), which carries the long-term risk of cardiovascular complications, is given the same score as a high fever of <24 hours duration (level 3). The CTCAE category 5 corresponding to death was not analyzed as we are interested in assigning value to AE that can be monitored over time while a patient undergoes treatment. The finite category of death can be measured independently without weighting because death from any cause is presumably of equal importance and consequence. Under the guidance of a board certified pediatrician (DA), items to measure congenital complications were adapted from the DSM-5 definitions of intellectual disability, an epilepsy research classification of congenital abnormalities, and neural tube defect classification systems [2225].

Physician subjects.

Each physician subject was asked to assign values (0 = no significance to 10 = most significant) to a random sample of 30 AEs within and across the chosen 73 CTCAE and congenital categories of varying severities. They were asked to consider each AE independent of any one disease or treatment. Subjects were also asked to factor scores they assigned both within and across AE categories as they rated AE in the survey (Appendix 2 in S1 File). A separate pediatrician survey included all the congenital malformation AE evaluated in addition to non-congenital AE. Adult physicians also rated congenital AE. Median scores with associated interquartile ranges were calculated to assign initial value to the AEU. This method of assigning weighted values to the AE categories identified from the CTCAE was adapted from method used by members of our research team (TB and MC) in the construction of the MG-Composite, a weighted, consensus, outcome measure validated for use in evaluating patients with myasthenia gravis [26].

Potential patient subjects.

A subset of AE derived from the CTCAE and weighted by the physicians in phase 1 were converted into lay descriptions informed by Mayo Clinic descriptions of symptoms and medical conditions (; Potential patient friendly versions of the CTCAE have been employed in other studies [27].

Potential patient subjects reviewed pairs of AE descriptions (Table 1 and Appendix 3 in S1 File) assigned different AEU values by the physician subjects. AE pairs to review were randomly computer generated so that the compared AEs were from different AE categories and had been assigned different scores by the physician subjects. In the style of a discrete choice experiment, subjects were asked “After reviewing each pair of AE, please choose which of the two AE would be least tolerable (i.e. the most severe of the pair)” [28, 29]. They were also asked to consider: impact on quality of life (QOL), impact on life expectancy, future medical complication risk, likelihood for AE resolution following therapy change, and other factors considered important to the subject. Potential patients were not told how the physicians weighted the AE being evaluated.

Combining physician and potential patient data.

Bradley-Terry models were fit to the choices made by the potential patients using Firth’s method for penalized maximum likelihood logistic regression in SAS version 9.4 [30, 31]. The Bradley-Terry model uses the paired comparisons obtained through Mturk to estimate a set of ‘less preferred’ parameters for the AE. These parameters have the property that, if an AE with parameter A is compared to a second AE with parameter B, we would estimate that a proportion A/ (A+B) of potential patients would choose the first AE as less tolerable. Once ‘less preferred’ parameters from potential patient choices were estimated, we created integer scores by applying K-means clustering to the less preferred estimates, setting the number of clusters equal to 9 to match the range of integer scores provided by the physicians (2 to 10). Final adjustment to AEU scores was done by comparing the physician and the potential patient AEU scores. If the potential patient AEU score was greater than the physician assigned AEU, we increased the physician AEU across an entire AE category (e.g. hypertension) rating by 1. If the potential patient AEU score was less than the physician assigned AEU, we decreased the physician AEU across an entire AE category rating by 1. This method of combining physician and potential patients AEU scores was chosen to give weight to the expertise of physicians in understanding the overall short and long-term sequelae of the rated AEs. It is essential to arrive at single AEU scale to achieve the ultimate goal of developing a combined, easy to administer, best fit, weighted, consensus unit that would be feasible to administer in a clinical practice setting or clinical trial.


Physician subjects

The targeted medical specialties were well represented (Table 2). The group was experienced and covered a wide geographic region. Recruited physicians had male and academic practice predominance. Primary care physicians practicing through university associated medical centers largely self-identified as in academic practice.

Potential patient subjects

The potential patient cohort represented the wide range of ages typical of neurology patients (Table 2). The variables of geographic regions in the U.S. and sex were equally represented. Potential patients with college level of education or above were overrepresented. In addition, African Americans, Hispanic Americans, and Asian Americans were slightly underrepresented when compared to most recent U.S. Census data [32].

Phase 1: Physician weighting

Three hundred sixty three physicians provided data from 397 surveys; 34 physicians completed two different surveys with different sets of AE. On a 0–10 scale (0 = no importance and 10 = maximal importance), physician responses ranged from 2–10 across the 73 AE categories evaluated. Median values with interquartile ranges are available in Appendix 1 in S1 File. In many circumstances, the weighted values provided by the physicians did not match the rigid 1–4 ordinal CTCAE scale. For example, the CTCAE category 1, corresponding to a mild AE for pulmonary fibrosis, received an AEU score of 6. The CTCAE category 4, corresponding to a severe AE for diabetes, received an AEU value of 9. In contrast, the severe CTCAE category 4 for headache received an AEU value of 6, similar to the AEU values assigned for CTCAE category 2 diabetes and CTCAE category 1 for pulmonary fibrosis.

Phase 2: Potential patient forced choice

Each of 660 MTurk potential patient raters made 20 random paired AE discrete choice comparisons. Two sets of comparisons, presented to 20 participants each, could not be used because the computer randomly assigned items with same initial physician derived AEU score. These two sets of comparisons did not allow the participants to distinguish the choices, leaving 11,463 comparisons for analysis. All 73 AE categories were used in at least one paired comparison. Appendix Table 4 in S1 File provides estimates and standard errors for the logistic regression parameters estimated using Firth’s method as well as a calibration plot. The model has excellent discrimination (c-index = 0.866) and calibration. Appendix 5 in S1 File shows the results of the K-means clustering used to assign integer scores to the preference parameters. Subsequent analyses adjusted the preference parameters for demographic characteristics, age, sex, race/ethnicity, education and region of the country and of the mTurk respondents, but none of the characteristics were statistically significant, and more importantly, did not change the final ratings. Adjustment for the demographic characteristics altered the final rating in only 3 of the 73 items, and never by more than 1 point. Given the additional complexity in interpreting the results with additional covariates, we present the final ratings based on the model without adjusting for demographic characteristics.

The AE evaluated by the potential patients ranged from 2–10 AEU points on the scale generated by the physicians and the Bradley Terry method (Table 3). Severity choice values ranged from 0.33 for a mild degree of diarrhea (physician AEU 3) to 8.5 for treatment related malignancy (physician AEU 9).

Table 3. Combined physician and potential patient AEU values.

Phase 3: Combining physician and potential patient values

Final physician and potential patient combined AEU scores are presented in Table 4. Fifty-five of the 73 items were adjusted from the originally assigned physician scores to reflect input from the potential patients (Table 3). In three categories (hallucinations, dyskinesia, and thrombosis), the physician assigned AEU value of a more severe adverse event in a category had a lower score than the immediately preceding AE. For example, moderate hallucinations had an AEU score of 6 and Severe Hallucinations/Medical Intervention Indicated (Hospitalization Not Indicated) had an AEU score of 5. This likely occurred as physicians were not shown the full range of side effects in each category when assigning scores. We did not show physicians all the AEs in a category to reduce survey burden and to prevent bias from being shown a group of side effects in a previously determined, ordinal fashion. In these three circumstances after discussion among the investigators, the decision was made to rate both categories with the higher AEU score prior to obtaining potential patient input; e.g. both hallucination categories in question have a final AEU score of 6. This decision is also supported by overlap in the interquartile range of these categories in the original physician subject weighted values.


In the age of precision medicine, well-designed, practical outcome measures and decision support tools expand the data we track about patients to better inform medical decisions [33, 34]. Unlike previous measures, the physician and potential patient derived AEU quantifies AE burden in a common currency independent of any disease or medication class, that can be compared among different medications over time. The AEU may facilitate movement from more gestalt AE burden measurement to more precise AE burden measurement, enriching treatment discussions between patients and physicians.

Individual patients and physicians may not value AE in the same way, e.g. patients with more severe conditions such as cancer, may tolerate a higher burden of AE. As a consensus metric, the AEU is not designed to be an absolute measure of burden and distress for any particular patient but rather a way to keep the AE burden score. The AEU is designed to best estimate the market price of specific AEs, similarly to how the price is set for a good or service, e.g. $10 for a basketball and $30,000 for a car. Consumers decide if they are willing to pay consensus prices for these goods. Similarly, patients can be given AEU scores corresponding to the number and type of AEs they develop on a given therapy. In combination with measures of disease improvement, financial burden, overall QOL, severity of a patient’s medical condition, patient age, and other factors unique to a particular patient, patients can decide whether they are willing to tolerate a specific AEU burden when making treatment decisions with their physician. Future validation projects, like one underway in a population of patients with myasthenia gravis, will attempt to understand clinically meaningful differences in AEU score over time for different patient populations.

Attempts have been made to develop disease and medication specific measures of AE [813]. Disease and medication specificity limit broad applicability. Quality Adjusted Life Year (QALY) is a useful measure of population cost effectiveness of varied treatments [35, 36]. Since the QALY encompasses all aspects of health, financial cost, and QOL, it cannot measure AE burden alone. As a population based tool, the QALY is a less practical way to measure treatment burden in a comparative efficacy trial or in the clinic.

The CTCAE is a medication independent, physician derived AE measurement tool [21]. Due to lack of weighting and patient input, it provides only granular AE burden measurement. We built the AEU based on the strengths of the CTCAE. The diverse physician group incorporated a wide range of opinions about AE impact on overall health accounting for both current effects (e.g. joint pain) as well as future secondary consequences (e.g. stroke due to new diabetes) to assign AEU values. Although all physicians surveyed could rate congenital complication AEs, all pediatricians surveyed weighted these items as they care for impacted children.

The AEU incorporates potential patient opinions in assigning AE burden values. The use of potential patients rather than patients with particular diagnoses allows AE burden to be scored independent of any particular disease or medication. While we were not able to stratify the sample by whether MTurk respondents were parents, many subjects who rated congenital AEs self-identified as parents in the comment section. Since MTurk doesn’t permit stratification by ethnicity, some groups were slightly underrepresented in our sample. We observed even representation of U.S. geographic regions. Utilizing MTurk, we obtained hundreds of opinions within days of survey release. Although MTurk introduced bias due to requirement of basic computing skills, it reduced other bias, including the selection bias of clinicians when choosing patients for participation in research. We found recruitment through this online tool to be a logistical and cost-effective strategy to easily obtain opinions from large samples. This method has the potential to be a powerful method for studies like this one and to obtain preliminary data for clinical study design while reducing the inherent bias of the small focus group method.

We believe the weighted consensus AEU values provides a more complete measurement of AE burden. A CTCAE category 1 is often classified as mild [37]. However, all CTCAE categories across different AEs are not of equal value and were not weighted the same among our cohort. A CTCAE grade 1 may not reflect a good outcome in all circumstances. For example, CTCAE Grade 1 pulmonary fibrosis, received an AEU score of 7 and was rated the same as CTCAE Grade 4 osteoporosis (Table 4). We also believe the AEU’s independence of any particular disease or medication class is essential to allow comparison of treatments across medication classes. For example, prednisone and IVIG, treatments with different AE profiles, could be compared by the AEU in patients with myasthenia gravis.

Although 75% of AE categories required final AEU value adjustment when physician and potential patient values were combined, only 12% of items had a rating difference of 3 or more points between physicians and potential patients (Table 4). This suggests that while there is difference in physician and potential patient opinions on AE severity, there appears to be general agreement among the groups. Use of the Bradley-Terry paired comparisons model was a useful way to put the physician ratings, collected as scores, and the MTurk ratings, collected as a sequence of paired comparisons, on the same scale. Although physician opinions anchored AEU values, potential patient opinions were incorporated via the discrete choice surveys. We believe adjusting the AEU score to incorporate opinions of both groups strengthens the future applicability of this tool. In practice, patients often rely on physician expert caregivers to guide medical decisions.

We believe the AEU has great promise to be a useful, practical tool to add precision to AE burden measurement in the clinic and in comparative efficacy research for neurology patients. Future studies may show the AEU to be useful in other medical specialties. In comparative efficacy research, we anticipate that AE burden of drugs from different classes can be compared by AEU burden. Assigning an AEU score over time will account for more transient AEs that drop out over time (e.g. single headache) and more persistent AEs (e.g. new hypertension). The AEU score can be combined with other disease specific outcome metrics and QOL metrics to measure differences among medications over time. Evaluation of the validity, utility, and value of the AEU in comparative efficacy trials in myasthenia gravis and other neurological disorders is under way. If the AEU is useful in these studies, translation of some or all of the other items in the CTCAE could be performed to generalize the AEU to other medical subspecialties.

Supporting information

S2 File. Physician subject raw data form 2.


S3 File. All potential patient choices raw data.


S4 File. Physician subject raw data form 5.


S6 File. Physician subject raw data form 1.


S7 File. Physician subject raw data pediatric form.


S8 File. Physician subject raw data form 6.


S9 File. Physician subject raw data form 4.


S10 File. Physician subject raw data form 3.



Dr. Robert Shapiro, Professor of Neurological Sciences at the University of Vermont Larner College of Medicine, provided the idea to utilize the Amazon Mechanical Turk program for this study. The authors are indebted to the 363 physicians who completed the surveys for this project without compensation.


  1. 1. Smith AG. The Cost of Rare Diseases is Threatening the U.S. Health Care System. Harvard Business Review. 2017.
  2. 2. Schepelmann K, Winter Y, Spottke AE, Claus D, Grothe C, Schroder R, et al. Socioeconomic burden of amyotrophic lateral sclerosis, myasthenia gravis and facioscapulohumeral muscular dystrophy. Journal of neurology. 2010;257(1):15–23. pmid:19629566
  3. 3. Guptill JT, Marano A, Krueger A, Sanders DB. Cost analysis of myasthenia gravis from a large U.S. insurance database. Muscle & nerve. 2011;44(6):907–11. pmid:22102461
  4. 4. Classen DC, Pestotnik SL, Evans RS, Lloyd JF, Burke JP. Adverse drug events in hospitalized patients. Excess length of stay, extra costs, and attributable mortality. JAMA: the journal of the American Medical Association. 1997;277(4):301–6. pmid:9002492
  5. 5. Bates DW, Spell N, Cullen DJ, Burdick E, Laird N, Petersen LA, et al. The costs of adverse drug events in hospitalized patients. Adverse Drug Events Prevention Study Group. JAMA: the journal of the American Medical Association. 1997;277(4):307–11. pmid:9002493
  6. 6. Quality AfHRa. Reducing and Preventing Adverse Drug Events to Decrease Hospital Costs. https://archiveahrqgov/research/findings/factsheets/errors-safety/aderia/adehtml. 2001.
  7. 7. Martinez-Ramirez D, Giugni JC, Little CS, Chapman JP, Ahmed B, Monari E, et al. Missing dosages and neuroleptic usage may prolong length of stay in hospitalized Parkinson’s disease patients. PloS one. 2015;10(4):e0124356. pmid:25884484
  8. 8. Testa MA, Anderson RB, Nackley JF, Hollenberg NK. Quality of life and antihypertensive therapy in men. A comparison of captopril with enalapril. The Quality-of-Life Hypertension Study Group. The New England journal of medicine. 1993;328(13):907–13. pmid:8446137
  9. 9. Katz NP, Mou J, Trudeau J, Xiang J, Vorsanger G, Orman C, et al. Development and preliminary validation of an integrated efficacy-tolerability composite measure for the evaluation of analgesics. Pain. 2015;156(7):1357–65. pmid:25867124
  10. 10. Mohr DC, Likosky W, Boudewyn AC, Marietta P, Dwyer P, Van der Wende J, et al. Side effect profile and adherence to in the treatment of multiple sclerosis with interferon beta-1a. Multiple sclerosis. 1998;4(6):487–9. pmid:9987757
  11. 11. Voruganti L, Cortese L, Oyewumi L, Cernovsky Z, Zirul S, Awad A. Comparative evaluation of conventional and novel antipsychotic drugs with reference to their subjective tolerability, side-effect profile and impact on quality of life. Schizophrenia research. 2000;43(2–3):135–45. pmid:10858632
  12. 12. Smith SM, Wang AT, Katz NP, McDermott MP, Burke LB, Coplan P, et al. Adverse event assessment, analysis, and reporting in recent published analgesic clinical trials: ACTTION systematic review and recommendations. Pain. 2013;154(7):997–1008. pmid:23602344
  13. 13. Hunsinger M, Smith SM, Rothstein D, McKeown A, Parkhurst M, Hertz S, et al. Adverse event reporting in nonpharmacologic, noninterventional pain clinical trials: ACTTION systematic review. Pain. 2014;155(11):2253–62. pmid:25123543
  14. 14. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. Journal of biomedical informatics. 2009;42(2):377–81. pmid:18929686
  15. 15. Behrend TS, Sharek DJ, Meade AW, Wiebe EN. The viability of crowdsourcing for survey research. Behavior research methods. 2011;43(3):800–13. pmid:21437749
  16. 16. Goodman JK CC, Cheema A. Data collection in a flat world: the strengths and weaknesses of mechanical turk samples. Journal of Behavioral Decision Making. 2013;26:213–24.
  17. 17. Schmidt GB, Jettinghoff WM. Using Amazon Mechanical Turk and other compensated crowdsourcing sites. Bus Horizons. 2016;59(4):391–400.
  18. 18. Paolacci G, Chandler J. Inside the Turk: Understanding Mechanical Turk as a Participant Pool. Current Directions in Psychological Science. 2014;23(3):184–8.
  19. 19. Buhrmester M, Kwang T, Gosling SD. Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? Perspectives on psychological science: a journal of the Association for Psychological Science. 2011;6(1):3–5. pmid:26162106
  20. 20. Bernstein J, Calamia M. Characteristics of a Mild Traumatic Brain Injury Sample Recruited Using Amazon’s Mechanical Turk. PM & R: the journal of injury, function, and rehabilitation. 2017. pmid:28633999
  21. 21. Institute NC. Common Terminology Criteria for Adverse Events v4.0. NIH Publication 09–7473. 2009.
  22. 22. Association AP. Neurodevelopmental Disorders. Diagnostic and statistical manual of mental disorders (5th ed)2013.
  23. 23. Holmes LB, Harvey EA, Coull BA, Huntington KB, Khoshbin S, Hayes AM, et al. The teratogenicity of anticonvulsant drugs. The New England journal of medicine. 2001;344(15):1132–8. pmid:11297704
  24. 24. Copp AJ, Stanier P, Greene ND. Neural tube defects: recent advances, unsolved questions, and controversies. The Lancet Neurology. 2013;12(8):799–810. pmid:23790957
  25. 25. Greene ND, Copp AJ. Neural tube defects. Annual review of neuroscience. 2014;37:221–42. pmid:25032496
  26. 26. Burns TM, Conaway MR, Cutter GR, Sanders DB, Muscle Study G. Construction of an efficient evaluative instrument for myasthenia gravis: the MG composite. Muscle & nerve. 2008;38(6):1553–62. pmid:19016543
  27. 27. Basch E, Iasonos A, McDonough T, Barz A, Culkin A, Kris MG, et al. Patient versus clinician symptom reporting using the National Cancer Institute Common Terminology Criteria for Adverse Events: results of a questionnaire-based study. The Lancet Oncology. 2006;7(11):903–9. pmid:17081915
  28. 28. Reed Johnson F, Lancsar E, Marshall D, Kilambi V, Muhlbacher A, Regier DA, et al. Constructing experimental designs for discrete-choice experiments: report of the ISPOR Conjoint Analysis Experimental Design Good Research Practices Task Force. Value in health: the journal of the International Society for Pharmacoeconomics and Outcomes Research. 2013;16(1):3–13.
  29. 29. Louviere JL, Flynn T.N., and Carson R.T. Discrete Choice Experiments and Not Conjoint Analysis. Journal of Choice Modelling. 2010;3(3):57–72.
  30. 30. Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.
  31. 31. Hunter D. MM Algorithms for Generalized Bradley-Terry Models. Annals of Statistics. 2004;32(1):384–406.
  32. 32. Bureau USC. Race and Hispanic Origin 2010 [Available from:].
  33. 33. Mirnezami R, Nicholson J, Darzi A. Preparing for precision medicine. The New England journal of medicine. 2012;366(6):489–91. pmid:22256780
  34. 34. Burns TM. The best of both worlds: Using patient-reported plus physician-scored measures during the evaluation of myasthenia gravis. Muscle & nerve. 2016;53(1):3–4. pmid:26506220
  35. 35. Bravo Vergel Y, Sculpher M. Quality-adjusted life years. Practical neurology. 2008;8(3):175–82. pmid:18502950
  36. 36. Sassi F. Calculating QALYs, comparing QALY and DALY calculations. Health policy and planning. 2006;21(5):402–8. pmid:16877455
  37. 37. Sanders DB, Wolfe GI, Benatar M, Evoli A, Gilhus NE, Illa I, et al. International consensus guidance for management of myasthenia gravis: Executive summary. Neurology. 2016;87(4):419–25. pmid:27358333