The process of attrition in pre-medical studies: A large-scale analysis across 102 schools

The important but difficult choice of vocational trajectory often takes place in college, beginning with majoring in a subject and taking relevant coursework. Of all possible disciplines, pre-medical studies are often not a formally defined major but pursued by a substantial proportion of the college population. Understanding students’ experiences with pre-med coursework is valuable and understudied, as most research on medical education focuses on the later medical school and residency. We examined the pattern and predictors of attrition at various milestones along the pre-med coursework track during college. Using a College Board dataset, we analyzed a sample of 15,442 students spanning 102 institutions who began their post-secondary education in years between 2006 and 2009. We examined whether students fulfilled the required coursework to remain eligible for medical schools at several milestones: 1) one semester of general chemistry, biology, physics, 2) two semesters of general chemistry, biology, physics, 3) one semester of organic chemistry, and 4) either the second semester of organic chemistry or one semester of biochemistry, and predictors of persistence at each milestone. Only 16.5% of students who intended to major in pre-med graduate college with the required coursework for medical schools. Attrition rates are highest initially but drop as students take more advanced courses. Predictors of persistence include academic preparedness before college (e.g., SAT scores, high school GPA) and college performance (e.g., grades in pre-med courses). Students who perform better academically both in high school and in college courses are more likely to remain eligible for medical school.


Introduction
All students inevitably face the challenge of choosing their vocational path. For many, the process begins with choosing their college major. This is a difficult but extremely important choice with lasting consequences. Some of the most common regrets of Americans involve their educational and career choices [1]. The present study investigates a particular case of career planning-the process through which undergraduate students fulfill prerequisite coursework for medical school. It is no secret that a substantial proportion of high school graduates aspire to a Additionally, there has been some suggestion that academic preparedness and performance may influence decisions about persistence in pre-medical education. Among factors prior to college, high school GPA was found to significantly predict pre-medical student retention [17]. However, Barr and colleagues found no association between SAT scores and persisting interest in pre-medical education [3]. In a study by Lovecchio and Dundes, 68% of the former pre-med students surveyed pointed to low grades during college as a major concern for their dropping out [11].
When inquired about specific coursework that deterred students from persisting in their interest in medicine in college, frequently mentioned were low grades obtained in difficult pre-medical "gateway" courses, especially the notorious organic chemistry [3,11,18,19]. Further, the discouraging effects of such chemistry course have been found to be especially pronounced for students from URM groups and women [3,18,20].
While this body of research points to demographics, scholastic preparedness, and college performance as predictors of attrition in the pre-medical curriculum, many causes have been derived qualitatively using small-sample interviews and case studies [3,11,12,18,19,21]. Among studies that quantitatively and longitudinally examined predictors of medical education attrition, many used multiple cohorts from a single institution [22]. A factor that perhaps can partially account for this lack of large-scale, quantitative research on the undergraduate experience of premed students is that pre-medical studies is not a well-defined major in most post-secondary institutions in the United States. Students who are on the pre-med track often major in biological sciences, physical sciences, health sciences, and some even in humanities and social sciences [23]. Therefore, it can be difficult to identify students who are in various majors but are in actuality on the premed track. The present study takes an indirect approach. Rather than attempting to directly identifying the group of pre-med undergraduates, we use all four year of students' coursework data to distinguish those who do not graduate with the basic pre-requisite coursework for medical school from those who do. The combination of this coursework criterion and students' self-reported intentions to pursue a pre-med track before let us reasonably estimate the group of pre-med students.
Furthermore, much of the previous work focused on the singular, final status of persisted versus dropped-out. Thus, the approximately four-year process of pre-medical coursework has been glossed over. The 2019 Matriculating Student Questionnaire (MSQ) administered by the Association of American Medical Colleges (AAMC) reported that while a majority of respondents decided that they wanted to study medicine before college, a substantial percentage (34.8%) decided during their four years of college, most of whom (22.1%) decided during the first two years of college [5]. There is much to be gained from examining the patterns of attrition throughout the various stages or milestones of achieving a pre-medical degree and completing medical school prerequisite courses.
Our paper focuses on progress through the science prerequisites for medical school among students stating that pre-med is their intent when they take the SAT. Using data collected by the College Board, we have a sample of 15,442 students from 102 post-secondary institutions across the United States for whom we have a complete record of course-taking in college. Based on the required courses for entry into medical school, we are able to examine which students remain medical school-eligible at various milestones: 1) one semester of general chemistry, biology, and physics, 2) two semesters of general chemistry, biology, and physics, 3) one semester of organic chemistry, and 4) either the second semester of organic chemistry or one semester of biochemistry, as well as predictors of fulfillment at each milestone.
We note that our focus here is on whether or not a student stating an initial pre-med intent completes the academic requirements to be eligible to apply to medical school, not on whether the student does or does not apply to medical school. This is a consequence of using a large dataset which contain rich details on individual course-taking provided by 102 colleges and universities. The data are anonymized, prohibiting inquiry into students plans and choices following graduation. This is clearly a limitation of the study, but we view the access to this rare large-scale data base on medical school eligibility as a worthwhile tradeoff.

Sample
Data from students who began their post-secondary education in academic years between 2006 and 2009 were provided by The College Board. Of the 917,459 individuals whose intended major choice information was available, 170,866 individuals had four years of complete coursework data available and attended a school using a standard semester system. Of those, 153,512 students did not indicate any intention in studying pre-medicine at the time of SAT, 1,912 indicated pre-medical studies as their secondary or tertiary major choice, or indicated pre-medical studies as their first choice major but were not certain of their choice, and 15,442 indicated pre-medicine as their first choice major and were very or fairly certain of their choice. This resulted in our primary sample of 15,442 students spanning 102 institutions.

Measures
Demographics. Gender and race/ethnicity information were provided. Of 15,442 students, 9,852 (63.80%) were female and 5,590 (36.20%) were male. Other than 331 (2.01%) individuals whose race/ethnicity information was missing, 8,130 (52.65%) were White, 2,899 (11.39%) were Asian or Pacific Islander, 1,764 (11.42%) were Black or African American, 1,674 (10.84%) were Hispanic, 67 (.43%) were American Indian or Alaska native, and 597 (3.87%) identified as Other race/ethnicity. Socioeconomic status (SES). Three SES variables were available: father's education, mother's education, and parental income. Parental income was reported on response options that consisted of several income ranges. A dollar value for parental income is calculated by taking the natural logarithm of the midpoint in each income bracket, thus normalizing the distribution. A equally weighted composite of the three variables were calculated by standardizing each variable individually, summing the three, then standardizing the sum again, following the procedure specified by [24].
Intended major choice. Students indicated their intended major choice from a list of 368 different majors at the time they took the SAT.
High school GPA (hsGPA). Self-reported GPA was used. SAT scores. Three SAT scores were provided based on the three subsections: Verbal/Critical Reading (SATV), Writing (SATW), and Math (SATM), with possible scores ranging from 200 to 800. A composite SAT score (SATC) was calculated by summing the three subsection scores.
AP courses. Data were provided on whether students took any AP courses and their grades on the corresponding AP exams, ranging from 1 to 5. The ones relevant to the pre-medicine curriculum were included: biology, chemistry, calculus AB, calculus BC, english language and composition, english literature and composition, physics B, physics C: electricity and magnetism, physics C: mechanics, and statistics.
College coursework. For all college courses that are taken by each student, information was provided about the course name, the year and semester in which the course was taken, the content area in which courses fell, and the grade obtained.

Procedure
Students' eligibility to apply to medical schools (persistence in studying medicine) was operationalized as whether they fulfill the standard coursework required by most medical programs. To determine medical school prerequisites, we tallied definitions used by prior research [8,9,25], admission requirements specified by the Association of American Medical Colleges (AAMC), and course requirements of 105 medical schools across the United States in 2013.
AAMC suggested general prerequisites to be one year of Biology, one year of Physics, and two years of Chemistry that includes Organic Chemistry courses [26]. Some medical programs also allowed one biochemistry course to substitute for the second organic chemistry course. Therefore, we operationalized fulfillment of medical school prerequisites to include one year (two semesters) of biology, physics, general chemistry, and either one year of organic chemistry or one semester of organic chemistry with one semester of biochemistry.
The number of courses in each subject was counted to indicate continued eligibility for medical school at a number of milestones during coursework progression. For a course to be included, a numerical grade of 0.7 or higher on a scale of 0 to 4 needed to be achieved, as it is the lowest passing grade.
Further, many students receive college credit for achieving satisfactory grades on AP exams. AP exam scores of 3 or above were accepted by most schools as course credits [27], and were counted toward prerequisite fulfillment.

Analyses
Descriptive statistics were calculated to examine differences between fulfillers of medical school course pre-requisites and non-fulfillers.
Subsequently, logistic regression analyses at several important coursework milestones were performed to examine the effects of demographics, academic preparedness, and college course performance on course fulfillment at each stage.
To allow comparisons, the number of students who fulfilled prerequisites for medical schools was also tallied for the group of 1,912 students who indicated some intention in studying pre-medicine but were not certain, and the group of 153,512 students who had no intention of studying pre-medicine. 267 (14.0%) fulfilled the full set of medical school prerequisite coursework in the former group and 2,633 (3.9%) fulfilled it in the latter group.

Comparison between fulfillers and non-fulfillers of medical school prerequisites
Demographic information of the 2,555 fulfillers and 12,887 non-fulfillers of medical school prerequisites can be found in Table 1. A larger proportion of males (21%) who reported premed intentions fulfilled the prerequisites than females (14%).
With regards to race/ethnicity, rates of fulfilling prerequisites among those who had intention of pursuing pre-medical studies were the highest for Asians (23%), followed by the Other Minority group (20%), then Whites (16%) and Hispanics (13%), and the lowest for Blacks (9%).
There were meaningful differences between fulfillers and non-fulfillers on a variety of variables, both prior to college and throughout college (see Table 2). Fulfillers scored higher than non-fulfillers on the SAT-Combined (d = .44), with the largest difference in the Math section (d = .51). Fulfillers also reported higher GPA, both in high school (d = . 33), and in all four years of college (average d = .37). Higher socioeconomic status (d = .18) was reported by fulfillers than non-fulfillers. In cases where AP grades were available, fulfillers obtained higher AP scores than non-fulfillers, including AP Biology (d = .52), AP Chemistry (d = .38), and AP Physics B and C (average d = .50).
When college performance is broken down specifically by course, fulfillers were found to have taken a greater number of medical school prerequisite courses (average d = 1.31) as coursework definitionally distinguishes fulfillers from non-fulfillers. They were also found to have performed better in these courses (average d = .42).

Logistic regression models for predicting fulfillment progression
Logistic regression analyses were performed for the four fulfillment milestones. Predictors included gender, ethnicity, the composite SAT score, high school GPA, SES, and mean college pre-med course grades when applicable. Continuous variables-SAT, GPA, SES, and course grades-were standardized to aid the comparison between regression coefficients. Variable intercorrelations with each subset of the sample can be found in S1, S2, S3 and S4 Tables in the supplement. The progression of logistic regression models can be found in Table 3. The same models with individual prior course grades as predictors were also tested and yielded similar results.  Table 3 reports the odds ratios (OR) and their 95% confidence intervals (CI) for various milestones. To aid the interpretation of these results, we also computed predicted likelihoods based on the regression weights. To compute fulfillment likelihoods of various demographic groups, we plugged in average values of all other predictors. To compute the continuous variables' effects on fulfillment likelihoods, we plugged in the reference groups (i.e., females for gender, White for race), average values of all other predictors, and the average and 1 standard deviation (SD) above average values of the predictor of interest.
Model 1 examined whether students took first-semester general chemistry, biology, and physics courses with demographic and pre-college predictors (milestone 1). The predicted likelihood of males fulfilling milestone 1 was 8.50% higher than that of females (odds ratio (OR) = 1.44), and the likelihood of Asian students completing the milestone was 11.97% higher than that of White students (OR = 1.65), net of all other predictors. Further, controlling for all other predictors, the predicted likelihood of fulfilling milestone 1 increased by 4.82% with every 1 SD increase in SAT score (OR = 1.23), 5.63% with every 1SD increase in high school GPA (OR = 1.28), and .98% with every 1SD increase in SES (OR = 1.04).
Model 2 examined whether students who completed a first semester of prerequisites proceeded to complete a second semester of general chemistry, biology, and physics (milestone 2),  using not only demographic and pre-college predictors, but also the mean of first-semester general chemistry, biology, and physics grades. Among students who had fulfilled milestone 1, the predicted likelihood of males fulfilling milestone 2 was 4.48% higher than that of females (OR = 1.21), and the likelihood of Hispanic students was 8.34% higher than that of White females (OR = 1.44) when holding all other predictors constant. Further, 1SD increase in average first-semester course grades improved the likelihood of students fulfilling milestone 2 by 10.04% net of all other predictors (OR = 1.56). Subsequently, model 3 used the same set of predictors with course grades computed as the average of grades across both semesters of general chemistry, biology, and physics to determine whether students who have fulfilled milestone 2 took any organic chemistry courses (milestone 3). Among students who had completed milestone 2, the likelihood of Hispanic students taking and passing an organic chemistry course was 8.58% lower than that of White females when holding all other predictors constant at average (OR = .64). The likelihood of fulfilling milestone 3 also increased by 4.51% with every 1SD increase in SAT score (OR = 1.33) and by 2.54% with every 1SD increase in high school GPA (OR = 1.17), net of all other variables.
Finally, Model 4 used the same previous set of predictors along with the first organic chemistry grade to predict whether students took a second organic chemistry course or a biochemistry course, thereby fulfilling all science coursework prerequisites for medical school, conditional on having completed all required courses thus far (milestone 4). There were no statistically significant difference in fulfillment likelihood between gender and racial majority and minority groups. Controlling for all other predictors, 1SD increase SAT scores decreased predicted likelihood of prerequisite completion by 2.24% (OR = .88), and 1SD increase in average coursework grades increased likelihood by 3.09% (OR = 1.21). To explore the potential demographic variables' interactions in predicting coursework fulfillment likelihoods, the same four models were also tested including the interaction terms between gender and race dummy variables. The only statistically significant and meaningful interaction was between being male and being Asian for fulfilling coursework milestone 2 (OR = .71, p < .001) and milestone 3 (OR = .69, p < .001). In other words, holding all other predictors constant at their average values, while the predicted likelihood of White males fulfilling milestone 2 exceeds that of White females by 8.67%, the difference in likelihood between Asian males and females is only 2.14%. Similarly, the difference between White males' and females' predicted likelihood for completing milestone 3 is 7.81%, while that between Asian males and females is 1.53%.
Additional logistic regression models with various interactions terms between demographic dummy variable (i.e., gender and race) and the continuous predictors (i.e., SAT, high school GPA, SES, and grades) were also tested individually. For predicting milestones 1, 2, and 3, there was a significant interaction between gender and SAT score such that the difference in fulfillment likelihood between males and females (in favor of males) was reduced as SAT score increased (OR = .88, .84, and .88, respectively). A similar effect was found in the White-Asian comparison for milestones 1 and 2, such that the higher fulfillment likelihood of Asian students than White students was reduced with higher SAT scores (OR = .83 and .84, respectively). In addition, for predicting fulfillment of milestone 3, the higher likelihood of Asian than White students was reduced as average college course grade increased (OR = .88). Lastly, the higher likelihood of Asian students fulfilling milestone 4 than White students was enhanced with higher high school GPA (OR = 1.31).

Discussion
The current study examines the pre-medical coursework fulfillment patterns of the group of students who indicated intentions of studying pre-medicine prior to entering college. Only 16.5% of the students graduated with the coursework required by most medical schools. Attrition is highest at early stages and levels off as students commit to the medical education track by taking more of the required courses. Previous studies that found that former pre-med students often mentioned a "distaste for the large pre-med classes" and the highly competitive environment [3], and a change in interest as a result of exposure to other subjects [3,12,19]. Thus, while attrition rates in the later college years are comparatively lower and may be attributed to challenging coursework, the initial high attrition may reflect students adjusting their expectations about medicine while discovering interest in non-medical disciplines. This is also consistent with earlier findings that students change their majors often due to interest in and positive perceptions of new major more than negative factors about the old major [28]. Given the low acceptance rates into medical schools [6] and the attrition rates in medical schools [4,29], this early change in education track may actually prevent additional personal resources from being wasted in the process of applying to medical programs or institutional resources from being wasted when students drop out of medical programs.
Although a much higher percentage of intended pre-med students completed the full set of medical school prerequisite courses and ended up eligible for medical schools (16.5%) than students with no initial intent (3.9%), the absolute number of the latter group (2,633) was comparable with that of the former group (2,555). The 2019 MSQ by the AAMC reported that of the medical school matriculants, 55.3% had decided that they wanted to study medicine prior to entering college and 34.8% did so during college [6]. When interpreted in light of the present findings, it is evident that a higher percentage of the students who completed medical school prerequisite coursework with initial intent is accepted than without initial intent.

Predictors of pre-med persistence
Similar to previous findings on the association between gender and attrition [3,10] and consistent with the normative alternatives approach to explaining the persistence gap [9], being male was linked with a significantly higher likelihood of persisting in a pre-medical education at nearly all stages throughout college. In other words, women's lower likelihood to persist in medicine may be construed as a higher likelihood to accept alternative career choices.
With regards to ethnic and racial identities, while previous investigations suggest that ethnic and racial minority students are less likely to persist [3,8], results of the current study were less consistent. For Asian students, the odds of fulfilling the first semester of coursework were more likely than those of White students, with persistence likelihood decreasing slightly throughout the later milestones but never significantly lower than those of White students. The odds of Hispanic students fulfilling the first year of coursework were higher than those of White students, but the odds of their completing organic chemistry were lower. Lastly, African American students did not differ from White students in their likelihood of persistence at any point after controlling for socio-economic status, SAT, and grades in high school and college.
Further, students who fulfilled all required coursework reported higher SES. Although SES predicted completion of the first semester of required coursework, it did not predict persistence at any of the later milestones. Thus, the advantage of coming from a family with higher SES found by [3] seems to wear off early on during college. However, its link with persistence in a medical education is likely to strengthen when students decide whether or not to attend medical school, as a $200,000 to $300,000 cost for medical school is no small expense [30].
Consistent with the reputation of a degree in pre-medical studies for being cognitively intensive and challenging, coursework fulfillers entered college with higher scores on all components of the SAT as well as higher high school GPA. Continuing with this advantage, college GPA's of students who eventually fulfilled all required coursework were higher than the GPA's of those who did not for all four years. However, the differences declined over time, with the GPA difference in the fourth year of college being half as large as the GPA difference in the first year of college. This may be an indirect reflection of the high levels of difficulty in premedical courses compared with other courses. In terms of college performance, coursework fulfillers both by definition completed a larger number of courses in relevant science subjects and obtained higher grades in them than non-fulfillers.
When examining the predictive validities of academic preparedness measured by variables prior to college (SAT and high school GPA) and college performance (course grades), an interesting pattern is observed. For predicting completion of the first semester of coursework in the absence of any college grades, higher academic preparedness was associated with a greater likelihood of completion. However, the predictive validities of such more distal, pre-college factors were overtaken by that of the more proximal college grades. The exception is whether students complete the first organic chemistry, for which academic preparedness rather than college grades was predictive. This may be due to the notoriously difficult organic chemistry being overwhelmingly identified as culprit in the leaky pipeline [3,11]. Students might be extra cautious in deciding whether they would be able to succeed in the course and resort to information about their academic effectiveness over a longer period of their lives from the past rather than grades from the recent one or two years to make the decision. In a recent study examining persistence in undergraduate STEM courses, it was found that grades in the first general chemistry course was related with subsequent persistence in STEM more strongly among underrepresented individuals than among well-represented students [20]. Our examination of such interactions between demographic group and other predictors (e.g., academic preparedness before college, grades in college) yielded mixed results.

Limitations and future directions
There are several limitations in the current study. First, because we lacked information about whether students actively pursued pre-medicine throughout college, we focused on completion of coursework required for medical programs. This operationalization is indirect and imperfect. It is possible that students of other science majors completed a similar set of coursework, cases that were considered noise in the current study. On the other hand, it is also possible that some students do not or partially complete prerequisite coursework during their undergraduate years, but later complete all required courses in a postbaccalaureate premedical program. A large number of such programs already existed during the time that data used in the current study was collected [31]. Thus, by focusing on students' medical school eligibility in terms of their coursework at their undergraduate institution, the experiences of those that only become eligible later on were not captured.
Second, as we were not provided information about whether students who obtained satisfactory scores on AP exams actually used their AP credits toward their college degree, we assumed that any AP exam score of 3 or higher counted as a fulfillment course. There is the possibility that some students may forfeit their AP credits or take the equivalent college course.
Third, since the collection of data used in the present study, a new version of the MCAT that includes a larger social and behavioral science component has been implemented [32]. It is reasonable to assume that requirements by medical schools were also revised to contain psychology and sociology coursework. It will be valuable to examine the effects of these changes. Furthermore, major shifts have taken place with regards to the philosophy underlying the evaluation of medical school applicants. In the past decade, AAMC has endorsed the value of "holistic review," which considers "applicants' experiences, attributes, and academic metrics as well as the value an applicant would contribute to learning, practice, and teaching" [33]. As such, academic record and performance is considered alongside many other factors for admission decisions, such as "distance traveled" or cumulative life experiences, and other contextual information for the applicants' accomplishment [34][35][36]. In response to the call for holistic evaluation, a number of medical schools have revised their admissions statements and requirements. For example, the Perelman School of Medicine at the University of Pennsylvania define competencies "not based on specific courses, but rather on the cumulative achievement of knowledge and skills needed to become a physician" [37]. The Boston University School of Medicine emphasize "experiential and personal qualities" in addition to academic rigor in their selection process [38]. There are even institutions like Stanford University School of Medicine that explicitly removed any specific prerequisite requirements, and only provide course recommendations instead [39]. Since the data used in the current study was collected prior to these changes, it will be important for future research to differently operationalize and examine pre-med intention and persistence.
Finally, as our study relied on archival data, we were limited in the extent to which we could explore underlying mechanisms that explain attrition in pre-medical studies. While the regression analyses revealed certain demographic and academic preparedness factors as predictors, no information about the specific reasons behind students' dropping out was available. However, our study provides a valuable quantitative complement to the prior qualitative body of research that described such reasons.

Conclusion
The present study quantitatively describes the process of attrition as reflected in coursework throughout pre-medical studies in postsecondary institutions. A number of pre-college preparedness factors including socio-economic status, SAT score, and high school GPA as well as grades during college were found to predict continued eligibility for medical studies at at various milestones of relevant coursework.
Supporting information S1