Quantifying underreporting of law-enforcement-related deaths in United States vital statistics and news-media-based data sources: A capture–recapture analysis

Background Prior research suggests that United States governmental sources documenting the number of law-enforcement-related deaths (i.e., fatalities due to injuries inflicted by law enforcement officers) undercount these incidents. The National Vital Statistics System (NVSS), administered by the federal government and based on state death certificate data, identifies such deaths by assigning them diagnostic codes corresponding to “legal intervention” in accordance with the International Classification of Diseases–10th Revision (ICD-10). Newer, nongovernmental databases track law-enforcement-related deaths by compiling news media reports and provide an opportunity to assess the magnitude and determinants of suspected NVSS underreporting. Our a priori hypotheses were that underreporting by the NVSS would exceed that by the news media sources, and that underreporting rates would be higher for decedents of color versus white, decedents in lower versus higher income counties, decedents killed by non-firearm (e.g., Taser) versus firearm mechanisms, and deaths recorded by a medical examiner versus coroner. Methods and findings We created a new US-wide dataset by matching cases reported in a nongovernmental, news-media-based dataset produced by the newspaper The Guardian, The Counted, to identifiable NVSS mortality records for 2015. We conducted 2 main analyses for this cross-sectional study: (1) an estimate of the total number of deaths and the proportion unreported by each source using capture–recapture analysis and (2) an assessment of correlates of underreporting of law-enforcement-related deaths (demographic characteristics of the decedent, mechanism of death, death investigator type [medical examiner versus coroner], county median income, and county urbanicity) in the NVSS using multilevel logistic regression. We estimated that the total number of law-enforcement-related deaths in 2015 was 1,166 (95% CI: 1,153, 1,184). There were 599 deaths reported in The Counted only, 36 reported in the NVSS only, 487 reported in both lists, and an estimated 44 (95% CI: 31, 62) not reported in either source. The NVSS documented 44.9% (95% CI: 44.2%, 45.4%) of the total number of deaths, and The Counted documented 93.1% (95% CI: 91.7%, 94.2%). In a multivariable mixed-effects logistic model that controlled for all individual- and county-level covariates, decedents injured by non-firearm mechanisms had higher odds of underreporting in the NVSS than those injured by firearms (odds ratio [OR]: 68.2; 95% CI: 15.7, 297.5; p < 0.01), and underreporting was also more likely outside of the highest-income-quintile counties (OR for the lowest versus highest income quintile: 10.1; 95% CI: 2.4, 42.8; p < 0.01). There was no statistically significant difference in the odds of underreporting in the NVSS for deaths certified by coroners compared to medical examiners, and the odds of underreporting did not vary by race/ethnicity. One limitation of our analyses is that we were unable to examine the characteristics of cases that were unreported in The Counted. Conclusions The media-based source, The Counted, reported a considerably higher proportion of law-enforcement-related deaths than the NVSS, which failed to report a majority of these incidents. For the NVSS, rates of underreporting were higher in lower income counties and for decedents killed by non-firearm mechanisms. There was no evidence suggesting that underreporting varied by death investigator type (medical examiner versus coroner) or race/ethnicity.

multilevel logistic regression. We estimated that the total number of law-enforcementrelated deaths in 2015 was 1,166 (95% CI: 1,153, 1,184). There were 599 deaths reported in The Counted only, 36 reported in the NVSS only, 487 reported in both lists, and an estimated 44 (95% CI: 31, 62) not reported in either source. The NVSS documented 44.9% (95% CI: 44.2%, 45.4%) of the total number of deaths, and The Counted documented 93.1% (95% CI: 91.7%, 94.2%). In a multivariable mixed-effects logistic model that controlled for all individual-and county-level covariates, decedents injured by non-firearm mechanisms had higher odds of underreporting in the NVSS than those injured by firearms (odds ratio [OR]: 68.2; 95% CI: 15.7, 297.5; p < 0.01), and underreporting was also more likely outside of the highest-income-quintile counties (OR for the lowest versus highest income quintile: 10.1; 95% CI: 2.4, 42.8; p < 0.01). There was no statistically significant difference in the odds of underreporting in the NVSS for deaths certified by coroners compared to medical examiners, and the odds of underreporting did not vary by race/ethnicity. One limitation of our analyses is that we were unable to examine the characteristics of cases that were unreported in The Counted.

Conclusions
The media-based source, The Counted, reported a considerably higher proportion of lawenforcement-related deaths than the NVSS, which failed to report a majority of these incidents. For the NVSS, rates of underreporting were higher in lower income counties and for decedents killed by non-firearm mechanisms. There was no evidence suggesting that underreporting varied by death investigator type (medical examiner versus coroner) or race/ethnicity.

Author summary
Why was this study done?
• Several governmental and nongovernmental databases track the number of law-enforcement-related deaths in the US, but all are likely to undercount these deaths.
• To our knowledge, our study is the first to estimate the proportion of law-enforcementrelated deaths properly captured by 2 data sources: official US mortality data, derived from death certificates, and The Counted, a nongovernmental database derived from news media reports.
• US mortality data include virtually all deaths that occur in the country, and law-enforcement-related deaths are supposed to be assigned a diagnostic code corresponding to "legal intervention." If a death is improperly assigned another code, it is considered to be misclassified, which leads to undercounting of the number of law-enforcementrelated deaths. We investigated the extent of misclassification and the factors associated with misclassification.

Introduction
The National Vital Statistics System (NVSS), administered by the US government and based on state death certificates, is the longest-running national data source on law-enforcementrelated deaths (i.e., those involving fatal injuries inflicted by law enforcement), but has long been suspected of underreporting a large number of such deaths [1][2][3]. Other databases run by the US Department of Justice similarly undercount law-enforcement-related deaths [4]. In recent years, a new type of data source on legal intervention mortality has emerged: national databases maintained by newspapers, nongovernmental organizations, and the US Bureau of Justice Statistics (BJS; a governmental organization) that identify incidents via web searches of news media reports [3,[5][6][7][8].
The NVSS has identified law-enforcement-related deaths since 1949, following the inclusion of "injury by intervention of police" as a diagnostic category in the 6th revision to the International Classification of Diseases (ICD) [9]. While the category has since been renamed as "legal intervention," its definition remains unchanged up to the current ICD revision, ICD-10: "injuries inflicted by the police or other law-enforcing agents, including military on duty, in the course of arresting or attempting to arrest lawbreakers, suppressing disturbances, maintaining order, and other legal action" [10] (Table 1). A designation of legal intervention does not depend on whether the use of force resulting in the injury was lawful [11] or whether the injuries were inflicted intentionally.
Prior studies found that NVSS counts of legal intervention deaths were lower in at least some US states compared to counts reported by law enforcement data sources, suggesting that the NVSS misses some proportion of these deaths [1,13,14]. This underreporting occurs when a death certificate is misclassified: it is wrongly assigned an ICD code that does not correspond to legal intervention, and the death can therefore not be identified as law-enforcement-related in queries of NVSS data (Table 1). Misclassification primarily occurs because the coroner or medical examiner certifying the death fails to mention police involvement in the literal text fields of the death certificate's cause of death section (e.g., the field labeled "Describe how the injury occurred" does not state "killed by police"), although mistakes in the process of assigning ICD codes may still occur even when the death certificate indicates police involvement [15]. To our knowledge, there have been no prior national estimates of the misclassification rate for legal intervention deaths in the NVSS, nor has any research investigated factors associated with misclassification.
In recent years, a number of nongovernmental initiatives have sought to identify incidents of law-enforcement-related deaths in the US based on web searches of news media, and these databases provide counts that far exceed those reported in the NVSS and traditional US Department of Justice governmental data sources. Examples of such nongovernmental efforts include The Guardian's The Counted (covering 2015-2016) [5], The Washington Post's police shooting database (2015-present; excludes non-firearm deaths) [7], and Fatal Encounters (2014-present prospectively; 2000-2013 retrospectively) [6]. Prior analyses have found that, within the same time period, these sources report a nearly identical set of cases [16]. In addition to these nongovernmental efforts, the BJS redesigned its Arrest-Related Deaths (ARD) program in mid-2015 to track deaths in custody using a similar method: ARD first identifies cases based on a systematic internet search of news media reports, then requests more information about deaths from law enforcement agencies, medical examiners, and coroners [8]. Even as researchers have made increasing use of these news-media-based data sources [3,[16][17][18] and the federal government has adopted their practices, there have been no prior estimates about the proportion of law-enforcement-related deaths that remain unreported in databases drawn from news media.
Our a priori hypotheses were that underreporting by the NVSS would exceed that by the news media sources, and that misclassification rates would be higher for decedents of color versus white, decedents in lower versus higher income counties, decedents killed by non-firearm versus firearm mechanisms, and deaths recorded by a medical examiner versus coroner. Our study aims to improve public health monitoring of law-enforcement-related deaths, which may ultimately aid efforts to improve accountability for both individual deaths and aggregate trends [18]. Deaths will not appear if they are misclassified, i.e., assigned an ICD-10 code that does not correspond to legal intervention. This may happen because law enforcement involvement is not mentioned on the death certificate, or potentially due to coding errors by the National Center for Health Statistics.
The Counted "People killed by police and other law enforcement agencies in the United States" From The Counted website: "What is included in The Counted? Any deaths arising directly from encounters with law enforcement. This will inevitably include, but will likely not be limited to, people who were shot, tasered and struck by police vehicles as well those who died in police custody. What is not included in The Counted? Self-inflicted deaths during encounters with law enforcement. For instance, a person who died by crashing his or her vehicle into an oncoming car while fleeing from police at high speed is not regarded by the Guardian's database to have been killed by law enforcement. The database does not include suicides or self-inflicted deaths including drug overdoses in police custody or detention facilities." [12]

Methods
We created a dataset of law-enforcement-related deaths in 2015 by matching 2 sources: The Counted, a news-media-based dataset created by the newspaper The Guardian [5], and the NVSS, from which we obtained individually identifiable mortality data for cases that were reported by The Guardian. Our study was deemed exempt from review by the Harvard T.H. Chan School of Public Health institutional review board (IRB16-1146) because it did not involve living persons. We were not able to publish death counts for all US states and counties due to privacy restrictions for NVSS data. We did not have a written prospective analysis plan; we agreed on an analytic plan at an October 2016 meeting and conducted all analyses in January 2017. Our cross-sectional study involved 2 main analyses: (1) a capture-recapture analysis to estimate the total number of law-enforcement-related deaths in the US during 2015, as well as the proportions captured by The Counted and the NVSS, and (2) a multilevel logistic regression analysis investigating the correlates of misclassification for law-enforcement-related deaths in NVSS data. This report has been prepared according to STROBE guidelines, as suggested by the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network (S1 STROBE checklist). The Counted identified US law-enforcement-related deaths in 2015-2016 using web searches of news media reports; it defined these incidents as "any deaths arising directly from encounters with law enforcement . . . [such as] people who were shot, tasered and struck by police vehicles as well those who died in police custody" and excluded persons who died of self-inflicted injuries (Table 1) [12]. The website of the dataset also allowed members of the public to report cases; however, all deaths in the 2015-2016 dataset were substantiated based on local news media reports with the exception of 5 deaths identified via The Guardian's original reporting [19]. The Guardian staff extracted characteristics of each incident including the decedent's name, demographic information, street address of the police encounter, date of the injury occurrence, and mechanism of death. They also included a brief narrative description of events leading to the death. When necessary, reporting staff requested more information from local government agencies.
The NVSS receives electronic mortality data, based on death certificates, on deaths from all causes that are reported by 52 US-based independent registration areas ("states"; including the 50 states, District of Columbia, and New York City, which reports independently of New York State). On death certificates, funeral home directors record demographic information, and coroners or medical examiners report cause of death information. Staff at state vital statistics registries input death certificate information in a standardized electronic format. They send these data to the National Center for Health Statistics, which assigns up to 20 cause of death codes, following ICD-10, based on literal text written by the coroner/medical examiner. For a majority of decedents-approximately 60% of cases coded as legal intervention deaths in 2015 -ICD codes are assigned by a computer program, SuperMICAR (National Center for Health Statistics; https://www.cdc.gov/nchs/nvss/mmds/super_micar.htm). Trained nosologists assign codes when automatic assignment fails.

Exclusion criteria
The Counted used a broader definition for law-enforcement-related deaths than the NVSS, which follows the ICD definition for legal intervention (Table 1). Unlike the ICD definition, The Counted did not require that the injury be inflicted by a law enforcement officer and made no differentiation as to whether the injury was inflicted while a law enforcement officer was acting in the line of duty. To ensure that both datasets were comparable, we excluded cases from The Counted that did not conform to the ICD definition of legal intervention, while also recognizing that ambiguity in the ICD definition can make it unclear whether the diagnostic category is appropriate for certain instances. One category for which the definition lacks clarity is motor-vehicle-related deaths involving law enforcement. While on duty, an officer may accidently hit a pedestrian, although it is unclear whether this death occurred "in the course of arresting or attempting to arrest lawbreakers, suppressing disturbances, maintaining order, and other legal action." Because these injuries may not specifically relate to the officer's law enforcement role, we excluded decedents killed in motor-vehicle-related accidents unless they were being pursued by police or were intentionally injured in a police vehicle during transit. Another category for which definitional ambiguities arise is "deaths in custody," i.e., nonfirearm deaths that occur during the course of arrest or in holding cells and jails. In such instances, the circumstances of the death may be unknown to the public, and it may not be clear to death investigators whether actions by officers contributed to the death [20]. We excluded deaths in custody unless The Counted described a clear mechanism through which law enforcement actions may have caused the death (medical neglect, use of a chokehold, use of a Taser) or the death was reportedly ruled a homicide in The Counted's narrative description (a homicide ruling can be made only if the injury was intentionally inflicted, while legal intervention, as defined by the ICD-10, does not require intentionality; however, a finding of homicide also provides evidence that law enforcement officers caused the death).
Additional exclusion criteria included instances of domestic violence perpetrated by law enforcement officers, as these did not occur in the course of carrying out "legal action." For the same reason, we excluded deaths by "friendly fire" (i.e., an accidental shooting of one officer by another; the only such death reported in the 2015 The Counted data occurred during a training). Finally, we also excluded the small number of decedents (N = 3; <0.3% of deaths) who were injured in 2015 but died in 2016, as they would not appear in the 2015 mortality data.

National Death Index plus matching process
The National Death Index (NDI) is a restricted-access database, administered by the National Center for Health Statistics, that researchers can use to access the same electronic mortality data reported in the NVSS [21,22]. Requestors submit a list of decedents, and the NDI returns either vital status only (i.e., confirmation of whether the individual has died) or, if the researcher pays a higher fee for "NDI Plus," all reported ICD-10 coded causes of death for each decedent. For all cases meeting our inclusion criteria, we submitted names and years of birth (based on media-reported age) using NDI Plus. NDI Plus requires that submitted data include exact matches for first names, near matches for last names, and near matches for year of birth (±1 year) [22]. Matched records return the state in which the death occurred, date of death, and multiple ICD-10 coded causes of death. We identified true matches from NDI Plus output by ensuring dates and states of death were consistent with media reports. We considered the date of death to match when it fell within 4 days of the injury occurrence date reported in The Counted.
We rejected cases for which the date of death preceded the reported date of injury by more than 4 days. For cases whose NDI record reported a date of death more than 4 days after the reported injury, we flagged the result as a match only if we were able to locate a news article reporting the later date of death. Similarly, for deaths whose matched record reported a state that differed from the location of injury reported by The Counted, we flagged it as a match if we were able to locate a news article confirming the state of death (differing states for injury and death can happen if a person is transported across state lines to a hospital before the death). Finally, we tabulated the characteristics of matched cases and unmatched cases and stratified by measured covariates for comparison. Unmatched cases were not included in any subsequent analyses.

Estimating the total number of law-enforcement-related deaths
Our first set of analyses used capture-recapture analysis (also known as multiple systems estimation) to estimate the number of US law-enforcement-related deaths in 2015. Using 2 or more matched, incomplete lists, capture-recapture analysis estimates the total size of a population, including the number of cases missed by all lists [23]. To conduct the capture-recapture analysis, we obtained monthly counts of deaths reported as legal intervention deaths in the 2015 NVSS public-use multiple cause of death file [24]. Using those counts along with the dataset derived from matching The Counted and NDI, we estimated the number of deaths (1) reported in The Counted only, (2) classified as legal intervention deaths in the NVSS only, and (3) reported in both systems. We considered a case to be reported as a legal intervention death in the NVSS when at least 1 of its multiple ICD-10 cause of death codes corresponded to legal intervention (ICD-10: Y35.0-Y35.4; Y35.6-Y35.7; Y89.0). We assumed unmatched cases from The Counted (95/1,086; 8.7%) were classified as legal intervention deaths in the NVSS at the same rate as matched cases: we added 43 of these deaths (45%) to the group that was captured by both the NVSS and The Counted, and added the remaining 52 cases (55%) to the group captured by The Counted only.
We used Poisson regression, with data stratified by 3-month periods, to conduct capturerecapture analysis. The counts for each group (deaths captured by The Counted only, the NVSS only, and both systems) analyzed by the Poisson model are presented in S1 Table. For capture-recapture analyses with only 2 data sources, the method assumes independence between the lists (i.e., the probability of a case appearing in one list is uncorrelated with its probability of appearing in the other list). This assumption is frequently violated in epidemiologic contexts, however: often there is positive list dependence, which leads to underestimated population sizes [25]. In our study, one possible source of list dependence is that both databases typically rely on reporting by police departments to ascertain cases, either when the agency issues press releases (in the case of media reports) or when it releases reports detailing the circumstances of the death to the coroner or medical examiner (in the case of the NVSS). With respect to the latter, journalists have revealed multiple incidents in which law enforcement agencies failed to release pertinent documents to death investigators for in-custody deaths or pressured death investigators to make a finding of non-homicide [26][27][28], although there is no evidence to suggest how frequently this occurs.
To address the potential for list dependence, we conducted a sensitivity analysis to estimate the maximum plausible number of cases adjusting for a prior correlation value between our 2 lists. We followed the method employed by Lum and Ball [29], who incorporated prior values, based on capture-recapture analyses of homicides from comparable sources in 5 other countries, when they estimated the number of law-enforcement-related deaths in the US from 2 probabilistically matched law enforcement datasets. By including the highest pairwise list correlation value they reported (0.93, based on a study of homicides in Syria) as an offset in our Poisson model, we calculated a maximum plausible estimate of the number of deaths in this sensitivity analysis.

Analyzing correlates of misclassification in National Vital Statistics System mortality data
Our next set of analyses sought to identify correlates of misclassification of legal intervention deaths in NVSS mortality data, with misclassification defined as there not being any ICD-10 codes for legal intervention among the reported multiple causes of death. For the purpose of these analyses, we assumed The Counted's matched cases were a random sample of the total population of US law-enforcement-related deaths in 2015. This is a tenable assumption because, as we report below, The Counted underreports relatively few incidents, and there appear to be no systematic differences between matched and unmatched cases. We used demographic data (age, gender, and race/ethnicity) reported in The Counted, which our prior research has found to be highly concordant with values reported on death certificates [15]. We also used The Counted data on mechanism of death (firearm or non-firearm) and the county where the fatal injury occurred. At the county level, we identified median household income quintiles based on 2011-2015 US Census data [30], urbanicity based on National Center for Health Statistics classifications [31], and death investigator type (medical examiner, elected coroner, or appointed coroner) based on a Centers for Disease Control and Prevention (CDC) dataset [32]. For counties with ambiguous CDC data regarding death investigator type, we contacted local government agencies directly.
After tabulating descriptive statistics on misclassified and properly classified legal intervention deaths, we calculated and mapped misclassification rates by state. We then conducted multilevel logistic regression, using Stata version 14.2 (StataCorp [https://www.stata.com]), to model the odds of misclassification. Our univariable and multivariable models included random intercepts for counties and states. We used post-estimation commands to calculate the average marginal effects for select covariates, and we report these as predicted probabilities of misclassification.

Results
The Counted identified 1,146 law-enforcement-related deaths in the US during 2015. Applying our exclusion criteria, we eliminated 60 cases that did not conform to the ICD definition of legal intervention, such that the initial dataset included 1,086 observed deaths (Table 2).
Overall, 444 (44.8%) of the law-enforcement-related deaths were properly classified as legal intervention deaths in the NVSS. The most common underlying cause of death for misclassified cases was assault, which was more prevalent than legal intervention and accounted for 47.5% of all matched cases (N = 471). While nearly all firearm deaths were coded as legal intervention or assault (96.8% combined), the causes of death reported for non-firearm mechanisms were more heterogeneous. Deaths that followed the use of Tasers were reported as legal intervention (6/46; 13%), assault (10/46; 21.7%), missing/undetermined (8/46; 17.4%), accidental injury (10/46, 21.7%), and mental health/behavioral disorders (5/46; 10.9%). Struck by/against was the only other non-firearm mechanism for which any cases were classified as legal intervention (4/18 struck by/against injuries; 22.2%).

Estimates of the number of US law-enforcement-related deaths in 2015
There were 599 deaths reported in The Counted only, 36 reported in the NVSS only, 487 reported in both lists, and an estimated 44 (95% CI: 31, 62) not reported in either list. Assuming independence between lists, our capture-recapture model estimates that the total number of US law-enforcement-related deaths in 2015 was 1,166 (95% CI: 1,153, 1,184) (Fig 1). This suggests that the NVSS documented 44.9% (95% CI: 44.2%, 45.4%) of law-enforcement-related deaths, and The Counted documented 93.1% (95% CI: 91.7%, 94.2%). Our sensitivity analyses show that these estimates are robust to potential pairwise list correlation. Assuming the highest of the pairwise list correlation values reported by Lum and Ball [29], 0.93, the maximum number of deaths was only slightly higher, equaling 1,233 (95% CI: 1,200, 1,280

Correlates of ICD-10 misclassification of law-enforcement-related deaths
We found that, among cases reported in The Counted and matched to NVSS data, 55.2% (547/ 991) were misclassified in the NVSS. These deaths occurred in 51 states (49 states, the District of Columbia, and New York City; The Counted did not report any cases from Rhode Island meeting our inclusion criteria) ( Table 5; Fig 2) and in 491 of 3,144 US counties. Misclassification rates ranged from 0% to 100%; among states with !10 matched cases, rates ranged from 17.6% (Washington) to 100.0% (Oklahoma). Taken together, 5 states-California, Texas, Florida, Oklahoma, and Arizona-contained 42.4% of matched cases and accounted for a majority of the misclassified cases (50.3%). Among these 5 states, misclassification was 40% to <60% in Mortality records report 1 underlying cause of death, defined as "(a) the disease or injury which initiated the train of events leading directly to death, or (b) the circumstances of the accident or violence which produced the fatal injury" [22]. The records also report up to 20 "multiple causes of death" based on any other health conditions reported on the death certificate. In rare instances (N = 2), legal intervention was reported as a multiple cause of death but not an underlying cause of death. We nonetheless present these cases in the column for legal intervention.

Discussion
We estimated the total number of law-enforcement-related deaths in the US in 2015-1,166 deaths (95% CI: 1,153, 1,184)-and found that, as hypothesized, a much higher proportion of such deaths were captured by The Guardian's The Counted (93.1%; 95% CI: 91.7%, 94.2%) https://doi.org/10.1371/journal.pmed.1002399.g002 Quantifying underreporting of law-enforcement-related deaths . We also found that misclassification rates in NVSS data for law-enforcement-related deaths varied widely both within and between states, and that misclassification was more likely for non-firearm deaths than firearm deaths and for deaths that occurred outside of the highest income counties. These findings together affirm that major shortcomings exist in official counts of law-enforcement-related deaths based on US vital statistics. The results additionally suggest these shortcomings could potentially be corrected by simultaneously (1) improving the extent and accuracy of the information recorded in death certificates and (2) expanding the types of data employed (such as media-based reports) utilized to generate official counts of these cases. Our study is strengthened by its use of identifiable, national US mortality data to estimate the number of law-enforcement-related deaths and to analyze patterns of misclassification of these deaths in the NVSS. One limitation is that differential matching rates for our NVSS/The Counted dataset may bias results, although the high proportion of cases that we were able to match limits this bias. Additionally, we were unable to examine the characteristics of cases that were unreported in The Counted. One issue of concern is that law-enforcement-related deaths occurring in rural areas may not be reported in the news media, because there is less local news coverage available in rural areas and rural news sources may not be accessible on the internet [34]. Another issue is that we cannot know with complete certainty in which county the death was declared; The Counted reports the location where the fatal injury was inflicted. While data from California suggest that four-fifths of persons fatally injured by law enforcement die immediately [35], an unknown proportion of the remaining one-fifth may die at a hospital in another county. Facilities best equipped to treat gunshot wounds, such as level I trauma centers, are more likely to be located in urban and higher income counties [36], so this could lead to measurement error for county-level variables. Finally, The Counted data do not include deaths that occurred in 2015 due to an injury inflicted in 2014, so any such cases are absent from the analyses. However, this is likely a very small number of cases (for injuries inflicted in 2015, we identified only 3 cases, or <0.3% of deaths, for which the death occurred in 2016).
Our estimates, derived from capture-recapture analysis, for the total number of lawenforcement-related deaths in 2015 are robust to pairwise list dependence. Because of the high degree of overlap between our 2 data sources (i.e., a large proportion of deaths reported in the NVSS were also reported in The Counted), any potential list dependency had minimal effect on the overall estimate. The Counted was more effective at identifying deaths: a case was approximately twice as likely to be reported in The Counted compared to the NVSS. Comparing its coverage rate to previous estimates produced by the BJS, The Counted outperformed ARD (which captured an estimated 49% of deaths over the period 2003-2011, excluding 2010) as well as the FBI's Supplementary Homicide Reports data (which captured an estimated 46% of deaths over the same period) [4]. Only 2 prior studies have used capture-recapture analysis to estimate the number of US law-enforcement-related deaths. First, a BJS analysis for the period 2003-2011 (excluding 2010) was based on probabilistically matched deaths from 2 national law enforcement sources and estimated that there were on average 928 annual law-enforcement-related deaths in the Quantifying underreporting of law-enforcement-related deaths US [4]. The authors of the BJS study note that many law enforcement agencies did not report any deaths to either system, and, once they accounted for nonresponse, their estimate was approximately 1,200, on par with our estimate. Second, Lum and Ball [29], adjusting for potential list dependency but not for agency nonresponse, used the same BJS data to estimate an annual mean of 1,500 deaths in the US, which is higher than our estimate. They state that adjusting for nonresponse would increase their estimate by an additional 30%. Differences between these prior estimates and our own may be attributable to (1) an actual change in the incidence of law-enforcement-related deaths, (2) uncertainty in the magnitude of list dependence, or (3) potential error in the prior estimates introduced by the imprecision of probabilistic matching. We found that the majority of misclassified cases for the most common cause of deathfatal gunshot wounds by law enforcement-were incorrectly coded as assault. As hypothesized, a higher risk of misclassification occurred for the less common phenomenon of law-enforcement-related deaths involving injury mechanisms other than firearms. This may reflect a lack of consensus among coroners and medical examiners about how to report non-firearm deaths in police custody [37]. Notably, cause of death classification was especially inaccurate for lawenforcement-related deaths due to Taser shocks, which was the second most common mechanism after firearms.
While misclassification of law-enforcement-related deaths is a problem throughout the country, affecting 55% of mortality records nationally, the probability of misclassification varied widely both within and between states, and also by social and economic groups. Descriptive analyses found higher probabilities of misclassification among decedents who were under age 18 years, black, or residing in the poorest county income quintiles, suggesting researchers should exercise caution when comparing rates of law-enforcement-related mortality among various sociodemographic groups using only national-level data. However, in our analyses that accounted for systematic differences in odds of misclassification by state and county (i.e., the multilevel models), only county income quintile remained significantly associated with risk of misclassification. Possible explanations for the inverse association between county income and odds of misclassification may include better resources and training among coroners/medical examiners in wealthier counties and differences in the political culture in wealthier counties that lead to greater transparency in relation to law-enforcement-related deaths. Even so, contrary to our hypotheses, we did not find that misclassification differed by death investigator type. It may be that extent of training and resources matters more to mitigate misclassification than death investigator type.
Misclassification of cause of death is a longstanding and ongoing concern in US vital statistics, and the validity of these reported data may vary widely depending on the type of disease or injury [38,39]. However, evidence suggests that the accuracy of mortality classification for homicide-an outcome similar to law-enforcement-related mortality in that it is also certified by coroners and medical examiners-is very high. A prior study of large US cities found a near-perfect correlation between homicide counts reported in the NVSS and homicide counts reported in Supplementary Homicide Reports [40]. For law-enforcement-related deaths, however, correlations between the same 2 systems are considerably lower [1,13].

Future research and implications
Future studies could estimate the number of law-enforcement-related deaths, nationally or subnationally, using data from additional years and sources. Alternative data sources for these deaths include the National Violent Death Reporting System (NVDRS), which covers 40 US states and the District of Columbia as of 2017 [41], and deaths-in-custody lists maintained by the attorneys general of California [35] and Texas [42]. Additionally, state offices of vital statistics and departments of health can identify the shortcomings of their current vital statistics data by reviewing death certificates for law-enforcement-related deaths. It will also be useful to evaluate whether making such deaths a notifiable condition improves reporting [9], per new legislation enacted in Tennessee in 2017 [43].
There are multiple interventions that may improve public health monitoring of lawenforcement-related deaths. Examples include training medical examiners and coroners to indicate law enforcement involvement in death certificate literal text, increasing the use of news media reports as a data source for NVDRS states, and legally requiring disclosure of these deaths to health departments [43] or death investigators [27]. Additionally, health departments can create websites to provide the public with real-time reports of law-enforcement-related deaths that occur within their jurisdiction. This can be coupled with the inclusion of such deaths in a jurisdiction's list of notifiable conditions, which would allow for reporting of these deaths to health departments by medical staff, first responders, and members of the public [18].
Improving public health monitoring of law-enforcement-related mortality is a critical part of efforts to ensure public accountability for these incidents and prevent future incidents. Also warranting attention is improved monitoring of nonfatal injuries due to law enforcement, which currently are not captured by any official or media-based reporting system [44]. Betterquality data would allow researchers to quantify various forms of social inequality that may be linked to law-enforcement-related mortality (e.g., differences by race/ethnicity, socioeconomic position, and gender identity), compare rates between jurisdictions, and identify whether incidence is increasing or decreasing over time [18,44].