Skip to main content

Measuring the hidden burden of violence: use of explicit and proxy codes in Minnesota injury hospitalizations, 2004–2014



Commonly-used violence surveillance systems are biased towards certain populations due to overreporting or over-scrutinized. Hospital discharge data may offer a more representative view of violence, through use of proxy codes, i.e. diagnosis of injuries correlated with violence. The goals of this paper are to compare the trends in violence in Minnesota, and associations of county-level demographic characteristics with violence rates, measured through explicitly diagnosed violence and proxy codes. It is an exploration of how certain sub-populations are overrepresented in traditional surveillance systems.


Using Minnesota hospital discharge data linked with census data from 2004 to 2014, this study examined the distribution and time trends of explicit, proxy, and combined (proxy and explicit) codes for child abuse, intimate partner violence (IPV), and elder abuse. The associations between county-level risk factors (e.g., poverty) and county violence rates were estimated using negative binomial regression models with generalized estimation equations to account for clustering over time.


The main finding was that the patterns of county-level violence differed depending on whether one used explicit or proxy codes. In particular, explicit codes suggested that child abuse and IPV trends were flat or decreased slightly from 2004 to 2014, while proxy codes suggested the opposite. Elder abuse increased during this timeframe for both explicit and proxy codes, but more dramatically when using proxy codes. In regard to the associations between county level characteristics and each violence subtype, previously identified county-level risk factors were more strongly related to explicitly-identified violence than to proxy-identified violence. Given the larger number of proxy-identified cases as compared with explicit-identified violence cases, the trends and associations of combined codes align more closely with proxy codes, especially for elder abuse and IPV.


Violence surveillance utilizing hospital discharge data, and particularly proxy codes, may add important information that traditional surveillance misses. Most importantly, explicit and proxy codes indicate different associations with county sociodemographic characteristics. Future research should examine hospital discharge data for violence identification to validate proxy codes that can be utilized to help to identify the hidden burden of violence.


Violence is a common and serious public health problem. In 2019, there were an estimated 5.4 million violent victimization experiences among U.S. residents 12 or older (Morgan and Jennifer 2020). In order to track incidence, accurate surveillance of violent events is critical. Administrative data for violence identification typically occur in two systems: first, systems where the primary purpose is violence identification, and, second, systems where the primary purpose is not violence identification. An example of the former is child protective services (CPS) data, where the primary function is to identify children who are victims of violence and related victimization (e.g., neglect). These and other types of administrative systems designed to capture violence cases, such as police data, are most commonly used for violence surveillance. However, there has been concern about bias in these data, because they only capture a fraction of cases, due to under-reporting (Rochelle and Buonanno 2018; Gray et al. 2017; Lipsky et al. 2012; Feldman et al. 2017). Further, the cases that are captured may skew toward more highly scrutinized communities (e.g., those with interaction with mandated reporters through public benefits programs) (Maguire-Jack et al. 2018).

Because of concerns with administrative data designed for violence identification, there has been increasing attention to the utility of administrative data where the primary function is not violence identification. An example of this second type of administrative data is hospital claims databases, where the primary function is hospital billing. These databases are formed using medical records. The medical records are used by health information professionals to code a patient's visit for billing purposes, using three main types of codes: International Classification of Disease Clinical Modification, 9th Revision (ICD-9 CM codes or hereafter ICD-9 codes), external cause (E-codes), and supplementary classification factors influencing health status codes); since October 2015, billing coding is based on the ICD-10-CM. ICD-9 CM codes are primarily used to assess cost and reimbursement, but they are widely available and therefore are also useful in surveillance and research on disease morbidity and mortality (Thomas et al. 2002), and may be similarly used to study violence (Ahern et al. 2018).

A common way to identify violence in hospital billing data is through ICD codes that explicitly diagnose an injury as caused by violence (e.g., ICD-9 CM: 995.81 for physical abuse) (Scott et al. 2009). However, these so-called “explicit” codes are underutilized (Muldoon et al. 2019) and may be biased in similar ways to police or child protection data. For example, for an ICD violence code to be assigned, either the patient must reveal that the injury that brought them into the hospital was due to violence, or the provider must make a subjective assessment that the intent of the injury was violence since the coding system does not allow for suspected abuse or neglect to be documented. There are several reasons why patients may not disclose violence, including presence of the perpetrator in the hospital room with the patient/victim (Lipsky et al. 2009; Hymel et al. 2018; Lane et al. 2002). Other sources of bias as it relates to violence codes could be related to the patients socio-economic status (Keenan et al. 2017), resource barriers (Beynon et al. 2012), and urban versus rural setting (Edwards 2015). More specifically, these codes could be overutilized for patients of color or low socioeconomic status individuals and underutilized for rural communities where violence tends to be underreported. This then would bias associations when examining how these factors are associated with violence. Thus, cases identified through these explicit codes may not be representative of the true distribution of violence-related injuries (Scott et al. 2009; Schnitzer et al. 2011; Lloyd and Rissing 1985; McKenzie et al. 2011; Hooft et al. 2015).

A second option for assessing violence in hospital data is through “proxy” codes for injuries. Proxy codes are common outcomes of violence (Schnitzer et al. 2011; Bhargava et al. 2011). Using these “proxy” codes for violence identification may yield a more representative distribution of violence and less prone to bias because it does not rely on patient reports or subjective assessments by providers. Past research has examined the correlation of these proxy codes with “gold standard” violence identification methods. For example, in child maltreatment, one study (Schnitzer et al. 2011) used medical record review by project staff trained in child maltreatment, as well as consultation by an advisory board, to identify maltreatment cases. An ICD code was defined as “suggestive of child maltreatment” when greater than 66% of the visits with that code were thought to be caused by child maltreatment (Schnitzer et al. 2011). An example of these newly identified proxy codes included retinal hemorrhage. Intimate partner violence (IPV) proxy codes have been identified through a confluence of studies linking head, neck and face injuries to IPV (Bhargava et al. 2011; Perciaccante et al. 1999, 2010; Schafer et al. 2008; Davidov et al. 2015; Wu et al. 2010; Petridou et al. 2002; Bhandari et al. 2006; Halpern et al. 2009; Sheridan and Nash 2007). Specifically, one study used predictive models to examine which injuries correctly predicted IPV cases confirmed through telephone or clinical diagnoses (Bhargava et al. 2011). Another study found that head, neck and face injuries had a 91% sensitivity and 59% specificity for violence against women (Perciaccante et al. 1999). It is important to note that, high specificity means low type 1 error (low likelihood of finding false positive) but low specificity does introduce potentially type 2 error (false negative). Lastly, elder abuse proxy codes have been identified through common diagnoses among individuals who reported to Adult Protective Services (APS) (Wiglesworth et al. 2009). These methodologies (and others, Lachs et al. 1997; Gironda et al. 2016) provide an evidence base for use of proxy ICD codes to identify violent injury.

These two options of using explicit vs proxy ICD hospital codes for violence identification have trade-offs. For instance, explicit codes are very likely to reflect true violence, but are potentially biased. In contrast, proxy codes are likely to include injuries not due to violence, but this misclassification is more likely to be non-differential and may result in less systematic bias. While proxy codes are not currently used for violence surveillance, they could potentially address the impacts of systematic underreporting, improving case ascertainment, and thus reveal the hidden burden of violence. To move toward the use of proxy codes for surveillance, more needs to be understood about how proxy codes compare and contrast to explicit codes with regard to their temporal and geographic distribution. However, there is no research, to our knowledge, that compares proxy codes to explicit codes in terms of trends over time, and association with county-level predictors of violence such as poverty, racial/ethnic demographics, urbanicity, employment, and education level. The contribution of such a comparison adds important information on potentially differential patterns especially in specific subgroups that are over-scrutinized and those where violence tend to be culturally silent (e.g., rural, and elder abuse).

The goals of this paper are therefore to compare the trends in violence in Minnesota by county from 2004 to 2014, and associations of county-level demographic characteristics with violence rates as measured through explicit, proxy and a combination of explicit and proxy codes using Minnesota Hospital Discharge Data. Three violence subtypes (child maltreatment, elder abuse, and intimate partner abuse) are examined to represent three important types of violence over the lifetime.



Minnesota hospital discharge data

Population representative hospital administrative data from 2004 to 2014 were obtained through the Minnesota Hospital Association (MHA). Minnesota hospitals (n = 246) are required to submit all inpatient, outpatient, and emergency department claims data to MHA, which compiles these data in a statewide administrative claims database. This database contains a data point for each patient encounter with a health care provider, including diagnoses (ICD codes). Individuals could appear in the database multiple times.

ICD-9 CM codes are used to describe the diagnosis of the condition being treated. For injuries, they describe the nature of the injury (e.g. facture, cut, etc.) and body part (skull, arm, etc.). ICD-9 CM codes are the main codes included in administrative datasets because they are required for billing and reimbursement. External cause codes (E-codes) are optional additional descriptors to ICD-9 CM codes that describe when and where the injury happened, to whom or by whom, how, and intentionality (Injury Data and Resources 2015). V-codes are supplementary classification of factors influencing health status. Cross-sectional (not longitudinally-linked) MHA data on ICD-9 CM, E-codes, and V-codes from 2004 to 2014 were used to measure cases of violence for this study. These data become the numerator for the violence rates.

Population data

Population denominators to calculate county-level violence-related injury rates were obtained from the Surveillance, Epidemiology, and End Results (SEER) Program (National Cancer Institute 2021) which provides annual population level estimates by county, sex, and age for each year from 2004 to 2014. The following denominators were used for each violence subtype: age 0 to 18 for child abuse, 65 plus for elder abuse, and 16 plus for intimate partner violence.

Sociodemographic data

The 2010 Decennial Census (US Census Bureau 2021a) and the 2010 American Community Survey (ACS) (US Census Bureau 2021b) were used to provide county-level sociodemographic predictors (correlated with violence in traditional systems) of county-level violence (United States Census Bureau 2020) including: percent poverty, percent minority, urban, percent unemployed and percent less than high school education. The continuous variables are dichotomized at the mean.

Case ascertainment

Violence-related injuries were identified using both explicit and proxy methods. To avoid duplication, using a unique encounter-specific identifier, encounters with both an explicit and a proxy code were identified in the explicit count. Three subtypes of violence were analyzed: child maltreatment (people ages 0–17 years), elder abuse (people ages 65+ years) and intimate partner violence (people ages 16+ years). Appropriate population denominators were applied to create incidence rates. These three subtypes of violence were chosen based on being representative of different forms of violence throughout one’s lifespan and availability of research identifying proxy codes, as described below.

Explicit operationalization of violence

ICD-9 CM codes, E-codes, and V-codes that indicate a diagnosis of violence (explicit codes) are listed in Table 1 along with corresponding average yearly counts.

Table 1 Explicit and proxy ICD-9, E- and V-codes used to define violence in minnesota hospital discharge data

Proxy operationalization of violence

ICD-9, E-Codes, and V-codes indicating injury suggestive of violence (proxy codes) are listed in Table 1 along with corresponding average yearly counts. The proxy operationalizations here were based on a review of literature. The final selection of proxy codes were based on studies identifying these codes through confirmed violence cases via in-depth medical record review (Schnitzer et al. 2011; Gironda et al. 2016; Btoush et al. 2009; Barlow et al. 1998), predictive modeling (Bhargava et al. 2011; Perciaccante et al. 1999, 2010; Reis et al. 2009), common diagnoses of known violent encounters (Schafer et al. 2008; Davidov et al. 2015; Wu et al. 2010; Petridou et al. 2002; Bhandari et al. 2006; Halpern et al. 2009; Sheridan and Nash 2007; Wiglesworth et al. 2009; Nannini et al. 2008; Rosen et al. 2016) and linking hospital records with known cases of violence through administrative data systems (Schnitzer et al. 2011; Lachs et al. 1997) such as Child Protection Services (CPS) or Elder Protection Services. Codes from these studies were only selected if there is consistency across literature for certain types of injuries and/or most likely to be violence, as indicated by the study. Motor vehicle crashes was excluded from IPV to minimize non-violence-related injuries (Sheridan and Nash 2007). Significantly less literature was available on elder abuse. Therefore, to increase certainty in elder abuse proxy codes, injuries identified as being “unintentional intent” e.g., (E928.9) were excluded from elder abuse. The resulting elder abuse proxy codes were therefore restricted to either undetermined or intentional injuries. Further descriptions of the proxy codes are in Additional file 1: Appendix Table 1.

Combined operationalization of violence

The counts for the number of explicit and proxy operationalization of violence codes were summed to create a third operationalization of violence. The combined outcome was meant to serve as an intermediary approach for identifying violence.


The distribution and time trends of explicit and proxy child abuse, IPV, and elder abuse by all 87 counties in Minnesota from 2004 to 2014 were examined. For incidence rates, the yearly sum of cases in a county, defined by the given set of codes, served as the numerator and yearly county population data served as the denominator.

To estimate associations between county-level risk factors (e.g., poverty) and county violence rates, negative binomial regression models with generalized estimation equations were run to estimate incidence rate ratios with 95% CIs, accounting for within-county clustering over time. Two separate models were run for each outcome, crude and adjusted. First, crude models with the yearly count totals of each outcome were regressed separately on each individual socio-demographic variable and on year, with the yearly county-level population denominator for the offset (rate denominator). Second, fully-adjusted models that included all the county-level socio-demographic variables and year were estimated. Finally, as a sensitivity analysis, the fully adjusted models were run on a subset of codes within each violence subtypes (e.g., any burn code for IPV) to examine if certain codes were driving these associations (Additional file 1: Appendix Table 2). There was no null hypothesis significance testing conducted and results instead focus on estimation (Lash 2017). This study was deemed not human research by the University of Minnesota Institutional Review Board.

Table 2 Mean violence incidence rate ratios and percent county population characteristics for Minnesota 2004–2014


Table 2 describes the average rates of explicit- and proxy-identified violence subtypes and socio-demographic characteristics across counties in the sample. Rates estimated using explicit and proxy codes were substantially different, especially for elder abuse (2 per 1000 for explicit vs. 106 per 1000 for proxy) and intimate partner violence (5 per 1000 for explicit vs. 294 per 1000 for proxy).

Table 3 describes the crude bivariate associations of year and each county socio-demographic factor with explicit-, proxy-, and combined-identified violence rates. Generally, there is a stronger magnitude of association with county level factors for explicit codes compared to proxy coded.

Table 3 Crude bivariate negative binomal regression with GEE: incidence rate ratio for the association between each county level socio-demographic characteristics and injury codes

The fully adjusted models are in Table 4. Using explicit codes, the rate of elder abuse appears to slightly increase from 2004 to 2014 (IRRexplicit per year: 1.03; 95% CI 1.01–1.06). The time trend for child abuse and IPV are both flat or slightly decreasing (child maltreatment IRRexplicit per year: 0.98; 95% CI 0.97–1.00, and IPV IRRexplicit per year: 0.98; 95% CI 0.96–1.01, respectively). In contrast, using proxy codes, there appears to be a substantial upward trend in elder abuse rates (IRRproxy per year:1.12; 95% CI 1.11–1.13) from 2004 to 2014. Child abuse and IPV measured using proxy codes are slightly increasing over time (child abuse IRRproxy per year :1.03; 95% CI 1.02–1.05, and IPV IRRproxy per year: 1.04; 95% CI 1.03–1.04). The combined explicit and proxy codes for child abuse, elder abuse and intimate partner violence mimic the proxy codes in magnitude and pattern.

Table 4 Fully adjustedb negative binomal regression with GEE: incidence rate ratio for the association between all county level socio-demographic characteristics and injury codes

In the fully adjusted models, explicit codes for child abuse indicate that counties with greater than or equal to the mean (11.3%) of people living in poverty have 1.36 (95% CI 1.09–1.68) times the rate of child abuse compared to counties that had less than 11.3% of people living in poverty. In general, the association of child abuse with county-level measures of poverty, people of color, unemployment, and education all decrease in magnitude when comparing explicit to proxy codes, although confidence intervals are often overlapping. For example, using the proxy measure of child abuse, counties with greater than or equal to 11.3% of people living in poverty have 1.12 (95% CI 0.88–1.43) times the rate of child abuse compared to counties with less than 11.3% of people living in poverty. Using combined explicit and proxy codes, counties with more than the average poverty have 1.27 (95% CI 1.03–1.55) times of the rate of child abuse as counties with less than the average poverty. The combined codes tend to be in the middle between proxy and explicit codes.

Using explicit elder abuse codes, counties with greater than or equal to the mean (5.8%) of people unemployed have a 58% increased rate of elder abuse compared to counties that had less than 5.8% unemployed. The associations between elder abuse and county-level socio-demographic characteristics also are lower in magnitude when using proxy rather than explicit codes, except for county-level education. For example, using proxy-identified and combined-identified elder abuse, counties with higher unemployment had about a 6% increased rate of elder abuse compared with counties that had lower unemployment (IRRproxy 1.06, 95% CI 0.90–1.25; IRRcombined 1.06, 95% CI 0.91–1.25), respectively.

As with child maltreatment and elder abuse, generally the associations between IPV and county level socio-demographic characteristics were closer to the null, although two associations flipped directionality, when using proxy and combined codes compared with explicit codes. For example, the counties with higher percent unemployment, and those with more people of color, were found to have a greater relative rates of IPV using the explicit versus proxy and combined codes (IRRexplicit 1.73, 95% CI 1.30–2.30 vs. IRRproxy 1.13, 95% CI 1.00–1.28 vs. IRRcombined 1.14, 95% CI 1.01–1.29) and (IRRexplicit 1.30, 95% CI 0.93–1.84 vs. IRRproxy 1.16, 95% CI 1.02–1.32 versus IRRcombined 1.16, 95% CI 1.03–1.32), respectively. Associations between IPV and poverty, education and urbanicity were close to the null in regardless of explicit, proxy or combined codes.

Lastly, the sensitivity analysis showed that, in the fully adjusted models, the malnutrition subset codes drove most of the associations with elder abuse and each county level sociodemographic characteristics. The associations between county level sociodemographic characteristics with child abuse and IPV subsets codes were less clear on which individual code drove these associations.


The goal of this paper is to determine how the addition of proxy codes in relation and in combination to a more traditional approach (explicit codes) describe violence using population-based data for one state. Our main findings are that the magnitude of violence rates, and patterns of violence across time and by county-level violence differed depending on whether one used explicit or proxy codes. In particular, explicit codes suggested that child abuse and IPV trends were flat or decreased slightly from 2004 to 2014, while proxy codes suggested the opposite. Elder abuse increased during this timeframe for both explicit and proxy codes, but more dramatically when using proxy codes. In regard to the associations between county level characteristics and each violence subtype, previously identified county-level risk factors were more strongly related to explicitly-identified violence than to proxy-identified violence. Given the larger number of proxy-identified than explicit-identified violence cases, the trends and associations of combined codes align more closely with proxy codes, especially with elder abuse and IPV.

The finding of increasing violence over time using proxy codes contrast with evidence of declining trends of child abuse (Finkelhor et al. 2014), elder abuse (Morgan and Mason 2014) and IPV (Catalano 2013) from data sources such as the Uniform Crime Reports (UCR), Child Protection (Finkelhor et al. 2013a), and the National Crime Victimization Survey (NCVS) (Powers and Kaukinen 2012; Morgan and Kena 2017). There are several possible reasons for differences in findings between this study and these other data sources. Traditional surveillance systems for violence may have systemic selection bias. For example, UCR relies on police data that may be likely to over-report crime in communities of color and under-report in white communities (Myers 1980; Mesic et al. 2018; Voigt et al. 2017). Further, UCR excludes sexual assault, and crimes not reported to the police (Planty et al. 2014). Underreporting is also a problem in Child Protection data (Wildeman et al. 2014). For example, in 2011, the National Child Abuse and Neglect Data System (NCANDS) reported approximately three million U.S. children who received an investigation or response from a state child protection service agency (Maltreatment 2020) but in the same year, according to the National Survey of Children’s Exposure to Violence reported, approximately 10 million U.S. youth had experienced maltreatment by their caregiver (Finkelhor et al. 2013b). Therefore, different types of selection and usage of these different surveillance systems (such as health care utilization) could be a reason behind different trends because each system is measuring different populations or is measuring violence differently.

Generally, the associations between violence and county socio-demographic compositional factors are smaller for proxy codes than for explicit codes. For example, after adjustment for all other county-level demographic characteristics, explicit codes indicated that violent injuries for elder abuse are highest in counties that had population percentages at or above the Minnesota mean percent of people of color. When violence is measured with proxy codes or combined codes, these associations are still elevated but move toward the null. One possible explanation of this could be that proxy codes could be less systemically racially biased, or conversely, explicit codes are more greatly influenced by racial bias. Explicit codes require a subjective judgment by medical providers, which leaves them vulnerable to individuals’ implicit biases. Proxy codes do not require this judgment and may be less affected by this bias. On the other hand, proxy codes trade greater potential representativeness for lower specificity, potentially leading to non-differential misclassification, which could also move effect estimates closer to the null. These different associations by proxy versus explicit coding suggest that sole reliance on explicit coding of violence for surveillance and research may be insufficient and proxy codes may potentially help to address under- and biased reporting (Shepherd and Sivarajasingam 2005), yet research is required to understand the potential misclassification in proxy codes. Since ICD codes are a universal coding system in the U.S., further testing and application should be done to assess proxy codes’ validity for violence identification.

The differences between predictors and trends over time for proxy and explicit codes is unknown. The differences, in part, are likely due to a trade off in specificity and sensitivity of the codes. For instance, explicit codes likely have high specificity (i.e. those with violence codes are very likely to reflect true violence), but may suffer from low sensitivity (i.e. many/most cases of violence are not coded as violence, and thus miss many cases of violence). If this misclassification is non-differential (Aschengrau and Seage 2021) with respect to predictor variables, then it may attenuate associations. However, if, as suspected, this misclassification is differential due to greater suspicion of violence in certain populations associated with our hardship measures, then associations may be biased away from the null. Proxy codes could see a decrease in specificity but an increase in sensitivity and the final estimate may still be an underestimate of the actual effect but perhaps closer in magnitude to the actual effect. These codes are also misclassified, but perhaps in a less systematic way (less biased). The use of proxy codes allows for violence cases to be identified that may not have been otherwise detected which is important for prevention. The utility of proxy codes for prevention may make these codes more forgiving of false positives. To our knowledge, this is the first study to compare the predicters and trends over time for proxy and explicit codes, therefore, future validation work is important.

This study has limitations that should be considered. First, this study missed those who experienced violent events but did not go to the hospital. Second, hospital data may be an oversample of those with health insurance in the population. That said, more severe or urgent injuries may bring people in for care despite the lack of health insurance coverage (Sommers and Simon 2017). Thus, this study may be seen as an analysis of more severe violence-related injuries. Third, hospital data lack details such as perpetration and location of the event that are available in studies like UCR, NCVS, and NCANDS, which could help to identity potential points for intervention. Fourth, the analysis includes a set of county-level covariates from a single time point (the 2010 decennial census or ACS). This limits the ability to account for variation across time in the covariates. However, there is minimal change over this 11-year time frame in these measures at the state level (Minnesota Compass 2020), and a middle timepoint captures mean covariate levels across the study period. Fifth, while these data are representative of violent related injuries in Minnesota given the census of hospital records as the data source, the results may not be generalizable outside of Minnesota. Sixth, this study uses ICD-9 codes while the current version of hospital discharge codes is ICD-10, thus the study stopped at 2014. Despite the use of an older coding system, both injury codes (ICD-9 and ICD-10) continue to use a similar approaches that are translatable (Gibson et al. 2016). In addition, the ICD-10 external cause framework is developed to be as consistent as possible with ICD-9 codes (Injury Data and Resources 2019).

This study has several strengths that mitigate its weaknesses. First, this study utilizes a population-based administrative dataset at the county level for Minnesota, allowing generalizability to the entire population. Second, while proxy codes are likely to have some misclassification, they are subject to potentially less systemic bias and may be thus better capture violence in communities where violence is not traditionally identified, such as whiter or wealthier communities (Sumner et al. 2015). Given the different strengths and weaknesses of explicit and proxy codes, and the lack of a gold standard for violence identification, it is useful to consider both approaches in research and for replication in other studies. The combined code approach could be a possible way to be more inclusive for studies that are attempting to target a broad pool. Third, this study utilizes county-level geography to distinguish associations between violence and county socio-demographic characteristics, which can be useful for local public health agency surveillance. In contrast, violence trend data such as NCVS are commonly reported at the nation or region level, which limits the more granular assessment of trends that occur in different parts of the United States.

There are several implications for future research from these findings. Violence surveillance utilizing hospital discharge data, and particularly proxy codes, may add important data that traditional surveillance lacks. Most importantly, explicit and proxy codes indicate different geographic patterns and trends over time. The use of proxy codes for violence identification may provide an avenue for capturing violence that traditional surveillance misses (Boyle and Kirkbride 2005). Accurate surveillance of violence is critical for resource allocation for prevention and intervention. Utilizing proxy codes in conjunction with explicit codes may be one step towards more comprehensive surveillance. More specifically, hospital records could be used as a syndromic surveillance system for violence, which could lead to potentially more timely and impactful interventions. Future research examining hospital discharge data for violence identification utilizing and verifying proxy codes can help to identify the hidden burden of violence.

Availability of data and materials

The majority of the data that support the findings of this study are available from Minnesota Hospital Association but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. The remaining data from the American Community Survey are available from the United States Census Bureau, website


ICD-9 CM codes:

International classification of disease clinical modification, 9th revision


Child protective services


External cause


Intimate partner violence


Adult protective services


Minnesota hospital association


National crime victimization survey


Uniform crime reports


National child abuse and neglect data system


American community survey


Surveillance, epidemiology, and end results


  1. Ahern J, Matthay EC, Goin DE, Farkas K, Rudolph KE. Acute changes in community violence and increases in hospital visits and deaths from stress-responsive diseases. Epidemiology. 2018;29(5):684–91.

    PubMed  PubMed Central  Article  Google Scholar 

  2. Aschengrau A, Seage GR. Essentials of epidemiology in public health, 4th edn. Independently Published; 2021. 538 p.

  3. Barlow KM, Milne S, Aitken K, Minns RA. A retrospective epidemiological analysis of non-accidental head injury in children in Scotland over a 15 year period. Scott Med J. 1998;43:112–4.

    CAS  PubMed  Article  Google Scholar 

  4. Beynon CE, Gutmanis IA, Tutty LM, Wathen N, Macmillan HL. Why physicians and nurses ask (or don’t) about partner violence: a qualitative analysis. BMC Public Health. 2012;12:1.

    Article  Google Scholar 

  5. Bhandari M, Dosanjh S, Tornetta P, Matthews D. Musculoskeletal manifestations of physical abuse after intimate partner violence. J Trauma Injury Infect Crit Care. 2006;61(6):1473–9.

    Article  Google Scholar 

  6. Bhargava R, Temkin TL, Fireman BH, Eaton A, Mccaw BR, Kotz KJ, et al. A predictive model to help identify intimate partner violence based on diagnoses and phone calls. AMEPRE. 2011;41(2):129–35.

    Google Scholar 

  7. Boyle A, Kirkbride J. Record linkage of domestic assault victims between an emergency department and the police. J Epidemiol Community Health. 2005;59:909–10.

    PubMed  PubMed Central  Article  Google Scholar 

  8. Btoush R, Campbell JC, Gebbie KM. Care provided in visits coded for intimate partner violence in a national survey of emergency departments. Women’s Health Issues. 2009;19(4):253–62.

    PubMed  Article  Google Scholar 

  9. Catalano S. Intimate partner violence: attributes of victimization, 1993–2011. 2013.

  10. Child Maltreatment. Children’s Bureau | ACF [Internet]. 2020. [cited 2020 Feb 11]. Available from:

  11. Davidov DM, Larrabee H, Davis SM. United States emergency department visits coded for intimate partner violence. J Emerg Med. 2015;48(1):94–100.

    PubMed  Article  Google Scholar 

  12. Edwards KM. Intimate partner violence and the rural–urban–suburban divide. Trauma Violence Abuse. 2015;16(3):359–73.

    PubMed  Article  Google Scholar 

  13. Feldman J, Gruskin S, Coull B, Krieger N. Quantifying underreporting of law-enforcement-related deaths in United States vital statistics and news-media-based data sources: a capture–recapture analysis. PLOS Med. 2017;14:e1002399.

    PubMed  PubMed Central  Article  Google Scholar 

  14. Finkelhor D, Jones LM, Shattuck AM, Finkelhor D, Jones L, Shattuck A, et al. Updated trends in child maltreatment, 2012. Durham, NH; 2013a.

  15. Finkelhor D, Turner HA, Shattuck A, Hamby SL. Violence, crime, and abuse exposure in a national sample of children and youth an update. JAMA Pediatr. 2013b;167(7):614–21.

    PubMed  Article  Google Scholar 

  16. Finkelhor D, Shattuck A, Turner HA, Hamby SL. Trends in children’s exposure to violence, 2003 to 2011. JAMA Pediatr. 2014;168(6):540–6.

    PubMed  Article  Google Scholar 

  17. Gibson T, Casto A, Young J, Karnell L, Coenen N. Impact of ICD-10-CM/PCS on Research using administrative databases [Internet]. U.S. Agency for Healthcare Research and Quality; 2016. (HCUP Methods Series Report # 2016-02). Available from:

  18. Gironda MW, Nguyen AL, Mosqueda LM. Is this broken bone because of abuse? Characteristics and comorbid diagnoses in older adults with fractures. J Am Geriatr Soc. 2016;64(8):1651–5.

    PubMed  PubMed Central  Article  Google Scholar 

  19. Gray BJ, Barton ER, Davies AR, Long SJ, Roderick J, Bellis MA. A shared data approach more accurately represents the rates and patterns of violence with injury assaults. J Epidemiol Community Health. 2017;71(12):1218–24.

    PubMed  Google Scholar 

  20. Halpern LR, Parry BA, Hayward G, Peak D, Dodson TB. A comparison of 2 protocols to detect intimate partner violence. J Oral Maxillofac Surg. 2009;67:1453–9.

    PubMed  Article  Google Scholar 

  21. Hooft AM, Asnes AG, Livingston N, Deutsch S, Cahill L, Wood JN, et al. The accuracy of ICD codes: identifying physical abuse in 4 Children’s hospitals. Acad Pediatr. 2015;15(4):444–50.

    PubMed  PubMed Central  Article  Google Scholar 

  22. Hymel KP, Laskey AL, Crowell KR, Wang M, Armijo-garcia V, Frazier TN, et al. Racial and ethnic disparities and bias in the evaluation and reporting of abusive head trauma. J Pediatr. 2018;198:137–44.

    PubMed  PubMed Central  Article  Google Scholar 

  23. Injury Data and Resources. ICD Injury Matrices [Internet]. Centers for Disease Control and Prevention. 2015 [cited 2020 Mar 11]. Available from:

  24. Injury Data and Resources. ICD Injury Matrices [Internet]. 2019. [cited 2021 Jan 30]. Available from:

  25. Keenan HT, Campbell KA, Page K, Cook LJ, Bardsley T, Olson LM. Perceived social risk in medical decision-making for physical child abuse: a mixed-methods study. BMC Pediatr. 2017;17(1):214.

    PubMed  PubMed Central  Article  Google Scholar 

  26. Lachs MS, Williams CS, O’Brien S, Hurst L, Kossack A, Siegal A, et al. ED use by older victims of family violence. Ann Emerg Med. 1997;30(4):448–54.

    CAS  PubMed  Article  Google Scholar 

  27. Lane WG, Rubin DM, Monteith R, Christian CW. Racial differences in the evaluation for physical abuse. JAMA. 2002;288(13):1603–9.

    PubMed  Article  Google Scholar 

  28. Lash TL. The harm done to reproducibility by the culture of null hypothesis significance testing. Am J Epidemiol. 2017;186(6):627–35.

    PubMed  Article  Google Scholar 

  29. Lipsky S, Caetano R, Roy-byrne P. Racial and ethinic disparities in police-reported intimate partner violence and risk of hospitalization among women. Women’s Health Issues. 2009;19(2):109–18.

    PubMed  PubMed Central  Article  Google Scholar 

  30. Lipsky S, Cristofalo M, Reed S, Caetano R, Roy-Byrne P. Racial and ethnic disparities in police-reported intimate partner violence perpetration: a mixed methods approach. J Interpers Violence. 2012;27(11):2144–62.

    PubMed  Article  PubMed Central  Google Scholar 

  31. Lloyd SS, Rissing JP. Physician and Coding errors in patient records. JAMA J Am Med Assoc. 1985;254(10):1330–6.

    CAS  Article  Google Scholar 

  32. Maguire-Jack K, Cao Y, Yoon S. Racial disparities in child maltreatment: the role of social service availability. Child Youth Serv Rev. 2018;86:49–55.

    Article  Google Scholar 

  33. Minnesota Compass. Demographics : overview [Internet]. Wilder Research. 2020. [cited 2020 Mar 12]. Available from:

  34. McKenzie K, Scott DA, Waller GS, Campbell M. Reliability of routinely collected hospital data for child maltreatment surveillance. BMC Public Health. 2011.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Mesic A, Franklin L, Cansever A, Potter F, Sharma A, Knopov A, et al. The relationship between structural racism and black–white disparities in fatal police shootings at the state level. J Natl Med Assoc. 2018;110(2):106–16.

    PubMed  Article  PubMed Central  Google Scholar 

  36. Morgan RE, Jennifer TL. Criminal victimization, 2019. 2020. p. 53. Report No.: NCJ 255113.

  37. Morgan R, Kena G. Criminal victimization, 2016. Bureau Justice Stat. 2017;2016:1–24.

    Google Scholar 

  38. Morgan RE, Mason BJ. Crimes against the elderly, 2003–2013. 2014.

  39. Muldoon K, Smith G, Talarico R, Heimerl M, McLean C, Sampsel K, et al. A 15-year population-based investigation of sexual assault cases across the province of Ontario, Canada, 2002–2016. Am J Public Health. 2019;109(9):1280–7.

    PubMed  PubMed Central  Article  Google Scholar 

  40. Myers SL Jr. Why are crimes underreported? What is the crime rate? Does it really matter? Soc Sci Q. 1980;61(1):23–43.

    Google Scholar 

  41. Nannini A, Lazar J, Berg C, Barger M, Tomashek K, Cabral H, et al. Physical injuries reported on hospital visits for assault during the pregnancy-associated period. Nurs Res. 2008;57(3):144–9.

    PubMed  Article  Google Scholar 

  42. National Cancer Institute. SEER Data & Software [Internet]. 2021. [cited 2021 Mar 1]. Available from:

  43. Perciaccante VJ, Ochs HA, Dodson TB. Head, neck, and facial injuries as markers of domestic violence in women. J Oral Maxillofac Surg. 1999;57(7):760–2.

    CAS  PubMed  Article  Google Scholar 

  44. Perciaccante VJ, Carey JW, Susarla SM, Dodson TB. Markers for intimate partner violence in the emergency department setting. J Oral Maxillofac Surg. 2010;68(6):1219–24.

    PubMed  Article  Google Scholar 

  45. Petridou E, Browne A, Lichter E, Dedoukou X, Alexe D, Dessypris N. What distinguishes unintentional injuries from injuries due to intimate partner violence: a study in Greek ambulatory care settings. Injury Prev. 2002;8:197–201.

    CAS  Article  Google Scholar 

  46. Planty M, Langston L, Barnett-Ryan C. The nation’s two crime measures. The Bureau of Justice Statistics of the US Department of Justice. 2014.

  47. Powers RA, Kaukinen CE. Trends in intimate partner violence: 1980–2008. J Interpers Violence. 2012;27(15):3072–90.

    PubMed  Article  Google Scholar 

  48. Reis BY, Kohane IS, Mandl KD. Longitudinal histories as predictors of future diagnoses of domestic abuse: modelling study. BMJ. 2009;339:1–9.

    Article  Google Scholar 

  49. Rochelle S, Buonanno L. Charting the attitudes of county child protection staff in a post-crisis environment. Child Youth Serv Rev. 2018;86(2017):166–75.

    Article  Google Scholar 

  50. Rosen T, Clark S, Bloemen EM, Mulcare MR, Stern ME, Hall JE, et al. Geriatric assault victims treated at U.S. trauma centers: five-year analysis of the national trauma data bank. Injury. 2016;47(12):2671–8.

    PubMed  PubMed Central  Article  Google Scholar 

  51. Schafer SD, Drach LL, Hedberg K, Kohn MA. Using diagnostic codes to screen for intimate partner violence in oregon emergency departments and hospitals. Public Health Rep. 2008;123(5):628–35.

    PubMed  PubMed Central  Article  Google Scholar 

  52. Schnitzer PG, Slusher PL, Kruse RL, Tarleton MM. Identification of ICD codes suggestive of child maltreatment. Child Abuse Negl. 2011;35(1):3–17.

    PubMed  Article  Google Scholar 

  53. Scott D, Tonmyr L, Fraser J, Walker S, McKenzie K. The utility and challenges of using ICD codes in child maltreatment research: a review of existing literature. Child Abuse Neglect. 2009;33(11):791–808.

    PubMed  Article  Google Scholar 

  54. Shepherd J, Sivarajasingam V. Injury research explains conflicting violence trends. Injury Prev. 2005;11(6):324–5.

    CAS  Article  Google Scholar 

  55. Sheridan DJ, Nash KR. Acute injury patterns of intimate partner violence victims. Trauma Violence Abuse. 2007;8(3):281–9.

    PubMed  Article  Google Scholar 

  56. Sommers BD, Simon K. Health insurance and emergency department use—a complex relationship. N Engl J Med. 2017;376(18):1708–11.

    PubMed  Article  Google Scholar 

  57. Sumner SA, Mercy JA, Dahlberg LL, Hillis SD, Klevens J, Houry D. Violence in the United States: status, challenges, and opportunities. JAMA J Am Med Assoc. 2015;314(5):478–88.

    CAS  Article  Google Scholar 

  58. Thomas SK, Brooks SE, Mullins CD, Baquet CR, Merchant S. Use of ICD-9 coding as a proxy for stage of disease in lung cancer. Pharmacoepidemiol Drug Saf. 2002;11(8):709–13.

    PubMed  Article  Google Scholar 

  59. United States Census Bureau (2020) When to Use 1-year, 3-year, or 5-year Estimates [Internet]. 2019 [cited 2020 Jun 30]. Available from:

  60. US Census Bureau. Decennial Census by Decades [Internet]. The United States Census Bureau. 2021a. [cited 2021 Sep 6]. Available from:

  61. US Census Bureau. American Community Survey (ACS) [Internet]. The United States Census Bureau. 2021b. [cited 2021 May 4]. Available from:

  62. Voigt R, Camp NP, Prabhakaran V, Hamilton WL, Hetey RC, Griffiths CM, et al. Language from police body camera footage shows racial disparities in officer respect. Proc Natl Acad Sci USA. 2017;114(25):6521–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  63. Wiglesworth A, Austin R, Corona M, Mosqueda L. Bruising as a marker of physical elder abuse: bruising as a marker of physical elder abuse. J Am Geriatr Soc. 2009;57(7):1191–6.

    PubMed  Article  Google Scholar 

  64. Wildeman C, Emanuel N, Leventhal JM, Putnam-Hornstein E, Waldfogel J, Lee H. The prevalence of confirmed maltreatment among US children, 2004 to 2011. JAMA Pediatr. 2014;168(8):706–13.

    PubMed  PubMed Central  Article  Google Scholar 

  65. Wu V, Huff H, Bhandari M. Pattern of physical injury associated with intimate partner violence in women presenting to the emergency department: a systematic review and meta-analysis. Trauma Violence Abuse. 2010;11(2):71–82.

    PubMed  Article  Google Scholar 

Download references


Not applicable.


The authors gratefully acknowledge support from the Minnesota Population Center (P2C HD041023) and the Interdisciplinary Population Health Science Training Program (T32HD095134). Both are funded by the Eunice Kennedy Shriver National Institute for Child Health and Human Development (NICHD).

Author information




NJ Santaularia completed the analyses, writing, and conceptualization of the study. MRR made substantial contributions to the conceptualization of the study and interpretation of the results. TLO supported data interpretation and provided substantial theorical framework. SMM oversaw the analyses, writing and conceptualization of the study. All authors read and approved the final manuscript.

Corresponding author

Correspondence to N. Jeanie Santaularia.

Ethics declarations

Ethics approval and consent to participate

Not applicable. As a secondary study of de-identified county-level data, this research does not involve individual human subjects.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. Appendix Table S1:

Proxy Codes used to Define Violence. Appendix Table S2: Fully Adjustedb Negative Binomal Regression with GEE: Rate Ratio for the Association Between All County Level Socio-demographic Characteristics and Subtypes of Proxy Injury Codes.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Santaularia, N.J., Ramirez, M.R., Osypuk, T.L. et al. Measuring the hidden burden of violence: use of explicit and proxy codes in Minnesota injury hospitalizations, 2004–2014. Inj. Epidemiol. 8, 63 (2021).

Download citation


  • Violent injury
  • Surveillance
  • Hospital data
  • Child abuse
  • Intimate partner violence
  • Elder abuse