Purpose
- This page explains the data sources and methods CDC uses to display data on the Mapping Injury, Overdose, and Violence Dashboard.
- Important technical notes and information about limitations are also provided.
Data on deaths
Data on deaths come from the National Vital Statistics System (NVSS), compiled by CDC's National Center for Health Statistics. NVSS contains information from death certificates completed by state vital records offices. The system includes causes of death reported by attending physicians, medical examiners, and coroners. NVSS data in the dashboard show where a person lived, which may not be the same as where their death occurred.
Types of data
Provisional and final data
The dashboard displays both provisional and final data on deaths. A blue box will appear in the dashboard indicating when the data is provisional.
Provisional data is data that is not final and may not be complete yet.
Provisional data is subject to change as information continues to be collected and submitted to CDC. Because deaths from different causes can take longer for states to certify, the dashboard displays data with a delay that is appropriate for the specific cause of death. Drug overdose deaths received from states are 90% complete by six months and are available with a lag of six months. Deaths from other causes shown in the dashboard—suicide and homicide—are more than 98% complete within four months of when the deaths occurred and are available with a lag of four months. However, the method of suicide also has a role in the timing of the reporting, with those involving drug overdose typically taking longer to report than other methods.
The completeness of provisional data can vary from state to state. Occasionally a state may submit a large batch of data, which can cause a shift in numbers. Because CDC continually receives new provisional data, there could be small differences between the dashboard and other CDC data systems, like CDC WONDER, depending on when updates are made. More information on provisional data is available through Technical Notes for Provisional Mortality.
Final data is complete when CDC receives all records from states and after CDC has fully reviewed the records for completeness and quality. CDC certifies data from a given calendar year as final once a year.
How the dashboard displays data
The dashboard shows death data as counts (number of deaths) and rates per 100,000 people at national, state, and county levels and for census tracts or groups of census tracts. Rates tell you how often something happens in a group of a defined size—like deaths per 100,000 people. More information on grouped census tracts and modeled rates is below.
The dashboard shows the number of deaths in each area. When there are fewer than 10 deaths in an area, the dashboard will display the deaths as a range "1-9." This helps protect the identity of people who have died. If there are no deaths in an area, the dashboard will display zero.
When the number of deaths in an area is 10 or more, the rate is calculated by dividing the number of deaths by the number of people living in that area. Areas of the map are colored based on their percentile ranking for the rate of deaths for the selected time period. Percentile rankings compare an area's rate to others; areas with higher percentile rankings have higher rates. For example, counties in the 95.1-100th percentile for 2023 had death rates that were higher than 95% of counties in that year. Information on the number of people living in an area comes from the following U.S. Census Bureau sources:
- For national, state, and county populations used to calculate death rates, the dashboard uses population measures from the 10-year census in years when available and U.S. Census Bureau estimates in years following a census.
- For rate calculations in census tracts, the dashboard uses population estimates from the U.S. Census Bureau's American Community Survey 5-Year Data estimates.
Population estimates can have a lag of one or more years. The dashboard uses the most recent estimates available for a given year, and changes in rates may occur as newer population estimates become available.
When there are only a few deaths in an area (1-9), CDC uses a Bayesian model to estimate rates. A Bayesian model is a type of statistical model often used in geographic analysis. This model can improve stability of the rates in small population areas and protect privacy by taking into account information from neighboring areas.
Statistical modeling can provide more stable estimates because even one death can cause a large change in the rate for an area with a small population. The dashboard indicates when rates are modeled in both the text and a blue box below the map.
The data table below the map lists 95% credible intervals for all modeled rates. This means the rate is expected to be in this range with 95% confidence. The range of the credible intervals will vary and can be greater in areas with small numbers of deaths or areas with small populations. Credible intervals can also be wide in places that have rates that differ substantially from the rates of neighboring areas. Suppression criteria are applied to ensure that modeled rates are stable (see unstable rates and suppressed data below).
Bayesian models are used in other CDC dashboards, such as CDC's Atlas of Heart Disease and Stroke.
Death data is displayed by the census tract location of the residence of the person who died, which might not necessarily be the location where the person died. This location information, also called geocoding, comes from the person's death certificate. Some census tracts have areas with small populations, so the dashboard combines nearby census tracts together until they form a group that has a minimum population of 10,000 people. Grouping helps to protect privacy and to increase stability when calculating rates. Census tracts are grouped with nearby census tracts by combining areas that are closest to each other.
In some areas of the country, particularly smaller population rural areas, the combined census tracts represent areas larger than counties. You can compare the county and census tract views to see which provides more detailed information.
CDC began collecting geocoded death certificate data in 2022. CDC has used census tract estimates in other projects, like the U.S. Small-Area Life Expectancy Estimates Project.
In a small number of areas, rates are not shown (suppressed) because a stable rate cannot be estimated. Specifically, if a modeled rate has led to a shift larger than one step in the percentile category used in the legend, the rate is considered unstable. The map legend and chart notes indicate where rates are unstable and not shown.
Rarely, the number of deaths in an entire state is less than 10 and the exact number is not shown. The dashboard also does not show modeled rates for these states or locations within these states.