Interpreting Race and Ethnicity in Cancer Data

What to know

Data published in this report may be underestimated for Asian and Pacific Islander, American Indian and Alaska Native, and Hispanic people. This may be due to racial and Hispanic origin misclassification.

Overview

The North American Association of Central Cancer Registries (NAACCR) Race and Ethnicity Identifier Assessment Project confirmed the importance of publishing cancer rates by race and ethnicity (specifically, Hispanic origin).1

Cancer incidence

When reporting cancer incidence, race and ethnicity information is abstracted from medical records and grouped into race and ethnicity categories.2 Although registries use standardized data items and codes for both race and ethnicity, the initial collection of this information by health care facilities and practitioners and the procedures for assigning and verifying codes for race and ethnicity are not well standardized.1 Thus, some inconsistency is expected in this information.

Cancer mortality

When reporting cancer mortality, race and Hispanic origin are recorded separately on the death certificate by the funeral director as provided by an informant or on the basis of observation.3 Inconsistencies in the collection and coding of data on race and Hispanic origin and their effect on mortality statistics have been described.4

Effects of misclassification

The net effect of misclassification is greatest for American Indian and Alaska Native people. Misclassification is smaller for Asian and Pacific Islander people and Hispanic people, and minimal for Black people and White people. Therefore, incidence and mortality data published in this report may be underestimated for Asian and Pacific Islander, American Indian and Alaska Native, and Hispanic people. This may be due to racial and Hispanic origin misclassification.

Improving data accuracy

In the U.S. Cancer Statistics Data Visualizations tool, we have restricted analysis to non-Hispanic populations to overcome racial misclassification and to represent populations more accurately. CDC's National Center for Health Statistics is working with states to improve the reporting of race and ethnicity on death certificates.

The Data Visualizations tool presents cancer incidence data for all races combined and by bridged race and ethnicity (non-Hispanic White, non-Hispanic Black, non-Hispanic American Indian and Alaska Native, non-Hispanic Asian and Pacific Islander, and Hispanic) categories. Starting with 2018 deaths, mortality data are presented for all races combined and by single-race race and ethnicity (non-Hispanic White, non-Hispanic Black, non-Hispanic American Indian and Alaska Native, non-Hispanic Asian, non-Hispanic Native Hawaiian or Other Pacific Islander, and Hispanic. Use caution while comparing with data in Archived Reports as the differences in rates are due to changes in data presentation and not due to rate changes. Also use caution if comparing bridged race and ethnicity incidence data to single race and ethnicity mortality data as single race and bridged race data are not considered to be directly comparable.5 Puerto Rico data are available only for all races and ethnicities combined.

Asian and Pacific Islander people

Central cancer registries have codes for race that allow them to document the occurrence of cancer in 25 Asian and Pacific Islander subpopulations.2 But the subpopulations are grouped into a single Asian and Pacific Islander category because of small numbers and concerns about possible misclassification. The Asian and Pacific Islander category is restricted to non-Hispanic persons.

Studies show excellent agreement (k=0.90) between Asian and Pacific Islander race in Surveillance, Epidemiology, and End Results (SEER) registry data and self-reported data from the U.S. Census.6 Studies examined the misclassification of race for Asian and Pacific Islander subpopulations in cancer registries.6789 Nearly all National Program of Cancer Registries (NPCR) and SEER registries assigned Asian, not otherwise specified to a more specific Asian race through the standardized use of the NAACCR Asian and Pacific Islander Identification Algorithm (NAPIIA) version 1.2.

For cases reported in 1999, Kansas opted not to present state- and county-specific counts and rates for non-Hispanic Asian and Pacific Islander persons. The national rates presented include data for Kansas.

A study reported 90% agreement between Asian and Pacific Islander race reported on death certificates and self-reported data from the U.S. Census.4

Hispanic people

The overall agreement between Hispanic ethnicity collected by SEER registries and self-reported ethnicity from the U.S. Census was substantial (k=0.61). Hispanic people were found to be underclassified in the SEER data compared to self-reports.6 Nearly all NPCR and SEER registries assigned Hispanic ethnicity through the standardized use of the NAACCR Hispanic Identification Algorithm (NHIA) version 2 (NHIA v2).10 After applying the NHIA v2, cases not classified as Hispanic are classified as non-Hispanic, leaving no cases with unknown Hispanic status.

A study reported an 88% record-by-record agreement between Hispanic origin on death certificates and self-reported data.4

Death counts and rates for Hispanic people are presented at the national and state levels for all 50 states and the District of Columbia. Hispanic origin is assigned to cancer mortality data on the basis of information collected from death certificates.

Improving estimation of cancer burden among American Indian and Alaska Native people

More American Indian and Alaska Native patients are misclassified as another race in cancer registry records than patients in other racial groups. Studies have found that this racial misclassification contributes to underestimates of cancer incidence and death rates among the American Indian and Alaska Native population.411 Accurate determination of disease burden is a critical first step toward identifying health disparities. Methods that can improve the accuracy of cancer burden estimates among the American Indian and Alaska Native population are described below.

Linkage with Indian Health Service administrative records

The Indian Health Service (IHS) provides medical services to American Indian and Alaska Native people who are enrolled members of federally recognized tribes. The IHS provides health care to about 2.2 million people, a number equivalent to about 64% of the U.S. American Indian and Alaska Native population.11 While IHS coverage of these populations varies by region, it does not include American Indian and Alaska Native people who are members of non-federally recognized tribes, and underrepresents those who live in certain urban areas. People who are eligible to receive IHS services have sufficient native ancestry in a federally recognized tribe to be classified accurately as an American Indian or Alaska Native person.

As a standard practice, central cancer registries classify race as coded in the medical record. To address American Indian and Alaska Native misclassification in cancer registry data, selected registries in CDC's NPCR and all registries in the National Cancer Institute's SEER program linked their central cancer registry data to the IHS administrative records database for cases diagnosed from 1995 to 2021 and 1988 to 2021, respectively. Results of the linkage were captured in the data element, IHS Link (NAACCR data item 192).2 Central cancer registries include race and IHS Link in their annual data submissions to CDC or NCI. Using the race and IHS Link data elements, CDC and NCI created a recoded race variable. If a cancer case had an IHS Link value that indicated a match to IHS and race is White, other, or unknown, then the recoded race variable was coded as American Indian and Alaska Native.

Although the linkage with IHS does not completely resolve the classification of race for American Indian and Alaska Native cases, it helps provide a more comprehensive and accurate picture of the cancer burden in this population.

Restriction to non-Hispanic populations

Updated bridged intercensal population estimates significantly overestimated the number of American Indian and Alaska Native persons of Hispanic origin.111213 Because these population estimates are used as denominators in rate calculations, larger than expected denominators can result in underestimation of rates. Studies demonstrate that restricting analysis to non-Hispanic populations can improve the accuracy of cancer incidence and death rate estimates among American Indian and Alaska Native people.13

Restriction to IHS Purchased/Referred Care Delivery Areas

The IHS Purchased/Referred Care Delivery Area (PRCDA) is the geographic area within which the IHS makes purchased or referred care available to members of an identified Indian community who reside in the area. The IHS uses it to determine eligibility for services not directly available within the IHS.

The IHS PRCDA consists of counties that include all or part of an American Indian or Alaska Native reservation or have a common boundary with a federally recognized tribal land, as defined in the October 14, 2020 Federal Register (82 FR 47004). There are 36 states that have at least one PRCDA-designated county. The PRCDA counties have higher proportions of American Indian and Alaska Native people in relation to the total population than non-PRCDA counties, with 53.5% of the U.S. American Indian and Alaska Native population residing in the 685 counties designated as PRCDA. Linkage studies have indicated more accurate race classification for American Indian and Alaska Native persons in PRCDA counties.111314151617

Data on American Indian and Alaska Native people in the U.S. Cancer Statistics Data Visualizations tool

The U.S. Cancer Statistics Data Visualizations tool presents national, state, and county data by race, including non-Hispanic American Indian and Alaska Native people. The national data include non-Hispanic American Indian and Alaska Native populations in all U.S. counties. These data use the results from the linkage with IHS to classify race, and are restricted to non-Hispanic people only. As described in the section above, these restrictions can improve the accuracy of cancer burden estimates among the American Indian and Alaska Native population.

State- and county-specific data for non-Hispanic American Indian and Alaska Native persons are not presented for states that opted not to present these data: Illinois, Kansas, New Jersey, and New York.

Data on American Indian and Alaska Native people in the At a Glance section

The U.S. Cancer Statistics Data Visualizations tool's American Indian and Alaska Native restricted to PRCDA only module presents data from the United States Cancer Statistics American Indian and Alaska Native Incidence Analytic Database (USCS AIAD) in the tool's At a Glance section. This database uses the three methods described above to improve the accuracy of cancer burden estimates among American Indian and Alaska Native people:

  • First, this database uses the recoded race variable to classify race. Only people of American Indian and Alaska Native race or White race (as comparison) are included in the module.
  • Second, the database is restricted to persons of non-Hispanic origin.
  • Third, the database is restricted to persons residing in PRCDA counties.

This database includes data elements specific to the American Indian and Alaska Native population, such as IHS Region and PRCDA county.

The USCS AIAD data can be displayed for all IHS regions combined or by six IHS regions: Alaska, Pacific Coast, Southwest, Northern Plains, Southern Plains, and East. The states grouped by IHS region are:

  • Alaska: Alaska.
  • Pacific Coast: California, Idaho, Oregon, and Washington.
  • Southwest: Arizona, Colorado, Nevada, New Mexico, and Utah.
  • Northern Plains: Indiana, Iowa, Michigan, Minnesota, Montana, Nebraska, North Dakota, South Dakota, Wisconsin, and Wyoming.
  • Southern Plains: Kansas, Oklahoma, and Texas.
  • East: Alabama, Connecticut, Florida, Louisiana, Massachusetts, Maine, Mississippi, New York, North Carolina, Pennsylvania, Rhode Island, South Carolina, and Virginia.

The percentages of the American Indian and Alaska Native population living in PRCDA-designated counties by IHS region from 2017 to 2021 were:

  • Alaska: 100%.
  • Pacific Coast: 61.0%.
  • Southwest: 86.2%.
  • Northern Plains: 53.9%.
  • Southern Plains: 55.7%.
  • East: 17.6%.
  • Total United States: 53.5%.

Studies have shown substantial variation in rates in the American Indian and Alaska Native population by IHS region.1819 IHS regions have been presented in several publications focusing on American Indian and Alaska Native people. This approach was determined to be preferable to the use of smaller jurisdictions, such as IHS Administrative Areas, which yielded less stable estimates.14202122

  1. O'Malley C, Hu KU, West DW. North American Association of Central Cancer Registries: Race and Ethnicity Identifier Assessment Project. Springfield (IL): North American Association of Central Cancer Registries; 2001.
  2. Havener L, Hulstrom D. Standards for Cancer Registries Vol. II: Data Standards and Data Dictionary. 10th ed., version 11. Springfield (IL): North American Association of Central Cancer Registries; 2004.
  3. Miniño AM, Heron MP, Smith BL, Kochanek K. Deaths: final data for 2004. Natl Vital Stat Rep. 2007;55(19).
  4. Arias E, Heron M, Hakes JK. The validity of race and Hispanic-origin reporting on death certificates in the United States: an update. Vital Health Stat 2. 2016;2(172).
  5. Heron MP. Comparability of race-specific mortality data based on 1977 versus 1997 reporting standards. Natl Vital Stat Rep. 2021;70(3):1–31.
  6. Clegg LX, Reichman ME, Hankey BF, et al. Quality of race, Hispanic ethnicity, and immigrant status in population-based cancer registry data: implications for health disparity studies. Cancer Causes Control. 2007;18(2):177–187.
  7. NAACCR Race and Ethnicity Work Group. NAACCR Asian Pacific Islander Identification Algorithm [NAPIIA v1.2.1]. Springfield (IL): North American Association of Central Cancer Registries; 2008.
  8. Boscoe FP. Issues with the coding of Asian race in cancer registration. J Registry Manag. 2007;34(4):135–139.
  9. Boscoe FP, Schymura MJ, Hsieh M, Williams MA, Henry KA. Issues with the coding of Pacific Islanders in central cancer registries. J Registry Manag. 2008;35(2):47–51.
  10. NAACCR Race and Ethnicity Work Group. NAACCR Guideline for Enhancing Hispanic/Latino Identification: Revised NAACCR Hispanic/Latino Identification Algorithm [NHIA v2.2.1]. Springfield (IL): North American Association of Central Cancer Registries. September 2011.
  11. Jim MA, Arias E, Seneca DS, et al. Racial misclassification of American Indians and Alaska Natives by Indian Health Service Contract Health Service Delivery Area. Am J Public Health. 2014;104(6 suppl 3):S29–S302.
  12. Arias E, Xu J, Jim MA. Period life tables for the non-Hispanic American Indian and Alaska Native population, 2007–2009. Am J Public Health. 2014;104:S312–S319.
  13. Espey DK, Jim MA, Richards TB, Begay C, Haverkamp D, Roberts D. Methods for improving the quality and completeness of mortality data for American Indians and Alaska Natives. Am J Public Health. 2014;104:S286–S294.
  14. Espey DK, Wiggins CL, Jim MA, Miller BA, Johnson CJ, Becker TM. Methods for improving cancer surveillance data in American Indian and Alaska Native populations. Cancer. 2008;113(5 suppl):1120–1130.
  15. Sugarman JR, Holliday M, Ross A, Castorina J, Hui Y. Improving American Indian cancer data in the Washington State Cancer Registry using linkages with the Indian Health Service and tribal records. Cancer. 1996;78(7 Suppl):1564–1568.
  16. Frost F, Taylor V, Fries E. Racial misclassification of Native Americans in a Surveillance, Epidemiology, and End Results cancer registry. J Natl Cancer Inst. 1992;84(12):957–962.
  17. Kwong SL, Perkins CL, Snipes KP, Wright WF. Improving American Indian cancer data in the California Cancer Registry by linkage with the Indian Health Service. J Registry Manag. 1998;25(1):17–20.
  18. Wiggins CL, Espey DK, Wingo PA, et al. Cancer among American Indians and Alaska Natives in the United States, 1999–2004. Cancer. 2008;113(5 suppl):1142–1152.
  19. White MC, Espey DK, Swan J, Wiggins CL, Eheman C, Kaur JS. Disparities in cancer mortality and incidence among American Indians and Alaska Natives in the United States. Am J Public Health. 2014;104:S377–S387.
  20. Espey DK, Wu XC, Swan J, et al. Annual report to the nation on the status of cancer, 1975–2004. Cancer. 2007;110(10):2119–2152.
  21. Espey DK, Jim MA, Richards TB, Begay C, Haverkamp D, Roberts D. Methods for improving the quality and completeness of mortality data for American Indians and Alaska Natives. Am J Public Health. 2014;104:S286–S294.
  22. Melkonian SC, Chen L, Jim MA, Haverkamp D, King JB. Disparities in incidence and trends of colorectal, lung, female breast, and cervical cancers among non-Hispanic American Indian and Alaska Native people, 1999–2018. Cancer Causes Control. 2023;34(8):657–670.