Key points
- Occupation and industry coding is the process of assigning standardized numeric codes to text descriptions of a person’s job and industry.
- Coding is done to prepare for data analysis.
- In the U.S., there are 3 classification systems commonly used to assign standardized codes.
- Use NIOCCS to autocode data.
The importance of data coding
What is coding
Occupation and industry coding is the process of converting a text description of a person’s job and type of business into a standardized numeric code.
Each occupation and industry has a unique number associated with it. These standardized codes are needed for data analysis and can be organized into broader groups.
Why coding is important
Once data are coded, public health officials and other researchers can analyze the data to look for patterns and trends in work-related diseases, injuries, and exposures.
Main classification systems
In the U.S., there are different classification systems that can be used to assign codes associated with each industry and each occupation.
- North American Industry Classification System (NAICS) is used to code industry.
- Standard Occupational Classification System (SOC) is used to code occupation.
- Census Occupation and Industry Classification System can code BOTH occupation and industry.
What classification system to use
If you are comparing your data to another dataset, the classification system you use will depend on the comparison dataset. For example, if the comparison set is coded using NAICS and SOC, then use NAICS and SOC.
Who uses what?
The National Health Interview Survey (NHIS) and Current Population Survey (CPS) are both coded using Census. Death certificate data are also coded using Census.
Some employer-based surveys use NAICS/SOC for business or economic analysis. This includes the Quarterly Census of Employment Wages survey and the Survey of Occupational Illness and Injury (SOII).
If you are not comparing your data to another dataset, we recommend using the NAICS/SOC classification systems because NAICS and SOC:
- Provide a greater level of detail than the Census occupation and industry classification system.
- Are useful even if you don't have both occupation and industry data. This is because NAICS and SOC are independent of each other.
Coding data
Software applications, or "autocoders," are available to code occupation and industry data. Autocoders electronically assign occupation and industry codes to free text descriptions using NAICS, SOC, or Census.
There are many advantages to using an autocoder:
- Autocoding is much faster than manual coding.
- Codes are more consistent from one record to another, which means less random error.
- Manual coding requires training and experience.
Code while collecting data
Use one of these to code WHILE you collect data:
- NIOCCS Single Record Coding Interface codes occupation and industry data one entry at a time. However, coding single records as you collect them requires you to switch back and forth between your data entry program and the NIOCCS site.
- NIOCCS Web API Autocoder can be incorporated into any data collection platform and requires only an internet connection. It converts free-text occupation and industry descriptions into standardized codes as data are collected.
Code after collecting data
Use NIOCCS if you want to code AFTER you collect data. NIOCCS can be used to code single records or batches of data.