2010 Census Occupation Coding Scheme
1990-based occupation variables
The IPUMS CPS variable OCC provides un-recoded occupation information for each IPUMS CPS sample. However, a consistent classification scheme is necessary due to shifting meanings of these OCC codes over time. IPUMS CPS provides such a classification scheme in the variable OCC19901. It should be noted that not all 1990 occupation codes are represented in the consistent 1990-based occupation codes assigned by IPUMS; some categories are combined to improve comparability over time.
The association between the original occupation codes and the corresponding 1990-based codes in 1960-2000 was created using a series of technical papers published by the Census Bureau. These papers provide "crosswalks" that double-code occupations using the original coding scheme and the 1990 coding scheme. These crosswalks identify 1990 occupation codes associated with each occupation code from the original coding scheme, and identify the proportion of the original occupation code that should be allocated to each of the associated 1990 occupation codes. Typically, the original occupation codes are assigned to the 1990 code that contains the largest proportion of the double coded cases. The protocol for interpreting these crosswalks is available in great detail elsewhere2,3. While these technical papers and crosswalks allow researchers to replicate the IPUMS harmonization of the 1960-2000 occupation codes to a 1990 coding scheme, they do not include the 2010 Census occupation codes or address changes introduced by the 2010 Census occupation coding scheme. This document describes the process IPUMS uses to harmonize the 2010 occupation codes for OCC1990 and other 1990-based variables.
2010 Census occupation coding scheme
The 2010 Census occupation coding scheme reflects modifications to the 2010 Standard Occupation Classification (SOC)4. This new coding scheme introduces new detailed occupations, and includes changes to both occupation titles and changes to occupation definitions (which may not necessarily be reflected by a change in title).
The Census Bureau does provide a crosswalk to compare the 2002-2010 occupation coding schemes; although this crosswalk identifies the associated 2010 occupation codes for each 2002 occupation code, it does not include the proportion of each original occupation code that should be allocated to each of the associated 2010 occupation codes. A conversion rate crosswalk that provides these proportions is available for comparing the 2008 and 2010 American Community Survey (ACS); the 2010 ACS uses the 2010 Census occupation coding scheme, so this crosswalk addresses the changes introduced in 2010. The 1960-2000 crosswalks directly compare the original coding scheme with 1990; the 2010 crosswalks only compare 2010 and 2002. The process to convert 2010 occupation codes into a 1990-base requires that the 2010 occupation codes are first crosswalked to their comparable 2002 values, and then to the appropriate 1990 value5. The process of allocating the derived 2002 occupation codes to their 1990 counterparts is done as per the 2000-1990 occupation code crosswalk6; this document only outlines the transition from 2010 to 2002 codes.
Differences between the 2002 and 2010 Census occupation coding schemesDifferences in the Census occupational coding schemes between the 2002 and 2010 classification systems can be categorized as follows:
- No difference in label, SOC, or Census occupation code
- Occupations with differences in label or SOC code, but not in Census occupation code
- Direct change occupations
- Aggregated occupations
- Disaggregated occupations
- Occupations combined by the ACS conversion crosswalk
No difference in label, SOC, or Census occupation code
IPUMS CPS does not make modifications for cases where the 2002 and 2010 SOC, Census occupation codes, and occupation labels are the same. The 2010 occupation code is the same as the 2002 occupation code in these cases.
Occupations with differences in label or SOC code, but not in Census occupation code
IPUMS CPS does not make modifications for cases where the 2002 and 2010 Census occupation codes are the same, and only the label or SOC code has changed. The 2010 occupation code is the same as the 2002 occupation code in these cases.
Direct change occupations
A small number of occupations have an update to the Census occupation code, but are not aggregated or disaggregated as part of the transition between the 2002 and 2010 occupation coding schemes; there is a one-to-one relationship between the 2002 code and the new 2010 code. In these cases, IPUMS CPS associates the 2010 occupation code with its direct match in 2002. This includes cases where multiple Census occupation codes that are combined into a single category in the 2008-2010 ACS conversion rate crosswalk, but these codes are available individually (e.g., not combined) in the underlying IPUMS USA or IPUMS CPS data.
Between the 2002 and 2010 occupation coding schemes only two occupations were aggregated to create a single category; the 2002 occupation codes "0200: Farm, ranch, and other agricultural managers" were combined with "0210: Farmers and ranchers" to create a new occupation code in 2010: "0205: Farmers, ranchers, and other agricultural managers". To determine how to allocate the new 2010 occupation to the 2002 counterpart, we used 2005-2009 5-Year ACS data7 from IPUMS USA to identify which of the 2002-based categories was larger. In the case of changes to farming and farm manager occupations, more persons were "0210: Farmers and ranchers" than "0200: Farm, ranch, and other agricultural managers", so all persons assigned a 2010 occupation code of "0205" are associated with the 2002 code "0210".
When a single 2002 occupation is disaggregated into many 2010 occupation codes, and these new 2010 occupation codes are only associated with a single 2002 code, the allocation of disaggregated occupation codes is simple. For these cases, there is no issue selecting which 2002 code should be used; all resulting 2010 codes from one to many disaggregation cases are assigned to the 2002 code from which they are derived.
When the newly created 2010 occupation codes are associated with more than one 2002 occupation code, the allocation of the disaggregated occupation codes is more complex. For these cases, we again use the 2005-2009 5-year ACS data from IPUMS USA. Using ACS data, we identify the number of people reporting each 2002 occupation code that has a many to many relationship with 2010 occupation codes. We then apply the conversion rate from the 2008-2010 ACS to the 2002 number of persons in each occupation to estimate the number of persons who would be assigned the new 2010 occupation code. We repeat this for all many-to-many disaggregated occupations. Next, we compare the frequencies for each 2010 occupation code associated with multiple 2002 occupation codes, and identify the largest subgroup among them. OCC1990 assigns all cases of a new 2010 occupation code to the 2002 occupation code that is associated with the largest subgroup of the 2010 code.
For example, between 2002 and 2010, the occupation category "620: Human resources, training, and labor relations specialists" was split into three new occupation codes:
- 630: Human resources workers
- 640: Compensation, benefits, and job analysis specialists
- 650: Training and development specialists
Similarly, the 2002 occupation category "6000: First line supervisors/managers of farming, fishing, and forestry workers" was split into two occupation codes:
- 630: Human resources workers
- 6005: First-line supervisors of farming, fishing, and forestry workers
The new 2010 occupation code "630: Human resources workers" is included in both. Using 5-year ACS data, we make a data-driven decision to determine which sub-group of 630 is estimated to be larger. An estimated 71,688 persons are assigned to the 2002 occupation code "6000: First-line supervisors of farming, fishing, and forestry workers"; the conversion rate indicates that 99.58% (71,387) of these cases would be categorized in the "6005" code under the 2010 scheme, and 0.42% (301) would be categorized under the shared 2010 occupation code of "630". An estimated 1,004,193 persons are assigned to the 2002 occupation code "620: Human resources, training, and labor relations specialists"; 73.85% (741,597) of whom are estimated to be in the shared code "630". Based on the larger N of subgroup associated with the 2002 code "620", all 2010 occupation codes of "630" will be associated with the 2002 code "620".
Some other cases have more overlapping occupations and smaller margins. However, we assign all 2010 occupation codes to the 2002 category that corresponds to the largest subgroup. This approach is systematic and can be replicated in other datasets, and easily checked by researchers. Additionally, researchers can choose to categorize these new occupations differently if they do not agree with our approach.
Occupations combined by the ACS conversion crosswalk
The ACS conversion rate crosswalk combines some occupations for the conversion rates it provides; the many of these codes are, however, available without being combined in CPS data. For example, the ACS 2008-2010 crosswalk notes the Census occupation code "1530: Miscellaneous engineers, including nuclear engineers" is comprised of two different occupation codes: "1510: Nuclear engineers" and "1530: Engineers, all other". However, these individual occupation codes may be available separately (e.g., not combined into a single category) in the data. When the individual occupation codes that are combined by the 2008-2010 ACS conversion rate crosswalk are available in the data, and do not change between 2002 and 2010, no modification is necessary.
There are a limited number of cases where the list of occupations included in combined categories in the 2008-2010 ACS conversion rate crosswalk differ between the 2002 and 2010 lists. In all of these cases, IPUMS uses additional information from the 2002-2010 crosswalk to identify the most appropriate 2002 code.
Same occupation code moving between combined groups
The original occupation code for "1940: Nuclear technicians" is identical between 2002 and 2010. The 2008-2010 ACS conversion rate crosswalk notes that this occupation is combined with "1960: Other life, physical, and social science technicians" into a combined category that uses the "1960" code, but also includes nuclear technicians. However, in 2010, "1940: Nuclear technicians" are instead included in a combined category "1930: Geological and petroleum technicians", which uses the "1930" code, but also includes nuclear technicians. IPUMS disregards these overarching categories and treats the underlying code 1940 Nuclear technicians as an occupation code with no change; there is no modification to its SOC code, Census occupation code, or label. The only difference is the broader grouping used in the 2008-2010 ACS conversion rate crosswalk.
New occupation codes with a clear 2002 association that are added to a 2010 combined occupation
There are a limited number of cases where a 2002 occupation code was disaggregated in 2010, and one of the new 2010 occupation codes is included in a set of combined occupations in the 2008-2010 ACS conversion rate crosswalk. In some cases, the 2002 occupation code is not part of a combined occupation group, and only the 2010 code is in a combined group. If the ACS 2008-2010 conversion rate crosswalk specifies a conversion rate from the 2002 occupation code to the combined code (rather than the newly added 2010 occupation code), but the 2002-2010 crosswalk specifies an association between the specific 2010 code (e.g., and not the combined group code), IPUMS associates the 2010 code directly with the 2002 occupation code from which it was derived and NOT the combined group.
Case study 1: Survey researchers
The 2010 coding scheme introduces the occupation code "1815: Survey researchers"; the 2008-2010 ACS conversion rate crosswalk includes this as part of the 1860 combined group, which also includes "1830: Sociologists" and "1860: Miscellaneous social scientists and related workers". The conversion rate crosswalk also notes that the 2002 occupation code "1810: Market and survey researchers" is split between the codes "0735: Market research analysts and marketing specialists" and the 1860 combined group. However, the original 2002-2010 crosswalk notes that the 2002 occupation code 1810 is split between two occupation codes in 2010: 0735, and 1815. The additional information in the 2002-2010 crosswalk makes the link between the 2002 occupation code 1810 and the 1860 combined occupation group more apparent; we assume the conversion rate to 1860 reflects that 1860 includes the new code 1815 in the 2008-2010 ACS conversion rate crosswalk. IPUMS associates the new 2010 occupation code 1815 with the 2002 occupation code 1810, and not to the combined occupation group code 1860.
Case study 2: Funeral Service Managers
The 2002 occupation code "320: Funeral directors" is split into two new occupation codes in 2010: "0325: Funeral service managers" and "4465: Morticians, undertakers, and funeral directors". The 2008-2010 ACS conversion rate crosswalk includes the new 2010 occupation code 0325 in a combined occupation group that includes "400: Postmasters and mail superintendents" and "430: Managers, all other" in both 2002 and 2010; however, this combined group does not include the occupation code "320" in 2002. We prioritize the 2002-2010 crosswalk information showing that code 0325 is derived from the "320: Funeral directors" occupation code, and associate the new code with the 2002 occupation "320" rather than the combined occupation group "430".
1OCC reports the person's current occupation; OCC1990 uses the same occupation information to report the person's current occupation using the 1990 based coding scheme. There are similar pairs of variables to report occupation information from the previous year (OCCLY, OCC90LY), the job tenure supplement (JTOCC, JTOCC1990), and the displaced worker supplement (DWOCC, DWOCC1990). For brevity, the steps outlined here reference OCC and OCC1990, but apply to all of the 1990-based occupation variables. Back
4In 2002, the 2000 Census occupation codes were modified slightly to add a fourth digit; in all cases, this is a trailing zero. This fourth digit contains non-zero values beginning in with the introduction of the 2010 Census occupation codes. Back
5 The only difference between the 2002 and 2000 occupation codes is the trailing zero; technically, there is an interim step of removing this trailing zero to translate the 2002-based occupation code into a 2000-based code. Back
5See introduction and additional resources linked in footnotes 2 and 3.Back
7The 2005-2009 5-Year ACS data uses the 2002 occupation codes. Back