What is IPUMS CPS?

About CPS

The Current Population Survey is a powerful source of data for investigating social and economic trends in the U.S. over the past half century. The CPS is a monthly U.S. household survey conducted jointly by the U.S. Census Bureau and the Bureau of Labor Statistics. Initiated in the 1940s in the wake of the Great Depression, the survey was designed to measure unemployment. The Current Population Survey (CPS) is administered monthly by the U.S. Bureau of the Census to over 65,000 households. These surveys gather information on education, labor force status, demographics, and other aspects of the U.S. population. The CPS is widely used by demographers, economists, sociologists, and other population-related researchers. In addition, the CPS is the basis upon which federal statistics on unemployment are calculated monthly.

A battery of labor force and demographic questions, known as the "basic monthly survey," is asked every month. These data are available from 1976 to present. Over time, supplemental inquiries on special topics have been added for particular months. Among these supplemental surveys, the March Annual Social and Economic Supplement (ASEC) is the most widely used by social scientists and policymakers, and it provides one set of data for IPUMS-CPS. Supplementals surveys also collect information on topics such as food security, tobacco use, job displacement, and employee tenure. Elsewhere we describe the full set of supplemental topics available.

Despite their importance to the research community, the CPS files from the U.S. Census Bureau are inconvenient to use, particularly for novice researchers. Problems are especially acute for those attempting to form a time series by piecing together surveys from many different years. Variables change location and length over time, requiring several different program formats to obtain a given set of variables across many years. Old variables are dropped and new ones added to files over time. Variable coding changes -- as do the questions from which the variables are derived -- and changes in questionnaire content are often subtle. For example, the values at which monetary variables are top-coded (i.e., the unbounded top range of values, for instance 50+) vary over time, often in ways not clearly spelled out in the survey documentation.

There are challenges in working with single samples as well. The Census-supplied documentation is sometimes incomplete and difficult to interpret, particularly for the early surveys. Determining the universe of respondents for questions is frequently not straightforward, requiring researchers to trace through skip patterns on questionnaires. Even the act of finding all variables on a specific topic, determining their coding, and ascertaining the context in which the appropriate questions were asked, can itself be a cumbersome process that requires a time-consuming manual search through CPS documentation.


IPUMS-CPS is an integrated set of data from the Current Population Survey (CPS) from 1962 forward. IPUMS-CPS is microdata--it provides information about individual persons and households. Researchers can create tabulations and multivariate analyses tailored to their particular questions, using their desired set of variables.

To make cross-time comparisons using the CPS data more feasible, variables in IPUMS-CPS are coded identically or "harmonized". IPUMS-CPS also facilitates the study of long-term change by providing detailed documentation covering comparability issues for each variable and an interactive data extraction system. IPUMS-CPS consists of all substantive variables from the original CPS samples. Various constructed variables from the original datasets have been excluded.

In addition to harmonizing the data, the IPUMS data extraction system allows researchers to make data files customized to their interests. Specifically, researchers may choose the variables and samples of interest for their research project rather than downloading and managing the entire set of variables available in a particular dataset. Given that CPS data are collected monthly and the size of a dataset containing several months can quickly get large and unweidly, this feature of IPUMS is particularly useful.

This harmonized dataset is also largely compatible with the data from the U.S. decennial censuses that are part of the Integrated Public Use Microdata Series (IPUMS-USA). Researchers can take advantage of the relatively large sample size of IPUMS-USA at ten-year intervals and fill in information for the intervening years using IPUMS-CPS. Documentation for individual variables in IPUMS-CPS covers comparability issues between IPUMS-CPS and IPUMS-USA. See sample notes and sample designs for more overview information about the CPS and IPUMS-CPS data.

Researchers can access the IPUMS-CPS data without charge by completing the registration form. The data are not suitable for studying small geographic areas or for genealogical research. Publications based on these data should appropriately cite the database and should be reported to IPUMS-CPS.

The IPUMS-CPS project was carried out by the Minnesota Population Center in collaboration with Unicon Research Corporation. Major funding comes from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, in addition to supplementary funding from the National Science Foundation program in Social Science Infrastructure and the Robert Wood Johnson Foundation.

