The samples and variables included in the ASEC Current Population Survey (CPS) vary over time. This document notes important characteristics of and changes in the samples of CPS data included in IPUMS-CPS. For an overview of the sampling strategy, see sample designs .
Children under age 14 are not included in the samples in these years. These datasets were not officially released by the U.S. Census Bureau as public use files. Because the datasets were used by researchers at the University of Wisconsin, they were preserved in the data archive at the Center for Demography and Ecology at the University of Wisconsin. Documentation for these files is particularly sparse.
The original 1962 dataset lacked "NIU" (not in universe) codes for individuals outside the universe for numerous variables. IPUMS-CPS has imposed such codes. In many cases (e.g., veteran status), a variable was supposedly available in the original data for 1962, but the codes were not reliable and the variable was excluded from IPUMS-CPS. For a few variables, such as relationship to household head, more than one variable appeared to cover the same material in the original dataset. Through diagnostic cross-tabs (e.g., cross-tabulating marital status with relationship to household head), we identified the most consistent variable and included it in IPUMS-CPS.
1600 records have a code of "62" for the original "year" variable in the 1963 CPS dataset. Usually one record within a household, rather than all members of the household, had the value "62." On the basis of careful examination of many cases, we decided to leave these cases in the data set and coded them as "1963" in the IPUMS-CPS YEAR variable. In the original data set, these 1600 cases were truncated, with blanks for many variables relating to work and earnings during the previous calendar year. For such variables, the 1600 truncated records were assigned the codes for "Not in universe" (NIU). These "1962" cases in the 1963 dataset can be identified using the REPORTYR variable.
Given the absence of children under 14 and the institutionalized population, one would expect the weighted counts of the CPS for 1963 to equal about two-thirds of the total U.S. population. In fact, the original weighted population count for the 1963 ASEC, using individual-level records, was only about half as large as the U.S. population total for 1963. The cases present in the 1963 sample were representative of the 1963 U.S. population in every way we could measure. For this reason, we adjusted all original weighting values by a constant (1.3262), so that the weighted totals for the 1963 dataset accurately reflect the absolute number of persons with any given characteristic in the U.S. adult, non-institutionalized population.
The number of cases in the 1966 dataset is almost twice as large as in any other ASEC sample prior to 1968. The original weighted population count for the 1966 ASEC, using individual-level records, is about twice as large as the expected non-institutionalized U.S. population age 14+ in that year. IPUMS-CPS created a revised weight for 1966, multiplying all original weighting values by a constant (0.5043).
The original weighted population count for the 1967 ASEC, using individual-level records, was only about half as large as the U.S. population total for 1967. Since the cases present were representative of the 1967 population in every measurable way, IPUMS-CPS created a revised weight for 1967. The revised weight multiplies all original weighting values by a constant (1.5333).
These datasets were not officially released by the U.S. Census Bureau as public use files. Children under 14 were included in the ASEC datasets beginning in 1968. The relationships of children and of members of the Armed Forces to the household head was left blank in the original data; IPUMS-CPS codes such persons as "Under 14, relationship unknown" and "Armed forces, relationship unknown" for the RELATE variable.
This dataset is the first in IPUMS-CPS to use the full set of occupation and industry codes generally used by the Census Bureau.
The 1976 ASEC file is the first dataset to include household-level records in the original data. For earlier files, IPUMS-CPS created a household-level record using the record of the household head. As with earlier CPS data files included in IPUMS-CPS, the format of the original data from the Census Bureau was modified by programmers at the University of Wisconsin.
An oversample of Hispanic persons was first done for the 1976 ASEC survey. The person-level weights, WTSUPP and WTFINL, and the household-level weights, HWTSUPP and HWTFINL , correct for this oversample, so that the number of Hispanic persons and households for weighted totals is consistent with the number of such persons in the non-institutionalized U.S. population.
The 1976 ASEC dataset, and datasets for subsequent years, include two different individual-level weights: WTSUPP and WTFINL. In prior years, only WTSUPP is available. For most purposes, researchers should rely on WTSUPP. WTSUPP must be used for analytic purposes to produce statistics representative of the non-institutionalized population of the United States. If they wish to replicate published BLS statistics relating to variables included in the basic monthly survey repeated each month in the CPS, analysts should use WTFINL.
The 1977 dataset included in IPUMS-CPS is the first dataset that was released as a public use file by the Census Bureau and was not modified in its original data formatting by programmers at the University of Wisconsin.
Up through 1979, persons age 14 and older were considered adults; beginning in 1980, persons age 15 and older were classified as adults. The lowest age limit of the universes for income variables and for many variables relating to employment rose from 14 to 15 beginning in 1980. Exceptions are the ABSENT, CLASSWKR, EMPSTAT, LABFORCE, LOOKING, OCC, and OCC1950 variables; these used age 14 as the youngest age group in the variable universe through 1987.
Particularly notable is the increase in 1988 in the number of variables relating to income from specific sources during the previous calendar year. The 1988 dataset is the first to include separate variables on income from the following sources: unemployment insurance; workers' compensation; veterans' benefits; disability income; dividends; rent; educational assistance; child support; alimony; and personal assistance from persons outside the household. Prior to 1988, income from these sources was subsumed into a smaller number of income variables with a broader focus.
More detailed information on the relationship of unrelated persons to the householder is available beginning in 1988, with the inclusion of "partner/roommate" and "foster child" in the codes for RELATE.
A major redesign of the Current Population Survey was implemented in 1994. One aspect of the redesign was changes in question wording. The new wording reduced underreporting of labor force participation by women working part-time and more precisely measured the number of persons on temporary lay-off from jobs. A second aspect of the redesign was that CPS interviewers switched from using paper questionnaires to computer-assisted interviewing technology with skip patterns programmed into the interview format.
The 1994 ASEC added new questions about the date of immigration for foreign-born persons and about the birthplaces of each respondent's mother and father.
The computer assisted-interviewing format instituted in 1994 facilitated another change--allowing respondents to report income from various sources for a number of short periods (e.g., bi-weekly or monthly) rather than as a lump sum for the previous calendar year--beginning in 1995.
New response categories for the relationship to the householder were put in place in 1995. The partner/roommate category was dropped and replaced with unmarried partner, housemate/roommate, and roomer/boarder in the RELATE variable.
A number of questions designed to measure participation in welfare reform programs (e.g., job training, transportation assistance) were first included in the 2001 ASEC survey. The sample included in IPUMS-CPS is the SCHIP file, which contains considerably more cases than the original 2001 data file.
For the first time, multiple race responses were allowed. Hispanic origin was ascertained through two questions, rather than the single question used in earlier years. The occupation and industry coding schemes of Census 2000 were adopted, with minor modifications.
See the Revision History page.