Data and methods

Data, and the methods used to analyze them, are the foundation for evidence-based research. Articles in this subject area discuss the value of different types of data collection, and explain important statistical and econometric methods that provide ways to summarize and present information, and to identify and quantify correlation or causality.

  • Why do we need longitudinal survey data?

    Knowing people’s history helps in understanding their present state and where they are heading

    Heather Joshi, November 2016
    Information from longitudinal surveys transforms snapshots of a given moment into something with a time dimension. It illuminates patterns of events within an individual’s life and records mobility and immobility between older and younger generations. It can track the different pathways of men and women and people of diverse socio-economic background through the life course. It can join up data on aspects of a person’s life, health, education, family, and employment and show how these domains affect one another. It is ideal for bridging the different silos of policies that affect people’s lives.
  • What makes a good job? Job quality and job satisfaction

    Job satisfaction is important to well-being, but intervention may be needed only if markets are impeded from improving job quality

    Andrew E. Clark, December 2015
    Many measures of job satisfaction have been trending downward. Because jobs are a key part of most people’s lives, knowing what makes a good job (job quality) is vital to knowing how well society is doing. Integral to worker well-being, job quality also affects the labor market through related decisions on whether to work, whether to quit, and how much effort to put into a job. Empirical work on what constitutes a good job finds that workers value more than wages; they also value job security and interest in their work. Policy to affect job quality requires information on the cost of the different aspects of job quality and how much workers value them.
  • Using linear regression to establish empirical relationships

    Linear regression is a powerful tool for estimating the relationship between one variable and a set of other variables

    Marno Verbeek, February 2017
    Linear regression is a powerful tool for investigating the relationships between multiple variables by relating one variable to a set of variables. It can identify the effect of one variable while adjusting for other observable differences. For example, it can analyze how wages relate to gender, after controlling for differences in background characteristics such as education and experience. A linear regression model is typically estimated by ordinary least squares, which minimizes the differences between the observed sample values and the fitted values from the model. Multiple tools are available to evaluate the model.
  • Using instrumental variables to establish causality

    Even with observational data, causality can be recovered with the help of instrumental variables estimation

    Sascha O. Becker, April 2016
    Randomized control trials are often considered the gold standard to establish causality. However, in many policy-relevant situations, these trials are not possible. Instrumental variables affect the outcome only via a specific treatment; as such, they allow for the estimation of a causal effect. However, finding valid instruments is difficult. Moreover, instrumental variables estimates recover a causal effect only for a specific part of the population. While those limitations are important, the objective of establishing causality remains; and instrumental variables are an important econometric tool to achieve this objective.
  • The use of natural experiments in migration research

    Data on rapid, unexpected refugee flows can credibly identify the impact of migration on native workers’ labor market outcomes

    Semih Tumen, October 2015
    Estimating the causal effect of immigration on the labor market outcomes of native workers has been a major concern in the literature. Because immigrants decide whether and where to migrate, immigrant populations generally consist of individuals with characteristics that differ from those of a randomly selected sample. One solution is to focus on events such as civil wars and natural catastrophes that generate rapid and unexpected flows of refugees into a country unrelated to their personal characteristics, location, and employment preferences. These “natural experiments” yield estimates that find small negative effects on native workers’ employment but not on wages.
  • The need for and use of panel data

    Panel data provide an efficient and cost-effective means to measure changing behaviors and attitudes over time

    Hans-Jürgen Andreß, April 2017
    Stability and change are essential elements of social reality and economic progress. Cross-sectional surveys are a means of providing information on specific issues at a particular point in time, though without providing any information about the prevailing stability. Limited information on change can be obtained by retrospective questioning, but this is often impaired by “recall bias.” However, valid information on change is essential for assessing whether phenomena such as poverty are permanent or only temporary. Panel data analyses can address these problems as well as provide an essential tool for effective policy design.
  • The importance of measuring dispersion in firm-level outcomes

    Ignoring the large variation in firm-level outcomes can create misunderstandings about the consequences of many policies

    Chad Syverson, May 2014
    Recent research has revealed enormous variation in performance and growth among firms, which both drives and is driven by large reallocations of inputs and outputs across firms (churning) within industries and markets. These differences in firm-level outcomes and the associated turnover of firms affect many economic policies (both labor- and non-labor-oriented), on both a microeconomic and a macroeconomic scale, and are affected by them. Properly evaluating these policies requires familiarity with the sources and consequences of firm-level variation and within-industry reallocation.
  • The importance and challenges of measuring work hours

    Measuring hours worked is important, but different surveys can tell different stories

    Jay Stewart, November 2014
    Work hours are key components in estimating productivity growth and hourly wages as well as being a useful cyclical indicator in their own right, so measuring them correctly is important. The US Bureau of Labor Statistics (BLS) collects data on work hours in several surveys and publishes three widely-used series that measure average weekly hours. The series tell different stories about average weekly hours and trends in those hours but qualitatively similar stories about the cyclical behavior of work hours. The research summarized here explains the differences in levels, but only some of the differences in trends.
  • The challenges of linking survey and administrative data

    Combining survey and administrative data is growing in popularity, even though data access is still highly restricted

    Steffen Künn, December 2015
    Using administrative records data and survey data to enhance each other offers huge potential for scientific and policy-related research. Two recent changes have expanded the potential for creating such linked data: the improved availability of data sources and progress in data-matching technology. These developments are reflected, among other ways, in the growing number of academic papers in labor economics that use linked survey and administrative data. While the number of studies using linked data is still small, the trend is clearly upward. Slowing the growth, however, are concerns about data security and privacy, which impede data access.
  • Skill mismatch and overeducation in transition economies

    Substantial skill shortages coexist with overeducation, affecting both young and old workers

    Olga Kupets, December 2015
    Large imbalances between the supply and demand for skills in transition economies are driven by rapid economic restructuring, misalignment of the education system with labor market needs, and underdeveloped adult education and training systems. The costs of mismatches can be large and long-lasting for workers, firms, and economies, with long periods of overeducation implying a loss of human capital for individuals and ineffective use of resources for the economy. To make informed decisions, policymakers need to understand how different types of workers and firms are affected by overeducation and skill shortages.
show more