Machine learning for causal inference in economics

Discover how machine learning can help to uncover causal insights from economic data to guide better informed policy decisions.

UniDistance Suisse, Switzerland

one-pager full article

Elevator pitch

Machine learning (ML) improves economic policy analysis by addressing the complexity of modern data. It complements traditional econometric methods by handling numerous control variables, managing interactions and non-linearities flexibly, and uncovering nuanced differential causal effects. However, careful validation and awareness of limitations such as risk of bias, transparency issues, and data requirements are essential for informed policy recommendations.

illustration

Key findings

Pros

Handling control variables: ML can handle (very) many control variables, improving causal effect identification.

Model flexibility: ML can manage diverse data types and complex, non-linear interactions effectively.

Efficient estimation: ML can improve the precision of average effect estimation, especially in small samples.

Uncovering differential effects: ML can reveal differential policy impacts across groups systematically.

Data-driven decisions: ML can support evidence-based policy making.

Cons

Prediction focus: ML's emphasis on prediction may lead to misinterpretation in causal contexts.

Lack of transparency: ML models are complex, making interpretation and decision-making less transparent.

Bias risks: Improper handling of model inputs (e.g. endogeneous variables) can introduce or amplify biases.

Noisy estimates: Small samples can result in imprecise and noisy estimates of differential effects.

Model instability: Instability in ML models can increase the risk of misinterpretation.

Author's main message

ML provides structured, data-driven tools that reduce the need for discretionary decisions in model specification, such as selecting control variables or defining functional forms. It is particularly useful for handling large sets of control variables and capturing complex interactions, improving the precision of policy effect estimates. Additionally, ML can systematically identify differential effects across groups, offering valuable insights for policy design. However, its effectiveness depends on the quality of input data and the appropriate selection of model parameters to minimize bias and overfitting. For evidence-based policy recommendations, researchers and policymakers need to be aware of the limitations of ML to ensure that its findings are properly interpreted and applied.

Full citation

Full citation

Data source(s)

Data type(s)

Method(s)

Countries