Beyond PSM: A Tour of the Causal Inference Toolkit (DiD, IV, RDD)

So far in this series, we’ve become proficient with Propensity Score Matching (PSM). It’s a powerful tool for untangling causation from correlation in observational data. But what if your data doesn’t have good overlap? What if you’re worried about unobserved confounders PSM can’t touch?

You reach for a different tool.

Think of causal inference not as a single method, but as a well-stocked toolbox. PSM is your trusty screwdriver—versatile and commonly used. But some jobs require a hammer, a wrench, or a socket set.

Today, we’ll tour three other essential tools in the causal inference toolkit: Difference-in-Differences (DiD), Instrumental Variables (IV), and Regression Discontinuity Design (RDD). For each, we’ll learn its superpower, its kryptonite, and when to use it.

1. Difference-in-Differences (DiD): The Before-and-After Comparator

Core Idea: Compare the change in outcome for a treated group to the change in outcome for an untreated control group, before and after the treatment.

The Classic Example: Measuring the effect of a new minimum wage law in one state. You can’t randomize laws, but you can compare employment growth in that state (treated) to employment growth in a similar neighboring state (control), both before and after the law passed.
The Finance Example: Your bank launches a new fee-free checking account in half of its branches. To measure its impact on customer satisfaction (CSAT), you can:
- Pre-Period: Measure CSAT in both treatment and control branches.
- Post-Period: Measure CSAT again after the launch.
- The DiD Effect: (CSAT_Treatment_After - CSAT_Treatment_Before) - (CSAT_Control_After - CSAT_Control_Before)

Key Requirement: The Parallel Trends Assumption. This assumes that in the absence of the treatment, the difference between the treatment and control groups would have remained constant over time. The control group’s trend is your counterfactual for the treatment group.

When It Breaks: If an unrelated event (e.g., a recession) hits the treatment group harder than the control group at the same time, it will violate parallel trends and bias your result.

2. Instrumental Variables (IV): The Natural Experiment Finder

Core Idea: Use a third variable—an “instrument”—that influences the treatment but affects the outcome only through its effect on the treatment.

The Classic Example: Estimating the effect of military service on lifetime earnings. The problem: motivated and disciplined people are more likely to serve and earn more (confounding by “grit”). A good instrument: draft lottery number. It strongly predicts who serves (relevance) but likely doesn’t affect earnings except through military service (exclusion restriction).
The Finance Example: You want to know the true effect of using a robo-advisor (treatment) on investment returns (outcome). The problem: financially savvy people are more likely to use the tool and get better returns. A potential instrument: a promo code mailed randomly to a subset of customers. It encourages tool adoption (relevance) and, if randomized, should not directly affect returns (exclusion restriction).

Key Requirement: A valid instrument. This is the hardest part. The instrument must be:

Relevant: Correlated with the treatment.
Exogenous: Not correlated with the error term (i.e., it affects the outcome only through the treatment).

When It Breaks: If the instrument directly affects the outcome (“direct path”) or is correlated with an unobserved confounder, the results will be severely biased.

3. Regression Discontinuity Design (RDD): The Sharp Cutoff Exploiter

Core Idea: When treatment is assigned by whether a unit scores above or below a specific cutoff on a continuous variable, compare outcomes for units just barely above and just barely below the cutoff.

The Classic Example: Measuring the effect of receiving a scholarship awarded to students with a GPA of 3.5 or higher. You compare the future earnings of students with a 3.49 GPA to those with a 3.51 GPA. The idea is that these students are essentially identical; the only difference is the scholarship.
The Finance Example: A “premium” credit card is offered only to customers with a credit score of 700 or higher. To measure the effect of this card on customer loyalty, you compare the churn rates of customers with a 699 credit score to those with a 701 score.

Key Requirement: The continuity assumption. This assumes that all other factors affecting the outcome vary smoothly around the cutoff. The only thing that changes discontinuously at the cutoff is the probability of treatment.

When It Breaks: If people can manipulate their score to just cross the cutoff, it breaks the randomness. For example, if customers know the 700 cutoff, those at 698 might try extra hard to boost their score, meaning the 699 and 701 groups are no longer comparable.

Choosing Your Tool: A Summary

Method	Best For When…	The “Magic” Ingredient	Key Assumption
PSM	You have observed confounders and good overlap.	Statistical twins	Ignorability (all confounders measured)
DiD	You have a pre- and post-period and a control group.	A parallel control trend	Parallel Trends
IV	You have unobserved confounding but a lucky natural experiment.	A good instrument	Instrument Validity
RDD	Treatment is assigned by a sharp cutoff rule.	A arbitrary cutoff	Continuity at cutoff

Conclusion: The Master Craftsman

There is no single “best” method for causal inference. The master craftsman isn’t the one who only knows how to use a hammer; it’s the one who looks at a problem, diagnoses its structure, and selects the right tool from the box.

The journey we’ve been on—from understanding why correlation lies, to seeing why A/B tests fail, to implementing PSM, and finally to surveying this powerful toolkit—equips you to be that craftsman. You can now ask better questions, challenge flawed analyses, and build more robust, truthful models that drive real business value.

Thank you for following along with this series on Practical Causal Inference for Finance. I hope it has been valuable.