Causal inference
February 19, 2024
Historically, reverse causality and omitted variable bias have been problematic for a lot of social science research aimed at making causal claims.
Recently, the counterfactual approach has been embraced in the social sciences as a framework for causal inference.
This represents a big shift in research:
Being more precise about what we mean by causal effects.
Using randomization or designs with as-if randomization.
More partnerships between researchers and practitioners.
In the counterfactual approach: “If X had not occurred, then Y would not have occurred.”
Experiments help us learn about counterfactual and manipulation-based claims about causation.
It’s not wrong to conceptualize “cause” in another way. But it has been productive to work in this counterfactual framework (Brady 2008).
“X causes Y” need not imply that W and V do not cause Y: X is a part of the story, not the whole story. (The whole story is not necessary in order to learn about whether X causes Y).
“X causes Y” requires a context: matches cause flame but require oxygen; small classrooms improve test scores but require experienced teachers and funding (Cartwright and Hardie 2012).
“X causes Y” can mean “With X, the probability of Y is higher than would be without X.” or “Without X there is no Y.” Either is compatible with the counterfactual idea.
It is not necessary to know the mechanism to establish that X causes Y. The mechanism can be complex, and it can involve probability: X causes Y sometimes because of A and sometimes because of B.
Counterfactual causation does not require “a spatiotemporally continuous sequence of causal intermediates”
Correlation is not causation.
Your friend says taking echinacea (a traditional remedy) reduces the duration of colds.
If we take a counterfactual approach, what does this statement implicitly claim about the counterfactual? What other counterfactuals might be possible and why?
For each unit we assume that there are two post-treatment outcomes: \(Y_i(1)\) and \(Y_i(0)\).
\(Y_i(1)\) is the outcome that would obtain if the unit received the treatment (\(T_i=1\)).
\(Y_i(0)\) is the outcome that would obtain if the unit received the control (\(T_i=0\)).
The causal effect of treatment (relative to control) is: \(\tau_i = Y_i(1) - Y_i(0)\)
Note that we’ve moved to using \(T\) to indicate our treatment (what we want to learn the effect of). \(X\) will be used for background variables.
You have to define the control condition to define a causal effect.
Each individual unit \(i\) has its own causal effect \(\tau_i\).
But we can’t measure the individual-level causal effect, because we can’t observe both \(Y_i(1)\) and \(Y_i(0)\) at the same time. This is known as the fundamental problem of causal inference. What we observe is \(Y_i\):
\(Y_i = T_iY_i(1) + (1-T_i)Y_i(0)\)
\(i\) | \(Y_i(1)\) | \(Y_i(0)\) | \(\tau_i\) |
---|---|---|---|
Andrei | 1 | 1 | 0 |
Bamidele | 1 | 0 | 1 |
Claire | 0 | 0 | 0 |
Deepal | 0 | 1 | -1 |
We have the treatment effect for each individual.
Note the heterogeneity in the individual-level treatment effects.
But we only have at most one potential outcome for each individual, which means we don’t know these treatment effects.
\(\overline{\tau_i} = \frac{1}{N} \sum_{i=1}^N ( Y_i(1)-Y_i(0) ) = \overline{Y_i(1)-Y_i(0)}\)
The average causal effect is also known as the average treatment effect (ATE).
Further explanation on how after we discuss randomization of treatment assignment in the next section.
Before we discuss randomization and how that allows us to estimate the ATE, note that the ATE is a type of estimand.
An estimand is a quantity you want to learn about (from your data). It’s the target of your research that you set.
Being precise about your research question means being precise about your estimand. For causal questions, this means specifying:
Counterfactual model is all about contribution, not attribution, except in a very conditional sense.
Focus is on non-rival contributions
The case of perfect complements…
What caused \(Y\)? Which cause was most important?
Counterfactual model is all about contribution, not attribution
Focus is on non-rival contributions
Question is not: what caused \(Y\) but
what is the effect of \(X\)?
At most it provides a conditional account
This is problem for research programs that define “explanation” in terms of figuring out the things that cause Y (e.g. mediation analysis Heckman and Pinto, 2015)
Difficult to conceptualize what it means to say one cause is more important than another cause.
Erdogan’s increasing authoritarianism was the most important reason for the attempted coup
Is this more important than Turkey’s history of coups? What does that mean?
What does it mean to say that Aunt Pat voted for Brexit because she is old?
Compare:
What does it mean to say that Southern counties voted for Brexit because they have many old people?
Jack exploited Jill
It’s Jill’s fault that bucket fell
Jack is the most obstructionist member of Congress
Melania Trump stole from Michelle Obama’s speech
Activists need causal claims