Postscript: Third Wave of Diff-in-Diff
More graphs plus synthetic control workshop announcement, and Little Mama, my long lost feral cat, returns!
I thought I’d just put up a little more data to yesterday’s post on the “Two Waves of Difference-in-differences” and show an idea I had that ultimately was not what I thought and that is the idea that there was a third wave of difference-in-differences. But, I think in collecting the data, I’m a little unsure so instead I’m just going to post it along with my thesis.
My thesis was really somewhat simple, but it was based on my own evolution. I graduated in 2007 and my third chapter used diff-in-diff and triple differences to study abortion legalization on future gonorrhea — an extension of papers by Gruber, Levine and Staiger’s 1999 QJE “the Marginal Child”, as well as Donohue and Levitt’s “abortion-crime” in 2001. My paper was in the ALER and was my first and only publication from my dissertation. I made a nice table, based on an old table that I’d seen Bill Evans create but which I can’t find right now, illustrating the research design and if you’re interested you can read the article, but here’s the picture. I remember falling into a trance using excel trying out every conceivable color and to make the patterns like this. It was actually this chapter where I learned what diff-in-diff was, as an aside. Philip Levine had this great book on abortion policy called “Sex and Consequences: Abortion, Public Policy and the Economic of Fertility”. And in it he patiently explains to the reader what a diff-in-diff is so that he can walk readers through a lot of papers on abortion. And when I saw the table, I thought, “No way. That’s what this is about? Causality?” And that was the moment when I got absolutely hooked on causal inference. I guess 2007.
Anyway, back to where I was. From 2007 to now, I’ve been thinking about those years as though that’s when I was born, and so when I saw in Currie, et al’s paper that there was a plateauing of diff-in-diff papers at NBER WP in 2007 — the very year I graduated — I just kept wondering to myself if I had a lived experience that overlapped with changes that might help me understand better my own journey. I know that sounds odd, but that’s how my mind thought and thinks. For those that didn’t see the post yesterday, I’m referring to this graph.
And the 2007 to 2011 point is just this halting of diff-in-diff, at least as a share of the total, at NBER, reaching a peak of around 12, maybe 13%. Then in 2011, as I said, it roars off and rises to 23%, maybe 10-11pp in only 7 years. It’s a very fast clip and in yesterday’s post I show using other data I collected from google scholar that this later period, which I termed the second wave of diff-in-diff, had certain key features that made it different from that first run up. The features I noted were simple: exponential growth rate in the number of papers mentioning “event study” with “parallel trends”.
In other words, it was the inclusion of mention of “event study” and “parallel trends” that I focused on in yesterday’s substack. I tried to find the origin of the phrase “parallel trends” and couldn’t quite there, as in the late 1990s and early 2000s the phraseology drifted a little. “Common trends” was sometimes used, but it seemed to refer more to secular unobserved shocks to all panel units at the same time that could be absorbed by a year dummy. But I noted that James Habyarimana had a 2003 JMP from Harvard that used an event study and diff-in-diff, which was the same year that David Autor published an event study with a diff-in-diff in JOLE. But James didn’t mention parallel trends. Then in 2005, it gets updated, and there are suddenly six references to “parallel trends”. James isn’t the first to start doing diff-in-diff with event studies (explicitly named as such) and with references to parallel trends as the identifying assumption, but it does seem like he’s right there at the very start of what would then become something bordering on a collective movement. See here. The black line is papers at google scholar that said “parallel trends” and “difference-in-differences”.
This really just went through the end of my data collection which was 2022. But in the back of my mind, I also wanted to think about the last few years, what I was calling in my mind “the third wave of diff-in-diff”. This frenzy of obsessive focus on the properties of a model called the twoway fixed effects (TWFE) estimator inside difference-in-differences with differential timing. So that’s what this is about: the third wave of diff-in-diff. And for this, I’m just going to write off the cuff and not allow myself to collect any more data, as it’s probably only going to chew up scarce time I need to spend on other stuff, but I’d like to just add this to yesterday’s post. The rest is below the paywall. Remember, you get free 7 days to subscribe then you can cancel.
Keep reading with a 7-day free trial
Subscribe to Scott's Mixtape Substack to keep reading this post and get 7 days of free access to the full post archives.