Staggered Difference-in-Differences using Matlab

1 Overview

Learning about consequences of economic (policy) measures is at the heart of modern economic research. Correspondingly, causal inference methods have become indispensable in economic research, enabling scholars to move beyond mere correlations to uncover the underlying causal relationships that drive economic phenomena. Recent years have witnessed not only a strong increase in academic studies making causal claims (see for example Garg/Fetzer 2024), but also significant methodological advancements, particularly in the development of tools for estimating causal effects in complex, real-world settings. The 2021 Nobel Prize in Economic Sciences awarded to Joshua Angrist, Guido Imbens, and David Card underscores the transformative impact of these methods, with Imbens and his co-author Alberto Abadie pioneering techniques such as synthetic control and instrumental variable approaches.
Unfortunately, these methodological advancements are only available in a few specialized statistical software (mostly Stata). My favorite software for empirical research is Matlab, however, and there are hardly any estimators for panel data or causal inference available. This blog is an outcome of some of my efforts to conduct causal inference analyses using Matlab.
In particular, the blog serves to showcase the analysis of staggered difference-in-differences (DiD) settings, where a specific treatments of economic units (like firms, persons etc.) occurs at different point in times and possibly with different impacts. The Goodman-Bacon (2021) study is pivotal for understanding such heterogeneity in staggered DiD designs. It introduces a decomposition that reveals how overall treatment effects are weighted averages of all two-group, two-period comparisons in the data, with weights depending on group size and timing. This framework highlights potential biases arising from negative weights, which can occur when later-treated groups act as controls for earlier-treated groups.
In the following, I will use my Matlab implementation of the „Bacon decomposition“ to illustrate it, relying on simulated panel data with and without treatment heterogeneity. Another Matlab function will then conduct the Wooldridge (2021) Mundlak-type regressions to estimate and test for heterogeneous effects.

2 General Simulation Parameters

This is an overview on the parameter settings used to simulate panel data with staggered treatments-
clear;
ATT = 2; % simulated treatment effect (initial or fixed))
num_ids = 1000; % Number of entities
num_periods = 5; % Number of time periods
startPeriod = 3;
percTreated = 0.6;

3 Staggered but Constant Treatment

3.1 Simulate Data

treatType = „constantTime“; %choose „zero“,“constant“, „constantTime“, „timeIncrease“, „onOff“
display(„Treatment start from period “ + startPeriod);
„Treatment start from period 3“
Generate data and show descriptives in terms of treatment cohorts.
inputTable = genDIDdata(num_periods,num_ids, percTreated, „ATT“,ATT, „startPeriod“,startPeriod, „treatType“,treatType);
g_sum = groupsummary(inputTable(inputTable.t>=inputTable.group,:),„group“,„mean“,„y“)
g_sum = 4×3 table
group GroupCount mean_y
1 0 1920 0.9989
2 3 594 2.6255
3 4 382 2.8223
4 5 227 2.2438
% regression
isPrint = true;
display(„—- Single Regressions —-„)
„—- Single Regressions —-„
res = didReg(inputTable,isPrint);
resPanel = 2×3 table
Coefficient tStat pVal
1 treat x post 2.0414 20.9576 0
2 nObs / R2 5000 NaN 0.1792
resOLS = 4×3 table
Coefficient tStat pVal
1 const. 1.2630 14.8114 0
2 isTreated -0.0846 -1.0986 0.2720
3 treat x post 1.9784 18.7874 0
4 nObs / R2 5000 NaN 0.1001
„Panel estimate is 2.0414 , OLS estimate is 1.9784“

3.2 Monte Carlo-Simulation

Run a a short Monte Carlo-Simulation to test for systematic bias in Two-Way-Fixed-Effects (TWFE) regressions.
isPrint = false;
S=1000;
resATT=nan(S,1);
tic
parfor ii=1:S
simTable = genDIDdata(num_periods,num_ids, percTreated, „ATT“,ATT, „startPeriod“,startPeriod,
„treatType“,treatType);
res = didReg(simTable,isPrint);
resATT(ii) = res.resPanel{1,1};
end
Starting parallel pool (parpool) using the ‚Processes‘ profile … Connected to parallel pool with 24 workers.
toc
Elapsed time is 51.596066 seconds.
display(„—- Monte Carlo Panel Regression —-„)
„—- Monte Carlo Panel Regression —-„
display(„meanATT from „+S+“ sim: „+mean(resATT));
„meanATT from 1000 sim: 2.0022“
The MC results show that with the simulated data, where treatment occurs at different point in times but with the same average treatment effect on the treated (ATT), there is no bias of TWFE. Note, however, that this only holds if the treatment is absorbing, i.e. once treated, every treated individual remains treated with the same effect.

3.3 Bacon Decomposition

The Bacon decomposition dissects the estimated coefficient from the TWFE regression into cohorts of treatment, generally comparing each cohort outcomes for each year of treatment
  • to all observations never-treated,
  • to cohorts that will be treated later („Treated earlier vs later“) and
  • to cohorts that have already been treated („Treated later vs earlier“).
The key insight form the Goodman-Bacon study is that the TWFE coefficient is a weighted average of these groups. Comparing treated units with other treated units leads to the potential bias of staggerd DiD, when treatment effects are heterogeneous between cohorts over time, or if the treatments are not absorbing (i.e., remain in place once treated).
Note that the Bacon decomposition is implemented as a class object, for future expansion.
%% Test bacon decomposition
myData = table;
myData.time = inputTable.t;
myData.id = inputTable.id;
myData.treatVar = inputTable.group;
myData.outcome = inputTable.y;
bd = baconDecomp(myData, ‚time‘, ‚id‘, ‚treatVar‘, ‚outcome‘);
% Run decomposition
[results, results_agg] = bd.decompose()
results = 9×5 table
type group1 group2 weight estimate
1 ‚Treated vs Never Treated‘ 3 0 0.2480 1.8606
2 ‚Treated vs Never Treated‘ 4 0 0.2393 2.1166
3 ‚Treated vs Never Treated‘ 5 0 0.1896 1.9288
4 ‚Treated earlier vs later‘ 3 4 0.0411 2.3735
5 ‚Treated earlier vs later‘ 3 5 0.0977 1.9324
6 ‚Treated earlier vs later‘ 4 5 0.0707 2.3056
7 ‚Treated later vs earlier‘ 4 3 0.0411 2.6076
8 ‚Treated later vs earlier‘ 5 3 0.0489 2.0877
9 ‚Treated later vs earlier‘ 5 4 0.0236 2.0804
results_agg = 3×3 table
Type Weight Estimate
1 ‚Treated earlier vs later‘ 0.2096 2.1449
2 ‚Treated later vs earlier‘ 0.1136 2.2744
3 ‚Treated vs Never Treated‘ 0.6768 1.9702
% Compute statistics
stats = computeBaconStats(results)
stats = struct with fields:
ate: 2.0414 se: 0.0811 ci_lower: 1.8825 ci_upper: 2.2002 n_comparisons: 9
The results form the Bacon decomposition illustrate that in this sample about 70% of the TWFE coefficient weight comes form observation where treated observations are only compared to never-treated observations, while 30% come from the „dangerous“ comparison groups „earlier vs. later“ and „later vs. earlier“, respectively.
With the present data the Bacon decomposition shows that despite the relatively high weight of dangerous comparisons, the coefficient estimates per group are roughly the same, therefore not indicating heterogeneous treatment effects over time.

3.4 Run Wooldridge Regression

The Wooldridge (2021) paper describes how to estimate the ATT coefficients per treatment cohort over time, using pooled OLS in the spirit of the Mundlak estimator. woolDID is my Matlab implementation of this estimator with an output similar to Stata’s xthdidregress command. Standard errors are robust and clustered at the cohort level.
myData = inputTable(:,[„id“,„t“,„y“,„treat x post“]);
woolDID(myData);
Wooldridge(2021): Heterogeneous Treatment Effects Estimation Results:
results_table = 14×6 table
Coefficient StdErr tStat pValue ConfLow(2.5) ConfHigh(97.5)
1 Constant 1.2672 0.0412 30.7856 7.5297e-05 1.1865 1.3479
2 Year_2 -0.2698 0.0730 -3.6934 0.0344 -0.4129 -0.1266
3 Year_3 0.2162 0.0899 2.4032 0.0956 0.0399 0.3924
4 Year_4 -0.4424 0.0765 -5.7843 0.0103 -0.5923 -0.2925
5 Year_5 -0.8456 0.0412 -20.5425 2.5224e-04 -0.9262 -0.7649
6 Cohort_3 -0.2012 0.0341 -5.9036 0.0097 -0.2680 -0.1344
7 Cohort_4 0.0404 0.0173 2.3389 0.1013 0.0065 0.0743
8 Cohort_5 -0.1067 1.6228e-15 -6.5735e+13 0 -0.1067 -0.1067
9 ATT_C3_Y3 2.0902 0.0968 21.5963 2.1726e-04 1.9005 2.2799
10 ATT_C3_Y4 1.8642 0.0715 26.0824 1.2363e-04 1.7241 2.0043
11 ATT_C3_Y5 1.7960 0.0341 52.6973 1.5050e-05 1.7292 1.8628
12 ATT_C4_Y4 2.1918 0.0691 31.7005 6.8979e-05 2.0562 2.3273
13 ATT_C4_Y5 2.1255 0.0173 122.9710 1.1857e-06 2.0917 2.1594
14 ATT_C5_Y5 1.9288 2.8676e-15 6.7264e+14 0 1.9288 1.9288
Again, the estimates for the ATTs indicate that for all treatment cohorts (C) over the cohort’s treatment years (Y) coefficients are very similar for all cohorts and over all treatment year, so that TWFE will not be biased.

4 Staggered DiD with Heterogeneous Treatment Effects

4.1 Generate data

We will now generate staggered treatment effect data. The true treatment effect will always be 2 in the first period of treatment and then increase by dynEffect = 0.5 for each additional year of treatment.
treatType = „timeIncrease“; %choose „zero“,“constant“, „constantTime“, „timeIncrease“, „onOff“
display(„Treatment start from period “ + startPeriod);
„Treatment start from period 3“
dynEffect = 0.5;
Generate data and show descriptives in terms of treatment cohorts.
inputTable = genDIDdata(num_periods,1000, percTreated, „ATT“,ATT, „startPeriod“,startPeriod, „treatType“,treatType,„dynEffect“,dynEffect);
g_sum = groupsummary(inputTable(inputTable.t>=inputTable.group,:),„group“,„mean“,„y“)
g_sum = 4×3 table
group GroupCount mean_y
1 0 1965 0.7047
2 3 621 2.8930
3 4 388 2.7704
4 5 206 1.2567
Graph the simple mean values by treated cohorts and control group in event time. The grph gives some indication that treatment effect are heterogeneous, but it would be difficult to infer the magnitudes.
meanTreated = groupsummary(inputTable, [„group“,„eventTime“],‚mean‘,‚y‘);
nGroup = unique(meanTreated.group);
figure(1)
plot(meanTreated(meanTreated.group==0,:),„eventTime“,„mean_y“,‚LineWidth‘,2);
hold on
for ii = 2:length(nGroup)
plot(meanTreated(meanTreated.group==nGroup(ii),:),„eventTime“,„mean_y“,„LineWidth“,2);
end
hold off
xline(0,‚–‚,LineWidth=2)
legend([„Control“,„C“+[nGroup(2):nGroup(end)],„Treatment“],„Location“,„northwest“)

4.2 Bacon Decomposition

The Bacon decomposition indicates heterogeneous treatment effects. First, coefficients for the comparisons against the never-treated are different. Second, the „later vs. earlier“ comparison gives quite some difference compared to coefficients of never-treated.
However, the results indicate that the Bacon decomposition is not enough to understand the dynamic pattern inherent to treatments. The coeffcients do not allow to infer, that treatment over time amounts to 2, 2.5 and 3.
myData = table;
myData.time = inputTable.t;
myData.id = inputTable.id;
myData.treatVar = inputTable.group;
myData.outcome = inputTable.y;
bd = baconDecomp(myData, ‚time‘, ‚id‘, ‚treatVar‘, ‚outcome‘);
% Run decomposition
[results, results_agg] = bd.decompose()
results = 9×5 table
type group1 group2 weight estimate
1 ‚Treated vs Never Treated‘ 3 0 0.2645 2.5280
2 ‚Treated vs Never Treated‘ 4 0 0.2478 2.5562
3 ‚Treated vs Never Treated‘ 5 0 0.1754 1.9430
4 ‚Treated earlier vs later‘ 3 4 0.0435 1.9550
5 ‚Treated earlier vs later‘ 3 5 0.0924 2.1348
6 ‚Treated earlier vs later‘ 4 5 0.0650 2.0917
7 ‚Treated later vs earlier‘ 4 3 0.0435 1.7927
8 ‚Treated later vs earlier‘ 5 3 0.0462 1.2113
9 ‚Treated later vs earlier‘ 5 4 0.0217 1.4037
results_agg = 3×3 table
Type Weight Estimate
1 ‚Treated earlier vs later‘ 0.2009 2.0819
2 ‚Treated later vs earlier‘ 0.1114 1.4759
3 ‚Treated vs Never Treated‘ 0.6877 2.3889
TWFE effects regressions are biased downwards relatively to the true effects.
display(„—- Single Regressions —-„)
„—- Single Regressions —-„
res = didReg(inputTable,1);
resPanel = 2×3 table
Coefficient tStat pVal
1 treat x post 2.2256 15.8588 0
2 nObs / R2 5000 NaN 0.2691
resOLS = 4×3 table
Coefficient tStat pVal
1 const. 0.7281 8.8607 0
2 isTreated -0.0183 -0.2448 0.8066
3 treat x post 2.3208 22.7773 0
4 nObs / R2 5000 NaN 0.1763
„Panel estimate is 2.2256 , OLS estimate is 2.3208“

4.3 Wooldridge Regression

THe Wooldridge regressions finally let us understand the dynamic pattern inherent to the treatment effects. It becomes quite visible from the estimated ATT over time that treatment effects increase per year by 0.5 from a level of initially 2.
myData = inputTable(:,[„id“,„t“,„y“,„treat x post“]);
woolDID(myData);
Wooldridge(2021): Heterogeneous Treatment Effects Estimation Results:
results_table = 14×6 table
Coefficient StdErr tStat pValue ConfLow(2.5) ConfHigh(97.5)
1 Constant 0.7432 0.0283 26.2493 1.2130e-04 0.6877 0.7987
2 Year_2 0.6811 0.0950 7.1676 0.0056 0.4949 0.8674
3 Year_3 -0.1083 0.0261 -4.1477 0.0255 -0.1595 -0.0571
4 Year_4 0.6331 0.0925 6.8415 0.0064 0.4517 0.8144
5 Year_5 -1.3984 0.0283 -49.3929 1.8274e-05 -1.4539 -1.3429
6 Cohort_3 -0.0216 0.0348 -0.6192 0.5796 -0.0898 0.0467
7 Cohort_4 -0.0792 0.0288 -2.7557 0.0704 -0.1356 -0.0229
8 Cohort_5 -0.0311 6.8247e-16 -4.5554e+13 0 -0.0311 -0.0311
9 ATT_C3_Y3 2.0330 0.0307 66.1724 7.6047e-06 1.9728 2.0932
10 ATT_C3_Y4 2.4380 0.1202 20.2763 2.6225e-04 2.2023 2.6737
11 ATT_C3_Y5 2.9169 0.0348 83.7780 3.7485e-06 2.8486 2.9851
12 ATT_C4_Y4 2.2962 0.1150 19.9625 2.7473e-04 2.0708 2.5217
13 ATT_C4_Y5 2.6821 0.0288 93.2690 2.7169e-06 2.6257 2.7385
14 ATT_C5_Y5 1.9430 2.0814e-15 9.3351e+14 0 1.9430 1.9430

5 Conclusions

I’m happy that after some efforts, staggered DiD has become less of a hurdle to be estimated via Matlab. Still, there remains much to do, as for example doubly robust estimators (with propensity score matching, see e.g. Callaway/Sant’Anna 2021), treatment of covariates, missing covariate balance (i.e. general matching methods) and many more issues recently discussed in the econometrics literature are yet to be implemented. It would be fantastic if more people in the community (or Mathworks itself) could contribute to help making causal inference accessible in Matlab, so that researchers need less to rely on Stata (though Stata is a very nice statistics software).

Literature

  • Goodman-Bacon, A. (2021). Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics, 225(2), 254–277. https://doi.org/10.1016/j.jeconom.2021.03.014
  • Callaway, B., & Sant’Anna, P. H. C. (2021). Difference-in-Differences with Multiple Time Periods. Journal of Econometrics, 225(2), 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001
  • Wooldridge, J. M. (2021). Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Difference-in-Differences Estimators. Available at SSRN: https://doi.org/10.2139/ssrn.3906345

Codes

The codes are provided as-is and meant to be first illustrative versions, without any liability for usage and/or errors).
  • baconDecomp, computeBaconStats, woolDID (Diff_in_Diff).
  • Test simData (a csv-file with simulated staggered heterogeneous treatment effects.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert