Staggered Difference-in-Differences using Matlab

Table of Contents

Overview
General Simulation Parameters
Staggered but Constant Treatment
1 Simulate Data
2 Monte Carlo-Simulation
3 Bacon Decomposition
4 Run Wooldridge Regression
Staggered DiD with Heterogeneous Treatment Effects
1 Generate data
2 Bacon Decomposition
3 Wooldridge Regression
Conclusions
Literature
Codes 

1 Overview

Learning about consequences of economic (policy) measures is at the heart of modern economic research. Correspondingly, causal inference methods have become indispensable in economic research, enabling scholars to move beyond mere correlations to uncover the underlying causal relationships that drive economic phenomena. Recent years have witnessed not only a strong increase in academic studies making causal claims (see for example Garg/Fetzer 2024), but also significant methodological advancements, particularly in the development of tools for estimating causal effects in complex, real-world settings. The 2021 Nobel Prize in Economic Sciences awarded to Joshua Angrist, Guido Imbens, and David Card underscores the transformative impact of these methods, with Imbens and his co-author Alberto Abadie pioneering techniques such as synthetic control and instrumental variable approaches.

Unfortunately, these methodological advancements are only available in a few specialized statistical software (mostly Stata). My favorite software for empirical research is Matlab, however, and there are hardly any estimators for panel data or causal inference available. This blog is an outcome of some of my efforts to conduct causal inference analyses using Matlab.

In particular, the blog serves to showcase the analysis of staggered difference-in-differences (DiD) settings, where a specific treatments of economic units (like firms, persons etc.) occurs at different point in times and possibly with different impacts. The Goodman-Bacon (2021) study is pivotal for understanding such heterogeneity in staggered DiD designs. It introduces a decomposition that reveals how overall treatment effects are weighted averages of all two-group, two-period comparisons in the data, with weights depending on group size and timing. This framework highlights potential biases arising from negative weights, which can occur when later-treated groups act as controls for earlier-treated groups.

In the following, I will use my Matlab implementation of the „Bacon decomposition“ to illustrate it, relying on simulated panel data with and without treatment heterogeneity. Another Matlab function will then conduct the Wooldridge (2021) Mundlak-type regressions to estimate and test for heterogeneous effects.

2 General Simulation Parameters

This is an overview on the parameter settings used to simulate panel data with staggered treatments-

clear; 

ATT = 2; % simulated treatment effect (initial or fixed))

num_ids = 1000; % Number of entities

num_periods = 5; % Number of time periods

startPeriod = 3;

percTreated = 0.6;

3 Staggered but Constant Treatment

3.1 Simulate Data

treatType = „constantTime“; %choose „zero“,“constant“, „constantTime“, „timeIncrease“, „onOff“

display(„Treatment start from period “ + startPeriod);

„Treatment start from period 3“

Generate data and show descriptives in terms of treatment cohorts.

inputTable = genDIDdata(num_periods,num_ids, percTreated, „ATT“,ATT, „startPeriod“,startPeriod, „treatType“,treatType);

g_sum = groupsummary(inputTable(inputTable.t>=inputTable.group,:),„group“,„mean“,„y“)

g_sum = 4×3 table 




group
GroupCount
mean_y


1
0
1920
0.9989

2
3
594
2.6255

3
4
382
2.8223

4
5
227
2.2438

	group	GroupCount	mean_y
1	0	1920	0.9989
2	3	594	2.6255
3	4	382	2.8223
4	5	227	2.2438

% regression

isPrint = true;

display(„—- Single Regressions —-„)

„—- Single Regressions —-„

res = didReg(inputTable,isPrint);

resPanel = 2×3 table 




Coefficient
tStat
pVal


1 treat x post
2.0414
20.9576
0

2 nObs / R2
5000
NaN
0.1792


resOLS = 4×3 table 




Coefficient
tStat
pVal


1 const.
1.2630
14.8114
0

2 isTreated
-0.0846
-1.0986
0.2720

3 treat x post
1.9784
18.7874
0

4 nObs / R2
5000
NaN
0.1001


„Panel estimate is 2.0414
, OLS estimate is 1.9784“

	Coefficient	tStat	pVal
1 treat x post	2.0414	20.9576	0
2 nObs / R2	5000	NaN	0.1792

	Coefficient	tStat	pVal
1 const.	1.2630	14.8114	0
2 isTreated	-0.0846	-1.0986	0.2720
3 treat x post	1.9784	18.7874	0
4 nObs / R2	5000	NaN	0.1001

3.2 Monte Carlo-Simulation

Run a a short Monte Carlo-Simulation to test for systematic bias in Two-Way-Fixed-Effects (TWFE) regressions.

isPrint = false;

S=1000;

resATT=nan(S,1);

tic

parfor ii=1:S

 simTable = genDIDdata(num_periods,num_ids, percTreated, „ATT“,ATT, „startPeriod“,startPeriod,…

 „treatType“,treatType);

 res = didReg(simTable,isPrint);

 resATT(ii) = res.resPanel{1,1};

end

Starting parallel pool (parpool) using the ‚Processes‘ profile …
Connected to parallel pool with 24 workers.

toc

Elapsed time is 51.596066 seconds.

display(„—- Monte Carlo Panel Regression —-„)

„—- Monte Carlo Panel Regression —-„

display(„meanATT from „+S+“ sim: „+mean(resATT));

„meanATT from 1000 sim: 2.0022“

The MC results show that with the simulated data, where treatment occurs at different point in times but with the same average treatment effect on the treated (ATT), there is no bias of TWFE. Note, however, that this only holds if the treatment is absorbing, i.e. once treated, every treated individual remains treated with the same effect.

3.3 Bacon Decomposition

The Bacon decomposition dissects the estimated coefficient from the TWFE regression into cohorts of treatment, generally comparing each cohort outcomes for each year of treatment

to all observations never-treated,
to cohorts that will be treated later („Treated earlier vs later“) and
to cohorts that have already been treated („Treated later vs earlier“).

The key insight form the Goodman-Bacon study is that the TWFE coefficient is a weighted average of these groups. Comparing treated units with other treated units leads to the potential bias of staggerd DiD, when treatment effects are heterogeneous between cohorts over time, or if the treatments are not absorbing (i.e., remain in place once treated).

Note that the Bacon decomposition is implemented as a class object, for future expansion.

%% Test bacon decomposition

myData = table;

myData.time = inputTable.t;

myData.id = inputTable.id;

myData.treatVar = inputTable.group;

myData.outcome = inputTable.y;

bd = baconDecomp(myData, ‚time‘, ‚id‘, ‚treatVar‘, ‚outcome‘); 

% Run decomposition 

[results, results_agg] = bd.decompose() 

results = 9×5 table 




type
group1
group2
weight
estimate


1
‚Treated vs Never Treated‘
3
0
0.2480
1.8606

2
‚Treated vs Never Treated‘
4
0
0.2393
2.1166

3
‚Treated vs Never Treated‘
5
0
0.1896
1.9288

4
‚Treated earlier vs later‘
3
4
0.0411
2.3735

5
‚Treated earlier vs later‘
3
5
0.0977
1.9324

6
‚Treated earlier vs later‘
4
5
0.0707
2.3056

7
‚Treated later vs earlier‘
4
3
0.0411
2.6076

8
‚Treated later vs earlier‘
5
3
0.0489
2.0877

9
‚Treated later vs earlier‘
5
4
0.0236
2.0804


results_agg = 3×3 table 




Type
Weight
Estimate


1
‚Treated earlier vs later‘
0.2096
2.1449

2
‚Treated later vs earlier‘
0.1136
2.2744

3
‚Treated vs Never Treated‘
0.6768
1.9702

	type	group1	group2	weight	estimate
1	‚Treated vs Never Treated‘	3	0	0.2480	1.8606
2	‚Treated vs Never Treated‘	4	0	0.2393	2.1166
3	‚Treated vs Never Treated‘	5	0	0.1896	1.9288
4	‚Treated earlier vs later‘	3	4	0.0411	2.3735
5	‚Treated earlier vs later‘	3	5	0.0977	1.9324
6	‚Treated earlier vs later‘	4	5	0.0707	2.3056
7	‚Treated later vs earlier‘	4	3	0.0411	2.6076
8	‚Treated later vs earlier‘	5	3	0.0489	2.0877
9	‚Treated later vs earlier‘	5	4	0.0236	2.0804

	Type	Weight	Estimate
1	‚Treated earlier vs later‘	0.2096	2.1449
2	‚Treated later vs earlier‘	0.1136	2.2744
3	‚Treated vs Never Treated‘	0.6768	1.9702

% Compute statistics 

stats = computeBaconStats(results)

stats = struct with fields:
ate: 2.0414
se: 0.0811
ci_lower: 1.8825
ci_upper: 2.2002
n_comparisons: 9

The results form the Bacon decomposition illustrate that in this sample about 70% of the TWFE coefficient weight comes form observation where treated observations are only compared to never-treated observations, while 30% come from the „dangerous“ comparison groups „earlier vs. later“ and „later vs. earlier“, respectively.

With the present data the Bacon decomposition shows that despite the relatively high weight of dangerous comparisons, the coefficient estimates per group are roughly the same, therefore not indicating heterogeneous treatment effects over time.

3.4 Run Wooldridge Regression

The Wooldridge (2021) paper describes how to estimate the ATT coefficients per treatment cohort over time, using pooled OLS in the spirit of the Mundlak estimator. woolDID is my Matlab implementation of this estimator with an output similar to Stata’s xthdidregress command. Standard errors are robust and clustered at the cohort level.

myData = inputTable(:,[„id“,„t“,„y“,„treat x post“]);

woolDID(myData);

Wooldridge(2021): Heterogeneous Treatment Effects Estimation Results:
results_table = 14×6 table 




Coefficient
StdErr
tStat
pValue
ConfLow(2.5)
ConfHigh(97.5)


1 Constant
1.2672
0.0412
30.7856
7.5297e-05
1.1865
1.3479

2 Year_2
-0.2698
0.0730
-3.6934
0.0344
-0.4129
-0.1266

3 Year_3
0.2162
0.0899
2.4032
0.0956
0.0399
0.3924

4 Year_4
-0.4424
0.0765
-5.7843
0.0103
-0.5923
-0.2925

5 Year_5
-0.8456
0.0412
-20.5425
2.5224e-04
-0.9262
-0.7649

6 Cohort_3
-0.2012
0.0341
-5.9036
0.0097
-0.2680
-0.1344

7 Cohort_4
0.0404
0.0173
2.3389
0.1013
0.0065
0.0743

8 Cohort_5
-0.1067
1.6228e-15
-6.5735e+13
0
-0.1067
-0.1067

9 ATT_C3_Y3
2.0902
0.0968
21.5963
2.1726e-04
1.9005
2.2799

10 ATT_C3_Y4
1.8642
0.0715
26.0824
1.2363e-04
1.7241
2.0043

11 ATT_C3_Y5
1.7960
0.0341
52.6973
1.5050e-05
1.7292
1.8628

12 ATT_C4_Y4
2.1918
0.0691
31.7005
6.8979e-05
2.0562
2.3273

13 ATT_C4_Y5
2.1255
0.0173
122.9710
1.1857e-06
2.0917
2.1594

14 ATT_C5_Y5
1.9288
2.8676e-15
6.7264e+14
0
1.9288
1.9288

	Coefficient	StdErr	tStat	pValue	ConfLow(2.5)	ConfHigh(97.5)
1 Constant	1.2672	0.0412	30.7856	7.5297e-05	1.1865	1.3479
2 Year_2	-0.2698	0.0730	-3.6934	0.0344	-0.4129	-0.1266
3 Year_3	0.2162	0.0899	2.4032	0.0956	0.0399	0.3924
4 Year_4	-0.4424	0.0765	-5.7843	0.0103	-0.5923	-0.2925
5 Year_5	-0.8456	0.0412	-20.5425	2.5224e-04	-0.9262	-0.7649
6 Cohort_3	-0.2012	0.0341	-5.9036	0.0097	-0.2680	-0.1344
7 Cohort_4	0.0404	0.0173	2.3389	0.1013	0.0065	0.0743
8 Cohort_5	-0.1067	1.6228e-15	-6.5735e+13	0	-0.1067	-0.1067
9 ATT_C3_Y3	2.0902	0.0968	21.5963	2.1726e-04	1.9005	2.2799
10 ATT_C3_Y4	1.8642	0.0715	26.0824	1.2363e-04	1.7241	2.0043
11 ATT_C3_Y5	1.7960	0.0341	52.6973	1.5050e-05	1.7292	1.8628
12 ATT_C4_Y4	2.1918	0.0691	31.7005	6.8979e-05	2.0562	2.3273
13 ATT_C4_Y5	2.1255	0.0173	122.9710	1.1857e-06	2.0917	2.1594
14 ATT_C5_Y5	1.9288	2.8676e-15	6.7264e+14	0	1.9288	1.9288

Again, the estimates for the ATTs indicate that for all treatment cohorts (C) over the cohort’s treatment years (Y) coefficients are very similar for all cohorts and over all treatment year, so that TWFE will not be biased.

4 Staggered DiD with Heterogeneous Treatment Effects

4.1 Generate data

We will now generate staggered treatment effect data. The true treatment effect will always be 2 in the first period of treatment and then increase by dynEffect = 0.5 for each additional year of treatment.

treatType = „timeIncrease“; %choose „zero“,“constant“, „constantTime“, „timeIncrease“, „onOff“

display(„Treatment start from period “ + startPeriod);

„Treatment start from period 3“

dynEffect = 0.5;

Generate data and show descriptives in terms of treatment cohorts.

inputTable = genDIDdata(num_periods,1000, percTreated, „ATT“,ATT, „startPeriod“,startPeriod, „treatType“,treatType,„dynEffect“,dynEffect);

g_sum = groupsummary(inputTable(inputTable.t>=inputTable.group,:),„group“,„mean“,„y“)

g_sum = 4×3 table 




group
GroupCount
mean_y


1
0
1965
0.7047

2
3
621
2.8930

3
4
388
2.7704

4
5
206
1.2567

	group	GroupCount	mean_y
1	0	1965	0.7047
2	3	621	2.8930
3	4	388	2.7704
4	5	206	1.2567

Graph the simple mean values by treated cohorts and control group in event time. The grph gives some indication that treatment effect are heterogeneous, but it would be difficult to infer the magnitudes.

meanTreated = groupsummary(inputTable, [„group“,„eventTime“],‚mean‘,‚y‘);

nGroup = unique(meanTreated.group);

figure(1)

plot(meanTreated(meanTreated.group==0,:),„eventTime“,„mean_y“,‚LineWidth‘,2);

hold on

for ii = 2:length(nGroup) 

 plot(meanTreated(meanTreated.group==nGroup(ii),:),„eventTime“,„mean_y“,„LineWidth“,2);

end

hold off

xline(0,‚–‚,LineWidth=2)

legend([„Control“,„C“+[nGroup(2):nGroup(end)],„Treatment“],„Location“,„northwest“)

4.2 Bacon Decomposition

The Bacon decomposition indicates heterogeneous treatment effects. First, coefficients for the comparisons against the never-treated are different. Second, the „later vs. earlier“ comparison gives quite some difference compared to coefficients of never-treated.

However, the results indicate that the Bacon decomposition is not enough to understand the dynamic pattern inherent to treatments. The coeffcients do not allow to infer, that treatment over time amounts to 2, 2.5 and 3.

myData = table;

myData.time = inputTable.t;

myData.id = inputTable.id;

myData.treatVar = inputTable.group;

myData.outcome = inputTable.y;

bd = baconDecomp(myData, ‚time‘, ‚id‘, ‚treatVar‘, ‚outcome‘); 

% Run decomposition 

[results, results_agg] = bd.decompose() 

results = 9×5 table 




type
group1
group2
weight
estimate


1
‚Treated vs Never Treated‘
3
0
0.2645
2.5280

2
‚Treated vs Never Treated‘
4
0
0.2478
2.5562

3
‚Treated vs Never Treated‘
5
0
0.1754
1.9430

4
‚Treated earlier vs later‘
3
4
0.0435
1.9550

5
‚Treated earlier vs later‘
3
5
0.0924
2.1348

6
‚Treated earlier vs later‘
4
5
0.0650
2.0917

7
‚Treated later vs earlier‘
4
3
0.0435
1.7927

8
‚Treated later vs earlier‘
5
3
0.0462
1.2113

9
‚Treated later vs earlier‘
5
4
0.0217
1.4037


results_agg = 3×3 table 




Type
Weight
Estimate


1
‚Treated earlier vs later‘
0.2009
2.0819

2
‚Treated later vs earlier‘
0.1114
1.4759

3
‚Treated vs Never Treated‘
0.6877
2.3889

	type	group1	group2	weight	estimate
1	‚Treated vs Never Treated‘	3	0	0.2645	2.5280
2	‚Treated vs Never Treated‘	4	0	0.2478	2.5562
3	‚Treated vs Never Treated‘	5	0	0.1754	1.9430
4	‚Treated earlier vs later‘	3	4	0.0435	1.9550
5	‚Treated earlier vs later‘	3	5	0.0924	2.1348
6	‚Treated earlier vs later‘	4	5	0.0650	2.0917
7	‚Treated later vs earlier‘	4	3	0.0435	1.7927
8	‚Treated later vs earlier‘	5	3	0.0462	1.2113
9	‚Treated later vs earlier‘	5	4	0.0217	1.4037

	Type	Weight	Estimate
1	‚Treated earlier vs later‘	0.2009	2.0819
2	‚Treated later vs earlier‘	0.1114	1.4759
3	‚Treated vs Never Treated‘	0.6877	2.3889

TWFE effects regressions are biased downwards relatively to the true effects.

display(„—- Single Regressions —-„)

„—- Single Regressions —-„

res = didReg(inputTable,1);

resPanel = 2×3 table 




Coefficient
tStat
pVal


1 treat x post
2.2256
15.8588
0

2 nObs / R2
5000
NaN
0.2691


resOLS = 4×3 table 




Coefficient
tStat
pVal


1 const.
0.7281
8.8607
0

2 isTreated
-0.0183
-0.2448
0.8066

3 treat x post
2.3208
22.7773
0

4 nObs / R2
5000
NaN
0.1763


„Panel estimate is 2.2256
, OLS estimate is 2.3208“

	Coefficient	tStat	pVal
1 treat x post	2.2256	15.8588	0
2 nObs / R2	5000	NaN	0.2691

	Coefficient	tStat	pVal
1 const.	0.7281	8.8607	0
2 isTreated	-0.0183	-0.2448	0.8066
3 treat x post	2.3208	22.7773	0
4 nObs / R2	5000	NaN	0.1763

4.3 Wooldridge Regression

THe Wooldridge regressions finally let us understand the dynamic pattern inherent to the treatment effects. It becomes quite visible from the estimated ATT over time that treatment effects increase per year by 0.5 from a level of initially 2.

myData = inputTable(:,[„id“,„t“,„y“,„treat x post“]);

woolDID(myData);

Wooldridge(2021): Heterogeneous Treatment Effects Estimation Results:
results_table = 14×6 table 




Coefficient
StdErr
tStat
pValue
ConfLow(2.5)
ConfHigh(97.5)


1 Constant
0.7432
0.0283
26.2493
1.2130e-04
0.6877
0.7987

2 Year_2
0.6811
0.0950
7.1676
0.0056
0.4949
0.8674

3 Year_3
-0.1083
0.0261
-4.1477
0.0255
-0.1595
-0.0571

4 Year_4
0.6331
0.0925
6.8415
0.0064
0.4517
0.8144

5 Year_5
-1.3984
0.0283
-49.3929
1.8274e-05
-1.4539
-1.3429

6 Cohort_3
-0.0216
0.0348
-0.6192
0.5796
-0.0898
0.0467

7 Cohort_4
-0.0792
0.0288
-2.7557
0.0704
-0.1356
-0.0229

8 Cohort_5
-0.0311
6.8247e-16
-4.5554e+13
0
-0.0311
-0.0311

9 ATT_C3_Y3
2.0330
0.0307
66.1724
7.6047e-06
1.9728
2.0932

10 ATT_C3_Y4
2.4380
0.1202
20.2763
2.6225e-04
2.2023
2.6737

11 ATT_C3_Y5
2.9169
0.0348
83.7780
3.7485e-06
2.8486
2.9851

12 ATT_C4_Y4
2.2962
0.1150
19.9625
2.7473e-04
2.0708
2.5217

13 ATT_C4_Y5
2.6821
0.0288
93.2690
2.7169e-06
2.6257
2.7385

14 ATT_C5_Y5
1.9430
2.0814e-15
9.3351e+14
0
1.9430
1.9430

	Coefficient	StdErr	tStat	pValue	ConfLow(2.5)	ConfHigh(97.5)
1 Constant	0.7432	0.0283	26.2493	1.2130e-04	0.6877	0.7987
2 Year_2	0.6811	0.0950	7.1676	0.0056	0.4949	0.8674
3 Year_3	-0.1083	0.0261	-4.1477	0.0255	-0.1595	-0.0571
4 Year_4	0.6331	0.0925	6.8415	0.0064	0.4517	0.8144
5 Year_5	-1.3984	0.0283	-49.3929	1.8274e-05	-1.4539	-1.3429
6 Cohort_3	-0.0216	0.0348	-0.6192	0.5796	-0.0898	0.0467
7 Cohort_4	-0.0792	0.0288	-2.7557	0.0704	-0.1356	-0.0229
8 Cohort_5	-0.0311	6.8247e-16	-4.5554e+13	0	-0.0311	-0.0311
9 ATT_C3_Y3	2.0330	0.0307	66.1724	7.6047e-06	1.9728	2.0932
10 ATT_C3_Y4	2.4380	0.1202	20.2763	2.6225e-04	2.2023	2.6737
11 ATT_C3_Y5	2.9169	0.0348	83.7780	3.7485e-06	2.8486	2.9851
12 ATT_C4_Y4	2.2962	0.1150	19.9625	2.7473e-04	2.0708	2.5217
13 ATT_C4_Y5	2.6821	0.0288	93.2690	2.7169e-06	2.6257	2.7385
14 ATT_C5_Y5	1.9430	2.0814e-15	9.3351e+14	0	1.9430	1.9430

5 Conclusions

I’m happy that after some efforts, staggered DiD has become less of a hurdle to be estimated via Matlab. Still, there remains much to do, as for example doubly robust estimators (with propensity score matching, see e.g. Callaway/Sant’Anna 2021), treatment of covariates, missing covariate balance (i.e. general matching methods) and many more issues recently discussed in the econometrics literature are yet to be implemented. It would be fantastic if more people in the community (or Mathworks itself) could contribute to help making causal inference accessible in Matlab, so that researchers need less to rely on Stata (though Stata is a very nice statistics software).

Literature

Goodman-Bacon, A. (2021). Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics, 225(2), 254–277. https://doi.org/10.1016/j.jeconom.2021.03.014
Callaway, B., & Sant’Anna, P. H. C. (2021). Difference-in-Differences with Multiple Time Periods. Journal of Econometrics, 225(2), 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001
Wooldridge, J. M. (2021). Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Difference-in-Differences Estimators. Available at SSRN: https://doi.org/10.2139/ssrn.3906345

Codes

The codes are provided as-is and meant to be first illustrative versions, without any liability for usage and/or errors).

baconDecomp, computeBaconStats, woolDID (Diff_in_Diff).
Test simData (a csv-file with simulated staggered heterogeneous treatment effects.

Schlagwörter: DiD, Matlab