DiD Toolbox for Matlab
(Dr. Ralf Elsas-Nicolle, 10/30/2025)
Table of Contents
Overview
As mentioned in my previous blog, recent years have witnessed a strong increase in academic studies making causal claims (see for example Garg/Fetzer 2024), and significant methodological advancements of tools for estimating causal effects in complex, real-world settings have been made. The 2021 Nobel Prize in Economic Sciences awarded to Joshua Angrist, Guido Imbens, and David Card underscores the transformative impact.
I have developed a Matlab DiD Toolbox to make these methods also available in Matlab. The toolbox can be downloaded and installed from Matlab File Exchange. The toolbox implements methods developed or discussed in the following key papers:
- Goodman-Bacon (2021): Provides the mathematical foundation for the decomposition of the TWFE estimator, explaining how it operates as a weighted average of DiD estimates and identifying the source of bias from „negative weights“ due to treatment effect heterogeneity.
- Wooldridge (2021): Establishes the algebraic equivalence between the Two-Way Fixed Effects (TWFE) estimator and the Two-Way Mundlak (TWM) regression, enabling flexible implementation using pooled OLS.
- Borusyak, Jaravel, and Spiess (2024): Derives the efficient and robust imputation estimator (BJS) for staggered DiD, which estimates counterfactual outcomes using only untreated observations to calculate heterogeneous causal effects, providing efficiency and avoiding spurious identification.
- de Chaisemartin and D’Haultfœuille (2020): Proposes the $ $DID_M$ $ estimator, which estimates a robust Average Treatment Effect across switching cells ($ $\delta_S$ $) and introduces robustness measures for assessing TWFE bias, particularly in designs where weights may be negative. And the unique feature is that the estimator can handle an on/off treatment, while all other estimators assume that treatment is an absorptive state.
- Callaway / Sant’Anna (2021): CS focus on cohort/time based analyses, looking at $ ATT(g,k) $ with a focus on taking covariates into account. They derive identification, estimation, and inference strategies assuming parametric nuisance models (i.e. linear outcome regression, logit for propensity scores) that admit standard regularity conditions. In an extension, the interaction weighted estimator by Sun/Abraham (2021) builds on this work, showing how get unbiased event studies in this setting.
The toolbox comes with an extensive manual as well as one worked real-life example for an analysis of a staggered treatment, using the example of the introduction of the so-called „Castle Doctrine“ in the U.S., where from 2000 to 2010, more than 20 states passed the “Castle Doctrine” law which extended lethal self-defense rights.
In this blog post, I will showcase some basic functionality of the toolbox to test whether two of the most popular estimators are unbiased using Monte Carlo simulation – the Wooldridge and Callaway/Sant’Anna estimators. Obviously, there is no doubt that these estimators under their assumptions are unbiased, but the Monte Carlo simulation shown below will test whether the code implementation is flawless.
Data Generation
In the first step, the genDIDdata functionality is used to generate data with staggered and heterogeneous treatment. This function allows a very flexible generation of panel data with different treatment effect modeling. Specifically, we will generate data comprising 12 year-observations for 500 individuals out of which 60% are treated, starting at one out of three defined periods (year 4, 7, and 10). We also model a dependency of all variables (except for treatment effects on two exogenous variables x1 and x2 (i.e. covariates).
Treatment effects are generated such the first cohort has an ATT of 2, the second of 2.5 and the third of 3. Knowing the actual ATTs allows in the Monte Carlo simulation to compare estimated ATTs to the true ones, i.e. testing unbiasedness.
clear;
T = did.genDIDdata(12, 500, 0.6, …
treatType=„constantTime“, startPeriod=4, …
cohortTimes=[4 7 10],xNum=2, … % creates x1, x2
betaX=[0.6, -0.4], … % effect of x1,x2 on y
xUnitStd=1, xTimeStd=1, xShockStd=1, CohortIncrease=0.5, Seed=42);
ds = did.Dataset.fromTable(T,idVar=„id“, timeVar=„time“, dVar=„D“, yVar=„y“,describe=true);
TWFE and Bacon Decomposition – Ignoring Staggered Treatment
As the first step of analyses, a standard two-way fixed effects regression is conducted, yielding an average estimate for the ATT. Since the Borusyiak and the Goodman-Bacon analyses, it is well known that the standard TWFE estimator is shown to be a weighted average of all possible Difference-in-Differences estimators based on the cohorts in the data. As some comparisons are not compatible with the standard DiD assumptions, they can contribute to the average estimate (even with negative) weight, thereby inducing bias.
resTWFE = did.fit(„TWFE“, ds,Covariates=[„x1“,„x2“]);
The Bacon-decomposition can be used to measure the contribution to the ATT estimate of 2.524 by decomposing the cohort comparisons:
- Treated vs. Untreated: Compares a treated cohort to a never-treated control group.
- Early vs. Late Treated (Admissible Timing): Earlier-treated groups act as treatment, later-treated groups act as controls during their pre-period.
- Late vs. Early Treated (Forbidden Comparison): Later-treated groups act as treatment, already-treated earlier groups act as controls.
res = did.fit(„bacon“, ds);
As a result we see very different ATT contributions for the (aggregated) cohort comparisons. Still, the „clean“ comparison (treated vs. never-treated) has a weight of about 90%, indicating a not too strong bias of the average ATT. Keep in mind though that treatment effects vary over time, which is necessarily ignored when calculating an average across all periods and cohorts using TWFE.
Staggered DiD Estimators
For illustration purposes, we first demonstrate a single estimation for both the Wooldridge and Callaway/Sant’Anna estimator. Cohort aggregated ATTs for both estimators show time variability of ATTs and are close to the true ATTs of 2, 2.5 and 3, respectively.
resWool = did.fit(„wooldridge“, ds, Covariates=[„x1“,„x2“],vcov=„clustered“, clusters=[„id“]);
resCS = did.fit(„cs“, ds, …
Comparison=„notyet“, Delta=0, Approach=„dr“, Covariates=[„x1“,„x2“], …
Seed=29, SEMethod=„multiplier“);
Monte Carlo Simulation
We now conduct a Monte Carlo simulation, where in 1.000 simulation runs each time new panel data (with the same structure but different random variability) are generated and the Wooldridge and CallaWay/Sant’Anna estimators are applied. These estimators will be unbiased when the mean ATT estimate across all simulation runs corresponds to the true ATT values on average.
S=1000; % num simulations
resATT=nan(S,3);
resCS=nan(S,3);
tic
parfor ii=1:S
% gen data
simTable = did.genDIDdata(12, 500, 0.6, …
treatType=„constantTime“, startPeriod=4, …
cohortTimes=[4 7 10],xNum=2, … % creates x1, x2
betaX=[0.6, -0.4], … % effect of x1,x2 on y
xUnitStd=1, xTimeStd=1, xShockStd=1, CohortIncrease=0.5);
% estimate Wooldridge
resSim = did.fit(„Wooldridge“,simTable,idVar=„id“, timeVar=„time“, dVar=„D“, yVar=„y“, …
Covariates=[„x1“,„x2“],…
details=false,describe=false,Print=false);
resATT(ii,:) = resSim.summaryTable.(„ATT(k)“)‘;
% estiamte Callway/Sant’Anna
resSimCS = did.fit(„cs“, simTable,idVar=„id“, timeVar=„time“, dVar=„D“, yVar=„y“, …
Comparison=„never“, Delta=0, Approach=„dr“, Covariates=[„x1“,„x2“], …
Seed=NaN, SEMethod=„multiplier“,Print=false);
resCS(ii,:) = resSimCS.Aggregates.byCohort.Estimate‘;
end
toc
% make tables of true ATTs versus average estimated ATTs per cohort over the
% simulation runs
trueTau = unique(T.ATT);
nameVar = „ATT_“+[1:size(resATT,2)];
resTbl = array2table([trueTau(2:end)‘;mean(resATT)],VariableNames=nameVar,RowNames=[„True“,„meanEstimate“]);
fprintf(„Wooldridge meanATT from „+ S + “ sims: \n“);
resTbl
resTblCS = array2table([trueTau(2:end)‘;mean(resCS)],VariableNames=nameVar,RowNames=[„True“,„meanEstimate“]);
fprintf(„Callaway/Sant’Anna(dr) meanATT from „+ S + “ sims: \n“);
resTblCS
As can be seen from the tables, both estimators take the control variates correctly into account and are unbiased as average ATT estimates across the simulations reveal the true coefficients. This is good news for the implementation of these estimators in the DID Toolbox.
Literature
- Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation, Review of Economic Studies, 00, 1–33.
- Callaway, B. / Sant’Anna, P. (2021): Difference-in-Differences with Multiple Time Periods, Journal of Econometrics 225 (2): 200–230.
- de Chaisemartin, C., & D’Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects, American Economic Review, 110(9), 2964–2996.
- Garg, P. / Fetzer, T. (2025): Causal Claims in Economics, arXiv:2501.06873
- Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing, Journal of Econometrics, 225(2), 254–277.
- Wooldridge, J. (2021). Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Event Study Estimators. Working Paper.