-
Notifications
You must be signed in to change notification settings - Fork 79
Description
Discovered the issue while trying to reproduce an analysis with the 2 stage diff-in-diff estimator in did2s.py.
I first noticed it a couple of months ago, but only had time now to look into it. I'm thus referring to the current master branch.
I cannot share the data used but it could be reproduced by removing a key-value in the fit1.fixef() attribute right after the first-stage estimation on line 217 did2s.py.
What happens is one or more fixed effects cannot be estimated in the first stage fit (for some reason, collinearity between a unique time and unit combination for ex.). This makes some predicted y values be None/np.nan. As a result the second stage residuals _second_u have a smaller dimension than _first_u.
When the did2s.vcov method is then called, I get:
Exception has occurred: ValueError
operands could not be broadcast together with shapes
File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/did2s.py", line 372, in _did2s_vcov
second_u *= weights_array
File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/did2s.py", line 115, in vcov
data=self._data,
^^^^^^^^^^^^^
...<9 lines>...
first_u=self._first_u,
File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/estimation.py", line 117, in event_study
vcov, _G = did2s.vcov()
~~~~~~~~~~^^
File "/Users/mcn5r1x/Projects/pyfixest/debug.py", line 17, in <module>
data=df_estudy.to_pandas(),
^^^^^^^^^^^^^^^^^^^^
...<6 lines>...
ValueError: operands could not be broadcast together with shapesThe collinearity is clearly an issue, so I am fine with getting an error. However, I had no idea why that was until I retraced the steps. Also the error should probably be catched earlier.
I diagnosed the issue with:
# Compare sets
unit_yw_sets = {}
for k, v in fit1.fixef().items():
print(f"{k}\n ---")
unit_yw_sets[k] = set()
for _k, _v in v.items():
print(f"{_k}: {_v}")
unit_yw_sets[k].add(int(_k))
# Original sets
filiaal_set = set(data["unit"])
yearweek_set = set(data["yearweek"])
set(filiaal_set - fil_yw_sets["C(unit)"])
set(yearweek_set - fil_yw_sets["C(yearweek)"])around line 231 of did2s.py