Skip to content

event_study did2s estimator gives ValueError when some fixed effects are not estimated in first stage #1244

@marcandre259

Description

@marcandre259

Discovered the issue while trying to reproduce an analysis with the 2 stage diff-in-diff estimator in did2s.py.

I first noticed it a couple of months ago, but only had time now to look into it. I'm thus referring to the current master branch.

I cannot share the data used but it could be reproduced by removing a key-value in the fit1.fixef() attribute right after the first-stage estimation on line 217 did2s.py.

What happens is one or more fixed effects cannot be estimated in the first stage fit (for some reason, collinearity between a unique time and unit combination for ex.). This makes some predicted y values be None/np.nan. As a result the second stage residuals _second_u have a smaller dimension than _first_u.

When the did2s.vcov method is then called, I get:

Exception has occurred: ValueError
operands could not be broadcast together with shapes
  File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/did2s.py", line 372, in _did2s_vcov
    second_u *= weights_array
  File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/did2s.py", line 115, in vcov
        data=self._data,
           ^^^^^^^^^^^^^
    ...<9 lines>...
        first_u=self._first_u,
    
  File "/Users/mcn5r1x/Projects/pyfixest/pyfixest/did/estimation.py", line 117, in event_study
    vcov, _G = did2s.vcov()
               ~~~~~~~~~~^^
  File "/Users/mcn5r1x/Projects/pyfixest/debug.py", line 17, in <module>
        data=df_estudy.to_pandas(),
               ^^^^^^^^^^^^^^^^^^^^
    ...<6 lines>...
    
    
ValueError: operands could not be broadcast together with shapes

The collinearity is clearly an issue, so I am fine with getting an error. However, I had no idea why that was until I retraced the steps. Also the error should probably be catched earlier.

I diagnosed the issue with:

    # Compare sets
    unit_yw_sets = {}
    for k, v in fit1.fixef().items():
        print(f"{k}\n ---")
        unit_yw_sets[k] = set()
        for _k, _v in v.items():
            print(f"{_k}: {_v}")
            unit_yw_sets[k].add(int(_k))

    # Original sets
    filiaal_set = set(data["unit"])
    yearweek_set = set(data["yearweek"])

    set(filiaal_set - fil_yw_sets["C(unit)"])
    set(yearweek_set - fil_yw_sets["C(yearweek)"])

around line 231 of did2s.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions