Skip to content

Mimic fixest::i() by relying on formulaic stateful transforms #782

@s3alfisc

Description

@s3alfisc

Prompted by matthewwardrop/formulaic#238, implement a proper i() operator and replace the ugly string parsing.

Easily done via formulaic stateful transforms:

import numpy as np
import pandas as pd
from formulaic.transforms import stateful_transform
from formulaic.transforms.contrasts import C, TreatmentContrasts
from formulaic import model_matrix
import pyfixest as pf

data = pf.get_data()

@stateful_transform
def i(factor_var, ref=None, _state=None, _metadata=None, _spec=None):

    if "i" not in _state:
        _state["i"] = C(data = factor_var, contrasts = TreatmentContrasts(ref))

    return _state["i"]

model_matrix("i(f1, ref = 1.0)", data = data).head()

Challenge: How to add a second variable that can be interacted with factor_var?

I.e. API as

@stateful_transform
def i(factor_var, var = None, ref=None, ref2 = None, _state=None, _metadata=None, _spec=None):

    if var is None: 
        if "i" not in _state:
            _state["i"] = Formula(C(data = factor_var, contrasts = TreatmentContrasts(ref)))
    else: 
        if "i" not in _state:
            # this does not work, need to find where : interaction implemented in formulaic
            _state["i"] = C(data = factor_var, contrasts = TreatmentContrasts(ref)) : C(data = var, contrasts = TreatmentContrasts(ref2))

    return _state["i"]

Maybe too ambitions (but certainly useful for pooling DiD time periods, i.e. months to years): bin and bin2 arguments to combine multiple fixed effects levels.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions