-
Notifications
You must be signed in to change notification settings - Fork 485
Closed
Labels
Description
What happened + What you expected to happen
I suspect a bug around the binomial negative. Indeed, performance seems to be off compared with other available distributions, even when faced with positive count data on which it is supposed to be efficient.
Perhaps a conflict with the way the input data is scaled? I know that on Pytorch-Forecasting, they block the use of negative binomial when applying centered normalization: https://pytorch-forecasting.readthedocs.io/en/stable/_modules/pytorch_forecasting/metrics/distributions.html#NegativeBinomialDistributionLoss
I can't share the results on my data, but I've coded a quick example that illustrates the problem.
Versions / Dependencies
neuralforecast==1.7.4
torch==2.3.1+cu121
Reproduction script
import pandas as pd
import numpy as np
import itertools
from neuralforecast import NeuralForecast
from neuralforecast.models import DeepAR, TFT, NHITS
from neuralforecast.losses.pytorch import DistributionLoss
from neuralforecast.losses.numpy import mae
from neuralforecast.utils import AirPassengersPanel
Y_df = AirPassengersPanel
nf = NeuralForecast(
models=[
eval(model)(
h=12,
input_size=48,
max_steps=100,
scaler_type="robust",
loss=DistributionLoss(distr, level=[]),
alias=f"{model}-{distr}",
enable_model_summary=False,
enable_checkpointing=False,
enable_progress_bar=False,
logger=False
)
for model, distr in itertools.product(
["DeepAR", "TFT", "NHITS"], ["Poisson", "Normal", "StudentT", "NegativeBinomial"]
)
],
freq="M"
)
cv_df = nf.cross_validation(Y_df, n_windows=5, step_size=12).reset_index();
def evaluate(df):
eval_ = {}
df = df.merge(Y_df[["unique_id", "ds", "y_[lag12]"]], how="left").rename(columns={"y_[lag12]": "seasonal_naive"})
models = ["seasonal_naive"] + list(df.columns[df.columns.str.contains('median')])
for model in models:
eval_[model] = {}
eval_[model][mae.__name__] = int(np.round(mae(df['y'].values, df[model].values), 0))
eval_df = pd.DataFrame(eval_).rename_axis('metric')
return eval_df
cv_df.groupby('cutoff').apply(lambda df: evaluate(df))
Issue Severity
Medium: It is a significant difficulty but I can work around it.
Reactions are currently unavailable
