-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Dear Cristian Arteaga,
First of all, thank you for developing such a great package. I have been using it extensively and I think it works just great!
However, I think I found a weird behavior (most likely a bug) when predicting probabilities from a MIXL model.
I was trying to re-create the Log-likelihood (LL) of the model from its predicted probabilities using equation (6) in your article:
Unfortunately, noticed that I was not able to replicate them, and the differences were rather large. (see the first chunk of code below). Additionally, weirdly enough, when using only 1 draw, I am able to replicate the LL value. (see the second chunk of code below)
I haven't had time to go through your source code, but I think there is something wrong with the .predict() method for the MIXL models, since, for MNL models, I checked that my method of retrieving the LL works just fine (not included here).
First chunk
#%%
#
import pandas as pd
import numpy as np
from xlogit.utils import wide_to_long
from xlogit import MixedLogit
# from your example
df_wide = pd.read_table("http://transp-or.epfl.ch/data/swissmetro.dat", sep='\t')
# Keep only observations for commute and business purposes that contain known choices
df_wide = df_wide[(df_wide['PURPOSE'].isin([1, 3]) & (df_wide['CHOICE'] != 0))]
df_wide['custom_id'] = np.arange(len(df_wide)) # Add unique identifier
df_wide['CHOICE'] = df_wide['CHOICE'].map({1: 'TRAIN', 2:'SM', 3: 'CAR'})
df = wide_to_long(df_wide, id_col='custom_id', alt_name='alt', sep='_',
alt_list=['TRAIN', 'SM', 'CAR'], empty_val=0,
varying=['TT', 'CO', 'HE', 'AV', 'SEATS'], alt_is_prefix=True)
df['ASC_TRAIN'] = np.ones(len(df))*(df['alt'] == 'TRAIN')
df['ASC_CAR'] = np.ones(len(df))*(df['alt'] == 'CAR')
df['TT'], df['CO'] = df['TT']/100, df['CO']/100 # Scale variables
annual_pass = (df['GA'] == 1) & (df['alt'].isin(['TRAIN', 'SM']))
df.loc[annual_pass, 'CO'] = 0 # Cost zero for pass holders
varnames=['ASC_CAR', 'ASC_TRAIN', 'CO', 'TT']
# Model building
model = MixedLogit()
model.fit(X=df[varnames],
y=df['CHOICE'],
varnames=varnames,
alts=df['alt'],
ids=df['custom_id'],
avail=df['AV'],
panels=df["ID"], randvars={'TT': 'n'},
n_draws=100,
optim_method='L-BFGS-B')
# Create predictions
predictions = model.predict(X=df[varnames],
varnames=varnames,
alts=df['alt'],
ids=df['custom_id'],
avail=df['AV'],
panels=df["ID"],
n_draws=100,
return_proba = True)
# Recovering the predicted probabilities
pred_proba = predictions[1]
# transform the df['CHOICE'] variable into a dummy variable
chosen = np.array(df_wide['CHOICE'].map({'TRAIN': 0, 'SM': 1, 'CAR': 2})).reshape(-1, 1)
# Select the probability of the chosen alternative
proba_chosen = np.take_along_axis(pred_proba, chosen, axis=1)
# Compute the negative log-likelihood
recreated_LL = np.sum(np.log(proba_chosen))
print("recreated LL:",recreated_LL)
print("model's LL :",model.loglikelihood)
# recreated LL: -5293.024645448918
# model's LL : -4360.226616589964
Second chunk
# The same again but with 1 draw only.
model.fit(X=df[varnames],
y=df['CHOICE'],
varnames=varnames,
alts=df['alt'],
ids=df['custom_id'],
avail=df['AV'],
panels=df["ID"], randvars={'TT': 'n'},
n_draws=1,
optim_method='L-BFGS-B')
# Create predictions
predictions = model.predict(X=df[varnames],
varnames=varnames,
alts=df['alt'],
ids=df['custom_id'],
avail=df['AV'],
panels=df["ID"],
n_draws=1,
return_proba = True)
# Recovering the predicted probabilities
pred_proba = predictions[1]
# transform the df['CHOICE'] variable into a dummy variable
chosen = np.array(df_wide['CHOICE'].map({'TRAIN': 0, 'SM': 1, 'CAR': 2})).reshape(-1, 1)
# Select the probability of the chosen alternative
proba_chosen = np.take_along_axis(pred_proba, chosen, axis=1)
# Compute the negative log-likelihood
recreated_LL = np.sum(np.log(proba_chosen))
print("recreated LL:",recreated_LL)
print("model's LL :",model.loglikelihood)
#recreated LL: -5331.206129776281
#model's LL : -5331.206129776281
Thank you in advance!
Álvaro
