potential bug in prediction of MIXL models

Dear Cristian Arteaga, 
First of all, thank you for developing such a great package. I have been using it extensively and I think it works just great!


However, I think I found a weird behavior (most likely a bug) when predicting probabilities from a MIXL model. 

I was trying to re-create the Log-likelihood (LL) of the model from its predicted probabilities using equation (6) in your article:

![image](https://user-images.githubusercontent.com/39359053/234081581-c003482f-4949-445e-9ee4-5c980d9cda37.png)

Unfortunately, noticed that I was not able to replicate them, and the differences were rather large. (see the first chunk of code below). Additionally, weirdly enough, when using only 1 draw, I am able to replicate the LL value. (see the second chunk of code below)

I haven't had time to go through your source code, but I think there is something wrong with the `.predict()` method for the MIXL models, since, for MNL models, I checked that my method of retrieving the LL works just fine (not included here).

# First chunk

```
#%%
# 
import pandas as pd
import numpy as np
from xlogit.utils import wide_to_long
from xlogit import MixedLogit
# from your example
df_wide = pd.read_table("http://transp-or.epfl.ch/data/swissmetro.dat", sep='\t')
# Keep only observations for commute and business purposes that contain known choices
df_wide = df_wide[(df_wide['PURPOSE'].isin([1, 3]) & (df_wide['CHOICE'] != 0))]
df_wide['custom_id'] = np.arange(len(df_wide))  # Add unique identifier
df_wide['CHOICE'] = df_wide['CHOICE'].map({1: 'TRAIN', 2:'SM', 3: 'CAR'})
df = wide_to_long(df_wide, id_col='custom_id', alt_name='alt', sep='_',
                  alt_list=['TRAIN', 'SM', 'CAR'], empty_val=0,
                  varying=['TT', 'CO', 'HE', 'AV', 'SEATS'], alt_is_prefix=True)
df['ASC_TRAIN'] = np.ones(len(df))*(df['alt'] == 'TRAIN')
df['ASC_CAR'] = np.ones(len(df))*(df['alt'] == 'CAR')
df['TT'], df['CO'] = df['TT']/100, df['CO']/100  # Scale variables
annual_pass = (df['GA'] == 1) & (df['alt'].isin(['TRAIN', 'SM']))
df.loc[annual_pass, 'CO'] = 0  # Cost zero for pass holders
varnames=['ASC_CAR', 'ASC_TRAIN', 'CO', 'TT']


# Model building 
model = MixedLogit()
model.fit(X=df[varnames], 
          y=df['CHOICE'], 
          varnames=varnames,
          alts=df['alt'], 
          ids=df['custom_id'],
           avail=df['AV'],
          panels=df["ID"], randvars={'TT': 'n'}, 
          n_draws=100,
          optim_method='L-BFGS-B')


# Create predictions 
predictions = model.predict(X=df[varnames], 
              varnames=varnames,
                alts=df['alt'], 
                ids=df['custom_id'], 
                avail=df['AV'],
                panels=df["ID"], 
                n_draws=100,
                return_proba = True) 
# Recovering the predicted probabilities
pred_proba = predictions[1]

# transform the df['CHOICE'] variable into a dummy variable
chosen = np.array(df_wide['CHOICE'].map({'TRAIN': 0, 'SM': 1, 'CAR': 2})).reshape(-1, 1)
# Select the probability of the chosen alternative
proba_chosen = np.take_along_axis(pred_proba, chosen, axis=1)
# Compute the negative log-likelihood
recreated_LL = np.sum(np.log(proba_chosen))
print("recreated LL:",recreated_LL)
print("model's LL  :",model.loglikelihood)

# recreated LL: -5293.024645448918
# model's LL  : -4360.226616589964
```



# Second chunk 
```
# The same again but with 1 draw only.
model.fit(X=df[varnames], 
          y=df['CHOICE'], 
          varnames=varnames,
          alts=df['alt'], 
          ids=df['custom_id'],
           avail=df['AV'],
          panels=df["ID"], randvars={'TT': 'n'}, 
          n_draws=1,
          optim_method='L-BFGS-B')


# Create predictions 
predictions = model.predict(X=df[varnames], 
              varnames=varnames,
                alts=df['alt'], 
                ids=df['custom_id'], 
                avail=df['AV'],
                panels=df["ID"], 
                n_draws=1,
                return_proba = True) 
# Recovering the predicted probabilities
pred_proba = predictions[1]

# transform the df['CHOICE'] variable into a dummy variable
chosen = np.array(df_wide['CHOICE'].map({'TRAIN': 0, 'SM': 1, 'CAR': 2})).reshape(-1, 1)
# Select the probability of the chosen alternative
proba_chosen = np.take_along_axis(pred_proba, chosen, axis=1)
# Compute the negative log-likelihood
recreated_LL = np.sum(np.log(proba_chosen))
print("recreated LL:",recreated_LL)
print("model's LL  :",model.loglikelihood)


#recreated LL: -5331.206129776281
#model's LL  : -5331.206129776281
```

Thank you in advance!

Álvaro





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

potential bug in prediction of MIXL models #14

First chunk

Second chunk

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

potential bug in prediction of MIXL models #14

Description

First chunk

Second chunk

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions