Skip to content

[BUG] Running EASE model with 600k ratings crashes with out of memory error #654

@filippo-orru

Description

@filippo-orru

Description

I've been trying out various cornac models and had some success. However, running the EASE model on a training set with ~600k ratings never succeeds. Memory consumption starts at around 500MB, then quickly grows to ~60GB until, at some point, the process is killed.
Activity Monitor 2024-12-25 18 18 28

In which platform does it happen?

MacOS 15.2 running on an M1 Pro with 16GB of memory.

How do we replicate the issue?

Minimal example:

import cornac
from cornac.eval_methods import RatioSplit
from cornac.models import EASE
from cornac.metrics import NDCG
import pandas as pd

path = "training_data/training_data_ratings_20241218_224839.parquet.snappy"
df_original = pd.read_parquet(path)
print("Loaded data")

# Convert dataframe to list of tuples (user_id, item_id, rating)
data = df_original[["userID", "itemID", "rating"]].values.tolist()
rs = RatioSplit(data, test_size=0.15, val_size=0.1, rating_threshold=3.0)
print(f"{len(data)} ratings: {rs.train_size} training and {rs.test_size} test")

ndcg = NDCG(k=10)
metrics = [ndcg]

ease = EASE()
models = [ease]

cornac.Experiment(eval_method=rs, models=models, metrics=metrics, user_based=True, verbose=True).run()

print("Done!")

ease.recommend("my_user_id", k=10) # Never reaches this point

Expected behavior (i.e. solution)

The experiment should run successfully and output the results. 600k training samples isn't that much, and the model is extremely simple. I don't see how it would need this much memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions