Skip to content

foreverYoungGitHub/generative-recommenders-pl

Repository files navigation

Generative Recommenders

python pytorch lightning hydra license PRs

Repicate the Generative Recommenders with Lightning and Hydra.

Suggestions are always welcome!


Description

This repository aims to replicate the Generative Recommenders using Lightning and Hydra. It hosts the code for the paper "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations". While primarily intended for personal learning, this repository offers several key features:

  • Efficient Training & Inference: enhances training and inference speed by optimizing GPU utilization. As a result, the training of the MovieLens-1M dataset over 100 epochs can now be completed in under 10 minutes on a single 4090 or L4 machine.
  • Experimentation Made Easy: Easily manage and create hierarchical configurations with overrides via config files and command-line options to support various experiments.
  • Modular Configuration: Dynamically instantiate objects through configuration files, allowing seamless switching between different datasets or modules without extensive rewriting.
  • Hardware Agnostic: The dependency on NVIDIA GPUs has been removed, enabling you to run the scripts on any device, including local machines for training, evaluation, and debugging.
  • Improved Readability: The code has been significantly refactored for clarity. The Generative Recommenders module is now divided into four major components: embeddings, preprocessor, sequence encoder, and postprocessor, making the training and evaluation processes more transparent.

Installation

It is recommended to use uv to install the library:

uv venv -p 3.10 && source .venv/bin/activate
uv pip install --extra dev --extra test -r pyproject.toml
uv pip install -e . --no-deps

For Linux systems with GPU support, you can also install fbgemm-gpu to enhance performance:

uv pip install fbgemm-gpu==0.7.0

How to Run

Prepare dataset based on config.

make prepare_data data=ml-1m

Train the Model with Default Configuration

# Train on CPU
make train trainer=cpu

# Train on GPU
make train trainer=gpu

Train the Model with a Specific Experiment Configuration. Choose an experiment configuration from configs/experiment/:

make train experiment=ml-1m-hstu

Evaluate the Model with a Given Checkpoint

make eval ckpt_path=example_checkpoint.pt

Predict the Model with a Given Checkpoint

make predict ckpt_path=example_checkpoint.pt output_file=example_output.csv

Override Parameters from the Command Line

make train trainer.max_epochs=20 data.batch_size=64

Experiment Result:

MovieLens-1M (ML-1M)

To ensure reproducibility and eliminate randomization, the sample in the dataset generation was removed, and the seed in training was set to 42.

Method HR@10 NDCG@10 HR@50 NDCG@50 HR@100 NDCG@100 HR@200 NDCG@200 MRR
HSTU 0.2975 0.1680 0.5815 0.2308 0.6887 0.2483 0.7735 0.2602 0.1455
HSTU w/ Aux 0.3031 0.1726 0.5798 0.2337 0.6861 0.2510 0.7724 0.2631 0.1493

Feel free to explore and modify the configurations to suit your needs. Your contributions and suggestions are always welcome!

About

Pytorch Lightning Implement of Generative Recommenders

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors