Post-Training Denoising of User Profiles with LLMs in Collaborative Filtering Recommendation

This repository contains the code for reproducing the experiments of the paper "Post-Training Denoising of User Profiles with LLMs in Collaborative Filtering Recommendation" by Ervin Dervishaj, Tuukka Ruotsalo, Maria Maistro and Christina Lioma, accepted as a full paper at ECIR 2026.

How to use this repository

This repository requires Python 3.10.x. Create a Python environment and install the necessary packages:

pip install -r requirements.txt

The experiments have been prepared with RecBole v1.2.1 (included in this repository) with minimal changes:

pip install -e RecBole

The repository is structured as follows:

/experiments: folder where all experimental configurations and results are collected.
/experiments/configs: RecBole run configuration files for the 3 datasets used in our experiments.
/experiments/prompts: the LLM prompts used for denoising.
/experiments/saved: folder where raw/preprocessed datasets, LLM generation and results are saved.
/models: collaborative filtering model used in the experiments.
/notebooks: contains the evaluation Jupyter notebook.
/RecBole: code for RecBole v1.2.1 including some minimal changes for our experiments.
/utils: utility code for preparing data, prompting LLMs and evaluation.
/run.py: main entry point for running the experiments.

Running experiments

You can replicate our results using the following commands (using Yelp dataset as example). Each command saves locally necessary files (and loads them, if they exist) that are consumed by subsequent commands, so run these in the given order.

Prepare data

First, the dataset is prepared with RecBole:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_dataset

where --config is the path to the RecBole run configuration file.

For Yelp (and Amazon CDs & Vinyl), we also sample 10000 users:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd sample_users --n 10000

The following command prepares the item content that represents user profiles in the LLM prompts:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_item2content

The dataloaders that will be used during training and validation should be prepared. If replicating experiments with few-shot examples in the prompts, include the flag --is_few_shot True:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_dataloaders [--is_few_shot True]

Finally, compute the user histories for the LLM prompts:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_user_histories

Train Collaborative Filtering (CF) model

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_trained_model

Prepare prompt data

In order to prepare the prompt data, our denoising approach requires computing the candidate item ranks from the CF model:

# Compute validation ranks
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_ranks --ranks_type dev --topk 10

# Compute test ranks
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_ranks --ranks_type test --topk 10

The flag --topk N computes and saves the top-K recommendations for each user.

Next, you need to compute the denoising prompts to the LLM for each user. In our paper we experiment with zero-shot and few-shot (ICL examples) prompting strategies:

# Zero-shot prompting data
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_prompt_samples

For ICL prompting with denoising examples:

# Compute denoising examples
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_examples --best 1

where --best indicates the number of items removed from the user profile for the denoising example. It can be set to 1 or 2.

Then, compute the prompt data:

# ICL prompting data
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_prompt_samples --best 1

For ICL prompt data with top-10 recommendations:

# ICL prompting with recommendation examples
python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd get_prompt_samples --with_recs True

LLM denoising

The following commands performs a denoising sweep over all the users in a dataset:

zero-shot:

# Remove only 1 item from user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_1.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples.pkl

# Remove 2 items from the user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_2.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples.pkl

few-shot with (1/2 best) denoising examples:

# `dev-prompt-samples-1-fs.pkl` includes denoising examples of removing 1 item from the user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_1.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples-1-fs.pkl

# `dev-prompt-samples-1-fs.pkl` includes denoising examples of removing 2 item2 from the user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_2.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples-2-fs.pkl

few-shot with top-10 recommendation examples:

# Remove only 1 item from user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_1_recs.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples-recs.pkl

# Remove 2 items from user profile
python -m utils.denoise_LLM --local --LLM Qwen/Qwen3-8B --system-prompt experiments/prompts/Yelp_remove_2_recs.txt --config experiments/configs/Yelp_CustomMultiVAE.yml --samples experiments/saved/2025/Yelp_CustomMultiVAE/dev-prompt-samples-recs.pkl

For the other datasets, change the --config YAML file, the --system-prompt file and --samples prompt samples accordingly.

LLM output cleaning

To clean some of the generated output from the LLM the following regular expressions are applied in the given order with modality replace all occurrences (in a code editor, e.g., Visual Studio Code):

Match	Replacement
`[\\|"]{3,}`	`\"`
`\\"'([^,\[\]]*)'\\"`	`\"$1\"`
`\\"\\"([^,\]\[]+)\\"\\"`	`\"$1\"`
`[\\|"]+([^,\]\[\\"]+)[\\|"]+`	`\"$1\"`
`[\\|"]+([^\]\[\\"]+)[\\|"]+`	`\"$1\"`
`\\"\\\\\\"([^,]+)\\\\\\"\\"`	`\"$1\"`
`'\\"([^,\[\]]+)\\"'`	`\"$1\"`
`"([^\[\]\\]+)"`	`"[\"$1\"]"`
`"\\"(.*)\\""`	`"[\"$1\"]"`
`"\[([^,\\"]+)\]"`	`"[\"$1\"]"`
`"\[([^"]+)\]"`	`"[\"$1\"]"`
`"\['([^,\\"\]\[]+)',\s*'([^,\\"\]\[]+)'\]"`	`"[\"$1\", \"$2\"]"`
`"\[([^,\\"\]\[]+),\s*([^,\\"\]\[]+)\]"`	`"[\"$1\", \"$2\"]"`
`(?<=[" \[\]])[^\\\[\]",]+(?=[",\]\[])`	`\"$1\"`

UppenBoundOnVal baselines

Compute the UpperBoundOnVal baselines:

python run.py --config experiments/configs/Yelp_CustomMultiVAE.yml --cmd brute_force_cf --COMB 1

--COMB indicates number of combinations to remove from the user profile. It can be set to 1 or 2.

Evaluation

To evaluate our LLM denoising approach and the baselines, run the notebook eval.ipynb in /notebooks folder. Results are saved in the path specified by the checkpoint_dir property in the RecBole configuration file. Change the parameter config_file_list to one of the RecBole configurations files in /experiments/configs folder:

config = Config(config_file_list=['experiments/configs/Yelp_CustomMultiVAE.yml'])

and update accordingly the LLM result filenames/prompts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Post-Training Denoising of User Profiles with LLMs in Collaborative Filtering Recommendation

How to use this repository

Running experiments

Prepare data

Train Collaborative Filtering (CF) model

Prepare prompt data

LLM denoising

LLM output cleaning

UppenBoundOnVal baselines

Evaluation

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
RecBole		RecBole
experiments		experiments
models		models
notebooks		notebooks
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
denoising_LLM.png		denoising_LLM.png
requirements.txt		requirements.txt
run.py		run.py

License

edervishaj/denoising-user-profiles-LLM

Folders and files

Latest commit

History

Repository files navigation

Post-Training Denoising of User Profiles with LLMs in Collaborative Filtering Recommendation

How to use this repository

Running experiments

Prepare data

Train Collaborative Filtering (CF) model

Prepare prompt data

LLM denoising

LLM output cleaning

UppenBoundOnVal baselines

Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages