This repository contains the code of the paper
Anonymous Authors. "Robust Partial-Label Learning by Leveraging Class Activation Values"
submitted at the conference AAAI 2025.
This document provides (1) an outline of the repository structure and (2) steps to reproduce the experiments including setting up a virtual environment.
-
The folder
experimentsis an initially empty folder that contains all experiments to evaluate. Runpython script_create_data.pyto populate it. -
The folder
externalcontains all datasets used within our work.- The subfolder
realworld-datasetscontains commonly used real-world datasets for partial-label learning, which were initially provided by Min-Ling Zhang. - The file
notMNIST_small.tar.gzcontains theNotMNISTdataset, which was initially provided by Yaroslav Bulatov (kaggle).
- The subfolder
-
The folder
partial_label_learningcontains the code for the experiments.- The subfolder
methodscontains all implementations of related-work algorithms and our method.
- The subfolder
-
The folder
plotscontains all the plots that appear in the paper or appendices. -
The folder
reference_modelscontains code for supervised reference models such as the MLP architecture. -
The folder
resultscontains the results of all experiments. This directory is initially empty. Runpython script_run_all.pyto populate it. -
The folder
saved_modelscontains saved variational auto-encoders for the MNIST-like datasets to be used by theDST-PLLmethod. -
Additionally, there are the following files in the root directory:
.gitignoreLICENSEdescribes the repository's licensing.README.mdis this document.requirements.txtis a list of all requiredpippackages for reproducibility.script_create_data.pyis a Python script to create all experimental configurations.script_run_all.pyruns all experimental configurations in theexperimentsfolder on all algorithms.script_res_to_sql.pycombines all results files into a singlesqlite3database file.script_tables_and_plots.pycreates all LaTeX tables and plots in the paper from the database fileresults/all_res.db.
Before running scripts to reproduce the experiments, you need to set up an environment with all the necessary dependencies. Our code is implemented in Python (version 3.11.5; other versions, including lower ones, might also work).
We used virtualenv (version 20.24.3; other versions might also work) to create an environment for our experiments.
First, you need to install the correct Python version yourself.
Next, you install virtualenv with
| Linux + MacOS (bash-like) | Windows (powershell) |
python -m pip install virtualenv==20.24.3 |
python -m pip install virtualenv==20.24.3 |
To create a virtual environment for this project, you have to clone this repository first. Thereafter, change the working directory to this repository's root folder. Run the following commands to create the virtual environment and install all necessary dependencies:
| Linux + MacOS (bash-like) | Windows (powershell) |
python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt |
python -m venv venv
.\venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txt |
Make sure that you created the virtual environment as stated above.
The script script_create_data.py creates all experimental settings including the artificial noise.
The script script_run_all.py runs all the experiments.
Running all experiments takes roughly one day on a system with 64 cores and one NVIDIA GeForce RTX 3090.
| Linux + MacOS (bash-like) | Windows (powershell) |
source venv/bin/activate
python script_create_data.py
python script_run_all.py |
.\venv\Scripts\Activate.ps1
python script_create_data.py
python script_run_all.py |
This creates .parquet.gz files in results/all containing the results of all experiments.
The experiments' results are compressed .parquet files.
You can easily read any of them with pandas.
import pandas as pd
results = pd.read_parquet("results/all/xyz.parquet.gz")To obtain tables and plots from the data, use the Python script script_tables_and_plots.py.
This script requires a working installation of LaTeX on your local system.
Use the following snippets to generate all tables and plots in the paper.
Generating all of them takes about 10 minutes on a single core.
| Linux + MacOS (bash-like) | Windows (powershell) |
source venv/bin/activate
python script_tables_and_plots.py |
.\venv\Scripts\Activate.ps1
python script_tables_and_plots.py |