This repository contains the code for the paper "Towards Human-AI Complementarity in Matching Tasks" by Adrian Arnaiz-Rodriguez, Nina Corvelo Benz, Suhas Thejaswi, Nuria Oliver, Manuel Gomez Rodriguez.
Accepted for oral presentation at the HLDM'25 workshop at ECMLPKDD2025 (the Third Workshop on Hybrid Human-Machine Learning and Decision Making at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2025).
The folder data/20250217-160341-results.zip
contains the outputs from the Prolific user study in the following folders:
Each participant's answer to it's assigned matching task. It is organized as submissions/pool{xx}/B{b}/matching-B_{b}-pool_pool{xx}-PID_{userID}.json
, where for each pool and deferral size b
, there is a JSON files for each participant's submission.
Each JSON file contains:
matching
: Partial assignment{person_id: slot or null}
utility
: Achieved utilityoptimalUtility
: Best possible utility for remaining peopletime_taken
: Time in secondstime_expired
: Whether the time limit was reachedtimestamp
: Submission timestampprobabilitiesFile
,capacitiesFile
: path of probabilities and capacities files (under20250217-160341.zip
)prolific_pid
,pool_id
,problem_id
: Submission metadatasequence_idx
: Index of assigned problem sequenceproblem_number
: Order of the problem in the sequence
Each participant has an assigned metadata file organized as:
user\_data/U\_{userID}-S\_{sequence}/
└── userdata-U\_{userID}-S\_{sequence}.json
Each JSON file contains:
sequence
: List of 10 problems assigned to the user (CSV pairs: probabiblities + capacities)sequence_idx
: Index of assigned problem sequenceprolific_pid
: Prolific participant IDstudy_id
: Prolific internal study session IDproblem_number
: Number of tasks completed.attention
: Whether attention checks were passedreturned
: Whether the study was returned instead of submitted
The data from our stylized human-subject study is included in data\20250217-160341
. The data structure is as follows:
data/{timestamp}/
│ prob_x.npy // general data generation information: generation probabilities for x as defined in the script
│ prob_y_per_x.npy // general data generation information: generation probabilities for y given x as defined in the script
│ capacities.npy // general data generation information: capacities for each type as defined in the script
│ user_problem_matrix.csv // general data generation information: precomputed problem sequences to show to Prolific users (each user does half of the arms from random pools in random order)
├─ pool0/ // Pool 0 (first pool generated)
│ pool.pkl // Pool Object of individuals - Details in `MWBM.data_genetrion.Pool`
│ g.npy // expected success `individuals` X `resources` (ground-truth success probabilities)
│ g_alg.npy // confidence scores `individuals` X `resources` (biased success probabilities)
│ matching_opt_real_prob.npy // optimal matching with real probabilities (g.npy)
│ ├─ B{b}/ // Sub-folders `B{b}` for every deferral size `b = 0,...,n`
│ │ alg_matching.csv // algorithmic matching (using biased confidence scores `g_alg`) of n-b individuals (only available if b<n)
│ │ opt_rem_match.csv // optimal remaining matching (using real probabilities `g`) of the b remaining individuals i.e. best human can do
│ │ remaining_capacities.csv // remaining capacities after algorithmic matching - one row per resource
│ │ remaining_g.csv // expected utilities of remaining `individuals` X `resources`
│ │ remaining_people.csv // remaining ID of individuals (after algorithmic matching)
│ └─
├─ pool1/
└─ …
See the section Data Generation at the bottom of this file for details on how this data was generated.
Ro rung the bandit system, the script run_bandit_parallel.py
is used to run the bandit system. This script simulates the bandit system using the data from the user study, and it can be run in parallel or sequentially.
python run_bandit_parallel.py # or python run_bandit_human_data.py for parallel or sequential execution
--pools-folder data/20250217-160341 # Path to folder with pool data.
--results-folder data/20250217-160341-results # Path to folder with human answers and also used to store results (preferred `{pools-folder}-results`).
--num_sims 100 # Number of simulation runs.
--horizon 2000 # Number of rounds per simulation.
--min-arm 5 # Lowest arm, below this, we consider all human matchings to be optimal.
--problems pool_assignment # name of tensor in {results-folder}\stats, dimensions `n_pools` x `arms` x `answers per pool-arm`, with the ID of users that made each problem.
--allowed-users all # 'all' or path to a JSON file with list of selected users in `{results-folder}/bandit/users/*.json`
--random-assign # If enabled, fill incomplete human matchings randomly.
--title all_random # identifier for this experiment, used in results' folder.
--seed 0 # Base seed for reproducibility.
--assignment-strategy human # Strategy to assign the remaining $b$ individuals to the remaining slots. 'human' option uses the answers from the user study.
The argument --problems {tensor}
loads a 3-d tensor of n_pools
x arms
x answers per pool-arm
indicating which users made which problems.
It is already computed and saved in data data/20250217-160341-results.zip/stats/pool_user_assignment/pool_assignment.npy
, and can be used directly by the bandit system.
However, if these system is used with other data, it must be computed before running the bandit and saved in {results-folder}\results\stats\pool_user_assignment\{name}.npy
.
It is computed by the script UserStudy-Analysis\run_analysis.bat
, which processes the user study results and saves the necessary statistics, including:
- The required
n
xarms
xanswers per pool-arm
tensor indicating which users made which problems. - Additional statistics: utilities of each problem, normalized utilities (scores wrt optimal score) for each user and arm, time taken for each problem, etc, saved in
{results-folder}\results\stats
.
The variables used in UserStudy-Analysis\run_analysis.bat
, are definded inside the script as follows:
set poolsfolder=data/20250217-160341 # Path to folder with pool and matching problem data.
set resultsfolder=%poolsfolder%-results # Path to store results (preferred `{pools-folder}-results`).
set npools=40 # set number of different pools that users solved
set narms=21 # max arm + 1 because arms = [0, max_arm]
set minarm=5 # minimum b collected in the study (We will use optimal matchings as human mathcings for lower arms)
set totalproblems=10 # total problems per user WITH the att checks (so arms done +2)
set userlist=users_valid
set problemlist=pool_assignment
The data from our stylized human-subject study is included in data\20250217-160341
.
In addition, to personlaize the data generation for human assisted matching, run the script:
python generate_prolific_matchings.py
This script generates synthetic data for human-assisted matching scenarios, as follows:
Given (as internal variables in the script):
N = 20
: Number of people per poolM = 10
: Number of appointment slots per poolN_POOLS = 40
: Total number of pools to generatecapacities_types = [2]
: Each slot can host 2 peopleprob_x
: Prior distribution over person types (categorical over K groups)prob_y_per_x
: Success probability matrix for each person type × slot (shape: K × M)slot_keys
: Fixed 10 appointment keys (Mo-am
,Mo-pm
, ...,Fr-pm
)
Creates:
- All generated data and results are saved under a 'data//' directory, including numpy arrays, pickled objects, and CSV tables for each pool and matching scenario.
- It includes general files with the details of the generation, one folder for each created pool, and sub-folders for each deferral size
b
(from 0 ton
), wheren
is the number of individuals in the pool. - The data structure is as follows:
data/{timestamp}/
│ prob_x.npy // general data generation information: generation probabilities for x as defined in the script
│ prob_y_per_x.npy // general data generation information: generation probabilities for y given x as defined in the script
│ capacities.npy // general data generation information: capacities for each type as defined in the script
│ user_problem_matrix.csv // general data generation information: precomputed problem sequences to show to Prolific users (each user does half of the arms from random pools in random order)
├─ pool0/ // Pool 0 (first pool generated)
│ pool.pkl // Pool Object of individuals - Details in `MWBM.data_genetrion.Pool`
│ g.npy // expected success `individuals` X `resources` (ground-truth success probabilities)
│ g_alg.npy // confidence scores `individuals` X `resources` (biased success probabilities)
│ matching_opt_real_prob.npy // optimal matching with real probabilities (g.npy)
│ ├─ B{b}/ // Sub-folders `B{b}` for every deferral size `b = 0,...,n`
│ │ alg_matching.csv // algorithmic matching (using biased confidence scores `g_alg`) of n-b individuals (only available if b<n)
│ │ opt_rem_match.csv // optimal remaining matching (using real probabilities `g`) of the b remaining individuals i.e. best human can do
│ │ remaining_capacities.csv // remaining capacities after algorithmic matching - one row per resource
│ │ remaining_g.csv // expected utilities of remaining `individuals` X `resources`
│ │ remaining_people.csv // remaining ID of individuals (after algorithmic matching)
│ └─
├─ pool1/
└─ …
pip install -r requirements.txt
or
conda env create -f environment.yml
IMPORTANT: Gurobi license is required. Instructions at How do I install Gurobi for Python?.