A repository to train and evaluate Influence Maximisation ML models for multilayer networks. This code was used in the preparation of the paper Identifying Super Spreaders in Multilayer Networks.
- Authors: Michał Czuba, Mateusz Stolarski, Adam Piróg, Piotr Bielak, Piotr Bródka
- Affiliation: WUST, Wrocław, Lower Silesia, Poland
This repository is part of a broader research codebase composed of multiple interrelated components, each addressing a specific aspect of the Influence Maximisation pipeline:
I. infmax-trainer-icm-mln - training ts-net
.
II. infmax-simulator-icm-mln - computing spreading potential and evaluating influence maximisation methods.
III. top-spreaders-dataset - storage and access layer for the TopSpreadersDataset
.
Particularly, it contains an implementation, weights, and the training pipeline of the
TopSpreadersNetwork
(a.k.a. ts-net
), which predicts
spreading potentials
of actors from the processed multilayer network. Its architectural design is presented on the
following figure:
I. First, initialise the environment:
conda env create -f env/conda.yaml
conda activate infmax-trainer-icm-mln
pip install pyg-lib -f https://data.pyg.org/whl/torch-2.3.1+cu121.html
II. Then, pull the Git submodule with data loaders and install its code:
git submodule init && git submodule update
pip install -e data
III. The TopSpreadersDataset is managed using DVC. To fetch it, follow the instructions in README.md.
├── data -> evaluated networks
├── env -> definition of the runtime environment
├── model -> exported model weights & configuration
├── scripts -> pipeline entries
│ ├── configs
│ └── analysis
├── src -> main source code
│ |── data_models -> customised HeteroData class
│ |── datamodule -> data loaders
│ |── dataset -> implemented datasets serving HeteroData
│ |── infmax_models -> trainable ML models for super-spreaders ident.
│ |── training -> training pipeline
│ │ ├── loss
│ │ ├── callbacks.py
│ │ ├── loggers.py
│ │ └── trainer.py
│ |── utils -> code helpers
│ └── wrapper -> wrappers for trainable models
├── README.md
├── run_evalation.py -> entrypoint of evaluation pipeline
└── run_experiments.py -> entrypoint of training pipeline
To run experiments, execute run_experiments.py
or run_evaluation.py
and provide the appropriate
configuration in scripts/configs
directory. See examples in scripts/configs
for reference.
To run experiments execute: run_experiments.py
and provide proper CLI arguments defined in
scripts/configs/hydra.yaml
, i.e. a name of the configuration file.
To select device on remote server please set up environment variable:
export CUDA_VISIBLE_DEVICES=2
and then in the config file select list of devices as [0]
.
To train model without neptune.ai
set up tensor_board
logger in configuration file.
To run evaluation execute: run_evaluation.py
and provide proper CLI arguments defined in
scripts/configs/evaluation.yaml
, i.e. a name of the experiment or test networks.
To run it without access to neptune.ai
, set the value of base/neptune
to False
. This will
enforce the local configuration from the model
directory.
This work was supported by the National Science Centre, Poland [grant no. 2022/45/B/ST6/04145] (www.multispread.pwr.edu.pl); the Polish Ministry of Science and Higher Education programme “International Projects Co-Funded”; and the EU under the Horizon Europe [grant no. 101086321]. Views and opinions expressed are those of the authors and do not necessarily reflect those of the funding agencies.