by Wouter Besse, Rénan van Dijk, Federico Signorelli, Jip de Vries.
Our work revolved around the paper Scale Equivariant Graph Metanetworks [arXiv]. The paper introduces ScaleGMN, a novel metanetwork architecture designed to process and manipulate the parameters of feedforward neural networks (FFNNs) and convolutional neural networks (CNNs) in a way that respects both permutation and scaling symmetries. As a graph-based metanetwork, ScaleGMN represents each neural network as a graph (vertices: neurons; edges: weights and biases) and employs specially designed equivariant layers that guarantee invariant or equivariant outputs with respect to their inputs regardless of how hidden neurons are permuted or uniformly rescaled. Empirical results demonstrate that these equivariant metanetworks outperform both standard (non-equivariant) baselines and prior equivariant approaches on tasks such as generalization prediction, hyperparameter estimation, and low-dimensional embedding of continuous neural fields (INRs).
Prior work on neural‐network symmetries has largely focused on permutation invariances of hidden neurons to understand optimization landscapes and facilitate model merging or ensembling. Early metanetwork approaches overlooked these symmetries entirely, instead applying standard feedforward networks to flattened weight vectors or learning continuous‐network embeddings via joint meta‐learning. Graph‐based methods then emerged, using self‐supervised objectives to learn on weight‐space graphs but without explicitly enforcing equivariance constraints. More recently, researchers characterized all linear equivariant layers for multilayer perceptrons and convolutional networks and devised algorithms for automatic weight‐sharing in arbitrary architectures. In parallel, some work treated neural networks as graphs for graph‐neural‐network processing, introducing ad-hoc symmetry breaking where needed. Another line of research addressed scaling and sign symmetries, though often trading off expressivity and requiring redesigns for each activation type. The current paper brings these threads together in a single, local, architecture-agnostic framework that automatically constructs equivariant metanetworks across diverse network types.
ScaleGMN is a metanetwork which is presented to be robust and effective, achieving state-of-the-art performance in terms of classification and editing tasks thanks to its graph-based design and scale equivariance. Leveraging permutation and scale symmetries is supposed to speeden up the training and generalized the obtained results. As metanetworks are a new avenue of research, applications are still relatively unexplored. We aimed to find a use case with practical utility to leverage this highly effective metanetwork.
The application that we found was in dealing with trojaned networks. A trojaned network is a neural network whose behavior has been maliciously altered during training so that it performs normally on most inputs yet exhibits attacker-specified behavior (wrong classification) when presented with a particular “trigger.” The trigger can be a small pattern such as a small square inserted in the picture. Our research worked on verifying how ScaleGMN would perform in the tasks of classification between healthy and trojaned networks, and the task of "healing" a network from trojaned to healthy through editing of its parameters. To verify its effectiveness we make use of established baselines and compare performance.
Results can be obtained and replicated through execution of the scripts described in following sections and are illustrated in the delivered report.
Our experiments highlighted the applicability and effectiveness of ScaleGMN on trojaning detection and repairing of convolutional neural networks.
To create a clean virtual environment and install the necessary dependencies execute:
git clone [email protected]:WouterBesse/scalegmnUvADL2.git
cd scalegmn/
conda env create -n scalegmn --file environment.yml
conda activate scalegmn
First, make sure you have the metrics.csv
and weights.npy
files in the root folder of this repo. To create the poisoned CIFAR10 CNN dataset, you can run the train_mp.py
script. It has multiple arguments, so make sure to check those. In our case, we ran it as follows:
python .\train_mp.py 0 270000 256 -cu -cc 15
And, to also fine-tune the clean models, after renaming the /cifar10/11169340
folder:
python .\train_mp.py 0 270000 256 -cu -cc 15 -pr 0.0
Then, to convert this to the needed .csv and .npy file, you can run the code under "New way, also takes care of clean models" in poison_cifar10.ipynb
. You will need to configure the paths to the finetuned clean models and the poisoned models here.
Then, edit ./configs/cifar10/scalegmn_hetero_bidir_troj.yml
to match the data folders you have. With the new data, to train the poison classifier model, you can run:
python .\predicting_trojan.py --conf ./configs/cifar10/scalegmn_hetero_bidir_troj.yml --wandb True
To enable wandb logging, use the CLI argument --wandb True
. For more useful CLI arguments, check the src/utils/setup_arg_parser.py file.
While this code does not provide any useful results, we have added the way to run it to ensure that future work is possible on this matter
python repair.py --conf configs/CIFAR10/scalegmn_bidir_cleanse.yml
@article{kalogeropoulos2024scale,
title={Scale Equivariant Graph Metanetworks},
author={Kalogeropoulos, Ioannis and Bouritsas, Giorgos and Panagakis, Yannis},
journal={Advances in Neural Information Processing Systems},
year={2024}
}
- Wouter Besse: Coordinating some of the tasks. Implementing the final version of CIFAR-10 data poisoning and the Trojan classifier. Analysing the results.
- Rénan van Dijk: Implementation of trojan cleansing, reproduction of generalization prediction, reproduction of INR editing, analysis of poisoned models, implementation of model poisoning script.
- Federico Signorelli: Implementing initial version of the Trojan classifier and attempt of implementation of trojan cleansing, running of early versions of poisoned model generation.
- Jip de Vries: Implement initial version of CIFAR-10 data poisoning pipeline, Develop and apply a clear understanding of original methods for explanations.