Influence Functions for Neural Networks — PyTorch

Compute training data influence for MLPs and Transformers using K-FAC. Find which training examples most affected a model’s prediction — interpretability and debugging for PyTorch.

What Are Influence Functions?

Influence functions estimate how much each training example contributed to a model’s prediction. This repo implements efficient influence computation via K-FAC (Kronecker-Factored Approximate Curvature) for:

MLPs — e.g. MNIST image classification
Transformers — autoregressive character-level language models

Use it for model interpretability, data debugging, finding mislabeled or influential examples, and understanding prediction behavior.

Features

PyTorch — native torch models and training loops
K-FAC / EKFAC — scalable second-order influence approximation
MLP + Transformer — MNIST demo and mini decoder-only transformer
Visualization — top influential training samples (e.g. MNIST images)
Modular — InfluenceCalculable interface to plug in your own layers

Requirements

Python 3.8+
PyTorch 2.x (CPU or CUDA)
See requirements.txt for full dependencies (torch, torchvision, einops, matplotlib, tqdm, etc.)

Installation

git clone https://github.com/KuchikiRenji/InfluenceFunctions.git
cd InfluenceFunctions
pip install -r requirements.txt

Quick Start

MNIST MLP — influence on test predictions

Train (optional; or use a pre-trained checkpoint):
```
python mnist_mlp.py
```
In mnist_mlp.py, uncomment train_model() in if __name__ == "__main__" to train and save model.ckpt.
Run influence analysis (find training examples that most influenced selected test samples):
```
python mnist_mlp.py
```
With the default run_influence("model.ckpt", 1, 300, 1000), this loads the model, picks 1 query from the test set, uses 300 samples for gradient fitting, and searches over 1000 training samples. Results are printed and saved as results_*.png.

Transformer — character-level influence

Train (optional):

In mini_transformer.py, uncomment train_char_predict() in if __name__ == "__main__" to train and save small_transformer.pth.
Run influence:
```
python mini_transformer.py
```
With the default calc_influence("small_transformer.pth"), this computes influential training sequences for chosen query sequences.

Project Structure

File	Description
`influence_functions_mlp.py`	K-FAC influence for MLP blocks (MNIST-style)
`influence_functions_transformer.py`	K-FAC influence for transformer MLP blocks (autoregressive loss)
`mnist_mlp.py`	MNIST MLP model, training, and influence + visualization
`mini_transformer.py`	Small decoder-only transformer and influence on char-level data
`requirements.txt`	Python dependencies

How It Works (High Level)

K-FAC factors — approximate the Hessian with Kronecker factors from activations and gradient covariances over a dataset.
Inverse-Hessian–vector products — computed efficiently using these factors.
Influence scores — for a query point, estimate the effect of each training example on the loss (or prediction) via gradients and the approximate inverse Hessian.

The code uses an InfluenceCalculable interface: each block provides activations, gradient w.r.t. pre-activations, and weight gradients so K-FAC and influence can be computed layer-wise.

Author & Contact

KuchikiRenji

GitHub: github.com/KuchikiRenji
Email: [email protected]
Discord: kuchiki_renji

License

See repository for license information. All other project content remains as in the original repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Influence Functions for Neural Networks — PyTorch

What Are Influence Functions?

Features

Requirements

Installation

Quick Start

MNIST MLP — influence on test predictions

Transformer — character-level influence

Project Structure

How It Works (High Level)

Author & Contact

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
influence_functions_mlp.py		influence_functions_mlp.py
influence_functions_transformer.py		influence_functions_transformer.py
mini_transformer.py		mini_transformer.py
mnist_mlp.py		mnist_mlp.py
requirements.txt		requirements.txt

KuchikiRenji/InfluenceFunctions

Folders and files

Latest commit

History

Repository files navigation

Influence Functions for Neural Networks — PyTorch

What Are Influence Functions?

Features

Requirements

Installation

Quick Start

MNIST MLP — influence on test predictions

Transformer — character-level influence

Project Structure

How It Works (High Level)

Author & Contact

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages