Deep neural networks for object classification align closely with human visual representations, a correspondence that has been attributed to fine-grained category supervision. We investigate whether such granular supervision is necessary for robust brain-model alignment. Using a PCA-based method, we generate progressively coarser ImageNet label sets (ranging from 2 to 64 categories) and retrain a standard CNN (AlexNet) from scratch for each granularity, enabling controlled comparisons against standard 1000-class training.
Evaluations employ representational similarity analysis (RSA) on large-scale fMRI data (NSD, including out-of-distribution stimuli) and behavioral data (THINGS). Our key findings include:
- On behavioral data, models trained with minimal categories (e.g., 2 classes) achieve surprisingly high alignment with human similarity judgments
- On fMRI data, models trained with 32-64 categories match or outperform 1000-class models in early visual cortex alignment and exhibit comparable performance in ventral areas, with coarser models displaying advantages on OOD stimuli
- Coarse-trained representations differ structurally from low-dimensional projections of fine-grained models, suggesting the learning of novel visual features
Collectively, these findings indicate that broader categorical distinctions are often sufficient — and sometimes more effective — for capturing cognitively salient visual structure, especially in early visual processing and OOD contexts. This work introduces classification granularity as a new framework for probing visual representation alignment, laying the groundwork for more biologically-aligned vision systems.
-
Clone the Repository
git clone git@github.com:yashsmehta/visreps.git cd visreps -
Set Up Python Environment (requires Python 3.11+)
curl -LsSf https://astral.sh/uv/install.sh | sh uv sync source .venv/bin/activate
-
Configure Environment Copy the example environment file and fill in paths to your datasets:
cp .env.example .env # Edit .env with your dataset paths
Train models with different label granularities:
# Train with 32 PCA-derived classes
python -m visreps.run --mode train --override pca_labels=true pca_n_classes=32 seed=1
# Grid search over multiple configurations
python scripts/runners/train_runner.py --grid configs/grids/train_default.jsonEvaluate brain-model alignment:
# RSA on NSD fMRI data
python -m visreps.run --mode eval --override cfg_id=32 seed=1 analysis=rsa neural_dataset=nsd
# Encoding score on THINGS behavioral data
python -m visreps.run --mode eval --override cfg_id=32 seed=1 analysis=encoding_score neural_dataset=things
# Grid search over evaluation configurations
python scripts/runners/eval_runner.py --grid configs/grids/eval_default.jsonConfiguration files are in configs/train/ and configs/eval/. Use --override to modify parameters from the command line.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this code in your research, please cite:
@software{GranularityAlignment2025,
author = {Author Names},
title = {{Probing the Granularity of Human-Machine Alignment}},
year = {2025},
url = {https://github.com/yashsmehta/visreps},
}We welcome contributions! Please feel free to submit a Pull Request.