This is a collection of tools to run experiments and evaluations for a project studying the ability of DCNN's to generalize to unseen poses of 3D objects.
The code available in this repository is grouped into several categories:
- data generation: all experiments are done with synthetic ShapeNet objects rendered into image stimuli with Blender
- DNN training
- analysis: the various analyses employed, including both standard and novel approaches
All data, including stimuli datasets, trained models and collected behavioral responses and neural activities, can be found at the following Harvard Dataverse dataset: TBD. To use this data, please save it locally and extract all files. The location of this data will be referred to as PATH_TO_DATA. Alternatively, you can generate all data and train all models. The codebase expects both datasets and experiment files to be located in a single directory.
The location of this repository will be referred to as PATH_TO_REPO
where OBJECT_CLASS is the class of object to be generated. This value can be one of [plane, car, lamp, SM]. And where INSTANCE_INDEX is the integer instance index of the object of the class to be generated. This value can be in the range [0,50).
Blender needs to be installed. We refer to its installation location as BLENDER_PATH
BLENDER_PATH -b -noaudio -P PATH_TO_REPO/render/render.py PATH_TO_REPO PATH_TO_DATA 32 OBJECT_CLASS INSTANCE_INDEXAfter generating data, there are several more steps:
- Check that all images were rendered
- Compress the images (Blender introduces some artifacts, including pixel values of 1 (max value 255) in the black background, where the lowest pixel value for the objects is ~70. Setting these pixels allows for much greater compression without loss of quality)
- Combine all images into a single numpy array which is saved to disk (with borders trimmed, and data type of uint8 to save storage). This format allows for faster loading to memory (as only one file needs to be loaded, rather than many small files)
If any of these steps cannot be completed, the program will exit with a message. After every step is completed, the dataset will be updated with a flag that the step has been completed. This allows for tracking of dataset completion
python3 PATH_TO_REPO/render/finalize_render.py PATH_TO_DATA 32 OBJECT_CLASS -sd INSTANCE_INDEXRotation-Generalization/
├── README.md
├── dataset_path.py #converts dataset attributes to a path to the correct directory.
│ offers a class method that converts an index to a dataset path
├── exps.csv #list of all experiments with experimental variables
├── istarmap.py #tool used for pooling and tqdm
├── my_dataclasses.py
├── my_models
│ ├── C8SteerableCNN.py # Equivariant model
│ ├── CORnet_S.py
│ └── CORnet_Z.py
├── notebooks
│ ├── evaluation_vis.ipynb
│ ├── network_analysis.ipynb
│ ├── tools.ipynb
│ └── tools.py
├── render
│ ├── generate_sm_object.py
│ ├── merge_datasets.py
│ ├── model_paths2.json
│ ├── render.py
│ └── render_check.py
├── slurm
│ ├── submit_render.sh
│ └── submit_training.sh
└── train
├── dataset.py
├── remaining_jobs.json
├── run.py
├── train.py
└── training_check.py
- Datasets
-
All datasets can be found at:
/om2/user/avic/datasets -
The dataclass available in
dataset_path.pyallows for the specification of a dataset by its parameters.__repr__()will return the correct pathExample Usage
>>> DatasetPath(model_category='plane', type=DatasetType.Bin, scale=True, restriction_axes=(1,2)) '/home/avic/om2/datasets/plane/bin/mid_scaled/Y_Z'
-
Alternatively, a class method is provided that allows for the specification of a dataset with its unique index
Example Usage
>>> DatasetPath.get_dataset(25) '/home/avic/om2/datasets/lamp/bin/X_Y'
-
- Experiments
-
All experiments can be found at
/om2/user/avic/experiments -
The dataclass
ExpDatafound inmy_dataclasses.pyallows for the specification of an experiment by its parameters. Upon initialization this class will generate all the necessary paths for the experiment directory tree, as exhibited below.- To prevent issues with logging, in order to manually print the values of an
ExpDataobject,reprmust be called as__repr__(print=True)
Example Usage
>>> exp_data = ExpData(job_id=4, data_div=20, model_type=ModelType.Inception, pretrained=False, num='23', training_category='plane', testing_category='plane', hole=1, augment=False, scale=True, restriction_axes=(0,1), lr=0.001, batch_size=128, max_epoch=15) >>> print(exp_data.__repr__(print=True)) '''job_id : 4 data_div : 20 model_type : ModelType.Inception pretrained : False num : 23 training_category : plane testing_category : plane hole : 1 augment : False scale : True restriction_axes : (0, 1) lr : 0.001 batch_size : 128 max_epoch : 15 name : Div20 dir : /home/avic/om2/experiments/exp23 logs_dir : /home/avic/om2/experiments/exp23/logs eval_dir : /home/avic/om2/experiments/exp23/eval stats_dir : /home/avic/om2/experiments/exp23/stats checkpoints_dir : /home/avic/om2/experiments/exp23/checkpoints tensorboard_logs_dir : /home/avic/om2/experiments/exp23/tensorboard_logs/4 logs : /home/avic/om2/experiments/exp23/logs/Div20.txt eval : /home/avic/om2/experiments/exp23/eval/Div20.csv testing_frame_path : /home/avic/om2/experiments/exp23/eval/TestingFrame_Div20.csv stats : /home/avic/om2/experiments/exp23/stats/Div20.csv checkpoint : /home/avic/om2/experiments/exp23/checkpoints/Div20.pt'''
- To prevent issues with logging, in order to manually print the values of an
-
Alternatively, a class method is provided that allows for the specification of an experiment with its unique index
Example Usage
>>> exp_data = ExpData.get_experiments(4) # `exp_data` can be printed out as above
-
All experiments live in the following directory structure:
/om2/user/avic/experiments/exp*/ ├── checkpoints # model checkpoints for further analysis │ ├── Div10.pt │ ├── Div20.pt │ ├── Div30.pt │ └── Div40.pt ├── eval # TestingFrames are a dataframe with image atrributes for images in the testing set │ # Div* are dataframes with records of the model's prediction for each image │ # by comparing the predictions in Div* with the expectation in TestingFrames, accuracy can be calculated │ # *_heatmap_id.npy is the in-distribution and *_heatmap_ood.npy is the out of distribution │ # per-orientation accuracy of the experiment. These are generated by running │ # Rotation-Generalization/analysis/generate_eval_heatmaps.py │ ├── Div10.csv │ ├── Div10_heatmap_id.npy │ ├── Div10_heatmap_ood.npy │ ├── Div20.csv │ ├── Div20_heatmap_id.npy │ ├── Div20_heatmap_ood.npy │ ├── Div30.csv │ ├── Div30_heatmap_id.npy │ ├── Div30_heatmap_ood.npy │ ├── Div40.csv │ ├── Div40_heatmap_id.npy │ ├── Div40_heatmap_ood.npy │ ├── TestingFrame_Div10.csv │ ├── TestingFrame_Div20.csv │ ├── TestingFrame_Div30.csv │ └── TestingFrame_Div40.csv ├── logs # Text files with properties and logs of each experiment │ ├── Div10.txt │ ├── Div20.txt │ ├── Div30.txt │ └── Div40.txt ├── stats # The per-epoch statistics (accuracies, loss) of each experiment │ ├── Div10.csv │ ├── Div20.csv │ ├── Div30.csv │ └── Div40.csv └── tensorboard_logs # Directory of tensorboard logs for use with tensorboard └── ... -
-
Synthetic Datasets
-
For easy generation of all datasets,
dataset_paths.pyhas a class method that converts an index into the path for a certain dataset- Each synthetic dataset has 500 annotation files (explained below) and with a total of 4 synthetic datasets, values ranging from
[1-1999]are valid indices
- Each synthetic dataset has 500 annotation files (explained below) and with a total of 4 synthetic datasets, values ranging from
-
Each dataset lives at a path generated by
dataset_paths.pyIn its directory there is a directory for all the images, an annotation file for each category, a merged annotation file for all the categories, and optionally further nested directories- An annotation file contains data for each image including the path to the image, the object's category, the object's rotation etc.
-
Rendering Pipeline
- (optionally:) Use the tool
python render/render_check.pyto get a bash-formatted list of rendering jobs that aren't yet completed - set the array variable in
sbatch slurm/submit_render.shwith the indices of the datasets to be generated, either with a range or with a list- this list can be obtained with the above mentioned
render_check
- this list can be obtained with the above mentioned
- obtain the merged dataset annotation files by running
python render/merge_datasets.py
Example Usage
python render/render_check.py # use the output as the array in submit_render.sh sbatch slurm/submit_render.sh python render/merge_datasets.py - (optionally:) Use the tool
-
-
iLab Datasets
- Updated code for iLab has not yet been implemented. This is forthcoming
-
For easy enumeration of experiments,
ExpDatainmy_dataclasses.pyprovides a constructor that takes experimental parameters as well as a class method to obtain a specific experiment from an index -
All experiments can be found in
exps.csvThis contains all the experiments to be run, that is all theExpData's with all legal indices- The
job_idis the (render) indices equivalent for training - Note:
exps.csvis not continuous in that it does not include experiments that use augmented data. This is due to a new pipeline that compares unaugmented experiments with a analytically generated 2D heatmap
- The
-
Experiments live in directories with an
exp_numas listed in theirExpData -
Currently experiments are run for 10 epochs. This is a hyperparameter and we might consider changing it for all networks or at least for some networks
-
Training Pipeline
- Use the tool
python train/training_check.pyto generate a file attrain/reamining_jobs.jsonwith a list of experiments that aren't yet completed - set the array variable in
sbatch slurm/submit_training.shwith the range printed out by the previous tool- there are 2 options for gpu specification. Resnet-18 models can be run on
tesla-k80gpu's, so when training these models uncomment the relevant line in the sbatch file. When training larger models, uncomment the lines that requesthigh-capacitygpu's with 11GB of memory
- there are 2 options for gpu specification. Resnet-18 models can be run on
Example Usage
python train/training_check.py sbatch slurm/submit_training.sh
- Use the tool
For complete analysis, we plan to run the following experiments. We will
Datasets Synthetic (for each of the following groups): Categories: Plane Car Lamp Shepard Metzler Full Scaled Unscaled Bin (Generate Mid Scale, split bins in hole and center) X, Y Y, Z X, Z Total: 4 · ((2 + 3) · 50) = 1,000 iLab Backbones Resnet-18 DenseNet InceptionV3 https://pytorch.org/docs/stable/torchvision/models.html Equivariant Group CNN’s (Possible mention of CorNet, for recurrence) Data Augmentation For translation - possibly only do in testing, not even in training Small translations Transfer from one object to another Transfer from ImageNet One of two backbones (Resnet, CorNet)
@article{
cooper2025emergent,
title={Emergent Neural Network Mechanisms for Generalization to Objects in Novel Orientations},
author={Avi Cooper and Daniel Harari and Tomotake Sasaki and Spandan Madan and Hanspeter Pfister and Pawan Sinha and Xavier Boix},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=4wBQTZVSHU}
}