- Introduction
- License
- Citing AdaS
- Empirical Classification Results on CIFAR10 and CIFAR100
- Knowledge Gain Vs. Mapping Condition - CNN Quality Metrics
- Requirements
- Installation
- Usage
- Common Issues (running list)
- Pytest
AdaS is an optimizer with adaptive scheduled learning rate methodology for training Convolutional Neural Networks (CNN).
- AdaS exhibits the rapid minimization characteristics that adaptive optimizers like AdaM are favoured for
- AdaS exhibits generalization (low testing loss) characteristics on par with SGD based optimizers, improving on the poor generalization characteristics of adaptive optimizers
- AdaS introduces no computational overhead over adaptive optimizers (see experimental results)
- In addition to optimization, AdaS introduces new quality metrics for CNN training (quality metrics)
This repository contains a PyTorch implementation of the AdaS learning rate scheduler algorithm.
AdaS is released under the MIT License (refer to the LICENSE file for more information)
| Permissions | Conditions | Limitations |
|---|---|---|
@misc{hosseini2020adas,
title={AdaS: Adaptive Scheduling of Stochastic Gradients},
author={Mahdi S. Hosseini and Konstantinos N. Plataniotis},
year={2020},
eprint={2006.06587},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Figure 1: Training performance using different optimizers across two datasets (CIFAR10 and CIFA100) and two CNNs (VGG16 and ResNet34)

Table 1: Image classification performance (test accuracy) with fixed budget epoch of ResNet34 on CIFAR10 and CIFAR100 using various optimizers

Evolution of knowledge gain vs. mapping condition across different training epochs using ResNet34 on CIFAR10. Different colours represnets different conv blocks, and transparency is correlated to training epoch--lower transparency means higher epoch
We use Python 3.7
Per requirements.txt, the following Python packages are required:
attrs==19.3.0
coverage==5.1
et-xmlfile==1.0.1
future==0.18.2
importlib-metadata==1.6.1
jdcal==1.4.1
more-itertools==8.3.0
numpy==1.18.5
openpyxl==3.0.3
packaging==20.4
pandas==1.0.4
Pillow==7.1.2
pluggy==0.13.1
py==1.8.1
pyparsing==2.4.7
pytest==5.4.3
pytest-cov==2.9.0
pytest-cover==3.0.0
pytest-coverage==0.0
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
scipy==1.4.1
six==1.15.0
torch==1.5.0
torchvision==0.6.0
wcwidth==0.2.4
zipp==3.1.0
NOTE that in order to satisfy torch==1.5.0 the following Nvidia requirements need to be met:
- CUDA Version:
CUDA 10.2 - CUDA Driver Version:
r440 - CUDNN Version:
7.6.4-7.6.5For more information, refer to the cudnn-support-matrix.
Refer to the PyTorch installation guide for information how to install PyTorch. We do not guarantee proper function of this code using different versions of PyTorch or CUDA.
- GPU
- At least 4 GB of GPU memory is required
- Memory
- At least 8 GB of internal memory is required Naturally, the memory requirements is scaled relative to current dataset being used and mini-batch sizes, we state these number using the CIFAR10 and CIFAR100 dataset.
Further, we state the following results for certain training configurations to give you an idea of what to expect:
dataset: 'CIFAR100'
network: 'ResNet34'
n_trials: 1
max_epoch: 50
mini_batch_size: 128
min_lr: 0.00000000000000000001
init_lr: 0.03
beta: 0.8
zeta: 1.0
p: 1
loss: 'cross_entropy'
optim_method: 'SGD' # or ADAM
lr_scheduler: 'None' # or AdaS| Optimizer | Learning Rate Scheduler | Epoch Time (avg.) | RAM (Memory) Consumed | GPU Memory Consumed |
|---|---|---|---|---|
| SGD | None | 40-43 seconds | ~6.2 GB | ~3.75 GB |
| SGD | AdaS | 40-43 seconds | ~6.2 GB | ~3.75 GB |
| ADAM | None | 40-43 seconds | ~6.2 GB | ~3.75 GB |
We identify that each experiment is identical is computational performance.
There are two version of the AdaS code contained in this repository.
- a python-package version of the AdaS code, which can be
pip-installed. - a static python module, provided in addition to the package for any user who may want an unpackaged version of the code. This code can be found at unpackaged
pip install directly, typing
pip install git+https://mahdihosseini/AdaSin the terminal
First, clone the repository
git clone https://github.com/mahdihosseini/AdaS.gitFollowing, there are two options here:
- move
src/adasinto your project directory, andimport/use it as any other python package from adas import AdaSfor example to import theAdaSmodule within the package- Navigate into the
AdaStop-level directory and runpip install -e .in the terminal
You can run the code either by typing
python main.py train --...OR
python train.py --...Where --... represents the options for training (see below)
As this is a self-contained python module, all functionalities are built in once you pip-install the package. To start training, you must first define you configuration file. An example can be found at src/adas/config.yaml. Finally, run
python -m adas train --config *config.yaml*Where you specify the path of you config.yaml file. Note the following options for adas:
python -m adas train --help
--
usage: __main__.py train [-h] [--config CONFIG] [--data DATA]
[--output OUTPUT] [--checkpoint CHECKPOINT] [-r]
optional arguments:
-h, --help show this help message and exit
--config CONFIG Set configuration file path: Default = 'config.yaml'
--data DATA Set data directory path: Default = '.adas-data'
--output OUTPUT Set output directory path: Default = '.adas-output'
--checkpoint CHECKPOINT
Set checkpoint path: Default = '.adas-checkpoint/ckpt.pth'
--root ROOT Set root path of project that parents all others: Default = '.'
-r, --resume Flag: resume training from checkpointNote that training progress is conveyed through console outputs, where per-epoch statements are outputed to indicate epoch time and train/test loss/accuracy.
Note also that a per-epoch updating .xlsx file is created for every training session. This file reports performance metrics of the CNN during training. An example is show in Table 2.
Table 2: Performance metrics of VGG16 trained on CIFAR100 - 1 Epoch
| Conv Block | Train_loss_epoch_0 | in_S_epoch_0 | out_S_epoch_0 | fc_S_epoch_0 | in_rank_epoch_0 | out_rank_epoch_0 | fc_rank_epoch_0 | in_condition_epoch_0 | out_condition_epoch_0 | rank_velocity_0 | learning_rate_0 | acc_epoch_0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 4.52598691779329 | 0 | 0 | 0.007025059778243 | 0 | 0 | 0.01953125 | 0 | 0 | 0.03 | 0.03 | 0.0353 |
| 1 | 4.52598691779329 | 0.06363195180893 | 0.067244701087475 | 0.007025059778243 | 0.125 | 0.15625 | 0.01953125 | 4.14393854141235 | 5.25829029083252 | 0.03 | 0.03 | 0.0353 |
| 2 | 4.52598691779329 | 0.062127389013767 | 0.030436672270298 | 0.007025059778243 | 0.109375 | 0.046875 | 0.01953125 | 3.57764577865601 | 2.39811992645264 | 0.03 | 0.03 | 0.0353 |
| 3 | 4.52598691779329 | 0.035973243415356 | 0.030653497204185 | 0.007025059778243 | 0.0703125 | 0.0546875 | 0.01953125 | 3.60598373413086 | 3.2860517501831 | 0.03 | 0.03 | 0.0353 |
| 4 | 4.52598691779329 | 0.021210107952356 | 0.014563170261681 | 0.007025059778243 | 0.0390625 | 0.01953125 | 0.01953125 | 3.49767923355102 | 1.73739552497864 | 0.03 | 0.03 | 0.0353 |
| 5 | 4.52598691779329 | 0.017496244981885 | 0.018149495124817 | 0.007025059778243 | 0.03125 | 0.03125 | 0.01953125 | 3.05637526512146 | 2.64313006401062 | 0.03 | 0.03 | 0.0353 |
| 6 | 4.52598691779329 | 0.011354953050613 | 0.010315389372408 | 0.007025059778243 | 0.01953125 | 0.015625 | 0.01953125 | 2.54586839675903 | 2.25333142280579 | 0.03 | 0.03 | 0.0353 |
| 7 | 4.52598691779329 | 0.006322608795017 | 0.006018768996 | 0.007025059778243 | 0.01171875 | 0.0078125 | 0.01953125 | 3.68418765068054 | 2.13097596168518 | 0.03 | 0.03 | 0.0353 |
| 8 | 4.52598691779329 | 0.006788529921323 | 0.009726315736771 | 0.007025059778243 | 0.013671875 | 0.015625 | 0.01953125 | 3.65298628807068 | 2.70360684394836 | 0.03 | 0.03 | 0.0353 |
| 9 | 4.52598691779329 | 0.006502093747258 | 0.008573451079428 | 0.007025059778243 | 0.013671875 | 0.013671875 | 0.01953125 | 3.25959372520447 | 2.38875222206116 | 0.03 | 0.03 | 0.0353 |
| 10 | 4.52598691779329 | 0.003374363761395 | 0.005663644522429 | 0.007025059778243 | 0.0078125 | 0.0078125 | 0.01953125 | 4.67283821105957 | 2.17876362800598 | 0.03 | 0.03 | 0.0353 |
| 11 | 4.52598691779329 | 0.00713284034282 | 0.007544621825218 | 0.007025059778243 | 0.013671875 | 0.01171875 | 0.01953125 | 3.79078459739685 | 3.62017202377319 | 0.03 | 0.03 | 0.0353 |
| 12 | 4.52598691779329 | 0.006892844568938 | 0.007025059778243 | 0.007025059778243 | 0.017578125 | 0.01953125 | 0.01953125 | 6.96407127380371 | 8.45268821716309 | 0.03 | 0.03 | 0.0353 |
Where each row represents a single convolutional block and:
- Train_loss_epoch_0 is the training loss for 0-th epoch
- in_S_epoch_0 is the knowledge gain for the input to that conv block for 0-th epoch
- out_S_epoch_0 is the knowledge gain for the output of that conv block for 0-th epoch
- fc_S_epoch_0 is the knowledge gain for the fc portion of that conv block for 0-th epoch
- in_rank_epoch_0 is the rank for the input to that conv block for 0-th epoch
- out_rank_epoch_0 is the rank for the output of that conv block for 0-th epoch
- fc_rank_epoch_0 is the rank for the fc portion of that conv block for 0-th epoch
- in_condition_epoch_0 is the mapping condition for the input to that conv block for 0-th epoch
- out_condition_epoch_0 is the mapping condition for the output of that conv block for 0-th epoch
- rank_velocity_epoch_0 is the rank velocity for that conv block for the 0-th epoch
- learning_rate_epoch_0 is the learning rate for that conv block for all parameters for the 0-th epoch
- acc_epoch_0 is the testing accuracy for the 0-th epoch
The columns will continue to grow during training, appending each epoch's metrics each time.
The location of the output .xlsx file depends on the -root and --output option during training, and naming of the file is determined by the config.yaml file's contents.
Checkpoints are saved to the path specified by the -root and --checkpoint option. A file or directory may be passed. If a directory path is specified, the filename for the checkpoint defaults to ckpt.pth.
All models used can be found in src/adas/models in this repository are copied from pytorch-cifar. We note that modifications were made to these models to allow variable num_classes to be used, relative to the dataset being used for training. The available models are as follows:
- VGG16
- ResNet34
- PreActResNet18
- GoogLeNet
- densenet_cifar
- ResNeXt29_2x64d
- MobileNet
- MobileNetV2
- DPN92
- ShuffleNetG2
- SENet18
- ShuffleNetV2
- EfficientNetB0
Currently only the following datasets are supported:
- CIFAR10
- CIFAR100
- ImageNet (see Common Issues)
- Currently, the PyTorch ImageNet links are no longer active. We are working to remedy this.
- Add medical imaging datasets (e.g. digital pathology, xray, and ct scans)
- Extension of AdaS to Deep Neural Networks
Note the following:
- Our Pytests write/download data/files etc. to
/tmp, so if you don't have a/tmpfolder (i.e. you're on Windows), then correct this if you wish to run the tests yourself





