Skip to content

Commit d1b1854

Browse files
Merge pull request #718 from NVIDIA/gh/release
[ConvNets/Pyt] Pretrained weights usage guidelines
2 parents 0a120b2 + 9061083 commit d1b1854

File tree

5 files changed

+141
-28
lines changed

5 files changed

+141
-28
lines changed

PyTorch/Classification/ConvNets/classify.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -63,17 +63,16 @@ def main(args):
6363

6464
if args.weights is not None:
6565
weights = torch.load(args.weights)
66-
6766
#Temporary fix to allow NGC checkpoint loading
68-
weights = {k.replace("module.", ""): v for k, v in weights.items()}
69-
67+
weights = {
68+
k.replace("module.", ""): v for k, v in weights.items()
69+
}
7070
model.load_state_dict(weights)
7171

7272
model = model.cuda()
7373

7474
if args.precision in ["AMP", "FP16"]:
75-
model = model.half()
76-
75+
model = network_to_half()
7776

7877
model.eval()
7978

PyTorch/Classification/ConvNets/main.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -363,10 +363,10 @@ def _worker_init_fn(id):
363363
)
364364
)
365365
pretrained_weights = torch.load(args.pretrained_weights)
366-
367-
#Temporary fix to allow NGC checkpoint loading
368-
369-
pretrained_weights = {k.replace("module.", ""): v for k, v in pretrained_weights.items()}
366+
# Temporary fix to allow NGC checkpoint loading
367+
pretrained_weights = {
368+
k.replace("module.", ""): v for k, v in pretrained_weights.items()
369+
}
370370
else:
371371
print("=> no pretrained weights found at '{}'".format(args.resume))
372372

PyTorch/Classification/ConvNets/resnet50v1.5/README.md

Lines changed: 44 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -281,17 +281,21 @@ Example:
281281

282282
### 6. Start inference
283283

284-
To run inference on ImageNet on a checkpointed model, run:
284+
You can download pretrained weights from NGC:
285285

286-
`python ./main.py --arch resnet50 --evaluate --epochs 1 --resume <path to checkpoint> -b <batch size> <path to imagenet>`
286+
```bash
287+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/resnet50_pyt_amp/versions/20.06.0/zip -O resnet50_pyt_amp_20.06.0.zip
287288

288-
To run inference on JPEG image, you have to first extract the model weights from checkpoint:
289+
unzip resnet50_pyt_amp_20.06.0.zip
290+
```
289291

290-
`python checkpoint2model.py --checkpoint-path <path to checkpoint> --weight-path <path where weights will be stored>`
292+
To run inference on ImageNet, run:
291293

292-
Then run classification script:
294+
`python ./main.py --arch resnet50 --evaluate --epochs 1 --pretrained-weights nvidia_resnet50_200821.pth.tar -b <batch size> <path to imagenet>`
293295

294-
`python classify.py --arch resnet50 -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
296+
To run inference on JPEG image using pretrained weights:
297+
298+
`python classify.py --arch resnet50 -c fanin --weights nvidia_resnet50_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
295299

296300

297301
## Advanced
@@ -445,6 +449,19 @@ Metrics gathered through training:
445449
- `train.data_time` - time spent on waiting on data
446450
- `train.compute_time` - time spent in forward/backward pass
447451

452+
To restart training from checkpoint use `--resume` option.
453+
454+
To start training from pretrained weights (e.g. downloaded from NGC) use `--pretrained-weights` option.
455+
456+
The difference between those two is that the pretrained weights contain only model weights,
457+
and checkpoints, apart from model weights, contain optimizer state, LR scheduler state, RNG state.
458+
459+
Checkpoints are suitable for dividing the training into parts, for example in order
460+
to divide the training job into shorter stages, or restart training after infrastructure fail.
461+
462+
Pretrained weights can be used as a base for finetuning the model to a different dataset,
463+
or as a backbone to detection models.
464+
448465
### Inference process
449466

450467
Validation is done every epoch, and can be also run separately on a checkpointed model.
@@ -470,6 +487,27 @@ Then run classification script:
470487

471488
`python classify.py --arch resnet50 -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
472489

490+
You can also run ImageNet validation on pretrained weights:
491+
492+
`python ./main.py --arch resnet50 --evaluate --epochs 1 --pretrained-weights <path to pretrained weights> -b <batch size> <path to imagenet>`
493+
494+
#### NGC Pretrained weights:
495+
496+
Pretrained weights can be downloaded from NGC:
497+
498+
```bash
499+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/resnet50_pyt_amp/versions/20.06.0/zip -O resnet50_pyt_amp_20.06.0.zip
500+
501+
unzip resnet50_pyt_amp_20.06.0.zip
502+
```
503+
504+
To run inference on ImageNet, run:
505+
506+
`python ./main.py --arch resnet50 --evaluate --epochs 1 --pretrained-weights nvidia_resnet50_200821.pth.tar -b <batch size> <path to imagenet>`
507+
508+
To run inference on JPEG image using pretrained weights:
509+
510+
`python classify.py --arch resnet50 -c fanin --weights nvidia_resnet50_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
473511

474512

475513
## Performance

PyTorch/Classification/ConvNets/resnext101-32x4d/README.md

Lines changed: 45 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -266,17 +266,21 @@ Example:
266266

267267
### 6. Start inference
268268

269-
To run inference on ImageNet on a checkpointed model, run:
269+
You can download pretrained weights from NGC:
270270

271-
`python ./main.py --arch resnext101-32x4d --evaluate --epochs 1 --resume <path to checkpoint> -b <batch size> <path to imagenet>`
271+
```bash
272+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/resnext101_32x4d_pyt_amp/versions/20.06.0/zip -O resnext101_32x4d_pyt_amp_20.06.0.zip
272273

273-
To run inference on JPEG image, you have to first extract the model weights from checkpoint:
274+
unzip resnext101_32x4d_pyt_amp_20.06.0.zip
275+
```
274276

275-
`python checkpoint2model.py --checkpoint-path <path to checkpoint> --weight-path <path where weights will be stored>`
277+
To run inference on ImageNet, run:
276278

277-
Then run classification script:
279+
`python ./main.py --arch resnext101-32x4d --evaluate --epochs 1 --pretrained-weights nvidia_resnext101-32x4d_200821.pth.tar -b <batch size> <path to imagenet>`
278280

279-
`python classify.py --arch resnext101-32x4d -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
281+
To run inference on JPEG image using pretrained weights:
282+
283+
`python classify.py --arch resnext101-32x4d -c fanin --weights nvidia_resnext101-32x4d_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
280284

281285

282286
## Advanced
@@ -431,6 +435,19 @@ Metrics gathered through training:
431435
- `train.data_time` - time spent on waiting on data
432436
- `train.compute_time` - time spent in forward/backward pass
433437

438+
To restart training from checkpoint use `--resume` option.
439+
440+
To start training from pretrained weights (e.g. downloaded from NGC) use `--pretrained-weights` option.
441+
442+
The difference between those two is that the pretrained weights contain only model weights,
443+
and checkpoints, apart from model weights, contain optimizer state, LR scheduler state, RNG state.
444+
445+
Checkpoints are suitable for dividing the training into parts, for example in order
446+
to divide the training job into shorter stages, or restart training after infrastructure fail.
447+
448+
Pretrained weights can be used as a base for finetuning the model to a different dataset,
449+
or as a backbone to detection models.
450+
434451
### Inference process
435452

436453
Validation is done every epoch, and can be also run separately on a checkpointed model.
@@ -454,8 +471,29 @@ To run inference on JPEG image, you have to first extract the model weights from
454471

455472
Then run classification script:
456473

457-
`python classify.py --arch resnext101-32x4d -c fanin --weights <path to weights from previous step> --precision AMP|
474+
`python classify.py --arch resnext101-32x4d -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
475+
476+
You can also run ImageNet validation on pretrained weights:
477+
478+
`python ./main.py --arch resnext101-32x4d --evaluate --epochs 1 --pretrained-weights <path to pretrained weights> -b <batch size> <path to imagenet>`
479+
480+
#### NGC Pretrained weights:
481+
482+
Pretrained weights can be downloaded from NGC:
483+
484+
```bash
485+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/resnext101-32x4d_pyt_amp/versions/20.06.0/zip -O resnext101-32x4d_pyt_amp_20.06.0.zip
486+
487+
unzip resnext101-32x4d_pyt_amp_20.06.0.zip
488+
```
489+
490+
To run inference on ImageNet, run:
491+
492+
`python ./main.py --arch resnext101-32x4d --evaluate --epochs 1 --pretrained-weights nvidia_resnext101-32x4d_200821.pth.tar -b <batch size> <path to imagenet>`
493+
494+
To run inference on JPEG image using pretrained weights:
458495

496+
`python classify.py --arch resnext101-32x4d -c fanin --weights nvidia_resnext101-32x4d_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
459497

460498

461499
## Performance

PyTorch/Classification/ConvNets/se-resnext101-32x4d/README.md

Lines changed: 44 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -267,17 +267,21 @@ Example:
267267

268268
### 6. Start inference
269269

270-
To run inference on ImageNet on a checkpointed model, run:
270+
You can download pretrained weights from NGC:
271271

272-
`python ./main.py --arch se-resnext101-32x4d --evaluate --epochs 1 --resume <path to checkpoint> -b <batch size> <path to imagenet>`
272+
```bash
273+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/seresnext101_32x4d_pyt_amp/versions/20.06.0/zip -O seresnext101_32x4d_pyt_amp_20.06.0.zip
273274

274-
To run inference on JPEG image, you have to first extract the model weights from checkpoint:
275+
unzip seresnext101_32x4d_pyt_amp_20.06.0.zip
276+
```
275277

276-
`python checkpoint2model.py --checkpoint-path <path to checkpoint> --weight-path <path where weights will be stored>`
278+
To run inference on ImageNet, run:
277279

278-
Then run classification script:
280+
`python ./main.py --arch se-resnext101-32x4d --evaluate --epochs 1 --pretrained-weights nvidia_se-resnext101-32x4d_200821.pth.tar -b <batch size> <path to imagenet>`
279281

280-
`python classify.py --arch se-resnext101-32x4d -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
282+
To run inference on JPEG image using pretrained weights:
283+
284+
`python classify.py --arch se-resnext101-32x4d -c fanin --weights nvidia_se-resnext101-32x4d_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
281285

282286

283287
## Advanced
@@ -432,6 +436,19 @@ Metrics gathered through training:
432436
- `train.data_time` - time spent on waiting on data
433437
- `train.compute_time` - time spent in forward/backward pass
434438

439+
To restart training from checkpoint use `--resume` option.
440+
441+
To start training from pretrained weights (e.g. downloaded from NGC) use `--pretrained-weights` option.
442+
443+
The difference between those two is that the pretrained weights contain only model weights,
444+
and checkpoints, apart from model weights, contain optimizer state, LR scheduler state, RNG state.
445+
446+
Checkpoints are suitable for dividing the training into parts, for example in order
447+
to divide the training job into shorter stages, or restart training after infrastructure fail.
448+
449+
Pretrained weights can be used as a base for finetuning the model to a different dataset,
450+
or as a backbone to detection models.
451+
435452
### Inference process
436453

437454
Validation is done every epoch, and can be also run separately on a checkpointed model.
@@ -457,6 +474,27 @@ Then run classification script:
457474

458475
`python classify.py --arch se-resnext101-32x4d -c fanin --weights <path to weights from previous step> --precision AMP|FP32 --image <path to JPEG image>`
459476

477+
You can also run ImageNet validation on pretrained weights:
478+
479+
`python ./main.py --arch se-resnext101-32x4d --evaluate --epochs 1 --pretrained-weights <path to pretrained weights> -b <batch size> <path to imagenet>`
480+
481+
#### NGC Pretrained weights:
482+
483+
Pretrained weights can be downloaded from NGC:
484+
485+
```bash
486+
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/seresnext101_32x4d_pyt_amp/versions/20.06.0/zip -O seresnext101_32x4d_pyt_amp_20.06.0.zip
487+
488+
unzip seresnext101_32x4d_pyt_amp_20.06.0.zip
489+
```
490+
491+
To run inference on ImageNet, run:
492+
493+
`python ./main.py --arch se-resnext101-32x4d --evaluate --epochs 1 --pretrained-weights nvidia_se-resnext101-32x4d_200821.pth.tar -b <batch size> <path to imagenet>`
494+
495+
To run inference on JPEG image using pretrained weights:
496+
497+
`python classify.py --arch se-resnext101-32x4d -c fanin --weights nvidia_se-resnext101-32x4d_200821.pth.tar --precision AMP|FP32 --image <path to JPEG image>`
460498

461499

462500
## Performance

0 commit comments

Comments
 (0)