Out of memory with 5GB VRAM

```
--------------------------------------------------------------------------------------------------------
|  epoch  |  train_loss  |  val_loss  |  train_acc  |  val_acc  |  ema_val_acc  |  total_time_seconds  |
--------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "main.py", line 621, in <module>
    main()
  File "main.py", line 540, in main
    for epoch_step, (inputs, targets) in enumerate(get_batches(data, key='train', batchsize=batchsize)):
  File "main.py", line 428, in get_batches
    images = batch_crop(data_dict[key]['images'], 32) # TODO: hardcoded image size for now?
  File "main.py", line 390, in batch_crop
    cropped_batch = torch.masked_select(inputs, crop_mask_batch).view(inputs.shape[0], inputs.shape[1], crop_size, crop_size)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.58 GiB (GPU 0; 5.81 GiB total capacity; 835.23 MiB already allocated; 2.35 GiB free; 1.24 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
```

I have not looked at the code too closely, but it might be possible to shave off a few MB when preparing batches.

Thank you for this comment by the way.

https://github.com/tysam-code/hlb-CIFAR10/blob/132829f191c00e71a178c3c995ab6c0302ec66e5/main.py#L523

I totally forgot to add `torch.cuda.synchronize()`, but it is finally fixed https://github.com/99991/cifar10-fast-simple Fortunately, it did not make much of a difference. I now get 14.3 seconds with my code vs 15.7 seconds with your code. Perhaps there is something during batch preparation which makes a difference?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Out of memory with 5GB VRAM #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Out of memory with 5GB VRAM #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions