Error running cifar_example.py with InfoBatch

I tried running the examples from the README.md. The first example (using the full dataset) worked as expected. However, the second example (using train_info_batch) resulted in the following error:

```
$ nohup /usr/bin/time -v python examples/cifar_example.py "--model r50 --optimizer lars --max-lr 5.2 --num_epoch 5 --delta 0.875 --ratio 0.5 --use_info_batch" >> log_test.log 2>&1
nohup: ignoring input
==> Building model..
use normal data parallel
Use info batch.
<class 'infobatch.infobatch.IBSampler'>

Epoch: 0, iterations 391
Traceback (most recent call last):
  File "/home/vm03/Desktop/barbara/infobatch/InfoBatch/examples/cifar_example.py", line 269, in <module>
    train_info_batch(epoch) if args.use_info_batch else train_normal(epoch)
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vm03/Desktop/barbara/infobatch/InfoBatch/examples/cifar_example.py", line 191, in train_info_batch
    lr_scheduler.step()
  File "/home/vm03/anaconda3/envs/cp/lib/python3.11/site-packages/torch/optim/lr_scheduler.py", line 241, in step
    values = self.get_lr()
             ^^^^^^^^^^^^^
  File "/home/vm03/anaconda3/envs/cp/lib/python3.11/site-packages/torch/optim/lr_scheduler.py", line 2153, in get_lr
    raise ValueError(
ValueError: Tried to step 1956 times. The specified number of total steps is 1955
```

To speed up the process, I initially limited the number of epochs, but I also ran the original example with 200 epochs and encountered the same error — though with a higher number of total steps.

It seems that the error originates from a loop that may be getting stuck inside the `train_info_batch` function:
``` python
for batch_idx, blobs in enumerate(trainloader):
    inputs, targets = blobs
    inputs, targets = inputs.to(device), targets.to(device)

```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error running cifar_example.py with InfoBatch #27

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error running cifar_example.py with InfoBatch #27

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions