How to keep track of all outputs during training? #16127

Pedrexus · 2022-12-20T08:30:00Z

Pedrexus
Dec 20, 2022

Dear lightning discussion channel,

thank you for building such an amazing tool and developing environment.

I am currently trying to build a simple callback that would keep hold of all outputs and targets of a model during training. I was able to build a rough draft, but I am facing the problem that the images are not fed in a fixed order (shuffle = True), so I wonder if there is a way to find the image index in the dataset.

My current approach is defining the following function:

def on_train_batch_end(self, trainer: "pl.Trainer", pl_module: "pl.LightningModule", outputs: STEP_OUTPUT, batch: Any, batch_idx: int) -> None:
        ...
        self.outputs[trainer.current_epoch, batch_size * batch_idx: (batch_idx + 1) * batch_size] = outputs["outputs"]
        self.targets[trainer.current_epoch, batch_size * batch_idx: (batch_idx + 1) * batch_size] = outputs["targets"]

However, due to shuffling the self.targets is not the same among epochs. Is there a way to know the index of each image in the original dataset/subset object?

I am still thinking if there is a simple way to do this. I though about hashing the value of each image, but data augmentation makes such action unreliable.

Hopefully you can help me!

Answered by Pedrexus

Dec 25, 2022

In the end, the solution was simply to modify the dataset class dynamically:

  def index_return_wrapper(dataset_cls: type) -> type:
      class IndexReturnWrapper(dataset_cls):
          def __getitem__(self, index: int) -> Tuple[Any, Any, int]:
              return super().__getitem__(index) + (index,)
      return IndexReturnWrapper

View full answer

Pedrexus · 2022-12-25T00:52:30Z

Pedrexus
Dec 25, 2022
Author

In the end, the solution was simply to modify the dataset class dynamically:

  def index_return_wrapper(dataset_cls: type) -> type:
      class IndexReturnWrapper(dataset_cls):
          def __getitem__(self, index: int) -> Tuple[Any, Any, int]:
              return super().__getitem__(index) + (index,)
      return IndexReturnWrapper

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to keep track of all outputs during training? #16127

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to keep track of all outputs during training? #16127

Uh oh!

Pedrexus Dec 20, 2022

Replies: 1 comment

Uh oh!

Pedrexus Dec 25, 2022 Author

Pedrexus
Dec 20, 2022

Pedrexus
Dec 25, 2022
Author