CometLogger can modify logged metrics in-place 

When `logger.log_metrics(metrics)` is called with a `CometLogger`, `metrics` may be modified in-place. This can lead to confusing errors. E.g. if the user does

```python
def training_step(self, batch, batch_idx):
    losses = self._get_losses(batch)
    self.logger.log_metrics(losses)
    return losses
```

then `losses` will have all the tensors moved to the CPU and their gradients detached, leading to an error like `RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn` when backprop is attempted.

None of the other loggers change `metrics` in-place when `log_metrics` is called. All of them except neptune say that they just accept `metrics: Dict[str, float]`, though some others (e.g. the tensorboard logger) have code to handle `torch.Tensor`s or other types as well.

The `CSVLogger` uses the following for handling tensors:
```python
def _handle_value(value):
    if isinstance(value, torch.Tensor):
        return value.item()
    return value
...
metrics = {k: _handle_value(v) for k, v in metrics_dict.items()}
```

The `TensorBoardLogger` similarly has

```python
for k, v in metrics.items():
    if isinstance(v, torch.Tensor):
        v = v.item()
    ...
    self.experiment.add_scalar(k, v, step)
```

In the `CometLogger`, the current tensor conversion code is

```python
for key, val in metrics.items():
  if is_tensor(val):
    metrics[key] = val.cpu().detach()
```

but then the entire `metrics` dictionary is copied later in the function anyway, so it doesn't really make sense to do in-place modification then copy everything.

I'm happy to submit a PR to fix this so that the `CometLogger` doesn't modify the original `metrics` dictionary. I just wanted to ask for a couple of opinions before changing things:

1. Should I keep the current tensor conversion behavior for `CometLogger` (`val.cpu().detach()`) or switch to using `val.item()`? My preference would be the latter, though this does change the behavior (see at the end).
2. Should I update the other loggers to all accept `metrics: Dict[str, Union[float, torch.Tensor]]` and have them all use the same method (probably imported from `loggers/base.py`) to convert to a `Dict[str, float]`?
3. * I don't know the other loggers, so I'm not sure if tensors are actually not supported or if the type annotation isn't precise and the conversion is happening in third-party code

---

`val.cpu().detach()` vs `val.item()`
* Comet sort of has support for tensors with >1 element, so using the first method will make logging such tensors valid while the second method would throw an error. However, I don't think anybody would be using this behavior on purpose. If you do `logger.log_metrics({"test": torch.tensor([1.0, 10.0])})`, you get `COMET WARNING: Cannot safely convert array([ 1., 10.], dtype=float32) object to a scalar value, using its string representation for logging`. The metric itself doesn't even appear in the web interface for CometML, so I assume you can only access it if you query for it directly through their API.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CometLogger can modify logged metrics in-place #7021

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CometLogger can modify logged metrics in-place #7021

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions