Skip to content

PyTorchProfiler: not showing CPU memory used even with profile_memory=TrueΒ #20339

@Jack12xl

Description

@Jack12xl

Bug description

Trying to use PyTorchProfiler (https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.profilers.PyTorchProfiler.html) to track some OOM(cpu memory) issues.
I go with

profiler = PyTorchProfiler(
        dirpath=log_dir,  # Directory to save logs
        filename="memory_profile",  # Name of the file to save results
        sort_by_key="self_cpu_memory_usage",  # Sort by CPU memory usage
        export_to_chrome=True,  # Export as JSON for Chrome
        row_limit=16,
        activities=[torch.profiler.ProfilerActivity.CPU],
        profile_memory=True,  # Record CPU memory usage
        with_stack=True,
        record_shapes=True,
    )

trainer=pl.Trainer(..., profiler=profiler, ...)
trainer.fit()

I expected to results similar to Pytorch Native profiler(https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html#using-profiler-to-analyze-memory-consumption). But it's still outputting cpu/gpu time like here:
image

I don't know if this is a bug(I thought PyTorchProfiler was a wrapper around native Pytorch Profiler, so it should have similar behavior when I set profile_memory=True).

Thanks! Please correct me if I am wrong!

What version are you seeing the problem on?

v2.4

How to reproduce the bug

profiler = PyTorchProfiler(
dirpath=log_dir, # Directory to save logs
filename="memory_profile", # Name of the file to save results
sort_by_key="self_cpu_memory_usage", # Sort by CPU memory usage
export_to_chrome=True, # Export as JSON for Chrome
row_limit=16,
activities=[torch.profiler.ProfilerActivity.CPU],
profile_memory=True, # Record CPU memory usage
with_stack=True,
record_shapes=True,
)

trainer=pl.Trainer(..., profiler=profiler, ...)
trainer.fit()

Error messages and logs

# Error messages and logs here please

Environment

Current environment
* CUDA:
	- GPU:
		- NVIDIA 30xx GPU
	- available:         True
	- version:           12.1
* Lightning:
	- lightning:         2.4.0
	- lightning-utilities: 0.11.7
	- pytorch-lightning: 2.4.0
	- torch:             2.3.1
	- torchaudio:        2.3.1
	- torchdata:         0.8.0
	- torchmetrics:      1.4.1
	- torchvision:       0.18.1

Python: 3.12.4

More info

It's not directly related to this issue. But is there some way I could have the export_memory_timeline(https://pytorch.org/docs/main/profiler.html#torch.profiler._KinetoProfile.export_memory_timeline) behavior with lightning PytorchProfiler?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions