Skip to content

flop_counts missing for deepspeed in 2.5.5 #21231

@bnestor

Description

@bnestor

Bug description

There is no flop_counts in version 2.5.5

It does exist in the main branch:

def flop_counts(self) -> dict[str, dict[Any, int]]:

commenting out this line:

("FLOPs", list(map(get_human_readable_count, (sum(x.values()) for x in self.flop_counts.values())))),

allows it to work

What version are you seeing the problem on?

v2.5

Reproduced in studio

No response

How to reproduce the bug

dataloader = DataLoader(my_dataset)
ptl_model = AnyModel()

trainer = Trainer(accelerator="auto", strategy="deepspeed_stage_3")

trainer.fit(ptl_model, dataloader)

Error messages and logs

Traceback (most recent call last):
  File "myfile.py", line 373, in main
    trainer.fit(ptl_model, training_dataloader, validation_dataloader)
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 560, in fit
    call._call_and_handle_interrupt(
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 48, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 105, in launch
    return function(*args, **kwargs)
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 598, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 991, in _run
    call._call_callback_hooks(self, "on_fit_start")
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 228, in _call_callback_hooks
    fn(trainer, trainer.lightning_module, *args, **kwargs)
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/callbacks/model_summary.py", line 65, in on_fit_start
    summary_data = model_summary._get_summary_data()
  File "envs/torch/lib/python3.10/site-packages/pytorch_lightning/utilities/model_summary/model_summary_deepspeed.py", line 102, in _get_summary_data
    ("FLOPs", list(map(get_human_readable_count, (sum(x.values()) for x in self.flop_counts.values())))),
AttributeError: 'DeepSpeedSummary' object has no attribute 'flop_counts'

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.5.0): 2.5.5
#- PyTorch Version (e.g., 2.5): 2.8.0
#- Python version (e.g., 3.12): 3.10
#- OS (e.g., Linux): linux
#- CUDA/cuDNN version: 12.6
#- GPU models and configuration: Nvidia rtx series
#- How you installed Lightning(`conda`, `pip`, source): pip install pytorch-lightning

More info

No response

cc @lantiga

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions