Skip to content

"ValueError: You selected an invalid strategy name" When DDPStrategy(process_group_backend="gloo") is passed #20526

@11philip22

Description

@11philip22

Bug description

When I run this code on Python 3.12.8 with pytorch-lightning 2.4.0 I get a ValueError

What version are you seeing the problem on?

v2.4

How to reproduce the bug

ddp_gloo = DDPStrategy(process_group_backend="gloo")

trainer = Trainer(
    devices=2,
    # devices=1,
    accelerator='gpu',
    strategy=ddp_gloo,
    benchmark=True,
    logger=logger,
    callbacks=[checkpoint_callback, lr_monitor],
    check_val_every_n_epoch=1,
    max_epochs=30,
    # max_epochs=3,
)
trainer.fit(model, data_module)

Error messages and logs

Traceback (most recent call last):
  File "C:\Users\Philip\source\repos\insightface_alignment_lightning\src\train.py", line 59, in <module>
    main()
  File "C:\Users\Philip\source\repos\insightface_alignment_lightning\src\train.py", line 43, in main
    trainer = Trainer(
              ^^^^^^^^
  File "C:\Users\Philip\.conda\envs\lightning\Lib\site-packages\pytorch_lightning\utilities\argparse.py", line 70, in insert_env_defaults
    return fn(self, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Philip\.conda\envs\lightning\Lib\site-packages\pytorch_lightning\trainer\trainer.py", line 395, in __init__
    self._accelerator_connector = _AcceleratorConnector(
                                  ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Philip\.conda\envs\lightning\Lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py", line 130, in __init__
    self._check_config_and_set_final_flags(
  File "C:\Users\Philip\.conda\envs\lightning\Lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py", line 193, in _check_config_and_set_final_flags
    raise ValueError(
ValueError: You selected an invalid strategy name: `strategy=<lightning.pytorch.strategies.ddp.DDPStrategy object at 0x0000023622FA2240>`. It must be either a string or an instance of `pytorch_lightning.strategies.Strategy`. Example choices: auto, ddp, ddp_spawn, deepspeed, ... Find a complete list of options in our documentation at https://lightning.ai

Environment

Current environment
  • CUDA:
    - GPU:
    - Quadro P6000
    - Quadro P6000
    - available: True
    - version: 12.4
  • Lightning:
    - efficientnet-pytorch: 0.7.1
    - lightning: 2.4.0
    - lightning-utilities: 0.11.9
    - pytorch-lightning: 2.4.0
    - segmentation-models-pytorch: 0.3.5.dev0
    - torch: 2.5.1
    - torchmetrics: 1.6.0
    - torchvision: 0.20.1
  • Packages:
    - absl-py: 2.1.0
    - aiohappyeyeballs: 2.4.4
    - aiohttp: 3.11.11
    - aiosignal: 1.3.2
    - albucore: 0.0.21
    - albumentations: 1.4.23
    - annotated-types: 0.7.0
    - attrs: 24.3.0
    - autocommand: 2.2.2
    - backports.tarfile: 1.2.0
    - brotli: 1.1.0
    - certifi: 2024.12.14
    - cffi: 1.17.1
    - charset-normalizer: 3.4.0
    - colorama: 0.4.6
    - contourpy: 1.3.1
    - cycler: 0.12.1
    - efficientnet-pytorch: 0.7.1
    - eval-type-backport: 0.2.0
    - filelock: 3.16.1
    - fonttools: 4.55.3
    - frozenlist: 1.5.0
    - fsspec: 2024.10.0
    - grpcio: 1.68.1
    - h2: 4.1.0
    - hpack: 4.0.0
    - huggingface-hub: 0.27.0
    - hyperframe: 6.0.1
    - idna: 3.10
    - importlib-metadata: 8.0.0
    - inflect: 7.3.1
    - jaraco.collections: 5.1.0
    - jaraco.context: 5.3.0
    - jaraco.functools: 4.0.1
    - jaraco.text: 3.12.1
    - jinja2: 3.1.4
    - kiwisolver: 1.4.7
    - lightning: 2.4.0
    - lightning-utilities: 0.11.9
    - markdown: 3.7
    - markupsafe: 3.0.2
    - matplotlib: 3.10.0
    - more-itertools: 10.3.0
    - mpmath: 1.3.0
    - multidict: 6.1.0
    - munch: 4.0.0
    - networkx: 3.4.2
    - numpy: 2.2.0
    - opencv-python: 4.10.0.84
    - opencv-python-headless: 4.10.0.84
    - packaging: 24.2
    - pillow: 10.4.0
    - pip: 24.3.1
    - platformdirs: 4.2.2
    - pretrainedmodels: 0.7.4
    - propcache: 0.2.1
    - protobuf: 5.29.2
    - pycocotools: 2.0.8
    - pycparser: 2.22
    - pydantic: 2.10.4
    - pydantic-core: 2.27.2
    - pyparsing: 3.2.0
    - pysocks: 1.7.1
    - python-dateutil: 2.9.0.post0
    - pytorch-lightning: 2.4.0
    - pyyaml: 6.0.2
    - requests: 2.32.3
    - safetensors: 0.5.0
    - scipy: 1.14.1
    - segmentation-models-pytorch: 0.3.5.dev0
    - setuptools: 75.6.0
    - simsimd: 6.2.1
    - six: 1.17.0
    - stringzilla: 3.11.2
    - sympy: 1.13.1
    - tensorboard: 2.18.0
    - tensorboard-data-server: 0.7.2
    - timm: 1.0.12
    - tomli: 2.0.1
    - torch: 2.5.1
    - torchmetrics: 1.6.0
    - torchvision: 0.20.1
    - tqdm: 4.67.1
    - typeguard: 4.3.0
    - typing-extensions: 4.12.2
    - urllib3: 2.2.3
    - werkzeug: 3.1.3
    - wheel: 0.45.1
    - win-inet-pton: 1.1.0
    - yarl: 1.18.3
    - zipp: 3.19.2
    - zstandard: 0.23.0
  • System:
    - OS: Windows
    - architecture:
    - 64bit
    - WindowsPE
    - processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
    - python: 3.12.8
    - release: 10
    - version: 10.0.19045

More info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrepro neededThe issue is missing a reproducible examplever: 2.4.x

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions