One-step-off-policy does not support IPv6 in distributed training

### System Info

I am using the one-step-off-policy method for multi-machine training, but I encounter the following error during execution:

<img width="1151" height="198" alt="Image" src="https://github.com/user-attachments/assets/5300168a-4e06-4453-a15a-28c4bca28a8b" />

After investigating, I traced the error to the following code path:

File: ./recipe/one_step_off_policy/distributed_util.py

Line: 61
<img width="767" height="645" alt="Image" src="https://github.com/user-attachments/assets/7c34ca48-c174-45a0-a1b8-a78ca9314c9a" />


At this line, the code calls a utility function from vLLM:
File: ./vllm/distributed/utils.py

<img width="767" height="747" alt="Image" src="https://github.com/user-attachments/assets/c49e6958-d0f4-4cbd-8bc8-24300265eabc" />

However, the implementation in vllm/distributed/utils only supports IPv4 addresses. When the training environment uses IPv6, this results in a failure during distributed initialization.

Therefore, I guess the one-step-off-policy distributed training pipeline currently does not support IPv6? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One-step-off-policy does not support IPv6 in distributed training #4771

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

One-step-off-policy does not support IPv6 in distributed training #4771

Description

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions