Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions verl/utils/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -724,12 +724,22 @@
if "image_bound" in inputs:
has_image_bound = True
for key, value in inputs.items():
if value is not None:
if key not in multi_modal_inputs_collected:
multi_modal_inputs_collected[key] = []
multi_modal_inputs_collected[key].append(value)
# Skip None values and empty tensors/lists to handle text-only samples
# in mixed text/multi-modal batches
if value is None:
continue
if isinstance(value, torch.Tensor) and value.numel() == 0:
continue
if isinstance(value, (list, tuple)) and len(value) == 0:

Check failure on line 733 in verl/utils/model.py

View workflow job for this annotation

GitHub Actions / pre-commit (3.12)

Ruff (UP038)

verl/utils/model.py:733:16: UP038 Use `X | Y` in `isinstance` call instead of `(X, Y)`
continue
if key not in multi_modal_inputs_collected:
multi_modal_inputs_collected[key] = []
multi_modal_inputs_collected[key].append(value)

for key, values in multi_modal_inputs_collected.items():
# Skip if no valid tensors were collected for this key
if not values:
continue
if has_image_bound: # minicpm-o logic
multi_modal_inputs[key] = values
else:
Expand Down
Loading