Skip to content

[BUG] SFT validation fails when use_sequence_packing is enabled #294

@XChen-Zero

Description

@XChen-Zero

Description

When use_sequence_packing is enabled in SFT training, validation fails during val_step.

In the current implementation:

  • train_step wraps the loss function with SequencePackingSFTLossWrapper
  • val_step directly uses the raw loss function

As a result, when sequence packing is enabled, forward_step receives incompatible
inputs and validation fails at runtime.

Affected Code

roll/pipeline/sft/sft_worker.py

Expected Behavior

Validation should apply the same loss wrapper as training when
use_sequence_packing is enabled, so that SFT training and validation work
correctly with sequence packing.

Additional Context

I have submitted a PR that fixes this issue by applying
SequencePackingSFTLossWrapper in val_step:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions