Skip to content

sequence packing for gpt-oss #1782

@jordane95

Description

@jordane95

Is your feature request related to a problem? Please describe.
I want to use sequence packing for better efficiency when fine-tuning gpt-oss. But got assert.

Describe the solution you'd like
I want to know why currently sequence packing is not supported for gpt-oss and if not so, I would like support sequence packing for gpt-oss.

Describe alternatives you've considered
I tried to removed assert. The log shows no dot product attention backend is available for the provided inputs. When adding debug info, I find that all attention backends are disabled because softmax_type = 'learnable' and qkv_format = 'thd'.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions