-
Notifications
You must be signed in to change notification settings - Fork 117
Open
Labels
community-requestenhancementNew feature or requestNew feature or requestmodel-gptosst-seqpackingx-open
Description
Is your feature request related to a problem? Please describe.
I want to use sequence packing for better efficiency when fine-tuning gpt-oss. But got assert.
Describe the solution you'd like
I want to know why currently sequence packing is not supported for gpt-oss and if not so, I would like support sequence packing for gpt-oss.
Describe alternatives you've considered
I tried to removed assert. The log shows no dot product attention backend is available for the provided inputs. When adding debug info, I find that all attention backends are disabled because softmax_type = 'learnable' and qkv_format = 'thd'.
Additional context
Add any other context or screenshots about the feature request here.
Metadata
Metadata
Assignees
Labels
community-requestenhancementNew feature or requestNew feature or requestmodel-gptosst-seqpackingx-open