Skip to content

[Question] Why delete q_b_scale kv_b_scale k_b_trans_scale #2970

@bobbych94

Description

@bobbych94

Why did the gpt_attention function delete the parameters

q_b_scale: Optional[Tensor] = None,
kv_b_scale: Optional[Tensor] = None,
k_b_trans_scale: Optional[Tensor] = None,

and is_fp8_model_flag in the latest code?
Can anyone explain the reason?

Metadata

Metadata

Assignees

No one assigned

    Labels

    not a bugSome known limitation, but not a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions