Skip to content

[FEA] Refine training benchmark model configuration. #121

@JacoCheung

Description

@JacoCheung

Currently, our benchmark configuration can not reflect real scenarios well.

  1. The batchsize or sequence length is not typical
  2. Input data is uniform, we need powerlaw. (both seqlen and key range)
  3. The embedding dim is equal to hidden dim
  4. The attention hyper-params is not well set
  5. The model weight is not large enough. (because of 3. and 4.)
  6. We have only single node test.
  7. We do not benchmark the TP.
    Therefore we need to improve our benchmark suites.

By submitting this issue, you agree to follow our code of conduct and our contributing guidelines.

Metadata

Metadata

Assignees

Labels

enhancementImprovement for existing feature

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions