Enable post-RHT amax estimation

**Is your feature request related to a problem? Please describe.**

RHT+amax kernel prevents fusion.

**Describe the solution you'd like**

Estimate post rht amax from pre rht amax with a linear function. This eliminates rht+amax kernel. Make this feature optional. Make hyperparameters (amax estimation scale) to be tunable.

**Validation**

Validate lm loss with dense/moe models. Ensure the convergence is the same or better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable post-RHT amax estimation #2578

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enable post-RHT amax estimation #2578

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions