You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SUMMARY:
The calculate_mse_min_max function previously performed a full grid
search across a 0.8 × 100 = 80-point space. After discussing with Alex
and Eldar last week, we reduced max_shrink to 0.2 to improve performance
without sacrificing accuracy.
Additionally, implemented an early stopping mechanism. The function now
tracks the best quantization error seen so far and stops if no
improvement is observed over 5 consecutive steps (patience = 5).
maxshrink variable is now configurable in recipe file, and patience(for
early stop) can be passed in as well.
TEST PLAN:
All lm_eval tests were run. No regressions in accuracy were observed.
Performance improved significantly after maxshrink is updated.
**There's a 3-7 mins slow down per test switching from MinMax to MSE
observer.**
USAGE:
Tested the recipe by adding:
```yaml
observer: "mse"
observer_kwargs:
maxshrink: 0.3
```
More details can be found in this [notion
page](https://www.notion.so/Accuracy-test-1d930c7e73f3803bb057fd17d6d45302?pvs=4)
Raw timing data are stored
[here](https://drive.google.com/drive/folders/1I69QNGKxLJJZ06k9jSw0f0BRhPchV_nt?usp=drive_link)
---------
Signed-off-by: shanjiaz <[email protected]>
Co-authored-by: Brian Dellabetta <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
0 commit comments