-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Take #267 and run an HPO study otherwise identical to #268 , however with the phase I-a model recompiled to reset the optimizer before running Phase I-b training.
The problem:
- TLDR: The perplexity improves very slowly over a large number of epochs in phase I-b.
- This may be optimal behavior OR it may be sub-optimal. The slow change is a sign of stability, but we would like to see if we can get the metrics to improve faster if possible and ideally to improve further in fewer epochs as Cerebros models usually do, including phase I-a models trained on the same number of samples. as the sum of phase I-a and phase I-b samples in recent studies ...
From #267
Metadata
Metadata
Assignees
Labels
No labels