Skip to content

model-recompiled-after-phase-1-small-scale-hpo-267 #269

@david-thrower

Description

@david-thrower

Take #267 and run an HPO study otherwise identical to #268 , however with the phase I-a model recompiled to reset the optimizer before running Phase I-b training.

The problem:

  • TLDR: The perplexity improves very slowly over a large number of epochs in phase I-b.
  • This may be optimal behavior OR it may be sub-optimal. The slow change is a sign of stability, but we would like to see if we can get the metrics to improve faster if possible and ideally to improve further in fewer epochs as Cerebros models usually do, including phase I-a models trained on the same number of samples. as the sum of phase I-a and phase I-b samples in recent studies ...

From #267

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions