Optimizer state for pre-trained model #23638
Unanswered
VinaySingh561
asked this question in
Q&A
Replies: 3 comments 1 reply
-
I added some comment on your code. I don't suggent you to reset the optimizer state import optax
import pickle
# Here I see your optimizer:
learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # Gradient descent
)
# opt_state = opt.init(params) <-- Why do you want to reset the optimizer state?
# Here you load the parameters and optimizer state from the checkpoint
with open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
# Load saved params and opt_state
params = checkpoint['params']
opt_state = checkpoint['opt_state'] # This will restores your optimizer's state
# Now you can continue training without resetting the optimizer |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for your descriptive answer. Please clarify just one more point.
Even though I am changing the loss function.[ intial loss function is
(energy_loss + force_loss) and now updated loss function is
(energy_loss+100*force_loss)... But I can use opt_state saved from last
training right?
Thanks for your time.
…On Sat, 14 Sept, 2024, 4:57 am Howard Cho, ***@***.***> wrote:
I added some comment on your code. I don't suggent you to reset the
optimizer state opt_state = opt.init(params) while you are continuing
training. This will initialize the optimizer state. You may want to use
this when you are starting completely new training run or switching to a
new task.
import optaximport pickle
# Here I see your optimizer:learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # Gradient descent
)
# opt_state = opt.init(params) <-- Why do you want to reset the optimizer state?
# Here you load the parameters and optimizer state from the checkpointwith open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
# Load saved params and opt_stateparams = checkpoint['params']opt_state = checkpoint['opt_state'] # This will restores your optimizer's state
# Now you can continue training without resetting the optimizer
—
Reply to this email directly, view it on GitHub
<#23638 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AUQD7SGAOOCHRWN23XLYIDLZWNYEJAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGI4DOOA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Ok, thanks for the clarification.
…On Sat, 14 Sept, 2024, 7:22 am Howard Cho, ***@***.***> wrote:
Yes, you can reuse the opt_state from your last training, even though
you're tweaking the loss function. The optimizer state (opt_state)
basically keeps track of things like momentum, learning rates, and other
internal stuff based on the parameters of your model.
—
Reply to this email directly, view it on GitHub
<#23638 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AUQD7SBOFPIMPL5OERK5GY3ZWOJGVAVCNFSM6AAAAABOGHLXZOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANRUGM2TAOI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am running the e3nn model in JAX, and I have trained it for 100 epochs with a loss function that includes both energy and force. Currently, I am achieving a good R² score for energy. Now, I want to increase the weightage of the force loss in the training process. Therefore, I would like to use the parameters from my last trained model. However, I am unsure about how to initialize the optimizer state.
I have already saved the optimizer state and parameters from the previous model. Should I use the opt_state from the trained model or initialize it using opt.init(params)? I have attached the code below for your convenience.
Thank you for your time and consideration.
`import optax
learning_rate_schedule = optax.exponential_decay(
init_value=1e-2, # initial learning rate
transition_steps=900, # how often to decay
decay_rate=0.9, # the decay rate
staircase=True # if True, decay happens at discrete intervals
)
opt = optax.chain(
optax.scale_by_adam(),
optax.scale_by_schedule(learning_rate_schedule),
optax.scale(-1.0) # multiply by -1.0 to perform gradient descent
)
opt_state = opt.init(params)
with open('best_energy_model_0.9978.pkl', 'rb') as f:
checkpoint = pickle.load(f)
params = checkpoint['params']
opt_state = checkpoint['opt_state']`
Beta Was this translation helpful? Give feedback.
All reactions