The "deepspeed" parameter of DeepSpeed OneBitAdam/ZeroOneAdam #13795

BlinkDL · 2022-07-22T05:56:26Z

BlinkDL
Jul 22, 2022

I am using DeepSpeed strategy at this moment, and I'd like to try DeepSpeed OneBitAdam/ZeroOneAdam, but initialize it in the python code (instead of the json config).

However, there is a "deepspeed" parameter in OneBitAdam/ZeroOneAdam (and you can't pass None, because the optimizer will call deepspeed.mpu):
https://deepspeed.readthedocs.io/en/latest/optimizers.html#zerooneadam-gpu

May I ask where I can find this "deepspeed" object inside DeepSpeedStrategy, so I can pass it to the optimizer?

I am using the following code to initialize DeepSpeedStrategy:

trainer = Trainer(strategy=DeepSpeedStrategy(config='deepspeed.json'), devices=NUM_GPUS, accelerator="gpu", precision=16)

zhuangxy · 2023-05-31T09:25:13Z

zhuangxy
May 31, 2023

hi @BlinkDL did you solve this issue at last? Currently I'm trying to run RWKV-4-7B retrain using ZeroOneAdam, hit exact the same issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The "deepspeed" parameter of DeepSpeed OneBitAdam/ZeroOneAdam #13795

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The "deepspeed" parameter of DeepSpeed OneBitAdam/ZeroOneAdam #13795

Uh oh!

Uh oh!

BlinkDL Jul 22, 2022

Replies: 1 comment

Uh oh!

zhuangxy May 31, 2023

BlinkDL
Jul 22, 2022

zhuangxy
May 31, 2023