Skip to content

Trust issue running deepseek-moe-16B-base #952

@torsli

Description

@torsli

Describe the bug

I'm trying to run deepseek-ai/deepseek-moe-16b-base and cannot get around a trust_remote_code prompt from HF that causes the script to time out.

[rank3]: Traceback (most recent call last):
[rank3]:   File "/opt/venv/lib/python3.12/site-packages/transformers/dynamic_module_utils.py", line 757, in resolve_trust_remote_code
[rank3]:     answer = input(
[rank3]:              ^^^^^^
[rank3]:   File "/opt/venv/lib/python3.12/site-packages/transformers/dynamic_module_utils.py", line 696, in _raise_timeout_error
[rank3]:     raise ValueError(
[rank3]: ValueError: Loading this model requires you to execute custom code contained in the model repository on your local machine. Please set the option `trust_remote_code=True` to permit loading of this model.

[rank3]: During handling of the above exception, another exception occurred:

[rank3]: Traceback (most recent call last):
[rank3]:   File "/lustre/fsw/coreai_dlalgo_llm/dthorsley/nemo_bench/bench_targets/llm_automodel/_pretrain.py", line 190, in <module>
[rank3]:     main()
[rank3]:   File "/lustre/fsw/coreai_dlalgo_llm/dthorsley/nemo_bench/bench_targets/llm_automodel/_pretrain.py", line 151, in main
[rank3]:     trainer.setup()
[rank3]:   File "/opt/Automodel/nemo_automodel/recipes/llm/train_ft.py", line 1015, in setup
[rank3]:     self.dataloader, self.tokenizer = build_dataloader(
[rank3]:                                       ^^^^^^^^^^^^^^^^^
[rank3]:   File "/opt/Automodel/nemo_automodel/recipes/llm/train_ft.py", line 585, in build_dataloader
[rank3]:     hf_model_config = AutoConfig.from_pretrained(_get_model_name(cfg_model))
[rank3]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/opt/venv/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line 1341, in from_pretrained
[rank3]:     trust_remote_code = resolve_trust_remote_code(
[rank3]:                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/opt/venv/lib/python3.12/site-packages/transformers/dynamic_module_utils.py", line 769, in resolve_trust_remote_code
[rank3]:     raise ValueError(
[rank3]: ValueError: The repository deepseek-ai/deepseek-moe-16b-base contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/deepseek-ai/deepseek-moe-16b-base .
[rank3]:  You can inspect the repository content at https://hf.co/deepseek-ai/deepseek-moe-16b-base.
[rank3]: Please pass the argument `trust_remote_code=True` to allow custom code to be run.

Steps/Code to reproduce bug

Run TrainFinetuneRecipeForNextTokenPrediction for this model with PP enabled.

Expected behavior

Including trust_remote_code in the config file in the correct place should bypass this prompt.

Additional context

I don't think there's any path to setting trust_remote_code here from the config right now.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions