Bug: ValueError: tokenizer_name is None when adding a new Hugging Face model due to prod_env behavior

I am reporting a persistent and difficult-to-debug issue where helm-run fails with a ValueError: tokenizer_name is None when adding a new, standard Hugging Face model.

After a very lengthy debugging session, we have concluded that this is not a user configuration error but rather an unexpected behavior in how HELM handles the prod_env directory, which seems to be auto-created and then overrides otherwise correct configurations.

The Problem in Detail
When a new model is correctly defined in src/helm/config/model_deployments.yaml, running helm-run still fails. The traceback points to the WindowService receiving a None value for tokenizer_name.

The logs show that even if the prod_env directory is deleted or renamed beforehand, the helm-run script appears to find or recreate it, and then enters a "local mode with base path: prod_env". This action seems to invalidate the configurations loaded from src/helm/config/.

Environment
OS: v100 machine

Python Version: 3.10

HELM Installation Method: https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E-Instruct

Steps to Reproduce
Start with a clean clone of the crfm-helm repository.

Bash

git clone https://github.com/stanford-crfm/helm.git
cd helm
Set up the environment (e.g., pip install -e .).

Ensure no prod_env directory exists.

Bash

rm -rf prod_env
Add the following complete and correct model deployment configuration to the end of src/helm/config/model_deployments.yaml:

YAML

- name: meta-llama/Llama-4-Scout-17B-16E-Instruct
  model: meta-llama/Llama-4-Scout-17B-16E-Instruct
  tokenizer_name: meta-llama/Llama-4-Scout-17B-16E-Instruct
  client_spec:
    class_name: helm.proxy.clients.huggingface_client.HuggingFaceClient
    args:
      pretrained_model_name_or_path: meta-llama/Llama-4-Scout-17B-16E-Instruct
      trust_remote_code: true
Use any standard run spec, for example, by creating a file named my_eval.conf with the following content:

entries: [
  {description: "MMLU test", priority: 1, groups: ["mmlu"]},
]
Execute the helm-run command:

Bash

helm-run   --conf-paths my_eval.conf   --suite my_suite   --models-to-run meta-llama/Llama-4-Scout-17B-16E-Instruct   --max-eval-instances 1
Expected Behavior
The helm-run command should start successfully, load the specified model, and run the evaluation without configuration errors.

Actual Behavior (The Bug)
The command fails with the ValueError: tokenizer_name is None traceback. Crucially, a prod_env directory is observed to be present during or after the run, and the log shows Running in local mode with base path: prod_env.

(Please paste the full traceback from your failed helm-run command here)

Evidence and Diagnostics (The "Smoking Gun")
To prove that the YAML configuration itself is loaded correctly by HELM's registry system, I ran a separate diagnostic script. This script successfully loads the configuration and finds the correct tokenizer_name. This proves the configuration file is correct, but the information is lost or ignored later in the helm-run process.

Output of the successful diagnostic script:

-Starting Final HELM Debug Script -
-----------------------------------------------------------------
Here is the FINAL configuration HELM actually sees for your model:
{'name': 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'client_spec': ClientSpec(class_name='helm.proxy.clients.huggingface_client.HuggingFaceClient', args={'pretrained_model_name_or_path': 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'trust_remote_code': True}), 'model_name': 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'tokenizer_name': 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'window_service_spec': None, 'max_sequence_length': None, 'max_request_length': None, 'max_sequence_and_generated_tokens_length': None, 'deprecated': False}
------------------------------------------------------------------------------
Confirmed Workaround
The only way to make the run succeed is to manually disable the prod_env directory right before running the command:

Bash

mv prod_env prod_env_disabled
helm-run ... # This now works correctly
This confirms the issue is tied to the prod_env override logic.

Thank you for looking into this complex issue.




-----------------------------------------------------------------------------------------------------------------------------------------

(helm) junyao@goofy-1:~/helm$ cat src/helm/config/model_deployments.yaml
model_deployments:
  - name: meta-llama/Llama-4-Scout-17B-16E-Instruct
    model: meta-llama/Llama-4-Scout-17B-16E-Instruct
    tokenizer_name: meta-llama/Llama-4-Scout-17B-16E-Instruct
    client_spec:
      class_name: helm.proxy.clients.huggingface_client.HuggingFaceClient
      args:
        pretrained_model_name_or_path: meta-llama/Llama-4-Scout-17B-16E-Instruct
        trust_remote_code: true
    window_service_spec:
      class_name: helm.benchmark.window_services.local_window_service.LocalWindowService
      args:
        tokenizer_name: meta-llama/Llama-4-Scout-17B-16E-Instruct
(helm) junyao@goofy-1:~/helm$ cat src/helm/benchmark/run_specs/my_llama4_eval.conf
entries: [
  {
    name: "mmlu:computer_security",
    description: "mmlu:subject=computer_security,method=multiple_choice_joint,model=meta-llama/Llama-4-Scout-17B-16E-Instruct",
    priority: 1,
    scenario_spec: {
      class_name: "helm.benchmark.scenarios.mmlu_scenario.MMLUScenario",
      args: {
        subject: "computer_security"
      }
    },
    adapter_spec: {
      method: "multiple_choice_joint",
      model: "meta-llama/Llama-4-Scout-17B-16E-Instruct",
      model_deployment: "meta-llama/Llama-4-Scout-17B-16E-Instruct"
    },
    metric_specs: [
      {
        class_name: "helm.benchmark.metrics.basic_metrics.BasicGenerationMetric",
        args: {
          names: ["exact_match", "quasi_exact_match", "prefix_exact_match", "quasi_prefix_exact_match"]
        }
      },
      {
        class_name: "helm.benchmark.metrics.basic_metrics.BasicReferenceMetric"
      },
      {
        class_name: "helm.benchmark.metrics.basic_metrics.InstancesPerSplitMetric"
      }
    ]
  }
]
(helm) junyao@goofy-1:~/helm$ cat src/helm/config/tokenizer_configs.yaml
tokenizer_configs:
  - name: meta-llama/Llama-4-Scout-17B-16E-Instruct
    tokenizer: meta-llama/Llama-4-Scout-17B-16E-Instruct
    tokenizer_spec:
      class_name: AutoTokenizer
      tokenizer_name: meta-llama/Llama-4-Scout-17B-16E-Instruct

(helm) junyao@goofy-1:~/helm$ cat src/helm/config/model_metadata.yaml
models:
  - name: meta-llama/Llama-4-Scout-17B-16E-Instruct
    display_name: Llama‑4 Scout 17B
    creator_organization_name: Meta
    description: |
      Instruction‑tuned Llama‑4 Scout 17B model (16 experts), multimodal (text+image input, text output).
    access: gated
    release_date: 2024-01-30
    license: Llama 4 Community License
    tags:
      - TEXT_MODEL_TAG
      - INSTRUCTION_FOLLOWING_MODEL_TAG
    tokenizer_name: meta-llama/Llama-4-Scout-17B-16E-Instruct
(helm) junyao@goofy-1:~/helm$ 




















Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: ValueError: tokenizer_name is None when adding a new Hugging Face model due to prod_env behavior #3709

-Starting Final HELM Debug Script -

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: ValueError: tokenizer_name is None when adding a new Hugging Face model due to prod_env behavior #3709

Description

-Starting Final HELM Debug Script -

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions