Skip to content

ExecuTorch 0.7.0 - Llama 3.2 and Qwen 2.5 Export Issues #14810

@andytriboletti

Description

@andytriboletti

ExecuTorch 0.7.0 - Llama 3.2 and Qwen 2.5 Export Issues

Summary

Unable to export Llama 3.2 1B and Qwen 2.5 1.5B models using ExecuTorch 0.7.0's export_llama.py script. Both models fail with different errors during the export process.

Environment

  • ExecuTorch Version: 0.7.0 (installed via pip)
  • Python Version: 3.10.18
  • OS: macOS (darwin)
  • Installation Method: pip install executorch==0.7.0
  • HuggingFace Access: Confirmed (token valid, Llama 3.2 access approved)

Issue 1: Llama 3.2 1B Export Failure

Command

python -m executorch.examples.models.llama.export_llama \
  -c meta-llama/Llama-3.2-1B \
  --pt2e_quantize xnnpack_dynamic \
  --max_seq_length 256 \
  --max_context_length 512 \
  -n llama32_1b.pte \
  -o . \
  -v

Error

AttributeError: 'BaseConfig' object has no attribute 'adapter_checkpoint'

Full Traceback

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.10.18_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/Cellar/[email protected]/3.10.18_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama.py", line 56, in <module>
    main()  # pragma: no cover
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama.py", line 52, in main
    export_llama(remaining_args)
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama_lib.py", line 635, in export_llama
    builder = _export_llama(llm_config)
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama_lib.py", line 1057, in _export_llama
    builder_exported = _prepare_for_llama_export(llm_config).export()
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama_lib.py", line 673, in _prepare_for_llama_export
    edge_manager = _load_llama_model(llm_config)
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/export_llama_lib.py", line 1194, in _load_llama_model
    EagerModelFactory.create_model(
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/model_factory.py", line 44, in create_model
    model = model_class(**kwargs)
  File "/Users/andytriboletti/Documents/GitHub/alientavern-v2/mobile/executorch/venv/lib/python3.10/site-packages/executorch/examples/models/llama/model.py", line 50, in __init__
    adapter_checkpoint_path = self.llm_config.base.adapter_checkpoint
AttributeError: 'BaseConfig' object has no attribute 'adapter_checkpoint'

Investigation

  1. Checked available model classes in executorch.examples.models.llama.model:

    • Only Llama2Model and EagerModelBase are available
    • No Llama3_2Model class exists in ExecuTorch 0.7.0
  2. Searched for adapter_checkpoint in the codebase:

    • No references found in model.py
    • Attribute is referenced but not defined in BaseConfig
  3. Tried various command variations:

    • With --model llama3_2 flag
    • Without model specification (auto-detect)
    • With --params params.json
    • All result in the same error

Issue 2: Qwen 2.5 1.5B Export Failure

Command

python -m executorch.examples.models.llama.export_llama \
  -c Qwen/Qwen2.5-1.5B-Instruct \
  --pt2e_quantize xnnpack_dynamic \
  --max_seq_length 256 \
  --max_context_length 512 \
  -n qwen25_1.5b.pte \
  -o . \
  -v

Error

Same adapter_checkpoint error as Llama 3.2.

Notes

  • Qwen 2.5 models are publicly accessible (no gating)
  • Model downloads successfully from HuggingFace
  • ExecuTorch's llama exporter is supposed to support Qwen (similar architecture)

Questions

  1. Does ExecuTorch 0.7.0 support Llama 3.2 models?

    • The --model flag accepts llama3_2 as an option
    • But the model class doesn't seem to exist
    • Is there a different export method for Llama 3.2?
  2. Does ExecuTorch 0.7.0 support Qwen 2.5 models?

    • The --model flag accepts qwen2_5 as an option
    • Same adapter_checkpoint error occurs
    • Is there additional setup required?
  3. What is the adapter_checkpoint attribute?

    • Where should it be defined?
    • Is it required for all models or only specific ones?
    • How should it be configured?
  4. Is there a workaround for ExecuTorch 0.7.0?

    • Can we patch the code to make it work?
    • Should we use a different export method?
    • Is there a pre-release version that supports these models?
  5. When will ExecuTorch 0.8.0 be released?

    • Will it include proper Llama 3.2 and Qwen 2.5 support?
    • Is there a beta/RC version we can test?

What Works

  • SmolLM-135M: Exports and runs successfully
    python -m executorch.examples.models.llama.export_llama \
      -c HuggingFaceTB/SmolLM-135M \
      --pt2e_quantize xnnpack_dynamic \
      --max_seq_length 128 \
      -n smollm_135m.pte

Expected Behavior

Based on the documentation and command-line help, we expected:

  1. Llama 3.2 models to export with --model llama3_2 or auto-detection
  2. Qwen 2.5 models to export with --model qwen2_5 or auto-detection
  3. The export process to complete successfully and produce a .pte file

Actual Behavior

Both Llama 3.2 and Qwen 2.5 exports fail with AttributeError: 'BaseConfig' object has no attribute 'adapter_checkpoint' during model initialization.

Use Case

We're building a React Native mobile app (iOS/Android/macOS) that uses on-device AI for chat personalities. We need:

  • Llama 3.2 1B: Better quality responses than SmolLM-135M
  • Qwen 2.5 1.5B: Alternative option with good quality/speed balance
  • 4-bit quantization: To fit on mobile devices (~500-600 MB)

Currently stuck with SmolLM-135M which works but has poor quality output.

Requested Help

  1. Confirmation of whether Llama 3.2 and Qwen 2.5 are supported in ExecuTorch 0.7.0
  2. Correct export commands or workarounds if they are supported
  3. Timeline for ExecuTorch 0.8.0 if these models require a newer version
  4. Any patches or fixes we can apply to make it work with 0.7.0

Additional Context

  • HuggingFace token is valid and working
  • Llama 3.2 access was approved by Meta
  • Qwen 2.5 models are publicly accessible
  • We successfully exported and ran SmolLM-135M
  • Python environment is clean (fresh venv with only executorch dependencies)

References

Thank you for any guidance!

cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng

Metadata

Metadata

Assignees

Labels

module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

To triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions