The ExecuTorch SwiftPM binary distribution (v0.7.0) fails to load the "forward" method from exported PTE models with **Error 32 (NotFound)**, while the same PTE files work perfectly with the Python ExecuTorch runtime (v0.7.0)

### 🐛 Describe the bug

# ExecuTorch SwiftPM Binary Distribution Bug Report

## Summary

The ExecuTorch SwiftPM binary distribution (v0.7.0) fails to load the "forward" method from exported PTE models with **Error 32 (NotFound)**, while the same PTE files work perfectly with the Python ExecuTorch runtime (v0.7.0).

## Environment

- **Platform**: macOS 26.0.1 (Apple Silicon M1/M2)
- **Xcode**: 26.0.1
- **ExecuTorch Python**: 0.7.0 (pip package)
- **ExecuTorch SwiftPM**: 0.7.0 (branch: swiftpm-0.7.0)
- **Model**: SmolLM-135M exported with LLM exporter
- **Export Command**:
```bash
python -m executorch.examples.models.llama.export_llama \
  --model smollm2 \
  -X --pt2e_quantize xnnpack_dynamic \
  --max_seq_length 128 \
  --params params.json \
  -n smollm2_simple.pte \
  -o . \
  -v
```

## Issue Description

### What Works (Python Runtime)

```python
from executorch.extension.pybindings.portable_lib import _load_for_executorch_from_buffer

with open("smollm2_simple.pte", "rb") as f:
    module = _load_for_executorch_from_buffer(f.read())

# This works perfectly
tokens = torch.tensor([[1, 72, 101, 108, 108, 111]], dtype=torch.long)
result = module.forward([tokens])
print(result[0].shape)  # Output: torch.Size([1, 49152])
```

**Result**: ✅ Works perfectly, generates coherent text

### What Fails (C++ SwiftPM Runtime)

```objc
// Load model
auto module = std::make_unique<torch::executor::Module>(modelPath, ...);
auto load_err = module->load();
// load_err == Error::Ok ✅

// Try to load forward method
auto load_method_err = module->load_method("forward");
// load_method_err == Error::NotFound (32) ❌

// Try to execute forward
std::vector<executorch::runtime::EValue> inputs;
inputs.emplace_back(*input_tensor);
auto result = module->forward(inputs);
// result.error() == Error::NotFound (32) ❌

// Try execute("forward", ...)
auto result2 = module->execute("forward", inputs);
// result2.error() == Error::NotFound (32) ❌
```

**Result**: ❌ Error 32 (NotFound) - method cannot be loaded or executed

## Detailed Observations

### 1. Method Exists in Metadata

```objc
auto method_names = module->method_names();
// Returns: ["enable_dynamic_shape", "use_sdpa_with_kv_cache", "get_n_layers", 
//           "get_eos_ids", "get_max_seq_len", "get_vocab_size", "get_bos_id", 
//           "get_max_context_len", "use_kv_cache", "forward"]
```

The "forward" method is listed in `method_names()`, but cannot be loaded.

### 2. Method Metadata is Accessible

```objc
auto meta_result = module->method_meta("forward");
// meta_result.ok() == true ✅

auto meta = meta_result.get();
// meta.num_inputs() == 1
// meta.input_tensor_meta(0).dtype() == ScalarType::Long (4)
// meta.input_tensor_meta(0).sizes() == [1, dynamic]
// meta.num_outputs() == 1
// meta.output_tensor_meta(0).dtype() == ScalarType::Float (6)
// meta.output_tensor_meta(0).sizes() == [1, 49152]
```

Metadata is accessible and shows correct input/output signatures.

### 3. All Operators Are Present

We verified that all required operators are present in the SwiftPM binaries by:
- Extracting operator list from PTE using `gen_oplist.py`
- Confirming all operators exist in `kernels_optimized` and `backend_xnnpack`
- Disabling custom operator registration → error remains 32 (not 20/OperatorMissing)

### 4. KV Cache Methods Also Fail

```objc
auto kv_result = module->execute("use_kv_cache", {true});
// kv_result.error() == Error::InvalidArgument (18) ❌
```

Helper methods added by LLM exporter also fail.

## Attempted Solutions

### ✅ Tried: Custom Operator Registration
- Generated `selected_operators.yaml` from PTE
- Created `RegisterCodegenUnboxedKernelsEverything.cpp`
- Linked all required kernels (optimized, xnnpack, custom, quantized)
- **Result**: Same error (32)

### ✅ Tried: Different Input Signatures
Attempted all possible input combinations:
- `[ids:int64 1xL]`, `[ids:int32 1xL]`
- `[ids:int64 L]`, `[ids:int32 L]`
- `[ids:int64, mask:int64]`, `[ids:int32, mask:int32]`
- `[ids:int64, seq_len:int32]`, `[ids:int32, seq_len:int32]`
- `[ids:int64, positions:int64]`, `[ids:int32, positions:int32]`
- Fixed sequence lengths: K ∈ {16, 32, 64, 128, 256}
- **Result**: All return error 32

### ✅ Tried: Different Module Loading Patterns
- Owner module (path-based): `Module(path, ...)`
- Allocator-backed module: `Module(path, ..., allocator)`
- Explicit `load_forward()` calls
- **Result**: All return error 32

### ✅ Tried: Version Alignment
- Downgraded Python to 0.4.0 → same error
- Upgraded Python to 0.7.0 → works in Python, fails in C++
- **Result**: Python works, C++ fails regardless of version

## Root Cause Hypothesis

The SwiftPM binary distribution appears to have a fundamental incompatibility with PTE files exported by the Python ExecuTorch 0.7.0 package. Specifically:

1. **Method Plan Loading**: The C++ runtime cannot load method plans from PTE files, even though:
   - The method exists in metadata
   - All operators are present
   - The Python runtime loads the same file successfully

2. **SwiftPM Branch Discrepancy**: The `swiftpm-0.7.0` branch doesn't exist in the official PyTorch/ExecuTorch repository. The latest official SwiftPM branch is `swiftpm-0.4.0`, suggesting the 0.7.0 binaries may be:
   - An unofficial build
   - An outdated/incomplete build
   - Missing critical runtime components

## Workaround

We implemented a **Python subprocess bridge** that:
1. Spawns Python process running ExecuTorch
2. Communicates via JSON over stdin/stdout
3. Uses the working Python runtime for inference
4. Adds ~1-5ms IPC overhead per token (negligible)

**Result**: ✅ Fully functional, generates coherent text

## Expected Behavior

The C++ SwiftPM runtime should be able to:
1. Load the "forward" method: `load_method("forward")` → `Error::Ok`
2. Execute forward pass: `forward(inputs)` → valid output tensor
3. Match Python runtime behavior exactly

## Actual Behavior

- `load_method("forward")` → `Error::NotFound (32)`
- `forward(inputs)` → `Error::NotFound (32)`
- `execute("forward", inputs)` → `Error::NotFound (32)`

## Reproduction Steps

1. **Export model** (Python):
```bash
python -m executorch.examples.models.llama.export_llama \
  --model smollm2 \
  -X --pt2e_quantize xnnpack_dynamic \
  --max_seq_length 128 \
  --params params.json \
  -n model.pte
```

2. **Test in Python** (works):
```python
from executorch.extension.pybindings.portable_lib import _load_for_executorch_from_buffer
import torch

with open("model.pte", "rb") as f:
    module = _load_for_executorch_from_buffer(f.read())

tokens = torch.tensor([[1, 2, 3]], dtype=torch.long)
result = module.forward([tokens])
print("Success:", result[0].shape)
```

3. **Test in C++ with SwiftPM** (fails):
```objc
#import <executorch/extension/module/module.h>

auto module = std::make_unique<torch::executor::Module>(modelPath);
module->load();

auto result = module->load_method("forward");
// result == Error::NotFound (32)
```

## Additional Information

- **PTE File Size**: 238 MB
- **Model Architecture**: SmolLM-135M (30 layers, 9 heads, vocab_size=49152)
- **Quantization**: XNNPACK dynamic quantization
- **Backend**: XNNPACK + optimized kernels
- **Platform**: macOS (arm64)

## Questions

1. Is the `swiftpm-0.7.0` branch an official release?
2. Are there known incompatibilities between Python-exported PTEs and SwiftPM binaries?
3. Should we use a different export path for SwiftPM compatibility?
4. Is there a recommended way to debug Error::NotFound (32) in method loading?

## Files Available

- `smollm2_simple.pte` - Exported model file (works in Python, fails in C++)
- `selected_operators.yaml` - Extracted operator list
- Full Xcode project with reproduction case

Happy to provide any additional information or test patches!


### Versions

(venv) andytriboletti@macbookpro alientavern-v2 % python collect_env.py

Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: macOS 26.0.1 (arm64)
GCC version: Could not collect
Clang version: 17.0.0 (clang-1700.3.19.1)
CMake version: version 4.1.2
Libc version: N/A

Python version: 3.13.7 (main, Aug 14 2025, 11:12:11) [Clang 17.0.0 (clang-1700.3.19.1)] (64-bit runtime)
Python platform: macOS-26.0.1-arm64-arm-64bit-Mach-O
Is CUDA available: N/A
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
Is XPU available: N/A
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A

CPU:
Apple M1 Max

Versions of relevant libraries:
[pip3] No relevant packages
[conda] Could not collect

cc @shoumikhin @cbilgin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The ExecuTorch SwiftPM binary distribution (v0.7.0) fails to load the "forward" method from exported PTE models with Error 32 (NotFound), while the same PTE files work perfectly with the Python ExecuTorch runtime (v0.7.0) #14809

🐛 Describe the bug

ExecuTorch SwiftPM Binary Distribution Bug Report

Summary

Environment

Issue Description

What Works (Python Runtime)

What Fails (C++ SwiftPM Runtime)

Detailed Observations

1. Method Exists in Metadata

2. Method Metadata is Accessible

3. All Operators Are Present

4. KV Cache Methods Also Fail

Attempted Solutions

✅ Tried: Custom Operator Registration

✅ Tried: Different Input Signatures

✅ Tried: Different Module Loading Patterns

✅ Tried: Version Alignment

Root Cause Hypothesis

Workaround

Expected Behavior

Actual Behavior

Reproduction Steps

Additional Information

Questions

Files Available

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The ExecuTorch SwiftPM binary distribution (v0.7.0) fails to load the "forward" method from exported PTE models with **Error 32 (NotFound)**, while the same PTE files work perfectly with the Python ExecuTorch runtime (v0.7.0) #14809

Description

🐛 Describe the bug

ExecuTorch SwiftPM Binary Distribution Bug Report

Summary

Environment

Issue Description

What Works (Python Runtime)

What Fails (C++ SwiftPM Runtime)

Detailed Observations

1. Method Exists in Metadata

2. Method Metadata is Accessible

3. All Operators Are Present

4. KV Cache Methods Also Fail

Attempted Solutions

✅ Tried: Custom Operator Registration

✅ Tried: Different Input Signatures

✅ Tried: Different Module Loading Patterns

✅ Tried: Version Alignment

Root Cause Hypothesis

Workaround

Expected Behavior

Actual Behavior

Reproduction Steps

Additional Information

Questions

Files Available

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The ExecuTorch SwiftPM binary distribution (v0.7.0) fails to load the "forward" method from exported PTE models with Error 32 (NotFound), while the same PTE files work perfectly with the Python ExecuTorch runtime (v0.7.0) #14809