Try testing more pytest

mergennachin · mergennachin · commit 01ebc5891f6b · 2025-09-06T12:47:51.000-04:00
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,166 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+ExecuTorch is an end-to-end solution for on-device inference powering AI experiences across Meta's products. It provides a portable runtime for executing PyTorch models on resource-constrained devices including mobile phones, embedded systems, and microcontrollers.
+
+## Key Development Commands
+
+### Build Commands
+
+```bash
+# Configure CMake build (one-time setup)
+rm -rf cmake-out && mkdir cmake-out && cd cmake-out && cmake ..
+
+# Build the project (use core count + 1 for -j value)
+cmake --build cmake-out -j9
+
+# Build with common extensions enabled
+cmake . \
+  -DEXECUTORCH_BUILD_XNNPACK=ON \
+  -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
+  -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
+  -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
+  -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
+  -Bcmake-out
+```
+
+### Testing Commands
+
+```bash
+# Run all C++ tests
+./test/run_oss_cpp_tests.sh
+
+# Run Python tests
+pytest
+
+# Run specific test directory
+./test/run_oss_cpp_tests.sh runtime/core/test/
+
+# Build and run size tests
+./test/build_size_test.sh
+```
+
+### Code Quality
+
+```bash
+# Set up lintrunner
+pip install lintrunner==0.12.7 lintrunner-adapters==0.12.4
+lintrunner init
+
+# Run linter
+lintrunner
+
+# Auto-fix linting issues
+lintrunner -a
+
+# Run specific linters
+lintrunner --take CLANGFORMAT
+```
+
+### Python Environment Setup
+
+```bash
+# Install requirements
+./install_requirements.sh
+
+# Install in development mode
+pip install -e .
+```
+
+## Architecture Overview
+
+### Core Components
+
+**Export Pipeline (exir/)**
+- Captures PyTorch models via torch.export
+- Lowers models through EXIR dialects (ATen → Edge → Backend/Core ATen)
+- Performs optimizations like memory planning and operator fusion
+- Serializes to .pte (Program Transfer Entity) format
+
+**Runtime (runtime/)**
+- Loads and executes .pte files
+- Core structures: Tensor, EValue, Program, Method
+- Memory management with static allocation
+- Delegate interface for hardware acceleration
+
+**Kernels (kernels/)**
+- `portable/`: Reference C++ implementations for all operators
+- `optimized/`: SIMD-optimized implementations
+- `quantized/`: Quantized operator implementations
+- Operators registered via static initialization
+
+**Backends (backends/)**
+- Hardware-specific accelerators (Apple CoreML/MPS, ARM, Qualcomm, etc.)
+- Each backend provides:
+  - Partitioner: Identifies subgraphs for delegation
+  - Preprocess: Converts subgraphs to backend format
+  - Runtime: Executes delegated operations
+
+### Key Design Patterns
+
+**Memory Management**
+- No dynamic allocation in core runtime
+- Static memory planning at export time
+- MemoryAllocator interface for custom allocation strategies
+
+**Error Handling**
+- Result<T, Error> type for fallible operations
+- No exceptions (disabled for portability)
+- ET_CHECK macros for assertions
+
+**Backend Delegation**
+- Subgraphs marked with delegation tags during export
+- Backend runtime loaded dynamically
+- Fallback to CPU execution if delegation fails
+
+### Model Export Flow
+
+1. Capture PyTorch model with torch.export
+2. Lower through EXIR passes (quantization, delegation, optimization)
+3. Run memory planning to determine buffer sizes
+4. Emit to flatbuffer format (.pte file)
+5. Load in runtime and execute
+
+### Extension Modules
+
+**Module (extension/module/)**
+- Simplified C++ API for loading and running models
+- Handles method execution and tensor I/O
+
+**LLM Support (extension/llm/)**
+- Specialized runtime for large language models
+- Custom operators for KV caching, rotary embeddings
+- Token sampling strategies
+
+**Platform Bindings**
+- Android: JNI wrappers in extension/android/
+- iOS: Objective-C++ wrappers in extension/apple/
+- Python: pybind11 bindings in extension/pybindings/
+
+## Working with Models
+
+Example models are in `examples/models/`. Each model directory typically contains:
+- `export_<model>.py`: Exports PyTorch model to .pte
+- `model.py`: Model definition
+- `runner/`: C++ runner implementation
+- `README.md`: Model-specific instructions
+
+Common workflow:
+1. Export model: `python export_llama.py`
+2. Build runner: `cmake --build cmake-out --target llama_runner`
+3. Run inference: `./cmake-out/examples/models/llama/llama_runner --model_path=model.pte`
+
+## Code Style Guidelines
+
+- C++ follows Google style with modifications:
+  - Functions use `lower_snake_case()`
+  - Files use `lower_snake_case.cpp`
+  - Headers use `#pragma once`
+  - All includes use `<angle_brackets>`
+- Python follows PyTorch style
+- No dynamic memory allocation in runtime
+- Avoid templates to minimize binary size
+- Document public APIs with Doxygen comments
diff --git a/pytest.ini b/pytest.ini
@@ -8,21 +8,44 @@ addopts =
     --capture=sys
     # don't suppress warnings, but don't shove them all to the end either
     -p no:warnings
-    # Ignore backends/arm tests you need to run examples/arm/setup.sh to install some tool to make them work
-    # For GitHub testing this is setup/executed in the unittest-arm job see .github/workflows/pull.yml for more info.
+    
+    # === STILL IGNORED - Infrastructure Dependencies ===
+    # Ignore backends/arm tests - requires running examples/arm/setup.sh to install tools
+    # For GitHub testing this is setup/executed in the unittest-arm job see .github/workflows/pull.yml
     --ignore-glob=backends/arm/**/*
-    # explicitly list out tests that are running successfully in oss
+    # backends/test - WIP testing infra, see https://github.com/pytorch/executorch/discussions/11140
+    --ignore=backends/test
+    # T200992559: Add torchao to ET as core dependency
+    --ignore=examples/models/llama/tests/test_pre_quantization_transforms.py
+    # This test depends on test-only cpp ops lib
+    --ignore=kernels/quantized/test/test_quant_dequant_per_token.py
+    --ignore=export/tests/test_export_stages.py
+    
+    # === TEST DIRECTORIES TO RUN ===
     .ci/scripts/tests
-    examples/models/test
+    
+    # backends
+    backends/apple/coreml/test
+    backends/test/harness/tests
+    backends/test/suite/tests
+    backends/transforms
+    backends/xnnpack/test
+    
+    # codegen
+    codegen/test
+    
+    # devtools
     devtools/
-    --ignore=devtools/visualization/visualization_utils_test.py
+    
     # examples
+    examples/models/test
     examples/models/llama/tests
     examples/models/llama/config
     examples/models/llama3_2_vision/preprocess
     examples/models/llama3_2_vision/vision_encoder/test
     examples/models/llama3_2_vision/text_decoder/test
-    # examples/models/llava/test TODO: enable this
+    examples/models/llava/test
+    
     # exir
     exir/_serialize/test
     exir/backend/test
@@ -32,68 +55,31 @@ addopts =
     exir/emit/test
     exir/program/test
     exir/tests/
-    # executorch/export
+    
+    # export
     export/tests
-    --ignore=export/tests/test_export_stages.py
-    # kernels/
+    
+    # extension
+    extension/
+    
+    # kernels
     kernels/prim_ops/test
     kernels/quantized
-    # Because this test depends on test only cpp ops lib
-    # Will add test only cmake targets to re-enable this test
-    # but maybe it is a bit of anti-pattern
-    --ignore=kernels/quantized/test/test_quant_dequant_per_token.py
     kernels/test/test_case_gen.py
-    # backends/test
-    # This effort is WIP and will be enabled in CI once testing infra
-    # is stable and signal to noise ratio is good (no irrelevant failures).
-    # See https://github.com/pytorch/executorch/discussions/11140
-    --ignore=backends/test
-    backends/test/harness/tests
-    backends/test/suite/tests
-    # backends/xnnpack
-    backends/xnnpack/test/ops
-    --ignore=backends/xnnpack/test/ops/test_bmm.py
-    --ignore=backends/xnnpack/test/ops/test_conv2d.py
-    --ignore=backends/xnnpack/test/ops/test_linear.py
-    --ignore=backends/xnnpack/test/ops/test_sdpa.py
-    backends/xnnpack/test/passes
-    backends/xnnpack/test/recipes
-    backends/xnnpack/test/serialization
-    # backends/apple/coreml
-    backends/apple/coreml/test
-    # extension/
-    extension/llm/modules/test
-    extension/llm/export
-    extension/llm/custom_ops/test_sdpa_with_kv_cache.py
-    extension/llm/custom_ops/test_update_cache.py
-    extension/llm/custom_ops/test_quantized_sdpa.py
-    extension/pybindings/test
-    extension/training/pybindings/test
-    # Runtime
+    
+    # profiler
+    profiler/
+    
+    # runtime
     runtime
-    # Tools
-    codegen/test
+    
+    # test
+    test/
+    
+    # tools
     tools/cmake
-    # test TODO: fix these tests
-    # test/end2end/test_end2end.py
-    --ignore=backends/xnnpack/test/ops/linear.py
-    --ignore=backends/xnnpack/test/models/llama2_et_example.py
-    # T200992559: Add torchao to ET as core dependency
-    --ignore=examples/models/llama/tests/test_pre_quantization_transforms.py
-    --ignore=exir/backend/test/demos
-    --ignore=exir/backend/test/test_backends.py
-    --ignore=exir/backend/test/test_backends_lifted.py
-    --ignore=exir/backend/test/test_partitioner.py
-    --ignore=exir/tests/test_common.py
-    --ignore=exir/tests/test_memory_format_ops_pass_aten.py
-    --ignore=exir/tests/test_memory_planning.py
-    --ignore=exir/tests/test_op_convert.py
-    --ignore=exir/tests/test_passes.py
-    --ignore=exir/tests/test_quant_fusion_pass.py
-    --ignore=exir/tests/test_quantization.py
-    --ignore=exir/tests/test_verification.py
 
 # run the same tests multiple times to determine their
 # flakiness status. Default to 50 re-runs
 flake-finder = true
-flake-runs = 50
+flake-runs = 50