Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
5f789c0
Fix channels_last_tagged_reshape_pass to handle mixed memory format t…
leafs1 Jun 16, 2025
2d09ab8
Revert vulkan changes from D76646172 fixup patch
ahmtox Jun 16, 2025
3a6c664
Update CMakeLists.txt for extension/apple to strip debug symbols path…
shoumikhin Jun 16, 2025
9eb8d01
Implement ReplaceMulTensorWithMulAndFullOpsPass.
eigen-k Jun 16, 2025
7b39a0c
Fixed linter (#11742)
guangy10 Jun 16, 2025
20ea230
Fix for overriding installed executorch when running optimum-et model…
guangy10 Jun 17, 2025
962db1b
Update documents for buffer allocator (#11467)
neuropilot-captain Jun 17, 2025
1309849
Add function for input preprocessing in numerical comparator
Jun 17, 2025
be8ffd1
[llm] Add generate_from_pos API to LLM runner (#11570)
larryliu0820 Jun 17, 2025
6af28c9
Arm backend: Added decomposition for MaxPool2d with dilation > 0. (#1…
wwwind Jun 17, 2025
b22a2be
Arm backend: Add support for per-channel quantization (#11752)
oscarandersson8218 Jun 17, 2025
5365c55
Arm backend: Fix bug in decompose linear vector norm (#11755)
wwwind Jun 17, 2025
3b1c7fd
Add numerical comparator base class and L1 comparator
Jun 17, 2025
d1bfa4d
Fix text_llm_runner unit test
larryliu0820 Jun 17, 2025
0286927
Use wrappers from xnnpack.h for unary and binary ops (#11584) (#11666)
abhinaykukkadapu Jun 17, 2025
078fb23
Qualcomm AI Engine Direct - GA Albert, Bert, Distilbert, Eurobert (#1…
winskuo-quic Jun 17, 2025
7503bb3
Qualcomm AI Engine Direct - Deprecate convert_bmm_to_matmul pass (#11…
DannyYuyang-quic Jun 17, 2025
5638657
Add MSE numerical comparator
Jun 17, 2025
830631d
Refactor XNNPACK tester to extract delegate-independent tester classe…
GregoryComer Jun 17, 2025
a6d8440
Reapply D74208085: "Switch fbcode builds of ExecuTorch and PyTorch to…
pytorchbot Jun 17, 2025
d984a2c
[ET-VK][Ops] quantization op shaders and impl (#11767)
pytorchbot Jun 17, 2025
9051d2d
[ET-VK][Ops] dequantization op shaders and impl (#11768)
pytorchbot Jun 17, 2025
03ffa48
[ET-VK][Ops] choose_qparams op shaders and impl (#11769)
pytorchbot Jun 17, 2025
7bd15b9
skip et quantizer numeric debugging tests for infra update
Gasoonjia Jun 17, 2025
5960a4b
[llm] Fix start_pos not being updated in prefill_chunk()
larryliu0820 Jun 18, 2025
57e0765
[llm] Update metadata max_seq_len based on the max range of dynamic s…
larryliu0820 Jun 18, 2025
7dfb47d
Support Span to construct from a single element similar to ArrayRef (…
pytorchbot Jun 18, 2025
44d2643
Introduce extension/llm/export_llm
jackzhxng Jun 18, 2025
7565342
Arm backend: Prevent illegal fusion in FuseEqualPlaceholdersPass (#11…
YufengShi-dudu Jun 18, 2025
5ca9876
Use MAP_SHARED to allow sharing memory between processing (#11733)
shoumikhin Jun 18, 2025
6b47a16
Set pyre-strict for passes unit tests.
eigen-k Jun 18, 2025
3c05b6c
Update CODEOWNERS (#11785)
mergennachin Jun 18, 2025
5531caf
Re-introduce the type-erasing tensor class.
shoumikhin Jun 18, 2025
ee7e29f
Propagate core_aten_exceptions to quantize_and_export_to_executorch a…
eigen-k Jun 18, 2025
29db57b
Improve executor_runner cmake instructions (#11773)
guangy10 Jun 18, 2025
4543412
Add legacy mode test
mcr229 Jun 18, 2025
5006a14
[ET-VK][Ops] enabling double support for quantization and dequantizat…
pytorchbot Jun 18, 2025
fcc7f3b
Arm backend: Change QAT weight observer (#11787)
wwwind Jun 18, 2025
3943938
Tensor description
shoumikhin Jun 18, 2025
5136175
Value description (#11798)
shoumikhin Jun 18, 2025
daebcde
dtype selective build from model API in OSS (#11760)
BujSet Jun 18, 2025
137163f
Static attention Python I/O manager
sxu Jun 19, 2025
9bb0735
Temporarily fix Moshi test (#11799)
jackzhxng Jun 19, 2025
a1dec07
Fix wheel build CI jobs by installing torchvision (#11806)
larryliu0820 Jun 19, 2025
994752e
Arm backend: Add unittest for MHA (#11812)
oscarandersson8218 Jun 19, 2025
28b8198
Arm backend: Add support for grouped convolution (#11817)
AdrianLundell Jun 19, 2025
99b3a68
Arm backend: Add support for aten.round (#11813)
oscarandersson8218 Jun 19, 2025
91dfd62
Arm backend: Improve pooling args handling (#11819)
AdrianLundell Jun 19, 2025
bc605b8
Arm backend: Test temp memory allocation return code in backend (#11814)
zingo Jun 19, 2025
10f0d22
Arm backend: Fix bug of inserting unnecessary casts for aten.where.se…
YufengShi-dudu Jun 19, 2025
28905e7
Arm backend: Remove hard coded TOSA profile in VGF backend (#11818)
mansnils Jun 19, 2025
aae0dba
Make the requant pass call the per_tensor overload
mcremon-meta Jun 19, 2025
5c91435
Fix static attention non-HF RoPE implementation
sxu Jun 20, 2025
496cb05
[llm] Add sentencepiece tokenizer support to llm runner
larryliu0820 Jun 20, 2025
9345972
Update install script and building from source docs (#10652)
keyprocedure Jun 20, 2025
e0f81d8
Qualcomm AI Engine Direct - Delegate mutable buffer and fix the mutab…
shewu-quic Jun 20, 2025
496022e
ET_HAS_EXCEPTIONS: require defined(_MSC_VER) to conclude _HAS_EXCEPTI…
pytorchbot Jun 20, 2025
6a787ce
[ET-VK][codegen][fix] Split codegen and SPIR-V compilation into separ…
pytorchbot Jun 20, 2025
63b047b
Fix typo for 16a4w_block quantization
rohansjoshi Jun 20, 2025
a12a005
Arm backend: Build c/c++ code with -Wall -Werror (#11815)
zingo Jun 21, 2025
608a745
Added quantization for evaluation script
rohansjoshi Jun 21, 2025
695c7d5
Replaced int4 string with torch.int4
rohansjoshi Jun 23, 2025
121714a
Fix apple benchmark app installation issue caused by not installing s…
larryliu0820 Jun 23, 2025
4cb71a0
Arm backend: Add missing bias in pass (#11847)
oscarandersson8218 Jun 23, 2025
d59fddc
Arm backend: Support ScalarType::Bool in EthosUBackend (#11850)
zingo Jun 23, 2025
1793bae
Arm backend: Add partial support for index.Tensor (#11851)
iliyan-georgiev-arm Jun 23, 2025
18e4240
Implement load_into for Mmap Data Loader (#11654)
keyprocedure Jun 23, 2025
925ef20
Pull c10 headers directly from PyTorch internally, not from the c10 m…
pytorchbot Jun 23, 2025
da36d8a
Fix shadowing error in mmap_data_loader.cpp (#11858)
GregoryComer Jun 23, 2025
be07160
Create a MemoryPlanningAlgo class.
hsharma35 Jun 23, 2025
7f2fcb0
Add script to fetch benchmark results for execuTorch (#11734)
yangw-dev Jun 23, 2025
d83636d
Ability to specify full file configs for export_llm (#11809)
jackzhxng Jun 23, 2025
222d9e3
Add inspector numeric gap calculation between AOT and runtime interme…
Jun 23, 2025
ff3c3b6
[Quantized DeConv Support] Enable Quantized Transposed Convs with gro…
pytorchbot Jun 23, 2025
3b02c99
Fix LlmConfig enum usage (#11833)
jackzhxng Jun 23, 2025
4df9290
Update export_llama in READMEs to use export_llm (#11811)
jackzhxng Jun 23, 2025
0c12dcd
Add tanh op to XNNPACK backend (#11804)
leafs1 Jun 23, 2025
09f621b
[Quantized DeConv Support] Enable Quantized Transposed Convs with gro…
mcr229 Jun 17, 2025
7d3ec3d
[Quantized DeConv Support] Dynamically Quantized Deconvolutions with …
mcr229 Jun 17, 2025
b7572d0
[XNNPACK Quantizer] Select between TConvs and Convs
mcr229 Jun 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .ci/docker/requirements-ci.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,6 @@ matplotlib>=3.9.4
myst-parser==0.18.1
sphinx_design==0.4.1
sphinx-copybutton==0.5.0

# script unit test requirements
yaspin==3.1.0
172 changes: 172 additions & 0 deletions .ci/scripts/benchmark_tooling/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
# Executorch Benchmark Tooling

A library providing tools for fetching, processing, and analyzing ExecutorchBenchmark data from the HUD Open API. This tooling helps compare performance metrics between private and public devices with identical settings.

## Table of Contents

- [Overview](#overview)
- [Installation](#installation)
- [Tools](#tools)
- [get_benchmark_analysis_data.py](#get_benchmark_analysis_datapy)
- [Quick Start](#quick-start)
- [Command Line Options](#command-line-options)
- [Example Usage](#example-usage)
- [Working with Output Files](#working-with-output-files-csv-and-excel)
- [Python API Usage](#python-api-usage)
- [Running Unit Tests](#running-unit-tests)

## Overview

The Executorch Benchmark Tooling provides a suite of utilities designed to:

- Fetch benchmark data from HUD Open API for specified time ranges
- Clean and process data by filtering out failures
- Compare metrics between private and public devices with matching configurations
- Generate analysis reports in various formats (CSV, Excel, JSON)
- Support filtering by device pools, backends, and models

This tooling is particularly useful for performance analysis, regression testing, and cross-device comparisons.

## Installation

Install dependencies:

```bash
pip install -r requirements.txt
```

## Tools

### get_benchmark_analysis_data.py

This script is mainly used to generate analysis data comparing private devices with public devices using the same settings.

It fetches benchmark data from HUD Open API for a specified time range, cleans the data by removing entries with FAILURE indicators, and retrieves all private device metrics along with equivalent public device metrics based on matching [model, backend, device_pool_names, arch] configurations. Users can filter the data by specifying private device_pool_names, backends, and models.

#### Quick Start

```bash
# generate excel sheets for all private devices with public devices using the same settings
python3 .ci/scripts/benchmark_tooling/get_benchmark_analysis_data.py \
--startTime "2025-06-11T00:00:00" \
--endTime "2025-06-17T18:00:00" \
--outputType "excel"

# generate the benchmark stability analysis
python3 .ci/scripts/benchmark_tooling/analyze_benchmark_stability.py \
--primary-file private.xlsx \
--reference-file public.xlsx
```

#### Command Line Options

##### Basic Options:
- `--startTime`: Start time in ISO format (e.g., "2025-06-11T00:00:00") (required)
- `--endTime`: End time in ISO format (e.g., "2025-06-17T18:00:00") (required)
- `--env`: Choose environment ("local" or "prod", default: "prod")
- `--no-silent`: Show processing logs (default: only show results & minimum logging)

##### Output Options:
- `--outputType`: Choose output format (default: "print")
- `print`: Display results in console
- `json`: Generate JSON file
- `df`: Display results in DataFrame format: `{'private': List[{'groupInfo':Dict,'df': DF},...],'public':List[{'groupInfo':Dict,'df': DF}]`
- `excel`: Generate Excel files with multiple sheets, the field in first row and first column contains the JSON string of the raw metadata
- `csv`: Generate CSV files in separate folders, the field in first row and first column contains the JSON string of the raw metadata
- `--outputDir`: Directory to save output files (default: current directory)

##### Filtering Options:

- `--device-pools`: Filter by private device pool names (e.g., "samsung-galaxy-s22-5g", "samsung-galaxy-s22plus-5g")
- `--backends`: Filter by specific backend names (e.g.,"xnnpack_q8")
- `--models`: Filter by specific model names (e.g., "mv3", "meta-llama-llama-3.2-1b-instruct-qlora-int4-eo8")

#### Example Usage

Filter by multiple private device pools and models:
```bash
# This fetches all private table data for models 'llama-3.2-1B' and 'mv3'
python3 get_benchmark_analysis_data.py \
--startTime "2025-06-01T00:00:00" \
--endTime "2025-06-11T00:00:00" \
--device-pools 'apple_iphone_15_private' 'samsung_s22_private' \
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
```

Filter by specific device pool and models:
```bash
# This fetches all private iPhone table data for models 'llama-3.2-1B' and 'mv3',
# and associated public iPhone data
python3 get_benchmark_analysis_data.py \
--startTime "2025-06-01T00:00:00" \
--endTime "2025-06-11T00:00:00" \
--device-pools 'apple_iphone_15_private' \
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
```

#### Working with Output Files CSV and Excel

You can use methods in `common.py` to convert the file data back to DataFrame format. These methods read the first row in CSV/Excel files and return results with the format `list of {"groupInfo":DICT, "df":df.Dataframe{}}`.

```python
import logging
logging.basicConfig(level=logging.INFO)
from .ci.scripts.benchmark_tooling.common import read_all_csv_with_metadata, read_excel_with_json_header

# For CSV files (assuming the 'private' folder is in the current directory)
folder_path = './private'
res = read_all_csv_with_metadata(folder_path)
logging.info(res)

# For Excel files (assuming the Excel file is in the current directory)
file_path = "./private.xlsx"
res = read_excel_with_json_header(file_path)
logging.info(res)
```

#### Python API Usage

To use the benchmark fetcher in your own scripts:

```python
from .ci.scripts.benchmark_tooling.get_benchmark_analysis_data import ExecutorchBenchmarkFetcher

# Initialize the fetcher
fetcher = ExecutorchBenchmarkFetcher(env="prod", disable_logging=False)

# Fetch data for a specific time range
fetcher.run(
start_time="2025-06-11T00:00:00",
end_time="2025-06-17T18:00:00"
)

# Get results in different formats
# As DataFrames
df_results = fetcher.to_df()

# Export to Excel
fetcher.to_excel(output_dir="./results")

# Export to CSV
fetcher.to_csv(output_dir="./results")

# Export to JSON
json_path = fetcher.to_json(output_dir="./results")

# Get raw dictionary results
dict_results = fetcher.to_dict()

# Use the output_data method for flexible output
results = fetcher.output_data(output_type="excel", output_dir="./results")
```

## Running Unit Tests

The benchmark tooling includes unit tests to ensure functionality.

### Using pytest for unit tests

```bash
# From the executorch root directory
pytest -c /dev/null .ci/scripts/tests/test_get_benchmark_analysis_data.py
```
Empty file.
Loading
Loading