Skip to content

Commit a9958a3

Browse files
committed
Update base for Update on "[ET-VK] Allow specifying multiple storage types/memory layouts for an operator + register group norm operator"
## Changes * Handle cases where an operator needs to specify a separate storage type / memory layout for each individual output. ## Motivation Required for the group norm operator. ## Future Work Currently, the `tag_memory_meta_pass` graph pass assumes that all tensors participating in a computation (aside from weights) will have the same storage type and memory layout. As more operators are being added, there are more exceptions to this rule. The pass may need an update in the near future to make it possible to specify required storage types and memory layouts on a more granular level. Differential Revision: [D77038781](https://our.internmc.facebook.com/intern/diff/D77038781/) [ghstack-poisoned]
2 parents d7607af + 222d9e3 commit a9958a3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+4633
-556
lines changed

.ci/docker/requirements-ci.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,6 @@ matplotlib>=3.9.4
2828
myst-parser==0.18.1
2929
sphinx_design==0.4.1
3030
sphinx-copybutton==0.5.0
31+
32+
# script unit test requirements
33+
yaspin==3.1.0
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# Executorch Benchmark Tooling
2+
3+
A library providing tools for fetching, processing, and analyzing ExecutorchBenchmark data from the HUD Open API. This tooling helps compare performance metrics between private and public devices with identical settings.
4+
5+
## Table of Contents
6+
7+
- [Overview](#overview)
8+
- [Installation](#installation)
9+
- [Tools](#tools)
10+
- [get_benchmark_analysis_data.py](#get_benchmark_analysis_datapy)
11+
- [Quick Start](#quick-start)
12+
- [Command Line Options](#command-line-options)
13+
- [Example Usage](#example-usage)
14+
- [Working with Output Files](#working-with-output-files-csv-and-excel)
15+
- [Python API Usage](#python-api-usage)
16+
- [Running Unit Tests](#running-unit-tests)
17+
18+
## Overview
19+
20+
The Executorch Benchmark Tooling provides a suite of utilities designed to:
21+
22+
- Fetch benchmark data from HUD Open API for specified time ranges
23+
- Clean and process data by filtering out failures
24+
- Compare metrics between private and public devices with matching configurations
25+
- Generate analysis reports in various formats (CSV, Excel, JSON)
26+
- Support filtering by device pools, backends, and models
27+
28+
This tooling is particularly useful for performance analysis, regression testing, and cross-device comparisons.
29+
30+
## Installation
31+
32+
Install dependencies:
33+
34+
```bash
35+
pip install -r requirements.txt
36+
```
37+
38+
## Tools
39+
40+
### get_benchmark_analysis_data.py
41+
42+
This script is mainly used to generate analysis data comparing private devices with public devices using the same settings.
43+
44+
It fetches benchmark data from HUD Open API for a specified time range, cleans the data by removing entries with FAILURE indicators, and retrieves all private device metrics along with equivalent public device metrics based on matching [model, backend, device_pool_names, arch] configurations. Users can filter the data by specifying private device_pool_names, backends, and models.
45+
46+
#### Quick Start
47+
48+
```bash
49+
# generate excel sheets for all private devices with public devices using the same settings
50+
python3 .ci/scripts/benchmark_tooling/get_benchmark_analysis_data.py \
51+
--startTime "2025-06-11T00:00:00" \
52+
--endTime "2025-06-17T18:00:00" \
53+
--outputType "excel"
54+
55+
# generate the benchmark stability analysis
56+
python3 .ci/scripts/benchmark_tooling/analyze_benchmark_stability.py \
57+
--primary-file private.xlsx \
58+
--reference-file public.xlsx
59+
```
60+
61+
#### Command Line Options
62+
63+
##### Basic Options:
64+
- `--startTime`: Start time in ISO format (e.g., "2025-06-11T00:00:00") (required)
65+
- `--endTime`: End time in ISO format (e.g., "2025-06-17T18:00:00") (required)
66+
- `--env`: Choose environment ("local" or "prod", default: "prod")
67+
- `--no-silent`: Show processing logs (default: only show results & minimum logging)
68+
69+
##### Output Options:
70+
- `--outputType`: Choose output format (default: "print")
71+
- `print`: Display results in console
72+
- `json`: Generate JSON file
73+
- `df`: Display results in DataFrame format: `{'private': List[{'groupInfo':Dict,'df': DF},...],'public':List[{'groupInfo':Dict,'df': DF}]`
74+
- `excel`: Generate Excel files with multiple sheets, the field in first row and first column contains the JSON string of the raw metadata
75+
- `csv`: Generate CSV files in separate folders, the field in first row and first column contains the JSON string of the raw metadata
76+
- `--outputDir`: Directory to save output files (default: current directory)
77+
78+
##### Filtering Options:
79+
80+
- `--device-pools`: Filter by private device pool names (e.g., "samsung-galaxy-s22-5g", "samsung-galaxy-s22plus-5g")
81+
- `--backends`: Filter by specific backend names (e.g.,"xnnpack_q8")
82+
- `--models`: Filter by specific model names (e.g., "mv3", "meta-llama-llama-3.2-1b-instruct-qlora-int4-eo8")
83+
84+
#### Example Usage
85+
86+
Filter by multiple private device pools and models:
87+
```bash
88+
# This fetches all private table data for models 'llama-3.2-1B' and 'mv3'
89+
python3 get_benchmark_analysis_data.py \
90+
--startTime "2025-06-01T00:00:00" \
91+
--endTime "2025-06-11T00:00:00" \
92+
--device-pools 'apple_iphone_15_private' 'samsung_s22_private' \
93+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
94+
```
95+
96+
Filter by specific device pool and models:
97+
```bash
98+
# This fetches all private iPhone table data for models 'llama-3.2-1B' and 'mv3',
99+
# and associated public iPhone data
100+
python3 get_benchmark_analysis_data.py \
101+
--startTime "2025-06-01T00:00:00" \
102+
--endTime "2025-06-11T00:00:00" \
103+
--device-pools 'apple_iphone_15_private' \
104+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
105+
```
106+
107+
#### Working with Output Files CSV and Excel
108+
109+
You can use methods in `common.py` to convert the file data back to DataFrame format. These methods read the first row in CSV/Excel files and return results with the format `list of {"groupInfo":DICT, "df":df.Dataframe{}}`.
110+
111+
```python
112+
import logging
113+
logging.basicConfig(level=logging.INFO)
114+
from .ci.scripts.benchmark_tooling.common import read_all_csv_with_metadata, read_excel_with_json_header
115+
116+
# For CSV files (assuming the 'private' folder is in the current directory)
117+
folder_path = './private'
118+
res = read_all_csv_with_metadata(folder_path)
119+
logging.info(res)
120+
121+
# For Excel files (assuming the Excel file is in the current directory)
122+
file_path = "./private.xlsx"
123+
res = read_excel_with_json_header(file_path)
124+
logging.info(res)
125+
```
126+
127+
#### Python API Usage
128+
129+
To use the benchmark fetcher in your own scripts:
130+
131+
```python
132+
from .ci.scripts.benchmark_tooling.get_benchmark_analysis_data import ExecutorchBenchmarkFetcher
133+
134+
# Initialize the fetcher
135+
fetcher = ExecutorchBenchmarkFetcher(env="prod", disable_logging=False)
136+
137+
# Fetch data for a specific time range
138+
fetcher.run(
139+
start_time="2025-06-11T00:00:00",
140+
end_time="2025-06-17T18:00:00"
141+
)
142+
143+
# Get results in different formats
144+
# As DataFrames
145+
df_results = fetcher.to_df()
146+
147+
# Export to Excel
148+
fetcher.to_excel(output_dir="./results")
149+
150+
# Export to CSV
151+
fetcher.to_csv(output_dir="./results")
152+
153+
# Export to JSON
154+
json_path = fetcher.to_json(output_dir="./results")
155+
156+
# Get raw dictionary results
157+
dict_results = fetcher.to_dict()
158+
159+
# Use the output_data method for flexible output
160+
results = fetcher.output_data(output_type="excel", output_dir="./results")
161+
```
162+
163+
## Running Unit Tests
164+
165+
The benchmark tooling includes unit tests to ensure functionality.
166+
167+
### Using pytest for unit tests
168+
169+
```bash
170+
# From the executorch root directory
171+
pytest -c /dev/null .ci/scripts/tests/test_get_benchmark_analysis_data.py
172+
```

.ci/scripts/benchmark_tooling/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)