Skip to content

Commit 99df0fe

Browse files
committed
fix error test
Signed-off-by: Yang Wang <[email protected]>
1 parent ed48f5f commit 99df0fe

File tree

1 file changed

+53
-21
lines changed

1 file changed

+53
-21
lines changed

.ci/scripts/benchmark_tooling/README.md

Lines changed: 53 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,48 @@
1-
# Benchmark Tooling
1+
# Executorch Benchmark Tooling
22

3-
A library providing tools for fetching, processing, and analyzing ExecutorchBenchmark data from the HUD Open API.
3+
A library providing tools for fetching, processing, and analyzing ExecutorchBenchmark data from the HUD Open API. This tooling helps compare performance metrics between private and public devices with identical settings.
4+
5+
## Table of Contents
6+
7+
- [Overview](#overview)
8+
- [Installation](#installation)
9+
- [Tools](#tools)
10+
- [get_benchmark_analysis_data.py](#get_benchmark_analysis_datapy)
11+
- [Quick Start](#quick-start)
12+
- [Command Line Options](#command-line-options)
13+
- [Example Usage](#example-usage)
14+
- [Working with Output Files](#working-with-output-files-csv-and-excel)
15+
- [Python API Usage](#python-api-usage)
16+
- [Running Unit Tests](#running-unit-tests)
17+
18+
## Overview
19+
20+
The Executorch Benchmark Tooling provides a suite of utilities designed to:
21+
22+
- Fetch benchmark data from HUD Open API for specified time ranges
23+
- Clean and process data by filtering out failures
24+
- Compare metrics between private and public devices with matching configurations
25+
- Generate analysis reports in various formats (CSV, Excel, JSON)
26+
- Support filtering by device pools, backends, and models
27+
28+
This tooling is particularly useful for performance analysis, regression testing, and cross-device comparisons.
429

530
## Installation
631

732
Install dependencies:
33+
834
```bash
935
pip install -r requirements.txt
1036
```
1137

1238
## Tools
1339

1440
### get_benchmark_analysis_data.py
15-
This script mainlu used to generate analysis data between private device and public device with same settings.
1641

17-
It fetches benchmark data from HUD Open API for a time range, then cleans the data with FAILURE inidcator, and retrieves all private device metrics and equivalent public device metrics based on [model, backend, device_pool_names, arch]. User can filter the data by specifying private device_pool_names, backends, and models for private devices.
42+
This script is mainly used to generate analysis data comparing private devices with public devices using the same settings.
43+
44+
It fetches benchmark data from HUD Open API for a specified time range, cleans the data by removing entries with FAILURE indicators, and retrieves all private device metrics along with equivalent public device metrics based on matching [model, backend, device_pool_names, arch] configurations. Users can filter the data by specifying private device_pool_names, backends, and models.
45+
1846
#### Quick Start
1947

2048
```bash
@@ -37,38 +65,42 @@ python3 .ci/scripts/benchmark_tooling/get_benchmark_analysis_data.py \
3765
- `print`: Display results in console
3866
- `json`: Generate JSON file
3967
- `df`: Display results in DataFrame format: `{'private': List[{'groupInfo':Dict,'df': DF},...],'public':List[{'groupInfo':Dict,'df': DF}]`
40-
- `excel`: Generate Excel files with multiple sheets, the field in first row and first column contains the json string of the raw metadata
41-
- `csv`: Generate CSV files in separate folders, the field in first row and first column contains the json string of the raw metadata
68+
- `excel`: Generate Excel files with multiple sheets, the field in first row and first column contains the JSON string of the raw metadata
69+
- `csv`: Generate CSV files in separate folders, the field in first row and first column contains the JSON string of the raw metadata
4270
- `--outputDir`: Directory to save output files (default: current directory)
4371

4472
##### Filtering Options:
4573

4674
- `--private-device-pools`: Filter by private device pool names (e.g., "samsung-galaxy-s22-5g", "samsung-galaxy-s22plus-5g")
47-
- `--backends`: Filter by specific backend names (e.g., "qnn-q8" , ""llama3-spinquan)
48-
- `--models`: Filter by specific model names (e.g "mv3" "meta-llama-llama-3.2-1b-instruct-qlora-int4-eo8")
75+
- `--backends`: Filter by specific backend names (e.g., "qnn-q8", "llama3-spinquan")
76+
- `--models`: Filter by specific model names (e.g., "mv3", "meta-llama-llama-3.2-1b-instruct-qlora-int4-eo8")
4977

5078
#### Example Usage
51-
call multiple private device pools and models:
52-
this fetches all the private table data that has model `llama-3.2-1B` and `mv3`
79+
80+
Filter by multiple private device pools and models:
5381
```bash
82+
# This fetches all private table data for models 'llama-3.2-1B' and 'mv3'
5483
python3 get_benchmark_analysis_data.py \
55-
--startTime "2025-06-01T00:00:00" \
56-
--endTime "2025-06-11T00:00:00" \
57-
--private-device-pools 'apple_iphone_15_private' 'samsung_s22_private' \
58-
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
84+
--startTime "2025-06-01T00:00:00" \
85+
--endTime "2025-06-11T00:00:00" \
86+
--private-device-pools 'apple_iphone_15_private' 'samsung_s22_private' \
87+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
5988
```
6089

61-
this fetches all the private iphone table data that has model `llama-3.2-1B` and `mv3`, and associated public iphone
90+
Filter by specific device pool and models:
6291
```bash
92+
# This fetches all private iPhone table data for models 'llama-3.2-1B' and 'mv3',
93+
# and associated public iPhone data
6394
python3 get_benchmark_analysis_data.py \
64-
--startTime "2025-06-01T00:00:00" \
65-
--endTime "2025-06-11T00:00:00" \
66-
--private-device-pools 'apple_iphone_15_private' \
67-
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
95+
--startTime "2025-06-01T00:00:00" \
96+
--endTime "2025-06-11T00:00:00" \
97+
--private-device-pools 'apple_iphone_15_private' \
98+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
6899
```
100+
69101
#### Working with Output Files CSV and Excel
70102

71-
You can use methods in `common.py` to convert the file data back to DataFrame format, those methods read the first row in csv/excel file, and return result with format list of {"groupInfo":DICT, "df":df.Dataframe{}} format.
103+
You can use methods in `common.py` to convert the file data back to DataFrame format. These methods read the first row in CSV/Excel files and return results with the format `list of {"groupInfo":DICT, "df":df.Dataframe{}}`.
72104

73105
```python
74106
import logging
@@ -126,7 +158,7 @@ results = fetcher.output_data(output_type="excel", output_dir="./results")
126158

127159
The benchmark tooling includes unit tests to ensure functionality.
128160

129-
### Using pytest
161+
### Using pytest for unit tests
130162

131163
```bash
132164
# From the executorch root directory

0 commit comments

Comments
 (0)