Skip to content

Commit b873570

Browse files
Trace Archive Tool (#276)
This PR introduces the Trace Archive Tool along with improvements in the documentation at `traces/stf_trace_archive/README.md`. The trace archive tools comes with four different commands * **Upload.** Upload workload and/or trace. * **List.** List items by category. * **Get.** Download a specified trace file. * **Setup.** Create or edit current tool configurations. This PR files follows this structure: ```text README.md # Main documentation file src/ ├── data/ # Core data models and classes │ └── storage/ # Storage backend implementations (local, cloud, etc.) ├── handlers/ # Command handlers (upload, get, list, setup) ├── utils/ # Utility functions and helpers └── trace_archive.py # Main CLI entry point ``` ### Quickstart ```bash # Install dependencies pip install -r requirements.txt # Configure initial storage (local) python trace_archive.py setup # Upload a workload and trace python trace_archive.py upload --workload ../../stf_metadata/example/dhrystone --trace ../../stf_metadata/example/dhrystone.zstf ``` command definitions and examples are present in the README file. ### Assumptions: * Every metadata file for simpointed traces will contain the property `stf` -> `trace_interval` -> `start_instruction_index` * Traces files has a .zstf file type ### Missing Work / Next Steps: * Add Bundle / Suite grouping * Add script to generate metadata for traces * Add search command * Make possible to upload workload related files, like objdump * Add a test class
1 parent f2a1476 commit b873570

27 files changed

+1333
-92
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,4 +38,7 @@
3838
[Ff]ast[Dd]ebug
3939

4040
# Backup files
41-
*~
41+
*~
42+
43+
# Python cache file
44+
__pycache__/

traces/stf_trace_archive/README.md

Lines changed: 160 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -1,113 +1,182 @@
11
# Trace Archive Tool
22

3-
A Python command-line interface (CLI) tool to manage shared trace files,such as uploding, searching and downloading traces.
3+
## Table of Contents
4+
5+
1. [Quickstart](#quickstart)
6+
2. [Introduction](#introduction)
7+
3. [Dependencies](#dependencies)
8+
4. [Project Structure](#project-structure)
9+
5. [Usage](#usage)
10+
1. [Initial Setup](#initial-setup)
11+
2. [Upload Command](#upload-command)
12+
3. [List Command](#list-command)
13+
4. [Get Command](#get-command)
14+
6. [Examples](#examples)
15+
1. [Uploading a Trace](#uploading-a-trace)
16+
2. [Downloading a Trace](#downloading-a-trace)
17+
3. [Downloading a Workload](#downloading-a-workload)
18+
4. [Creating and using second storage source](#creating-and-using-second-storage-source)
19+
7. [Trace ID](#trace-id)
20+
1. [Example Trace IDs](#example-trace-ids)
21+
8. [Storage Folder Structure](#storage-folder-structure)
22+
23+
## Quickstart
424

5-
## Usage
25+
```bash
26+
# Install dependencies
27+
pip install -r requirements.txt
628

7-
Run the script using:
29+
# Configure initial storage (local)
30+
python trace_archive.py setup
831

9-
```bash
10-
python trace_archive.py <command> [options]
32+
# Upload a workload and trace
33+
python trace_archive.py upload --workload ../../stf_metadata/example/dhrystone --trace ../../stf_metadata/example/dhrystone.zstf
1134
```
1235

13-
To view all available commands and options use `--help` or `-h`:
36+
## Introduction
1437

15-
```bash
16-
$ python trace_archive.py --help
17-
Usage: python trace_archive.py COMMAND [OPTIONS]
38+
A Python CLI tool for uploading, organizing, and sharing trace and workload files.
39+
Currently supports local storage, with planned extensions for cloud sources (e.g., Google Drive).
40+
41+
## Dependencies
1842

19-
CLI tool for Olympia traces exploration
43+
To use the trace archive tool, ensure you have the following installed:
2044

21-
Commands:
22-
connect Connect to the system or database.
23-
upload Upload workload and trace.
24-
search Search traces by specified expression.
25-
list List items by category.
26-
get Download a specified trace file.
45+
- **Python 3** (recommended: Python 3.8 or newer)
46+
- **Required Python packages**: Install dependencies with:
47+
```bash
48+
pip install -r requirements.txt
49+
```
50+
The main requirements are:
51+
- `pandas`
52+
- `PyYAML`
2753

28-
Run 'trace_archive COMMAND --help' for more information on a command.
54+
## Project Structure
2955

30-
For more help on how to use trace_archive, head to GITHUB_README_LINK
56+
```text
57+
src/
58+
├── data/ # Core data models and classes
59+
│ └── storage/ # Storage backend implementations (local, cloud, etc.)
60+
├── handlers/ # Command handlers (upload, get, list, setup)
61+
├── utils/ # Utility functions and helpers
62+
└── trace_archive.py # Main CLI entry point
3163
```
3264

33-
---
3465

35-
## Available Commands
66+
## Usage
67+
68+
The tool can be used with the following commands:
69+
70+
* **[Upload](#upload-command).** Upload workload and/or trace.
71+
* **[List](#list-command).** List items by category.
72+
* **[Get](#get-command).** Download a specified trace file.
73+
* **[Setup](#setup-command).** Create or edit current tool configurations.
3674

37-
### `upload`
3875

39-
Uploads a trace file along with its associated workload and metadata.
76+
### Initial Setup
77+
78+
To set up the trace archive tool, run the `setup` command to configure the inital storage type and it's path. For example, to set up a local storage type, with the name `local` and path to the storage folder `/home/user/trace_archive`, run:
4079
4180
```bash
42-
$ python trace_archive.py upload --help
43-
Usage: python trace_archive.py upload [OPTIONS]
81+
$ python trace_archive.py setup
82+
Creating a new storage source.
83+
Registred storage type options: local-storage
84+
Select your storage type: local-storage
85+
Enter your storage name: local
86+
Enter the storage folder path: /home/user/trace_archive
87+
```
88+
89+
All storage sources contains a type and a name. The type is used to identify the storage source, like `local-storage` or `google-drive`, while the name is used to identify the storage configuration in the tools commands.
4490
45-
Upload a workload, trace and metadata to the database
91+
With the initial setup done, you can add new storage sources or change the default storage source using the `setup` command again, with the commands `--add-storage` and `--set-default-storage`, respectively.
4692
47-
Options:
48-
--workload Path to the workload file.
49-
--trace Path to the trace file.
50-
--it Iteractive files selection mode.
93+
```bash
94+
$ python trace_archive.py setup --add-storage
95+
$ python trace_archive.py setup --set-default-storage
5196
```
5297
53-
> Requires a metadata file located at `<trace>.metadata.yaml`.
98+
All configurations are stored in the `config.yaml` file, which is created in the current working directory when the `setup` command is run for the first time.
5499
55-
> For every upload, a unqiue [trace id](#trace-id) will be generated
100+
Checkout the [Creating and using second storage source](#creating-and-using-second-storage-source) section for more details on how to create and use a second storage source.
56101
57-
---
102+
### Upload Command
58103
59-
### `search`
104+
The `upload` command is for upload a trace and it's workload. The a trace file, and if not presented in the storage yet, hte workload file. The tools also expects a metadata file, which is a YAML file with the name `<trace>.metadata.yaml`, where `<trace>` is the name of the trace file. Multiple traces can be uploaded at once, as long as they are from the same trace attempt.
60105

61-
Search can be used to search for the given regex term in the list of available traces and metadata matches
106+
The `upload` command options are:
62107

63-
```bash
64-
$ python trace_archive.py search --help
65-
Usage: python trace_archive.py search [OPTIONS] [REGEX]
108+
* `--workload`: Path to the workload file.
109+
* `--trace`: Path to the trace file.
110+
* `--it`: Interactive files selection mode. If this option is used, the tool will prompt the user to select the workload and trace files
66111

67-
Search for traces and metadata using a regular expression.
112+
For every upload, a unique [`trace-id`](#trace-id) will be generated and filled into the metadata file.
68113

69-
Arguments:
70-
REGEX Regex expression to search with.
114+
### List Command
71115

72-
Options:
73-
--names-only Search only by trace id (ignore metadata).
74-
```
116+
The `list` command is used to list the available traces or workloads in the archive.
75117

76-
---
118+
### Get Command
77119

78-
### `list`
120+
The `get` command is used to download a specified trace, workload or metadata file from the archive. The command options are:
79121

80-
```bash
81-
$ python trace_archive.py list --help
82-
Usage: python trace_archive.py list [OPTIONS]
122+
* `--trace`: Id of the trace to download.
123+
* `--workload`: Id of the workload to download.
124+
* `--metadata`: Id of the metadata (same as the trace id) to download.
125+
* `-o, --output`: Output file path. If not specified, the file will be downloaded to the current working directory.
83126

84-
List database traces or related entities.
127+
## Examples
85128

86-
Options:
87-
--traces Lists available traces (default)
88-
--workloads Lists available workloads
129+
Assuming the trace archive tool is set up with a local storage type named `local`, you can use the following commands:
130+
131+
### Uploading a Trace
132+
133+
To upload a trace file named `dhrystone.zstf` and its workload `dhrystone`, present in the metadata example folder, you can run:
134+
135+
```bash
136+
$ python trace_archive.py --storage-name local upload --workload ../../stf_metadata/example/dhrystone --trace ../../stf_metadata/example/dhrystone.zstf
137+
138+
Uploading workload: ../../stf_metadata/example/dhrystone with id: 0
139+
Uploading trace: ../../stf_metadata/example/dhrystone.zstf with id: 0.0.0000_dhrystone
89140
```
90141

91-
---
142+
### Downloading a Trace
143+
144+
To download the trace file `000.000.000_dhrystone.zstf` and its metadata, you can run:
145+
146+
```bash
147+
$ python trace_archive.py get --trace 000.000.000_dhrystone.zstf
92148
93-
### `get`
149+
Trace 0.0.0000_dhrystone saved on ./0.0.0000_dhrystone.zstf
150+
Metadata 0.0.0000_dhrystone saved on ./0.0.0000_dhrystone.zstf.metadata.yaml
151+
```
152+
153+
### Downloading a Workload
94154

95-
Downloads a specified trace file.
155+
To download the workload `0` (dhrystone), you can run:
96156

97157
```bash
98-
$ python trace_archive.py get --help
99-
Usage: python trace_archive.py get [OPTIONS] TRACE
158+
$ python trace_archive.py get --workload 0
100159
101-
Download a specified trace file.
160+
Workload 0 saved on ./dhrystone
161+
```
102162

103-
Arguments:
104-
TRACE Name of the trace to download.
163+
### Creating and using second storage source
105164

106-
Options:
107-
--revision Revision number. If not specified, the latest revision is used.
108-
--company Filter by associated company.
109-
--author Filter by author.
110-
-o, --output Output file path.
165+
To create a second storage source you can run the `setup` command with the `--add-storage` option:
166+
167+
```bash
168+
$ python trace_archive.py setup --add-storage
169+
Creating a new storage source.
170+
Registred storage type options: local-storage
171+
Select your storage type: local-storage
172+
Enter your storage name: private-storage
173+
Enter the storage folder path: ./private
174+
```
175+
176+
This will create a new storage source named `private-storage` with the path `./private`. You can then use this storage source in the `upload` command by specifying the `--storage` option:
177+
178+
```bash
179+
$ python trace_archive.py --storage-name private-storage upload --workload ../../stf_metadata/example/dhrystone --trace ../../stf_metadata/example/dhrystone.zstf
111180
```
112181

113182
## Trace ID
@@ -135,13 +204,13 @@ Where:
135204

136205
| Upload # | Description | Trace ID |
137206
| -------- | --------------------------------------------------------- | ----------------------- |
138-
| 1st | `dhrystone` compiled with `-O3`, fully traced | `000.000.000_dhrystone` |
139-
| 2nd | `dhrystone` `-O3`, traced from instruction 0 to 1,000,000 | `000.001.000_dhrystone` |
140-
| 3rd | `dhrystone` `-O3`, traced from 1,000,000 to 2,000,000 | `000.001.001_dhrystone` |
141-
| 4th | `dhrystone` `-O3`, traced from 2,000,000 to 3,000,000 | `000.001.002_dhrystone` |
142-
| 5th | Same trace as 1st (re-uploaded) | `000.002.000_dhrystone` |
143-
| 6th | `dhrystone` compiled with `-O2`, fully traced | `001.000.000_dhrystone` |
144-
| 7th | `embench` compiled with `-O3`, fully traced | `002.000.000_embench` |
207+
| 1st | `dhrystone` compiled with `-O3`, fully traced | `0.0.0000_dhrystone` |
208+
| 2nd | `dhrystone` `-O3`, traced from instruction 0 to 1,000,000 | `0.1.0000_dhrystone` |
209+
| 3rd | `dhrystone` `-O3`, traced from 1,000,000 to 2,000,000 | `0.1.0001_dhrystone` |
210+
| 4th | `dhrystone` `-O3`, traced from 2,000,000 to 3,000,000 | `0.1.0002_dhrystone` |
211+
| 5th | Same trace as 1st (re-uploaded) | `0.2.0000_dhrystone` |
212+
| 6th | `dhrystone` compiled with `-O2`, fully traced | `0.0.0000_dhrystone` |
213+
| 7th | `embench` compiled with `-O3`, fully traced | `0.0.0000_embench` |
145214

146215
---
147216

@@ -152,33 +221,33 @@ For the trace archive structure, each workload is stored in its own folder, iden
152221
The tree graph below illustrates a setup of the [Trace Id Example](#example-trace-ids):
153222

154223
```text
155-
000/
224+
0000_dhrystone/
156225
├── dhrystone
157226
├── dhrystone.objdump
158227
├── dhrystone.stdout
159-
├── 000/
160-
│ ├── 000.000.000_dhrystone.zstf
161-
│ └── 000.000.000_dhrystone.zstf.metadata.yaml
162-
├── 001/
163-
│ ├── 000.001.000_dhrystone.zstf
164-
│ ├── 000.001.000_dhrystone.zstf.metadata.yaml
165-
│ ├── 000.001.001_dhrystone.zstf
166-
│ ├── 000.001.001_dhrystone.zstf.metadata.yaml
167-
│ ├── 000.001.002_dhrystone.zstf
168-
│ └── 000.001.002_dhrystone.zstf.metadata.yaml
169-
001/
228+
├── attempt_0000/
229+
│ ├── 0.0.0000_dhrystone.zstf
230+
│ └── 0.0.0000_dhrystone.zstf.metadata.yaml
231+
├── attempt_0001/
232+
│ ├── 0.1.0000_dhrystone.zstf
233+
│ ├── 0.1.0000_dhrystone.zstf.metadata.yaml
234+
│ ├── 0.1.0001_dhrystone.zstf
235+
│ ├── 0.1.0001_dhrystone.zstf.metadata.yaml
236+
│ ├── 0.1.0002_dhrystone.zstf
237+
│ └── 0.1.0002_dhrystone.zstf.metadata.yaml
238+
0001_dhrystone/
170239
├── dhrystone
171240
├── dhrystone.objdump
172241
├── dhrystone.stdout
173-
└── 000/
174-
├── 001.000.000_dhrystone.zstf
175-
└── 001.000.000_dhrystone.zstf.metadata.yaml
176-
002/
242+
└── attempt_0000/
243+
├── 1.0.0000_dhrystone.zstf
244+
└── 1.0.0000_dhrystone.zstf.metadata.yaml
245+
0002_embench/
177246
├── embench.zip
178247
├── embench.objdump
179248
├── embench.stdout
180-
└── 000/
181-
├── 002.000.000_embench.zstf
182-
└── 002.000.000_embench.zstf.metadata.yaml
249+
└── attempt_0000/
250+
├── 2.0.0000_embench.zstf
251+
└── 2.0.0000_embench.zstf.metadata.yaml
183252
184253
```
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
from dataclasses import dataclass
2+
from typing import Dict, Optional, Type, Union, List
3+
4+
5+
@dataclass
6+
class LocalStorageConfig:
7+
path: str
8+
9+
10+
CONFIG_TYPE_MAP: Dict[str, Type] = {
11+
"local-storage": LocalStorageConfig,
12+
}
13+
14+
15+
@dataclass
16+
class StorageConfig:
17+
type: str
18+
name: str
19+
config: Union[LocalStorageConfig]
20+
21+
@staticmethod
22+
def from_dict(data: dict):
23+
specific_config_type = data['type']
24+
if specific_config_type not in CONFIG_TYPE_MAP:
25+
raise ValueError(f"Unknown storage type: {specific_config_type}")
26+
27+
specific_config_class = CONFIG_TYPE_MAP.get(specific_config_type)
28+
specific_config = specific_config_class(**data['config'])
29+
return StorageConfig(type=data['type'], name=data['name'], config=specific_config)
30+
31+
32+
@dataclass
33+
class Config:
34+
storages: List[StorageConfig]
35+
default_storage: Optional[str]
36+
37+
@staticmethod
38+
def from_dict(data: dict):
39+
if not data:
40+
return None
41+
42+
storages = []
43+
if 'storages' in data:
44+
storages = [StorageConfig.from_dict(s) for s in data['storages']]
45+
46+
return Config(storages=storages, default_storage=data.get('default_storage'))
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
from dataclasses import dataclass
2+
3+
4+
@dataclass(frozen=True)
5+
class Const():
6+
PAD_LENGHT = 4

0 commit comments

Comments
 (0)