You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An interactive visualization tool for [TaskVine](https://github.com/cooperative-computing-lab/cctools), a task scheduler for large workflows to run efficiently on HPC clusters. This tool helps you analyze task execution patterns, file transfers, resource utilization, storage consumption, and other key metrics.
4
4
5
-
## Quick Install
5
+
## Installation
6
+
7
+
### For Users (Recommended)
8
+
9
+
Install directly from PyPI:
10
+
11
+
```bash
12
+
pip install taskvine-report-tool
13
+
```
14
+
15
+
After installation, you can use the commands `vine_parse` and `vine_report` directly from anywhere.
16
+
17
+
### For Developers
18
+
19
+
If you want to contribute to development or modify the source code:
- 🔍 `vine_parse` - Parse TaskVine logs and generate analysis data
18
32
- 🌐 `vine_report` - Start web visualization server
19
33
20
-
Follow these steps to use the visualization tool:
34
+
### Command Reference
21
35
22
-
###1. Prepare Log Files
36
+
#### `vine_parse` - Parse TaskVine Logs
23
37
24
-
After running your TaskVine workflow, you'll find the logs in a directory named with a timestamp (e.g., `2025-05-20T110437`) or your specified workflow name. The default structure looks like this:
38
+
**Required Parameters:**
39
+
-`--templates`: List of log directory names/patterns (required)
25
40
26
-
```
27
-
workflow_name/
28
-
└── vine-logs/
29
-
├── debug
30
-
├── performance
31
-
├── taskgraph
32
-
├── transactions
33
-
└── workflow.json
34
-
```
41
+
**Optional Parameters:**
42
+
-`--logs-dir`: Base directory containing log folders (default: current directory)
35
43
36
-
To use these logs with the visualization tool:
44
+
**Usage Examples:**
37
45
38
-
1. Copy the entire workflow directory to a logs directory:
39
46
```bash
40
-
mkdir -p logs
41
-
cp -r /path/to/workflow_name logs/
42
-
```
47
+
# Basic usage - parse specific log directories (--templates is required)
48
+
vine_parse --templates experiment1 experiment2
43
49
44
-
2. Parse the logs and generate visualization data:
- If no `--logs-dir` is specified, uses current working directory
62
+
- The `--templates` parameter is **required** - the command will fail without it
63
+
- Patterns support shell glob expansion (*, ?, [])
64
+
- Automatically filters out directories that don't contain `vine-logs` subdirectory
65
+
66
+
#### `vine_report` - Start Web Server
67
+
68
+
**All Parameters are Optional:**
69
+
70
+
-`--logs-dir`: Directory containing log folders (default: current directory)
71
+
-`--port`: Port number for the web server (default: 9122)
72
+
-`--host`: Host address to bind to (default: 0.0.0.0)
73
+
74
+
**Usage Examples:**
75
+
55
76
```bash
77
+
# Basic usage - start server with all defaults
56
78
vine_report
57
-
```
58
79
59
-
4. View the report in your browser at `http://localhost:9122`
80
+
# Specify custom port and logs directory
81
+
vine_report --port 8080 --logs-dir /path/to/logs
60
82
61
-
Note: In the web interface, you'll only see log collections that have been successfully processed by `vine_parse`. You can process multiple log collections at once:
83
+
# Bind to specific host (restrict access)
84
+
vine_report --host 127.0.0.1 --port 9122
62
85
63
-
```bash
64
-
vine_parse logs/log1 logs/log2 logs/log3
86
+
# Allow remote access (default behavior)
87
+
vine_report --host 0.0.0.0 --port 9122
65
88
```
66
89
67
-
### 2. Command Reference
90
+
**Default Behavior:**
91
+
- Uses current working directory as logs directory
92
+
- Starts server on port 9122
93
+
- Binds to all interfaces (0.0.0.0) allowing remote access
94
+
- Displays all available IP addresses where the server can be accessed
68
95
69
-
#### `vine_parse` - Parse TaskVine Logs
96
+
## Quick Start
97
+
98
+
Follow these steps to use the visualization tool:
99
+
100
+
### 1. Navigate to Your Log Directory
101
+
102
+
After running your TaskVine workflow, the logs are automatically saved in the `vine-run-info` directory within your workflow's working directory. Navigate to this directory:
Instead of manually copying logs, you can configure TaskVine to generate logs directly in the correct location. When creating your TaskVine manager, set these parameters:
168
+
By default, TaskVine creates a `vine-run-info` directory in your working directory. You can customize this location when creating your TaskVine manager:
107
169
108
170
```python
109
171
manager = vine.Manager(
@@ -124,62 +186,57 @@ This will automatically create the correct directory structure:
124
186
125
187
After your workflow completes, simply:
126
188
1. Navigate to your analysis directory: `cd ~/my_analysis_directory`
127
-
2. Parse the logs: `vine_parse your_workflow_name`
189
+
2. Parse the logs: `vine_parse --templates your_workflow_name`
128
190
3. Start the server: `vine_report`
129
191
4. View at `http://localhost:9122`
130
192
131
-
### 4. Multiple Log Collections
132
-
133
-
You can have multiple log collections. For example:
193
+
### 4. Generated Data Structure
134
194
195
+
After parsing, each experiment will have multiple generated directories:
135
196
```
136
-
logs/
137
-
├── experiment1/
138
-
│ └── vine-logs/
139
-
├── large_workflow/
140
-
│ └── vine-logs/
141
-
└── test_run/
142
-
└── vine-logs/
143
-
```
144
-
145
-
Parse all of them at once:
146
-
```bash
147
-
vine_parse experiment1 large_workflow test_run
197
+
vine-run-info/
198
+
└── experiment1/
199
+
├── vine-logs/ # Original log files
200
+
│ ├── debug
201
+
│ ├── performance
202
+
│ ├── taskgraph
203
+
│ ├── transactions
204
+
│ └── workflow.json
205
+
├── pkl-files/ # Raw parsed data (generated by vine_parse)
206
+
│ ├── manager.pkl # Manager information
207
+
│ ├── workers.pkl # Worker statistics
208
+
│ ├── tasks.pkl # Task execution details
209
+
│ ├── files.pkl # File transfer information
210
+
│ └── subgraphs.pkl # Task dependency graphs
211
+
├── csv-files/ # Visualization-ready data (generated from pkl-files)
212
+
│ ├── task_concurrency.csv
213
+
│ ├── worker_lifetime.csv
214
+
│ ├── file_transfers.csv
215
+
│ └── ... # Various CSV files for different charts
216
+
└── svg-files/ # Cached graph visualizations
217
+
├── task_subgraphs_1.svg
218
+
├── task_dependencies_graph.svg
219
+
└── ... # Cached SVG files for complex graphs
148
220
```
149
221
150
-
Or use the --all option:
151
-
```bash
152
-
vine_parse --all
153
-
```
222
+
**Directory Breakdown:**
154
223
155
-
### 5. Complete Workflow Example
224
+
-**`pkl-files/`**: Contains the raw parsed data extracted directly from log files. These are Python pickle files containing structured data about workers, tasks, files, and other workflow components. This is the primary output of `vine_parse`.
-**`csv-files/`**: Contains visualization-ready data files generated from the pkl-files. The web frontend uses these CSV files as the data source for all charts and graphs. Each CSV file corresponds to a specific visualization module.
160
227
161
-
# 2. Start the web server
162
-
vine_report --logs-dir ~/my_logs --port 9122
228
+
-**`svg-files/`**: Contains cached SVG files for complex graph visualizations (such as task dependency graphs and subgraphs). Since building these graphs is computationally expensive and time-consuming, we cache the generated SVG files to avoid rebuilding them on subsequent loads.
163
229
164
-
# 3. Open browser to http://localhost:9122
165
-
```
230
+
**For Developers:**
166
231
167
-
### 6. Generated Data Structure
232
+
If you want to work with the raw data programmatically, you can load the pkl files into memory using the `restore_pkl_files()` function. The data structures are defined in the following files:
233
+
-`data_parser.py` - Main data parsing logic and file restoration
234
+
-`task.py` - Task data structure and methods
235
+
-`worker.py` - Worker data structure and methods
236
+
-`file.py` - File data structure and methods
237
+
-`manager.py` - Manager data structure and methods
168
238
169
-
After parsing, each log collection will have a `pkl-files` directory:
170
-
```
171
-
logs/
172
-
└── experiment1/
173
-
├── vine-logs/
174
-
│ ├── debug
175
-
│ └── transactions
176
-
└── pkl-files/ # Generated by vine_parse
177
-
├── manager.pkl # Manager information
178
-
├── workers.pkl # Worker statistics
179
-
├── tasks.pkl # Task execution details
180
-
├── files.pkl # File transfer information
181
-
└── subgraphs.pkl # Task dependency graphs
182
-
```
239
+
This allows you to build custom visualizations based on the original parsed data. You can also customize the CSV generation logic by editing the `generate_csv_files()` function to create your own visualization-ready data formats.
0 commit comments