Skip to content

Commit 942e622

Browse files
committed
update readme
1 parent 2629c74 commit 942e622

File tree

1 file changed

+19
-19
lines changed

1 file changed

+19
-19
lines changed

README.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -149,47 +149,47 @@ The tool provides various visualization modules to analyze different aspects of
149149
![Example](imgs/example_task_execution_details.png)
150150
- **Task Concurrency**: Visualizes task states over time from the manager's perspective, tracking five distinct states: waiting (committed but not dispatched), committing (dispatched but not yet executed), executing (currently running on workers), waiting retrieval (completed with outputs pending retrieval), and done (fully completed, whether succeeded or failed).
151151
![Example](imgs/example_task_concurrency.png)
152-
- **Task Response Time**: Distribution of time between task submission and start of execution
152+
- **Task Response Time**: Measures the duration between task commitment to the manager and its dispatch to a worker. High response times may indicate task queue congestion or scheduler inefficiencies when available cores are significantly outnumbered by waiting tasks.
153153
![Example](imgs/example_task_response_time.png)
154-
- **Task Execution Time**: Distribution of actual task execution durations
154+
- **Task Execution Time**: Displays the actual runtime duration of each task, providing insights into computational performance and resource utilization.
155155
![Example](imgs/example_task_execution_time.png)
156-
- **Task Retrieval Time**: Time taken to retrieve task dependencies
156+
- **Task Retrieval Time**: Tracks the time required to retrieve task outputs, beginning when a task completes and sends its completion message to the manager. This phase ends when outputs are successfully retrieved or an error is identified.
157157
![Example](imgs/example_task_retrieval_time.png)
158-
- **Task Completion Percentiles**: Distribution of task completion times
158+
- **Task Completion Percentiles**: Shows the time required to complete specific percentages of the total workflow. For instance, the 10th percentile indicates the time needed to complete the first 10% of all tasks.
159159
![Example](imgs/example_task_completion_percentiles.png)
160-
- **Task Dependencies**: Visualization of task dependency relationships
160+
- **Task Dependencies**: Visualizes the number of parent tasks for each task. A task can only execute after all its parent tasks have completed and their outputs have been successfully retrieved by the manager.
161161
![Example](imgs/example_task_dependencies.png)
162-
- **Task Dependents**: Shows which tasks depend on each task
162+
- **Task Dependents**: Shows the number of child tasks that depend on each task's outputs as their inputs.
163163
![Example](imgs/example_task_dependents.png)
164-
- **Task Subgraphs**: Displays task dependency subgraphs
164+
- **Task Subgraphs**: Displays the workflow's independent Directed Acyclic Graphs (DAGs), where each subgraph represents a set of tasks connected by input-output file dependencies.
165165
![Example](imgs/example_task_subgraphs.png)
166166

167167
### Worker Analysis
168-
- **Worker Storage Consumption**: Storage usage patterns across workers
168+
- **Worker Storage Consumption**: Monitors the actual storage usage of each worker over time, specifically tracking worker cache consumption. Note that this metric excludes task-related sandboxes as they represent virtual resource allocation.
169169
![Example](imgs/example_worker_storage_consumption.png)
170-
- **Worker Concurrency**: Number of concurrent tasks per worker
170+
- **Worker Concurrency**: Tracks the number of active workers over time, providing insights into cluster utilization and scalability.
171171
![Example](imgs/example_worker_concurrency.png)
172-
- **Worker Incoming Transfers**: File transfer patterns to workers
172+
- **Worker Incoming Transfers**: Shows the number of file download requests received by each worker over time. These transfers occur when other workers need files from this worker or when the manager is retrieving task outputs.
173173
![Example](imgs/example_worker_incoming_transfers.png)
174-
- **Worker Outgoing Transfers**: File transfer patterns from workers
174+
- **Worker Outgoing Transfers**: Displays the number of file download requests initiated by each worker over time, including transfers from the cloud, other workers, or the manager.
175175
![Example](imgs/example_worker_outgoing_transfers.png)
176-
- **Worker Executing Tasks**: Tasks currently running on each worker
176+
- **Worker Executing Tasks**: Tracks the number of tasks actively running on each worker over time.
177177
![Example](imgs/example_worker_executing_tasks.png)
178-
- **Worker Waiting Retrieval Tasks**: Tasks waiting for file retrieval
178+
- **Worker Waiting Retrieval Tasks**: Shows the number of completed tasks on each worker that are pending output retrieval.
179179
![Example](imgs/example_worker_waiting_retrieval_tasks.png)
180-
- **Worker Lifetime**: Worker availability and uptime patterns
180+
- **Worker Lifetime**: Visualizes the active period of each worker throughout the workflow, accounting for varying connection times and potential crashes.
181181
![Example](imgs/example_worker_lifetime.png)
182182

183183
### File Analysis
184-
- **File Sizes**: Distribution of file sizes in the workflow
184+
- **File Sizes**: Displays the size of each task-related file, including both input and output files.
185185
![Example](imgs/example_file_sizes.png)
186-
- **File Concurrent Replicas**: Number of simultaneous file replicas
186+
- **File Concurrent Replicas**: Shows the maximum number of file replicas at any given time. Higher values indicate better redundancy and fault tolerance. Replication occurs automatically for temporary files when specified by the manager's `temp-replica-count` parameter, or naturally when workers fetch inputs from other workers.
187187
![Example](imgs/example_file_concurrent_replicas.png)
188-
- **File Retention Time**: How long files are kept in the system
188+
- **File Retention Time**: Measures the duration between file creation and removal from the cluster. Longer retention times provide better redundancy but consume more disk space. This can be optimized through the manager's file pruning feature.
189189
![Example](imgs/example_file_retention_time.png)
190-
- **File Transferred Size**: Total size of files transferred
190+
- **File Transferred Size**: Tracks the cumulative size of data transferred between workers over time.
191191
![Example](imgs/example_file_transferred_size.png)
192-
- **File Created Size**: Size of files created during execution
192+
- **File Created Size**: Shows the cumulative size of distinct files created during workflow execution.
193193
![Example](imgs/example_file_created_size.png)
194194

195195
## Troubleshooting

0 commit comments

Comments
 (0)