Skip to content

Commit 1dc2f4b

Browse files
committed
feat: write docs on producer
1 parent fe1f88e commit 1dc2f4b

File tree

1 file changed

+104
-0
lines changed

1 file changed

+104
-0
lines changed

docs/ARCHITECTURE.md

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# High-level architecture
2+
3+
OpenDT runs two concurrent threads that communicate via Kafka:
4+
5+
## 1. Producer Loop (Workload Replay)
6+
7+
The producer streams workload traces to Kafka, simulating real-time datacenter activity.
8+
9+
### Data Sources
10+
- **tasks.parquet** (7,850 tasks)
11+
- Columns: `id`, `submission_time`, `duration`, `cpu_count`, `cpu_capacity`, `mem_capacity`
12+
- Tasks have timestamps indicating when they were submitted
13+
14+
- **fragments.parquet** (2.3M fragments)
15+
- Columns: `id` (task_id), `duration`, `cpu_count`, `cpu_usage`
16+
- Fragments have NO submission times in raw data
17+
- Each task is composed of ≥1 fragments for fine-grained resource modeling
18+
19+
The SURF workload trace captures approximately **7 days of wall clock time**.
20+
21+
### Key Transformations
22+
23+
**Fragment Timestamp Synthesis:**
24+
Fragments inherit their parent task's submission time, offset by cumulative fragment durations. The synthesized timestamp is stored in the `submission_time` key of each fragment message:
25+
26+
```
27+
Task starts at T=0
28+
Fragment 1.submission_time = T=0 + duration[0]
29+
Fragment 2.submission_time = T=0 + duration[0] + duration[1]
30+
Fragment 3.submission_time = T=0 + duration[0] + duration[1] + duration[2]
31+
...
32+
```
33+
34+
This creates a sequential execution timeline where fragments "chain" together.
35+
36+
**Time-Scaled Replay:**
37+
- Original trace: ~7 days of workload data
38+
- Replay speed: 10x accelerated (TIME_SCALE = 0.1)
39+
- Real-time gaps between events are preserved but compressed
40+
- Example: 1-hour workload gap → 6-minute real wait
41+
42+
### Workload Semantics
43+
44+
**Task-Fragment Relationship:**
45+
In the AtLarge traces, tasks are _always_ equally split into fragments. For example, we can have a task with `duration=10000ms` split into 5 fragments of `duration=2000ms`.
46+
47+
**Computing Clock Cycles:**
48+
Fragments specify the total number of clock cycles needed, calculated as:
49+
```
50+
total_cycles = duration (ms) × cpu_usage (MHz) × 1,000
51+
```
52+
Since MHz = million cycles/second and duration is in milliseconds.
53+
54+
**Simulation Behavior:**
55+
The `duration` field represents the actual measured runtime of the original workload. However, when simulating on different topologies, the execution time varies based on CPU specifications.
56+
57+
**Example:**
58+
A fragment with `cpu_count=1`, `cpu_usage=1000 MHz`, `duration=5000 ms`:
59+
- Total cycles needed: `5000 × 1000 × 1,000 = 5,000,000,000 cycles` (5 billion)
60+
61+
When simulated on different hardware:
62+
- **2 CPUs @ 1000 MHz each:** Total throughput = 2,000 MHz → `duration = 5,000,000,000 / (2000 × 1,000,000) = 2500 ms`
63+
- **1 CPU @ 5000 MHz:** Total throughput = 5,000 MHz → `duration = 5,000,000,000 / (5000 × 1,000,000) = 1000 ms`
64+
65+
This is a simplification of reality but makes reasoning about the simulation behavior clearer.
66+
67+
### Output to Kafka
68+
69+
**Topic: "tasks"**
70+
```json
71+
{
72+
"key": {"id": "task_id"},
73+
"value": {
74+
"submission_time": "2022-10-06T22:00:00Z",
75+
"duration": 27935000,
76+
"cpu_count": 16,
77+
"cpu_capacity": 33600.0,
78+
"mem_capacity": 100000
79+
}
80+
}
81+
```
82+
83+
**Topic: "fragments"**
84+
```json
85+
{
86+
"key": {"id": "task_id"},
87+
"value": {
88+
"submission_time": "2022-10-06T22:00:30Z",
89+
"duration": 30000,
90+
"cpu_usage": 1953.0
91+
}
92+
}
93+
```
94+
95+
### Implementation
96+
- Two parallel threads: one for tasks, one for fragments
97+
- Both synchronized via barrier to start simultaneously
98+
- Pacing via `sleep()` between messages to maintain temporal structure
99+
100+
---
101+
102+
## 2. Consumer Loop (Digital Twin)
103+
104+
_[To be documented]_

0 commit comments

Comments
 (0)