Skip to content

Commit 62c29e7

Browse files
committed
fixing an issue that came up when testing in arena.
1 parent 0729a79 commit 62c29e7

File tree

2 files changed

+104
-1
lines changed

2 files changed

+104
-1
lines changed

SpiffWorkflow/bpmn/serializer/default/workflow.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def from_dict(self, dct, workflow):
6565
task.internal_data = self.registry.restore(dct['internal_data'])
6666

6767
delta = dct.get('delta')
68-
if delta:
68+
if delta and task.parent is not None:
6969
data = DeepMerge.merge({}, task.parent.data)
7070
data.update(self.registry.restore(delta.get('updates', {})))
7171
for key in delta.get('deletions', {}):
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Periodic Serialization Performance Test Design
2+
3+
**Date:** 2026-03-03
4+
**Status:** Approved
5+
6+
## Problem Statement
7+
8+
The `BpmnWorkflowSerializer.to_dict()` method walks the entire workflow task tree on every call. Current performance tests only measure serialization once after workflow completion. We need to understand the cost of repeated serialization calls during workflow execution, which is critical for checkpoint/state-saving scenarios.
9+
10+
## Context
11+
12+
Recent commits show significant work on serialization optimization:
13+
- Commit 26fe7390: Reduced serialized workflow size using delta storage for child tasks
14+
- Commit 5a865caa: Simplified serialization change detection
15+
- Commit 7ac8477b: Made engine steps non-recursive
16+
17+
Analysis of `SpiffWorkflow/bpmn/serializer/default/workflow.py` confirms that `WorkflowConverter.to_dict()` calls `mapping_to_dict(workflow.tasks)`, which iterates through all tasks and converts each one. The delta optimization reduces output size but doesn't avoid the tree traversal.
18+
19+
## Design
20+
21+
### Test Method
22+
23+
Add one new test method to `tests/SpiffWorkflow/bpmn/test_performance_test.py`:
24+
25+
```python
26+
test_performance_periodic_serialization_300_items()
27+
```
28+
29+
### Execution Pattern
30+
31+
1. Create workflow with 300 items using existing `_create_workflow_with_item_count(300)` helper
32+
2. Execute workflow in batches of 10 engine steps using `workflow.complete_next(10)`
33+
3. After each batch:
34+
- Measure serialization time: `serializer.to_dict(workflow)`
35+
- Record task count: `len(workflow.tasks)`
36+
- Accumulate metrics
37+
4. Continue until `workflow.is_completed()` returns True
38+
39+
### Metrics Collected
40+
41+
At each checkpoint:
42+
- Individual serialization time
43+
- Number of tasks in the workflow tree
44+
- Step count
45+
46+
Summary metrics:
47+
- Total execution time
48+
- Total serialization time
49+
- Number of serializations performed
50+
- Average serialization time
51+
- Serialization overhead percentage (total serialization / execution time)
52+
53+
### Output Format
54+
55+
```
56+
PERIODIC SERIALIZATION TEST (performance_test.bpmn)
57+
================================================================
58+
300 items (serialize every 10 steps):
59+
Execution time: X.XXXXXX seconds
60+
61+
Serialization checkpoints:
62+
After 10 steps (N tasks): X.XXXXXX seconds
63+
After 20 steps (N tasks): X.XXXXXX seconds
64+
After 30 steps (N tasks): X.XXXXXX seconds
65+
...
66+
67+
Total serialization time: X.XXXXXX seconds
68+
Serialization overhead: XX.X% of execution time
69+
Number of serializations: N
70+
Average per serialization: X.XXXXXX seconds
71+
================================================================
72+
```
73+
74+
## Expected Outcomes
75+
76+
This test will reveal:
77+
1. How serialization time grows as the task tree expands during execution
78+
2. The cumulative cost of repeated tree traversals
79+
3. Whether serialization overhead becomes significant relative to execution time
80+
4. Whether there are opportunities for incremental or differential serialization
81+
82+
## Implementation Approach
83+
84+
Following TDD:
85+
1. Write failing test first
86+
2. Verify it fails for the expected reason (method not implemented)
87+
3. Implement minimal code to pass
88+
4. Refactor if needed
89+
90+
## Alternatives Considered
91+
92+
**Multiple serializations on completed workflow:** Simpler but doesn't show how cost changes as tree grows. Rejected because it doesn't capture the realistic scenario of serializing during execution.
93+
94+
**Serialize after every step:** Too granular, would dominate test execution time. Using 10-step batches provides sufficient granularity while keeping test runtime reasonable.
95+
96+
**Test all item counts (20, 100, 200, 300):** Would provide more data points but significantly increase test suite runtime. Starting with 300 items (largest tree) provides the most meaningful signal. Can add more if needed.
97+
98+
## Success Criteria
99+
100+
- Test executes without errors
101+
- Serialization happens multiple times during workflow execution
102+
- Output clearly shows serialization time growth correlated with task count
103+
- Results help inform serialization optimization decisions

0 commit comments

Comments
 (0)