Conversation
| cutoff = datetime.utcnow() | ||
| cutoff = cutoff.replace( | ||
| day=cutoff.day - older_than_days | ||
| ) |
There was a problem hiding this comment.
Incorrect date arithmetic causes ValueError in cleanup
Medium Severity
The SQLite backend's cleanup method uses cutoff.replace(day=cutoff.day - older_than_days) to calculate a date cutoff, but datetime.replace() expects a valid day value (1-31). If the current day minus older_than_days results in zero or negative (e.g., January 5th with older_than_days=10 yields -5), Python raises ValueError: day is out of range for month. The correct approach is to use datetime.utcnow() - timedelta(days=older_than_days) for date arithmetic.
| "storage_id": file_id, | ||
| } | ||
| else: | ||
| result["files"][fname] = value |
There was a problem hiding this comment.
Wrong variable assigned in file serialization loop
Medium Severity
In _serialize_tool_output, when iterating over value.items() to get fname, fdata pairs, the else branch incorrectly assigns value (the entire files dictionary) instead of fdata (the individual file's data). This would cause each file entry to contain the entire files dictionary rather than its own content, corrupting the serialized output.
|
|
||
| # === CRITICAL: Set resume loop === | ||
| # Agent._run_agent will start from this loop instead of 1 | ||
| self._resume_from_loop = state.get("current_loop", 0) |
There was a problem hiding this comment.
Agent resume starts from completed loop instead of next
Medium Severity
The RFC example code has an off-by-one error in agent loop resume logic. _resume_from_loop is set to state.get("current_loop", 0), but checkpoints are saved AFTER each loop iteration completes. When resuming, range(start_loop, max_loops + 1) re-executes the already-completed loop. The conversation history already contains that loop's messages (restored from checkpoint), so this would cause duplicate LLM calls and potential message duplication. The fix is state.get("current_loop", 0) + 1 to skip the completed loop.
Additional Locations (1)
| except (TypeError, ValueError): | ||
| # Large or non-serializable results: store truncated | ||
| if isinstance(value, str) and len(value) > 10000: | ||
| serialized[cache_key] = value[:10000] + "...[truncated]" |
There was a problem hiding this comment.
Non-serializable tool cache entries silently dropped during checkpoint
Low Severity
In _serialize_tool_cache, when a tool result value is not JSON-serializable and is not a large string, the entry is silently dropped with no fallback or warning. On resume, these missing cache entries would cause the corresponding tools to be re-executed unnecessarily, contradicting the RFC's goal of using the tool cache to "skip re-executing identical tool calls on resume." Non-serializable results (e.g., custom objects) are simply not added to serialized.
ec64e12 to
d912a41
Compare
Note
Adds a complete RFC suite for opt-in checkpoint/resume across Dynamiq workflows with strong backward compatibility and HITL support.
docs/RFC-001-CHECKPOINT-RESUME.mdwith overview, alternatives, and timelinedocs/rfc-001-checkpoint-resume/covering: industry research, runtime integration, node analysis, data models, storage backends (File/SQLite/Redis/PostgreSQL), flow integration, testing/migration, and UI/chat integrationFlowCheckpoint,NodeCheckpointState,CheckpointConfig) and statuses, plus protocol (CheckpointableNode) and mixinPENDING_INPUTbehaviorWritten by Cursor Bugbot for commit d912a41. This will update automatically on new commits. Configure here.