Skip to content

Move to Redis lua scripts#444

Merged
cschleiden merged 6 commits intomainfrom
redis-move-to-lua-scripts
Oct 17, 2025
Merged

Move to Redis lua scripts#444
cschleiden merged 6 commits intomainfrom
redis-move-to-lua-scripts

Conversation

@cschleiden
Copy link
Copy Markdown
Owner

@cschleiden cschleiden commented Oct 17, 2025

This pull request refactors how workflow and activity events are persisted and queued in the Redis backend by replacing multi-step Go-side pipelining with new atomic Lua scripts. The changes improve consistency, reliability, and transactional safety when handling workflow events, activity completions, instance cancellations, and signals. The update also removes now-unnecessary helper methods and streamlines event marshaling.

Fixes: #440

@cschleiden cschleiden requested a review from Copilot October 17, 2025 18:14
@cschleiden
Copy link
Copy Markdown
Owner Author

!bench !large

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors the Redis backend to use atomic Lua scripts for workflow and activity event operations, replacing Go-side pipelining with server-side execution. This improves transactional safety and consistency when persisting events, completing activities, canceling instances, and handling signals.

Key changes:

  • Introduces four new Lua scripts for atomic operations: signal_workflow.lua, delete_instance.lua, complete_activity_task.lua, and cancel_workflow_instance.lua
  • Removes helper methods addWorkflowInstanceEventP, addEventPayloadsP, and addEventToStreamP that are now replaced by Lua scripts
  • Standardizes event marshaling by consistently using marshalEvent instead of splitting marshaling logic

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
backend/redis/workflow.go Updated CompleteWorkflowTask to use unified marshalEvent helper and removed deprecated addWorkflowInstanceEventP method
backend/redis/signal.go Replaced pipelined operations with atomic signalWorkflowCmd Lua script execution
backend/redis/scripts/signal_workflow.lua New Lua script for atomically adding signal events and queueing workflow tasks
backend/redis/scripts/delete_instance.lua New Lua script for atomic instance deletion
backend/redis/scripts/complete_activity_task.lua New Lua script combining activity completion, event addition, and workflow queueing
backend/redis/scripts/cancel_workflow_instance.lua New Lua script for atomic workflow cancellation
backend/redis/redis.go Updated script loading to include new Lua scripts and removed obsolete script preloading logic
backend/redis/instance.go Refactored CreateWorkflowInstance and CancelWorkflowInstance to use marshalEvent and new Lua scripts
backend/redis/events.go Removed deprecated helper methods replaced by Lua scripts
backend/redis/delete.go Migrated from inline script to external deleteInstanceCmd Lua script
backend/redis/activity.go Replaced pipelined activity completion with atomic completeActivityTaskCmd Lua script

local instanceSegment = ARGV[4]

-- Add event payload
redis.pcall("HSETNX", payloadHashKey, eventId, payload)
Copy link

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error handling: Line 24 uses redis.pcall (which suppresses errors) while lines 27, 30, and 32 use redis.call (which propagates errors). Since payload storage is critical for event processing, use redis.call here for consistent error propagation.

Suggested change
redis.pcall("HSETNX", payloadHashKey, eventId, payload)
redis.call("HSETNX", payloadHashKey, eventId, payload)

Copilot uses AI. Check for mistakes.
local instanceSegment = ARGV[4]

-- Add event payload
redis.pcall("HSETNX", payloadHashKey, eventId, payload)
Copy link

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error handling: Line 12 uses redis.pcall (which suppresses errors) while lines 15, 18, and 20 use redis.call (which propagates errors). Since payload storage is critical for event processing, use redis.call here for consistent error propagation.

Suggested change
redis.pcall("HSETNX", payloadHashKey, eventId, payload)
redis.call("HSETNX", payloadHashKey, eventId, payload)

Copilot uses AI. Check for mistakes.

-- Store payload if provided (only if not empty)
if ARGV[5] ~= "" then
redis.pcall("HSETNX", KEYS[4], ARGV[3], ARGV[5])
Copy link

Copilot AI Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error handling: Line 33 uses redis.pcall (which suppresses errors) while other Redis operations use redis.call (which propagates errors). Since payload storage is critical for event processing, use redis.call here for consistent error propagation.

Suggested change
redis.pcall("HSETNX", KEYS[4], ARGV[3], ARGV[5])
redis.call("HSETNX", KEYS[4], ARGV[3], ARGV[5])

Copilot uses AI. Check for mistakes.
Comment thread backend/redis/scripts/complete_activity_task.lua
@github-actions
Copy link
Copy Markdown

Benchmark Results

SQLite run

Command Mean [ms] Min [ms] Max [ms] Relative
sqlite-main 249.7 ± 27.1 220.2 310.8 1.06 ± 0.16
sqlite-pr 236.6 ± 25.5 200.8 284.7 1.00

Large SQLite payload run (1MB)

Command Mean [s] Min [s] Max [s] Relative
sqlite-main 5.358 ± 0.417 4.685 6.055 1.02 ± 0.10
sqlite-pr 5.230 ± 0.280 4.948 5.744 1.00

Redis run

Command Mean [ms] Min [ms] Max [ms] Relative
redis-main 178.1 ± 28.2 144.7 238.8 1.07 ± 0.20
redis-pr 165.7 ± 17.3 144.2 199.7 1.00

Large Redis payload run (1MB)

Command Mean [s] Min [s] Max [s] Relative
redis-main 3.992 ± 0.424 3.357 4.672 1.00
redis-pr 4.156 ± 0.442 3.330 4.657 1.04 ± 0.16

@cschleiden cschleiden merged commit 5003264 into main Oct 17, 2025
4 checks passed
@cschleiden cschleiden deleted the redis-move-to-lua-scripts branch October 17, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recover from Redis restart during workflow exection

2 participants