Skip to content

Conversation

kcons
Copy link
Member

@kcons kcons commented Aug 9, 2025

Buffer reliability is critical and has been proven to be a potential stability risk as we scale up.
To allow Workflows infrastructure to fail without impacting the legacy system and to enable the use
of newer backends without migrating live load, it's helpful to give Workflows it's own Buffer.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Aug 9, 2025
Copy link

codecov bot commented Aug 9, 2025

Codecov Report

❌ Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
.../sentry/workflow_engine/tasks/delayed_workflows.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #97549      +/-   ##
==========================================
+ Coverage   79.98%   80.64%   +0.65%     
==========================================
  Files        8588     8587       -1     
  Lines      378443   378374      -69     
  Branches    24642    24632      -10     
==========================================
+ Hits       302709   305128    +2419     
+ Misses      75362    72874    -2488     
  Partials      372      372              

@kcons kcons marked this pull request as ready for review August 12, 2025 17:45
@kcons kcons requested review from a team as code owners August 12, 2025 17:45
@kcons kcons requested a review from saponifi3d August 12, 2025 17:46
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@saponifi3d saponifi3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm if we're okay to remove the buffer hook registry in the legacy code.

FLUSH = "flush"


class BufferHookRegistry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this yet? i'm mostly concerned about the legacy delayed_processor code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be okay, though it is hard to reason about.
The task now called process_buffer directly, and process buffer uses the registry to pick the right backend for the right delayed processor.
Though, I didn't think too hard about what the registry was trying to accomplish, so I may be missing something. 😬

buffer_keys,
min=0,
max=fetch_time,
)


if not redis_buffer_registry.has(BufferHookEvent.FLUSH):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 - i never liked this code. just curious, if we want to add another handler when the buffer is flushed, would we have to manually add it to tasks/process_buffer.py::process_pending_batch? should we have a registry hook there or is this uncommon enough that we can be explicit about it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've made it so there isn't just One buffer anymore, and the dispatching to the appropriate buffer is now handled by the config in delayed_processing_registry.
For the delayed processing-style batch scheduling, this seem sufficiently extensible (presumably, in some month it'll only be dispatching to workflows, and we may want to split "Buffer" as our method set is fairly independent and can pretty much work with a redis client), so I'm not too concerned with generalizing the pattern.

Copy link
Contributor

Seer failed to run.

@kcons kcons merged commit 73d312e into master Aug 13, 2025
64 checks passed
@kcons kcons deleted the kcons/ourbuf branch August 13, 2025 21:11
Copy link

sentry-io bot commented Aug 14, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

Did you find this useful? React with a 👍 or 👎

priscilawebdev pushed a commit that referenced this pull request Aug 25, 2025
Buffer reliability is critical and has been proven to be a potential
stability risk as we scale up.
To allow Workflows infrastructure to fail without impacting the legacy
system and to enable the use
of newer backends without migrating live load, it's helpful to give
Workflows it's own Buffer.
andrewshie-sentry pushed a commit that referenced this pull request Aug 26, 2025
Buffer reliability is critical and has been proven to be a potential
stability risk as we scale up.
To allow Workflows infrastructure to fail without impacting the legacy
system and to enable the use
of newer backends without migrating live load, it's helpful to give
Workflows it's own Buffer.
@github-actions github-actions bot locked and limited conversation to collaborators Aug 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants