Skip to content

feat(sessions): Add EAP double-write for user sessions#5588

Open
noahsmartin wants to merge 4 commits intomasterfrom
sessions-eap-double-write
Open

feat(sessions): Add EAP double-write for user sessions#5588
noahsmartin wants to merge 4 commits intomasterfrom
sessions-eap-double-write

Conversation

@noahsmartin
Copy link
Contributor

@noahsmartin noahsmartin commented Jan 28, 2026

When the UserSessionsEap feature flag is enabled, session data is sent both through the legacy metrics pipeline and directly to the snuba-items topic as TRACE_ITEM_TYPE_USER_SESSION TraceItems.

This enables migration to the new EAP-based user sessions storage.

Includes a Datadog metric sessions.eap.produced tagged with session_type (update/aggregate) to track EAP writes.

Cursor Bugbot found 2 potential issues for commit 1a745cd

When the `UserSessionsEap` feature flag is enabled, session data is sent
both through the legacy metrics pipeline and directly to the snuba-items
topic as TRACE_ITEM_TYPE_USER_SESSION TraceItems.

This enables migration to the new EAP-based user sessions storage.

Includes a Datadog metric `sessions.eap.produced` tagged with
`session_type` (update/aggregate) to track EAP writes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@noahsmartin noahsmartin force-pushed the sessions-eap-double-write branch from 21e22bf to 6acd029 Compare January 28, 2026 20:49
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review in-depeth, as this we're going to need a different approach.

Session processing can stay as it is, we still need to extract sessions into metrics and aggregate them through the metrics pipeline to keep aggregation in-place.

Right now, you can check the implementation in services/store.rs where session metrics are separated from other "generic" metrics into the sessions topic. This is going to be the entry point where we can start double writing.

Long term, we'd want to have a more strictly typed version of aggregation for metrics, by re-using the generic metrics pipeline but properly typed. This will require some more work though.

Regarding rollout: What is the preferred rollout strategy, via a feature flag? I recommend instead doing a global rollout rate defined in "global config", this is usually the approach we're taking for controlling double writes to Kafka.

@noahsmartin noahsmartin force-pushed the sessions-eap-double-write branch 3 times, most recently from ed7c367 to 1a745cd Compare February 4, 2026 20:33
@noahsmartin noahsmartin marked this pull request as ready for review February 4, 2026 23:11
@noahsmartin noahsmartin requested a review from a team as a code owner February 4, 2026 23:11
@noahsmartin noahsmartin force-pushed the sessions-eap-double-write branch from 1a745cd to d56e867 Compare February 5, 2026 17:29
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

let received = Some(prost_types::Timestamp {
seconds: now.as_secs() as i64,
nanos: 0,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EAP received timestamp uses current time instead of actual receive time

Medium Severity

The received timestamp for EAP TraceItems is set to now (current time when emitting to EAP) instead of using bucket.metadata.received_at (the actual time Relay received the data). Other EAP producers like logs, trace attachments, and trace metrics consistently use the actual received_at time from their context. The bucket's received_at metadata is available and already used for delay tracking in the legacy pipeline (lines 539-544), but the EAP emission ignores it. This causes the received field in EAP to reflect when data was processed for Kafka production rather than when it was originally received, leading to incorrect timestamps and inconsistency between legacy and EAP data.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants