Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Summary

Adds three new histogram metrics to the Hermes service to track latency in price feed updates:

  • publish_to_receive_latency: Time from publish_time to receive_time (tracked for both SSE and WS)
  • receive_to_ws_send_latency: Time from receive_time to WebSocket send time
  • receive_to_sse_send_latency: Time from receive_time to SSE send time

Rationale

These metrics provide visibility into latency at different stages of the price feed pipeline, enabling monitoring and optimization of the Hermes service performance. The metrics help identify bottlenecks between price publication, reception, and delivery to clients.

Implementation Details

  • Extended WsMetrics struct to include publish_to_receive_latency and receive_to_ws_send_latency histograms
  • Created new SseMetrics struct with receive_to_sse_send_latency histogram
  • Updated ApiState to include SSE metrics
  • Instrumented both WebSocket and SSE endpoints to observe latency at send time
  • Used histogram buckets ranging from 0.1 to 20.0 seconds
  • Version bumped to 0.10.5

Architecture Notes

The publish_to_receive_latency metric is registered in WsMetrics but observed by both endpoints. SSE accesses it via state.ws.metrics.publish_to_receive_latency to avoid duplicate registration.

How has this been tested?

  • Current tests cover my changes (compilation verified with cargo build)
  • Added new tests
  • Manually tested the code

Testing performed: Verified compilation success with cargo build. No runtime testing of metrics collection was performed.

Review Checklist

Please pay special attention to:

  1. Metrics architecture: Is the cross-endpoint access pattern (SSE -> WS metrics) acceptable or should publish_to_receive_latency be registered differently?

  2. Timing accuracy: Are the wall-clock measurements using SystemTime::now() appropriate for latency tracking in hot paths?

  3. Performance impact: Review the overhead of adding histogram observations to every price update in both WS and SSE paths

  4. Borrow checker solutions: Verify the approach of capturing received_at and publish_time before moving update structs is sound

  5. Histogram buckets: Validate that the 0.1-20.0 second range with the chosen bucket distribution makes sense for expected latencies


Link to Devin run: https://app.devin.ai/sessions/e9af526709c3456c8bd160207b8ec67a
Requested by: Tejas Badadare ([email protected])

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

vercel bot commented Aug 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
api-reference Ready Ready Preview Comment Aug 21, 2025 0:26am
component-library Ready Ready Preview Comment Aug 21, 2025 0:26am
developer-hub Ready Ready Preview Comment Aug 21, 2025 0:26am
entropy-explorer Ready Ready Preview Comment Aug 21, 2025 0:26am
insights Ready Ready Preview Comment Aug 21, 2025 0:26am
proposals Ready Ready Preview Comment Aug 21, 2025 0:26am
staking Ready Ready Preview Comment Aug 21, 2025 0:26am

Comment on lines 512 to 530
if let Some(received_at) = received_at_opt {
let pub_to_recv = (received_at - publish_time).max(0) as f64;
self.ws_state
.metrics
.publish_to_receive_latency
.observe(pub_to_recv);

let now_secs = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.ok()
.and_then(|d| i64::try_from(d.as_secs()).ok())
.unwrap_or(received_at);
let recv_to_send = (now_secs - received_at).max(0) as f64;
self.ws_state
.metrics
.receive_to_ws_send_latency
.observe(recv_to_send);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the wrong place, we should observe the latency after self.sender.flush()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant