Skip to content

Conversation

@jmhossler
Copy link

Hi! We've been troubleshooting some latency issues around our push notifications, and that led to adding some additional telemetry to this library so we could turn on some detailed logging and generate some good data to help us track down these problems. I'm happy to update more docs as well to document the usage of these events if desired. Also, I'm happy to update this to make :telemetry an optional dependency and update how we emit these so that it is only done if the :telemetry module is loaded, but I wasn't sure of the value of that.

I added telemetry events & metrics to the following modules:

  • lib/connection.ex - telemetry event [:kadabra, :connection, :stop] added, with the duration of the connection logged. This has been helpful for understanding the lifecycle of both our FCM and APNS connections.
  • lib/socket.ex - telemetry events [:kadabra, :socket, :recv_frame], [:kadabra, :socket, :send], [:kadabra, :socket, :closed]. I just went ahead and included the frame received and data being sent. I found these useful when trying to understand why our connections were being closed, like seeing the GOAWAY frames or the messages we sent leading up to that.
  • lib/stream.ex - telemetry events [:kadabra, :stream, :start], [:kadabra, :stream, :stop], with the duration of the lifetime of the stream included. This was useful for us to gather metrics on the lifetime of requests as well as to see how many streams were being created.

In this change is also an API breaking change -- updates to ping. Another thing we attempted to gather metrics on was the RTT of pings over these connections, and by allowing for us to specify an optional payload in the ping, we could calculate the RTT time from the pong response received without having to worry about out-of-order ping responses when sent around the same time. You can still just call Kadabra.ping(pid), but I couldn't think of a way to return the payload when we get the response from the server without making it so that the data in the ping frame was always included in the message sent to the calling process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant