Skip to content

Conversation

rkistner
Copy link
Contributor

@rkistner rkistner commented Jun 30, 2025

Background

RSocket handles keepalive as follows:

  1. Client sends a keepalive message once every keepAlive ms (20s).
  2. Server responds with a keepalive message.
  3. Client checks that it gets one keepalive response every lifetime ms (30s).

That gives a 10s grace period between sending and receiving the message.

The issue here comes in when there is either:

  1. A slow network connection, that buffers a lot of data on the TCP level.
  2. CPU overhead with parsing the BSON data (observed on React-Native), that could have the same buffering effect.

Either of the above could block the keepalive response for more than that 10s, resulting in Error: Closed. Original cause [Error: No keep-alive acks for 30000 millis]..

One workaround was added in #494, reducing the number of messages that are requested by the client at a time. However, this can decrease throughput.

Keepalive Changes

This implements an additional workaround that:

  1. Checks that we get one message of any type every 30s.
  2. Increases the keepalive lifetime to 90s.

This applies per-message, not per-byte or per-frame. This means we can still get a timeout error if a single message takes more than 10s to send over the connection. Our messages are generally limited to 1MB in size (unless a single operation is larger than that), so it should cover most cases.

BSON parsing changes

This now delays parsing of BSON data until the line is processed, rather than when it is received. This helps since:

  1. The serialized buffer should use less memory than the parsed data, resulting in slightly less memory usage when we buffer a lot of data.
  2. This makes the websocket handling code more responsive, making it less likely that the keepalive messages are significantly delayed.

The Rust implementation takes this further to move the BSON parsing into the SQLite extension, which helps even more.

This refactors the DataStream handling a bit to support this. This also fixes some type issues we had, such as mixing up Uint8Array and ArrayBuffer.

Future options

  1. For NodeJS and Chrome, we can consider using WebSocketStream instead of WebSocket, which has built-in backpressure support, which may mitigate this issue somewhat. This may require significant changes to RSocket though.
  2. We could implement for "dynamic" backpressure, rather than always requesting 10 messages at a time. However, implementing an effective strategy here could be quite difficult.

Testing

NodeJS

Did regression testing on NodeJS using all 4 combinations of connectionMethod and clientImplementation - no issues found.

React Native

On React-Native, tested using websockets only, for both clientImplementation methods.For the JavaScript implementation, overall sync performance appears to be slightly better now, but I didn't do proper benchmarking.

Before, data would typically be processed batches of 10x deserialize, then 10x saveSyncData, alternating between them. Now, one message is deserialized at a time.

For op-sqlite, I used react-native-barebones-opsqlite. I had trouble with the fetch implementation - both write checkpoints and http streams get Value is undefined, expected a String (see facebook/react-native#27741 (comment)). But websockets do work with both the JS and Rust implementations.

Web

Tested using the diagnostics app, using all 4 combinations of connectionMethod and clientImplementation - no issues found.

TODO

  • Fix tests hanging

@rkistner rkistner requested a review from Chriztiaan June 30, 2025 15:33
Copy link

changeset-bot bot commented Jun 30, 2025

🦋 Changeset detected

Latest commit: 430f99f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages
Name Type
@powersync/react-native Patch
@powersync/diagnostics-app Patch
@powersync/common Patch
@powersync/node Patch
@powersync/op-sqlite Patch
@powersync/web Patch
@powersync/tanstack-react-query Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@rkistner rkistner force-pushed the websocket-keepalives branch from 360b135 to a31ddb5 Compare July 1, 2025 09:41
@rkistner rkistner marked this pull request as ready for review July 1, 2025 13:34
@rkistner rkistner requested a review from simolus3 July 1, 2025 13:35
Chriztiaan
Chriztiaan previously approved these changes Jul 2, 2025
Copy link
Collaborator

@stevensJourney stevensJourney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here look good to me. Happy from my side :)

@rkistner rkistner merged commit ffe3095 into main Jul 2, 2025
11 of 12 checks passed
@rkistner rkistner deleted the websocket-keepalives branch July 2, 2025 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants