Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -358,9 +358,10 @@
"pages": [
"self-hosting/lifecycle-maintenance",
"self-hosting/lifecycle-maintenance/securing-your-deployment",
"self-hosting/lifecycle-maintenance/server-specs",
"self-hosting/lifecycle-maintenance/healthchecks",
"self-hosting/lifecycle-maintenance/telemetry"
"self-hosting/lifecycle-maintenance/telemetry",
"self-hosting/lifecycle-maintenance/migrating",
"self-hosting/lifecycle-maintenance/multiple-instances"
]
},
"self-hosting/enterprise",
Expand Down Expand Up @@ -575,6 +576,10 @@
{
"source": "/integration-guides/supabase",
"destination": "/integration-guides/supabase-+-powersync"
},
{
"source": "/self-hosting/lifecycle-maintenance/server-specs",
"destination": "/self-hosting/lifecycle-maintenance"
}
],
"footerSocials": {
Expand Down
2 changes: 1 addition & 1 deletion self-hosting/installation/powersync-service-setup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ The PowerSync Service requires a storage backend for sync buckets. You can use e

### MongoDB Storage

MongoDB requires at least one replica set node. A single node is fine for development/staging environments, but a 3-node replica set is recommended [for production](/self-hosting/lifecycle-maintenance/server-specs).
MongoDB requires at least one replica set node. A single node is fine for development/staging environments, but a 3-node replica set is recommended [for production](/self-hosting/lifecycle-maintenance).

[MongoDB Atlas](https://www.mongodb.com/products/platform/atlas-database) enables replica sets by default for new clusters.

Expand Down
113 changes: 108 additions & 5 deletions self-hosting/lifecycle-maintenance.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,114 @@
---
title: "Lifecycle / Maintenance"
description: "Notes for sysadmins"
description: "Self-hosting setup and maintenance"
sidebarTitle: Overview
---

## Migrations
## Minimal Setup

Migrations run automatically by default.
A minimal "development" setup (e.g. for a staging or a QA environment) is:

1. A single PowerSync "compute" container (API + replication) with 512MB memory, 1 vCPU.
2. A single MongoDB node in replica set mode, 2GB memory, 1 vCPU. M10+ when using Atlas.
3. Load balancer for TLS.

This setup has no redundancy. If the replica set fails, you may need to recreate it from scratch which will re-sync all clients.

## Production

For production, we recommend running a high-availability setup:

1. 1x PowerSync replication container, 1GB memory, 1 vCPU
2. 2+ PowerSync API containers, 1GB memory each, 1vCPU each.
3. A 3-node MongoDB replica set, 2+GB memory each. Refer to the MongoDB documentation for deployment requirements. M10+ when using Atlas.
4. A load balancer with redundancy.
5. Run a daily compact job.

For scaling up, add 1x PowerSync API container per 100 connections. The MongoDB replica set should be scaled based on CPU and memory usage.

### Replication Container

The replication container handles replicating from the source database to PowerSync's bucket storage.

The replication process is run using the docker command `start -r sync`, for example `docker run powersync start -r sync`.

Only one process can replicate at a time. If multiple are running concurrently, you may see an error `[PSYNC_S1003] Sync rules have been locked by another process for replication`.
If you use rolling deploys, it is normal to see this error for a short duration while multiple processes are running.

Memory and CPU usage of the replication container is primarily driven by write load on the source database. A good starting point is 1GB memory and 1 vCPU for the container, but this may be scaled down depending on the load patterns.

Set the environment variable `NODE_OPTIONS=--max-old-space-size=800` for 800MB, or set to 80% of the total assigned memory if scaling up or down.

### API Containers

The API container handles streaming sync connections, as well as any other API calls.

The replication process is run using the docker command `start -r api`, for example `docker run powersync start -r api`.

Each API container is limited to 200 concurrent connections, but we recommend targeting 100 concurrent connections or less per container. This may change as we implement additional performance optimizations.

Memory and CPU usage of API containers are driven by:
1. Number of concurrent connections.
2. Number of buckets per connection.
3. Amount of data synced to each connection.

A good starting point is 1GB memory and 1 vCPU per container, but this may be scaled up or down depending on the specific load patterns.

Set the environment variable `NODE_OPTIONS=--max-old-space-size=800` for 800MB, or set to 80% of the total assigned memory if scaling up or down.

### Compact Job

We recommend running a compact job daily as a cron job, or after any large maintenance jobs. For details, see the documentation on [Compacting Buckets](/usage/lifecycle-maintenance/compacting-buckets).

Run the compact job using the docker command `compact`, for example `docker run powersync compact`.

The compact job uses up to 1GB memory for compacting, if available. Set the environment variable `NODE_OPTIONS=--max-old-space-size=800` for 800MB, or set to 80% of the total assigned memory if scaling up or down.

### Load Balancer

A load balancer is required in from of the API containers to provide TLS support and load balancing. Most cloud providers have built-in options for load balancing, such as ALB on AWS.

It is currently required to host the API container on a dedicated subdomain - we do not support running it on the same subdomain as another service.

For self-hosting, [nginx](https://nginx.org/en/) is always a good option. A basic nginx configuration could look like this:

```yaml
server {
listen 443 ssl;
server_name powersync.example.prg;

# SSL configuration here

# Reverse proxy settings
location / {
proxy_pass http://powersync_server_ip:powersync_port; # Replace with your powersync details
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

# Disable proxy buffering - important for HTTP streaming connections
proxy_buffering off;
}
}
```

When using nginx as a Kubernetes ingress, set the proxy buffering option as an annotation on the ingress:

```yaml
nginx.ingress.kubernetes.io/proxy-buffering: "off"
```

### Health Checks

If the load balancer supports health checks, it may be configured to poll the API container at `/probes/liveness`. This endpoint is expected to have a 200 response when the container is healthy. See [Healthchecks](./lifecycle-maintenance/healthchecks) for details.

### Migrations

By default, migrations are run as part of the replication and API containers. In some cases, a migration may add signifant delay to the container startup.

To avoid this, the migrations may be run as a separate job on each update, before replacing the rest of the containers. To run the migrations, run the docker command `migrate up`, for example `docker run powersync migrate up`.

In this case, disable automatic migrations in the config:

```yaml
# powersync.yaml
Expand All @@ -16,10 +118,11 @@ migrations:
# When set to true, migrations must be triggered manually by modifying the container `command`.
disable_auto_migration: true
```
MongoDB locks ensure migrations are executed exactly once, even when multiple containers start simultaneously.

## Backups

We recommend using Git to backup your configuration files.

The sync bucket storage database doesn't require backups as it can be easily reconstructed.
None of the containers use any local storage, so no backups are required there.

The sync bucket storage database may be backed up using the recommendations for the storage database system. This is not a strong requirement, since this data can be recovered by re-replicating from the source database.
14 changes: 14 additions & 0 deletions self-hosting/lifecycle-maintenance/migrating.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: "Migrating between instances"
description: "Migrating users between PowerSync instances"
---

## Overview

In some cases, you may want to migrate users between PowerSync instances. This may be between cloud and self-hosted instances, or even just to change the endpoint.

If the PowerSync instances use the same source database and have the same basic configuration and sync rules, you can migrate users by just changing the endpoint to the new instance.

To make this process easier, we recommend using an API to retrieve the PowerSync endpoint, intead of hardcoding the endpoint in the client application. If you're using custom authentication, this can be done in the same API call as getting the authentication token.

There should be no downtime for users when switching between endpoints. The client will have to re-sync all data, but this will all happen automatically, and the client will atomically switch between the two. The main effect visible to users will be a delay in syncing new data while the client is re-syncing. All data will remain available to read on the client for the entire process.
31 changes: 31 additions & 0 deletions self-hosting/lifecycle-maintenance/multiple-instances.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: "Multiple PowerSync Instances"
description: "Scaling using multiple instances"
---

## Overview

<Warning>
Multiple instances are not required in most cases. See the [Overview](self-hosting/lifecycle-maintenance) for details on standard horizontal scaling setups.
</Warning>

When exceeding a couple thousand concurrent connections, the standard PowerSync setup may not scale sufficiently to handle the load. In this case, we recommend you [contact us](/resources/contact-us) to discuss the options. However, we give a basic overview of using multiple PowerSync instances to scale here.

Each PowerSync "instance" is a single endpoint (URL), that is backend by:
1. One replication container.
2. Multiple API containers, scaling horizontally.
3. One bucket storage database.

This setup is described in the [Overview](self-hosting/lifecycle-maintenance).

To scale further, multiple copies of this setup can be run, using the same source database.

## Mapping user -> PowerSync endpoint

Since each PowerSync instance maintains its own copy of the bucket data, the exact list of operations and associated checksum will be different between them. This means the same client must connect to the same endpoint every time, otherwise they will have to re-sync all their data every time they switch. Multiple PowerSync instances cannot be load-balanced behind the same subdomain.

To ensure the same user always connects to the same endpoint, we recommend:
1. Do an API lookup from the client application to get the PowerSync endpoint, don't hardcode it in the application.
2. Either store the endpoint associated with each user, or compute it automatically using a hash function on the user id.


24 changes: 0 additions & 24 deletions self-hosting/lifecycle-maintenance/server-specs.mdx

This file was deleted.