Skip to content

Conversation

@sanjivanipatrax
Copy link
Contributor

This PR adds test coverage and automation support for Edge Server, including new test cases, helpers, provisioning logic, and installation scripts for end-to-end validation.

Features added:

  • End-to-end Edge Server test coverage across authentication, CRUD, replication, conflict resolution, and system verification scenarios (in tests/QE/edge_server/).
  • Helper modules for REST API interactions, configuration generation, and remote shell operations (in client/src/cbltest/api/ and client/src/cbltest/).
  • Environment setup scripts to install, uninstall, and configure Couchbase Server, Sync Gateway, and Edge Server components (in environment/cbs/, environment/sg/, and environment/edge_server/).
  • VM provisioning from the Mobile QE VM pool (in environment/vm_management/).

@sanjivanipatrax sanjivanipatrax self-assigned this Oct 31, 2025
@sanjivanipatrax sanjivanipatrax marked this pull request as draft October 31, 2025 13:26
@sanjivanipatrax sanjivanipatrax removed the request for review from borrrden October 31, 2025 14:38
@@ -0,0 +1,694 @@
import asyncio
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire class is suspicious to me and is getting way too far into conflating orchestration and execution. Why is this class needed instead of a test server? What is it for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For edge server testing, we are using HTTP clients instead of test servers. This class helps us create several clients per VM (about 10–20) and make them send requests at the same time to different components.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that will create any sort of reliable testing. Is that any different than using 10-20 threads inside of test server to send requests? I worry about fragmentation and opaque hard to understand things like "http client" which doesn't really describe its purpose.

@@ -0,0 +1,183 @@
import subprocess
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very skeptical about having tests that actually connect to and execute commands on a remote machine because now there are assumptions here about the environment on the other side. I guess there might be no other way to do this for edge server but at least I'd like to keep this as minimal as possible. Having a class called "remote shell" is vague and hard to understand.


def __init__(self, key: str, id: str, revid: str | None, cv: str | None) -> None:
def __init__(
self, key: str, id: str, revid: Optional[str], cv: Optional[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn’t used the newer str | None syntax yet, so I went with Optional[str]. Totally fine to change it if you’d like

@@ -0,0 +1,91 @@
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is seemingly redoing all the work that the orcestrator already does

@@ -0,0 +1,248 @@
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also redoing a bunch of orchestrator work

@@ -0,0 +1,139 @@
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also redoing a bunch of orchestrator work, as well as doing some weird stuff with /etc/hosts....

@@ -0,0 +1,366 @@
import logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any benefit to using the QE VM pool rather than AWS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added code last December-January to test Server, Sync Gateway, Edge Server, and HTTP clients on QE VMs. While AWS setup covers Server and Sync Gateway, it doesn’t yet include Edge Server and HTTP clients. This PR adds QE VM support for all four components.

As soon as we get time, we could look at moving the tests to AWS, but right now we’re focusing on finishing the migration. Raised this PR since it’s been a while and an Edge Server release is expected in 1–2 weeks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to be better organized then. Someone looking at this folder is likely to not understand why all of this is here. I added Edge Server orchestration months ago, so that is taken care of for AWS as well. As for HTTP client this PR is the first I'm hearing about it and I'm not a fan so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking to other dev members it would be ideal if this testing also moved to AWS rather than the VM pool. As such, let's try to work on that before merging. This branch can be used for testing for the impending release, and then work toward the AWS goal.

sanjivanipatrax and others added 20 commits November 13, 2025 11:11
* JS Server: Implemented snapshot & verify

Also updated CBL dependency to 1.0.0-5

* JS Server: Implemented blobs for /updateDatabase

* JS Server: Normalize collection names everywhere

i.e. convert `_default._default` to `_default`

* JS Server: Actually save doc after updating it

* Disable more JS tests and a few more test server fixes

1. Make it an error to have a snapshot update to a document that doesn't exist in a snapshot
2. Only return result.document if the snapshot update type is an update (not delete or purge)
3. Sometimes we make a snapshot of a document that does not exist in the DB in order to verify that it didn't get pulled unintentionally, so check that (it will show up as an entry with an "undefined" value)
4. updateDatabase should be allowed to create documents
5. Bug in update remove properties handling (it was iterating the actual strings and sending them one character at a time)

* Fix API spec for conflict resolvers and run query test

* JS Server: Added auth support for replicator

* JS Server: Implemented conflict resolvers

* JS Server: Throw error when updating keypath fails

* JS Server: Implemented push/pull filters

* JS Server: Resolve relative blob URLs

* JS Server: support updatedBlobs in /verifyDocuments

* JS Server: Just prettified tdkSchema.ts

* JS Server: Fixed transaction error in /updateDatabase

Updating a blob would trigger an exception because it's illegal to
make non-database async calls within a transaction.
So use Collection.updateMultiple() instead.

NOTE: This assumes that a /updateDatabase request does not list the
same document multiple times! If so, this will probably fail.

* A few small fixes

1. Normalize collection name in replicator config
2. Correct blob base URL

* Correct snapshot tests and behavior

Only update verifications (not delete or purge) return the document.  Also, there is a distinction between null and missing for snapshot entries:  null means "I want to verify this later, but it doesn't exist right now".  undefined is "Not included in the snapshot and ineligible for verification"

* Further correct Snapshot

- Spurious `!` on line 40 messes up the test for undefined
- I accidentally used `!!` (Kotlin syntax) instead of `!` on line 59
- Using `T | null` in DocumentMap is incorrect; nulls are handled in
  the declaration of #documents, which is
  `DocumentMap<cbl.CBLDocument | null>`.

* Correct and error code and prettify message

The TDK expects an HTTP 400 when a nonexistent blob is requested.  Remove "self" from error message since it results in it being printed twice, and make the error message "returned XXX" not weirdly formatted.

* Update travel JS dataset

Needs two empty collections to be compatible with SGW's setup

* Switch SGW certificate strategy

The cert is now issued by a CA that can easily be trusted in python, as well as browser.  Existing SDK cert pinning should still be fine as well.  Also only copy prerelease RPM if it's not already on the remote machine (uploads are slow)

* Emit "transport: ws" for Javascript topology setups

* Need to separate push and pull filter

Their functions are different signatures so a premade CreateFilter is not possible

* Relax smoke test requirements for error on connection failure

Since JS has no access to that information :(

---------

Co-authored-by: Jim Borden <[email protected]>
…modified overwritten changes to non-edge server code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants