Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions rust/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions rust/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ dashmap = "5"
futures = "0.3"
tokio-util = "0.7"
async-trait = "0.1"
async-channel = "2.3"

# Error handling
thiserror = "2"
Expand Down
111 changes: 111 additions & 0 deletions rust/spec/e2e-test-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
<!--
Copyright (c) 2025 ADBC Drivers Contributors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# E2E Test Specification — Rust ADBC Driver (CloudFetch)

This document captures connection parameters and execution instructions for
CloudFetch end-to-end testing of the Databricks Rust ADBC driver.
A separate agent reads this document to generate test implementations.

---

## Connection Parameters

Source: `/home/e.wang/databricks-driver-test/databricks-test-config.json`

| Config field | ADBC option key | Env var | Value |
|---|---|---|---|
| `uri` / `hostName` | `OptionDatabase::Uri` | `DATABRICKS_HOST` | `https://adb-6436897454825492.12.azuredatabricks.net` |
| `path` | `databricks.http_path` | `DATABRICKS_HTTP_PATH` | `/sql/1.0/warehouses/2f03dd43e35e2aa0` |
| `token` | `databricks.access_token` | `DATABRICKS_TOKEN` | *(read from config)* |

> **Note:** `uri` is the full warehouse URL. Pass only the host portion to the driver;
> `path` maps separately to `databricks.http_path`.

---

## CloudFetch Test Scenarios

### Large result sets
- **1M rows**: all rows received across multiple batches.
- **10M rows**: exercises the link-prefetch loop beyond the server's 32-link-per-response limit; all rows received.

### Fault injection (external harness only, via mitmproxy)

| Scenario | What to verify |
|---|---|
| Link expiry | Proxy returns 403 on presigned-URL download; driver refetches and retries |
| Download timeout | Proxy delays response past `chunk_ready_timeout_ms`; driver surfaces timeout error |
| Partial download | Proxy closes connection mid-stream; driver retries or fails cleanly |
| SEA 500 on chunk-link fetch | `GET /result/chunks` returns 500; driver retries up to `max_retries` then fails |
| Slow consumer | Consumer reads with deliberate delay; `chunks_in_memory` backpressure prevents unbounded memory use |

---

## How to Run

### Option 1: Native cargo tests

Set credentials in `rust/.cargo/config.toml` (gitignored):

```toml
[env]
DATABRICKS_HOST = "https://adb-6436897454825492.12.azuredatabricks.net"
DATABRICKS_HTTP_PATH = "/sql/1.0/warehouses/2f03dd43e35e2aa0"
DATABRICKS_TOKEN = "<pat-token>"
```

```bash
cargo test --test e2e -- --include-ignored
```

### Option 2: External harness (required for fault injection)

```bash
cd /home/e.wang/databricks-driver-test

# Test against local driver
DATABRICKS_DRIVER_PATH=/home/e.wang/databricks/rust \
DATABRICKS_TEST_CONFIG_FILE=./databricks-test-config.json \
./adbc_rust_tests.sh

# Filter by test name
DATABRICKS_DRIVER_PATH=/home/e.wang/databricks/rust \
DATABRICKS_TEST_CONFIG_FILE=./databricks-test-config.json \
./adbc_rust_tests.sh --filter "test_cloudfetch"

# Run a specific test binary
DATABRICKS_DRIVER_PATH=/home/e.wang/databricks/rust \
DATABRICKS_TEST_CONFIG_FILE=./databricks-test-config.json \
./adbc_rust_tests.sh --test cloud_fetch
```

> If `./driver` symlink exists from a previous run: `rm ./driver`

**How it works:** `DATABRICKS_DRIVER_PATH` causes the script to symlink `./driver →
/home/e.wang/databricks/rust` instead of cloning from GitHub, then runs
`cargo build --release` in `$DRIVER_DIR/rust`.

**All env vars:**

| Variable | Required | Description |
|---|---|---|
| `DATABRICKS_TEST_CONFIG_FILE` | **Yes** | Path to `databricks-test-config.json` |
| `DATABRICKS_DRIVER_PATH` | No | Local `rust/` dir; skips GitHub clone |
| `DATABRICKS_DRIVER_REPO` | No | Git URL (default: `https://github.com/adbc-drivers/databricks.git`) |
| `RUST_TEST_FILTER` | No | Test name substring filter |
| `PROXY_PORT` | No | mitmproxy port (default: `8080`) |
| `PROXY_CONTROL_PORT` | No | Proxy control API port (default: `18081`) |
120 changes: 120 additions & 0 deletions rust/spec/orchestration_spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
<!--
Copyright (c) 2025 ADBC Drivers Contributors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Orchestration Spec — E2E Agent

The E2E Agent is the final task of every sprint whose spec directory contains
an `e2e-test-spec.md`. It runs after all implementation tasks are complete.

It reads only the design doc and the e2e spec — it has no knowledge of
individual task specs or implementation details.

---

## Inputs

| Input | Purpose |
|---|---|
| Design doc (e.g. `cloudfetch-pipeline-redesign.md`) | What the feature does — used to understand intent when writing tests |
| `e2e-test-spec.md` | Connection parameters, test scenarios, and run instructions |

---

## Protocol

```
After all impl tasks complete:
└─ spawn E2E Agent
├─ Phase 1: Write or locate tests
│ Read e2e-test-spec.md and design doc
│ For each scenario in e2e-test-spec.md:
│ If not covered by an existing test → write it
│ Run all e2e tests per instructions in e2e-test-spec.md
├─ if all tests pass → DONE
└─ if any test fails → Phase 2: Fix the stack
├─ Identify the offending commit
│ Read git log, match commits to feature descriptions
│ Use git bisect if the regression is not obvious
├─ Checkout the branch containing the bad commit
├─ Fix the code (amend the commit)
├─ Rebase the entire stack forward
│ Rebase every downstream branch in dependency order
├─ Return to the tip branch
└─ Re-run all e2e tests → repeat until all pass
```

---

## Exit Criteria

All scenarios in `e2e-test-spec.md` pass. If the agent cannot resolve
failures, it escalates to the human with:
- Which test(s) failed
- Which commit was identified as the source
- What fix was attempted and why it did not resolve the failure

---

## Prompt Template

```
You are the E2E Agent. All implementation tasks are complete.
Your job is to validate the feature end-to-end and fix any regressions.

## Inputs
- Design doc: <path>
- E2E spec: <path to e2e-test-spec.md>

## Phase 1 — Write or locate tests
Read the e2e spec. For each scenario:
1. Check if an existing test covers it.
2. If not, write the test using the connection parameters in the e2e spec.
3. Run all e2e tests using the run instructions in the e2e spec.

## Phase 2 — Fix regressions (if any tests fail)
1. Run `git log --oneline` to see the commit stack.
2. Match the failing behavior to the most likely responsible commit.
Use git bisect if it is not obvious.
3. Checkout that branch/commit.
4. Fix the code and amend the commit.
5. Rebase all downstream branches in order.
6. Return to the tip branch and re-run all e2e tests.
7. Repeat until all tests pass.

## Exit criteria
Every scenario in the e2e spec passes.
Report DONE when complete, or ESCALATE with a summary if you cannot resolve.
```

---

## File Locations

| File | Purpose |
|---|---|
| `rust/spec/orchestration_spec.md` | This file |
| `rust/spec/e2e-test-spec.md` | Connection parameters, scenarios, and run instructions |
| `rust/spec/sprint-plan-*.md` | Source for task generation; triggers E2E task when e2e-test-spec.md exists |
9 changes: 0 additions & 9 deletions rust/src/database.rs
Original file line number Diff line number Diff line change
Expand Up @@ -228,15 +228,6 @@ impl Optionable for Database {
Err(DatabricksErrorHelper::set_invalid_option(&key, &value).to_adbc())
}
}
"databricks.cloudfetch.chunk_ready_timeout_ms" => {
if let Some(v) = Self::parse_int_option(&value) {
self.cloudfetch_config.chunk_ready_timeout =
Some(Duration::from_millis(v as u64));
Ok(())
} else {
Err(DatabricksErrorHelper::set_invalid_option(&key, &value).to_adbc())
}
}
"databricks.cloudfetch.speed_threshold_mbps" => {
if let Some(v) = Self::parse_float_option(&value) {
self.cloudfetch_config.speed_threshold_mbps = v;
Expand Down
Loading
Loading