adbc-drivers
diff --git a/‎rust/spec/cloudfetch-pipeline-redesign.md‎
Lines changed: 86 additions & 9 deletions b/‎rust/spec/cloudfetch-pipeline-redesign.md‎
Lines changed: 86 additions & 9 deletions
@@ -1,3 +1,19 @@
+<!--
+  Copyright (c) 2025 ADBC Drivers Contributors
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
 # CloudFetch Pipeline Redesign: DashMap → Channel-Based Pipeline
 
 **Status:** Proposed
@@ -61,14 +77,22 @@ graph LR
 **Two channels replace the DashMap:**
 
 - **`download_channel`** — scheduler pushes `ChunkDownloadTask` items; download workers pull
-  from it. Unbounded (backpressure is applied via `result_channel`).
+  from it. Bounded to `max_chunks_in_memory` using `async-channel` (MPMC). The scheduler's
+  sequential loop means `download_channel` and `result_channel` are always filled in lockstep,
+  so the same bound caps both.
 - **`result_channel`** — scheduler pushes `ChunkHandle` items *in chunk-index order*, bounded
   to `max_chunks_in_memory`. The consumer reads from it in order and awaits each handle.
 
 Items are enqueued to `result_channel` by the scheduler before the download starts, preserving
 sequential ordering even when downloads complete out of order. This matches the C# pattern in
 `CloudFetchDownloader.cs` (result enqueued before download task is awaited).
 
+**Why `async-channel` for `download_channel`?** `tokio::sync::mpsc` is single-consumer —
+N workers cannot each call `.recv()` on the same receiver. `async-channel` is MPMC: each
+worker clones the receiver and blocks on `.recv()` independently. When a task arrives, exactly
+one worker wakes up. No mutex needed. Equivalent to C#'s `BlockingCollection` consumed via
+`GetConsumingEnumerable` with a `SemaphoreSlim` controlling parallelism.
+
 ---
 
 ## Key Types
@@ -125,9 +149,14 @@ sequenceDiagram
 ```
 
 - `fetch_links()` returns a **batch** of `CloudFetchLink` values (`Vec<CloudFetchLink>`), not a
-  single link. The scheduler iterates the batch and creates one oneshot pair per link.
-- `SeaChunkLinkFetcher` caches and prefetches links internally; the scheduler simply consumes
-  what `fetch_links()` returns, advancing `next_chunk_index` after each batch.
+  single link. The batch size is **server-determined** — one `fetch_links()` call maps to one
+  API call and returns however many links the server provides in that page.
+- Calling `fetch_links(next_chunk_index)` internally updates `SeaChunkLinkFetcher`'s
+  `current_consumer_index` and triggers a background prefetch task if the prefetch window
+  (`link_prefetch_window`, default 128 chunks) needs filling. The scheduler does not need to
+  signal prefetch explicitly — `fetch_links()` handles it.
+- `SeaChunkLinkFetcher` returns from its `DashMap` cache immediately if links are available;
+  falls back to a synchronous server fetch only on a cache miss (prefetch hasn't caught up).
 - Creates a `oneshot` channel pair per chunk.
 - Sends `ChunkDownloadTask` to `download_channel` and `ChunkHandle` to `result_channel`.
 - The bounded `result_channel` provides backpressure automatically — no manual
@@ -171,8 +200,16 @@ sequenceDiagram
     end
 ```
 
-The worker owns `ChunkDownloadTask` outright. URL refresh mutates a local `link` variable —
-no map lookup, no lock, no guard. Mirrors C#'s `DownloadFileAsync` directly.
+The worker owns `ChunkDownloadTask` outright. When `refetch_link()` returns a fresh
+`CloudFetchLink`, it is assigned to the worker's local `link` variable — no shared state is
+mutated, no lock needed, no map lookup. This is the key improvement over the DashMap approach
+where calling `refetch_link().await` while holding a `get_mut()` guard was impossible.
+Mirrors C#'s `DownloadFileAsync` directly.
+
+**Why retry logic lives in the worker, not a shared HTTP client layer:** the 401/403 path
+requires calling `refetch_link()` on `ChunkLinkFetcher`, which is CloudFetch-specific state
+unavailable to a generic HTTP client. A future shared HTTP client could own 5xx/network
+retries, but URL refresh must remain in the worker to access `ChunkLinkFetcher`.
 
 **Proactive expiry check** (C# parity — `IsExpiredOrExpiringSoon`): before the first HTTP
 request for each chunk, the worker checks `link.is_expired()` using the
@@ -232,6 +269,9 @@ pub struct StreamingCloudFetchProvider {
 
     // Cancellation
     cancel_token: CancellationToken,
+
+    // Worker task handles — awaited on drop for clean shutdown
+    worker_handles: JoinSet<()>,
 }
 ```
 
@@ -259,11 +299,17 @@ Both types are no longer needed and can be deleted. The equivalent state lives o
 | Component | Concurrency primitive | Reason |
 |---|---|---|
 | Scheduler | Single `tokio::spawn` task | Sequential chunk ordering required |
-| Download workers | N `tokio::spawn` tasks sharing `download_channel` | Parallel downloads |
+| Download workers | N `tokio::spawn` tasks, each holding a cloned `async_channel::Receiver` | All N workers block on `.recv()` simultaneously; exactly one wakes per task |
 | Consumer | Caller's task (no spawn) | Sequential result consumption |
-| `result_channel` | Bounded `mpsc` (capacity = `max_chunks_in_memory`) | Backpressure without manual counter |
+| `download_channel` | `async_channel::bounded(max_chunks_in_memory)` | MPMC required; bounded to match `result_channel` capacity |
+| `result_channel` | `tokio::sync::mpsc` bounded (capacity = `max_chunks_in_memory`) | Single consumer; backpressure without manual counter |
 | URL refresh in worker | Local variable mutation | No shared state — no lock needed |
 
+**Worker lifecycle:** `StreamingCloudFetchProvider` holds a `tokio::task::JoinSet<()>`
+containing the N worker task handles. On shutdown: `cancel_token.cancel()` → workers exit
+their recv loop → scheduler drops the `download_channel` sender → `JoinSet` is awaited on
+`StreamingCloudFetchProvider` drop to ensure clean teardown.
+
 **Thread safety:** No shared mutable state between components. Each `ChunkDownloadTask` is
 owned by exactly one worker at a time. Each `ChunkHandle` is owned by the consumer.
 
@@ -309,7 +355,7 @@ field is **removed**, and two defaults are **corrected** to match C#.
 
 | Field | Old use | New use | Default change? |
 |---|---|---|---|
-| `max_chunks_in_memory` | Manual `AtomicUsize` counter | `mpsc::channel(max_chunks_in_memory)` capacity bound | No |
+| `max_chunks_in_memory` | Manual `AtomicUsize` counter | `async_channel::bounded` + `mpsc::channel` capacity for both channels | **Yes: → 10** (C# equivalent: 200 MB ÷ 20 MB max chunk size) |
 | `max_retries` | Per-download retry limit | Unchanged | **Yes: 5 → 3** (align with C# `MaxRetries = 3`) |
 | `retry_delay` | Constant sleep | Linear backoff: `retry_delay * (attempt + 1)` — matches C# `RetryDelayMs * (retry + 1)` | **Yes: 1500ms → 500ms** (align with C# `RetryDelayMs = 500`) |
 | `link_prefetch_window` | Background prefetch ahead of consumer | Unchanged — owned by `SeaChunkLinkFetcher` | No |
@@ -337,6 +383,37 @@ address the root issue of co-locating three concerns in one structure.
 
 ---
 
+## Implementation Phases
+
+### Phase 1: Foundation — Config and Pipeline Types
+
+- Update `CloudFetchConfig`: add `max_refresh_retries`, `num_download_workers`,
+  `url_expiration_buffer_secs`; remove `chunk_ready_timeout`; correct defaults for
+  `max_chunks_in_memory`, `max_retries`, `retry_delay`
+- Remove legacy types `ChunkEntry` and `ChunkState` from `types.rs`
+- Define `ChunkDownloadTask` and `ChunkHandle` structs
+
+Corresponds to PECO-2927 and the type definitions from PECO-2928.
+
+### Phase 2: Core Pipeline — Scheduler, Workers, Consumer
+
+- Implement the Scheduler task (fetch links → create oneshot pairs → push to both channels)
+- Implement Download Workers (retry contract, proactive expiry check, refetch on 401/403/404)
+- Implement Consumer (`next_batch` reads from `result_channel`, awaits each `ChunkHandle`)
+
+Corresponds to PECO-2928, PECO-2929, PECO-2930.
+
+### Phase 3: Integration — Provider Rewrite and Tests
+
+- Refactor `StreamingCloudFetchProvider` struct: replace `DashMap` fields with channels,
+  wire up scheduler and workers in constructor, clean shutdown via `JoinSet` + `CancellationToken`
+- Unit tests (9 tests — see Test Strategy below)
+- Integration tests (3 tests — see Test Strategy below)
+
+Corresponds to PECO-2931, PECO-2932, PECO-2933.
+
+---
+
 ## Test Strategy
 
 ### Unit Tests