perf: add connection pooling for shuffle reader remote fetches #1388

andygrove · 2026-01-17T16:26:59Z

Summary

Adds a BallistaClientPool that caches gRPC connections to executors by (host, port). This avoids the overhead of establishing new connections for each partition fetch during shuffle reads.

Changes

Add BallistaClientPool struct in client.rs with:
- get_or_connect(host, port, max_message_size) - gets cached connection or creates new one
- remove(host, port) - removes a connection (useful for error recovery)
- clear() - clears all cached connections
Add global_client_pool() function for shared pool access across all shuffle reads
Update fetch_partition_remote() in shuffle_reader.rs to use the connection pool
Remove stale connections from pool on fetch errors

Implementation Details

The pool uses a read-write lock pattern with parking_lot::RwLock:

Fast path: read lock to check for existing connection (no contention for cache hits)
Slow path: create connection without holding lock, then write lock to cache

Race handling: if multiple tasks try to connect to the same host simultaneously, each creates a connection but only one gets cached. The others detect this and return the cached one, dropping their extra connection. This is a rare case and avoids holding locks during async connection establishment.

Error Handling

On FetchFailed or GrpcActionError, the connection is removed from the pool so subsequent requests create fresh connections. This handles cases where:

The remote server closes the connection
Network issues cause connection problems
The executor is restarted

Motivation

This addresses the TODO comment in shuffle_reader.rs:

// TODO for shuffle client connections, we should avoid creating new connections again and again.
// And we should also avoid to keep alive too many connections for long time.

Connection establishment involves TCP handshake, TLS negotiation (if enabled), and HTTP/2 setup. Reusing connections eliminates this overhead for repeated fetches from the same executor.

Note

This PR was created with AI assistance using Claude Code. All changes were reviewed and approved by a human maintainer.

Test plan

Existing shuffle_reader tests pass (cargo test -p ballista-core shuffle_reader)
Existing client tests pass (cargo test -p ballista-core client)
Manual testing with distributed queries to verify connection reuse

Adds a BallistaClientPool that caches gRPC connections to executors by (host, port). This avoids the overhead of establishing new connections for each partition fetch during shuffle reads. Key changes: - Add BallistaClientPool struct with get_or_connect() method - Add global_client_pool() function for shared pool access - Update fetch_partition_remote() to use the connection pool - Remove stale connections from pool on fetch errors The pool uses a read-write lock pattern with parking_lot::RwLock: - Fast path: read lock to check for existing connection - Slow path: create connection without holding lock, then write lock to cache (handles races by preferring existing) Error handling: on FetchFailed or GrpcActionError, the connection is removed from the pool so subsequent requests create fresh connections. This handles cases where connections become stale or the remote server closes them. This addresses the TODO comment in shuffle_reader.rs about avoiding creating new connections repeatedly. Co-Authored-By: Claude Opus 4.5 <[email protected]>

milenkovicm · 2026-01-17T19:27:09Z

is there a chance we cover #1367 somehow as well? (if it is obvious)

milenkovicm · 2026-01-17T19:36:42Z

there is a test

datafusion-ballista/ballista/client/tests/context_checks.rs

Line 918 in fce60b7

async fn should_force_local_read_with_flight(

which forces remote read always, not sure if that would cover your last task in the list.

milenkovicm · 2026-01-17T20:19:43Z

ballista/core/src/client.rs

+/// access during connection creation. The `BallistaClient` itself is `Clone`
+/// (wrapping an `Arc`), so cloned clients share the underlying connection.
+#[derive(Default)]
+pub struct BallistaClientPool {


just wonder would it make sense to use some off-the-shelf like deadpool or r2d2 instead of implementing our own ?

If i'm not mistaken, current implementation can, in theory, leak connections; connection will be removed from the pool only when it fails, but, in theory, we can have a executor removed/replaced and connection never get chance to fail

I pushed a TTL improvement, but moved this to draft for now

Connections in the pool were only removed when they failed. If an executor was removed/replaced without causing connection errors, stale connections would leak indefinitely. This adds a configurable TTL (default 5 minutes) so connections automatically expire and are replaced with fresh ones on the next access attempt. Co-Authored-By: Claude Opus 4.5 <[email protected]>

milenkovicm · 2026-01-18T15:29:49Z

i guess we can't use r2d2 or deadpool as they pool instances of the same destination, interfaces have get() method to return a pooled instance and we need get(address)

milenkovicm · 2026-01-22T10:22:37Z

closes #736 i believe

andygrove force-pushed the shuffle-reader-connection-pool branch from 70f145d to 4b28e0a Compare January 17, 2026 16:27

andygrove requested a review from milenkovicm January 17, 2026 16:29

milenkovicm reviewed Jan 17, 2026

View reviewed changes

andygrove marked this pull request as draft January 17, 2026 21:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: add connection pooling for shuffle reader remote fetches #1388

perf: add connection pooling for shuffle reader remote fetches #1388

andygrove commented Jan 17, 2026 •

edited

Loading

Uh oh!

milenkovicm commented Jan 17, 2026 •

edited

Loading

Uh oh!

milenkovicm commented Jan 17, 2026

Uh oh!

milenkovicm Jan 17, 2026

Uh oh!

andygrove Jan 17, 2026 •

edited

Loading

Uh oh!

milenkovicm commented Jan 18, 2026

Uh oh!

milenkovicm commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: add connection pooling for shuffle reader remote fetches #1388

Are you sure you want to change the base?

perf: add connection pooling for shuffle reader remote fetches #1388

Conversation

andygrove commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Implementation Details

Error Handling

Motivation

Note

Test plan

Uh oh!

milenkovicm commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milenkovicm commented Jan 17, 2026

Uh oh!

milenkovicm Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milenkovicm commented Jan 18, 2026

Uh oh!

milenkovicm commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andygrove commented Jan 17, 2026 •

edited

Loading

milenkovicm commented Jan 17, 2026 •

edited

Loading

andygrove Jan 17, 2026 •

edited

Loading