Skip to content

feat(rpc): Add core RPC types, client interface, and mock client [1/8] (OSS)#16645

Open
zhichenxu-meta wants to merge 1 commit intofacebookincubator:mainfrom
zhichenxu-meta:export-D95008955
Open

feat(rpc): Add core RPC types, client interface, and mock client [1/8] (OSS)#16645
zhichenxu-meta wants to merge 1 commit intofacebookincubator:mainfrom
zhichenxu-meta:export-D95008955

Conversation

@zhichenxu-meta
Copy link
Contributor

Summary:

Stack Overview

Problem

Presto SQL queries need to call external services during execution — LLM
inference, embedding generation, vector search, etc. These calls are
high-latency (100ms–10s) and must be non-blocking so Velox pipelines stay
productive. Today there is no framework for this in Velox; each service
integration is built ad-hoc.

User Experience

Adding a new RPC function requires writing a single C++ file:

// 1. Implement the AsyncRPCFunction interface
class MyRpcFunction : public AsyncRPCFunction {
  std::string name() const override { return "my_rpc_function"; }
  TypePtr resultType() const override { return VARCHAR(); }
  std::vector<RPCRequest> prepareRequests(...) const override { ... }
  VectorPtr buildOutput(...) const override { ... }
  std::shared_ptr<IRPCClient> getClient() const override { ... }
};

// 2. Register with signatures (one line per alias)
static AsyncRPCFunctionRegistrar __reg(
    "my_rpc_function",
    []() { return std::make_shared<MyRpcFunction>(); },
    MyRpcFunction::signatures());

That's it. The function is automatically:

  • Discovered by the native sidecar (GET /v1/functions)
  • Available in SQL by unqualified name (SELECT my_rpc_function(...))
  • Intercepted by the Java planner and rewritten to an RPCNode
  • Executed asynchronously by the RPCOperator on workers

No Java code changes needed. No hardcoded function name lists.

Design

Single-operator architecture: TableScan -> RPCOperator -> downstream

The RPCOperator handles both dispatch (send) and join (receive) within
one operator, using RPCState for async state management. It supports two
streaming modes:

  • PER_ROW: Emit rows as they complete (low tail latency for
    high-variance workloads like LLM inference)
  • BATCH: Wait for all rows before emitting (lower overhead for
    uniform-latency workloads)

Key components: per-tier rate limiting (RPCRateLimiter), pluggable RPC
client interface (IRPCClient), and function registry with automatic
sidecar discovery (AsyncRPCFunctionRegistry).

Stack Contents

  • [1/8] Core types, client interface, mock client (OSS)
  • [2/8] RPCNode plan node and RPCState shared state (OSS)
  • [3/8] RPCOperator, RPCRateLimiter, RPCPlanNodeTranslator (OSS)
  • [4/8] Unit tests for core framework (OSS)
  • [5/8] AsyncRPC function registry and PlanTransformer (OSS)
  • [6/8] Production RPC clients and factory (Meta)
  • [7/8] Async RPC function implementation (Meta)
  • [8/8] Java planner, Presto protocol, plan converter (OSS)

This Diff

Defines the foundational types and interfaces for the RPC framework:

  • RPCTypes.h: RPCRequest and RPCResponse structs for passing data
    between the operator and RPC clients. RPCStreamingMode enum (PER_ROW
    vs BATCH).
  • IRPCClient.h: Abstract client interface with call() for single-row
    RPCs and callBatch() for batch RPCs, both returning SemiFuture.
    Default callBatch() fans out to individual call() invocations.
  • AsyncRPCFunction.h: Base class that function implementations extend.
    Provides prepareRequests() to convert input vectors into RPCRequests,
    buildOutput() to convert RPCResponses back into result vectors, and
    getClient()/getBatchClient() factory methods.
  • MockRPCClient: Test client with configurable latency, error injection,
    and echo behavior for unit testing.

Reviewed By: xiaoxmeng

Differential Revision: D95008955

@netlify
Copy link

netlify bot commented Mar 5, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 2317377
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69a9d47df56fd7000841d516

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 5, 2026
@meta-codesync
Copy link

meta-codesync bot commented Mar 5, 2026

@zhichenxu-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D95008955.

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhichenxu-meta you need to add cmake file?

@zhichenxu-meta
Copy link
Contributor Author

zhichenxu-meta commented Mar 5, 2026

@zhichenxu-meta you need to add cmake file?
Added cmake

zhichenxu-meta added a commit to zhichenxu-meta/velox that referenced this pull request Mar 5, 2026
…] (OSS) (facebookincubator#16645)

Summary:

## Stack Overview

### Problem

Presto SQL queries need to call external services during execution — LLM
inference, embedding generation, vector search, etc. These calls are
high-latency (100ms–10s) and must be non-blocking so Velox pipelines stay
productive. Today there is no framework for this in Velox; each service
integration is built ad-hoc.

### User Experience

Adding a new RPC function requires writing a single C++ file:

```cpp
// 1. Implement the AsyncRPCFunction interface
class MyRpcFunction : public AsyncRPCFunction {
  std::string name() const override { return "my_rpc_function"; }
  TypePtr resultType() const override { return VARCHAR(); }
  std::vector<RPCRequest> prepareRequests(...) const override { ... }
  VectorPtr buildOutput(...) const override { ... }
  std::shared_ptr<IRPCClient> getClient() const override { ... }
};

// 2. Register with signatures (one line per alias)
static AsyncRPCFunctionRegistrar __reg(
    "my_rpc_function",
    []() { return std::make_shared<MyRpcFunction>(); },
    MyRpcFunction::signatures());
```

That's it. The function is automatically:
- Discovered by the native sidecar (`GET /v1/functions`)
- Available in SQL by unqualified name (`SELECT my_rpc_function(...)`)
- Intercepted by the Java planner and rewritten to an RPCNode
- Executed asynchronously by the RPCOperator on workers

No Java code changes needed. No hardcoded function name lists.

### Design

Single-operator architecture: `TableScan -> RPCOperator -> downstream`

The RPCOperator handles both dispatch (send) and join (receive) within
one operator, using RPCState for async state management. It supports two
streaming modes:
- **PER_ROW**: Emit rows as they complete (low tail latency for
  high-variance workloads like LLM inference)
- **BATCH**: Wait for all rows before emitting (lower overhead for
  uniform-latency workloads)

Key components: per-tier rate limiting (RPCRateLimiter), pluggable RPC
client interface (IRPCClient), and function registry with automatic
sidecar discovery (AsyncRPCFunctionRegistry).

### Stack Contents

- [1/8] Core types, client interface, mock client (OSS)
- [2/8] RPCNode plan node and RPCState shared state (OSS)
- [3/8] RPCOperator, RPCRateLimiter, RPCPlanNodeTranslator (OSS)
- [4/8] Unit tests for core framework (OSS)
- [5/8] AsyncRPC function registry and PlanTransformer (OSS)
- [6/8] Production RPC clients and factory (Meta)
- [7/8] Async RPC function implementation (Meta)
- [8/8] Java planner, Presto protocol, plan converter (OSS)

## This Diff

Defines the foundational types and interfaces for the RPC framework:

- `RPCTypes.h`: `RPCRequest` and `RPCResponse` structs for passing data
  between the operator and RPC clients. `RPCStreamingMode` enum (PER_ROW
  vs BATCH).
- `IRPCClient.h`: Abstract client interface with `call()` for single-row
  RPCs and `callBatch()` for batch RPCs, both returning `SemiFuture`.
  Default `callBatch()` fans out to individual `call()` invocations.
- `AsyncRPCFunction.h`: Base class that function implementations extend.
  Provides `prepareRequests()` to convert input vectors into RPCRequests,
  `buildOutput()` to convert RPCResponses back into result vectors, and
  `getClient()`/`getBatchClient()` factory methods.
- `MockRPCClient`: Test client with configurable latency, error injection,
  and echo behavior for unit testing.

Reviewed By: xiaoxmeng

Differential Revision: D95008955
…] (OSS) (facebookincubator#16645)

Summary:

## Stack Overview

### Problem

Presto SQL queries need to call external services during execution — LLM
inference, embedding generation, vector search, etc. These calls are
high-latency (100ms–10s) and must be non-blocking so Velox pipelines stay
productive. Today there is no framework for this in Velox; each service
integration is built ad-hoc.

### User Experience

Adding a new RPC function requires writing a single C++ file:

```cpp
// 1. Implement the AsyncRPCFunction interface
class MyRpcFunction : public AsyncRPCFunction {
  std::string name() const override { return "my_rpc_function"; }
  TypePtr resultType() const override { return VARCHAR(); }
  std::vector<RPCRequest> prepareRequests(...) const override { ... }
  VectorPtr buildOutput(...) const override { ... }
  std::shared_ptr<IRPCClient> getClient() const override { ... }
};

// 2. Register with signatures (one line per alias)
static AsyncRPCFunctionRegistrar __reg(
    "my_rpc_function",
    []() { return std::make_shared<MyRpcFunction>(); },
    MyRpcFunction::signatures());
```

That's it. The function is automatically:
- Discovered by the native sidecar (`GET /v1/functions`)
- Available in SQL by unqualified name (`SELECT my_rpc_function(...)`)
- Intercepted by the Java planner and rewritten to an RPCNode
- Executed asynchronously by the RPCOperator on workers

No Java code changes needed. No hardcoded function name lists.

### Design

Single-operator architecture: `TableScan -> RPCOperator -> downstream`

The RPCOperator handles both dispatch (send) and join (receive) within
one operator, using RPCState for async state management. It supports two
streaming modes:
- **PER_ROW**: Emit rows as they complete (low tail latency for
  high-variance workloads like LLM inference)
- **BATCH**: Wait for all rows before emitting (lower overhead for
  uniform-latency workloads)

Key components: per-tier rate limiting (RPCRateLimiter), pluggable RPC
client interface (IRPCClient), and function registry with automatic
sidecar discovery (AsyncRPCFunctionRegistry).

### Stack Contents

- [1/8] Core types, client interface, mock client (OSS)
- [2/8] RPCNode plan node and RPCState shared state (OSS)
- [3/8] RPCOperator, RPCRateLimiter, RPCPlanNodeTranslator (OSS)
- [4/8] Unit tests for core framework (OSS)
- [5/8] AsyncRPC function registry and PlanTransformer (OSS)
- [6/8] Production RPC clients and factory (Meta)
- [7/8] Async RPC function implementation (Meta)
- [8/8] Java planner, Presto protocol, plan converter (OSS)

## This Diff

Defines the foundational types and interfaces for the RPC framework:

- `RPCTypes.h`: `RPCRequest` and `RPCResponse` structs for passing data
  between the operator and RPC clients. `RPCStreamingMode` enum (PER_ROW
  vs BATCH).
- `IRPCClient.h`: Abstract client interface with `call()` for single-row
  RPCs and `callBatch()` for batch RPCs, both returning `SemiFuture`.
  Default `callBatch()` fans out to individual `call()` invocations.
- `AsyncRPCFunction.h`: Base class that function implementations extend.
  Provides `prepareRequests()` to convert input vectors into RPCRequests,
  `buildOutput()` to convert RPCResponses back into result vectors, and
  `getClient()`/`getBatchClient()` factory methods.
- `MockRPCClient`: Test client with configurable latency, error injection,
  and echo behavior for unit testing.

Reviewed By: xiaoxmeng

Differential Revision: D95008955
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants