Skip to content

Conversation

@LukeButters
Copy link
Contributor

@LukeButters LukeButters commented Oct 20, 2025

Summary

This PR adds a new execution mode where RPC requests are executed over the Polling Pending Request Queue (e.g. Redis Queue) without a TCP connection. This means, given two nodes that are configured to use the same Redis queue those nodes could execute RPCs on each other through the Redis queue, without the need for a direct connection between the two.

Background

Halibut is a RPC framework where the "client" initiates the RPC call and the "Service" executes that call. Ordinarily that Service is some remote machine and the RPC is made over a TCP connection. In Polling mode the remote Service connects to the client (the remote creates the TCP connection). When the clients are configured in a multi-node setup the Remote Polling Service may not connect to the client that wants to the initiate the RPC, for this case a Redis queue is used.

The Change

This is a trivial change as it takes the existing infrastructure for executing a RPC calls and just cuts out TCP.

Current:
image

Currently in a multi node setup, the RPC call goes via the Redis Queue and then down a TCP connection to the Service that executes the work.
src

New Queue Execution:

image

The new Queue based execution mode simply removes the TCP part and so allows the RPC call to go over the existing Redis Queue and be executed by another node connected to Redis.

Motivation

Requesting Halibut logs between nodes.

Halibut provides an in memory rolling log of the last 100 log lines per Endpoint. In a multi-node setup currently one must go to each node to get these logs. Since a multi-node setup would already have a shared Redis, the support for RPC over Redis makes it trivial to request the logs from each node.

Clients behind a load balancer.

We are sometimes in the situation in which we need work to by picked up by specific nodes e.g. a client is connected to only one node and we need that node to process the work.

With this change and a distributed queue (e.g. the Redis one), it would be possible to setup something like:

  • Client connects to a node lets call the Client "bob"
  • That node would call halibutRunTime.PollLocalAsync(new Uri("local://bob"), workerCts.Token) and so would begin to processes messages sent to "local://bob".
  • A different node is able to send a request to bob in the usual halibut way: var echo = client.CreateAsyncClient<IEchoService, IAsyncClientEchoService>(new ("local://test-worker");
  • and the node connected to bob will collect the request and do it.

Changes

Core Implementation

  • HalibutRuntime.PollForRPCOverQueueAsync() - New method that polls a local:// queue and executes RPCs locally
  • queue:// URI scheme support - Added to routing logic in SendOutgoingRequestAsync()
  • Workers directly access the queue via GetQueue() and execute requests using ServiceInvoker
  • Simple polling loop: dequeue → invoke locally → apply response

Usage

see: RPCOverQueueExecutionModeFixture.SimpleRPCOverQueueExecutionExample

@LukeButters LukeButters requested a review from a team as a code owner October 20, 2025 03:56
LukeButters and others added 4 commits December 22, 2025 11:28
This adds a new execution mode where RPC requests are executed locally on
the worker node that dequeues work, rather than being proxied over TCP.

Changes:
- Add PollLocalAsync() method to HalibutRuntime for local queue polling
- Support local:// URI scheme for local execution endpoints
- Workers poll queue directly and execute RPCs locally via ServiceInvoker
- Add comprehensive design document explaining architecture and usage
- Add test fixture demonstrating local execution mode

Benefits:
- 10-100x lower latency (no TCP/SSL overhead)
- True horizontal scaling via worker pools
- Queue-agnostic (works with in-memory and Redis queues)
- Backward compatible with existing code

Usage:
```csharp
// Worker
var worker = new HalibutRuntime(serviceFactory);
worker.Services.AddSingleton<IMyService>(new MyServiceImpl());
await worker.PollLocalAsync(new Uri("local://worker-pool-a"), cancellationToken);

// Client
var client = new HalibutRuntime(serviceFactory);
var service = client.CreateAsyncClient<IMyService, IAsyncClientMyService>(
    new ServiceEndPoint("local://worker-pool-a", null));
await service.DoWorkAsync();
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…ModeFixture

The LocalExecutionModeFixture test uses Redis functionality (RedisFacadeBuilder,
RedisPendingRequestQueueFactory) which is only available in .NET 8.0 or greater.
Added #if NET8_0_OR_GREATER directive to match the pattern used in other Redis
queue tests in the codebase.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Added back the SimplePollingExample test implementation that demonstrates
basic polling mode with TCP. This test serves as a reference example for
the Halibut polling pattern.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
LukeButters and others added 2 commits December 23, 2025 11:32
The hardcoded limit of 100 log events is now accessible via InMemoryConnectionLog.MaxLogEvents, allowing external code to reference this configuration value.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Making the class public so that the MaxLogEvents field can be accessed from outside the assembly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
public class HalibutExamplesFixture : BaseTest
{
[Test]
public async Task SimplePollingExample()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this since we lack simple examples of how to use halibit.

@LukeButters LukeButters changed the title Add local execution mode for queue-based RPC without TCP Add support for RPC over Redis. Dec 23, 2025
{
public static class InMemoryConnectionLogLimits
{
public static readonly int MaxLogEventsStored = 100;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be used in Octopus to limit returned logs in multi node setups.

@LukeButters LukeButters changed the title Add support for RPC over Redis. Add support for RPC over queue (e.g. RPC over Redis) execution mode Jan 5, 2026
@LukeButters LukeButters requested a review from rhysparry January 6, 2026 04:52
Copy link
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to clarify the final scheme.

{
public class HalibutRuntime : IHalibutRuntime
{
public const string QueueEndpointScheme = "local";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were we switching this to queue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was sure I did that......

{
if (queueOnlyEndpoint.Scheme.ToLowerInvariant() != QueueEndpointScheme)
{
throw new ArgumentException($"Only 'queue://' endpoints are supported. Provided: {queueOnlyEndpoint.Scheme}://", nameof(queueOnlyEndpoint));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the constant here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

using var workerCts = new CancellationTokenSource();
var pollingTask = Task.Run(async () =>
{
await worker.PollForRPCOverQueueAsync(new Uri("local://test-worker"), workerCts.Token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do change the scheme, this will need updating


// Client creates proxy to local://test-worker and makes request
var echo = client.CreateAsyncClient<IEchoService, IAsyncClientEchoService>(
new ServiceEndPoint("local://test-worker", null, client.TimeoutsAndLimits));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💚 LGTM

{
var request = await queue.DequeueAsync(cancellationToken);

if (request != null)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if there is no request we immediately poll again? Is it worth adding a delay when there is no request waiting?

Notice that while the configuration code changed, the request/response code didn't apart from the endpoint. Logically, the Octopus is still the request/response client, and the Tentacle is still the request/response server, even though the transport layer has Octopus as the TCP listener and Tentacle as the TCP client polling for work.
Notice that while the configuration code changed, the request/response code didn't apart from the endpoint. Logically, the Octopus is still the request/response client, and the Tentacle is still the request/response server, even though the transport layer has Octopus as the TCP listener and Tentacle as the TCP client polling for work.

## RPC over Redis
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you added this section to the readme. This feels like a significant extra capability you've added to halibut here. I understand the value and our use case for it, I'm interested to hear how you are feeling about the conceptual overhead that this new capability adds.

For example I notice that in this readme you talk about "nodes" for the first time. Halibut is a client and server communication framework, so what is a "node" in this client serve model.

Also I notice in the "how it works" section. You talk about a client making an RPC call to a server... however in our use case of collecting in memory halibut logs, isn't it actually a client sending an RPC call to another client?

Lets use this thread to discuss the big picture question "How do we feel about the conceptual overhead we've added" and once we've covered that I can start new threads to nit pick the examples I've listed to give weight to my question.

I'm happy for you to merge while this discussion is ongoing as long as we're committed to addressing this question 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants