Skip to content

Conversation

@gabotechs
Copy link
Collaborator

Allow people to pass hooks that trigger on certain situations during the lifetime of an Arrow flight do_get query. This exposes a new API during ArrowFlightEndpoint construction that allows people to provide their own callbacks:

let mut endpoint = ArrowFlightEndpoint::try_new(DefaultSessionBuilder);
endpoint.on_plan(move |plan| {
    // log something based on the plan
    plan
});

I was tempted to introduce more hooks in this PR, but I think it'd be better to wait for use-cases rather than trying to guess them now.

The main use case for this right now is for wrapping nodes with https://github.com/datafusion-contrib/datafusion-tracing.

Comment on lines 19 to 22
#[allow(clippy::type_complexity)]
pub(super) struct ArrowFlightEndpointHooks {
pub(super) on_plan: Arc<dyn Fn(Arc<dyn ExecutionPlan>) -> Arc<dyn ExecutionPlan> + Sync + Send>,
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO clippy here is a bit over dramatic. Not sure how this is too complex, but not line 59.

Copy link
Collaborator

@geoffreyclaude geoffreyclaude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super generic hook, which is great! I think the API should be a bit more DataFusion idiomatic though, and, more critically, allow multiple hooks (to allow various extensions from adding their dedicated hooks.)

Something like:

let mut endpoint = ArrowFlightEndpoint::try_new(DefaultSessionBuilder);
// appends two "pre-get" hooks
endpoint = endpoint.with_pre_get_hook(tracing_hook).with_pre_get_hook(some_other_hook);
// overwrite all "pre-get" hooks
endpoint.with_pre_get_hooks(vec![tracing_hook, some_other_hook]);
});

@gabotechs
Copy link
Collaborator Author

Makes sense! followed your suggestion with two details:

  • still did not add the ability to pass multiple hooks, as I don't think there's a use case for it, and we can always add it when there's one
  • kept the name as add_on_plan_hook, because it's strictly incorrect that it will run on every get request, it just runs whenever a plan is formed, which happens only in some get requests but not all.

Copy link
Collaborator

@jayshrivastava jayshrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion but I don't feel strongly. Looks good overall!

}
}
// As many plans as tasks should have been received.
assert_eq!(plans_received.load(Ordering::SeqCst), task_keys.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker but an independent test where you swap out a plan node (ex. DataSourceExec with EmptyExec) would be nice

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 but that would fail pretty bad, part of the contract with this hook is that people should not do that. Or do you expect that to pass?

Copy link
Collaborator

@jayshrivastava jayshrivastava Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think swapping out a leaf node like DataSourceExec will work because it preserves the plan structure, as long as the schema matches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's not something we are willing to support right? the contract is more something like: "don't touch anything that changes the plan", we just don't enforce it at runtime for performance reasons.

///
/// The callback takes the plan and returns another plan that must be either the same,
/// or equivalent in terms of execution. Mutating the plan by adding nodes or removing them
/// will make the query blow up in unexpected ways.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add an assertion that the final schema matches within the ArrowFlightEndpoint so we fail early with "schema mismatch in hook" to make it clear that a hook caused this sort of error.

The harder part is asserting that the plan structure is the same, which is important for metrics. Traversing the plan to assert this would be expensive. Since we only support wrapping (because the plan structure cannot change), can we leverage with_new_children in the hook itself?

ex.

trait Hook {
    fn apply(Arc<dyn ExecutionPlan>) -> bool
    fn new(Arc<dyn ExecutionPlan>) -> Arc<dyn ExecutionPlan>
}


// in the endpoint

fn apply_hook<H: Hook>(hook: H, plan: Arc<dyn ExecutionPlan>) {
    plan.transform_up(
        |node| {
            if hook.apply(node) {
                Transformed::yes(hook.new(node).with_new_children(node.children()))
            }
        } 
   )
}

This makes hooks a bit more expensive bc we necessarily traverse the whole plan, where previously we we were not. But for datafusion-tracing, we would have traversed the whole plan anyways.

This might be more effort than it's worth so I'll leave the decision up to you. If you do implement something like this, then I'm happy to take another look :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I imagine in most cases there's just going to be no hook, I think it's worth not penalizing those cases.

The "schema mismatch" might be a good idea though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that makes sense

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now I'm going to keep it simple and not do anything. In case we find it necessary to add some checks we can refer back to this conversation.

@gabotechs gabotechs merged commit df78006 into main Oct 22, 2025
4 checks passed
@gabotechs gabotechs deleted the gabrielmusat/add-endpoint-callbacks branch October 22, 2025 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants