Skip to content

Find a way for users to thread arbitrary metadata across network calls #93

@gabotechs

Description

@gabotechs

User's might expect certain metadata adjacent to the query to be present in their own physical nodes. With normal DataFusion, this can be done by injecting arbitrary object into the config extensions with config.setExtension, and it will be recursively propagated across nodes.

This project introduces network boundaries between different execution nodes, and therefore, any config extensions set at the head of the plan will not be automatically propagated to further nodes executed in different hosts.

One solution to this would be to allow users access to the gRPC request metadata both in ArrowFlightReadExec and in ArrowFlightEndpoint:

  • users could choose to write to the gRPC metadata in ArrowFlightReadExec, duping into it a serialized version of anything they want to propagate
  • users could access the gRPC metadata in the SessionBuilder trait passed to the ArrowFlightEndpoint, read anything from it, build their own objects, and inject them into the SessionContext in the normal session building methods from SessionBuilder

gRPC already has a specification for passing through metadata like this, and the API is really nice, so my proposal is to just use that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions