allow configuring message sizes #207

adriangb · 2025-10-26T23:52:43Z

Closes #206

Open to opinions but I think there's little point in limiting message sizes in this application? Should we just go with usize::MAX as the recommendation?

I don't think we can set a default value like usize::MAX as the library because if you don't also configure your gRPC client you'd run into serious issues that occur seemingly at random (i.e. won't show up until you put it into prod possibly).

gabotechs · 2025-10-27T09:34:41Z

src/flight_service/do_get.rs

+            encoder_builder = encoder_builder.with_max_flight_data_size(max_message_size);
+        }
+
+        let stream =
+            encoder_builder
+                .with_max_flight_data_size(usize::MAX)
+                .build(stream.map_err(|err| {


🤔 I think there's something wrong here. See how on line 140 you set the user-provided max_message_size just to immediately get unconditionally replaced with usize::MAX on line 145.

What do you think about leaving this as:

let stream = FlightDataEncoderBuilder::new() .with_schema(stream.schema().clone()) .with_max_flight_data_size(self.max_message_size) .build(stream.map_err(|err| { FlightError::Tonic(Box::new(datafusion_error_to_tonic_status(&err))) }));

And make self.max_message_size instead of being Option<usize> to let it be some reasonable non-optional usize default? maybe 128 * 1024 * 1024? (2 * 1024 * 1024 would match the current FlightDataEncoderBuilder defaults, but I imagine we probably need something bigger)

If we choose to have a higher default, it would probably also make sense to add a

impl ArrowFlightEndpoint { fn into_flight_server(self) ->FlightServiceServer }

Where we drive the message sizes ourselves.

What do you think?

I thought about that, it feels to me like it pushes datafusion-distributed to be more of a "package" than a "library". It also is a footgun as long as you can create your own incorrectly configured FlightServiceServer. So I'm split. If you want to go that way happy to do so, but let me fix up the current thing first.

Let me do some quick benchmarks. If I see that it's pretty obvious that saner defaults improve performance let's go for it

==== Comparison with the previous benchmark from 2025-10-27 15:32:35 UTC ==== os: macos arch: aarch64 cpu cores: 16 threads: 2 -> 2 workers: 8 -> 8 ============================================================================= Query 1: prev=2672 ms, new=2658 ms, diff=1.01 faster ✔ Query 2: prev= 359 ms, new= 383 ms, diff=1.07 slower ✖ Query 3: prev=1067 ms, new=1088 ms, diff=1.02 slower ✖ Query 4: prev=1126 ms, new=1010 ms, diff=1.11 faster ✔ Query 5: prev=1587 ms, new=1553 ms, diff=1.02 faster ✔ Query 6: prev=1127 ms, new=1200 ms, diff=1.06 slower ✖ Query 7: prev=2717 ms, new=2532 ms, diff=1.07 faster ✔ Query 8: prev=2486 ms, new=2495 ms, diff=1.00 slower ✖ Query 9: prev=3106 ms, new=2963 ms, diff=1.05 faster ✔ Query 10: prev=1589 ms, new=1513 ms, diff=1.05 faster ✔ Query 11: prev= 585 ms, new= 491 ms, diff=1.19 faster ✔ Query 12: prev=1674 ms, new=1641 ms, diff=1.02 faster ✔ Query 13: prev=2273 ms, new=1927 ms, diff=1.18 faster ✔ Query 14: prev=1528 ms, new=1154 ms, diff=1.32 faster ✅ Query 15: prev=1323 ms, new=1176 ms, diff=1.12 faster ✔ Query 16: prev= 258 ms, new= 256 ms, diff=1.01 faster ✔ Query 17: prev=4174 ms, new=3533 ms, diff=1.18 faster ✔ Query 18: prev=4411 ms, new=4304 ms, diff=1.02 faster ✔ Query 19: prev=2539 ms, new=2760 ms, diff=1.09 slower ✖ Query 20: prev=1558 ms, new=2006 ms, diff=1.29 slower ❌ Query 21: prev=4545 ms, new=4137 ms, diff=1.10 faster ✔ Query 22: prev= 385 ms, new= 457 ms, diff=1.19 slower ✖

It has always been a bit flaky to run these benchmarks in our own laptops, so there might be a fair amount of noise here. However, I do see far more green checks than red ones, so I'm inclined to go for higher defaults.

gabotechs · 2025-10-27T09:43:29Z

examples/in_memory_cluster.rs

-                .add_service(FlightServiceServer::new(endpoint))
+                .add_service(
+                    FlightServiceServer::new(endpoint)
+                        .max_decoding_message_size(MAX_MESSAGE_SIZE)


🤔 I see that this value is 4Mb by default, which seems fine for your normal day-to-day CRUD API, but a fairly low default for a distributed query engine.

I'd advocate for taking this PR one step further and actually provide more reasonable defaults apart from letting people configure them.

I do think if we have a default it should be very high if not unlimited. I'll let you make a decision based on #207 (comment) and then happy to implement either way.

I'd say let's go for it

let me know what you think of b653b64

gabotechs

💯 thanks for all the extensive docs and comments!

gabotechs reviewed Oct 27, 2025

View reviewed changes

adriangb added 4 commits October 27, 2025 12:09

allow configuring message sizes

a3e61d2

actually use

82ac1f0

fmt

4f87e30

use usize::MAX and add helpers to create clients/servers

b653b64

adriangb force-pushed the message-sizes branch from 7024098 to b653b64 Compare October 27, 2025 17:13

gabotechs approved these changes Oct 27, 2025

View reviewed changes

gabotechs merged commit 6f69516 into datafusion-contrib:main Oct 27, 2025
4 checks passed

adriangb deleted the message-sizes branch October 27, 2025 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

allow configuring message sizes #207

allow configuring message sizes #207

Uh oh!

adriangb commented Oct 26, 2025 •

edited

Loading

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

adriangb Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

adriangb Oct 27, 2025

Uh oh!

gabotechs Oct 27, 2025

Uh oh!

adriangb Oct 27, 2025

Uh oh!

gabotechs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

allow configuring message sizes #207

allow configuring message sizes #207

Uh oh!

Conversation

adriangb commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gabotechs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adriangb commented Oct 26, 2025 •

edited

Loading