Skip to content

Conversation

@aseigo
Copy link
Contributor

@aseigo aseigo commented Dec 8, 2025

As the Fetch API does not expose HTTP trailers to the Javascript runtime, grpcweb mandates that tailers are included in the message payload with the most-significant bit of the leading byte (flags) set to 1 followed by a run-length-encoded block of text that follows the same formatting as normal headers.

Most extant grpcweb libraries written in JS/TS are lenient about this and will happily forego receiving trailers. However, some are more picky about this and REQUIRE trailers (the buf.read Connect libraries are an example of this).

With this changset:

GRPC.Server follows the spec when sending protos over grpcweb, allowing errors and other custom trailers to be sent in a way that is visible to the client.

GRPC.Message recognizes trailers and parses them appropriately: it extracts partial-buffer messages using the run-length encoding bytes (which it was previously quietly ignoring, which would also allow malformed buffers due to e.g. network problems sneak through anyways), it respects the trailers flag, and returns appropriate data in each of these cases.

The GRPC client now also works with embedded trailers.

Overhead for non-grpcweb should be nominal as new code paths are hidden behind grpcweb checks, while the additional binary checks are placed in front the error paths (so errors may be nominally slower to be reached, but the happy paths should be untouched).

This has been tested with both Google's own grpc-web lbirary as well as buf.build's connect/connect-web libraries with a real-world API being served by elixir-grpc's grpc libraries.

This does need more testing (what doesn't!), and there are some decisions made in the details of the code that could be discussed.

aseigo and others added 3 commits December 8, 2025 13:33
As the Fetch API does not expose HTTP trailers to the Javascript runtime,
grpcweb mandates that tailers are included in the message payload with
the most-significant bit of the leading byte (flags) set to 1.

What follows is a run-length-encoded block of text that follows the same
formatting as normal headers.

Most extant grpcweb libraries written in JS/TS are lenient about this
and will happily forego receiving trailers. However, some are more
picky about this and REQUIRE trailers (the buf.read connect libraries
are an example of this).

GRPC.Server follows the spec when sending protos over grpcweb, allowing
errors and other custom trailers to be sent in a way that is visible to
the client.

GRPC.Message also now recognizes trailers and parses them appropriately:
it extracts partial-buffer messages using the run-length encoding bytes
(which it was previously quietly ignoring, which would also allow
malformed buffers due to e.g. network problems sneak through anwyays),
it respects the trailers flag, and returns appropriate data in each of
these cases.

The GRPC client now also works with embedded trailers.

Overhead for non-grpcweb should be nominal as new code paths are hidden
behind grpcweb checks, while the additional binary checks are placed in
front the error paths (so errors may be nominally slower to be reached,
but the happy paths should be untouched).
Copy link
Contributor

@polvalente polvalente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR generally looks good, but I wanna defer to @sleipnir. I'm not sure if we should add an option for accepting those trailers only when we're parsing grpcweb.

Also, I feel like grpc_server could use a new test or test change too.

{<<1, 2, 3, 4, 5, 6, 7, 8>>, <<>>}
"""
@spec from_data(binary) :: binary
@spec from_data(binary) :: {message :: binary, rest :: binary}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a breaking change. I don't think it impacts as much if we merge before releasing 1.0, but we need to keep this in mind.

data
|> String.split("\r\n")
|> Enum.reduce(%{}, fn line, acc ->
[k, v] = String.split(line, ":")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[k, v] = String.split(line, ":")
[k, v] = String.split(line, ":", parts: 2)

Otherwise this can raise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see 054f370

@sleipnir
Copy link
Collaborator

sleipnir commented Dec 8, 2025

Hi @aseigo, thank you for this PR.

I need to be very careful here because I'm currently working on these APIs to incorporate a new adapter for the server, and therefore I need to evaluate the least effort merge method to proceed. Give me some time to analyze this.

I think it's worthwhile to run the benchmark against this PR as well and measure how much it adds to the hot path. Perhaps an optional feature, as Paulo suggested, would be interesting to ensure optimal performance in all cases, but I haven't evaluated this in depth, just a suggestion.

That said, I will take a closer look, as well as the other PRs in the queue this week, and I will get back to you soon with my opinions.

Thank you again, and we'll talk soon.

@aseigo
Copy link
Contributor Author

aseigo commented Dec 8, 2025

I need to be very careful here because I'm currently working on these APIs to incorporate a new adapter for the server, and

Oh, nice! Out of curiosity: Will it replace the current cowboy-based adapter, or is it using another webstack altogether (e.g. bandit)?

I did find the current adapter modules a bit of a maze as they call each other back and forth, so knowing there's some work happening there is really nice to hear!

@sleipnir
Copy link
Collaborator

sleipnir commented Dec 8, 2025

I need to be very careful here because I'm currently working on these APIs to incorporate a new adapter for the server, and

Oh, nice! Out of curiosity: Will it replace the current cowboy-based adapter, or is it using another webstack altogether (e.g. bandit)?

I did find the current adapter modules a bit of a maze as they call each other back and forth, so knowing there's some work happening there is really nice to hear!

I explain more here #482

@sleipnir
Copy link
Collaborator

sleipnir commented Dec 8, 2025

I did find the current adapter modules a bit of a maze as they call each other back and forth, so knowing there's some work happening there is really nice to hear!

I don't know if I'm doing a much better job hahaha... let me know your opinion. 😄

@aseigo
Copy link
Contributor Author

aseigo commented Dec 8, 2025

Benchmarks incoming!

With this PR:

Total requests: 100000
Total time: 24.54 seconds
Requests per second: 4075.05
Average latency: 0.245 ms

Total requests: 100000
Total time: 24.876 seconds
Requests per second: 4019.95
Average latency: 0.249 ms

Total requests: 100000
Total time: 24.461 seconds
Requests per second: 4088.14
Average latency: 0.245 ms

From upstream master branch:

Total requests: 100000
Total time: 24.748 seconds
Requests per second: 4040.76
Average latency: 0.247 ms

Total requests: 100000
Total time: 24.49 seconds
Requests per second: 4083.22
Average latency: 0.245 ms

Total requests: 100000
Total time: 24.708 seconds
Requests per second: 4047.28
Average latency: 0.247 ms

This is quite repeatable, with the times between different runs being within c.a. ~1% of each other.

@polvalente
Copy link
Contributor

Benchmarks incoming!

With this PR:

Total requests: 100000
Total time: 24.54 seconds
Requests per second: 4075.05
Average latency: 0.245 ms

Total requests: 100000
Total time: 24.876 seconds
Requests per second: 4019.95
Average latency: 0.249 ms

Total requests: 100000
Total time: 24.461 seconds
Requests per second: 4088.14
Average latency: 0.245 ms

From upstream master branch:

Total requests: 100000
Total time: 24.748 seconds
Requests per second: 4040.76
Average latency: 0.247 ms

Total requests: 100000
Total time: 24.49 seconds
Requests per second: 4083.22
Average latency: 0.245 ms

Total requests: 100000
Total time: 24.708 seconds
Requests per second: 4047.28
Average latency: 0.247 ms

This is quite repeatable, with the times between different runs being within c.a. ~1% of each other.

Great! So the decision is whether we want to accept "incorrect" payloads regardless when grpcweb-formatted data is sent outside that scope. I'm ok with just keeping the single code path.

@aseigo
Copy link
Contributor Author

aseigo commented Dec 8, 2025

I explain more here #482

Wow, that is a crazy amount of work, but it's clearly paying off! I haven't tested the PR (yet!) but have skimmed the code (and read more interesting parts with a bit more care), and so far it looks really nice. A bit unfortunate to have to implement an http2 stack, but I can see how it makes sense in this case, given how this is an absolutely core part of this framework of libraries.

I'm a big fan of thousand_island, always impressed by the performance of it given it is straight Elixir code. <3

In any case, I can see how merging in the (frankly annoying) grpcweb trailers support and your work can be a bit of a chore. Happy to see this go in in whatever order makes sense to you. IMHO the new adapter has clear priority given it stands to provide a significant performance improvement, and would be the foundation for "features" needed by e.g. grpcweb

@aseigo
Copy link
Contributor Author

aseigo commented Dec 8, 2025

Also, I feel like grpc_server could use a new test or test change too.

A small related comment:

The existing tests for GRPCWeb exercise these code paths and do catch when they fail. I can add a few more tests for variations (preferably once we've decided on the final shape of things so as to test actual code that may be merged :) ), but I was actually able to use the existing tests to drive this towards a working state.

In fact, once the tests were passing, it all Just Worked(tm), first time of trying, with the buf..build Connect libraries.

Kudos to everyone who's worked on them as they made my effort here a lot easier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants