Skip to content

Conversation

@sebthom
Copy link
Contributor

@sebthom sebthom commented Oct 25, 2025

  • Add MessageJsonHandler.serialize(Message, OutputStream, Charset)
  • Serialize into ByteArrayOutputStream and write via writeTo(output)
  • Remove String.getBytes(...) and toByteArray() clone
  • Cache Charset instead of using encoding String

No breaking changes: existing constructors retained; new overloads are additive.

@sebthom sebthom changed the title perf: eliminate intermediate byte[] copies in StreamMessageConsume perf: eliminate intermediate byte[] copies in StreamMessageConsumer Oct 25, 2025
- Add MessageJsonHandler.serialize(Message, OutputStream, Charset)
- Serialize into ByteArrayOutputStream and write via writeTo(output)
- Remove String.getBytes(...) and toByteArray() clone
- Cache Charset instead of using encoding String

No breaking changes: existing constructors retained; new overloads are
additive.
@sebthom sebthom force-pushed the StreamMessageConsumer branch from e3c9abe to 4462bbd Compare November 8, 2025 20:19
@pisv
Copy link
Contributor

pisv commented Nov 12, 2025

@sebthom Many thanks for all your contributions to the project.

In general, for performance-related improvements I'd like to see more details about the issue being addressed including realistic benchmarks to check the performance and the actual measurements before and after the change.

Sometimes a small amount of micro-optimisation can make a huge difference. However, it is important to have evidence that we are optimizing an actual bottleneck. Otherwise, the code can end up being harder to maintain, and we'll quite possibly find that we've either missed the real bottleneck, or that our micro-optimisations are harming performance instead of helping.

Again, these general notes apply to all performance-related improvements.

@sebthom
Copy link
Contributor Author

sebthom commented Nov 12, 2025

I don't see how to provide realistic benchmarks. What would be the exact criterias? Which tools do you accept etc.? These PRs address issues like #815 The current parsing is memory inefficient. These improvements (similar to #816) reduce CPU churn and GC pressure.

@jonahgraham what is your opinion?

@pisv
Copy link
Contributor

pisv commented Nov 12, 2025

I don't see how to provide realistic benchmarks.

OK. But have you measured the actual increase in performance somehow?

@pisv
Copy link
Contributor

pisv commented Nov 12, 2025

In this particular case, it is not that obvious when taking a deeper look at the code.

StreamMessageConsumer.consume before the change:

  • A byte-array is created for a StringWriter (AbstractStringBuilder.value)
  • It is then copied in StringWriter.toString (but note that StringBuffer.toString is annotated with @HotSpotIntrinsicCandidate, so it must be efficient, I guess)
  • A byte-array is created as the result of String.getBytes. (The bulk-encoding to UTF-8, which is the only encoding supported right now in LSP, is a special case, and must be quite efficient)

StreamMessageConsumer.consume after the change:

  • A byte-array is created for a ByteArrayOutputStream (buf)
  • A byte-buffer is allocated for a StreamEncoder of the OutputStreamWriter
  • The StreamEncoder creates a new char array and wraps it into a new CharBuffer in each write call

As we can see, the two implementations are quite different. Which would be the more efficient and to what extent? I don't know without actually measuring it. But I do know that the implementation before the change looks more straightforward and readable to me; the content is written in exactly the same way as the header. This is just an example to illustrate my general point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants