-
Notifications
You must be signed in to change notification settings - Fork 16k
Description
What version of protobuf and what language are you using?
Version: main/v3.25 (also verified the issue on the latest version)
Language: Java
What operating system (Linux, Windows, ...) and version?
Linux and Windows
What runtime / compiler are you using (e.g., python version or gcc version)
Java 21
What did you do?
Steps to reproduce the behavior:
Try to parse below message from byte stream, the parsing is ~2.5x slower in version 3.21.7+.
syntax = "proto3";
option java_package = "com.uber.debugprotobuf";
option java_outer_classname = "DebugProtobufProto";
message BenchmarkRequest {
NodeA root = 1;
}
message BenchmarkResponse {
}
// Tree structure for benchmarking
// Level 1
message NodeA {
NodeB child1 = 1;
NodeB child2 = 2;
NodeB child3 = 3;
NodeB child4 = 4;
NodeB child5 = 5;
NodeB child6 = 6;
NodeB child7 = 7;
NodeB child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 2
message NodeB {
NodeC child1 = 1;
NodeC child2 = 2;
NodeC child3 = 3;
NodeC child4 = 4;
NodeC child5 = 5;
NodeC child6 = 6;
NodeC child7 = 7;
NodeC child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 3
message NodeC {NodeD child1 = 1;
NodeD child2 = 2;
NodeD child3 = 3;
NodeD child4 = 4;
NodeD child5 = 5;
NodeD child6 = 6;
NodeD child7 = 7;
NodeD child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 4
message NodeD {
NodeE child1 = 1;
NodeE child2 = 2;
NodeE child3 = 3;
NodeE child4 = 4;
NodeE child5 = 5;
NodeE child6 = 6;
NodeE child7 = 7;
NodeE child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 5
message NodeE {
NodeF child1 = 1;
NodeF child2 = 2;
NodeF child3 = 3;
NodeF child4 = 4;
NodeF child5 = 5;
NodeF child6 = 6;
NodeF child7 = 7;
NodeF child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 6
message NodeF {
NodeG child1 = 1;
NodeG child2 = 2;
NodeG child3 = 3;
NodeG child4 = 4;
NodeG child5 = 5;
NodeG child6 = 6;
NodeG child7 = 7;
NodeG child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 7
message NodeG {
NodeH child1 = 1;
NodeH child2 = 2;
NodeH child3 = 3;
NodeH child4 = 4;
NodeH child5 = 5;
NodeH child6 = 6;
NodeH child7 = 7;
NodeH child8 = 8;
string name = 9;
int64 value = 10;
}
// Level 8
message NodeH {
string name = 1;
int64 value = 2;
}
Issue Summary
A performance regression was introduced by PR #10665
, where the Java code generator removed the optimized parsing constructor and moved all parsing logic to the Builder. This change results in a 2–3× performance degradation with builder approach.
Before this PR, the Java generator emitted a specialized private parsing constructor for each message type whenever the file was not configured with optimize_for = CODE_SIZE.
- This constructor provided a fast path for parsing: it directly populated message fields while reading from the input stream.
- The builder path was only used for users explicitly optimizing for file size, accepting lower speed.
In PR #10665, this constructor was removed entirely. The changelog even states:
Move proto wireformat parsing functionality from the private “parsing constructor” to the Builder class.
As a result, all parsing now goes through the Builder, regardless of optimization settings.
Fix
Reintroduced the parsing constructor and confirmed there is no performance degradation.
https://github.com/protocolbuffers/protobuf/compare/v25.5...amishra-u:protobuf:v3.25.5-UBER1?expand=1