upstream: cache metadata hash for O(1) comparison in EDS updates#43351
upstream: cache metadata hash for O(1) comparison in EDS updates#43351wdauchy wants to merge 4 commits intoenvoyproxy:mainfrom
Conversation
3b607e2 to
ab97358
Compare
|
/retest transients |
|
/retest |
botengyao
left a comment
There was a problem hiding this comment.
Thanks for the contribution!
/wait
| if (host->metadata() && existing_host->second->metadata()) { | ||
| metadata_changed = !Protobuf::util::MessageDifferencer::Equivalent( | ||
| *host->metadata(), *existing_host->second->metadata()); | ||
| metadata_changed = MessageUtil::hash(*host->metadata()) != |
There was a problem hiding this comment.
This could have a slight behavior change for Equivalent vs hash, if we are going to change this, we could need a benchmark to compare before we decide if we can move forward.
There was a problem hiding this comment.
On the behavior change: hash is stricter than Equivalent, the only impact is potentially treating an unchanged message as changed (false positive), never the reverse. In practice EDS metadata comes from the control plane with consistent serialization, so this shouldn't cause spurious updates?
On benchmarking: our setup with >7k endpoints, updateDynamicHostList is a hot path. Do you think a test on our side would be sufficient or do you mean to add a micro benchmark in the tests? (not super familiar with it so far)
There was a problem hiding this comment.
Thanks for adding the mico-benchmark! Do you mind sharing the results?
There was a problem hiding this comment.
I've reworked the approach based on your suggestion to add a benchmark.
Instead of recomputing the hash on every comparison (which only gave ~8% improvement), I'm now caching the metadata hash on the host object. The hash is computed once when metadata is set (at construction or update) via a new metadataHash() method on HostDescription. The comparison in updateDynamicHostList becomes a simple integer != check.
Benchmark results (aarch64, -c opt):
| Benchmark (5000 hosts) | 5 fields | 20 fields |
|---|---|---|
Equivalent (before) |
4.39 ms | 14.1 ms |
Hash (recompute each time) |
4.07 ms | 12.6 ms |
CachedHash (new approach) |
0.001 ms | 0.001 ms |
The cached hash is ~4000x faster than Equivalent for the common case (metadata unchanged across EDS updates).
Note: recomputing the hash on every comparison (without caching) is actually slower than Equivalent when metadata differs, because Equivalent can short-circuit on the first mismatch while hash must process the full message. The cached approach avoids this tradeoff entirely, the comparison is always a simple integer check regardless of whether metadata changed or not. The hash cost is paid once in the setter, only when metadata actually changes.
On the behavioral difference: since both hosts compute the hash the same way, the only edge case vs Equivalent is if two semantically equivalent messages have different serializations, which would cause a false "changed" detection. In practice, EDS metadata from the control plane has consistent serialization so this shouldn't happen.
4cc905f to
48545ce
Compare
|
/retest transients |
…y comparison Replace Protobuf::util::MessageDifferencer with MessageUtil::hash for equality checks in hot paths to reduce CPU during large EDS updates: - updateDynamicHostList: metadata comparison (upstream_impl.cc) - LocalityEndpointEqualTo: locality/endpoint comparison (locality_endpoint.h) Hash comparison is typically faster than reflection-based MessageDifferencer and matches Envoy's MessageLiteDifferencer approach when full protos are disabled. Helps with clusters of ~5k endpoints where these comparisons run per host on each EDS update. Signed-off-by: William Dauchy <william.dauchy@datadoghq.com>
Signed-off-by: William Dauchy <william.dauchy@datadoghq.com>
|
/retest transients |
|
/retest |
|
/retest transients |
|
/retest transients |
Commit Message:
Cache metadata hash on HostDescription to avoid expensive per-host comparisons during EDS updates. The hash is computed once when metadata is set (at host creation or update) and stored alongside the metadata. The comparison in
updateDynamicHostListbecomes a simple integer comparison instead ofMessageDifferencer::Equivalent.Additional Description:
For large clusters (~5-7k endpoints),
updateDynamicHostListis a hot path where metadata comparison runs per host on each EDS update. Caching the hash eliminates the repeated serialization/comparison cost.Benchmark results (5000 hosts, aarch64,
-c opt):Equivalent: baseline (reflection-based proto comparison)Hash: recomputing hash each time (~8% faster)CachedHash: pre-computed hash comparison (~4000x faster)Risk Level: Low
Testing: Benchmark added (
test/common/upstream/metadata_comparison_benchmark.cc)Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]