-
Notifications
You must be signed in to change notification settings - Fork 9
Unified Storage Performance Optimizations #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: performance-optimization-baseline
Are you sure you want to change the base?
Unified Storage Performance Optimizations #10
Conversation
…#97529) * dont lazy init unified storage * Inits index when creating new resource server. Fixes trace propagation by passing span ctx. Update some logging. * Use finer grained cache locking when building indexes to speed things up. Locking the whole function was slowing things down. * formatting * linter fix * go mod * make update-workspace * fix workspaces check error * update dependency owner in mod file * wait 1 second before querying metrics * try with big timeout, see if fixes CI. Wont fail locally. * skips postgres integration test. Only fails in drone. Will fix later. * put delay back to 500 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements several performance optimizations for Grafana's unified storage system. The changes focus on reducing request latency and improving observability while addressing some technical debt.
The most significant change is in pkg/storage/unified/resource/server.go, which refactors the ResourceServer initialization pattern. Instead of calling server.Init() on every request (which still incurred atomic operation overhead despite using sync.Once), initialization now happens once during server construction. This eliminates per-request initialization overhead and ensures the server is ready immediately after creation. The initialization sequence is also reordered to set up the search index before starting the watcher, preventing potential race conditions.
In pkg/storage/unified/resource/search.go, several improvements enhance observability and fix context handling. The changes properly capture contexts from tracer.Start() calls, ensuring distributed tracing works correctly and context cancellation propagates through search operations. New logging provides visibility into search index performance metrics, including initialization duration and document count.
The pkg/storage/unified/sql/backend.go file receives a critical fix for context propagation in the GetResourceStats method, ensuring the enriched context from OpenTelemetry's tracer is properly used for trace linking.
In pkg/storage/unified/search/bleve.go, the cache mutex lock duration is reduced by moving it from protecting the entire index building process to only the final cache update operation. This optimization aims to reduce lock contention during resource-intensive index building.
Finally, pkg/server/module_server_test.go adds a skip condition for PostgreSQL tests in the CI environment, addressing a Drone-specific test failure while acknowledging this as technical debt.
These changes align with the unified storage performance optimization goals by reducing per-request overhead, improving observability, and addressing bottlenecks in the search and storage systems.
Confidence score: 3/5
• This PR contains mostly solid optimizations but has one concerning race condition risk that could cause production issues
• The Bleve cache lock optimization introduces a potential race condition where multiple goroutines could simultaneously build the same index, leading to duplicate work and cache inconsistency
• Files pkg/storage/unified/search/bleve.go needs more attention due to the race condition risk in the cache lock optimization
5 files reviewed, 1 comment
|
|
||
| // The builder will write all documents before returning | ||
| builder func(index resource.ResourceIndex) (int64, error), | ||
| ) (resource.ResourceIndex, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Race condition: Multiple goroutines can now build the same index simultaneously, causing duplicate work and potential cache inconsistency
|
|
||
| // The builder will write all documents before returning | ||
| builder func(index resource.ResourceIndex) (int64, error), | ||
| ) (resource.ResourceIndex, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Race condition: Multiple goroutines can now build the same index simultaneously, causing duplicate work and potential cache inconsistency
|
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 2 weeks if no further activity occurs. Please feel free to give a status update or ping for review. Thank you for your contributions! |
|
This pull request has been automatically closed because it has not had any further activity in the last 2 weeks. Thank you for your contributions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Test 10