Skip to content

Conversation

@everettbu
Copy link

Test 10

…#97529)

* dont lazy init unified storage

* Inits index when creating new resource server. Fixes trace propagation by passing span ctx. Update some logging.

* Use finer grained cache locking when building indexes to speed things up. Locking the whole function was slowing things down.

* formatting

* linter fix

* go mod

* make update-workspace

* fix workspaces check error

* update dependency owner in mod file

* wait 1 second before querying metrics

* try with big timeout, see if fixes CI. Wont fail locally.

* skips postgres integration test. Only fails in drone. Will fix later.

* put delay back to 500 ms
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements several performance optimizations for Grafana's unified storage system. The changes focus on reducing request latency and improving observability while addressing some technical debt.

The most significant change is in pkg/storage/unified/resource/server.go, which refactors the ResourceServer initialization pattern. Instead of calling server.Init() on every request (which still incurred atomic operation overhead despite using sync.Once), initialization now happens once during server construction. This eliminates per-request initialization overhead and ensures the server is ready immediately after creation. The initialization sequence is also reordered to set up the search index before starting the watcher, preventing potential race conditions.

In pkg/storage/unified/resource/search.go, several improvements enhance observability and fix context handling. The changes properly capture contexts from tracer.Start() calls, ensuring distributed tracing works correctly and context cancellation propagates through search operations. New logging provides visibility into search index performance metrics, including initialization duration and document count.

The pkg/storage/unified/sql/backend.go file receives a critical fix for context propagation in the GetResourceStats method, ensuring the enriched context from OpenTelemetry's tracer is properly used for trace linking.

In pkg/storage/unified/search/bleve.go, the cache mutex lock duration is reduced by moving it from protecting the entire index building process to only the final cache update operation. This optimization aims to reduce lock contention during resource-intensive index building.

Finally, pkg/server/module_server_test.go adds a skip condition for PostgreSQL tests in the CI environment, addressing a Drone-specific test failure while acknowledging this as technical debt.

These changes align with the unified storage performance optimization goals by reducing per-request overhead, improving observability, and addressing bottlenecks in the search and storage systems.

Confidence score: 3/5

• This PR contains mostly solid optimizations but has one concerning race condition risk that could cause production issues
• The Bleve cache lock optimization introduces a potential race condition where multiple goroutines could simultaneously build the same index, leading to duplicate work and cache inconsistency
• Files pkg/storage/unified/search/bleve.go needs more attention due to the race condition risk in the cache lock optimization

5 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile


// The builder will write all documents before returning
builder func(index resource.ResourceIndex) (int64, error),
) (resource.ResourceIndex, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Race condition: Multiple goroutines can now build the same index simultaneously, causing duplicate work and potential cache inconsistency


// The builder will write all documents before returning
builder func(index resource.ResourceIndex) (int64, error),
) (resource.ResourceIndex, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Race condition: Multiple goroutines can now build the same index simultaneously, causing duplicate work and potential cache inconsistency

@github-actions
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 2 weeks if no further activity occurs. Please feel free to give a status update or ping for review. Thank you for your contributions!

@github-actions github-actions bot added the stale label Aug 28, 2025
@github-actions
Copy link
Contributor

This pull request has been automatically closed because it has not had any further activity in the last 2 weeks. Thank you for your contributions!

@github-actions github-actions bot closed this Sep 12, 2025
@everettbu everettbu reopened this Sep 12, 2025
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants