Skip to content

Conversation

@akshayutture-augment
Copy link
Contributor

…#97529)

* dont lazy init unified storage

* Inits index when creating new resource server. Fixes trace propagation by passing span ctx. Update some logging.

* Use finer grained cache locking when building indexes to speed things up. Locking the whole function was slowing things down.

* formatting

* linter fix

* go mod

* make update-workspace

* fix workspaces check error

* update dependency owner in mod file

* wait 1 second before querying metrics

* try with big timeout, see if fixes CI. Wont fail locally.

* skips postgres integration test. Only fails in drone. Will fix later.

* put delay back to 500 ms
@akshayutture-augment
Copy link
Contributor Author

@claude review this PR

1 similar comment
@akshayutture-augment
Copy link
Contributor Author

@claude review this PR

@claude
Copy link

claude bot commented Nov 20, 2025

Claude finished @akshayutture-augment's task —— View job


PR Review Complete

Tasks

  • Analyze PR changes and context
  • Review modified files for code quality
  • Check for bugs and performance issues
  • Provide detailed feedback

Summary

This PR makes important performance optimizations and initialization improvements to the unified storage system. The main changes include moving from lazy initialization to eager initialization, fixing trace context propagation, and improving concurrent index building performance.

Key Changes

1. Eager Initialization (pkg/storage/unified/resource/server.go)

  • Good: Moved Init() call to constructor instead of lazy initialization on first request
  • Good: Removed repetitive Init() checks from every method (Create, Read, Update, Delete, List, Watch, Search, etc.)
  • Benefit: Reduces latency on first requests and simplifies code

2. Trace Context Propagation Fixes (multiple files)

  • Good: Fixed trace context propagation in search.go:173, search.go:310, and backend.go:126
  • Changed from _, span := s.tracer.Start(ctx, ...) to ctx, span := s.tracer.Start(ctx, ...)
  • Benefit: Proper trace context flow for observability

3. Concurrent Index Building Optimization (pkg/storage/unified/search/bleve.go)

  • Excellent: Moved cache lock from entire BuildIndex() function to just the cache write operation
  • Before: Lock held during entire index building (lines 88-138)
  • After: Lock only held for cache insertion (lines 137-139)
  • Benefit: Significant performance improvement for parallel index building

4. Minor Improvements

  • ✅ Added logging for search index initialization with duration (search.go:216)
  • ✅ Fixed incorrect logger import in search.go (line 11, removed grafana-cli/logger)
  • ✅ Removed unnecessary log statement in bleve.go (line 99-101)

Issues & Recommendations

🟡 Medium Priority

1. Initialization Order Change (pkg/storage/unified/resource/server.go:303-310)

// Before: Init watcher before search index
// After: Init search index before watcher

Concern: The order was swapped - search index initialization now happens before watcher initialization. While this appears intentional (search needs to be ready before watching events), the PR description doesn't explicitly mention this change.

Recommendation: Verify this order change is intentional and consider adding a comment explaining why search must be initialized before the watcher.

2. Test Skip for Postgres (pkg/server/module_server_test.go:35-38)

if dbType == "postgres" {
    t.Skip("skipping - test not working with postgres in Drone. Works locally.")
}

Concern: Skipping tests hides potential issues. The comment says it "works locally" but not in CI.

Recommendation: This should be tracked in a follow-up issue. Flaky tests can hide real problems and reduce confidence in the test suite.

🟢 Low Priority / Observations

3. Error Handling After Init (pkg/storage/unified/resource/server.go:258-262)
The new eager initialization properly returns errors, but callers need to handle them. The previous lazy init pattern would fail later.

Observation: This is actually an improvement - fail fast at startup rather than on first request.

4. Performance Metrics
The PR claims "speed up indexing" but doesn't include benchmarks or metrics showing the improvement.

Suggestion: Consider adding benchmark results or load test metrics in the PR description to quantify the performance gains from the locking optimization.

Security & Correctness

✅ No security vulnerabilities identified
✅ No race conditions introduced (locking is now more granular, not removed)
✅ Error handling appears correct
✅ Context propagation now correct

Code Quality

✅ Code is cleaner with removed repetitive Init() checks
✅ Proper use of sync.Once pattern for initialization
✅ Good use of defer for lock/unlock
✅ Consistent error handling

Testing

⚠️ One test skipped for Postgres (see issue #2 above)
✅ No other test modifications needed (which is good - indicates backward compatibility)

Verdict

Overall: Approve with minor concerns

This is a solid performance optimization PR that simplifies the codebase while improving performance. The main improvements are:

  1. Cleaner code with eager initialization
  2. Better observability with fixed trace propagation
  3. Significant performance gains from optimized locking

The concerns are minor and don't block merging, but should be tracked for follow-up.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants