-
Notifications
You must be signed in to change notification settings - Fork 0
Dual Storage Architecture #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dual-storage-baseline
Are you sure you want to change the base?
Conversation
* Dual writer: mode 3 * Add integration tests for playlits in mode 3 * Remove todo * Update pkg/apiserver/rest/dualwriter_mode3.go Co-authored-by: Arati R. <[email protected]> * Admin: Fixes an issue where user accounts could not be enabled (#88117) Fix: unable to enable user * [REVIEW] FInish mode 3 and add tests * Improve logging * Update dependencies * Update pkg/apiserver/rest/dualwriter_mode3_test.go Co-authored-by: maicon <[email protected]> * remove test assertion * Use mode log when dual writer is initiated --------- Co-authored-by: Arati R. <[email protected]> Co-authored-by: gonvee <[email protected]> Co-authored-by: maicon <[email protected]>
|
@coderabbitai review |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
WalkthroughThis pull request adds a Changes
Sequence Diagram(s)sequenceDiagram
participant Test
participant DualWriterMode3
participant Storage
participant Legacy
Test->>DualWriterMode3: List(ctx, options)
DualWriterMode3->>Storage: List from unified storage
Storage-->>DualWriterMode3: runtime.Object
DualWriterMode3-->>Test: runtime.Object, error
Note over DualWriterMode3: Logs context & tracks duration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/tests/apis/playlist/playlist_test.go (1)
132-144: Mode3 playlist integration coverage for file/unified/etcd backends mirrors existing modesThe three new subtests:
"with dual write (file, mode 3)","with dual write (unified storage, mode 3)", and"with dual write (etcd, mode 3)"all correctly:
- Set the expected
APIServerStorageType("file","unified","etcd").- Enable
featuremgmt.FlagKubernetesPlaylists, plusfeaturemgmt.FlagUnifiedStoragefor unified.- Configure
DualWriterDesiredModestografanarest.Mode3forplaylistv0alpha1.GROUPRESOURCE.- Reuse
doPlaylistTestsso Mode3 flows get the same CRUD and cross-API assertions as modes 0–2.This is a straightforward and consistent extension of coverage. If duplication across modes becomes painful, you could later switch this whole group to a table-driven pattern, but it’s not necessary for this change.
Also applies to: 191-204, 287-312
pkg/apiserver/rest/dualwriter_mode3_test.go (1)
17-78: Mode3 unit tests provide good CRUD/List coverage; consider tighter backend-specific assertions laterThe new table-driven tests for Mode3:
- Cover
Create,Get,List,Delete,DeleteCollection, andUpdatesuccess and error paths.- Use
legacyStoreMock/storageMockwrappers to plug a single*mock.Mockinto bothLegacyandStorage.- Verify:
- Returned objects vs
exampleObj/anotherObj.- List outputs vs
exampleList/anotherList.- Error vs non-error flows based on test cases.
This is solid baseline coverage for the new List method and Mode3’s “write to both, read from Storage” semantics.
If you later need stricter guarantees, two incremental improvements to consider:
Differentiate legacy vs storage interactions
Using separate mocks forLegacyandStorage(or asserting call counts per backend) would let you express expectations like “Legacy is not called when Storage fails on Create” or “DeleteCollection always triggers the async legacy delete”. Today the shared*mock.Mockmakes those distinctions harder to encode.Optionally assert metrics side-effects
Once metrics wiring indualwriter_mode3.gois corrected, focused tests arounddualWriterMetricswould help catch regressions in storage vs legacy duration labeling.None of this blocks the current PR; the existing tests already add meaningful coverage over Mode3 behavior.
Also applies to: 80-133, 135-187, 189-241, 243-295, 297-356
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.work.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
pkg/apiserver/rest/dualwriter_mode1_test.go(0 hunks)pkg/apiserver/rest/dualwriter_mode3.go(2 hunks)pkg/apiserver/rest/dualwriter_mode3_test.go(1 hunks)pkg/tests/apis/playlist/playlist_test.go(3 hunks)
💤 Files with no reviewable changes (1)
- pkg/apiserver/rest/dualwriter_mode1_test.go
🔇 Additional comments (2)
pkg/apiserver/rest/dualwriter_mode3.go (2)
24-26: Constructor and mode constant look consistent with other dual-writer modes
newDualWriterMode3wiring (legacy, storage, metrics, and a per-instance logger labeled withmode3Str) is coherent and should integrate cleanly with existing dual-writer infrastructure.Also applies to: 33-34
62-76: Get/List behavior for Mode3 reads from Storage looks correct and consistent
Getand the newListmethod both:
- Attach a per-call logger with useful fields (
kind,nameorresourceVersion,method).- Measure
Storagelatency viarecordStorageDuration(err != nil, ...).- Read exclusively from
Storage, which matches Mode3’s “read from new storage” semantics.- Asynchronous legacy writes are intentionally absent from List, which is appropriate since Mode3 reads-only behavior doesn’t require legacy interaction on list.
No functional issues spotted here.
Also applies to: 78-92
| func (d *DualWriterMode3) Create(ctx context.Context, obj runtime.Object, createValidation rest.ValidateObjectFunc, options *metav1.CreateOptions) (runtime.Object, error) { | ||
| log := klog.FromContext(ctx) | ||
| var method = "create" | ||
| log := d.Log.WithValues("kind", options.Kind, "method", method) | ||
| ctx = klog.NewContext(ctx, log) | ||
|
|
||
| startStorage := time.Now() | ||
| created, err := d.Storage.Create(ctx, obj, createValidation, options) | ||
| if err != nil { | ||
| log.Error(err, "unable to create object in storage") | ||
| d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage) | ||
| return created, err | ||
| } | ||
| d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage) | ||
|
|
||
| if _, err := d.Legacy.Create(ctx, obj, createValidation, options); err != nil { | ||
| log.WithValues("object", created).Error(err, "unable to create object in legacy storage") | ||
| } | ||
| return created, nil | ||
| go func() { | ||
| ctx, cancel := context.WithTimeoutCause(ctx, time.Second*10, errors.New("legacy create timeout")) | ||
| defer cancel() | ||
|
|
||
| startLegacy := time.Now() | ||
| _, errObjectSt := d.Legacy.Create(ctx, obj, createValidation, options) | ||
| d.recordLegacyDuration(errObjectSt != nil, mode3Str, options.Kind, method, startLegacy) | ||
| }() | ||
|
|
||
| return created, err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Metrics helpers mis-wired and one logging-context inconsistency in Mode3 write paths
There are a few subtle but important inconsistencies in how metrics and logging are wired:
-
Create error path (Lines 43–46)
On storage failure you record a legacy duration using the storage start time and never record a storage failure metric:d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage)
This should be a storage metric to accurately reflect that the error originates fromStorage.Create.
-
Delete success path label mismatch (Line 106)
You record storage duration usingnameinstead ofoptions.Kindfor the “kind” label:d.recordStorageDuration(false, mode3Str, name, method, startStorage)
All other methods useoptions.Kindin this position, so this will pollute metrics with object names instead of resource kinds.
-
Delete logging context (Line 97)
You build an enriched logger aslog := d.Log.WithValues("name", name, "kind", options.Kind, "method", method)but then attach the base logger to the context:ctx = klog.NewContext(ctx, d.Log)
This is inconsistent with Create/Get/List/DeleteCollection/Update and drops the per-call fields from the context logger.
-
Update error path (Lines 127–130)
On storage failure you again record a legacy duration with the storage start time:d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage)
This should be a storage failure metric.
-
DeleteCollection legacy path (Lines 161–167)
In the legacy goroutine you calld.recordStorageDurationinstead ofd.recordLegacyDuration:d.recordStorageDuration(err != nil, mode3Str, options.Kind, method, startLegacy)
This double-counts storage metrics and never records legacy DeleteCollection timings.
These don’t break functional behavior but will significantly skew observability and make it harder to reason about Mode3 traffic across backends.
A minimal, consistent fix would look like:
@@ func (d *DualWriterMode3) Create(...
- if err != nil {
- log.Error(err, "unable to create object in storage")
- d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage)
- return created, err
- }
- d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage)
+ if err != nil {
+ log.Error(err, "unable to create object in storage")
+ d.recordStorageDuration(true, mode3Str, options.Kind, method, startStorage)
+ return created, err
+ }
+ d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage)
@@ func (d *DualWriterMode3) Delete(...
- log := d.Log.WithValues("name", name, "kind", options.Kind, "method", method)
- ctx = klog.NewContext(ctx, d.Log)
+ log := d.Log.WithValues("name", name, "kind", options.Kind, "method", method)
+ ctx = klog.NewContext(ctx, log)
@@
- d.recordStorageDuration(false, mode3Str, name, method, startStorage)
+ d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage)
@@ func (d *DualWriterMode3) Update(...
- if err != nil {
- log.Error(err, "unable to update in storage")
- d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage)
- return res, async, err
- }
- d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage)
+ if err != nil {
+ log.Error(err, "unable to update in storage")
+ d.recordStorageDuration(true, mode3Str, options.Kind, method, startStorage)
+ return res, async, err
+ }
+ d.recordStorageDuration(false, mode3Str, options.Kind, method, startStorage)
@@ func (d *DualWriterMode3) DeleteCollection(...
- _, err := d.Legacy.DeleteCollection(ctx, deleteValidation, options, listOptions)
- d.recordStorageDuration(err != nil, mode3Str, options.Kind, method, startLegacy)
+ _, err := d.Legacy.DeleteCollection(ctx, deleteValidation, options, listOptions)
+ d.recordLegacyDuration(err != nil, mode3Str, options.Kind, method, startLegacy)Also applies to: 94-117, 120-144, 147-170
I'll verify the review comment by examining the actual code in the repository. Let me retrieve the file and check each claim.
🏁 Script executed:
#!/bin/bash
# Extract the full dualwriter_mode3.go file to verify all claims
cat pkg/apiserver/rest/dualwriter_mode3.goLength of output: 7431
🏁 Script executed:
#!/bin/bash
# Alternative: use rg to find the specific methods mentioned
rg -n "func \(d \*DualWriterMode3\) (Create|Delete|Update|DeleteCollection)" pkg/apiserver/rest/dualwriter_mode3.go -A 50Length of output: 6805
All five issues confirmed as accurate; critical metrics and logging inconsistencies need fixing.
Verification of the actual code confirms every claim in the review comment:
-
Create error path (line 45): Correctly identified—
recordLegacyDurationis used on Storage.Create failure instead ofrecordStorageDuration. -
Delete label mismatch (line 106): Correctly identified—Delete is the only method using
nameinstead ofoptions.Kindfor the "kind" metric label; all other methods (Create, Get, List, Update, DeleteCollection) useoptions.Kind. -
Delete logging context (line 97): Correctly identified—Delete attaches base
d.Logto context instead of the enrichedlogcreated on line 96; all other methods attach the enriched logger. -
Update error path (line 129): Correctly identified—
recordLegacyDurationis used on Storage.Update failure instead ofrecordStorageDuration. -
DeleteCollection legacy path (line 166): Correctly identified—the legacy goroutine calls
recordStorageDurationwithstartLegacytiming instead ofrecordLegacyDuration, double-counting storage metrics.
The suggested fixes in the review are appropriate. These inconsistencies will indeed cause observability issues and misattribution of metrics to the wrong backend paths.
🤖 Prompt for AI Agents
pkg/apiserver/rest/dualwriter_mode3.go lines 36-60 (and related locations noted
in review): multiple metric/logging mis-attributions need correction—1) at
Storage.Create error path (around line 45) replace the incorrect call to
recordLegacyDuration with recordStorageDuration and keep the rest of the error
handling intact; 2) align Delete metric "kind" label (around line 106) to use
options.Kind instead of name so it matches other methods; 3) fix Delete logging
context (around line 97) to attach the enriched log (log) to the context instead
of the base d.Log; 4) at Storage.Update error path (around line 129) replace the
incorrect recordLegacyDuration call with recordStorageDuration; 5) in
DeleteCollection legacy goroutine (around line 166) replace the incorrect
recordStorageDuration call with recordLegacyDuration and pass the correct
startLegacy timing so storage and legacy durations aren’t double-counted.
Test 6
Summary by CodeRabbit
Release Notes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.
Replicated from ai-code-review-evaluation/grafana-coderabbit#6