You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MB-39292: 1/n set_collections persist current manifest
set_collections only allows 'forward' progress after checking
that the new manifest is a successor of the current manifest,
however after warm-up we have to accept whatever we are given.
This commit updates set_collections so that for persistent buckets
the new manifest is first stored in the database directory and
then we update from that manifest on-success, now warm-up brings
the manifest back from storage and we can validate that further
updates are a valid successor.
Ephemeral buckets just update with no background task.
This patch stores using Manifest::toJSON and reloads JSON using
Manifest's existing construction with no integrity checking.
Change-Id: Ie548e31f56c4847ecf4c0c4ad866544f6bcd2a5c
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/137161
Reviewed-by: Dave Rigby <[email protected]>
Tested-by: Jim Walker <[email protected]>
Aim: Better manifest comparison - full path checks on update
2
+
3
+
E.g. If I currently have /scope1/c1 and those have ids sid:8 and cid:8 I want to detect:
4
+
5
+
/scope2/c1 (sid:9 and sid:8) // collection moved to a new scope
6
+
/scope2/c1 (sid:8 and sid:8) // scope changed name
7
+
/scope1/c2 (8 and 8) // collection name changed
8
+
9
+
10
+
Problem(s)
11
+
12
+
1) vbucket_manifest only has IDs
13
+
So if we try and do these detections during the vb.update code we have no name
14
+
for comparison and cannot detect any of the above cases
15
+
16
+
2) Collections::Manifest stores the last recevied manifest from ns_Server, we
17
+
can apply detections there as it has scope and collection names - but if we
18
+
warmup this object is 'empty'.
19
+
20
+
3) Leading into how to solve warmup issue, we don't persist scope name, only id
21
+
in _local
22
+
23
+
24
+
Per vbucket does store collection names 1) in _local and 2) in system events
25
+
Per vbucket stores scope names in 1) system event only (nothing in _local)
26
+
27
+
28
+
Ideas:
29
+
30
+
I think Collections::Manifest A compare Collections::Manifest B is something to consider and means we need a way to warm-up something comparable. Could also mean some weird stuff from replica?
31
+
32
+
33
+
Weird stuff:
34
+
35
+
Node {
36
+
Active VB1: Thinks /scope1/c1
37
+
Replica VB2: Thinks /scope2/c1 (as it got that over replication) - is this possible in the quorum case?
38
+
}
39
+
or
40
+
41
+
Node: {
42
+
Active VB1: Thinks /scope1/c1 (sid:8, cid:8)
43
+
Replica VB2: Thinks /scope2/c2 (sid:8, cid:8) (as it got that over replication) - is this possible in the quorum case?
44
+
}
45
+
46
+
promote VB2 to active
47
+
48
+
new data structure that can be populated from warmup (_local) and allows storage of {id} ?? this is the manifest right?
49
+
50
+
hashes?
51
+
52
+
53
+
Options:
54
+
55
+
A) Be defensive for *every* manifest update - this means KV
56
+
57
+
a) checks manifest.uid is incrementing
58
+
b) checks any scope/collections it knows, the immutable properties are equal (i.e. id == id && name == name)
59
+
60
+
Reject if the incoming manifest fails checks
61
+
Apply manifest otherwise
62
+
63
+
This has challenges in terms of comparison (b) - only the 'bucket' Manifest stores names in-memory and that data is populated from ns_server. Following warmup we cannot validate the input.
64
+
65
+
Solutions?
66
+
67
+
Store manifest:
68
+
Persist each manifest (we always trust the first Manifest when KV goes from uid 0 to uid 0+). Means storing a new file and possibly changing the collection update logic - i.e. steps become newfile.onperist(update vbuckets to new manifest).
69
+
A new file hmm, flatbuffers + crc32c , not a couchstore file, i guess we already have extra files such as access log - however if this file gets damaged, warmup fails.
70
+
71
+
Use vbuckets persisted data:
72
+
Re-assemble a manifest from the active vbuckets at warmup. We could warmup with vb 0 being ahead of vb 1 (if we crashed before vb1 persisted a collection change), not necessarily a problem - given though that an update only ever begins if the manifest is progressing, we use the greatest manifest found? However we support a push that can create/drop many collections and these get split over many flush batches, only the final flush updates the manifest id, we could get vbuckets reporting equal manifest uid but being different to each other.
73
+
This is getting messy and the partial persistence issue a problem.
74
+
75
+
Overall though do we need option A) trust that ns_server updates (non forced) are compliant.
76
+
77
+
B) Only care for 'force' update (quorum was forcefully removed and we may be given a manifest which changes collection state in some of the unusual ways identified earlier)
78
+
79
+
In force update case, we can 'afford' to take our time? I.e. read system events or _local data to get names - epehemeral will read system events, persistent can just do one read if it wants - scope names are not on disk.
0 commit comments