-
Notifications
You must be signed in to change notification settings - Fork 2
Fix triggering replication when multiple destinations are set #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -114,16 +114,19 @@ class ReplicationStatusUpdater { | |
| * Determines if an object should be updated based on its replication metadata properties. | ||
| * @private | ||
| * @param {ObjectMD} objMD - The metadata of the object. | ||
| * @param {string} site - The destination site name. | ||
| * @returns {boolean} True if the object should be updated. | ||
| */ | ||
| _objectShouldBeUpdated(objMD) { | ||
| _objectShouldBeUpdated(objMD, site) { | ||
| return this.replicationStatusToProcess.some(filter => { | ||
| if (filter === 'NEW') { | ||
| // Either site specific replication info is missing | ||
| // or are initialized with empty fields. | ||
| return (!objMD.getReplicationInfo() | ||
| || objMD.getReplicationInfo().status === ''); | ||
| || !objMD.getReplicationSiteStatus(site)); | ||
| } | ||
| return (objMD.getReplicationInfo() | ||
| && objMD.getReplicationInfo().status === filter); | ||
| && objMD.getReplicationSiteStatus(site) === filter); | ||
| }); | ||
| } | ||
|
|
||
|
|
@@ -172,36 +175,54 @@ class ReplicationStatusUpdater { | |
| // codebase easier to maintain and upgrade, as opposed to having multiple branches or versions of | ||
| // the code for different schema versions. | ||
| objMD = new ObjectMD(JSON.parse(mdRes.Body)); | ||
| if (!this._objectShouldBeUpdated(objMD)) { | ||
| if (!this._objectShouldBeUpdated(objMD, storageClass)) { | ||
| skip = true; | ||
| return process.nextTick(next); | ||
| } | ||
| // Initialize replication info, if missing | ||
| // This is particularly important if the object was created before | ||
| // enabling replication on the bucket. | ||
| if (!objMD.getReplicationInfo() | ||
| || !objMD.getReplicationSiteStatus(storageClass)) { | ||
| let replicationInfo = objMD.getReplicationInfo(); | ||
| if (!replicationInfo || !replicationInfo.status) { | ||
| const { Rules, Role } = repConfig; | ||
| const destination = Rules[0].Destination.Bucket; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this may not be correct, if there is more than 1 desitination....
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. even without triggering multiple sites at once : if a user requests replicatoin to the second "site", this will initialize with the 1st destination (--> trigger replication!), then we will add the second one... When there is no replicationInfo (i.e. typically when it is empty), should we not just initialize an empty replicationInfo, and let the next block ("Update replication info with site specific info") fill the details for the requested destination?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in cloudserver, it seems this is initialized to
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a weird one... So, we currently only support having a single destination bucket for all replication rules of a bucket. When creating the rules/workflows via UI it's even worse, the destination bucket becomes the name of the bucket we are replicating from. |
||
| // set replication properties | ||
| const ops = objMD.getContentLength() === 0 ? ['METADATA'] | ||
| : ['METADATA', 'DATA']; | ||
| const backends = [{ | ||
| site: storageClass, | ||
| status: 'PENDING', | ||
| dataStoreVersionId: '', | ||
| }]; | ||
| const replicationInfo = { | ||
| replicationInfo = { | ||
| status: 'PENDING', | ||
| backends, | ||
| content: ops, | ||
| backends: [], | ||
| destination, | ||
| storageClass, | ||
| storageClass: '', | ||
| role: Role, | ||
| storageType: this.storageType, | ||
| storageType: '', | ||
|
Comment on lines
+197
to
+199
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what are these 2 fields (
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that is how these fields are used... my question is really what information do they actually store, what these fields represent (for example, the storageClass of the object is already known, and stored in same for
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the code which initializes these in cloudserver: --> it seems these fields (along with
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In Zenko, these fields are duplicates of information we already have, as backbeat/cloudserver have a list of all location information, we could just use the storage class stored in the rules to get the info we want. I think these are more of a relic from S3C that we can't really remove right now as S3C uses them. In CRR, the role is also a list. I don't think this works in Zenko tho.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
"soon", we will have a single arsenal, cloudserver & backbeat : so we can start planning to clean this up, possibly removing (deprecating) redudant fields and aligning the way CRR and multi-backend replication store/manage the state? can you please create a tech debt ticket, to track this (future) work and document the problem/situation/state of the analysis? |
||
| }; | ||
| objMD.setReplicationInfo(replicationInfo); | ||
| } | ||
| // Update replication info with site specific info | ||
| if (!objMD.getReplicationSiteStatus(storageClass)) { | ||
| // When replicating to multiple destinations, | ||
| // the storageClass and storageType properties | ||
| // become comma-separated lists of the storage | ||
| // classes and types of the replication destinations. | ||
| const storageClasses = objMD.getReplicationStorageClass() | ||
| ? `${objMD.getReplicationStorageClass()},${storageClass}` : storageClass; | ||
| objMD.setReplicationStorageClass(storageClasses); | ||
| if (this.storageType) { | ||
| const storageTypes = objMD.getReplicationStorageType() | ||
| ? `${objMD.getReplicationStorageType()},${this.storageType}` : this.storageType; | ||
| objMD.setReplicationStorageType(storageTypes); | ||
| } | ||
| // Add site to the list of replication backends | ||
| const backends = objMD.getReplicationBackends(); | ||
| backends.push({ | ||
| site: storageClass, | ||
| status: 'PENDING', | ||
| dataStoreVersionId: '', | ||
| }); | ||
| objMD.setReplicationBackends(backends); | ||
| } | ||
|
|
||
| objMD.setReplicationSiteStatus(storageClass, 'PENDING'); | ||
| objMD.setReplicationStatus('PENDING'); | ||
|
|
@@ -273,13 +294,16 @@ class ReplicationStatusUpdater { | |
| }), | ||
| (repConfig, next) => { | ||
| const { Rules } = repConfig; | ||
| const storageClass = Rules[0].Destination.StorageClass || this.siteName; | ||
| const storageClass = this.siteName || Rules[0].Destination.StorageClass; | ||
| if (!storageClass) { | ||
| const errMsg = 'missing SITE_NAME environment variable, must be set to' | ||
| + ' the value of "site" property in the CRR configuration'; | ||
| this.log.error(errMsg); | ||
| return next(new Error(errMsg)); | ||
| } | ||
| if (!this.siteName) { | ||
| this.log.warn(`missing SITE_NAME environment variable, triggering replication to the ${storageClass} storage class`); | ||
| } | ||
| return eachLimit(versions, this.workers, (i, apply) => { | ||
| const { Key, VersionId } = i; | ||
| this._markObjectPending(bucket, Key, VersionId, storageClass, repConfig, apply); | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this true as well for CRR (in s3c) ?
esp. we should check that there is no "legacy" data format where we would not have the per-site status, and where we should rely only on the global state...
(e.g. maybe if CRR sometimes has objet which are marked with
replicationInfo.status === 'COMPLETE'andsitelisted instorageClass, but no site status...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolas2bert @jonathan-gramain what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replication metadata are managed the same in both S3C and Zenko so yes this should work fine.
I looked back to the initial format of the metadata, there was always a global status.