You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1)Add functionality to restore cloud-transitioned objects on demand.
Current commit has below -
* Given <bucket,object>, fetch the object from the cloud endpoint.
* if days provided and > 0, the restore is marked temporary with expiry date.
* Without <days>, it is marked as permanent restore.
2)Use ObjectExpirer/delete_at attr to delete temp objects
For temporarily restored objects, set delete_at attr to the expiration time.
This will add those objects to ObjectExpirer list. Use LC worker thread to
scan that list and delete expired objects. By delete here, it means to delete
restored object data and reset HEAD object as Cloud-transitioned object as it
was before restore.
In addition below changes are done -
* If temporary, object is still marked RGWObj::CloudTiered and mtime is set same as
transition time.
* If permanent, object is marked RGWObj::Main and mtime is set to restore time (now()).
* rgw_restore_debug_interval option added to set configure restore Days (similar to rgw_lc_debug_interval)
There is an issue with ObjectExpirer code where in if an object is added
to ObjectExpirer list and is re-written, it is not deleted from the expirer list
and hence the new object may get deleted. Fixed the same and also addressed
minor review comments.
3)Design doc added
4) ObjCategory should be set to CloudTiered only for cloud-transitioned
objects and temporarily restored objects. Permanent copies are to be
treated as regular objects.
Signed-off-by: Soumya Koduri <[email protected]>
[`cloud-transition`](https://docs.ceph.com/en/latest/radosgw/cloud-transition) feature enables data transition to a remote cloud service as part of Lifecycle Configuration via Storage Classes. However the transition is unidirectional; data cannot be transitioned back from the remote zone.
6
+
7
+
The `cloud-restore` feature enables restoration of those transitioned objects from the remote cloud S3 endpoints back into RGW.
8
+
9
+
The objects can be restored either by using S3 `restore-object` CLI or via `read-through`. The restored copies can be either temporary or permanent.
10
+
11
+
## S3 restore-object CLI
12
+
The goal here is to implement minimal functionality of [`S3RestoreObject`](https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html) API so that users can restore the cloud transitioned objects.
13
+
14
+
```sh
15
+
aws s3api restore-object \
16
+
--bucket <value> \
17
+
--key <value> ( can be object name or *for Bulk restore) \
18
+
[--version-id <value>] \
19
+
--restore-request (structure) {
20
+
// for temporary restore
21
+
{ "Days": integer, }
22
+
// if Days not provided, it will be considered as permanent copy
23
+
}
24
+
```
25
+
This CLI may be extended in future to include custom parameters (like target-bucket/storage-class etc) specific to RGW.
26
+
27
+
28
+
## read-through
29
+
As per the cloud-transition feature functionality, the cloud-transitioned objects cannot be read. `GET` on those objects fails with ‘InvalidObjectState’ error.
30
+
31
+
But using this restore feature, transitioned objects can be restored and read. New tier-config options `allow_read_through` and `read_through_restore_days` are added for the same. Only when `allow_read_through` is enabled, `GET` on the transitioned objects will restore the objects from the S3 endpoint.
32
+
33
+
Note: The object copy restored via `readthrough` is temporary and is retained only for the duration of `read_through_restore_days`.
34
+
35
+
## Design
36
+
37
+
* Similar to cloud-transition feature, this feature currently works for**only s3 compatible cloud endpoint**.
38
+
* This feature works for only **cloud-transitioned objects**. In order to validate this, `retain_head_object` option should be set to true so that the object’s `HEAD` object can be verified before restoring the object.
39
+
40
+
***Request flow:**
41
+
* Once the `HEAD` object is verified, its cloudtier storage class config details are fetched.
42
+
Note: Incase the cloudtier storage-class is deleted/updated, the object may not be restored.
43
+
* RestoreStatus for the `HEAD` object is marked `RestoreAlreadyInProgress`
44
+
* Object Restore is done asynchronously by issuing either S3 `GET` or S3 `RESTORE` request to the remote endpoint.
45
+
* Once the object is restored, RestoreStaus is updated as `CloudRestored` and RestoreType is set to either `Temporary` or `Permanent`.
46
+
* Incase the operation fails, RestoreStatus is marked as `RestoreFailed`.
47
+
48
+
***New attrs:** Below are the new attrs being added
*`user.rgw.restore-expiry-date`: <Expiration time incase of temporary copies>
53
+
*`user.rgw.cloudtier_storage_class`: <CloudTier storage class used incase of temporarily restored copies>
54
+
```sh
55
+
enum RGWRestoreStatus : uint8_t {
56
+
None = 0,
57
+
RestoreAlreadyInProgress = 1,
58
+
CloudRestored = 2,
59
+
RestoreFailed = 3
60
+
};
61
+
enum class RGWRestoreType : uint8_t {
62
+
None = 0,
63
+
Temporary = 1,
64
+
Permanent = 2
65
+
};
66
+
```
67
+
68
+
***Response:**
69
+
*`S3 restore-object CLI` returns SUCCESS - either the 200 OK or 202 Accepted status code.
70
+
* If the object is not previously restored, then RGW returns 202 Accepted in the response.
71
+
* If the object is previously restored, RGW returns 200 OK in the response.
72
+
* Special errors:
73
+
Code: RestoreAlreadyInProgress ( Cause: Object restore is already in progress.)
74
+
Code: ObjectNotFound (if Object is not found in cloud endpoint)
75
+
Code: I/O error (for any other I/O errors during restore)
76
+
*`GET request` continues to return an ‘InvalidObjectState’ error till the object is successfully restored.
77
+
* S3 head-object can be used to verify if the restore is still in progress.
78
+
* Once the object is restored, GET will return the object data.
79
+
80
+
81
+
***StorageClass**: By default, the objects are restored to `STANDARD` storage class. However, as per [AWS S3 Restore](https://docs.aws.amazon.com/cli/latest/reference/s3api/restore-object.html) the storage-class remains the same for restored objects. Hence for the temporary copies, the `x-amz-storage-class` returned contains original cloudtier storage-class.
82
+
* Note: A new tier-config option may be added to selectthe storage-class to restore the objects to.
83
+
84
+
***mtime**: If the restored object is temporary, object is still marked `RGWObj::CloudTiered` and mtime is not changed i.e, still set to transition time. But incase the object is permanent copy, it is marked `RGWObj::Main` and mtime is updated to the restore time (now()).
85
+
86
+
***Lifecycle**:
87
+
*`Temporary` copies are not subjected to any further transition to the cloud. However (as is the case with cloud-transitioned objects) they can be deleted via regular LC expiration rules or via external S3 Delete request.
88
+
*`Permanent` copies are treated as any regular objects and are subjected to any LC rules applicable.
89
+
90
+
***Replication**: The restored objects (both temporary and permanent) are also replicated like regular objects and will be deleted across the zones post expiration.
91
+
92
+
***VersionedObjects**: In case of versioning, if any object is cloud-transitioned, it would have been non-current. Post restore too, the same non-current object will be updated with the downloaded data and its HEAD object will be updated accordingly as the case with regular objects.
93
+
94
+
***Temporary Object Expiry**: This is done via Object Expirer
95
+
* When the object is restored as temporary, `user.rgw.expiry-date` is set accordingly and `delete_at` attr is also updated with the same value.
96
+
* This object is then added to the list used by `ObjectExpirer`.
97
+
*`LC` worker thread is used to scan through that list and post expiry, resets the objects back to cloud-transitioned state i.e,
98
+
* HEAD object with size=0
99
+
* new attrs removed
100
+
*`delete_at` reset
101
+
* Note: A new RGW option `rgw_restore_debug_interval` is added, which when set will be considered as `Days` value (similar to `rgw_lc_debug_interval`).
102
+
103
+
***FAILED Restore**: In case the restore operation fails,
104
+
* The HEAD object will be updated accordingly.. i.e, Storage-class is reset to the original cloud-tier storage class
105
+
* All the new attrs added will be removed , except for`user.rgw.restore-status` which will be updated as `RestoreFailed`
106
+
107
+
***Check Restore Progress**: Users can issue S3 `head-object` request to check if the restore is done or still in progress for any object.
108
+
109
+
***RGW down/restarts** - Since the restore operation is asynchronous, we need to keep track of the objects being restored. In case RGW is down/restarts, this data will be used to retrigger on-going restore requests or do appropriate cleanup for the failed requests.
110
+
111
+
***Compression** - If the placement-target to which the objects are being restored to has compression enabled, the data will be compressed accordingly (bug2294512)
112
+
113
+
***Encryption** - If the restored object is encrypted, the old sse-related xattrs/keys from the HEAD stub will be copied back into object metadata (bug2294512)
114
+
115
+
***Delete cloud object post restore** - Once the object is successfully restored, the object at the remote endpoint is still retained. However we could choose to delete it for permanent restored copies by adding new tier-config option.
116
+
117
+
118
+
## Future work
119
+
120
+
***Bulk Restore**: In the case of BulkRestore, some of the objects may not be restored. User needs to manually cross-check the objects to check the objects restored or InProgress.
121
+
122
+
***Admin CLIs**: Admin debug commands will be provided to start, check the status and cancel the restore operations.
0 commit comments