Skip to content

Commit 82d56dc

Browse files
committed
Move the section about "is_ready_to_use" to under "CreateSnapshot"
1 parent a90cc88 commit 82d56dc

File tree

1 file changed

+27
-27
lines changed

1 file changed

+27
-27
lines changed

spec.md

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1420,9 +1420,34 @@ CO SHOULD then reissue the same `CreateSnapshotRequest` periodically until boole
14201420
If an error occurs during the process, `CreateSnapshot` SHOULD return a corresponding gRPC error code that reflects the error condition.
14211421

14221422
A snapshot MAY be used as the source to provision a new volume.
1423-
A CreateVolumeRequest message may specify an OPTIONAL source snapshot parameter.
1423+
A CreateVolumeRequest message MAY specify an OPTIONAL source snapshot parameter.
14241424
Reverting a snapshot, where data in the original volume is erased and replaced with data in the snapshot, is an advanced functionality not every storage system can support and therefore is currently out of scope.
14251425

1426+
##### The is_ready_to_use Parameter
1427+
1428+
Some SPs MAY "process" the snapshot after the snapshot is cut, for example, maybe uploading the snapshot somewhere after the snapshot is cut.
1429+
The post-cut process may be a long process that could take hours.
1430+
The CO MAY freeze the application using the source volume before taking the snapshot.
1431+
The purpose of `freeze` is to ensure the application data is in consistent state.
1432+
When `freeze` is performed, the container is paused and the application is also paused.
1433+
When `thaw` is performed, the container and the application start running again.
1434+
During the snapshot processing phase, since the snapshot is already cut, a `thaw` operation can be performed so application can start running without waiting for the process to complete.
1435+
The `is_ready_to_use` parameter of the snapshot will become `true` after the process is complete.
1436+
1437+
For cloud providers and storage systems that don't have the process, the `is_ready_to_use` parameter SHOULD be `true` after the snapshot is cut.
1438+
`thaw` can be done when the `is_ready_to_use` parameter is `true` in this case.
1439+
1440+
The `is_ready_to_use` parameter provides guidance to the CO on when it can "thaw" the application in the process of snapshotting.
1441+
If the cloud provider or storage system needs to process the snapshot after the snapshot is cut, the `is_ready_to_use` parameter returned by CreateSnapshot SHALL be `false`.
1442+
CO MAY continue to call CreateSnapshot while waiting for the process to complete until `is_ready_to_use` becomes `true`.
1443+
Note that CreateSnapshot no longer blocks after the snapshot is cut.
1444+
1445+
A gRPC error code SHALL be returned if an error occurs during any stage of the snapshotting process.
1446+
A CO SHOULD explicitly delete snapshots when an error occurs.
1447+
1448+
Based on this information, CO can issue repeated (idemponent) calls to CreateSnapshot, monitor the response, and make decisions.
1449+
Note that CreateSnapshot is a synchronous call and it MUST block until the snapshot is cut.
1450+
14261451
```protobuf
14271452
message CreateSnapshotRequest {
14281453
// The ID of the source volume to be snapshotted.
@@ -1508,7 +1533,7 @@ The CO MUST implement the specified error recovery behavior when it encounters t
15081533
|-----------|-----------|-------------|-------------------|
15091534
| Snapshot already exists but is incompatible | 6 ALREADY_EXISTS | Indicates that a snapshot corresponding to the specified snapshot `name` already exists but is incompatible with the specified `volume_id`. | Caller MUST fix the arguments or use a different `name` before retrying. |
15101535
| Operation pending for snapshot | 10 ABORTED | Indicates that there is already an operation pending for the specified snapshot. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per snapshot at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same snapshot. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified snapshot, and then retry with exponential back off. |
1511-
| Not enough space to create snapshot | 13 RESOURCE_EXHAUSTED | There is not enough space on the storage system to handle the create snapshot request. | Caller should fail this request. Future calls to CreateSnapshot may succeed if space is freed up. |
1536+
| Not enough space to create snapshot | 13 RESOURCE_EXHAUSTED | There is not enough space on the storage system to handle the create snapshot request. | Caller should fail this request. Future calls to CreateSnapshot MAY succeed if space is freed up. |
15121537

15131538

15141539
#### `DeleteSnapshot`
@@ -1638,31 +1663,6 @@ If a `CreateSnapshot` operation times out before the snapshot is cut, leaving th
16381663

16391664
It is NOT REQUIRED for a controller plugin to implement the `LIST_SNAPSHOTS` capability if it supports the `CREATE_DELETE_SNAPSHOT` capability: the onus is upon the CO to take into consideration the full range of plugin capabilities before deciding how to proceed in the above scenario.
16401665

1641-
##### The is_ready_to_use Parameter
1642-
1643-
Some cloud providers will process the snapshot after the snapshot is cut, i.e., uploading the snapshot to a location in the cloud (i.e., an object store) after the snapshot is cut.
1644-
A process such as uploading may be a long process that could take hours.
1645-
If a `freeze` operation was done on the application before taking the snapshot, it could be a long time before the application can be running again if we wait until the process is complete to `thaw` the application.
1646-
The purpose of `freeze` is to ensure the application data is in consistent state.
1647-
When `freeze` is performed, the container is paused and the application is also paused.
1648-
When `thaw` is performed, the container and the application start running again.
1649-
During the snapshot processing phase, since the snapshot is already cut, a `thaw` operation can be performed so application can start running without waiting for the process to complete.
1650-
The `is_ready_to_use` parameter of the snapshot will become `true` after the process is complete.
1651-
1652-
For cloud providers and storage systems that don't have the process, the `is_ready_to_use` parameter should be `true` after the snapshot is cut.
1653-
`thaw` can be done when the `is_ready_to_use` parameter is `true` in this case.
1654-
1655-
If the cloud provider or storage system needs to process the snapshot after the snapshot is cut, the `is_ready_to_use` parameter returned by CreateSnapshot SHALL be `false`.
1656-
CO MAY continue to call CreateSnapshot while waiting for the process to complete until `is_ready_to_use` becomes `true`.
1657-
Note that CreateSnapshot no longer blocks after the snapshot is cut.
1658-
1659-
A gRPC error code SHALL be returned if an error occurs during any stage of the snapshotting process.
1660-
A CO SHOULD explicitly delete snapshots when an error occurs.
1661-
1662-
The `is_ready_to_use` parameter provides guidance to the CO on what action can be taken in the process of snapshotting.
1663-
Based on this information, CO can issue repeated (idemponent) calls to CreateSnapshot, monitor the response, and make decisions.
1664-
Note that CreateSnapshot is a synchronous call and it must block until the snapshot is cut.
1665-
16661666
ListSnapshots SHALL return with current information regarding the snapshots on the storage system.
16671667
When processing is complete, the `is_ready_to_use` parameter of the snapshot from ListSnapshots SHALL become `true`.
16681668
The downside of calling ListSnapshots is that ListSnapshots will not return a gRPC error code if an error occurs during the processing. So calling CreateSnapshot repeatedly is the preferred way to check if the processing is complete.

0 commit comments

Comments
 (0)