Skip to content

Commit 881281c

Browse files
committed
OADP 2041 timeouts
1 parent 5e9c18e commit 881281c

9 files changed

+458
-0
lines changed

backup_and_restore/application_backup_and_restore/troubleshooting.adoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,14 @@ include::modules/migration-debugging-velero-admission-webhooks-ibm-appconnect.ad
7777
* xref:../../architecture/admission-plug-ins.adoc#admission-webhook-types_admission-plug-ins[Types of webhook admission plugins]
7878

7979
include::modules/oadp-installation-issues.adoc[leveloffset=+1]
80+
include::modules/oadp-timeouts.adoc[leveloffset=+1]
81+
include::modules/oadp-restic-timeouts.adoc[leveloffset=+2]
82+
include::modules/oadp-velero-timeouts.adoc[leveloffset=+2]
83+
include::modules/oadp-datamover-timeouts.adoc[leveloffset=+2]
84+
include::modules/oadp-csi-snapshot-timeouts.adoc[leveloffset=+2]
85+
include::modules/oadp-velero-default-timeouts.adoc[leveloffset=+2]
86+
include::modules/oadp-item-restore-timeouts.adoc[leveloffset=+2]
87+
include::modules/oadp-item-backup-timeouts.adoc[leveloffset=+2]
8088
include::modules/oadp-backup-restore-cr-issues.adoc[leveloffset=+1]
8189
include::modules/oadp-restic-issues.adoc[leveloffset=+1]
8290

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="CSIsnapshot-timeout_{context}"]
7+
= CSI snapshot timeout
8+
9+
`CSISnapshotTimeout` specifies the time during creation to wait until the `CSI VolumeSnapshot` status becomes `ReadyToUse`, before returning error as timeout. The default value is `10m`.
10+
11+
Use the `CSISnapshotTimeout` for the following scenarios:
12+
13+
* With the CSI plugin.
14+
* For very large storage volumes that may take longer than 10 minutes to snapshot. Adjust this timeout if timeouts are found in the logs.
15+
16+
[NOTE]
17+
====
18+
Typically, the default value for `CSISnapshotTimeout` does not require adjustment, because the default setting can accommodate large storage volumes.
19+
====
20+
21+
.Procedure
22+
* Edit the values in the `spec.csiSnapshotTimeout` block of the `Backup` CR manifest, as in the following example:
23+
+
24+
[source,yaml]
25+
----
26+
apiVersion: velero.io/v1
27+
kind: Backup
28+
metadata:
29+
name: <backup_name>
30+
spec:
31+
csiSnapshotTimeout: 10m
32+
# ...
33+
----

modules/oadp-datamover-timeouts.adoc

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="Datamover-timeout_{context}"]
7+
= Data Mover timeout
8+
9+
`timeout` is a user-supplied timeout to complete `VolumeSnapshotBackup` and `VolumeSnapshotRestore`. The default value is `10m`.
10+
11+
Use the Data Mover `timeout` for the following scenarios:
12+
13+
* If creation of `VolumeSnapshotBackups` (VSBs) and `VolumeSnapshotRestores` (VSRs), times out after 10 minutes.
14+
* For large scale environments with total PV data usage that is greater than 500GB. Set the timeout for `1h`.
15+
* With the `VolumeSnapshotMover` (VSM) plugin.
16+
* Only with OADP 1.1.x.
17+
18+
.Procedure
19+
* Edit the values in the `spec.features.dataMover.timeout` block of the `DataProtectionApplication` CR manifest, as in the following example:
20+
+
21+
[source,yaml]
22+
----
23+
apiVersion: oadp.openshift.io/v1alpha1
24+
kind: DataProtectionApplication
25+
metadata:
26+
name: <dpa_name>
27+
spec:
28+
features:
29+
dataMover:
30+
timeout: 10m
31+
# ...
32+
----
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="Item-operation-timeout-backup_{context}"]
7+
= Item operation timeout - backup
8+
9+
`ItemOperationTimeout` specifies the time used to wait for asynchronous
10+
`BackupItemAction` operations. The default value is `1h`.
11+
12+
Use the backup `ItemOperationTimeout` for the following scenarios:
13+
14+
* Only with Data Mover 1.2.x.
15+
* For Data Mover uploads and downloads to or from the `BackupStorageLocation`. If the backup action is not completed when the timeout is reached, it will be marked as failed. If Data Mover operations are failing due to timeout issues, because of large storage volume sizes, then this timeout setting may need to be increased.
16+
17+
.Procedure
18+
* Edit the values in the `Backup.spec.itemOperationTimeout` block of the `Backup` CR manifest, as in the following example:
19+
+
20+
[source,yaml]
21+
----
22+
apiVersion: velero.io/v1
23+
kind: Backup
24+
metadata:
25+
name: <backup_name>
26+
spec:
27+
itemOperationTimeout: 1h
28+
# ...
29+
----
30+
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="Item-operation-timeout-restore_{context}"]
7+
= Item operation timeout - restore
8+
9+
`ItemOperationTimeout` specifies the time that is used to wait for `RestoreItemAction` operations. The default value is `1h`.
10+
11+
Use the restore `ItemOperationTimeout` for the following scenarios:
12+
13+
* Only with Data Mover 1.2.x.
14+
* For Data Mover uploads and downloads to or from the `BackupStorageLocation`. If the restore action is not completed when the timeout is reached, it will be marked as failed. If Data Mover operations are failing due to timeout issues, because of large storage volume sizes, then this timeout setting may need to be increased.
15+
16+
.Procedure
17+
* Edit the values in the `Restore.spec.itemOperationTimeout` block of the `Restore` CR manifest, as in the following example:
18+
+
19+
[source,yaml]
20+
----
21+
apiVersion: velero.io/v1
22+
kind: Restore
23+
metadata:
24+
name: <restore_name>
25+
spec:
26+
itemOperationTimeout: 1h
27+
# ...
28+
----
29+

modules/oadp-restic-timeouts.adoc

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="Restic-timeout_{context}"]
7+
= Restic timeout
8+
9+
`timeout` defines the Restic timeout. The default value is `1h`.
10+
11+
Use the Restic `timeout` for the following scenarios:
12+
13+
* For Restic backups with total PV data usage that is greater than 500GB.
14+
* If backups are timing out with the following error:
15+
+
16+
[source,terminal]
17+
----
18+
level=error msg="Error backing up item" backup=velero/monitoring error="timed out waiting for all PodVolumeBackups to complete"
19+
----
20+
21+
.Procedure
22+
* Edit the values in the `spec.configuration.restic.timeout` block of the `DataProtectionApplication` CR manifest, as in the following example:
23+
+
24+
[source,yaml]
25+
----
26+
apiVersion: oadp.openshift.io/v1alpha1
27+
kind: DataProtectionApplication
28+
metadata:
29+
name: <dpa_name>
30+
spec:
31+
configuration:
32+
restic:
33+
timeout: 1h
34+
# ...
35+
----

0 commit comments

Comments
 (0)