Skip to content

Commit 4c43d99

Browse files
Merge pull request #67917 from RichardHoch/silent_op
2 parents 9f8bb4e + fd5e0a6 commit 4c43d99

File tree

4 files changed

+96
-2
lines changed

4 files changed

+96
-2
lines changed

backup_and_restore/application_backup_and_restore/troubleshooting.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ include::modules/migration-debugging-velero-admission-webhooks-ibm-appconnect.ad
7979
* xref:../../architecture/admission-plug-ins.adoc#admission-webhook-types_admission-plug-ins[Types of webhook admission plugins]
8080

8181
include::modules/oadp-installation-issues.adoc[leveloffset=+1]
82+
include::modules/oadp-operator-issues.adoc[leveloffset=+1]
8283
include::modules/oadp-timeouts.adoc[leveloffset=+1]
8384
include::modules/oadp-restic-timeouts.adoc[leveloffset=+2]
8485
include::modules/oadp-velero-timeouts.adoc[leveloffset=+2]

modules/oadp-operator-issues.adoc

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * backup_and_restore/application_backup_and_restore/troubleshooting.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="oadp-operator-issues_{context}"]
7+
= OADP Operator issues
8+
9+
The {oadp-first} Operator might encounter issues caused by problems it is not able to resolve.
10+
11+
[id="oadp-operator-fails-silently_{context}"]
12+
== OADP Operator fails silently
13+
14+
The S3 buckets of an OADP Operator might be empty, but when you run the command `oc get po -n <OADP_Operator_namespace>`, you see that the Operator has a status of `Running`. In such a case, the Operator is said to have _failed silently_ because it incorrectly reports that it is running.
15+
16+
.Cause
17+
18+
The problem is caused when cloud credentials provide insufficient permissions.
19+
20+
.Solution
21+
22+
Retrieve a list of backup storage locations (BSLs) and check the manifest of each BSL for credential issues.
23+
24+
.Procedure
25+
26+
. Run one of the following commands to retrieve a list of BSLs:
27+
28+
.. Using the OpenShift CLI:
29+
+
30+
[source,terminal]
31+
----
32+
$ oc get backupstoragelocation -A
33+
----
34+
35+
.. Using the Velero CLI:
36+
+
37+
[source,terminal]
38+
----
39+
$ velero backup-location get -n <OADP_Operator_namespace>
40+
----
41+
42+
. Using the list of BSLs, run the following command to display the manifest of each BSL, and examine each manifest for an error.
43+
+
44+
[source,terminal]
45+
----
46+
$ oc get backupstoragelocation -n <namespace> -o yaml
47+
----
48+
49+
.Example result
50+
51+
[source, yaml]
52+
----
53+
apiVersion: v1
54+
items:
55+
- apiVersion: velero.io/v1
56+
kind: BackupStorageLocation
57+
metadata:
58+
creationTimestamp: "2023-11-03T19:49:04Z"
59+
generation: 9703
60+
name: example-dpa-1
61+
namespace: openshift-adp-operator
62+
ownerReferences:
63+
- apiVersion: oadp.openshift.io/v1alpha1
64+
blockOwnerDeletion: true
65+
controller: true
66+
kind: DataProtectionApplication
67+
name: example-dpa
68+
uid: 0beeeaff-0287-4f32-bcb1-2e3c921b6e82
69+
resourceVersion: "24273698"
70+
uid: ba37cd15-cf17-4f7d-bf03-8af8655cea83
71+
spec:
72+
config:
73+
enableSharedConfig: "true"
74+
region: us-west-2
75+
credential:
76+
key: credentials
77+
name: cloud-credentials
78+
default: true
79+
objectStorage:
80+
bucket: example-oadp-operator
81+
prefix: example
82+
provider: aws
83+
status:
84+
lastValidationTime: "2023-11-10T22:06:46Z"
85+
message: "BackupStorageLocation \"example-dpa-1\" is unavailable: rpc
86+
error: code = Unknown desc = WebIdentityErr: failed to retrieve credentials\ncaused
87+
by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus
88+
code: 403, request id: d3f2e099-70a0-467b-997e-ff62345e3b54"
89+
phase: Unavailable
90+
kind: List
91+
metadata:
92+
resourceVersion: ""
93+
----

modules/oadp-velero-default-timeouts.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="velero-default-item-operation-timeout_{context}"]
7-
= Velereo default item operation timeout
7+
= Velero default item operation timeout
88

99
`defaultItemOperationTimeout` defines how long to wait on asynchronous `BackupItemActions` and `RestoreItemActions` to complete before timing out. The default value is `1h`.
1010

modules/oadp-velero-timeouts.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="velero-timeout_{context}"]
7-
= Velereo resource timeout
7+
= Velero resource timeout
88

99
`resourceTimeout` defines how long to wait for several Velero resources before timeout occurs, such as Velero custom resource definition (CRD) availability, `volumeSnapshot` deletion, and repository availability. The default is `10m`.
1010

0 commit comments

Comments
 (0)