Skip to content

Commit c393a8c

Browse files
authored
adding support for workload identity in neo4j admin backup (#1131) (#1154)
Cherry-picks #1131
1 parent e511f57 commit c393a8c

File tree

1 file changed

+137
-31
lines changed

1 file changed

+137
-31
lines changed

modules/ROOT/pages/kubernetes/operations/backup-restore.adoc

Lines changed: 137 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -13,29 +13,28 @@ For more information, see xref:kubernetes/accessing-neo4j.adoc[Accessing Neo4j].
1313

1414
You can perform a backup of a Neo4j database(s) to any cloud provider (AWS, GCP, and Azure) bucket using the _neo4j/neo4j-admin_ Helm chart.
1515
From Neo4j 5.10.0, the _neo4j/neo4j-admin_ Helm chart also supports performing a backup of multiple databases.
16+
And from 5.13.0, the _neo4j/neo4j-admin_ Helm chart also supports workload identity integration for GCP, AWS, and Azure.
1617

1718
=== Prerequisites
1819

1920
Before you can back up a database and upload it to your bucket, verify that you have the following:
2021

2122
* A cloud provider bucket (AWS, GCP, or Azure) with read and write access to be able to upload the backup.
2223
* Credentials to access the cloud provider bucket, such as a service account JSON key file for GCP, a credentials file for AWS, or storage account credentials for Azure.
24+
* A service account with workload identity if you want to use workload identity integration to access the cloud provider bucket.
25+
** For more information on setting up a service account with workload identity on GCP and AWS, see:
26+
*** link:https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity[Google Kubernetes Engine (GKE) -> Use Workload Identity]
27+
*** link:https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html[Amazon EKS -> Configuring a Kubernetes service account to assume an IAM role]
28+
** For more information on setting up an Azure storage account with workload identity, link:https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview?tabs=go[Microsoft Azure -> Use Microsoft Entra Workload ID with Azure Kubernetes Service (AKS)]
2329
* A Kubernetes cluster running on one of the cloud providers with the Neo4j Helm chart installed.
2430
For more information, see xref:kubernetes/quickstart-standalone/index.adoc[Quickstart: Deploy a standalone instance] or xref:kubernetes/quickstart-cluster/index.adoc[Quickstart: Deploy a cluster].
31+
* The latest Neo4j Helm charts.
32+
You can update the repository to get the latest charts using `helm repo update`.
2533

26-
=== Steps
34+
=== Create a Kubernetes secret
2735

28-
To perform a backup of a Neo4j database to any cloud provider (AWS, GCP, and Azure) bucket, follow these steps:
36+
You can create a Kubernetes secret with the credentials that can access the cloud provider bucket using one of the following options:
2937

30-
. Update the repository to get the latest charts:
31-
+
32-
[source, shell, role='noheader']
33-
----
34-
helm repo update
35-
----
36-
37-
. Create a Kubernetes secret with the credentials to access the cloud provider bucket using one of the following options:
38-
+
3938
[.tabbed-example]
4039
=====
4140
[.include-with-gke]
@@ -86,14 +85,19 @@ kubectl create secret generic azurecred --from-file=credentials=/path/to/your/cr
8685
======
8786
=====
8887

89-
. Configure the backup parameters in the _backup-values.yaml_ file using one of the following options:
90-
+
88+
=== Configure the backup parameters
89+
90+
You can configure the backup parameters in the _backup-values.yaml_ file either by using the `secretName` and `secretKeyName` parameters or by mapping the Kubernetes service account
91+
to the workload identity integration.
92+
9193
[NOTE]
9294
====
9395
The following examples show the minimum configuration required to perform a backup to a cloud provider bucket.
9496
For more information about the available backup parameters, see <<kubernetes-neo4j-backup-parameters, Backup parameters>>.
9597
====
96-
+
98+
99+
==== Configure the _backup-values.yaml_ file using the `secretName` and `secretKeyName` parameters
100+
97101
[.tabbed-example]
98102
=====
99103
[.include-with-gke]
@@ -171,36 +175,117 @@ consistencyCheck:
171175
----
172176
======
173177
=====
174-
+
178+
179+
==== Configure the _backup-values.yaml_ file using service account workload identity integration
180+
181+
In certain situations, it may be useful to assign a Kubernetes Service Account with workload identity integration to the Neo4j backup pod.
182+
This is particularly relevant when you want to improve security and have more precise access control for the pod.
183+
Doing so ensures that secure access to resources is granted based on the pod's identity within the cloud ecosystem.
184+
For more information on setting up a service account with workload identity, see https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity[Google Kubernetes Engine (GKE) -> Use Workload Identity], https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html[Amazon EKS -> Configuring a Kubernetes service account to assume an IAM role], and https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview?tabs=go[Microsoft Azure -> Use Microsoft Entra Workload ID with Azure Kubernetes Service (AKS)].
185+
186+
To configure the Neo4j backup pod to use a Kubernetes service account with workload identity, set `serviceAccountName` to the name of the service account to use.
187+
For Azure deployments, you also need to set the `azureStorageAccountName` parameter to the name of the Azure storage account, where the backup files will be uploaded.
188+
For example:
189+
190+
[.tabbed-example]
191+
=====
192+
[.include-with-gke]
193+
======
194+
[source, yaml, role='noheader']
195+
----
196+
neo4j:
197+
image: "neo4j/helm-charts-backup"
198+
imageTag: "5.13.0"
199+
jobSchedule: "* * * * *"
200+
successfulJobsHistoryLimit: 3
201+
failedJobsHistoryLimit: 1
202+
backoffLimit: 3
203+
204+
backup:
205+
bucketName: "my-bucket"
206+
databaseAdminServiceName: "standalone-admin" #This is the Neo4j Admin Service name.
207+
database: "neo4j,system"
208+
cloudProvider: "gcp"
209+
secretName: ""
210+
secretKeyName: ""
211+
212+
consistencyCheck:
213+
enabled: true
214+
215+
serviceAccountName: "demo-service-account"
216+
----
217+
======
218+
219+
[.include-with-aws]
220+
======
221+
[source, yaml, role='noheader']
222+
----
223+
neo4j:
224+
image: "neo4j/helm-charts-backup"
225+
imageTag: "5.13.0"
226+
jobSchedule: "* * * * *"
227+
successfulJobsHistoryLimit: 3
228+
failedJobsHistoryLimit: 1
229+
backoffLimit: 3
230+
231+
backup:
232+
bucketName: "my-bucket"
233+
databaseAdminServiceName: "standalone-admin"
234+
database: "neo4j,system"
235+
cloudProvider: "aws"
236+
secretName: ""
237+
secretKeyName: ""
238+
239+
consistencyCheck:
240+
enabled: true
241+
242+
serviceAccountName: "demo-service-account"
243+
----
244+
======
245+
246+
[.include-with-azure]
247+
======
248+
[source, yaml, role='noheader']
249+
----
250+
neo4j:
251+
image: "neo4j/helm-charts-backup"
252+
imageTag: "5.13.0"
253+
jobSchedule: "* * * * *"
254+
successfulJobsHistoryLimit: 3
255+
failedJobsHistoryLimit: 1
256+
backoffLimit: 3
257+
258+
backup:
259+
bucketName: "my-bucket"
260+
databaseAdminServiceName: "standalone-admin"
261+
database: "neo4j,system"
262+
cloudProvider: "azure"
263+
azureStorageAccountName: "storageAccountName"
264+
265+
consistencyCheck:
266+
enabled: true
267+
268+
serviceAccountName: "demo-service-account"
269+
----
270+
======
271+
=====
175272
The _/backups_ mount created by default is an _emptyDir_ type volume.
176273
This means that the data stored in this volume is not persistent and will be lost when the pod is deleted.
177274
To use a persistent volume for backups add the following section to the _backup-values.yaml_ file:
178-
+
275+
179276
[source, yaml, role='noheader']
180277
----
181278
tempVolume:
182279
persistentVolumeClaim:
183280
claimName: backup-pvc
184281
----
185-
+
282+
186283
[NOTE]
187284
====
188285
You need to create the persistent volume and persistent volume claim before installing the _neo4j-admin_ Helm chart.
189286
For more information, see xref:kubernetes/persistent-volumes.adoc[Volume mounts and persistent volumes].
190287
====
191288

192-
. Install _neo4j-admin_ Helm chart using the _backup-values.yaml_ file:
193-
+
194-
[source, shell, role='noheader']
195-
----
196-
helm install backup-name neo4j-admin -f /path/to/your/backup-values.yaml
197-
----
198-
+
199-
The _neo4j/neo4j-admin_ Helm chart installs a cronjob that launches a pod based on the job schedule. This pod performs a backup of one or multiple databases, a consistency check of the backup file(s), and uploads them to the cloud provider bucket.
200-
201-
. Monitor the backup pod logs using `kubectl logs pod/<neo4j-backup-pod-name>` to check the progress of the backup.
202-
. Check that the backup files and the consistency check reports have been uploaded to the cloud provider bucket.
203-
204289
[[kubernetes-neo4j-backup-parameters]]
205290
=== Backup parameters
206291

@@ -228,7 +313,7 @@ disableLookups: false
228313
229314
neo4j:
230315
image: "neo4j/helm-charts-backup"
231-
imageTag: "5.11.0"
316+
imageTag: "5.13.0"
232317
podLabels: {}
233318
# app: "demo"
234319
# acac: "dcdddc"
@@ -303,7 +388,9 @@ backup:
303388
secretName: ""
304389
# provide the keyname used in the above secret
305390
secretKeyName: ""
306-
391+
# provide the azure storage account name
392+
# this to be provided when you are using workload identity integration for azure
393+
azureStorageAccountName: ""
307394
#setting this to true will not delete the backup files generated at the /backup mount
308395
keepBackupFiles: true
309396
@@ -334,6 +421,10 @@ consistencyCheck:
334421
verbose: true
335422
336423
# Set to name of an existing Service Account to use if desired
424+
# Follow the following links for setting up a service account with workload identity
425+
# Azure - https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview?tabs=go
426+
# GCP - https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
427+
# AWS - https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html
337428
serviceAccountName: ""
338429
339430
# Volume to use as temporary storage for files before they are uploaded to cloud. For large databases local storage may not have sufficient space.
@@ -399,6 +490,21 @@ tolerations: []
399490
# effect: "NoSchedule"
400491
----
401492

493+
=== Install the _neo4j-admin_ Helm chart
494+
495+
. Install _neo4j-admin_ Helm chart using the _backup-values.yaml_ file:
496+
+
497+
[source, shell, role='noheader']
498+
----
499+
helm install backup-name neo4j-admin -f /path/to/your/backup-values.yaml
500+
----
501+
+
502+
The _neo4j/neo4j-admin_ Helm chart installs a cronjob that launches a pod based on the job schedule.
503+
This pod performs a backup of one or multiple databases, a consistency check of the backup file(s), and uploads them to the cloud provider bucket.
504+
505+
. Monitor the backup pod logs using `kubectl logs pod/<neo4j-backup-pod-name>` to check the progress of the backup.
506+
. Check that the backup files and the consistency check reports have been uploaded to the cloud provider bucket.
507+
402508
[[kubernetes-neo4j-restore]]
403509
== Restore a single database
404510

0 commit comments

Comments
 (0)