|
| 1 | +--- |
| 2 | +title: Alternate OneLake configuration for Cloud Ingest Edge Volumes |
| 3 | +description: Learn about an alternate Cloud Ingest Edge Volumes configuration. |
| 4 | +author: sethmanheim |
| 5 | +ms.author: sethm |
| 6 | +ms.topic: how-to |
| 7 | +ms.custom: linux-related-content |
| 8 | +ms.date: 08/26/2024 |
| 9 | +--- |
| 10 | + |
| 11 | +# Alternate: OneLake configuration for Cloud Ingest Edge Volumes |
| 12 | + |
| 13 | +This article describes an alternate configuration for [Cloud Ingest Edge Volumes](cloud-ingest-edge-volume-configuration.md) (blob upload with local purge) for OneLake Lakehouses. |
| 14 | + |
| 15 | +This configuration is an alternative option that you can use with key-based authentication methods. You should review the recommended configuration using the system-assigned managed identities described in [Cloud Ingest Edge Volumes configuration](cloud-ingest-edge-volume-configuration.md). |
| 16 | + |
| 17 | +## Configure OneLake for Extension Identity |
| 18 | + |
| 19 | +### Add Extension Identity to OneLake workspace |
| 20 | + |
| 21 | +1. Navigate to your OneLake portal; for example, `https://youraccount.powerbi.com`. |
| 22 | +1. Create or navigate to your workspace. |
| 23 | + :::image type="content" source="media/onelake-workspace.png" alt-text="Screenshot showing workspace ribbon in portal." lightbox="media/onelake-workspace.png"::: |
| 24 | +1. Select **Manage Access**. |
| 25 | + :::image type="content" source="media/onelake-manage-access.png" alt-text="Screenshot showing manage access screen in portal." lightbox="media/onelake-manage-access.png"::: |
| 26 | +1. Select **Add people or groups**. |
| 27 | +1. Enter your extension name from your Azure Container Storage enabled by Azure Arc installation. This must be unique within your tenant. |
| 28 | + :::image type="content" source="media/add-extension-name.png" alt-text="Screenshot showing add extension name screen." lightbox="media/add-extension-name.png"::: |
| 29 | +1. Change the drop-down for permissions from **Viewer** to **Contributor**. |
| 30 | + :::image type="content" source="media/onelake-set-contributor.png" alt-text="Screenshot showing set contributor screen." lightbox="media/onelake-set-contributor.png"::: |
| 31 | +1. Select **Add**. |
| 32 | + |
| 33 | +### Create a Cloud Ingest Persistent Volume Claim (PVC) |
| 34 | + |
| 35 | +1. Create a file named `cloudIngestPVC.yaml` with the following contents. Modify the `metadata::name` value with a name for your Persistent Volume Claim. This name is referenced on the last line of `deploymentExample.yaml` in the next step. You must also update the `metadata::namespace` value with your intended consuming pod. If you don't have an intended consuming pod, the `metadata::namespace` value is `default`. |
| 36 | + |
| 37 | + [!INCLUDE [lowercase-note](includes/lowercase-note.md)] |
| 38 | + |
| 39 | + ```yaml |
| 40 | + kind: PersistentVolumeClaim |
| 41 | + apiVersion: v1 |
| 42 | + metadata: |
| 43 | + ### Create a nane for your PVC ### |
| 44 | + name: <create-a-pvc-name-here> |
| 45 | + ### Use a namespace that matches your intended consuming pod, or "default" ### |
| 46 | + namespace: <intended-consuming-pod-or-default-here> |
| 47 | + spec: |
| 48 | + accessModes: |
| 49 | + - ReadWriteMany |
| 50 | + resources: |
| 51 | + requests: |
| 52 | + storage: 2Gi |
| 53 | + storageClassName: cloud-backed-sc |
| 54 | + ``` |
| 55 | +
|
| 56 | +1. To apply `cloudIngestPVC.yaml`, run: |
| 57 | + |
| 58 | + ```bash |
| 59 | + kubectl apply -f "cloudIngestPVC.yaml" |
| 60 | + ``` |
| 61 | + |
| 62 | +### Attach sub-volume to Edge Volume |
| 63 | + |
| 64 | +You can use the following process to create a sub-volume using Extension Identity to connect to your OneLake LakeHouse. |
| 65 | + |
| 66 | +1. Get the name of your Edge Volume using the following command: |
| 67 | + |
| 68 | + ```bash |
| 69 | + kubectl get edgevolumes |
| 70 | + ``` |
| 71 | + |
| 72 | +1. Create a file named `edgeSubvolume.yaml` and copy/paste the following contents. The following variables must be updated with your information: |
| 73 | + |
| 74 | + [!INCLUDE [lowercase-note](includes/lowercase-note.md)] |
| 75 | + |
| 76 | + - `metadata::name`: Create a name for your sub-volume. |
| 77 | + - `spec::edgevolume`: This name was retrieved from the previous step using `kubectl get edgevolumes`. |
| 78 | + - `spec::path`: Create your own subdirectory name under the mount path. Note that the following example already contains an example name (`exampleSubDir`). If you change this path name, line 33 in `deploymentExample.yaml` must be updated with the new path name. If you choose to rename the path, don't use a preceding slash. |
| 79 | + - `spec::container`: Details of your One Lake Data Lake Lakehouse (for example, `<WORKSPACE>/<DATA_LAKE>/Files`). |
| 80 | + - `spec::storageaccountendpoint`: Your storage account endpoint is the prefix of your Power BI web link. For example, if your OneLake page is `https://contoso-motors.powerbi.com/`, then your endpoint is `https://contoso-motors.dfs.fabric.microsoft.com`. |
| 81 | + |
| 82 | + ```yaml |
| 83 | + apiVersion: "arccontainerstorage.azure.net/v1" |
| 84 | + kind: EdgeSubvolume |
| 85 | + metadata: |
| 86 | + name: <create-a-subvolume-name-here> |
| 87 | + spec: |
| 88 | + edgevolume: <your-edge-volume-name-here> |
| 89 | + path: exampleSubDir # If you change this path, line 33 in deploymentExample.yaml must to be updated. Don't use a preceding slash. |
| 90 | + auth: |
| 91 | + authType: MANAGED_IDENTITY |
| 92 | + storageaccountendpoint: "https://<Your AZ Site>.dfs.fabric.microsoft.com/" # Your AZ site is the root of your Power BI OneLake interface URI, such as https://contoso-motors.powerbi.com |
| 93 | + container: "<WORKSPACE>/<DATA_LAKE>/Files" # Details of your One Lake Data Lake Lakehouse |
| 94 | + ingestPolicy: edgeingestpolicy-default # Optional: See the following instructions if you want to update the ingestPolicy with your own configuration |
| 95 | + ``` |
| 96 | + |
| 97 | +2. To apply `edgeSubvolume.yaml`, run: |
| 98 | + |
| 99 | + ```bash |
| 100 | + kubectl apply -f "edgeSubvolume.yaml" |
| 101 | + ``` |
| 102 | + |
| 103 | +#### Optional: Modify the `ingestPolicy` from the default |
| 104 | + |
| 105 | +1. If you want to change the `ingestPolicy` from the default `edgeingestpolicy-default`, create a file named `myedgeingest-policy.yaml` with the following contents. The following variables must be updated with your preferences: |
| 106 | + |
| 107 | + [!INCLUDE [lowercase-note](includes/lowercase-note.md)] |
| 108 | + |
| 109 | + - `metadata::name`: Create a name for your `ingestPolicy`. This name must be updated and referenced in the `spec::ingestPolicy` section of your `edgeSubvolume.yaml`. |
| 110 | + - `spec::ingest::order`: The order in which dirty files are uploaded. This is best effort, not a guarantee (defaults to `oldest-first`). Options for order are: `oldest-first` or `newest-first`. |
| 111 | + - `spec::ingest::minDelaySec`: The minimum number of seconds before a dirty file is eligible for ingest (defaults to 60). This number can range between 0 and 31536000. |
| 112 | + - `spec::eviction::order`: How files are evicted (defaults to `unordered`). Options for eviction order are: `unordered` or `never`. |
| 113 | + - `spec::eviction::minDelaySec`: The number of seconds before a clean file is eligible for eviction (defaults to 300). This number can range between 0 and 31536000. |
| 114 | + |
| 115 | + ```yaml |
| 116 | + apiVersion: arccontainerstorage.azure.net/v1 |
| 117 | + kind: EdgeIngestPolicy |
| 118 | + metadata: |
| 119 | + name: <create-a-policy-name-here> # This will need to be updated and referenced in the spec::ingestPolicy section of the edgeSubvolume.yaml |
| 120 | + spec: |
| 121 | + ingest: |
| 122 | + order: <your-ingest-order> |
| 123 | + minDelaySec: <your-min-delay-sec> |
| 124 | + eviction: |
| 125 | + order: <your-eviction-order> |
| 126 | + minDelaySec: <your-min-delay-sec> |
| 127 | + ``` |
| 128 | + |
| 129 | +1. To apply `myedgeingest-policy.yaml`, run: |
| 130 | + |
| 131 | + ```bash |
| 132 | + kubectl apply -f "myedgeingest-policy.yaml" |
| 133 | + ``` |
| 134 | + |
| 135 | +## Attach your app (Kubernetes native application) |
| 136 | + |
| 137 | +1. To configure a generic single pod (Kubernetes native application) against the Persistent Volume Claim (PVC), create a file named `deploymentExample.yaml` with the following contents. Replace the values for `containers::name` and `volumes::persistentVolumeClaim::claimName` with your own. If you updated the path name from `edgeSubvolume.yaml`, `exampleSubDir` on line 33 must be updated with your new path name. |
| 138 | + |
| 139 | + [!INCLUDE [lowercase-note](includes/lowercase-note.md)] |
| 140 | + |
| 141 | + ```yaml |
| 142 | + apiVersion: apps/v1 |
| 143 | + kind: Deployment |
| 144 | + metadata: |
| 145 | + name: cloudingestedgevol-deployment ### This must be unique for each deployment you choose to create. |
| 146 | + spec: |
| 147 | + replicas: 2 |
| 148 | + selector: |
| 149 | + matchLabels: |
| 150 | + name: wyvern-testclientdeployment |
| 151 | + template: |
| 152 | + metadata: |
| 153 | + name: wyvern-testclientdeployment |
| 154 | + labels: |
| 155 | + name: wyvern-testclientdeployment |
| 156 | + spec: |
| 157 | + affinity: |
| 158 | + podAntiAffinity: |
| 159 | + requiredDuringSchedulingIgnoredDuringExecution: |
| 160 | + - labelSelector: |
| 161 | + matchExpressions: |
| 162 | + - key: app |
| 163 | + operator: In |
| 164 | + values: |
| 165 | + - wyvern-testclientdeployment |
| 166 | + topologyKey: kubernetes.io/hostname |
| 167 | + containers: |
| 168 | + ### Specify the container in which to launch the busy box. ### |
| 169 | + - name: <create-a-container-name-here> |
| 170 | + image: mcr.microsoft.com/azure-cli:2.57.0@sha256:c7c8a97f2dec87539983f9ded34cd40397986dcbed23ddbb5964a18edae9cd09 |
| 171 | + command: |
| 172 | + - "/bin/sh" |
| 173 | + - "-c" |
| 174 | + - "dd if=/dev/urandom of=/data/exampleSubDir/esaingesttestfile count=16 bs=1M && while true; do ls /data &>/dev/null || break; sleep 1; done" |
| 175 | + volumeMounts: |
| 176 | + ### This name must match the following volumes::name attribute ### |
| 177 | + - name: wyvern-volume |
| 178 | + ### This mountPath is where the PVC is attached to the pod's filesystem ### |
| 179 | + mountPath: "/data" |
| 180 | + volumes: |
| 181 | + ### User-defined name that's used to link the volumeMounts. This name must match volumeMounts::name as previously specified. ### |
| 182 | + - name: wyvern-volume |
| 183 | + persistentVolumeClaim: |
| 184 | + ### This claimName must refer to your PVC metadata::name |
| 185 | + claimName: <your-pvc-metadata-name-from-line-5-of-pvc-yaml> |
| 186 | + ``` |
| 187 | + |
| 188 | +1. To apply `deploymentExample.yaml`, run: |
| 189 | + |
| 190 | + ```bash |
| 191 | + kubectl apply -f "deploymentExample.yaml" |
| 192 | + ``` |
| 193 | + |
| 194 | +1. Use `kubectl get pods` to find the name of your pod. Copy this name, as you need it in the next step. |
| 195 | + |
| 196 | + > [!NOTE] |
| 197 | + > Because `spec::replicas` from `deploymentExample.yaml` was specified as `2`, two pods appear using `kubectl get pods`. You can choose either pod name to use for the next step. |
| 198 | + |
| 199 | +1. Run the following command and replace `POD_NAME_HERE` with your copied value from the previous step: |
| 200 | + |
| 201 | + ```bash |
| 202 | + kubectl exec -it POD_NAME_HERE -- sh |
| 203 | + ``` |
| 204 | + |
| 205 | +1. Change directories into the `/data` mount path as specified in `deploymentExample.yaml`. |
| 206 | + |
| 207 | +1. You should see a directory with the name you specified as your `path` in Step 2 of the [Attach sub-volume to Edge Volume](#attach-sub-volume-to-edge-volume) section. Now, `cd` into `/YOUR_PATH_NAME_HERE`, replacing `YOUR_PATH_NAME_HERE` with your details. |
| 208 | + |
| 209 | +1. As an example, create a file named `file1.txt` and write to it using `echo "Hello World" > file1.txt`. |
| 210 | + |
| 211 | +1. In the Azure portal, navigate to your storage account and find the container specified from step 2 of [Attach sub-volume to Edge Volume](#attach-sub-volume-to-edge-volume). When you select your container, you should find `file1.txt` populated within the container. If the file hasn't yet appeared, wait approximately 1 minute; Edge Volumes waits a minute before uploading. |
| 212 | + |
| 213 | +## Next steps |
| 214 | + |
| 215 | +After you complete these steps, begin monitoring your deployment using Azure Monitor and Kubernetes Monitoring, or 3rd-party monitoring with Prometheus and Grafana. |
| 216 | + |
| 217 | +[Monitor Your Deployment](monitor-deployment-edge-volumes.md) |
0 commit comments