|
| 1 | +# Permissions migration logic and data structures |
| 2 | + |
| 3 | +On a very high-level, the permissions inventorization process is split into two steps: |
| 4 | + |
| 5 | +1. collect all existing permissions into a persistent storage. |
| 6 | +2. apply the collected permissions to the target resources. |
| 7 | + |
| 8 | +The first step is performed by the `Crawler` and the second by the `Applier`. |
| 9 | + |
| 10 | +Crawler and applier are intrinsically connected to each other due to SerDe (serialization/deserialization) logic. |
| 11 | + |
| 12 | +We implement separate crawlers and applier for each supported resource type. |
| 13 | + |
| 14 | +Please note that `table ACLs` logic is currently handled separately from the logic described in this document. |
| 15 | + |
| 16 | +## Logical objects and relevant APIs |
| 17 | + |
| 18 | + |
| 19 | +### Group level properties (uses SCIM API) |
| 20 | + |
| 21 | +- [x] Entitlements (One of `workspace-access`, `databricks-sql-access`, `allow-cluster-create`, `allow-instance-pool-create`) |
| 22 | +- [x] Roles (AWS only) |
| 23 | + |
| 24 | +These are workspace-level properties that are not associated with any specific resource. |
| 25 | + |
| 26 | +Additional info: |
| 27 | + |
| 28 | +- object ID: `group_id` |
| 29 | +- listing method: `ws.groups.list` |
| 30 | +- get method: `ws.groups.get(group_id)` |
| 31 | +- put method: `ws.groups.patch(group_id)` |
| 32 | + |
| 33 | +### Compute infrastructure (uses Permissions API) |
| 34 | + |
| 35 | +- [x] Clusters |
| 36 | +- [x] Cluster policies |
| 37 | +- [x] Instance pools |
| 38 | +- [x] SQL warehouses |
| 39 | + |
| 40 | +These are compute infrastructure resources that are associated with a specific workspace. |
| 41 | + |
| 42 | +Additional info: |
| 43 | + |
| 44 | +- object ID: `cluster_id`, `policy_id`, `instance_pool_id`, `id` (SQL warehouses) |
| 45 | +- listing method: `ws.clusters.list`, `ws.cluster_policies.list`, `ws.instance_pools.list`, `ws.warehouses.list` |
| 46 | +- get method: `ws.permissions.get(object_id, object_type)` |
| 47 | +- put method: `ws.permissions.update(object_id, object_type)` |
| 48 | +- get response object type: `databricks.sdk.service.iam.ObjectPermissions` |
| 49 | + |
| 50 | + |
| 51 | +### Workflows (uses Permissions API) |
| 52 | + |
| 53 | +- [x] Jobs |
| 54 | +- [x] Delta Live Tables |
| 55 | + |
| 56 | +These are workflow resources that are associated with a specific workspace. |
| 57 | + |
| 58 | +Additional info: |
| 59 | + |
| 60 | +- object ID: `job_id`, `pipeline_id` |
| 61 | +- listing method: `ws.jobs.list`, `ws.pipelines.list` |
| 62 | +- get method: `ws.permissions.get(object_id, object_type)` |
| 63 | +- put method: `ws.permissions.update(object_id, object_type)` |
| 64 | +- get response object type: `databricks.sdk.service.iam.ObjectPermissions` |
| 65 | + |
| 66 | +### ML (uses Permissions API) |
| 67 | + |
| 68 | +- [x] MLflow experiments |
| 69 | +- [x] MLflow models |
| 70 | + |
| 71 | +These are ML resources that are associated with a specific workspace. |
| 72 | + |
| 73 | +Additional info: |
| 74 | + |
| 75 | +- object ID: `experiment_id`, `id` (models) |
| 76 | +- listing method: custom listing |
| 77 | +- get method: `ws.permissions.get(object_id, object_type)` |
| 78 | +- put method: `ws.permissions.update(object_id, object_type)` |
| 79 | +- get response object type: `databricks.sdk.service.iam.ObjectPermissions` |
| 80 | + |
| 81 | + |
| 82 | +### SQL (uses SQL Permissions API) |
| 83 | + |
| 84 | +- [x] Alerts |
| 85 | +- [x] Dashboards |
| 86 | +- [x] Queries |
| 87 | + |
| 88 | +These are SQL resources that are associated with a specific workspace. |
| 89 | + |
| 90 | +Additional info: |
| 91 | + |
| 92 | +- object ID: `id` |
| 93 | +- listing method: `ws.alerts.list`, `ws.dashboards.list`, `ws.queries.list` |
| 94 | +- get method: `ws.dbsql_permissions.get` |
| 95 | +- put method: `ws.dbsql_permissions.set` |
| 96 | +- get response object type: `databricks.sdk.service.sql.GetResponse` |
| 97 | +- Note that API has no support for UPDATE operation, only PUT (overwrite) is supported. |
| 98 | + |
| 99 | +### Security (uses Permissions API) |
| 100 | + |
| 101 | +- [x] Tokens |
| 102 | +- [x] Passwords |
| 103 | + |
| 104 | +These are security resources that are associated with a specific workspace. |
| 105 | + |
| 106 | +Additional info: |
| 107 | + |
| 108 | +- object ID: `tokens` (static value), `passwords` (static value) |
| 109 | +- listing method: N/A |
| 110 | +- get method: `ws.permissions.get(object_id, object_type)` |
| 111 | +- put method: `ws.permissions.update(object_id, object_type)` |
| 112 | +- get response object type: `databricks.sdk.service.iam.ObjectPermissions` |
| 113 | + |
| 114 | +### Workspace (uses Permissions API) |
| 115 | + |
| 116 | +- [x] Notebooks |
| 117 | +- [x] Directories |
| 118 | +- [x] Repos |
| 119 | +- [x] Files |
| 120 | + |
| 121 | +These are workspace resources that are associated with a specific workspace. |
| 122 | + |
| 123 | +Additional info: |
| 124 | + |
| 125 | +- object ID: `object_id` |
| 126 | +- listing method: custom listing |
| 127 | +- get method: `ws.permissions.get(object_id, object_type)` |
| 128 | +- put method: `ws.permissions.update(object_id, object_type)` |
| 129 | +- get response object type: `databricks.sdk.service.iam.ObjectPermissions` |
| 130 | + |
| 131 | +### Secrets (uses Secrets API) |
| 132 | + |
| 133 | +- [x] Secrets |
| 134 | + |
| 135 | +These are secrets resources that are associated with a specific workspace. |
| 136 | + |
| 137 | +Additional info: |
| 138 | + |
| 139 | +- object ID: `scope_name` |
| 140 | +- listing method: `ws.secrets.list_scopes()` |
| 141 | +- get method: `ws.secrets.list_acls(scope_name)` |
| 142 | +- put method: `ws.secrets.put_acl` |
| 143 | + |
| 144 | + |
| 145 | +## Crawler and serialization logic |
| 146 | + |
| 147 | +Crawlers are expected to return a list of callable functions that will be later used to get the permissions. |
| 148 | +Each of these functions shall return a `PermissionInventoryItem` that should be serializable into a Delta Table. |
| 149 | +The permission payload differs between different crawlers, therefore each crawler should implement a serialization |
| 150 | +method. |
| 151 | + |
| 152 | +## Applier and deserialization logic |
| 153 | + |
| 154 | +Appliers are expected to accept a list of `PermissionInventoryItem` and generate a list of callables that will apply the |
| 155 | +given permissions. |
| 156 | +Each applier should implement a deserialization method that will convert the raw payload into a typed one. |
| 157 | +Each permission item should have a crawler type associated with it, so that the applier can use the correct |
| 158 | +deserialization method. |
| 159 | + |
| 160 | +## Relevance identification |
| 161 | + |
| 162 | +Since we save all objects into the permission table, we need to filter out the objects that are not relevant to the |
| 163 | +current migration. |
| 164 | +We do this inside the `applier`, by returning a `noop` callable if the object is not relevant to the current migration. |
| 165 | + |
| 166 | +## Crawling the permissions |
| 167 | + |
| 168 | +To crawl the permissions, we use the following logic: |
| 169 | +1. Go through the list of all crawlers. |
| 170 | +2. Get the list of all objects of the given type. |
| 171 | +3. For each object, generate a callable that will return a `PermissionInventoryItem`. |
| 172 | +4. Execute the callables in parallel |
| 173 | +5. Collect the results into a list of `PermissionInventoryItem`. |
| 174 | +6. Save the list of `PermissionInventoryItem` into a Delta Table. |
| 175 | + |
| 176 | +## Applying the permissions |
| 177 | + |
| 178 | +To apply the permissions, we use the following logic: |
| 179 | + |
| 180 | +1. Read the Delta Table with raw permissions. |
| 181 | +2. Map the items to the relevant `support` object. If no relevant `support` object is found, an exception is raised. |
| 182 | +3. Deserialize the items using the relevant applier. |
| 183 | +4. Generate a list of callables that will apply the permissions. |
| 184 | +5. Execute the callables in parallel. |
0 commit comments