Skip to content

Commit 9f1521c

Browse files
Merge pull request #27 from databrickslabs/feature/dlt-meta-uc-cli
-Fixed integration tests for non-uc flows
2 parents 33a8e16 + 31bc10b commit 9f1521c

File tree

15 files changed

+276
-140
lines changed

15 files changed

+276
-140
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,4 +154,6 @@ deployment-merged.yaml
154154
integration-tests/conf/dlt-meta/onboarding.json
155155

156156
.databricks
157-
.databricks-login.json
157+
.databricks-login.json
158+
demo/conf/onboarding.json
159+
integration_tests/conf/onboarding.json

README.md

Lines changed: 40 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -71,42 +71,47 @@ Refer to the [Getting Started](https://databrickslabs.github.io/dlt-meta/getting
7171
### Databricks Labs DLT-META CLI lets you run onboard and deploy in interactive python terminal
7272
- ``` git clone dlt-meta ```
7373
- ``` cd dlt-meta ```
74+
- ``` python -m venv .venv ```
75+
- ```source .venv/bin/activate ```
76+
- ``` pip install databricks ```
77+
- ``` pip install databricks-sdk ```
7478
- ```databricks labs dlt-meta onboard```
75-
- - Above command will prompt you to provide onboarding details. If you have cloned dlt-meta git repo then accept defaults which will launch config from demo folder
76-
```Provide onboarding file path (default: demo/conf/onboarding.template):
77-
Provide onboarding files local directory (default: demo/):
78-
Provide dbfs path (default: dbfs:/dlt-meta_cli_demo):
79-
Provide databricks runtime version (default: 14.2.x-scala2.12):
80-
Run onboarding with unity catalog enabled?
81-
[0] False
82-
[1] True
83-
Enter a number between 0 and 1: 1
84-
Provide unity catalog name: ravi_dlt_meta_uc
85-
Provide dlt meta schema name (default: dlt_meta_dataflowspecs_203b9da04bdc49f78cdc6c379d1c9ead):
86-
Provide dlt meta bronze layer schema name (default: dltmeta_bronze_cf5956873137432294892fbb2dc34fdb):
87-
Provide dlt meta silver layer schema name (default: dltmeta_silver_5afa2184543342f98f87b30d92b8c76f):
88-
Provide dlt meta layer
89-
[0] bronze
90-
[1] bronze_silver
91-
[2] silver
92-
Enter a number between 0 and 2: 1
93-
Provide bronze dataflow spec table name (default: bronze_dataflowspec):
94-
Provide silver dataflow spec table name (default: silver_dataflowspec):
95-
Overwrite dataflow spec?
96-
[0] False
97-
[1] True
98-
Enter a number between 0 and 1: 1
99-
Provide dataflow spec version (default: v1):
100-
Provide environment name (default: prod): prod
101-
Provide import author name (default: ravi.gawai):
102-
Provide cloud provider name
103-
[0] aws
104-
[1] azure
105-
[2] gcp
106-
Enter a number between 0 and 2: 0
107-
Do you want to update ws paths, catalog, schema details to your onboarding file?
108-
[0] False
109-
[1] True
79+
- - Above command will prompt you to provide onboarding details. If you have cloned dlt-meta git repo then accept defaults which will launch config from demo folder.
80+
81+
``` Provide onboarding file path (default: demo/conf/onboarding.template):
82+
Provide onboarding files local directory (default: demo/):
83+
Provide dbfs path (default: dbfs:/dlt-meta_cli_demo):
84+
Provide databricks runtime version (default: 14.2.x-scala2.12):
85+
Run onboarding with unity catalog enabled?
86+
[0] False
87+
[1] True
88+
Enter a number between 0 and 1: 1
89+
Provide unity catalog name: ravi_dlt_meta_uc
90+
Provide dlt meta schema name (default: dlt_meta_dataflowspecs_203b9da04bdc49f78cdc6c379d1c9ead):
91+
Provide dlt meta bronze layer schema name (default: dltmeta_bronze_cf5956873137432294892fbb2dc34fdb):
92+
Provide dlt meta silver layer schema name (default: dltmeta_silver_5afa2184543342f98f87b30d92b8c76f):
93+
Provide dlt meta layer
94+
[0] bronze
95+
[1] bronze_silver
96+
[2] silver
97+
Enter a number between 0 and 2: 1
98+
Provide bronze dataflow spec table name (default: bronze_dataflowspec):
99+
Provide silver dataflow spec table name (default: silver_dataflowspec):
100+
Overwrite dataflow spec?
101+
[0] False
102+
[1] True
103+
Enter a number between 0 and 1: 1
104+
Provide dataflow spec version (default: v1):
105+
Provide environment name (default: prod): prod
106+
Provide import author name (default: ravi.gawai):
107+
Provide cloud provider name
108+
[0] aws
109+
[1] azure
110+
[2] gcp
111+
Enter a number between 0 and 2: 0
112+
Do you want to update ws paths, catalog, schema details to your onboarding file?
113+
[0] False
114+
[1] True
110115
```
111116
- Goto your databricks workspace and located onboarding job under: Workflow->Jobs runs
112117
- Once onboarding jobs is finished deploy `bronze` and `silver` DLT using below command

demo/README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,12 @@ This Demo launches Bronze and Silver DLT pipleines with following activities:
2727
- cloud_provider_name : aws or azure or gcp
2828
- db_version : Databricks Runtime Version
2929
- dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
30-
- you can provide --profile=databricks_profile name in case you already have databricks cli otherwise command prompt will ask host and token
30+
- you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token.
3131
32-
6a. Databricks Workspace URL:
33-
- Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
32+
- - 6a. Databricks Workspace URL:
33+
- - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
3434
35-
6b. Token:
35+
- - 6b. Token:
3636
- In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
3737
3838
- On the Access tokens tab, click Generate new token.
@@ -65,12 +65,12 @@ This demo will launch auto generated tables(100s) inside single bronze and silve
6565
- cloud_provider_name : aws or azure or gcp
6666
- db_version : Databricks Runtime Version
6767
- dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
68-
- you can provide --profile=databricks_profile name in case you already have databricks cli otherwise command prompt will ask host and token
68+
- you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token
6969
70-
6a. Databricks Workspace URL:
70+
- - 6a. Databricks Workspace URL:
7171
- Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
7272
73-
6b. Token:
73+
- - 6b. Token:
7474
- In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
7575
7676
- On the Access tokens tab, click Generate new token.

demo/conf/onboarding.template

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
},
1313
"bronze_database_prod": "{uc_catalog_name}.{bronze_schema}",
1414
"bronze_table": "customers",
15+
"bronze_table_path_prod": "{dbfs_path}/data/bronze/customers",
1516
"bronze_reader_options": {
1617
"cloudFiles.format": "csv",
1718
"cloudFiles.rescuedDataColumn": "_rescued_data",
@@ -20,8 +21,10 @@
2021
"bronze_data_quality_expectations_json_prod": "{dbfs_path}/demo/conf/dqe/customers.json",
2122
"bronze_database_quarantine_prod": "{uc_catalog_name}.{bronze_schema}",
2223
"bronze_quarantine_table": "customers_quarantine",
24+
"bronze_quarantine_table_path_prod": "{dbfs_path}/data/bronze/customers_quarantine",
2325
"silver_database_prod": "{uc_catalog_name}.{silver_schema}",
2426
"silver_table": "customers",
27+
"silver_table_path_prod": "{dbfs_path}/data/silver/customers",
2528
"silver_cdc_apply_changes": {
2629
"keys": [
2730
"customer_id"
@@ -50,6 +53,7 @@
5053
},
5154
"bronze_database_prod": "{uc_catalog_name}.{bronze_schema}",
5255
"bronze_table": "transactions",
56+
"bronze_table_path_prod": "{dbfs_path}/data/bronze/transactions",
5357
"bronze_reader_options": {
5458
"cloudFiles.format": "csv",
5559
"cloudFiles.rescuedDataColumn": "_rescued_data",
@@ -61,6 +65,7 @@
6165
"bronze_quarantine_table_path_prod": "{dbfs_path}/demo/resources/data/bronze/transactions_quarantine",
6266
"silver_database_prod": "{uc_catalog_name}.{silver_schema}",
6367
"silver_table": "transactions",
68+
"silver_table_path_prod": "{dbfs_path}/data/silver/transactions",
6469
"silver_cdc_apply_changes": {
6570
"keys": [
6671
"transaction_id"
@@ -90,6 +95,7 @@
9095
},
9196
"bronze_database_prod": "{uc_catalog_name}.{bronze_schema}",
9297
"bronze_table": "products",
98+
"bronze_table_path_prod": "{dbfs_path}/data/bronze/products",
9399
"bronze_reader_options": {
94100
"cloudFiles.format": "csv",
95101
"cloudFiles.rescuedDataColumn": "_rescued_data",
@@ -102,6 +108,7 @@
102108
"bronze_quarantine_table_path_prod": "{dbfs_path}/demo/resources/data/bronze/products_quarantine",
103109
"silver_database_prod": "{uc_catalog_name}.{silver_schema}",
104110
"silver_table": "products",
111+
"silver_table_path_prod": "{dbfs_path}/data/silver/products",
105112
"silver_cdc_apply_changes": {
106113
"keys": [
107114
"product_id"
@@ -131,6 +138,7 @@
131138
},
132139
"bronze_database_prod": "{uc_catalog_name}.{bronze_schema}",
133140
"bronze_table": "stores",
141+
"bronze_table_path_prod": "{dbfs_path}/data/bronze/stores",
134142
"bronze_reader_options": {
135143
"cloudFiles.format": "csv",
136144
"cloudFiles.rescuedDataColumn": "_rescued_data",
@@ -143,6 +151,7 @@
143151
"bronze_quarantine_table_path_prod": "{dbfs_path}/demo/resources/data/bronze/stores_quarantine",
144152
"silver_database_prod": "{uc_catalog_name}.{silver_schema}",
145153
"silver_table": "stores",
154+
"silver_table_path_prod": "{dbfs_path}/data/silver/stores",
146155
"silver_cdc_apply_changes": {
147156
"keys": [
148157
"store_id"

demo/launch_dais_demo.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import uuid
22
from databricks.sdk.service import jobs
3+
from src.install import WorkspaceInstaller
34
from integration_tests.run_integration_tests import (
45
DLTMETARunner,
56
DLTMetaRunnerConf,
@@ -21,6 +22,7 @@ class DLTMETADAISDemo(DLTMETARunner):
2122
def __init__(self, args, ws, base_dir):
2223
self.args = args
2324
self.ws = ws
25+
self.wsi = WorkspaceInstaller(ws)
2426
self.base_dir = base_dir
2527

2628
def init_runner_conf(self) -> DLTMetaRunnerConf:

demo/launch_techsummit_demo.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
from databricks.sdk.service.catalog import VolumeType, SchemasAPI
2828
from databricks.sdk.service.workspace import ImportFormat
2929
from dataclasses import dataclass
30+
from src.install import WorkspaceInstaller
3031
from integration_tests.run_integration_tests import (
3132
DLTMETARunner,
3233
DLTMetaRunnerConf,
@@ -65,6 +66,7 @@ class DLTMETATechSummitDemo(DLTMETARunner):
6566
def __init__(self, args, ws, base_dir):
6667
self.args = args
6768
self.ws = ws
69+
self.wsi = WorkspaceInstaller(ws)
6870
self.base_dir = base_dir
6971

7072
def init_runner_conf(self) -> TechsummitRunnerConf:

docs/content/getting_started/additionals.md

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -11,35 +11,40 @@ draft: false
1111

1212
2. Goto to DLT-META directory
1313

14-
3. If you already have datatbricks CLI installed with profile as given [here](https://docs.databricks.com/en/dev-tools/cli/profiles.html), you can skip above export step and provide --profile=<your databricks profile name> option while running command 4.
14+
3. Set python environment variable into terminal
15+
```
16+
export PYTHONPATH=<<local dlt-meta path>>
17+
```
1518
16-
4. Run integration test against cloudfile or eventhub or kafka using below options:
17-
4a. Run the command for cloudfiles ```python integration-tests/run-integration-test.py --cloud_provider_name=aws --dbr_version=11.3.x-scala2.12 --source=cloudfiles --dbfs_path=dbfs:/tmp/DLT-META/```
19+
4. If you already have datatbricks CLI installed with profile as given [here](https://docs.databricks.com/en/dev-tools/cli/profiles.html), you can skip above export step and provide `--profile=<your databricks profile name>` option while running command in step:4
1820
19-
4b. Run the command for eventhub ```python integration-tests/run-integration-test.py --cloud_provider_name=azure --dbr_version=11.3.x-scala2.12 --source=eventhub --dbfs_path=dbfs:/tmp/DLT-META/ --eventhub_name=iot --eventhub_secrets_scope_name=eventhubs_creds --eventhub_namespace=int_test-standard --eventhub_port=9093 --eventhub_producer_accesskey_name=producer --eventhub_consumer_accesskey_name=consumer```
21+
5. Run integration test against cloudfile or eventhub or kafka using below options:
22+
- 5a. Run the command for cloudfiles ```python integration-tests/run-integration-test.py --cloud_provider_name=aws --dbr_version=11.3.x-scala2.12 --source=cloudfiles --dbfs_path=dbfs:/tmp/DLT-META/```
2023
21-
For eventhub integration tests, the following are the prerequisites:
22-
1. Needs eventhub instance running
23-
2. Using Databricks CLI, Create databricks secrets scope for eventhub keys
24-
3. Using Databricks CLI, Create databricks secrets to store producer and consumer keys using the scope created in step 2
24+
- 5b. Run the command for eventhub ```python integration-tests/run-integration-test.py --cloud_provider_name=azure --dbr_version=11.3.x-scala2.12 --source=eventhub --dbfs_path=dbfs:/tmp/DLT-META/ --eventhub_name=iot --eventhub_secrets_scope_name=eventhubs_creds --eventhub_namespace=int_test-standard --eventhub_port=9093 --eventhub_producer_accesskey_name=producer --eventhub_consumer_accesskey_name=consumer```
2525
26-
Following are the mandatory arguments for running EventHubs integration test
27-
1. Provide your eventhub topic : --eventhub_name
28-
2. Provide eventhub namespace : --eventhub_namespace
29-
3. Provide eventhub port : --eventhub_port
30-
4. Provide databricks secret scope name : --eventhub_secrets_scope_name
31-
5. Provide eventhub producer access key name : --eventhub_producer_accesskey_name
32-
6. Provide eventhub access key name : --eventhub_consumer_accesskey_name
26+
- - For eventhub integration tests, the following are the prerequisites:
27+
1. Needs eventhub instance running
28+
2. Using Databricks CLI, Create databricks secrets scope for eventhub keys
29+
3. Using Databricks CLI, Create databricks secrets to store producer and consumer keys using the scope created in step 2
3330
31+
- - Following are the mandatory arguments for running EventHubs integration test
32+
1. Provide your eventhub topic : --eventhub_name
33+
2. Provide eventhub namespace : --eventhub_namespace
34+
3. Provide eventhub port : --eventhub_port
35+
4. Provide databricks secret scope name : --eventhub_secrets_scope_name
36+
5. Provide eventhub producer access key name : --eventhub_producer_accesskey_name
37+
6. Provide eventhub access key name : --eventhub_consumer_accesskey_name
3438
35-
4c. Run the command for kafka ```python3 integration-tests/run-integration-test.py --cloud_provider_name=aws --dbr_version=11.3.x-scala2.12 --source=kafka --dbfs_path=dbfs:/tmp/DLT-META/ --kafka_topic_name=dlt-meta-integration-test --kafka_broker=host:9092```
3639
37-
For kafka integration tests, the following are the prerequisites:
38-
1. Needs kafka instance running
40+
- 5c. Run the command for kafka ```python3 integration-tests/run-integration-test.py --cloud_provider_name=aws --dbr_version=11.3.x-scala2.12 --source=kafka --dbfs_path=dbfs:/tmp/DLT-META/ --kafka_topic_name=dlt-meta-integration-test --kafka_broker=host:9092```
3941
40-
Following are the mandatory arguments for running EventHubs integration test
41-
1. Provide your kafka topic name : --kafka_topic_name
42-
2. Provide kafka_broker : --kafka_broker
42+
- - For kafka integration tests, the following are the prerequisites:
43+
1. Needs kafka instance running
44+
45+
- - Following are the mandatory arguments for running EventHubs integration test
46+
1. Provide your kafka topic name : --kafka_topic_name
47+
2. Provide kafka_broker : --kafka_broker
4348
4449
6. Once finished integration output file will be copied locally to
4550
```integration-test-output_<run_id>.txt```
-171 Bytes
Binary file not shown.
-26 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)