Skip to content

Commit 0ad5d67

Browse files
- Added hugo docs for databricks labs cli option
- Added change log - added version 0.0.5
1 parent e0932fe commit 0ad5d67

File tree

12 files changed

+338
-99
lines changed

12 files changed

+338
-99
lines changed

CHANGELOG.md

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,9 @@
11
# Changelog
22

3-
All notable changes to this project will be documented in this file.
4-
5-
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6-
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7-
8-
**NOTE:** For CLI interfaces, we support SemVer approach. However, for API components we don't use SemVer as of now. This may lead to instability when using dbx API methods directly.
9-
10-
[Please read through the Keep a Changelog (~5min)](https://keepachangelog.com/en/1.0.0/).
113

124
## [v.0.0.5]
13-
- enabled UC (link to PR)
14-
- databricks labs cli integration (link to PR)
5+
- Enabled Unity Catalog support: [PR](https://github.com/databrickslabs/dlt-meta/pull/28)
6+
- Added databricks labs cli: [PR](https://github.com/databrickslabs/dlt-meta/pull/28)
157

168
## [v0.0.4] - 2023-10-09
179
### Added
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
title: "Additionals"
3+
date: 2021-08-04T14:25:26-04:00
4+
weight: 21
5+
draft: false
6+
---
7+
#### [DLT-META](https://github.com/databrickslabs/dlt-meta) DEMO's
8+
1. [DAIS 2023 DEMO](#dais-2023-demo): Showcases DLT-META's capabilities of creating Bronze and Silver DLT pipelines with initial and incremental mode automatically.
9+
2. [Databricks Techsummit Demo](#databricks-tech-summit-fy2024-demo): 100s of data sources ingestion in bronze and silver DLT pipelines automatically.
10+
11+
12+
##### DAIS 2023 DEMO
13+
This Demo launches Bronze and Silver DLT pipleines with following activities:
14+
- Customer and Transactions feeds for initial load
15+
- Adds new feeds Product and Stores to existing Bronze and Silver DLT pipelines with metadata changes.
16+
- Runs Bronze and Silver DLT for incremental load for CDC events
17+
18+
##### Steps:
19+
1. Launch Terminal/Command promt
20+
21+
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
22+
23+
3. ```git clone https://github.com/databrickslabs/dlt-meta.git ```
24+
25+
4. ```cd dlt-meta```
26+
27+
5. Set python environment variable into terminal
28+
```
29+
export PYTHONPATH=<<local dlt-meta path>>
30+
```
31+
32+
6. Run the command ```python demo/launch_dais_demo.py --username=<<your databricks username>> --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=aws --dbr_version=13.3.x-scala2.12 --dbfs_path=dbfs:/dais-dlt-meta-demo-automated_new```
33+
- cloud_provider_name : aws or azure or gcp
34+
- db_version : Databricks Runtime Version
35+
- dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
36+
- you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token.
37+
38+
- - 6a. Databricks Workspace URL:
39+
- - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
40+
41+
- - 6b. Token:
42+
- In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
43+
44+
- On the Access tokens tab, click Generate new token.
45+
46+
- (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
47+
48+
- Click Generate.
49+
50+
- Copy the displayed token
51+
52+
- Paste to command prompt
53+
54+
##### Databricks Tech Summit FY2024 DEMO:
55+
This demo will launch auto generated tables(100s) inside single bronze and silver DLT pipeline using dlt-meta.
56+
57+
1. Launch Terminal/Command promt
58+
59+
2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
60+
61+
3. ```git clone https://github.com/databrickslabs/dlt-meta.git ```
62+
63+
4. ```cd dlt-meta```
64+
65+
5. Set python environment variable into terminal
66+
```
67+
export PYTHONPATH=<<local dlt-meta path>>
68+
```
69+
70+
6. Run the command ```python demo/launch_techsummit_demo.py [email protected] --source=cloudfiles --cloud_provider_name=aws --dbr_version=13.3.x-scala2.12 --dbfs_path=dbfs:/techsummit-dlt-meta-demo-automated ```
71+
- cloud_provider_name : aws or azure or gcp
72+
- db_version : Databricks Runtime Version
73+
- dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
74+
- you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token
75+
76+
- - 6a. Databricks Workspace URL:
77+
- Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
78+
79+
- - 6b. Token:
80+
- In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
81+
82+
- On the Access tokens tab, click Generate new token.
83+
84+
- (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
85+
86+
- Click Generate.
87+
88+
- Copy the displayed token
89+
90+
- Paste to command prompt

docs/content/getting_started/additionals.md renamed to docs/content/getting_started/additionals2.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
---
22
title: "Additionals"
33
date: 2021-08-04T14:25:26-04:00
4-
weight: 21
4+
weight: 22
55
draft: false
66
---
7-
This is easist way to launch dlt-meta to your databricks workspace with following steps.
87

9-
## Run Integration Tests
8+
#### Run Integration Tests
109
1. Launch Terminal/Command promt
1110

1211
2. Goto to DLT-META directory

docs/content/getting_started/buildwhl.md

Lines changed: 0 additions & 13 deletions
This file was deleted.
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: "Launch Generic DLT pipeline"
3+
date: 2021-08-04T14:25:26-04:00
4+
weight: 20
5+
draft: false
6+
---
7+
## Option#1: Databricks Labs CLI
8+
##### pre-requisites:
9+
- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
10+
- Python 3.8.0 +
11+
##### Steps:
12+
```shell
13+
git clone dlt-meta
14+
cd dlt-meta
15+
python -m venv .venv
16+
source .venv/bin/activate
17+
pip install databricks-sdk
18+
databricks labs dlt-meta onboard
19+
```
20+
21+
- Once onboarding jobs is finished deploy `bronze` and `silver` DLT using below command
22+
#### Deploy Bronze DLT
23+
```shell
24+
databricks labs dlt-meta deploy
25+
```
26+
- Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
27+
```shell
28+
Deploy DLT-META with unity catalog enabled?
29+
[0] False
30+
[1] True
31+
Enter a number between 0 and 1: 1
32+
Provide unity catalog name: uc_catalog_name
33+
Deploy DLT-META with serverless?
34+
[0] False
35+
[1] True
36+
Enter a number between 0 and 1: 1
37+
Provide dlt meta layer
38+
[0] bronze
39+
[1] silver
40+
Enter a number between 0 and 1: 0
41+
Provide dlt meta onboard group: A1
42+
Provide dlt_meta dataflowspec schema name: dlt_meta_dataflowspecs_203b9
43+
Provide bronze dataflowspec table name (default: bronze_dataflowspec):
44+
Provide dlt meta pipeline name (default: dlt_meta_bronze_pipeline_2aee):
45+
Provide dlt target schema name: dltmeta_bronze_cf595
46+
```
47+
48+
#### Deploy Silver DLT
49+
```shell
50+
databricks labs dlt-meta deploy
51+
```
52+
- - Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
53+
```shell
54+
Deploy DLT-META with unity catalog enabled?
55+
[0] False
56+
[1] True
57+
Enter a number between 0 and 1: 1
58+
Provide unity catalog name: uc_catalog_name
59+
Deploy DLT-META with serverless?
60+
[0] False
61+
[1] True
62+
Enter a number between 0 and 1: 1
63+
Provide dlt meta layer
64+
[0] bronze
65+
[1] silver
66+
Enter a number between 0 and 1: 1
67+
Provide dlt meta onboard group: A1
68+
Provide dlt_meta dataflowspec schema name: dlt_meta_dataflowspecs_203b9
69+
Provide silver dataflowspec table name (default: silver_dataflowspec):
70+
Provide dlt meta pipeline name (default: dlt_meta_silver_pipeline_21475):
71+
Provide dlt target schema name: dltmeta_silver_5afa2
72+
```

docs/content/getting_started/dltpipeline.md renamed to docs/content/getting_started/dltpipelineopt2.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
---
22
title: "Launch Generic DLT pipeline"
33
date: 2021-08-04T14:25:26-04:00
4-
weight: 20
4+
weight: 21
55
draft: false
66
---
7+
### Option#2: Manual
78

8-
### 1. Create a Delta Live Tables launch notebook
9+
#### 1. Create a Delta Live Tables launch notebook
910

1011
1. Go to your Databricks landing page and select Create a notebook, or click New Icon New in the sidebar and select Notebook. The Create Notebook dialog appears.
1112

Lines changed: 54 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,64 @@
11
---
2-
title: "Running Onboarding"
2+
title: "Run Onboarding"
33
date: 2021-08-04T14:25:26-04:00
44
weight: 17
55
draft: false
66
---
77

8-
#### Option#1: Python whl job
9-
1. Go to your Databricks landing page and do one of the following:
8+
#### Option#1: Databricks Labs CLI
9+
##### pre-requisites:
10+
- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
11+
- Python 3.8.0 +
12+
##### Steps:
13+
1. ``` git clone dlt-meta ```
14+
2. ``` cd dlt-meta ```
15+
3. ``` python -m venv .venv ```
16+
4. ```source .venv/bin/activate ```
17+
5. ``` pip install databricks-sdk ```
1018

11-
2. In the sidebar, click Jobs Icon Workflows and click Create Job Button.
19+
##### run dlt-meta cli command:
20+
```shell
21+
databricks labs dlt-meta onboard
22+
```
23+
- Above command will prompt you to provide onboarding details.
24+
- If you have cloned dlt-meta git repo then accepting defaults will launch config from [demo/conf](https://github.com/databrickslabs/dlt-meta/tree/main/demo/conf) folder.
25+
- You can create onboarding files e.g onboarding.json, data quality and silver transformations and put it in conf folder as show in [demo/conf](https://github.com/databrickslabs/dlt-meta/tree/main/demo/conf)
1226

13-
3. In the sidebar, click New Icon New and select Job from the menu.
14-
15-
4. In the task dialog box that appears on the Tasks tab, replace Add a name for your job… with your job name, for example, Python wheel example.
16-
17-
5. In Task name, enter a name for the task, for example, ```dlt_meta_onboarding_pythonwheel_task```.
18-
19-
6. In Type, select Python wheel.
20-
21-
5. In Package name, enter ```dlt_meta```.
22-
23-
6. In Entry point, enter ``run``.
24-
25-
7. Click Add under Dependent Libraries. In the Add dependent library dialog, under Library Type, click PyPI. Enter Package = ```dlt-meta```
26-
27-
8. Click Add.
28-
29-
9. In Parameters, select keyword argument then select JSON. Past below json parameters with :
30-
```json
31-
{
32-
"onboard_layer": "bronze_silver",
33-
"database": "dlt_demo",
34-
"onboarding_file_path": "dbfs:/onboarding_files/users_onboarding.json",
35-
"silver_dataflowspec_table": "silver_dataflowspec_table",
36-
"silver_dataflowspec_path": "dbfs:/onboarding_tables_cdc/silver",
37-
"bronze_dataflowspec_table": "bronze_dataflowspec_table",
38-
"import_author": "Ravi",
39-
"version": "v1",
40-
"bronze_dataflowspec_path": "dbfs:/onboarding_tables_cdc/bronze",
41-
"onboard_layer": "bronze_silver",
42-
"uc_enabled": "False",
43-
"overwrite": "True",
44-
"env": "dev"
45-
}
27+
```shell
28+
Provide onboarding file path (default: demo/conf/onboarding.template):
29+
Provide onboarding files local directory (default: demo/):
30+
Provide dbfs path (default: dbfs:/dlt-meta_cli_demo):
31+
Provide databricks runtime version (default: 14.2.x-scala2.12):
32+
Run onboarding with unity catalog enabled?
33+
[0] False
34+
[1] True
35+
Enter a number between 0 and 1: 1
36+
Provide unity catalog name: uc_catalog_name
37+
Provide dlt meta schema name (default: dlt_meta_dataflowspecs_203b9):
38+
Provide dlt meta bronze layer schema name (default: dltmeta_bronze_cf595):
39+
Provide dlt meta silver layer schema name (default: dltmeta_silver_5afa2):
40+
Provide dlt meta layer
41+
[0] bronze
42+
[1] bronze_silver
43+
[2] silver
44+
Enter a number between 0 and 2: 1
45+
Provide bronze dataflow spec table name (default: bronze_dataflowspec):
46+
Provide silver dataflow spec table name (default: silver_dataflowspec):
47+
Overwrite dataflow spec?
48+
[0] False
49+
[1] True
50+
Enter a number between 0 and 1: 1
51+
Provide dataflow spec version (default: v1):
52+
Provide environment name (default: prod): prod
53+
Provide import author name (default: ravi.gawai):
54+
Provide cloud provider name
55+
[0] aws
56+
[1] azure
57+
[2] gcp
58+
Enter a number between 0 and 2: 0
59+
Do you want to update ws paths, catalog, schema details to your onboarding file?
60+
[0] False
61+
[1] True
4662
```
4763

48-
Alternatly you can enter keyword arguments, click + Add and enter a key and value. Click + Add again to enter more arguments.
49-
50-
10. Click Save task.
51-
52-
11. Run now
53-
54-
12. Make sure job run successfully. Verify metadata in your dataflow spec tables entered in step: 11 e.g ```dlt_demo.bronze_dataflowspec_table``` , ```dlt_demo.silver_dataflowspec_table```
64+
- Goto your databricks workspace and located onboarding job under: Workflow->Jobs runs

0 commit comments

Comments
 (0)