databrickslabs
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 10 deletions b/‎CHANGELOG.md‎
Lines changed: 2 additions & 10 deletions
diff --git a/‎docs/content/getting_started/additionals1.md‎
Lines changed: 90 additions & 0 deletions b/‎docs/content/getting_started/additionals1.md‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎docs/content/getting_started/additionals.md‎ renamed to ‎docs/content/getting_started/additionals2.md‎
Lines changed: 2 additions & 3 deletions b/‎docs/content/getting_started/additionals.md‎ renamed to ‎docs/content/getting_started/additionals2.md‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎docs/content/getting_started/buildwhl.md‎
Lines changed: 0 additions & 13 deletions b/‎docs/content/getting_started/buildwhl.md‎
Lines changed: 0 additions & 13 deletions
diff --git a/‎docs/content/getting_started/dltpipelineopt1.md‎
Lines changed: 72 additions & 0 deletions b/‎docs/content/getting_started/dltpipelineopt1.md‎
Lines changed: 72 additions & 0 deletions
diff --git a/‎docs/content/getting_started/dltpipeline.md‎ renamed to ‎docs/content/getting_started/dltpipelineopt2.md‎
Lines changed: 3 additions & 2 deletions b/‎docs/content/getting_started/dltpipeline.md‎ renamed to ‎docs/content/getting_started/dltpipelineopt2.md‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎docs/content/getting_started/runoboardingopt1.md‎
Lines changed: 54 additions & 44 deletions b/‎docs/content/getting_started/runoboardingopt1.md‎
Lines changed: 54 additions & 44 deletions
@@ -1,17 +1,9 @@
 # Changelog
 
-All notable changes to this project will be documented in this file.
-
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
-and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-
-**NOTE:** For CLI interfaces, we support SemVer approach. However, for API components we don't use SemVer as of now. This may lead to instability when using dbx API methods directly.
-
-[Please read through the Keep a Changelog (~5min)](https://keepachangelog.com/en/1.0.0/).
 
 ## [v.0.0.5] 
-- enabled UC (link to PR)
-- databricks labs cli integration (link to PR)
+- Enabled Unity Catalog support: [PR](https://github.com/databrickslabs/dlt-meta/pull/28)
+- Added databricks labs cli: [PR](https://github.com/databrickslabs/dlt-meta/pull/28)
 
 ## [v0.0.4] - 2023-10-09
 ### Added
 
@@ -0,0 +1,90 @@
+---
+title: "Additionals"
+date: 2021-08-04T14:25:26-04:00
+weight: 21
+draft: false
+---
+#### [DLT-META](https://github.com/databrickslabs/dlt-meta) DEMO's 
+ 1. [DAIS 2023 DEMO](#dais-2023-demo): Showcases DLT-META's capabilities of creating Bronze and Silver DLT pipelines with initial and incremental mode automatically.
+ 2. [Databricks Techsummit Demo](#databricks-tech-summit-fy2024-demo): 100s of data sources ingestion in bronze and silver DLT pipelines automatically.
+
+
+##### DAIS 2023 DEMO
+This Demo launches Bronze and Silver DLT pipleines with following activities:
+- Customer and Transactions feeds for initial load
+- Adds new feeds Product and Stores to existing Bronze and Silver DLT pipelines with metadata changes.
+- Runs Bronze and Silver DLT for incremental load for CDC events
+
+##### Steps:
+1. Launch Terminal/Command promt 
+
+2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
+
+3. ```git clone https://github.com/databrickslabs/dlt-meta.git ```
+
+4. ```cd dlt-meta```
+
+5. Set python environment variable into terminal
+    ```
+        export PYTHONPATH=<<local dlt-meta path>>
+    ```
+
+6. Run the command ```python demo/launch_dais_demo.py --username=<<your databricks username>> --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=aws --dbr_version=13.3.x-scala2.12 --dbfs_path=dbfs:/dais-dlt-meta-demo-automated_new```
+    - cloud_provider_name : aws or azure or gcp
+    - db_version : Databricks Runtime Version
+    - dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
+    - you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token.
+
+    - - 6a. Databricks Workspace URL:
+    - - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
+
+    - - 6b. Token:
+        - In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
+
+        - On the Access tokens tab, click Generate new token.
+
+        - (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
+
+        - Click Generate.
+
+        - Copy the displayed token
+
+        - Paste to command prompt
+
+##### Databricks Tech Summit FY2024 DEMO:
+This demo will launch auto generated tables(100s) inside single bronze and silver DLT pipeline using dlt-meta.
+
+1. Launch Terminal/Command promt 
+
+2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
+
+3. ```git clone https://github.com/databrickslabs/dlt-meta.git ```
+
+4. ```cd dlt-meta```
+
+5. Set python environment variable into terminal
+    ```
+        export PYTHONPATH=<<local dlt-meta path>>
+    ```
+
+6. Run the command ```python demo/launch_techsummit_demo.py [email protected] --source=cloudfiles --cloud_provider_name=aws --dbr_version=13.3.x-scala2.12 --dbfs_path=dbfs:/techsummit-dlt-meta-demo-automated ```
+    - cloud_provider_name : aws or azure or gcp
+    - db_version : Databricks Runtime Version
+    - dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
+    - you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token
+
+    - - 6a. Databricks Workspace URL:
+        - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
+
+    - - 6b. Token:
+        - In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
+
+        - On the Access tokens tab, click Generate new token.
+
+        - (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
+
+        - Click Generate.
+
+        - Copy the displayed token
+
+        - Paste to command prompt
@@ -1,12 +1,11 @@
 ---
 title: "Additionals"
 date: 2021-08-04T14:25:26-04:00
-weight: 21
+weight: 22
 draft: false
 ---
- This is easist way to launch dlt-meta to your databricks workspace with following steps.
 
-## Run Integration Tests
+#### Run Integration Tests
 1. Launch Terminal/Command promt
 
 2. Goto to DLT-META directory
 
@@ -0,0 +1,72 @@
+---
+title: "Launch Generic DLT pipeline"
+date: 2021-08-04T14:25:26-04:00
+weight: 20
+draft: false
+---
+## Option#1: Databricks Labs CLI
+##### pre-requisites:
+- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
+- Python 3.8.0 +
+##### Steps:
+```shell 
+ git clone dlt-meta 
+ cd dlt-meta
+ python -m venv .venv 
+ source .venv/bin/activate 
+ pip install databricks-sdk 
+ databricks labs dlt-meta onboard
+ ```
+
+- Once onboarding jobs is finished deploy `bronze` and `silver` DLT using below command
+#### Deploy Bronze DLT
+ ```shell 
+        databricks labs dlt-meta deploy
+   ```
+- Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
+```shell
+    Deploy DLT-META with unity catalog enabled?
+    [0] False
+    [1] True
+    Enter a number between 0 and 1: 1
+    Provide unity catalog name: uc_catalog_name
+    Deploy DLT-META with serverless?
+    [0] False
+    [1] True
+    Enter a number between 0 and 1: 1
+    Provide dlt meta layer
+    [0] bronze
+    [1] silver
+    Enter a number between 0 and 1: 0
+    Provide dlt meta onboard group: A1  
+    Provide dlt_meta dataflowspec schema name: dlt_meta_dataflowspecs_203b9
+    Provide bronze dataflowspec table name (default: bronze_dataflowspec): 
+    Provide dlt meta pipeline name (default: dlt_meta_bronze_pipeline_2aee): 
+    Provide dlt target schema name: dltmeta_bronze_cf595
+```
+
+#### Deploy Silver DLT
+ ```shell 
+        databricks labs dlt-meta deploy
+```
+- - Above command will prompt you to provide dlt details. Please provide respective details for schema which you provided in above steps
+```shell
+    Deploy DLT-META with unity catalog enabled?
+    [0] False
+    [1] True
+    Enter a number between 0 and 1: 1
+    Provide unity catalog name: uc_catalog_name
+    Deploy DLT-META with serverless?
+    [0] False
+    [1] True
+    Enter a number between 0 and 1: 1
+    Provide dlt meta layer
+    [0] bronze
+    [1] silver
+    Enter a number between 0 and 1: 1
+    Provide dlt meta onboard group: A1
+    Provide dlt_meta dataflowspec schema name: dlt_meta_dataflowspecs_203b9
+    Provide silver dataflowspec table name (default: silver_dataflowspec): 
+    Provide dlt meta pipeline name (default: dlt_meta_silver_pipeline_21475): 
+    Provide dlt target schema name: dltmeta_silver_5afa2
+```
@@ -1,11 +1,12 @@
 ---
 title: "Launch Generic DLT pipeline"
 date: 2021-08-04T14:25:26-04:00
-weight: 20
+weight: 21
 draft: false
 ---
+### Option#2: Manual
 
-### 1. Create a Delta Live Tables launch notebook
+#### 1. Create a Delta Live Tables launch notebook
 
 1. Go to your Databricks landing page and select Create a notebook, or click New Icon New in the sidebar and select Notebook. The Create Notebook dialog appears.
 
 
@@ -1,54 +1,64 @@
 ---
-title: "Running Onboarding"
+title: "Run Onboarding"
 date: 2021-08-04T14:25:26-04:00
 weight: 17
 draft: false
 ---
 
-#### Option#1: Python whl job
-1. Go to your Databricks landing page and do one of the following:
+#### Option#1: Databricks Labs CLI 
+##### pre-requisites:
+- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
+- Python 3.8.0 +
+##### Steps:
+1. ``` git clone dlt-meta ```
+2. ``` cd dlt-meta ```
+3. ``` python -m venv .venv ```
+4. ```source .venv/bin/activate ```
+5. ``` pip install databricks-sdk ```
 
-2. In the sidebar, click Jobs Icon Workflows and click Create Job Button.
+##### run dlt-meta cli command: 
+ ```shell 
+    databricks labs dlt-meta onboard
+``` 
+-  Above command will prompt you to provide onboarding details.
+- If you have cloned dlt-meta git repo then accepting defaults will launch config from [demo/conf](https://github.com/databrickslabs/dlt-meta/tree/main/demo/conf) folder.
+- You can create onboarding files e.g onboarding.json, data quality and silver transformations and put it in conf folder as show in [demo/conf](https://github.com/databrickslabs/dlt-meta/tree/main/demo/conf)
 
-3. In the sidebar, click New Icon New and select Job from the menu.
-
-4. In the task dialog box that appears on the Tasks tab, replace Add a name for your job… with your job name, for example, Python wheel example.
-
-5. In Task name, enter a name for the task, for example, ```dlt_meta_onboarding_pythonwheel_task```.
-
-6. In Type, select Python wheel.
-
-5. In Package name, enter ```dlt_meta```.
-
-6. In Entry point, enter ``run``. 
-
-7. Click Add under Dependent Libraries. In the Add dependent library dialog, under Library Type, click PyPI. Enter Package = ```dlt-meta```
-
-8. Click Add.
-
-9. In Parameters, select keyword argument then select JSON. Past below json parameters with :
-```json 
-    {                   
-        "onboard_layer": "bronze_silver",
-        "database": "dlt_demo",
-        "onboarding_file_path": "dbfs:/onboarding_files/users_onboarding.json",
-        "silver_dataflowspec_table": "silver_dataflowspec_table",
-        "silver_dataflowspec_path": "dbfs:/onboarding_tables_cdc/silver",
-        "bronze_dataflowspec_table": "bronze_dataflowspec_table",
-        "import_author": "Ravi",
-        "version": "v1",
-        "bronze_dataflowspec_path": "dbfs:/onboarding_tables_cdc/bronze",
-        "onboard_layer": "bronze_silver",
-        "uc_enabled": "False",
-        "overwrite": "True",
-        "env": "dev"
-    } 
+```shell
+		Provide onboarding file path (default: demo/conf/onboarding.template): 
+        Provide onboarding files local directory (default: demo/): 
+        Provide dbfs path (default: dbfs:/dlt-meta_cli_demo): 
+        Provide databricks runtime version (default: 14.2.x-scala2.12): 
+        Run onboarding with unity catalog enabled?
+        [0] False
+        [1] True
+        Enter a number between 0 and 1: 1
+        Provide unity catalog name: uc_catalog_name
+        Provide dlt meta schema name (default: dlt_meta_dataflowspecs_203b9): 
+        Provide dlt meta bronze layer schema name (default: dltmeta_bronze_cf595): 
+        Provide dlt meta silver layer schema name (default: dltmeta_silver_5afa2): 
+        Provide dlt meta layer
+        [0] bronze
+        [1] bronze_silver
+        [2] silver
+        Enter a number between 0 and 2: 1
+        Provide bronze dataflow spec table name (default: bronze_dataflowspec): 
+        Provide silver dataflow spec table name (default: silver_dataflowspec): 
+        Overwrite dataflow spec?
+        [0] False
+        [1] True
+        Enter a number between 0 and 1: 1
+        Provide dataflow spec version (default: v1): 
+        Provide environment name (default: prod): prod
+        Provide import author name (default: ravi.gawai): 
+        Provide cloud provider name
+        [0] aws
+        [1] azure
+        [2] gcp
+        Enter a number between 0 and 2: 0
+        Do you want to update ws paths, catalog, schema details to your onboarding file?
+        [0] False
+        [1] True
 ```
 
-Alternatly you can enter keyword arguments, click + Add and enter a key and value. Click + Add again to enter more arguments. 
-
-10. Click Save task.
-
-11. Run now
-
-12. Make sure job run successfully. Verify metadata in your dataflow spec tables entered in step: 11 e.g ```dlt_demo.bronze_dataflowspec_table``` , ```dlt_demo.silver_dataflowspec_table```
+- Goto your databricks workspace and located onboarding job under: Workflow->Jobs runs