databrickslabs
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎demo/README.md‎
Lines changed: 55 additions & 1 deletion b/‎demo/README.md‎
Lines changed: 55 additions & 1 deletion
diff --git a/‎demo/conf/onboarding_cars.template‎
Lines changed: 21 additions & 0 deletions b/‎demo/conf/onboarding_cars.template‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎demo/conf/onboarding_fanout_cars.template‎
Lines changed: 29 additions & 0 deletions b/‎demo/conf/onboarding_fanout_cars.template‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎demo/conf/silver_transformations_cars.json‎
Lines changed: 50 additions & 0 deletions b/‎demo/conf/silver_transformations_cars.json‎
Lines changed: 50 additions & 0 deletions
diff --git a/‎demo/dbc/afam_eventhub_runners.dbc‎
196 Bytes b/‎demo/dbc/afam_eventhub_runners.dbc‎
196 Bytes
diff --git a/‎demo/dbc/silver_fout_runners.dbc‎
1.49 KB b/‎demo/dbc/silver_fout_runners.dbc‎
1.49 KB
diff --git a/‎demo/launch_af_cloudfiles_demo.py‎
Lines changed: 1 addition & 1 deletion b/‎demo/launch_af_cloudfiles_demo.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎demo/launch_af_eventhub_demo.py‎
Lines changed: 1 addition & 1 deletion b/‎demo/launch_af_eventhub_demo.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎demo/launch_dais_demo.py‎
Lines changed: 1 addition & 1 deletion b/‎demo/launch_dais_demo.py‎
Lines changed: 1 addition & 1 deletion
@@ -6,6 +6,7 @@
 - Added support for Bring your own custom transformation: [Issue](https://github.com/databrickslabs/dlt-meta/issues/68)
 - Added support to Unify PyPI releases with GitHub OIDC: [PR](https://github.com/databrickslabs/dlt-meta/pull/62)
 - Added demo for append_flow and file_metadata options: [PR](https://github.com/databrickslabs/dlt-meta/issues/74)
+- Added Demo for silver fanout architecture: [PR](https://github.com/databrickslabs/dlt-meta/pull/83)
 - Added documentation in docs site for new features: [PR](https://github.com/databrickslabs/dlt-meta/pull/64)
 - Added unit tests to showcase silver layer fanout examples: [PR](https://github.com/databrickslabs/dlt-meta/pull/67)
 - Fixed issue for No such file or directory: '/demo' :[PR](https://github.com/databrickslabs/dlt-meta/issues/59)
 
@@ -3,6 +3,7 @@
  2. [Databricks Techsummit Demo](#databricks-tech-summit-fy2024-demo): 100s of data sources ingestion in bronze and silver DLT pipelines automatically.
  3. [Append FLOW Autoloader Demo](#append-flow-autoloader-file-metadata-demo): Write to same target from multiple sources using [dlt.append_flow](https://docs.databricks.com/en/delta-live-tables/flows.html#append-flows)  and adding [File metadata column](https://docs.databricks.com/en/ingestion/file-metadata-column.html)
  4. [Append FLOW Eventhub Demo](#append-flow-eventhub-demo): Write to same target from multiple sources using [dlt.append_flow](https://docs.databricks.com/en/delta-live-tables/flows.html#append-flows)  and adding [File metadata column](https://docs.databricks.com/en/ingestion/file-metadata-column.html)
+ 5. [Silver Fanout Demo](#silver-fanout-demo): This demo showcases the implementation of fanout architecture in the silver layer.
 
 
 
@@ -35,7 +36,7 @@ This Demo launches Bronze and Silver DLT pipelines with following activities:
     export PYTHONPATH=$dlt_meta_home
     ```
 
-6. Run the command ```python demo/launch_dais_demo.py --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=aws --dbr_version=15.3.x-scala2.12 --dbfs_path=dbfs:/dais-dlt-meta-demo-automated_new```
+6. Run the command ```python demo/launch_dais_demo.py --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=aws --dbr_version=15.3.x-scala2.12 --dbfs_path=dbfs:/dais-dlt-meta-demo-automated```
     - cloud_provider_name : aws or azure or gcp
     - db_version : Databricks Runtime Version
     - dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
@@ -202,3 +203,56 @@ This demo will perform following tasks:
     ```
 
   ![af_eh_demo.png](docs/static/images/af_eh_demo.png)
+
+
+# Silver Fanout Demo
+- This demo will showcase the onboarding process for the silver fanout pattern.
+    - Run the onboarding process for the bronze cars table, which contains data from various countries.
+    - Run the onboarding process for the silver tables, which have a `where_clause` based on the country condition specified in [silver_transformations_cars.json](https://github.com/databrickslabs/dlt-meta/blob/main/demo/conf/silver_transformations_cars.json).
+    - Run the Bronze DLT pipeline which will produce cars table.
+    - Run Silver DLT pipeline, fanning out from the bronze cars table to country-specific tables such as cars_usa, cars_uk, cars_germany, and cars_japan.    
+
+### Steps:
+1. Launch Terminal/Command prompt 
+
+2. Install [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html)
+
+3. ```commandline
+    git clone https://github.com/databrickslabs/dlt-meta.git 
+    ```
+
+4. ```commandline
+    cd dlt-meta
+    ```
+5. Set python environment variable into terminal
+    ```commandline
+    dlt_meta_home=$(pwd)
+    ```
+    ```commandline
+    export PYTHONPATH=$dlt_meta_home
+
+6. Run the command ```python demo/launch_silver_fanout_demo.py --source=cloudfiles --uc_catalog_name=<<uc catalog name>> --cloud_provider_name=aws --dbr_version=15.3.x-scala2.12 --dbfs_path=dbfs:/dais-dlt-meta-silver-fanout```
+    - cloud_provider_name : aws or azure
+    - db_version : Databricks Runtime Version
+    - dbfs_path : Path on your Databricks workspace where demo will be copied for launching DLT-META Pipelines
+    - you can provide `--profile=databricks_profile name` in case you already have databricks cli otherwise command prompt will ask host and token.
+
+    - - 6a. Databricks Workspace URL:
+    - - Enter your workspace URL, with the format https://<instance-name>.cloud.databricks.com. To get your workspace URL, see Workspace instance names, URLs, and IDs.
+
+    - - 6b. Token:
+        - In your Databricks workspace, click your Databricks username in the top bar, and then select User Settings from the drop down.
+
+        - On the Access tokens tab, click Generate new token.
+
+        - (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
+
+        - Click Generate.
+
+        - Copy the displayed token
+
+        - Paste to command prompt
+
+    ![silver_fanout_workflow.png](docs/static/images/silver_fanout_workflow.png)
+    
+    ![silver_fanout_dlt.png](docs/static/images/silver_fanout_dlt.png)
@@ -0,0 +1,21 @@
+[
+   {
+      "data_flow_id": "100",
+      "data_flow_group": "A1",
+      "source_system": "mysql",
+      "source_format": "cloudFiles",
+      "source_details": {
+         "source_path_demo": "{dbfs_path}/demo/resources/data/cars"
+      },
+      "bronze_database_demo": "{uc_catalog_name}.{bronze_schema}",
+      "bronze_table": "cars",
+      "bronze_reader_options": {
+         "cloudFiles.format": "csv",
+         "cloudFiles.rescuedDataColumn": "_rescued_data",
+         "header": "true"
+      },
+      "silver_database_demo": "{uc_catalog_name}.{silver_schema}",
+      "silver_table": "cars_usa",
+      "silver_transformation_json_demo": "{dbfs_path}/demo/conf/silver_transformations_cars.json"
+   }
+]
@@ -0,0 +1,29 @@
+[
+   {
+      "data_flow_id": "101",
+      "data_flow_group": "A1",
+      "bronze_database_demo": "{uc_catalog_name}.{bronze_schema}",
+      "bronze_table": "cars",
+      "silver_database_demo": "{uc_catalog_name}.{silver_schema}",
+      "silver_table": "cars_germany",
+      "silver_transformation_json_demo": "{dbfs_path}/demo/conf/silver_transformations_cars.json"
+   },
+   {
+      "data_flow_id": "102",
+      "data_flow_group": "A1",
+      "bronze_database_demo": "{uc_catalog_name}.{bronze_schema}",
+      "bronze_table": "cars",
+      "silver_database_demo": "{uc_catalog_name}.{silver_schema}",
+      "silver_table": "cars_uk",
+      "silver_transformation_json_demo": "{dbfs_path}/demo/conf/silver_transformations_cars.json"
+   },
+   {
+      "data_flow_id": "103",
+      "data_flow_group": "A1",
+      "bronze_database_demo": "{uc_catalog_name}.{bronze_schema}",
+      "bronze_table": "cars",
+      "silver_database_demo": "{uc_catalog_name}.{silver_schema}",
+      "silver_table": "cars_japan",
+      "silver_transformation_json_demo": "{dbfs_path}/demo/conf/silver_transformations_cars.json"
+   }        
+]
@@ -0,0 +1,50 @@
+[
+  {
+    "target_table": "cars_usa",
+    "select_exp": [
+      "concat(first_name,' ',last_name) as full_name",
+      "country",
+      "brand",
+      "model",
+      "color",
+      "cc_type"
+    ],
+    "where_clause": ["country = 'United States'"]
+  },
+  {
+    "target_table": "cars_germany",
+    "select_exp": [
+      "concat(first_name,' ',last_name) as full_name",
+      "country",
+      "brand",
+      "model",
+      "color",
+      "cc_type"
+    ],
+    "where_clause": ["country = 'Germany'"]
+  },
+  {
+    "target_table": "cars_uk",
+    "select_exp": [
+      "concat(first_name,' ',last_name) as full_name",
+      "country",
+      "brand",
+      "model",
+      "color",
+      "cc_type"
+    ],
+    "where_clause": ["country = 'United Kingdom'"]
+  },  
+  {
+    "target_table": "cars_japan",
+    "select_exp": [
+      "concat(first_name,' ',last_name) as full_name",
+      "country",
+      "brand",
+      "model",
+      "color",
+      "cc_type"
+    ],
+    "where_clause": ["country = 'Japan'"]
+  }  
+]
@@ -81,7 +81,7 @@ def launch_workflow(self, runner_conf: DLTMetaRunnerConf):
     "--profile": "provide databricks cli profile name, if not provide databricks_host and token",
     "--uc_catalog_name": "provide databricks uc_catalog name, this is required to create volume, schema, table",
     "--cloud_provider_name": "provide cloud provider name. Supported values are aws , azure , gcp",
-    "--dbr_version": "Provide databricks runtime spark version e.g 11.3.x-scala2.12",
+    "--dbr_version": "Provide databricks runtime spark version e.g 15.3.x-scala2.12",
     "--dbfs_path": "Provide databricks workspace dbfs path where you want run integration tests \
                         e.g --dbfs_path=dbfs:/tmp/DLT-META/"
 }
 
@@ -78,7 +78,7 @@ def launch_workflow(self, runner_conf: DLTMetaRunnerConf):
     "--profile": "provide databricks cli profile name, if not provide databricks_host and token",
     "--uc_catalog_name": "provide databricks uc_catalog name, this is required to create volume, schema, table",
     "--cloud_provider_name": "provide cloud provider name. Supported values are aws , azure , gcp",
-    "--dbr_version": "Provide databricks runtime spark version e.g 11.3.x-scala2.12",
+    "--dbr_version": "Provide databricks runtime spark version e.g 15.3.x-scala2.12",
     "--dbfs_path": "Provide databricks workspace dbfs path where you want run integration tests \
                         e.g --dbfs_path=dbfs:/tmp/DLT-META/",
     "--eventhub_name": "Provide eventhub_name e.g --eventhub_name=iot",
 
@@ -181,7 +181,7 @@ def create_daisdemo_workflow(self, runner_conf: DLTMetaRunnerConf):
                  "--uc_catalog_name": "provide databricks uc_catalog name, \
                      this is required to create volume, schema, table",
                  "--cloud_provider_name": "provide cloud provider name. Supported values are aws , azure , gcp",
-                 "--dbr_version": "Provide databricks runtime spark version e.g 11.3.x-scala2.12",
+                 "--dbr_version": "Provide databricks runtime spark version e.g 15.3.x-scala2.12",
                  "--dbfs_path": "Provide databricks workspace dbfs path where you want run integration tests \
                         e.g --dbfs_path=dbfs:/tmp/DLT-META/"}
Original file line number	Diff line number	Diff line change
`@@ -81,7 +81,7 @@ def launch_workflow(self, runner_conf: DLTMetaRunnerConf):`
`81`	`81`	`"--profile": "provide databricks cli profile name, if not provide databricks_host and token",`
`82`	`82`	`"--uc_catalog_name": "provide databricks uc_catalog name, this is required to create volume, schema, table",`
`83`	`83`	`"--cloud_provider_name": "provide cloud provider name. Supported values are aws , azure , gcp",`
`84`		`- "--dbr_version": "Provide databricks runtime spark version e.g 11.3.x-scala2.12",`
	`84`	`+ "--dbr_version": "Provide databricks runtime spark version e.g 15.3.x-scala2.12",`
`85`	`85`	`"--dbfs_path": "Provide databricks workspace dbfs path where you want run integration tests \`
`86`	`86`	`e.g --dbfs_path=dbfs:/tmp/DLT-META/"`
`87`	`87`	`}`