Skip to content

Commit 4fa0ff5

Browse files
updated docs, readme for:
1.Updating feature table with latest api names 2.Updated docs readme for metadata prep info 3.Updated docs cli readme to add dev requirements
1 parent 0ffe82e commit 4fa0ff5

File tree

3 files changed

+56
-18
lines changed

3 files changed

+56
-18
lines changed

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,14 +53,15 @@ In practice, a single generic pipeline reads the Dataflowspec and uses it to orc
5353
| Custom transformations | Bronze, Silver layer accepts custom functions|
5454
| Data Quality Expecations Support | Bronze, Silver layer |
5555
| Quarantine table support | Bronze layer |
56-
| [apply_changes](https://docs.databricks.com/en/delta-live-tables/python-ref.html#cdc) API support | Bronze, Silver layer |
57-
| [apply_changes_from_snapshot](https://docs.databricks.com/en/delta-live-tables/python-ref.html#change-data-capture-from-database-snapshots-with-python-in-delta-live-tables) API support | Bronze layer|
56+
| [create_auto_cdc_flow](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-apply-changes) API support | Bronze, Silver layer |
57+
| [create_auto_cdc_from_snapshot_flow](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-apply-changes-from-snapshot) API support | Bronze layer|
5858
| [append_flow](https://docs.databricks.com/en/delta-live-tables/flows.html#use-append-flow-to-write-to-a-streaming-table-from-multiple-source-streams) API support | Bronze layer|
5959
| Liquid cluster support | Bronze, Bronze Quarantine, Silver tables|
6060
| [DLT-META CLI](https://databrickslabs.github.io/dlt-meta/getting_started/dltmeta_cli/) | ```databricks labs dlt-meta onboard```, ```databricks labs dlt-meta deploy``` |
6161
| Bronze and Silver pipeline chaining | Deploy dlt-meta pipeline with ```layer=bronze_silver``` option using Direct publishing mode |
62-
| [DLT Sinks](https://docs.databricks.com/aws/en/delta-live-tables/dlt-sinks) |Supported formats:external ```delta table```, ```kafka```.Bronze, Silver layers|
62+
| [create_sink](https://docs.databricks.com/aws/en/dlt-ref/dlt-python-ref-sink) API support |Supported formats:external ```delta table```, ```kafka```.Bronze, Silver layers|
6363
| [Databricks Asset Bundles](https://docs.databricks.com/aws/en/dev-tools/bundles/) | Supported
64+
| [DLT-META UI](https://github.com/databrickslabs/dlt-meta/tree/main/lakehouse_app#dlt-meta-lakehouse-app-setup) | Uses Databricks Lakehouse DLT-META App
6465

6566
## Getting Started
6667

docs/content/getting_started/dltmeta_cli.md

Lines changed: 51 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,20 +5,58 @@ weight: 7
55
draft: false
66
---
77

8-
### pre-requisites:
9-
- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
10-
- Once you install Databricks CLI, authenticate your current machine to a Databricks Workspace:
11-
12-
```commandline
13-
databricks auth login --host WORKSPACE_HOST
14-
```
8+
### Prerequisites:
159
- Python 3.8.0 +
16-
##### Steps:
17-
1. ``` git clone https://github.com/databrickslabs/dlt-meta.git ```
18-
2. ``` cd dlt-meta ```
19-
3. ``` python -m venv .venv ```
20-
4. ```source .venv/bin/activate ```
21-
5. ``` pip install databricks-sdk ```
10+
- [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/tutorial.html)
11+
12+
### Steps:
13+
1. Install and authenticate Databricks CLI:
14+
```commandline
15+
databricks auth login --host WORKSPACE_HOST
16+
```
17+
18+
2. Install dlt-meta via Databricks CLI:
19+
```commandline
20+
databricks labs install dlt-meta
21+
```
22+
23+
3. Clone dlt-meta repository:
24+
```commandline
25+
git clone https://github.com/databrickslabs/dlt-meta.git
26+
```
27+
28+
4. Navigate to project directory:
29+
```commandline
30+
cd dlt-meta
31+
```
32+
33+
5. Create Python virtual environment:
34+
```commandline
35+
python -m venv .venv
36+
```
37+
38+
6. Activate virtual environment:
39+
```commandline
40+
source .venv/bin/activate
41+
```
42+
43+
7. Install required packages:
44+
```commandline
45+
# Core requirements
46+
pip install "PyYAML>=6.0" setuptools databricks-sdk
47+
48+
# Development requirements
49+
pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5
50+
51+
# Integration test requirements
52+
pip install "typer[all]==0.6.1"
53+
```
54+
55+
8. Set environment variables:
56+
```commandline
57+
dlt_meta_home=$(pwd)
58+
export PYTHONPATH=$dlt_meta_home
59+
```
2260
2361
![onboardingDLTMeta.gif](/images/onboardingDLTMeta.gif)
2462

docs/content/getting_started/metadatapreperation.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,7 @@ The `onboarding.json` file contains links to [silver_transformations.json](https
6464
| silver_transformation_json | Silver table sql transformation json path |
6565
| silver_data_quality_expectations_json_{env} | Silver table data quality expectations json file path
6666
| silver_append_flows | Silver table append flows json. e.g.`"silver_append_flows":[{"name":"customer_bronze_flow",
67-
| silver_apply_changes_from_snapshot | Silver apply changes from snapshot Json e.g. Mandatory fields: keys=["userId"], scd_type=`1` or `2` optional fields: track_history_column_list=`[col1]`, track_history_except_column_list=`[col2]` |
68-
"create_streaming_table": false,"source_format": "cloudFiles", "source_details": {"source_database": "APP","source_table":"CUSTOMERS", "source_path_dev": "tests/resources/data/customers", "source_schema_path": "tests/resources/schema/customer_schema.ddl"},"reader_options": {"cloudFiles.format": "json","cloudFiles.inferColumnTypes": "true","cloudFiles.rescuedDataColumn": "_rescued_data"},"once": true}]`|
67+
| silver_apply_changes_from_snapshot | Silver apply changes from snapshot Json e.g. Mandatory fields: keys=["userId"], scd_type=`1` or `2` optional fields: track_history_column_list=`[col1]`, track_history_except_column_list=`[col2]`|
6968

7069

7170

0 commit comments

Comments
 (0)