Skip to content

Commit ba57f05

Browse files
committed
updated readme with schema example
1 parent 191de41 commit ba57f05

File tree

1 file changed

+21
-10
lines changed

1 file changed

+21
-10
lines changed

README.md

Lines changed: 21 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -51,28 +51,39 @@ import dataworkbench
5151

5252
To use it on your local machine, it requires you to set a set of variables to connect to the Veracity Dataworkbench API.
5353

54-
### Basic Example
5554

55+
56+
## Examples
57+
58+
### Saving a Spark DataFrame to the Data Catalogue
5659
```python
5760
from dataworkbench import DataCatalogue
5861

5962
df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["letter", "number"])
6063

61-
datacatalogue = DataCatalogue() # Naming subject to change
62-
datacatalogue.save(df, "Dataset Name", "Description", tags={"environment": ["test"]})
64+
datacatalogue = DataCatalogue()
65+
datacatalogue.save(
66+
df,
67+
"Dataset Name",
68+
"Description",
69+
tags={"environment": ["test"]}
70+
) # schema_id is optional - if not provided, schema will be inferred from the dataframe
6371
```
64-
65-
## Examples
66-
67-
### Saving a Spark DataFrame to the Data Catalogue
68-
72+
#### Using an existing schema
73+
When you have an existing schema that you want to reuse:
6974
```python
7075
from dataworkbench import DataCatalogue
7176

7277
df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["letter", "number"])
7378

74-
datacatalogue = DataCatalogue() # Naming subject to change
75-
datacatalogue.save(df, "Dataset Name", "Description", tags={"environment": ["test"]})
79+
datacatalogue = DataCatalogue()
80+
datacatalogue.save(
81+
df,
82+
"Dataset Name",
83+
"Description",
84+
tags={"environment": ["test"]},
85+
schema_id="abada0f7-acb4-43cf-8f54-b51abd7ba8b1" # Using an existing schema ID
86+
)
7687
```
7788

7889
## API Reference

0 commit comments

Comments
 (0)