11---
2- title : Provide Access to Your Database
2+ title : Replicate your SQL Database
33sidebar_position : 3
44---
55
6- OSO's dagster infrastructure has support for database replication into our data
6+ OSO's Dagster infrastructure has support for database replication into our data
77warehouse by using Dagster's "embedded-elt" that integrates with the library
88[ dlt] ( https://dlthub.com/ ) .
99
10- ## Configure your database as a dagster asset
10+ ## Configure your database as a Dagster asset
1111
12- There are many possible ways to configure a database as a dagster asset,
13- however, to reduce complexity of configuration we provide a single interface for
14- specifying a SQL database for replication. The SQL database _ must _ be a database
15- that is [ supported by
16- dlt] ( https://dlthub.com/devel/dlt-ecosystem/verified-sources/sql_database ) . In
17- general, we replicate _ all_ columns and for now custom column selection is not
12+ There are many possible ways to configure a database as a Dagster asset.
13+ To simplify things, we have built a factory function, ` sql_assets ` ,
14+ to automatically replicate any SQL database.
15+ The SQL database _ must _ be a database that is
16+ [ supported by dlt] ( https://dlthub.com/devel/dlt-ecosystem/verified-sources/sql_database ) .
17+ In general, we replicate _ all_ columns and for now custom column selection is not
1818available in our interface.
1919
20- This section shows how to setup a database with two tables as a set of sql
21- assets. The table named ` some_incremental_database ` has a chronologically
22- organized or updated dataset and can therefore be loaded incrementally. The
23- second table, ` some_nonincremental_database ` , does not have a way to be loaded
20+ This section shows how to replciate 2 tables in a database.
21+ The first table named ` some_incremental_database ` has a time column
22+ and can be loaded incrementally.
23+ The second table, ` some_nonincremental_database ` , does not have a way to be loaded
2424incrementally and will force a full refresh upon every sync.
2525
2626To setup this database replication, you can add a new python file to
@@ -52,25 +52,21 @@ my_database = sql_assets(
5252```
5353
5454The first three lines of the file import some necessary tooling to configure a
55- sql database:
56-
57- - The first import, ` sql_assets ` , is an asset factory created by the OSO team
58- that enables this "easy" configuration of sql assets.
59- - The second import, ` SecretReference ` , is a tool used to reference a secret in
60- a secret resolver. The secret resolver can be configured differently based on
61- the environment, but on production we use this to reference a cloud based secret
62- manager.
63- - The final import, ` incremental ` , is used to specify a column to use for
64- incremental loading. This is a ` dlt ` constructor that is passed to the
65- configuration.
66-
67- The ` sql_assets ` , factory takes 3 arguments:
68-
69- - The first argument is an asset key prefix which is used to both specify an
70- asset key prefix and also used when generating asset related names inside the
71- factory. In general, this should match the filename of the containing python
72- file unless you have a more complex set of assets to configure. This name is
73- also used as the dataset name into which this data will be loaded.
55+ SQL database:
56+
57+ - ` sql_assets ` : an asset factory created by the OSO team
58+ that enables this simple configuration of SQL assets.
59+ - ` SecretReference ` : a secret reference in the OSO a secret resolver.
60+ The secret resolver can be configured differently based on
61+ the environment. On production, we use a cloud-based secret manager.
62+ - ` incremental ` : used to specify a column to use for incremental loading.
63+ This is a ` dlt ` constructor that is passed to the configuration.
64+
65+ The ` sql_assets ` factory takes 3 arguments:
66+
67+ - The first argument is an asset key prefix, used to group assets generated
68+ by the factory. In general, this should match the filename of the python
69+ file unless you have more complex requirements.
7470- The second argument must be a ` SecretReference ` object that will be used to
7571 retrieve the credentials that you will provide at a later step to the OSO
7672 team. The ` SecretReference ` object has two required keyword arguments:
@@ -81,11 +77,10 @@ The `sql_assets`, factory takes 3 arguments:
8177 - ` key ` - This is an arbitrary name for the secret.
8278
8379- The third argument is a list of dictionaries that define options for tables
84- that should be replicated into the data warehouse. The most important options
85- here are:
80+ that should be replicated into OSO.
8681
87- - ` table ` - The table name
88- - ` destination_table_name ` - The table name to use in the data warehouse
82+ - ` table ` - The source table name
83+ - ` destination_table_name ` - The destination table name to use in the OSO data lake
8984 - ` incremental ` - An ` incremental ` object that defines time/date based column
9085 to use for incrementally loading a database.
9186
@@ -95,11 +90,11 @@ The `sql_assets`, factory takes 3 arguments:
9590
9691## Enabling access to your database
9792
98- Before the OSO infrastructure can begin to synchronize your database to the data
99- warehouse, it will need to be provided access to the database. At this time
100- there is no automated process for this. Once you're ready to get your database
101- integrated, you will want to contact the OSO team on our
102- [ Discord] ( https://www.opensource.observer/discord ) . Be prepared to provide
103- credentials (we will work out a secure method of transmission) and also ensure
104- that you have access to update any firewall settings that may be required for us
93+ For the asset to run in OSO production, we will need access to
94+ your secrets (e.g. password or connection string).
95+ At this time there is no automated process for this.
96+ You can contact the OSO team on our
97+ [ Discord] ( https://www.opensource.observer/discord ) .
98+ Be prepared to provide credentials via a secure method of transmission.
99+ Also remember to update any firewall settings that may be required for us
105100to access your database server.
0 commit comments