Skip to content

Commit 22ac4b9

Browse files
mallikasahayMallika Sahaysanscontext
authored
Small updates to reflect new Glue DB name setting (#1017)
* Small updates to reflect new Glue DB name setting * Removed info box about Glue DB name * Update src/connections/storage/data-lakes/index.md Co-authored-by: LRubin <[email protected]> * Update src/connections/storage/catalog/data-lakes/index.md Co-authored-by: LRubin <[email protected]> Co-authored-by: Mallika Sahay <[email protected]> Co-authored-by: LRubin <[email protected]>
1 parent c7caca2 commit 22ac4b9

File tree

2 files changed

+5
-3
lines changed

2 files changed

+5
-3
lines changed

src/connections/storage/catalog/data-lakes/index.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,13 +46,15 @@ After you set up the necessary AWS resources:
4646
- **IAM Role ARN**: The ARN of the IAM role that Segment will use to connect to Data Lakes.
4747
- **S3 Bucket**: Name of the S3 bucket used by Data Lakes. The EMR cluster will store logs in this bucket.
4848

49-
5. _(Optional)_ **Date Partition**: Optional setting to change the date partition structure, with a default structure `day=<YYYY-MM-DD>/hr=<HH>`. To use the default, leave this setting unchanged. To partition the data by a different date structure, choose one of the following options:
49+
5. _(Optional)_ **Date Partition**: Optional advanced setting to change the date partition structure, with a default structure `day=<YYYY-MM-DD>/hr=<HH>`. To use the default, leave this setting unchanged. To partition the data by a different date structure, choose one of the following options:
5050
- Day/Hour [YYYY-MM-DD/HH] (Default)
5151
- Year/Month/Day/Hour [YYYY/MM/DD/HH]
5252
- Year/Month/Day [YYYY/MM/DD]
5353
- Day [YYYY-MM-DD]
5454

55-
6. Enable the Data Lakes destination by toggling the switch next to the “Setup Guide” button to on.
55+
6. _(Optional)_ **Glue Database Name**: Optional advanced setting to change the name of the Glue Database which is set to the source slug by default. Each source connected to Data Lakes must have a different Glue Database name otherwise data from different sources will collide in the same database.
56+
57+
7. Enable the Data Lakes destination by clicking the toggle near the **Set up Guide** button.
5658

5759
Once the Data Lakes destination is enabled, the first sync will begin approximately 2 hours later.
5860

src/connections/storage/data-lakes/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Data Lakes stores the inferred schema and associated metadata of the S3 data in
6565
New columns are appended to the end of the table in the Glue Data Catalog as they are detected.
6666

6767
#### Glue Database
68-
The Glue database stores the schema inferred by Segment. Segment stores the schema for each source in its own Glue database to organize the data so it is easier to query. To make it easier to find, Segment writes the schema to a Glue database named using the source slug.
68+
The Glue database stores the schema inferred by Segment. Segment stores the schema for each source in its own Glue database to organize the data so it is easier to query. To make it easier to find, Segment writes the schema to a Glue database named using the source slug by default. The database name can be modified from the Data Lakes settings.
6969

7070
> info ""
7171
> The recommended IAM role permissions grant Segment access to create the Glue databases on your behalf. If you do not grant Segment these permissions, you must manually create the Glue databases for Segment to write to.

0 commit comments

Comments
 (0)