Skip to content

Commit c9355f9

Browse files
datalakes_featuregate (#1079)
* datalakes_featuregate Updating set up instructions to replace the need to contact Support to get the in-app set up link * Update src/connections/storage/catalog/data-lakes/index.md Co-authored-by: LRubin <[email protected]> * Update src/connections/storage/catalog/data-lakes/index.md Co-authored-by: LRubin <[email protected]> Co-authored-by: LRubin <[email protected]>
1 parent d82fb92 commit c9355f9

File tree

1 file changed

+6
-9
lines changed
  • src/connections/storage/catalog/data-lakes

1 file changed

+6
-9
lines changed

src/connections/storage/catalog/data-lakes/index.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -27,24 +27,21 @@ The Terraform module and manual set up instructions both provide a base level of
2727

2828
## Step 2 - Enable Data Lakes Destination
2929

30-
After you set up the necessary AWS resources:
30+
After you set up the necessary AWS resources, the next step is to set up the Data Lakes destination within Segment:
3131

32-
1. [Contact the Support team](https://segment.com/help/contact/) to receive a link to the Data Lakes landing page in your workspace.
32+
1. In the [Segment App](https://app.segment.com/goto-my-workspace/overview), click **Add Destination**, then search for and select **Data Lakes**.
3333

34-
2. Click the link provided, and from the Data Lakes landing page, click **Configure Data Lakes**.
35-
36-
3. Select the source to connect to the Data Lakes destination.
37-
38-
Each source must be individually connected to the Data Lakes destination. However, you can copy the settings from another source by clicking the “…” button (next to the button for “Setup Guide”).
39-
40-
> **Note**: You must include all source ids in the external ID list in the IAM policy, or else the source data cannot be synced to S3.
34+
2. Click **Configure Data Lakes** and select the source to connect to the Data Lakes destination.
35+
> **Warning**: You must include all source ids in the external ID list in the IAM policy, or else the source data cannot be synced to S3.
4136
4237
4. In the Settings tab, enter and save the following connection settings:
4338
- **AWS Region**: The AWS Region where your EMR cluster, S3 Bucket and Glue DB reside.
4439
- **EMR Cluster ID**: The EMR Cluster ID where the Data Lakes jobs will be run.
4540
- **Glue Catalog ID**: The Glue Catalog ID (this must be the same as your AWS account ID).
4641
- **IAM Role ARN**: The ARN of the IAM role that Segment will use to connect to Data Lakes.
4742
- **S3 Bucket**: Name of the S3 bucket used by Data Lakes. The EMR cluster will store logs in this bucket.
43+
44+
You must individually connect each source to the Data Lakes destination. However, you can copy the settings from another source by clicking **** ("more") (next to the button for “Set up Guide”).
4845

4946
5. _(Optional)_ **Date Partition**: Optional advanced setting to change the date partition structure, with a default structure `day=<YYYY-MM-DD>/hr=<HH>`. To use the default, leave this setting unchanged. To partition the data by a different date structure, choose one of the following options:
5047
- Day/Hour [YYYY-MM-DD/HH] (Default)

0 commit comments

Comments
 (0)