Skip to content

Commit 748c177

Browse files
committed
DOC-467 Draft of setup procedures
1 parent 56092dd commit 748c177

File tree

3 files changed

+53
-6
lines changed

3 files changed

+53
-6
lines changed

src/_data/sidenav/main.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,8 @@ sections:
192192
title: Set Up Data Lakes
193193
- path: /connections/storage/data-lakes/sync-reports
194194
title: Sync Reports and Error Reporting
195+
- path: /connections/storage/data-lakes/lake-formation
196+
title: AWS Lake Formation
195197
- path: /connections/storage/data-lakes/sync-history
196198
title: Data Lakes Sync History and Health
197199
- path: /connections/storage/data-lakes/comparison

src/connections/storage/data-lakes/lake-formation-setup.md

Lines changed: 0 additions & 6 deletions
This file was deleted.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
title: AWS Lake Formation
3+
---
4+
AWS Lake Formation is a fully managed service built on top of the AWS Glue Data Catalog that provides one central set of tools to securely build and manage a Data Lake. The tools fall into one of two categories: setup and data management and security management. Setup and data management tools help import, catalog, transform, and deduplicate data, and optimize your storage and security. Security management tools help you to define and enforce encryption and access controls and implement audit logging.
5+
6+
> note "Learn more about AWS Lake Formation features"
7+
> To learn more about AWS Lake Formation features, refer to the [Amazon Web Services documentation](https://aws.amazon.com/lake-formation/features/).
8+
9+
<!---add description of how the security works, because the secure aspect is a big selling point-->
10+
11+
## Configuring Lake Formation
12+
You can configure Lake Formation using the [`IAMAllowedPrincipals` group](#configuring-lake-formation-using-the-iamallowedprincipals-group) or by [using IAM policies for access control](#configuring-lake-formation-using-iam-policies). With the `IAMAllowedPrincipals` group,
13+
<!--add use case explanation, finish sentence here-->
14+
15+
> info "Permissions required to configure Data Lakes"
16+
> To configure Lake Formation, you must be logged in to AWS with data lake administrator or a database creator permissions.
17+
18+
### Configuring Lake Formation using the IAMAllowedPrincipals group
19+
20+
#### Existing databases
21+
1. Open the [AWS Lake Formation service](https://console.aws.amazon.com/lakeformation/).
22+
2. Under **Data catalog**, select the settings tab. Ensure the check boxes under the **Default permissions for newly created databases and tables** are not checked.
23+
3. Under **Permissions**, select the **Admins and database creators** section and give your EMR instance profile role (`EMR_EC2-DEFAULT` if you created your EMR cluster manually, or `segment_emr_instance_profile` if you set it up using Terraform) to the **Database creators** section.
24+
25+
#### New databases
26+
1. Open the [AWS Lake Formation service](https://console.aws.amazon.com/lakeformation/).
27+
2. Under **Data catalog**, select the settings tab. Ensure the check boxes under the **Default permissions for newly created databases and tables** are not checked.
28+
3. Select the Databases tab. Click the **Create database** button, and create your database:
29+
1. Select the **Database** button.
30+
2. Name your database.
31+
3. Set the location to `s3://$datalake_bucket/segment-data/`. <br/> **Optional:** Add a description to your database.
32+
4. Select the `Use only IAM access control for new tables in this database`.
33+
5. Click **Create database**.
34+
4.
35+
<!---asked Udit where the next step lives for the new databases section: doc isn't super clear?-->
36+
37+
### Configuring Lake Formation using IAM policies
38+
39+
#### Existing databases
40+
1. Open the [AWS Lake Formation service](https://console.aws.amazon.com/lakeformation/).
41+
42+
#### New databases
43+
1. Open the [AWS Lake Formation service](https://console.aws.amazon.com/lakeformation/).
44+
2. Under **Data catalog**, select the settings tab. Ensure the check boxes under the **Default permissions for newly created databases and tables** are not checked.
45+
3. Select the Databases tab. Click the **Create database** button, and create your database:
46+
1. Select the **Database** button.
47+
2. Name your database.
48+
3. Set the location to `s3://$datalake_bucket/segment-data/`. <br/> **Optional:** Add a description to your database.
49+
4. Click **Create database**.
50+
4.
51+
<!---same as note above: not sure where next step lives for either new/existing databases-->

0 commit comments

Comments
 (0)