Skip to content

Commit b804094

Browse files
committed
DOC-266 Resolved comments
1 parent aaf3033 commit b804094

File tree

1 file changed

+17
-162
lines changed
  • src/connections/storage/catalog/aws-s3

1 file changed

+17
-162
lines changed

src/connections/storage/catalog/aws-s3/index.md

Lines changed: 17 additions & 162 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,13 @@ title: AWS S3 with IAM Role Support Destination
33
hide-personas-partial: true
44
---
55

6-
{% include content/beta-note.md %}
6+
> info "This document is about a destination which is in beta"
7+
> This means that the AWS S3 with IAM Role Support destination is in active development, and some functionality may change before it becomes generally available.
8+
79

810
## Getting Started
911

10-
The Amazon S3 destination puts the raw logs of the data Segment receives into your S3 bucket, encrypted, no matter what region the bucket is in.
12+
The AWS destination puts the raw logs of the data Segment receives into your S3 bucket, encrypted, no matter what region the bucket is in.
1113

1214
> info ""
1315
> Segment copies data into your bucket every hour around the :40 minute mark. You may see multiple files over a period of time depending on the amount of data Segment copies.
@@ -28,25 +30,7 @@ Complete the following steps to configure the AWS S3 Destination with IAM Role S
2830

2931
To complete this section, you need access to your AWS dashboard.
3032

31-
1. Create a new S3 bucket in your preferred region. For more information, see Amazon's documentation, [Create your first S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html){:target="_blank"}. Add the following policy to the bucket to allow Segment to copy files into it:
32-
```json
33-
{
34-
"Version": "2008-10-17",
35-
"Id": "Policy1425281770533",
36-
"Statement": [
37-
{
38-
"Sid": "AllowSegmentUser",
39-
"Effect": "Allow",
40-
"Principal": {
41-
"AWS": "arn:aws:iam::107630771604:user/s3-copy"
42-
},
43-
"Action": "s3:PutObject",
44-
"Resource": "arn:aws:s3:::<YOUR_BUCKET_NAME>/segment-logs/*"
45-
}
46-
]
47-
}
48-
```
49-
This adds the ability to `s3:PutObject` for the Segment s3-copy user for your bucket.
33+
1. Create a new S3 bucket in your preferred region. For more information, see Amazon's documentation, [Create your first S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html){:target="_blank"}.
5034
2. Create a new IAM role for Segment to assume. For more information, see Amazon's documentation, [Creating a role to delegate permissions to an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user.html){:target="_blank"}.
5135
3. Attach the following trust relationship document. Be sure to add your Workspace ID to the `sts:ExternalId` field.
5236
```json
@@ -95,10 +79,10 @@ To complete this section, you need access to your AWS dashboard.
9579
"Sid": "AllowKMS",
9680
"Effect": "Allow",
9781
"Action": [
98-
"kms:GenerateDataKey",
99-
"kms:Decrypt"
82+
"kms:GenerateDataKey",
83+
"kms:Decrypt"
10084
],
101-
"Resource": "YOUR_KEY_ARN"
85+
"Resource": "<YOUR_KEY_ARN>"
10286
}
10387
]
10488
}
@@ -111,7 +95,7 @@ If you have server-side encryption enabled, see the [required configuration](#en
11195

11296
To finish configuration, enable the AWS S3 Destination with IAM Role Support destination in your workspace.
11397

114-
1. Add the destination from the Data Storage catalog.
98+
1. Add the AWS S3 destination from the Data Storage section of the Destinations catalog.
11599
2. Select the data source you'll connect to the destination.
116100
3. Provide a unique name for the destination.
117101
4. Complete the destination settings:
@@ -124,18 +108,17 @@ To finish configuration, enable the AWS S3 Destination with IAM Role Support des
124108
## Migrate an existing destination
125109
To migrate an existing Amazon S3 destination to the AWS S3 with IAM Role Support Destination:
126110

127-
1. Configure the IAM role and IAM policy permissions as described in steps 3 and 4 [above](#create-an-iam-role-in-aws).
111+
1. Configure the IAM role and IAM policy permissions as described in steps 2 - 4 [above](#create-an-iam-role-in-aws).
128112
2. Add the AWS S3 with IAM Role Support Destination and add the AWS Region and IAM role ARN. For the bucket name, enter `<YOUR_BUCKET_NAME>/segment-logs/test`. Enable the destination, and verify data is received at `<YOUR_BUCKET_NAME>/segment-logs/test/segment-logs`. If the folder receives data, continue to the next step. If you don't see log entries, check the trust relationship document and IAM policy attached to the role.
129113
3. Update the bucket name in the new destination to `<YOUR_BUCKET_NAME>`.
130114
4. After 1 hour, disable the original Amazon S3 destination to avoid data duplication.
131115
5. Verify that the `<YOUR_BUCKET_NAME>/segment-logs` receives data.
132116
6. Remove the test folder created in step 2 from the bucket.
133117

134-
{% comment %}
135-
### Migration steps for users with multiple sources per environment
136118

137-
In cases where users have multiple sources per environment, for example staging sources pointing to a staging bucket, and production sources going to a production bucket, they need two IAM roles, one for staging, and one for production.
119+
### Migration steps for scenarios with multiple sources per environment
138120

121+
In cases where you have multiple sources per environment, for example staging sources pointing to a staging bucket, and production sources going to a production bucket, you need two IAM roles, one for staging, and one for production.
139122

140123
For example:
141124

@@ -146,10 +129,9 @@ For example:
146129
- prod_source_2 → prod_bucket
147130
- prod_source_N → prod_bucket
148131

149-
In this scenario, for `stage_source_1`:
150-
1.
132+
For each source in the scenario, complete the steps described in [Migrate an existing destination](#migrate-an-existing-destination), and ensure that you have separate IAM Roles and Permissions set for staging and production use.
133+
151134

152-
{% endcomment %}
153135
## Data format
154136

155137
Segment stores logs as gzipped, newline-separated JSON containing the full call information. For a list of supported properties, see the [Segment Spec](/docs/connections/spec/) documentation.
@@ -162,98 +144,7 @@ The received-day refers to the UTC date unix timestamp, that the API receives th
162144

163145
## Encryption
164146

165-
This section contains information for enabling encryption on your S3 bucket.
166-
167-
### Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3)
168-
169-
Segment supports optional, S3-managed Server-Side Encryption, which you can disable or enable from the Destination Configuration UI. By default, the destination now automatically enables encryption, and Segment recommends that you continue to encrypt.
170-
If you've had the S3 destination enabled since before October 2017, you might need to enable encryption manually on your bucket.
171-
172-
While most client libraries transparently decrypt the file when fetching it, you should make sure that any applications that are consume data in the S3 bucket are ready to decrypt the data before you enable this feature. When you're ready, you can enable encryption from the setting in the destination configuration UI.
173-
174-
### Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS)
175-
Segment can also write to S3 buckets with Default Encryption set to AWS-KMS. This ensures that objects written to your bucket are encrypted using customer managed keys created in your AWS Key Management Service (KMS).
176-
Follow the steps below to enable encryption using AWS KMS Managed Keys:
177-
178-
#### Create a new customer-managed key and grant the Segment user permissions to generate new keys
179-
The Segment user must have the permission to `GenerateDataKey` from your AWS Key Management Service. Here is a sample policy document that grants the Segment user the necessary permissions.
180-
181-
```json
182-
{
183-
"Version": "2012-10-17",
184-
"Id": "key-consolepolicy-3",
185-
"Statement": [
186-
{
187-
"Sid": "Allow Segment S3 user to generate key",
188-
"Effect": "Allow",
189-
"Principal": {
190-
"AWS": "arn:aws:iam::107630771604:user/s3-copy"
191-
},
192-
"Action": "kms:GenerateDataKey",
193-
"Resource": "*"
194-
}
195-
]
196-
}
197-
```
198-
199-
![creating customer managed key screenshot](images/customer-managed-key.png)
200-
201-
#### Update S3 bucket default encryption property
202-
The target S3 bucket should have the "Default encryption" property enabled and set to `AWS-KMS`. Choose the customer-managed key generated in the above step for encryption.
203-
204-
![update default encryption property](images/bucket-property.png)
205-
206-
#### Disable ServerSideEncryption in Segment S3 Destination settings
207-
Disable the Server Side Encryption setting in the Segment destination configuration. This allows you to enable bucket-level encryption, so Amazon can encrypt objects using KMS managed keys.
208-
209-
![disable segment s3 destination property](images/disable-segment-sse.png)
210-
211-
### Enforcing encryption
212-
To further secure your bucket by ensuring that all files upload with the encryption flag present, you can add to the bucket policy to strictly enforce that all uploads trigger encryption.
213-
214-
Segment recommends doing this as a best practice. The following policy strictly enforces upload encryption with Amazon S3-Managed keys.
215-
216-
```json
217-
{
218-
"Version": "2008-10-17",
219-
"Id": "Policy1425281770533",
220-
"Statement": [
221-
{
222-
"Sid": "AllowSegmentUser",
223-
"Effect": "Allow",
224-
"Principal": {
225-
"AWS": "arn:aws:iam::107630771604:user/s3-copy"
226-
},
227-
"Action": "s3:PutObject",
228-
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/segment-logs/*"
229-
},
230-
{
231-
"Sid": "DenyIncorrectEncryptionHeader",
232-
"Effect": "Deny",
233-
"Principal": "*",
234-
"Action": "s3:PutObject",
235-
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*",
236-
"Condition": {
237-
"StringNotEquals": {
238-
"s3:x-amz-server-side-encryption": "AES256"
239-
}
240-
}
241-
},
242-
{
243-
"Sid": "DenyUnEncryptedObjectUploads",
244-
"Effect": "Deny",
245-
"Principal": "*",
246-
"Action": "s3:PutObject",
247-
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*",
248-
"Condition": {
249-
"Null": {
250-
"s3:x-amz-server-side-encryption": "true"
251-
}
252-
}
253-
}
254-
]
255-
}
256-
```
147+
Configure encryption at the bucket-level from within the AWS console. For more information, see Amazon's documentation [Protecting data using encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingEncryption.html){:target="_blank"}.
257148

258149
## Region
259150

@@ -265,49 +156,13 @@ To use a custom key prefix for the files in your bucket, append the path to the
265156

266157
### How can I download the data from my bucket?
267158

268-
Segment recommends using the [AWS CLI](http://aws.amazon.com/cli/) and writing a short script to download specific days, one at a time. The AWS CLI is faster than [s3cmd](http://s3tools.org/s3cmd) because it downloads files in parallel.
269-
270-
> info ""
271-
> S3 transparently decompresses the files for most clients. To access the raw gzipped data you can programmatically download the file using [the AWS SDK](http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html) and setting `ResponseContentEncoding: none`. This functionality isn't available in the AWS CLI). You can also manually remove the metadata on the file (`Content-Type: text/plain` and `Content-Encoding: gzip`) through the AWS interface, which allows you to download the file as gzipped.
272-
273-
To configure the AWS CLI, see Amazon's documentation [here](http://docs.aws.amazon.com/cli/latest/userguide/installing.html). For linux systems, run the following command:
274-
275-
276-
```bash
277-
$ sudo apt-get install awscli
278-
```
279-
280-
Then configure AWS CLI with your Access Key ID and Secret Access Key. You can create or find these keys in your [Amazon IAM user management console](https://console.aws.amazon.com/iam/home#users). Then run the following command which will prompt you for the access keys:
281-
282-
```bash
283-
$ aws configure
284-
```
285-
286-
To see a list of the most recent log folders:
287-
288-
```bash
289-
$ aws s3 ls s3://{bucket}/segment-logs/{source-id}/ | tail -10
290-
```
291-
292-
To download the files for a specific day:
293-
294-
```bash
295-
$ aws s3 sync s3://{bucket}/segment-logs/{source-id}/{received-day} .
296-
```
297-
298-
Or to download *all* files for a source:
299-
300-
```bash
301-
$ aws s3 sync s3://{bucket}/segment-logs/{source-id} .
302-
```
303-
304-
To put the files in a specific folder replace the `.` at the end ("current directory") with the desired directory like `~/Downloads/logs`.
159+
Amazon provides several methods to download data from an S3 bucket. For more information, see [Downloading an object](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html){:target="_blank"}.
305160

306161

307162
## Personas
308163

309164
> warning ""
310-
> As mentioned above, the Amazon S3 destination works differently than other destinations in Segment. As a result, Segment sends **all** data from a Personas source to S3 during the sync process, not only the connected audiences and traits.
165+
> As mentioned above, the AWS S3 destination works differently than other destinations in Segment. As a result, Segment sends **all** data from a Personas source to S3 during the sync process, not only the connected audiences and traits.
311166

312167
You can send computed traits and audiences generated using [Segment Personas](/docs/personas) to this destination as a **user property**.
313168

0 commit comments

Comments
 (0)