You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/sources/catalog/cloud-apps/amazon-s3/index.md
+16-16Lines changed: 16 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,10 +17,6 @@ The goal of this walkthrough is to make this process easier by providing an auto
17
17
- an AWS IAM execution role that grants the permissions your Lambda function needs through the permissions policy associated with this role
18
18
- an AWS S3 source bucket with a notification configuration that invokes the Lambda function
19
19
20
-
> warning "CSV support recommendation"
21
-
>
22
-
> Implementing a production-grade solution with this tutorial can be complex. Segment recommends that you submit feature requests for Segment reverse ETL for CSV support.
23
-
24
20
## Prerequisites
25
21
26
22
This tutorial assumes that you have some basic understanding of S3, Lambda and the `aws cli` tool. If you haven't already, follow the instructions in [Getting Started with AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html){:target="_blank"} to create your first Lambda function. If you're unfamiliar with `aws cli`, follow the instructions in [Setting up the AWS Command Line Interface](https://docs.aws.amazon.com/polly/latest/dg/setup-aws-cli.html){:target="_blank"} before you proceed.
@@ -31,13 +27,13 @@ On Linux and macOS, use your preferred shell and package manager. On macOS, you
31
27
32
28
[Install NPM](https://www.npmjs.com/get-npm){:target="_blank"} to manage the function's dependencies.
33
29
34
-
## Getting Started
30
+
## Getting started
35
31
36
32
### 1. Create an S3 source in Segment
37
33
38
34
Remember the write key for this source, you'll need it in a later step.
39
35
40
-
### 2. Create the Execution Role
36
+
### 2. Create the execution role
41
37
42
38
Create the [execution role](https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html){:target="_blank"} that gives your function permission to access AWS resources.
43
39
@@ -57,7 +53,7 @@ Create the [execution role](https://docs.aws.amazon.com/lambda/latest/dg/lambda-
57
53
58
54
The **AWSLambdaExecute** policy has the permissions that the function needs to manage objects in Amazon S3, and write logs to CloudWatch Logs.
59
55
60
-
### 3. Create Local Files, an S3 Bucket and Upload a Sample Object
56
+
### 3. Create local files, an S3 bucket and upload a sample object
61
57
62
58
Follow these steps to create your local files, S3 bucket and upload an object.
63
59
@@ -77,7 +73,7 @@ Follow these steps to create your local files, S3 bucket and upload an object.
77
73
3. Create your bucket. **Record your bucket name** - you'll need it later!
78
74
4. In the source bucket, upload `track_1.csv`.
79
75
80
-
### 4. Create the Function
76
+
### 4. Create the function
81
77
82
78
Next, create the Lambda function, install dependencies, and zip everything up so it can be deployed to AWS.
83
79
@@ -264,11 +260,11 @@ The command above sets a 90-second timeout value as the function configuration.
In this step, you invoke the Lambda functionmanually using sample Amazon S3 event data.
270
266
271
-
**To test the Lambda function**
267
+
**To test the lambda function**
272
268
273
269
1. Create an empty file named `output.txt`in the `S3-Lambda-Segment` folder - the aws cli complains if it's not there.
274
270
```bash
@@ -285,7 +281,7 @@ In this step, you invoke the Lambda function manually using sample Amazon S3 eve
285
281
286
282
**Note**: Calls to Segment's Object API don't show up the Segment debugger.
287
283
288
-
### Configure Amazon S3 to Publish Events
284
+
### Configure Amazon S3 to publish events
289
285
290
286
In this step, you add the remaining configuration so that Amazon S3 can publish object-created events to AWS Lambda and invoke your Lambda function.
291
287
You'll do the following:
@@ -352,11 +348,15 @@ Last, test your system to make sure it's working as expected:
352
348
### Timestamps
353
349
This script automatically transforms all CSV timestamp columns named `createdAt` and `timestamp` to timestamp objects, regardless of nesting, preparation for Segment ingestion. If your timestamps have a different name, search the example `index.js` code for the `colParser` function, and add your column names there for automatic transformation. If you make this modification, re-zip the package (using `zip -r function.zip .`) and upload the new zip to Lambda.
354
350
355
-
## CSV Formats
351
+
## CSV formats
356
352
357
353
Define your CSV file structure based on the method you want to execute.
358
354
359
-
#### Identify Structure
355
+
> warning "CSV support recommendation"
356
+
>
357
+
> Implementing a production-grade solution with this tutorial can be complex. Segment recommends that you submit feature requests for Segment reverse ETL for CSV support.
358
+
359
+
#### Identify structure
360
360
361
361
An `identify_XXXXX` .csv file uses the following field names:
362
362
@@ -371,7 +371,7 @@ An `identify_XXXXX` .csv file uses the following field names:
371
371
In the above structure, the `userId` is required, but all other items are optional. Start all traits with `traits.` and then the trait name, for example `traits.account_type`. Similarly, start context fields with `context.` followed by the canonical structure. The same structure applies to `integrations.` too.
372
372
373
373
374
-
#### Page/Screen Structure
374
+
#### Page/Screen structure
375
375
376
376
For example a `screen_XXXXX` or `page_YYYY` file has the following field names:
377
377
@@ -384,7 +384,7 @@ For example a `screen_XXXXX` or `page_YYYY` file has the following field names:
384
384
7. `timestamp` (Unix time) - Optional
385
385
8. `integrations.<integration>` - Optional
386
386
387
-
#### Track Structure
387
+
#### Track structure
388
388
389
389
For example a `track_XXXXX` file has the following field names:
390
390
@@ -413,7 +413,7 @@ For any of these methods, you might need to pass nested JSON to the tracking or
413
413
414
414
The example `index.js` sample code above does not support ingestion of arrays. If you need this functionality you can modify the sample code as needed.
415
415
416
-
#### Object Structure
416
+
#### Object structure
417
417
418
418
There are cases when Segment's tracking API is not suitable for datasets that you might want to move to a warehouse. This could be e-commerce product data, media content metadata, campaign performance, and so on.
0 commit comments