Skip to content

Commit 354c698

Browse files
author
markzegarelli
authored
Merge pull request #1487 from segmentio/DOC-146_Data-Res-Updates
DOC 146 Data Res Updates
2 parents 79627d3 + 39d3fb9 commit 354c698

File tree

4 files changed

+169
-6
lines changed

4 files changed

+169
-6
lines changed

src/connections/data-residency.md

Lines changed: 169 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,177 @@
22
title: Data Residency
33
beta: true
44
---
5-
Segment offers customers the option to mitigate risk by providing regional infrastructure across Europe, Middle East, Africa and Asia Pacific. The default region for all customers is United States (Oregon). The regional infrastructure has the same [rate limits and SLA](/docs/connections/rate-limits/) as the default region.
5+
Segment offers customers the option to mitigate risk by providing regional infrastructure across Europe, Middle East, Africa and Asia Pacific. The default region for all customers is in Oregon, United States. The regional infrastructure has the same [rate limits and SLA](/docs/connections/rate-limits/) as the default region.
66

7-
If your workspace is enabled to support regional services, you can select on the region on a per-source basis in that Source's settings.
7+
## Enable Local Data Ingest and Storage
88

9-
![Data Residency Settings](images/data-residency.png)
9+
This feature is in Public Preview. You can enable Local Data Ingest and Storage from the Regional Settings tab within your Workspace settings. Enabling the feature here makes the feature available to both client-side and server-side sources.
1010

11-
All Segment client-side libraries dynamically read your preferred region when the SDK is loaded when your app starts or restarts. Changing regions does not require changes to your code when you change regions.
11+
![enable](images/enable-regional-ingest.png)
1212

13-
For server-side routing, you can view endpoint details in the source's settings, and use the [`host`](https://github.com/segmentio/analytics-python/blob/c9f5ba6b58813eba1c3e5c778b0fc8d86f937f55/analytics/__init__.py#L9) configuration parameter to send data to the desired region.
13+
## Local Data Ingest
1414

15-
Regional infrastructure can failover across locations within the region, but does not failover across regions.
15+
Local Data Ingest enables you to send data to Segment from both Client-side and Server-side sources through locally hosted API ingest points. The regional infrastructure can fail-over across locations within a region, but never across regions.
16+
17+
### Client-side sources
18+
19+
You can configure Segment's client-side SDKs for Javascript, iOS, Android, and React Native sources to send data to a regional host after you've updated the Ingest Region in that source's settings.
20+
21+
![ingest region](images/regional-ingest.png)
22+
23+
All regions are configured on a **per-source** basis. You'll need to configure the region for each source separately if you do not want to use the default region (Oregon). All Segment client-side SDKs read this setting and update themselves automatically to send data to new endpoints when the app is reloaded. You do not need to change code when you switch regions.
24+
25+
### Server-side and project sources
26+
27+
When you send data from a server-side or project source, you can use the `host` configuration parameter to send data to the desired region:
28+
29+
1. Oregon (Default) — `api.segment.io/v1`
30+
2. Dublin — `in.eu2.segmentapis.com/v1`
31+
3. Singapore — `in.ap1.segmentapis.com/v1`
32+
4. Sydney — `in.au1.segmentapis.com/v1`
33+
34+
## Local Data Storage
35+
36+
Local Data Storage allows you to preserve your raw events in Amazon S3 buckets hosted regionally. These buckets are hosted by you, in your desired region.
37+
38+
> note ""
39+
> Configure Local Data Storage on new sources instead of enabling on an existing source to avoid deletion of historical data and a change in retention policy. Historical data that is expired due to a retention policy cannot be replayed at a time in the future. Historical data cannot migrate from the US to your regional buckets.
40+
41+
### Pre-requisites
42+
43+
To begin with Local Data Storage, complete the following steps in your AWS account:
44+
45+
1. Create an S3 bucket in your preferred region
46+
2. Create a folder named `segment-logs` in the new bucket
47+
3. Edit the bucket policy to allow Segment access to the S3 bucket
48+
```json
49+
{
50+
"Version": "2008-10-17",
51+
"Id": "Policy1425281770533",
52+
"Statement": [
53+
{
54+
"Sid": "AllowSegmentUser",
55+
"Effect": "Allow",
56+
"Principal": {
57+
"AWS": "arn:aws:iam::107630771604:user/s3-copy"
58+
},
59+
"Action": "s3:PutObject",
60+
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/segment-logs/*"
61+
}
62+
]
63+
}
64+
```
65+
**Note**: `Resource` property string must end with `/*`.
66+
67+
Segment requires this access to write raw data to your regionally hosted S3 bucket. Specifically, this allows Segment (as the Segment S3-copy user) to use `s3:PutObject`. To enable encryption at rest, use the default S3 mechanism. If you have server-side encryption enabled with AWS KMS managed keys, see the additional [required configuration step](/docs/connections/storage/catalog/amazon-s3/#encryption). To edit the bucket policy, right-click the bucket name in the AWS management console, and select **Edit policy**.
68+
69+
4. Create a new IAM role in your AWS account with a trust relationship to the role which allows Segment to use the Segment `workspace_id` as `externalID`.
70+
```json
71+
{
72+
"Version": "2012-10-17",
73+
"Statement": [
74+
{
75+
"Sid": "",
76+
"Effect": "Allow",
77+
"Principal": {
78+
"AWS": [
79+
"arn:aws:iam::595280932656:role/segment-regional-archives-production-access"
80+
]
81+
},
82+
"Action": "sts:AssumeRole",
83+
"Condition": {
84+
"StringEquals": {
85+
"sts:ExternalId": [
86+
"YOUR_WORKSPACE_ID"
87+
]
88+
}
89+
}
90+
}
91+
]
92+
}
93+
```
94+
5. Attach this IAM policy to the role defined in Step 4.
95+
```json
96+
{
97+
"Version": "2012-10-17",
98+
"Statement": [
99+
{
100+
"Sid": "ListObjectsInBucket",
101+
"Effect": "Allow",
102+
"Action": "s3:ListBucket",
103+
"Resource": [
104+
"arn:aws:s3:::YOUR_BUCKET_NAME"
105+
]
106+
},
107+
{
108+
"Sid": "AllObjectActions",
109+
"Effect": "Allow",
110+
"Action": "s3:*Object*",
111+
"Resource": [
112+
"arn:aws:s3:::YOUR_BUCKET_NAME/*",
113+
]
114+
}
115+
]
116+
}
117+
```
118+
This access allows Segment to run local deletions jobs from regionally hosted data for a given user ID.
119+
120+
6. If you are using KMS encryption on your S3 bucket, add the following policy to the IAM role:
121+
```json
122+
{
123+
"Version": "2012-10-17",
124+
"Statement": [
125+
{
126+
"Sid": "AllowKMS",
127+
"Effect": "Allow",
128+
"Action": [
129+
"kms:GenerateDataKey",
130+
"kms:Decrypt"
131+
],
132+
"Resource": "$YOUR_KEY_ARN"
133+
}
134+
]
135+
}
136+
```
137+
138+
### Local Data Storage configuration
139+
140+
After you configure the policy and roles, as defined above, navigate to the Regional Settings tab of the Settings page of the source for which you want to store data regionally, and find the Local Data Storage section.
141+
142+
![local storage](images/regional-data-source.png)
143+
144+
To complete the configuration of Local Data Storage:
145+
146+
1. Enter the name of the S3 Bucket you created in the Pre-requisites section
147+
2. Add a regional label to help locate the bucket from AWS at a later date. For example, `us-west2-294048959147` or `country-REGION-AWS_ACCOUNT`
148+
3. Enter the ARN of the IAM role set up as a pre-requisite. For example, `arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME`.
149+
4. Click **Save**
150+
5. Navigate to *Privacy > Settings > Data Retention*
151+
6. Identify the Source in the source-level archive retention periods list
152+
7. Set the Retention for this source to the suggested value of `7 days` and click **Save**
153+
154+
The Local Data Storage bucket will take roughly one hour to receive data.
155+
156+
### Deletion from Local Data Storage
157+
158+
Local Data Storage offers the same ability to delete end-user data when they revoke or alter their consent to have data collected, as long as they can be identified by a `userId`. For example, if an end-user invokes their Right to Object or Right to Erasure under the GDPR or CCPA, you can use these features to:
159+
- block ongoing data collection for that user
160+
- delete all historical data about them from Segment's systems, connected S3 buckets and warehouses, and supported downstream partners
161+
162+
Contact [Segment Support](https://segment.com/help/contact/) to initiate a deletion and provide a valid `userId` for each user to be deleted. Deletion requests are run within the region, and have a 30 day SLA, but requires that the raw data is accessible to Segment in an unaltered state. Deletion from Local Data Storage is not available from the [Privacy Portal](/docs/privacy/user-deletion-and-suppression/#deletion-requests) or through the API.
163+
164+
> note "Business Plan Customers"
165+
> If you use this feature to delete data, you cannot Replay the deleted data at the same time. For standard Replay requests, you must wait for any pending deletions to complete. You cannot submit new deletions requests for the period of time that Segment replays data for you.
166+
167+
### Replays from Local Data Storage
168+
169+
Local Data Storage supports [Replays](/docs/guides/what-is-replay/#replays-for-tooling-changes) to both existing and new tools. To request a Replay, [contact Segment](https://segment.com/help/contact/) with the following:
170+
- Source ID
171+
- Date Range
172+
- Destination
173+
174+
Segment will initiate Replay jobs and update status as they progress.
175+
176+
### Debugging and support
177+
178+
The Segment Customer Success and Support Engineering teams do not have access to customer hosted regional buckets used for Local Data Storage and cannot provide event level debug support.
212 KB
Loading
83.7 KB
Loading
229 KB
Loading

0 commit comments

Comments
 (0)