Skip to content

Commit 511e472

Browse files
committed
GA updates
1 parent e753c98 commit 511e472

File tree

4 files changed

+129
-19
lines changed

4 files changed

+129
-19
lines changed

src/_data/sidenav/main.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,8 @@ sections:
286286
- section_title: Profiles Sync
287287
slug: profiles/profiles-sync
288288
section:
289+
- path: /profiles/profiles-sync/overview
290+
title: Profiles Sync Overview
289291
- path: /profiles/profiles-sync
290292
title: Setup
291293
- path: /profiles/profiles-sync/sample-queries

src/profiles/profiles-sync/index.md

Lines changed: 43 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,6 @@ beta: true
44
plan: profiles
55
---
66

7-
> info "Profiles Sync Beta"
8-
> Profiles Sync is in beta and Segment is actively working on this feature. Segment's [First-Access and Beta terms](https://segment.com/legal/first-access-beta-preview/) govern this feature. To learn more, reach out to your CSM, AE, or SE.
9-
10-
Profiles Sync connects identity-resolved customer profiles to a data warehouse of your choice.
11-
12-
With a continual flow of synced Profiles, teams can enrich and use these data sets as the basis for new audiences and models. Profiles Sync addresses a number of use cases, with applications for machine learning, identity graph monitoring, and attribution analysis. View [Profiles Sync Sample Queries](/docs/profiles/profiles-sync/sample-queries) for an in-depth guide to Profiles Sync applications.
13-
147
On this page, you’ll learn how to set up Profiles Sync, enable historical backfill, and adjust settings for warehouses that you’ve connected to Profiles Sync.
158

169
## Initial Profiles Sync setup
@@ -42,6 +35,27 @@ The following table shows the supported Profiles Sync warehouse Destinations and
4235

4336
Once you’ve finished the required steps for your chosen warehouse, you’re ready to connect your warehouse to Segment. Because you’ll next enter credentials from the warehouse you just created, **leave the warehouse tab open to streamline setup.**
4437

38+
#### Profiles Sync permissions
39+
40+
To allow Segment to write to the warehouse you're using for Profiles Sync, you'll need to set up specific permissions.
41+
42+
For example, if you're using BigQuery, you must [create a service account](/docs/connections/storage/catalog/bigquery/#create-a-service-account-for-segment) for Segment and assign the following roles:
43+
- `BigQuery Data Owner`
44+
- `BigQuery Job User`
45+
46+
Review the required steps for each warehouse in the table above to see which permissions you'll need.
47+
48+
#### Profiles Sync roles
49+
50+
The following Segment access [roles](/docs/segment-app/iam/roles/) apply to Profiles Sync:
51+
52+
**Profiles and Engage read-only**: Read-only access to Profiles Sync, including the sync history and configuration settings. With these roles assigned, you can't download PII or edit Profiles Sync settings.
53+
54+
**Profiles read-only and Engage user**: Read-only access to Profiles Sync, including the sync history and configuration settings. With these roles assigned, you can't download PII or edit Profiles Sync settings.
55+
56+
**Profiles and Engage Admin access**: Full edit access to Profiles Sync, including the sync history and configuration settings.
57+
58+
4559
### Step 2: Connect the warehouse and enable Profiles Sync
4660

4761
With your warehouse configured, you can now connect it to Segment.
@@ -67,7 +81,28 @@ Segment staff then receives and enables live sync for your account.
6781

6882
Profiles Sync sends Profiles to your warehouse on an hourly basis, beginning after you complete setup. You can use backfill, however, to sync historical Profiles to your warehouse, as well.
6983

70-
By default, Segment includes identity graph updates, external ID mapping tables, and two months of the events table in the initial warehouse sync made during setup. Reach out to Segment support if your use case exceeds the scope of the initial setup backfill.
84+
When Segment runs historical backfills:
85+
86+
- The `id_graph_updates` and `external_id_mapping_updates` tables sync your entire historical data to your warehouse.
87+
- The `identities`, `page`, `screens`, and `tracks` tables sync the last two months of events to your warehouse.
88+
89+
Reach out to [Segment support](https://app.segment.com/workspaces?contact=1){:target="blank"} if your use case exceeds the scope of the initial setup backfill.
90+
91+
> warning ""
92+
> For event tables, Segment can only backfill up to 2,000 tables for each workspace.
93+
94+
> success ""
95+
> While historical backfill is running, you can start building [materialized views](/docs/profiles/profiles-sync/tables/#tables-you-materialize) and running [sample queries](/docs/profiles/profiles-sync/sample-queries).
96+
97+
### Step 3: Materialize key views using a SQL automation tool
98+
99+
To start seeing unified profiles in your warehouse and build attribution models, you'll need to materialize the tables that Profiles Sync lands into three key views:
100+
101+
* `id_graph`: the current state of relationships between segment ids
102+
* `external_id_mapping`: the current-state mapping between each external identifier you’ve observed and its corresponding, fully-merged `canonical_segment_id`
103+
* `profile_traits`: the last seen value for all custom traits, computed traits, SQL traits, audiences, and journeys associated with a profile in a single row
104+
105+
Please visit [Tables you materialize](/docs/profiles/profiles-sync/tables/#tables-you-materialize) for more on how to materialize these views either on your own, or with [Segment's open source dbt models](https://github.com/segmentio/profiles-sync-dbt){:target="blank"}.
71106

72107
## Working with synced warehouses
73108

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title: Profiles Sync Overview
3+
---
4+
5+
Profiles Sync connects identity-resolved customer profiles to a data warehouse of your choice.
6+
7+
With a continual flow of synced Profiles, teams can enrich and use these data sets as the basis for new audiences and models. Profiles Sync addresses a number of use cases, with applications for identity graph monitoring, attribution analysis, machine learning, and more. View [Profiles Sync Sample Queries](/docs/profiles/profiles-sync/sample-queries) for an in-depth guide to Profiles Sync applications.
8+
9+
10+
## Profiles Sync use cases
11+
12+
To help you get started, here are a few example use cases:
13+
14+
### Understand how Segment creates Profiles
15+
16+
Use Profiles Sync for more insight into profiles generated by Segment's [Identity Resolution](/docs/profiles/identity-resolution/). Query the Profile Sync data set to answer questions such as:
17+
- What's causing profile merges in a given period?
18+
- How many merges occur for each profile?
19+
- How many emails are associated with a profile?
20+
21+
Understanding how Segment creates profiles helps you detect potential instrumentation errors.
22+
23+
### Create golden profiles
24+
25+
Join Segment's profile data with existing object data from your warehouse to create a single view of the customer. You can then use this data set to create personalized experiences on any channel. For example, B2B companies can build a report that maps sales executives (object data synced with a source like Salesforce) with customers who are most likely to buy a certain product (Profile Traits data synced with Profiles Sync).
26+
27+
### Understand a customer's journey
28+
29+
With Profiles Sync, your data teams can better understand profile merge events. Connect anonymous IDs, User IDs, and emails to understand your customer's journey with details such as:
30+
- How often a user has paid.
31+
- If someone is a bad actor.
32+
- If a user has subscribed or not.
33+
- What products users are viewing, even when they aren't logged in.
34+
35+
### Build attribution models
36+
37+
Use Profiles Sync to build data models that marketing partners can trust. Trace prospective customer journeys before buying products, and build models that help you understand which channels provide the most value.
38+
39+
40+
### Use machine learning
41+
42+
Access profile traits and see how they change over time. This will help you to better understand how your customer's behavior evolves, and build models that predict LTV, churn, and propensity scores.
43+
44+
45+
## Next steps
46+
47+
To learn more about Profiles Sync, visit the following docs:
48+
49+
- [Profiles Sync Setup](/docs/profiles/profiles-sync/): Learn how to set up Profiles Sync, enable historical backfill, and adjust settings for warehouses you've connected.
50+
- [Sample Queries](/docs/profiles/profiles-sync/sample-queries/): View sample queries you can run to help you familiarize yourself with Profiles Sync.
51+
- [Tables and materialized views](/docs/profiles/profiles-sync/tables/): Learn how to use data sets and models that Segment provides to enrich customer profiles.
52+
53+
> info ""
54+
> For more on Profiles Sync logic, table mappings, and data types, download this [Profiles Sync ERD](/docs/profiles/files/ERD.png).

src/profiles/profiles-sync/tables.md

Lines changed: 30 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ Using the profile merge scenario, Segment would generate three new entries to th
9393

9494
<div style="overflow-x:auto;" markdown=1>
9595

96-
| `segment_id` | `canonical_segment_id` | `triggering_event_type` | `triggering_event_id` | `timestamp` |
96+
| `segment_id` (varchar) | `canonical_segment_id` (varchar) | `triggering_event_type` (varchar) | `triggering_event_id` (varchar) | `timestamp` (datetime) |
9797
| ------------ | ---------------------- | ----------------------- | --------------------- | ------------------- |
9898
| `profile_1` | `profile_1` | `page` | `event_1` | 2022-05-02 14:01:00 |
9999
| `profile_2` | `profile_2` | `page` | `event_3` | 2022-06-22 10:47:15 |
@@ -103,6 +103,7 @@ Using the profile merge scenario, Segment would generate three new entries to th
103103

104104
In this example, the table shows `profile_2` mapping to two places: first to itself, then, later, to `profile_1` after the merge occurs.
105105

106+
106107
#### Recursive entries
107108

108109
Segment shows the complete history of every profile. If, later, `profile_1` merges into a different `profile_0`, Segment adds recursive entries to show that `profile_1` and `profile_2` both map to `profile_0`. These entries give you a comprehensive history of all profiles that ever existed.
@@ -117,7 +118,7 @@ The anonymous site visits sample used earlier would generate the following event
117118

118119
<div style="overflow-x:auto;" markdown=1>
119120

120-
| `segment_id` | `external_id_type` | `external_id_value` | `triggering_event_type` | `triggering_event_id` | `timestamp` |
121+
| `segment_id` (varchar) | `external_id_type` (varchar) | `external_id_value` (varchar) | `triggering_event_type` (varchar) | `triggering_event_id` (varchar) | `timestamp` (datetime) |
121122
| ------------ | -------------------| ------------------------| ----------------------- |-----------------------| ------------------- |
122123
| `profile_1` | `anonymous_id` | `5285bc35-05ef-4d21` | `page` | `event_1` | 2022-05-02 14:01:00 |
123124
| `profile_1` | `email` | `[email protected]` | `identify` | `event_2` | 2022-05-02 14:01:47 |
@@ -142,7 +143,7 @@ The previous result would generate two entries in the `pages` table:
142143

143144
<div style="overflow-x:auto;" markdown=1>
144145

145-
| `segment_id` | `context_url` | `anonymous_id` | `event_source_id` | `event_id` | `timestamp` |
146+
| `segment_id` (varchar) | `context_url` (array) | `anonymous_id` (varchar) | `event_source_id` (varchar) | `event_id` (varchar) | `timestamp` (datetime) |
146147
| ------------ | ---------------------- | -------------------- | ----------------- | ---------- | ------------------- |
147148
| `profile_1` | `twilio.com` | `5285bc35-05ef-4d21` | `source_1` | `event_1` | 2022-05-02 14:01:00 |
148149
| `profile_2` | `twilio.com/education` | `b50e18a5-1b8d-451c` | `source_1` | `event_3` | 2022-06-22 10:47:15 |
@@ -153,7 +154,7 @@ And two entries in the `identifies` table:
153154

154155
<div style="overflow-x:auto;" markdown=1>
155156

156-
| `segment_id` | `context_url` | `anonymous_id` | `email` | `event_source_id` | `event_id` | `timestamp` |
157+
| `segment_id` (varchar) | `context_url` (array) | `anonymous_id` (varchar) | `email` (varchar) | `event_source_id` (varchar) | `event_id` (varchar) | `timestamp` (datetime) |
157158
| ------------ | ---------------------------- | -------------------- | ---------------------- | ----------------- | ---------- | ------------------- |
158159
| `profile_1` | `twilio.com/try_twilio` | `5285bc35-05ef-4d21` | `[email protected]` | `source_1` | `event_2` | 2022-05-02 14:01:47 |
159160
| `profile_2` | `twilio.com/events/webinars` | `b50e18a5-1b8d-451c` | `[email protected]` | `source_2` | `event_4` | 2022-06-22 10:48:00 |
@@ -162,13 +163,29 @@ And two entries in the `identifies` table:
162163

163164
All these events were performed by the same person. If you use these tables to assemble your data models, though, always join them against `id_graph` to resolve each event’s `canonical_segment_id`.
164165

166+
### Profiles Sync schema
167+
168+
Profiles Sync uses the following schema: `<profiles_space_name>.<tableName>`.
169+
170+
> note ""
171+
> Note that the Profiles Sync schema is different from the Connections Warehouse schema: `<source_name>.<tableName>`.
172+
173+
If your Profiles space has the same name as a source connected to your Segment Warehouse destination, Segment overwrites data to the Event tables.
174+
175+
> success ""
176+
> For more on Profiles Sync logic, table mappings, and data types, download this [Profiles Sync ERD](/docs/profiles/files/ERD.png) or visit [schema evolution and compatibility](/docs/connections/storage/warehouses/schema/#schema-evolution-and-compatibility).
177+
178+
{% comment %}
179+
180+
### Update your schema name
181+
182+
Follow the steps below to change your schema name:
183+
{% endcomment %}
165184

166185
## Tables you materialize
167186

168187
> info "dbt model definitions package"
169-
> To get started with your table materializations, try Segment's [open-source dbt models](https://github.com/segmentio/profiles-sync-dbt){:target="_blank"}.
170-
171-
With Profiles Traits materialized view, you can view all custom traits, computed traits, SQL traits, audiences, and journeys associated with a profile in a single row.
188+
> To get started with your table materializations, try Segment's [open-source dbt models](https://github.com/segmentio/profiles-sync-dbt){:target="_blank"}, or materialize views with your own tools.
172189
173190
Every customer profile (or `canonical_segment_id`) will be represented in each of the following tables.
174191

@@ -178,7 +195,7 @@ This table represents the current state of your identity graph, showing only whe
178195

179196
The most recent entry for each `segment_id` from `id_graph_updates` reflects this. After the four example events, `id_graph` would show the following:
180197

181-
| `segment_id` | `canonical_segment_id` | `timestamp` |
198+
| `segment_id` (varchar) | `canonical_segment_id` (varchar) | `timestamp` (datetime) |
182199
| ------------ | ---------------------- | ------------------- |
183200
| `profile_1` | `profile_1` | 2022-05-02 14:01:00 |
184201
| `profile_2` | `profile_1` | 2022-06-22 10:48:00 |
@@ -191,7 +208,7 @@ Use this table to view the full, current-state mapping between each external ide
191208

192209
In the case study example, you’d see the following:
193210

194-
| `canonical_segment_id` | `external_id_type` | `external_id_value` | `timestamp` |
211+
| `canonical_segment_id` (varchar) | `external_id_type` (varchar) | `external_id_value` (varchar) | `timestamp` (datetime) |
195212
| ---------------------- | ------------------ | ---------------------- | --------------------- |
196213
| `profile_1` | `anonymous_id` | `5285bc35-05ef-4d21` | `2022-05-02 14:01:00` |
197214
| `profile_1` | `email` | `[email protected]` | `2022-05-02 14:01:47` |
@@ -200,13 +217,15 @@ In the case study example, you’d see the following:
200217

201218
### `profile_traits` table
202219

203-
This table contains the last seen value for any of your customer profile traits that Segment processes as an Identify call.
220+
Use the `profile_traits` table for a singular view of your customer. With this table, you can view all custom traits, computed traits, SQL traits, audiences, and journeys associated with a profile in a single row.
221+
222+
The `profile_traits` table contains the last seen value for any of your customer profile traits that Segment processes as an Identify call.
204223

205224
If Segment later merges away a profile, it populates the `segment_id` it merged in the `merged_to` column.
206225

207226
In the case study example, Segment only collected email. As a result, Segment would generate the following `profile_traits` table:
208227

209-
| `canonical_segment_id` | `email` | `merged_to` |
228+
| `canonical_segment_id` (varchar) | `email` (varchar) | `merged_to` (varchar)|
210229
| ---------------------- | ---------------------- | ----------- |
211230
| `profile_1` | `[email protected]` | |
212231
| `profile_2` | | `profile_1` |

0 commit comments

Comments
 (0)