Skip to content

Commit 7b2d49b

Browse files
committed
setup updates [netlify-build]
1 parent d8eead3 commit 7b2d49b

File tree

1 file changed

+212
-0
lines changed

1 file changed

+212
-0
lines changed
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
---
2+
title: Profiles Sync Setup
3+
plan: unify
4+
redirect_from:
5+
- '/unify/profiles-sync/'
6+
---
7+
8+
On this page, you’ll learn how to set up Profiles Sync, enable historical backfill, and adjust settings for warehouses that you’ve connected to Profiles Sync.
9+
10+
## Initial Profiles Sync setup
11+
12+
> info "Identity Resolution setup"
13+
> To use Profiles Sync, you must first set up [Identity Resolution](/docs/unify/identity-resolution/).
14+
15+
To set up Profiles Sync, you’ll first create a warehouse, then connect the warehouse within the Segment app.
16+
17+
Before you begin, prepare for setup with these tips:
18+
19+
- To connect your warehouse to Segment, you must have read and write permissions with the warehouse Destination you choose.
20+
- During step 2, you’ll copy credentials between Segment and your warehouse Destination. To streamline setup, open your Segment workspace in one browser tab and open another with your warehouse account.
21+
- Make sure to copy any IP addresses Segment asks you to allowlist in your warehouse destination.
22+
23+
### Step 1: Select a warehouse
24+
25+
You’ll first choose the destination warehouse to which Segment will sync profiles. Profiles Sync supports the Snowflake, Redshift, BigQuery, Azure, and Postgres warehouse Destinations. Your initial setup will depend on the warehouse you choose.
26+
27+
The following table shows the supported Profiles Sync warehouse destinations and the corresponding required steps for each. Select a warehouse, view its Segment documentation, then carry out the warehouse’s required steps before moving to step 2 of Profiles Sync setup:
28+
29+
| Warehouse Destination | Required steps |
30+
| ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
31+
| [Snowflake](/docs/connections/storage/catalog/snowflake/#getting-started) | 1. Create virtual warehouse. <br> 2. Create a database. <br> 3. Create role for Segment. <br> 4. Create user for Segment. <br> 5. Test the user and credentials. |
32+
| [Redshift](/docs/connections/storage/catalog/redshift/#getting-started) | 1. Choose an instance. <br> 2. Provision a new Redshift cluster. |
33+
| [BigQuery](/docs/connections/storage/catalog/bigquery/) | 1. Create a project and enable BigQuery. <br> 2. Create a service account for Segment. |
34+
| [Azure](/docs/connections/storage/catalog/azuresqldw/) | 1. Sign up for an Azure subscription. <br> 2. Provision a dedicated SQL pool. |
35+
| [Postgres](/docs/connections/storage/catalog/postgres/) | 1. Follow the steps in the [Postgres getting started](/docs/connections/storage/catalog/postgres/) section. |
36+
| [Databricks](/docs/unify/profiles-sync/profiles-sync-setup/databricks-profiles-sync/) | 1. Follow the steps in the [Databricks for Profiles Sync](/docs/unify/profiles-sync/profiles-sync-setup/databricks-profiles-sync/) guide. |
37+
38+
Once you’ve finished the required steps for your chosen warehouse, you’re ready to connect your warehouse to Segment. Because you’ll next enter credentials from the warehouse you just created, **leave the warehouse tab open to streamline setup.**
39+
40+
#### Profiles Sync permissions
41+
42+
To allow Segment to write to the warehouse you're using for Profiles Sync, you'll need to set up specific permissions.
43+
44+
For example, if you're using BigQuery, you must [create a service account](/docs/connections/storage/catalog/bigquery/#create-a-service-account-for-segment) for Segment and assign the following roles:
45+
- `BigQuery Data Owner`
46+
- `BigQuery Job User`
47+
48+
Review the required steps for each warehouse in the table above to see which permissions you'll need.
49+
50+
#### Profiles Sync roles
51+
52+
The following Segment access [roles](/docs/segment-app/iam/roles/) apply to Profiles Sync:
53+
54+
**Unify and Engage read-only**: Read-only access to Profiles Sync, including the sync history and configuration settings. With these roles assigned, you can't download PII or edit Profiles Sync settings.
55+
56+
**Unify read-only and Engage user**: Read-only access to Profiles Sync, including the sync history and configuration settings. With these roles assigned, you can't download PII or edit Profiles Sync settings.
57+
58+
**Unify and Engage Admin access**: Full edit access to Profiles Sync, including the sync history and configuration settings.
59+
60+
61+
### Step 2: Connect the warehouse and enable Profiles Sync
62+
63+
After selecting your warehouse, you can connect it to Segment.
64+
65+
During this step, you’ll copy credentials from the warehouse you just set up and enter them into the Segment app. The specific credentials you’ll enter depend on the warehouse you chose during step 1.
66+
67+
Segment may also display IP addresses you’ll need to allowlist in your warehouse. Make sure to copy the IP addresses and enter them into your warehouse account.
68+
69+
To connect your warehouse:
70+
71+
1. Configure your database.
72+
- Be sure to log in with a user who has read and write permissions so that Segment can write to your database.
73+
- Segment shows an IP address to allowlist. Copy it to your warehouse destination.
74+
2. Enter a schema name to help you identify this space in the warehouse, or use the default name provided.
75+
- The schema name can't be changed once the warehouse is connected.
76+
4. Enter your warehouse credentials, then select **Test Connection**.
77+
5. If the connection test succeeds, Segment enables the **Next** button. Select it.
78+
* If the connection test fails, verify that you’ve correctly entered the warehouse credentials, then try again.
79+
80+
81+
### Step 3: Set up Selective Sync
82+
83+
Set up Selective Sync to control the exact tables and columns that Segment will sync to your connected data warehouse.
84+
85+
> info ""
86+
> Data will be backfilled to your warehouse based on the last two months of history.
87+
88+
You can sync the following tables:
89+
90+
| Type | Tables |
91+
| ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
92+
| [Profile raw tables](/docs/unify/profiles-sync/tables/#profile-raw-tables) | - `external_id_mapping_updates` <br> - `id_graph_updates` <br> - `profile_traits_updates` |
93+
| [Profile materialized tables](/docs/unify/profiles-sync/tables/#tables-segment-materializes) | - `user_identifier` <br> - `user_traits` <br> - `profile_merges` |
94+
| [Event type tables](/docs/unify/profiles-sync/tables/#event-type-tables) | - `Identify` <br> - `Page` <br> - `Group` <br> - `Screen` <br> - `Alias` <br> - `Track` |
95+
| [Track event tables](/docs/unify/profiles-sync/tables/#track-event-tables) | To view and select individual track tables, don't sync track tables during the initial setup. Edit your sync settings after enabling Profiles Sync and waiting for the first sync to complete. |
96+
97+
98+
#### Using Selective Sync
99+
100+
Use Selective Sync to manage the data you send to your warehouses by choosing which tables and columns (also known as properties) to sync. Syncing fewer tables and properties will lead to faster and more frequent syncs, faster queries, and using less disk space.
101+
102+
You can access Selective Sync in two ways:
103+
- From the Set Selective Sync page as you connect your warehouse to Profiles Sync.
104+
- From the Profiles Sync settings (**Profiles Sync** > **Settings** > **Selective sync**).
105+
106+
You'll see a list of event type tables, event tables, and [tables Segment materializes](/docs/unify/profiles-sync/tables/#tables-segment-materializes) available to sync. Select the tables and properties that you'd like to sync, and be sure the ones you'd like to prevent from syncing aren't selected.
107+
108+
Regardless of schema size, only the first 5,000 collections and 5,000 properties per collection can be managed using your Segment space. To edit Selective Sync settings for any collection which exceeds this limit, [contact Segment support](https://app.segment.com/workspaces?contact=1){:target="blank"}.
109+
110+
> info ""
111+
> You must be a workspace owner to change Selective Sync settings.
112+
113+
#### When to use Selective Sync
114+
115+
Use Selective Sync when you'd like to prevent specific tables and properties from syncing to your warehouse. Segment stops syncing from disabled tables or properties, but will not delete any historical data from your warehouse.
116+
117+
If you choose to re-enable a table or property to sync again, only new data generated will sync to your warehouse. Segment doesn't backfill data that was omitted with Selective Sync.
118+
119+
#### Using historical backfill
120+
121+
Profiles Sync sends profiles to your warehouse on an hourly basis, beginning after you complete setup. You can use backfill, however, to sync historical profiles to your warehouse, as well.
122+
123+
> info ""
124+
> You can only use historical backfill for tables that you enable with [Selective Sync](#using-selective-sync) during setup. Segment does not backfill tables that you disable with Selective Sync.
125+
126+
When Segment runs historical backfills:
127+
128+
- Profile raw and Profile materialized tables sync your entire historical data to your warehouse.
129+
- Profiles Sync gathers the last two months of all events for Event type and Track event tables and syncs them to your warehouse.
130+
131+
Segment lands the data on an internal staging location, then removes the backfill banner. Segment then syncs the backfill data to your warehouse.
132+
133+
Reach out to [Segment support](https://app.segment.com/workspaces?contact=1){:target="blank"} if your use case exceeds the scope of the initial setup backfill.
134+
135+
> success ""
136+
> While historical backfill is running, you can start building [materialized views](/docs/unify/profiles-sync/tables/#tables-you-materialize) and running [sample queries](/docs/unify/profiles-sync/sample-queries).
137+
138+
139+
### Step 4 (Optional): Materialize key views using a SQL automation tool
140+
141+
During setup, you can optionally materialize views on your own, or use Segment's open source dbt models.
142+
143+
You might want to materialize your own tables if, for example, you want to transform additional data or join Segment profile data with external data before materialization.
144+
145+
> success ""
146+
> You can alternatively use [tables that Segment materializes](/docs/unify/profiles-sync/tables/#tables-segment-materializes) and syncs to your data warehouse.
147+
148+
To start seeing unified profiles in your warehouse and build attribution models, you'll need to materialize the tables that Profiles Sync lands into three key views:
149+
150+
* `id_graph`: the current state of relationships between segment ids
151+
* `external_id_mapping`: the current-state mapping between each external identifier you’ve observed and its corresponding, fully-merged `canonical_segment_id`
152+
* `profile_traits`: the last seen value for all custom traits, computed traits, SQL traits, audiences, and journeys associated with a profile in a single row
153+
154+
Please visit [Tables you materialize](/docs/unify/profiles-sync/tables/#tables-you-materialize) for more on how to materialize these views either on your own, or with [Segment's open source dbt models](https://github.com/segmentio/profiles-sync-dbt){:target="blank"}.
155+
156+
> warning ""
157+
> Please note that dbt models are in beta and need modifications to run efficiently on BigQuery, Synapse, and Postgres warehouses. Segment is actively working on this feature.
158+
159+
## Profiles Sync limits
160+
161+
As you use Profiles Sync, please keep the following limits in mind:
162+
163+
- For event tables, Segment can only backfill up to 2,000 tables for each workspace.
164+
- Segment can only initiate backfills after a successful sync with > 0 rows.
165+
- For every sync, the total dataset Segment can sync is limited to 20TB.
166+
167+
168+
## Working with synced warehouses
169+
170+
<!-- add transition line here -->
171+
172+
### Monitor Profiles Sync
173+
174+
You can view warehouse sync information in the overview section of the Profiles Sync page. Segment displays the dates and times of the last and next syncs, as well as your sync frequency.
175+
176+
In the Syncs table, you’ll find reports on individual syncs. Segment lists your most recent syncs first. The following table shows the information Segment tracks for each sync:
177+
178+
| DATA TYPE | DEFINITION |
179+
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
180+
| Sync status | - `Success`, which indicates that all rows synced correctly; <br> - `Partial success`, indicating that some rows synced correctly <br> - `Failed`, indicating that no rows synced correctly |
181+
| Duration | Length of sync time, in minutes |
182+
| Start time | The date and time when the sync began |
183+
| Synced rows | The number of rows synced to the warehouse |
184+
185+
Selecting a row from the Syncs table opens a pane that contains granular sync information. In this view, you’ll see the sync’s status, duration, and start time. Segment also displays a nuanced breakdown of the total rows synced, sorting them into identity graph tables, event type tables, and event tables.
186+
187+
If the sync failed, Segment shows any available error messages in the sync report.
188+
189+
### Settings and maintenance
190+
191+
The **Settings** tab of the Profiles Sync page contains tools that can help you monitor and maintain your synced warehouse.
192+
193+
#### Disable or delete a warehouse
194+
195+
In the **Basic settings** tab, you can disable warehouse syncs or delete your connected warehouse altogether.
196+
197+
To disable syncs, toggle **Sync status** to off. Segment retains your warehouse credentials but stops further syncs. Toggle Sync status back on at any point to continue syncs.
198+
199+
To delete your warehouse, toggle **Sync status** to off, then select **Delete warehouse**. Segment doesn’t retain credentials for deleted warehouses; to reconnect a deleted warehouse, you must set it up as a new warehouse.
200+
201+
#### Connection settings
202+
203+
In the **Connection settings** tab, you can verify your synced warehouse’s credentials and view IP addresses you’ll need to allowlist so that Segment can successfully sync profiles.
204+
205+
If you have write access, you can verify that your warehouse is successfully connected to Segment by entering your password and then selecting **Test Connection**.
206+
207+
> info "Changing your synced warehouse"
208+
> If you’d like to change the warehouse connected to Profiles Sync, [reach out to Segment support](https://segment.com/help/contact/){:target="blank"}.
209+
210+
#### Sync schedule
211+
212+
Segment supports hourly syncs.

0 commit comments

Comments
 (0)