You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When Segment loads data into your warehouse, each sync goes through the following steps:
2
-
1.**Ping:** Segment servers connect to your warehouse. For Redshift warehouses, Segment also runs a query to determine how many slices a cluster has.
3
-
2.**Scan:** Segment finds new events in AWS S3 and updated objects in Dynamo.
4
-
3.**Download:** Segment pulls the events and objects into a staging area.
5
-
4.**Process:** The raw Segment event and object archive files are transformed into database-specific formats. The [warehouse schema](/docs/connections/storage/warehouses/schema/) is also defined in this step.
6
-
5.**Load:** Segment de-duplicates the transformed data and loads it into your warehouse. If you have queries set up in your warehouse, they run after the data is loaded into your warehouse. ***This is the only step that connects to your warehouse: all other steps are internal to Segment.***
1
+
When Segment loads data into your warehouse, each sync goes through two steps:
2
+
1.**Ping:** Segment servers connect to your warehouse. For Redshift warehouses, Segment also runs a query to determine how many slices a cluster has. Common reasons a sync might fail at this step include a blocked VPN or IP, a warehouse that isn't set to be publicly accessible, or an issue with user permissions or credentials.
3
+
2.**Load:** Segment de-duplicates the transformed data and loads it into your warehouse. If you have queries set up in your warehouse, they run after the data is loaded into your warehouse.
Copy file name to clipboardExpand all lines: src/connections/storage/warehouses/faq.md
+25-16Lines changed: 25 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -93,34 +93,43 @@ After a source is created, you can enable or disable a warehouse sync within the
93
93
94
94
## Can I be notified when warehouse syncs fail?
95
95
96
-
If you enabled activity notifications for your storage destination, you'll receive notifications in the Segment app when warehouse syncs fail.
96
+
If you enabled activity notifications for your storage destination, you'll receive notifications in the Segment app for the fifth and 20th consecutive warehouse failures.
97
97
98
98
To sign up for warehouse sync notifications:
99
99
1. Open the Segment app.
100
100
2. Go to **Settings** > **User Preferences**.
101
101
3. In the Activity Notifications section, select **Storage Destinations**.
102
102
4. Enable **Storage Destination Sync Failed**.
103
103
104
-
## How is my data formatted in my warehouse?
104
+
## How is the data formatted in my warehouse?
105
105
106
-
Data in your warehouse is formatted into **schemas**, which involve a detailed description of database elements (tables, views, indexes, synonyms, etc.) and the relationships that exist between elements. Segment's schemas use the following template: <br/>`<source>.<collection>.<property>`, for example, `segment-engineering.tracks.userId`, where Source refers to the source or project name (segment-engineering), collection refers to the event (tracks), and the property refers to the data being collected (userId). For more information about Warehouse Schemas, see the [Warehouse Schemas](/docs/connections/storage/warehouses/schema) page.
106
+
Data in your warehouse is formatted into **schemas**, which involve a detailed description of database elements (tables, views, indexes, synonyms, etc.)
107
+
and the relationships that exist between elements. Segment's schemas use the following template: <br/>`<source>.<collection>.<property>`, for example,
108
+
`segment_engineering.tracks.user_id`, where Source refers to the source or project name (segment_engineering), collection refers to the event (tracks),
109
+
and the property refers to the data being collected (user_id).
110
+
111
+
> note " "
112
+
> All schema data is always represented in snake case.
113
+
114
+
For more information about Warehouse Schemas, see the [Warehouse Schemas](/docs/connections/storage/warehouses/schema) page.
107
115
108
116
## If my syncs fail and get fixed, will I need to ask for a backfill?
109
117
110
-
Yes, if your syncs fail, you will need to reach out to [Segment Support](https://segment.com/help/) to ask for a backfill. Be sure to include the following information in your request:
111
-
- The warehouse that requires the backfill
112
-
- What sources you need information from
113
-
- The date range of data that requires a backfill
118
+
If your syncs fail, you do not need to reach out to Segment Support to request a backfill. Once a successful sync takes place,
119
+
Segment will automatically load all of the data created since the last successful sync.
120
+
114
121
115
122
## Can I change my schema names once they've been created?
116
123
117
-
If you'd like to change the name of your schema:
124
+
Segment stores the name of your hanging the name of your schema in the **SQL Settings** page without updating the name in your data warehouse causes the schema
125
+
to be split into two after the name is changed.
118
126
119
-
1. Open the Segment app.
120
-
2. Select your warehouse from the Sources tab.
121
-
3. On the source's overview page, select "Settings."
122
-
4. Under the "Enable Source" section, disable your warehouse and click "Save Changes."
123
-
5. Select the "SQL Settings" tab.
124
-
6. Update the "Schema Name" field with your intended schema name and click "Save Changes."
125
-
7. On the source's overview page, select "Basic."
126
-
8. Under the "Enable Source" section, disable your warehouse and click "Save Changes."
127
+
To change the name of your schema without disruptions:
128
+
129
+
1. Open the Segment app, select your warehouse from the Sources tab, and select **Settings.**
130
+
2. Under the "Enable Source" section, disable your warehouse and click **Save Changes.**
131
+
3. Select the "SQL Settings" tab.
132
+
4. Update the "Schema Name" field with the new name for your schema and click **Save Changes.**
133
+
5. Rename the schema in your Data Warehouse to match the new name in the Segment app.
134
+
6. Open the Segment app, select your warehouse from the Sources tab, and select **Settings.** On the source's settings page, select "Basic."
135
+
7. Under the "Enable Source" section, enable your warehouse and click **Save Changes.**
Copy file name to clipboardExpand all lines: src/connections/storage/warehouses/schema.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Warehouse Schemas
3
3
---
4
4
5
5
A **schema** describes the way that the data in a warehouse is organized. Schemas of warehouse data are organized into the following template:
6
-
`<source>.<collection>.<property>`, for example `segment-engineering.tracks.userId`, where source refers to the source or project name (segment-engineering), collection refers to the event (tracks), and the property refers to the data being collected (userId).
6
+
`<source>.<collection>.<property>`, for example `segment_engineering.tracks.user_id`, where source refers to the source or project name (segment_engineering), collection refers to the event (tracks), and the property refers to the data being collected (user_id). All schemas convert collection and property names from `CamelCase` to `snake_case`.
7
7
8
8
> note "Warehouse column creation"
9
9
> **Note:** Segment creates tables for each of your custom events in your warehouse, with columns for each event's custom properties. Segment does not allow unbounded `event` or `property` spaces in your data. Instead of recording events like "Ordered Product 15", use a single property of "Product Number" or similar.
@@ -412,15 +412,18 @@ Data types are set up in your warehouse based on the first value that comes in f
412
412
413
413
The data types that Segment currently supports include:
414
414
415
-
### `timestamp`
415
+
####`timestamp`
416
416
417
-
### `integer`
417
+
####`integer`
418
418
419
-
### `float`
419
+
####`float`
420
420
421
-
### `boolean`
421
+
####`boolean`
422
422
423
-
### `varchar`
423
+
#### `varchar`
424
+
425
+
> note " "
426
+
> To change data types after they've been determined, please reach out to [Segment Support](https://segment.com/help/contact) for assistance.
<!--- The Warehouse Sync process prepares the raw data coming from a source and loads it into a warehouse destination. There are two phases to the sync process:
7
-
1. **Preparation phase**: This is where Segment prepares the data coming from a source so that it's in the right format for the loading phase.
8
-
2. **Loading phase**: This is where Segment deduplicates data and the data loads into the warehouse destination. Any sync issues that occur in this phase can be traced back to your warehouse. -->
9
-
10
6
Instead of constantly streaming data to the warehouse destination, Segment loads data to the warehouse in bulk at regular intervals. Before the data loads, Segment inserts and updates events and objects, and automatically adjusts the schema to make sure the data in the warehouse is inline with the data in Segment.
11
7
12
8
{% include content/how-a-sync-works.md %}
@@ -17,13 +13,16 @@ Warehouses sync with all data coming from your source. However, Business plan me
17
13
18
14
Your plan determines how frequently data is synced to your warehouse.
19
15
20
-
| Plan | Frequency |
21
-
| -------- | ----------- |
22
-
| Free | Once a day |
23
-
| Team | Twice a day |
24
-
| Business | Up to 24 times a day. Generally, these syncs are fixed to the top of the hour (:00), but these times can vary. |
| Business*| Up to 24 times a day. Generally, these syncs are fixed to the top of the hour (:00), but these times can vary. |
21
+
22
+
*If you're a Business plan member and would like to adjust your sync frequency, you can do so using the Selective Sync feature. To enable Selective Sync, please go to **Warehouse** > **Settings** > **Sync Schedule**.
25
23
26
-
If you're a Business plan member and would like to adjust your sync frequency, you can do so using the Sync Schedule feature. To enable Sync Schedule, please go to **Warehouse** > **Settings** > **Sync Schedule**.
24
+
> note "Why can't I sync more than 24 times per day?"
25
+
> We do not set syncs to happen more than once per hour (ie, 24 times per day). The warehouse product is not designed for real-time data, so more frequent syncs would not necessarily be helpful.
27
26
28
27
## Sync History
29
28
You can use the Sync History page to see the status and history of data updates in your warehouse. The Sync History page is available for every source connected to each warehouse. This page helps you answer questions like, “Has the data from a specific source been updated recently?” “Did a sync completely fail, or only partially fail?” and “Why wasn’t this sync successful?”
0 commit comments