Skip to content

Commit 468baf1

Browse files
authored
Merge pull request #268 from segmentio/repo-sync
repo sync
2 parents 1a8ecb1 + aa512f3 commit 468baf1

File tree

8 files changed

+569
-135
lines changed

8 files changed

+569
-135
lines changed

src/_data/sidenav/main.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,8 @@ sections:
219219
title: Redshift Cluster and Redshift Connector Limitations
220220
- path: /connections/storage/warehouses/redshift-tuning
221221
title: Speeding Up Redshift Queries
222+
- path: /connections/storage/warehouses/redshift-useful-sql
223+
title: Useful SQL Queries for Redshift
222224
- path: /connections/test-connections
223225
title: Testing Connections
224226
- path: /connections/data-export-options
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
When Segment loads data into your warehouse, each sync goes through two steps:
2+
1. **Ping:** Segment servers connect to your warehouse. For Redshift warehouses, Segment also runs a query to determine how many slices a cluster has. Common reasons a sync might fail at this step include a blocked VPN or IP, a warehouse that isn't set to be publicly accessible, or an issue with user permissions or credentials.
3+
2. **Load:** Segment de-duplicates the transformed data and loads it into your warehouse. If you have queries set up in your warehouse, they run after the data is loaded into your warehouse.

src/connections/storage/warehouses/faq.md

Lines changed: 49 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,9 @@ Your warehouse id appears in the URL when you look at the [warehouse destination
4444

4545
## How fresh is the data in Segment Warehouses?
4646

47-
Data is available in Warehouses within 24-48 hours. The underlying Redshift datastore has a subtle tradeoff between data freshness, robustness, and query speed. For the best experience, Segment needs to balance all three of these.
47+
Data is available in Warehouses within 24-48 hours, depending on your tier's sync frequency. For more information about sync frequency by tier, see [Sync Frequency](/docs/connections/storage/warehouses/warehouse-syncs/#sync-frequency).
4848

49-
Real-time loading of the data into Segment Warehouses would cause significant performance degradation at query time because of the way Redshift uses large batches to optimize and compress columns. To optimize for your query speed, reliability, and robustness, Segment guarantees that your data will be available in Redshift within 24 hours.
50-
51-
As Segment improves and updates the ETL processes and optimizes for SQL query performance downstream, the actual load time will vary, but Segment ensures it's always within 24 hours.
49+
Real-time loading of the data into Segment Warehouses would cause significant performance degradation at query time. To optimize for your query speed, reliability, and robustness, Segment guarantees that your data will be available in your warehouse within 24 hours. The underlying datastore has a subtle tradeoff between data freshness, robustness, and query speed. For the best experience, Segment needs to balance all three of these.
5250

5351
## What if I want to add custom data to my warehouse?
5452

@@ -58,23 +56,23 @@ The only restriction when loading your own data into your connected warehouse is
5856

5957
If you want to insert custom data into your warehouse, create new schemas that are not associated with an existing source, since these may be deleted upon a reload of the Segment data in the cluster.
6058

61-
We highly recommend scripting any sort of additions of data you might have to warehouse, so that you aren't doing one-off tasks that can be hard to recover from in the future in the case of hardware failure.
59+
Segment recommends scripting any sort of additions of data you might have to warehouse, so that you aren't doing one-off tasks that can be hard to recover from in the future in the case of hardware failure.
6260

63-
## Which IPs should I whitelist?
61+
## Which IPs should I allowlist?
6462

65-
You can whitelist Segment's custom IP `52.25.130.38/32` while authorizing Segment to write in to your Redshift or Postgres port.
63+
You can allowlist Segment's custom IP `52.25.130.38/32` while authorizing Segment to write in to your Redshift or Postgres port.
6664

6765
If you're in the EU region, use CIDR `3.251.148.96/29`.
6866

6967
> info ""
7068
> EU workspace regions are currently in beta. If you would like to learn more about the beta, please contact your account manager.
7169
72-
BigQuery does not require whitelisting an IP address. To learn how to set up BigQuery, check out our [set up guide](https://segment.com/docs/connections/storage/catalog/bigquery/#getting-started)
70+
BigQuery does not require allowlisting an IP address. To learn how to set up BigQuery, check out Segment's BigQuery [set up guide](/docs/connections/storage/catalog/bigquery/#getting-started)
7371

7472

7573
## Will Segment sync my historical data?
7674

77-
We will automatically load up to 2 months of your historical data when you connect a warehouse.
75+
Segment loads up to two months of your historical data when you connect a warehouse.
7876

7977
For full historical backfills you'll need to be a Segment Business plan customer. If you'd like to learn more about our Business plan and all the features that come with it, [check out our pricing page](https://segment.com/pricing).
8078

@@ -92,3 +90,45 @@ When you create a new source, the source syncs to all warehouse(s) in the worksp
9290
- **Config API**: Send a [PATCH Connected Warehouse request](https://reference.segmentapis.com/?version=latest#ec12dae0-1a3e-4bd0-bf1c-840f43537ee2) to update the settings for the warehouse(s) you want to prevent from syncing.
9391

9492
After a source is created, you can enable or disable a warehouse sync within the Warehouse Settings page.
93+
94+
## Can I be notified when warehouse syncs fail?
95+
96+
If you enabled activity notifications for your storage destination, you'll receive notifications in the Segment app for the fifth and 20th consecutive warehouse failures.
97+
98+
To sign up for warehouse sync notifications:
99+
1. Open the Segment app.
100+
2. Go to **Settings** > **User Preferences**.
101+
3. In the Activity Notifications section, select **Storage Destinations**.
102+
4. Enable **Storage Destination Sync Failed**.
103+
104+
## How is the data formatted in my warehouse?
105+
106+
Data in your warehouse is formatted into **schemas**, which involve a detailed description of database elements (tables, views, indexes, synonyms, etc.)
107+
and the relationships that exist between elements. Segment's schemas use the following template: <br/>`<source>.<collection>.<property>`, for example,
108+
`segment_engineering.tracks.user_id`, where Source refers to the source or project name (segment_engineering), collection refers to the event (tracks),
109+
and the property refers to the data being collected (user_id).
110+
111+
> note " "
112+
> All schema data is always represented in snake case.
113+
114+
For more information about Warehouse Schemas, see the [Warehouse Schemas](/docs/connections/storage/warehouses/schema) page.
115+
116+
## If my syncs fail and get fixed, do I need to ask for a backfill?
117+
118+
If your syncs fail, you do not need to reach out to Segment Support to request a backfill. Once a successful sync takes place,
119+
Segment automatically loads all of the data generated since the last successful sync occurred.
120+
121+
122+
## Can I change my schema names once they've been created?
123+
124+
Segment stores the name of your schema in the **SQL Settings** page. Changing the name of your schema in the app without updating the name in your data warehouse causes a new schema to form, one that doesn't contain historical data.
125+
126+
To change the name of your schema without disruptions:
127+
128+
1. Open the Segment app, select your warehouse from the Sources tab, and select **Settings.**
129+
2. Under the "Enable Source" section, disable your warehouse and click **Save Changes.**
130+
3. Select the "SQL Settings" tab.
131+
4. Update the "Schema Name" field with the new name for your schema and click **Save Changes.**
132+
5. Rename the schema in your Data Warehouse to match the new name in the Segment app.
133+
6. Open the Segment app, select your warehouse from the Sources tab, and select **Settings.** On the source's settings page, select "Basic."
134+
7. Under the "Enable Source" section, enable your warehouse and click **Save Changes.**
212 KB
Loading

src/connections/storage/warehouses/index.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ Relational databases are great when you know and predefine the information colle
2020

2121
Examples of data warehouses include Amazon Redshift, Google BigQuery, and Postgres.
2222

23+
{% include content/how-a-sync-works.md %}
24+
2325
<div data-headings-anchors id="warehouse-schemas"></div>
2426
> info "Looking for the Warehouse Schemas docs?"
2527
> They've moved! Check them out [here](schema/).
@@ -32,7 +34,7 @@ Examples of data warehouses include Amazon Redshift, Google BigQuery, and Postgr
3234

3335
[How do I give users permissions to my warehouse?](/docs/connections/storage/warehouses/add-warehouse-users/)
3436

35-
Check out our [Frequently Asked Questions about Warehouses](/docs/connections/storage/warehouses/faq/) and [a list of helpful queries to get you started](https://help.segment.com/hc/en-us/articles/205577035-Common-Segment-SQL-Queries).
37+
Check out the [Frequently Asked Questions about Warehouses](/docs/connections/storage/warehouses/faq/) page and [a list of helpful SQL queries to get you started with Redshift ](/docs/connections/storage/warehouses/redshift-useful-sql).
3638

3739
## FAQs
3840

@@ -42,36 +44,36 @@ Check out our [Frequently Asked Questions about Warehouses](/docs/connections/st
4244

4345
[How do I give users permissions?](/docs/connections/storage/warehouses/add-warehouse-users/)
4446

45-
[What are the limitations of Redshift clusters and our warehouses connector?](/docs/connections/storage/warehouses/redshift-faq/)
47+
[What are the limitations of Redshift clusters and warehouses connectors?](/docs/connections/storage/warehouses/redshift-faq/)
4648

4749
[Where do I find my source slug?](/docs/connections/storage/warehouses/faq/#how-do-i-find-my-source-slug)
4850

4951
### Setting up a warehouse
5052

5153
[How do I create a user, grant usage on a schema and then grant the privileges that the user will need to interact with that schema?](/docs/connections/storage/warehouses/add-warehouse-users/)
5254

53-
[Which IPs should I whitelist?](/docs/connections/storage/warehouses/faq/#which-ips-should-i-whitelist)
55+
[Which IPs should I allowlist?](/docs/connections/storage/warehouses/faq/#which-ips-should-i-whitelist)
5456

5557
[Will Segment sync my historical data?](/docs/connections/storage/warehouses/faq/#will-segment-sync-my-historical-data)
5658

5759
[Can I load in my own data into my warehouse?](/docs/connections/storage/warehouses/faq/#what-if-i-want-to-add-custom-data-to-my-warehouse)
5860

59-
[Can I control what data is sent to my warehouse?](/docs/connections/storage/warehouses/faq/)
61+
[Can I control what data is sent to my warehouse?](/docs/connections/storage/warehouses/faq/#can-i-control-what-data-is-sent-to-my-warehouse)
6062

6163
### Managing a warehouse
6264

63-
[How fresh is the data in my warehouse?](/docs/connections/storage/warehouses/faq/)
65+
[How fresh is the data in my warehouse?](/docs/connections/storage/warehouses/faq/#how-fresh-is-the-data-in-segment-warehouses)
6466

65-
[Can I add, tweak, or delete some of the tables?](/docs/connections/storage/warehouses/faq/)
67+
[Can I add, tweak, or delete some of the tables?](/docs/connections/storage/warehouses/faq/#can-we-add-tweak-or-delete-some-of-the-tables)
6668

67-
[Can I transform or clean up old data to new formats or specs?](/docs/connections/storage/warehouses/faq/)
69+
[Can I transform or clean up old data to new formats or specs?](/docs/connections/storage/warehouses/faq/#can-we-transform-or-clean-up-old-data-to-new-formats-or-specs)
6870

6971
[What are common errors and how do I debug them?](/docs/connections/storage/warehouses/warehouse-errors/)
7072

71-
[How do I speed up my queries?](/docs/connections/storage/warehouses/redshift-tuning/)
73+
[How do I speed up my Redshift queries?](/docs/connections/storage/warehouses/redshift-tuning/)
7274

7375
### Analyzing with SQL
7476

7577
[How do I forecast LTV with SQL and Excel for e-commerce businesses?](/docs/guides/how-to-guides/forecast-with-sql/)
7678

77-
[How do I measure the ROI of my Marketing Campaigns?](/docs/guides/how-to-guides/measure-marketing-roi/)
79+
[How do I measure the ROI of my Marketing Campaigns?](/docs/guides/how-to-guides/measure-marketing-roi/)

0 commit comments

Comments
 (0)