Skip to content

Commit 9dc38f2

Browse files
author
markzegarelli
authored
Merge pull request #1464 from segmentio/redshift-column-sizing
Redshift column sizing
2 parents 1c50740 + 530e20d commit 9dc38f2

File tree

2 files changed

+3
-17
lines changed

2 files changed

+3
-17
lines changed

src/connections/storage/warehouses/redshift-faq.md

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -25,27 +25,15 @@ Like with most data warehouses, column data types (string, integer, float, etc.)
2525

2626
## VARCHAR size limits
2727

28-
All Segment-managed schemas have a default VARCHAR size of 512 in order to keep performance high. If you wish to increase the VARCHAR size, you can run the following query to create a temp column with the VARCHAR size of your choosing. The query then copies over the data from the original column, drops the original column and finally renames the temp column back to the original column. Keep in mind that this process will not backfill any truncated data. The only way to currently backfill this truncated data is to run a backfill which requires a Business Tier Segment account. NOTE: The following query will only work if you're changing the VARCHAR size of a string column. Do not use this query to change a column type (i.e. integer to float).
28+
All Segment-managed schemas have a default VARCHAR size of 512 in order to keep performance high. If you wish to increase the VARCHAR size, you can run the following query.
2929

3030
```sql
31-
BEGIN;
32-
LOCK table_name;
33-
ALTER TABLE table_name ADD COLUMN column_new column_type;
34-
UPDATE table_name SET column_new = column_name;
35-
ALTER TABLE table_name DROP column_name;
36-
ALTER TABLE table_name RENAME column_new TO column_name;
37-
COMMIT;
31+
ALTER TABLE table_name ALTER COLUMN column_name column_type;
3832
```
3933

4034
Example:
4135
```sql
42-
BEGIN;
43-
LOCK segment_prod.identifies;
44-
ALTER TABLE segment_prod.identifies ADD COLUMN new_account_id VARCHAR(1024);
45-
UPDATE segment_prod.identifies SET new_account_id = account_id;
46-
ALTER TABLE segment_prod.identifies DROP account_id;
47-
ALTER TABLE table_name RENAME new_account_id TO account_id;
48-
COMMIT;
36+
ALTER TABLE segment_prod.identifies ALTER COLUMN account_id TYPE VARCHAR(1024);
4937
```
5038
> warning ""
5139
> Increasing the default size can impact query performance as it needs to process more data to accomodate the increased column size. See [Amazon's Redshift Documentation](https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-smallest-column-size.html) for more details.

src/connections/storage/warehouses/schema.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -602,8 +602,6 @@ After analyzing the data from dozens of customers we set the string column lengt
602602

603603
We special-case compression for some known columns like event names and timestamps. The others default to LZO. We may add look-ahead sampling down the road, but from inspecting the datasets today this would be unnecessary complexity.
604604

605-
After a column is created, Redshift doesn't allow altering. Swapping and renaming may work down the road, but this would cause thrashing and performance issues. If you would like to change the column size, see our [docs here](/docs/connections/storage/warehouses/redshift-faq/#varchar-size-limits).
606-
607605
## Timestamps
608606

609607
The Segment API associates four timestamps with every call: `timestamp`, `original_timestamp`, `sent_at` and `received_at`.

0 commit comments

Comments
 (0)