Skip to content

Commit 2606cad

Browse files
Merge pull request #4031 from segmentio/warehouse-schema-datatype
add details about data type mismatch
2 parents 9264e68 + 64dd30d commit 2606cad

File tree

2 files changed

+17
-7
lines changed

2 files changed

+17
-7
lines changed

src/connections/storage/warehouses/faq.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ Your source slug can be found in the URL when you're looking at the source desti
7474

7575
Your warehouse id appears in the URL when you look at the [warehouse destinations page](https://app.segment.com/goto-my-workspace/warehouses/). The URL structure looks like this:
7676

77-
​​`app.segment.com/[my-workspace]/warehouses/[my-warehouse-id]/overview`
77+
`app.segment.com/[my-workspace]/warehouses/[my-warehouse-id]/overview`
7878

7979

8080
## How fresh is the data in Segment Warehouses?
@@ -170,3 +170,7 @@ To change the name of your schema without disruptions:
170170
11. Select the warehouse you disabled syncs for from the list of destinations.
171171
3. On the overview page for your source, select **Settings**.
172172
4. Enable the **Sync Data** toggle and click **Save Settings**.
173+
174+
## Can I change the data type of a column in the warehouse?
175+
176+
Yes, data types are set up in your warehouse based on the first value that comes in from a source. However, you can request the support team to update the data type by reaching out to [support](https://app.segment.com/workspaces?contact=1). To learn more, check out [Data Types](/docs/connections/storage/warehouses/schema/#schema-evolution-and-compatibility) section.

src/connections/storage/warehouses/schema.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -401,15 +401,15 @@ ORDER BY day
401401
| 2014-07-20 | $1,595 |
402402
| 2014-07-21 | $2,350 |
403403

404+
## Schema Evolution and Compatibility
405+
404406
### New Columns
405407

406408
New event properties and traits create columns. Segment processes the incoming data in batches, based on either data size or an interval of time. If the table doesn't exist we lock and create the table. If the table exists but new columns need to be created, we perform a diff and alter the table to append new columns.
407409

408410
When Segment process a new batch and discover a new column to add, we take the most recent occurrence of a column and choose its datatype.
409411

410-
411-
### Supported Data Types
412-
Data types are set up in your warehouse based on the first value that comes in from a source. For example, if the first value that came in from a source was a string, Segment would set the data type in the warehouse to `string`.
412+
### Data Types
413413

414414
The data types that Segment currently supports include:
415415

@@ -423,10 +423,16 @@ The data types that Segment currently supports include:
423423

424424
#### `varchar`
425425

426-
> note " "
427-
> To change data types after they've been determined, please reach out to [Segment Support](https://segment.com/help/contact) for assistance.
426+
Data types are set up in your warehouse based on the first value that comes in from a source. For example, if the first value that came in from a source was a string, Segment would set the data type in the warehouse to `string`.
427+
428+
In cases where a data type is determined incorrectly, the support team can help you update the data type. As an example, if a field can include float values as well as integers, but the first value we received was an integer, we will set the data type of the field to integer, resulting in a loss of precision.
429+
430+
To update the data type, reach out to the Segment support team. They will update the internal schema that Segment uses to infer your warehouse schema. Once the change is made, Segment will start syncing the data with the correct data type. However, if you want to backfill the historical data , you must drop the impacted tables on your end so that Segment can recreate them and backfill those tables.
431+
432+
To request data types changes, please reach out to [Segment Support](https://segment.com/help/contact) for assistance, and provide with these details for the affected columns in the following format:
433+
`<schema_name>.<table_name>.<column_name>.<current_datatype>.<new_datatype>`
428434

429-
## Column Sizing
435+
### Column Sizing
430436

431437
After analyzing the data from dozens of customers, we set the string column length limit at 512 characters. Longer strings are truncated. We found this was the sweet spot for good performance and ignoring non-useful data.
432438

0 commit comments

Comments
 (0)