You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/storage/warehouses/schema.md
+10-14Lines changed: 10 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -401,32 +401,28 @@ ORDER BY day
401
401
| 2014-07-20 | $1,595 |
402
402
| 2014-07-21 | $2,350 |
403
403
404
+
## Schema Evolution and Compatibility
405
+
404
406
### New Columns
405
407
406
408
New event properties and traits create columns. Segment processes the incoming data in batches, based on either data size or an interval of time. If the table doesn't exist we lock and create the table. If the table exists but new columns need to be created, we perform a diff and alter the table to append new columns.
407
409
408
410
When Segment process a new batch and discover a new column to add, we take the most recent occurrence of a column and choose its datatype.
409
411
412
+
### Data Types
410
413
411
-
### Supported Data Types
412
-
Data types are set up in your warehouse based on the first value that comes in from a source. For example, if the first value that came in from a source was a string, Segment would set the data type in the warehouse to `string`.
413
-
414
-
The data types that Segment currently supports include:
415
-
416
-
#### `timestamp`
417
-
418
-
#### `integer`
414
+
The data types that Segment currently supports include `timestamp`, `integer`, `float`, `boolean`, and `varchar`.
419
415
420
-
#### `float`
416
+
Data types are set up in your warehouse based on the first value that comes in from a source. For example, if the first value that came in from a source was a string, Segment would set the data type in the warehouse to `string`.
421
417
422
-
#### `boolean`
418
+
In cases where a data type is determined incorrectly, the support team can help you update the data type. As an example, if a field can include float values as well as integers, but the first value we received was an integer, we will set the data type of the field to integer, resulting in a loss of precision.
423
419
424
-
#### `varchar`
420
+
To update the data type, the support team will update the internal schema that Segment uses to infer your warehouse schema. We will start syncing the data with the correct data type after the change is made. However, if you want to backfill all historical data correctly, it will be required to drop the impacted tables on your end so Segment can recreate them in the correct datatype, and then backfill those tables.
425
421
426
-
> note " "
427
-
> To change data types after they've been determined, please reach out to [Segment Support](https://segment.com/help/contact) for assistance.
422
+
To request data types changes, please reach out to [Segment Support](https://segment.com/help/contact) for assistance, and provide with these details for the affected columns in the following format:
After analyzing the data from dozens of customers, we set the string column length limit at 512 characters. Longer strings are truncated. We found this was the sweet spot for good performance and ignoring non-useful data.
0 commit comments