You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/storage/warehouses/schema.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Warehouse Schemas
3
3
---
4
4
5
-
A **schema** describes the way that the data in a warehouse is organized. Schemas of warehouse data are organized into the following template:
5
+
A **schema** describes the way that the data in a warehouse is organized. Schemas of Segment data are organized into the following template:
6
6
`<source>.<collection>.<property>`, for example `segment_engineering.tracks.user_id`, where source refers to the source or project name (segment_engineering), collection refers to the event (tracks), and the property refers to the data being collected (user_id). All schemas convert collection and property names from `CamelCase` to `snake_case`.
7
7
8
8
> note "Warehouse column creation"
@@ -16,7 +16,7 @@ Segment's libraries pass nested objects and arrays into tracking calls as **prop
16
16
- The warehouse connector stringifies all **context fields** that contain a nested **array**
17
17
- The warehouse connector stringifies all **traits** that contain a nested **array**
18
18
- The warehouse connector "flattens" all **traits** that contain a nested **object**
19
-
- The warehouse connector optionally stringifies **arrays** when they follow our[Ecommerce spec](/docs/connections/spec/ecommerce/v2/)
19
+
- The warehouse connector optionally stringifies **arrays** when they follow the[Ecommerce spec](/docs/connections/spec/ecommerce/v2/)
20
20
- The warehouse connector "flattens" all **context fields** that contain a nested **object** (for example, context.field.nestedA.nestedB becomes a column called context_field_nestedA_nestedB)
21
21
22
22
<table>
@@ -126,7 +126,7 @@ The table below describes the schema in Segment Warehouses:
126
126
|`<source>.groups`| A table with your `group` method calls. This table includes the `traits` you record for groups as top-level columns, for example `<source>.groups.employee_count`. |
127
127
|`<source>.accounts`|*IN BETA* A table with unique `group` method calls. Group calls are upserted into this table (updated if an existing entry exists, appended otherwise). This table holds the latest state of a group. |
128
128
|`<source>.identifies`| A table with your `identify` method calls. This table includes the `traits` you identify users by as top-level columns, for example `<source>.identifies.email`. |
129
-
|`<source>.users`| A table with unique `identify` calls. `identify` calls are upserted on `user_id` into this table (updated if an existing entry exists, appended otherwise). This table holds the latest state of a user. The `id` column in the users table is the same as the `user_id` column in the identifies table. Also note that this table won't have an `anonymous_id` column since a user can have multiple anonymousIds. To retrieve a user's `anonymousId`, query the identifies table. *If you observe any duplicates in the users table [contact us](https://segment.com/help/contact/) (unless you are using BigQuery, where [this is expected](/docs/connections/storage/catalog/bigquery/#schema))*. |
129
+
|`<source>.users`| A table with unique `identify` calls. `identify` calls are upserted on `user_id` into this table (updated if an existing entry exists, appended otherwise). This table holds the latest state of a user. The `id` column in the users table is the same as the `user_id` column in the identifies table. Also note that this table won't have an `anonymous_id` column since a user can have multiple anonymousIds. To retrieve a user's `anonymousId`, query the identifies table. *If you observe any duplicates in the users table [contact Segment support](https://segment.com/help/contact/) (unless you are using BigQuery, where [this is expected](/docs/connections/storage/catalog/bigquery/#schema))*. |
130
130
|`<source>.pages`| A table with your `page` method calls. This table includes the `properties` you record for pages as top-level columns, for example `<source>.pages.title`. |
131
131
|`<source>.screens`| A table with your `screen` method calls. This table includes `properties` you record for screens as top-level columns, for example `<source>.screens.title`. |
132
132
|`<source>.tracks`| A table with your `track` method calls. This table includes standardized properties that are all common to all events: `anonymous_id`, `context_*`, `event`, `event_text`, `received_at`, `sent_at`, and `user_id`. This is because every event that you send to Segment has different properties. For querying by the custom properties, use the `<source>.<event>` tables instead. |
0 commit comments