Skip to content

Commit d4e9d25

Browse files
authored
Merge pull request #3307 from segmentio/spec-edits
Update Spec Docs Formatting
2 parents b7b3d4a + a1fe222 commit d4e9d25

File tree

9 files changed

+531
-2148
lines changed

9 files changed

+531
-2148
lines changed

src/connections/spec/common.md

Lines changed: 34 additions & 176 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ However, not all destinations accept all fields included in the Spec. Not sure w
99
{% include components/reference-button.html href="https://university.segment.com/introduction-to-segment/324252?reg=1&referrer=docs" icon="media/academy.svg" title="Segment University: The Segment Methods" description="Check out our high-level overview of these APIs in Segment University. (Must be logged in to access.)" %}
1010

1111
## Structure
12-
Every API call has the same core structure and fields. These fields describe user identity, timestamping and mechanical aides like API version.
12+
Every API call has the same core structure and fields. These fields describe user identity, timestamping, and mechanical aides like API version.
1313

1414
Here's an example of these common fields in raw JSON:
1515

@@ -121,119 +121,29 @@ Beyond this common structure, each API call adds a few specialized top-level fie
121121

122122
Context is a dictionary of extra information that provides useful context about a datapoint, for example the user's `ip` address or `locale`. You should **only use** Context fields for their intended meaning.
123123

124-
<table>
125-
<thead>
126-
<tr>
127-
<th>Field</th>
128-
<th>Type</th>
129-
<th>Description</th>
130-
</tr>
131-
</thead>
132-
<tr>
133-
<td>`active`</td>
134-
<td>Boolean</td>
135-
<td>Whether a user is active
136-
<br><br>
137-
This is usually used to flag an `.identify()` call to just update the traits but not "last seen."
138-
</td>
139-
</tr>
140-
<tr>
141-
<td>`app`</td>
142-
<td>Object</td>
143-
<td>dictionary of information about the current application, containing `name`, `version` and `build`.
144-
<br><br>
145-
This is collected automatically from the mobile libraries when possible.
146-
</td>
147-
</tr>
148-
<tr>
149-
<td>`campaign`</td>
150-
<td>Object</td>
151-
<td>Dictionary of information about the campaign that resulted in the API call, containing `name`, `source`, `medium`, `term`, `content`, and any other custom UTM parameter.
152-
<br><br>
153-
This maps directly to the common UTM campaign parameters.
154-
</td>
155-
</tr>
156-
<tr>
157-
<td>`device` </td>
158-
<td>Object</td>
159-
<td>Dictionary of information about the device, containing `id`, `advertisingId`, `manufacturer`, `model`, `name`, `type` and `version`.</td>
160-
</tr>
161-
<tr>
162-
<td>`ip`</td>
163-
<td>String</td>
164-
<td>Current user's IP address.</td>
165-
</tr>
166-
<tr>
167-
<td>`library` </td>
168-
<td>Object</td>
169-
<td>Dictionary of information about the library making the requests to the API, containing `name` and `version`.</td>
170-
</tr>
171-
<tr>
172-
<td>`locale` </td>
173-
<td>String</td>
174-
<td>Locale string for the current user, for example `en-US`.</td>
175-
</tr>
176-
<tr>
177-
<td>`location`</td>
178-
<td>Object</td>
179-
<td>Dictionary of information about the user's current location, containing `city`, `country`, `latitude`, `longitude`, `region` and `speed`.</td>
180-
</tr>
181-
<tr>
182-
<td>`network`</td>
183-
<td>Object</td>
184-
<td>Dictionary of information about the current network connection, containing `bluetooth`, `carrier`, `cellular` and `wifi`</td>
185-
</tr>
186-
<tr>
187-
<td>`os`</td>
188-
<td>Object</td>
189-
<td>Dictionary of information about the operating system, containing `name` and `version`</td>
190-
</tr>
191-
<tr>
192-
<td>`page`</td>
193-
<td>Object</td>
194-
<td>Dictionary of information about the current page in the browser, containing `path`, `referrer`, `search`, `title` and `url`. This is automatically collected by [Analytics.js](/docs/connections/sources/catalog/libraries/website/javascript/#context--traits).
195-
</td>
196-
</tr>
197-
<tr>
198-
<td>`referrer`</td>
199-
<td>Object</td>
200-
<td>Dictionary of information about the way the user was referred to the website or app, containing `type`, `name`, `url` and `link`</td>
201-
</tr>
202-
<tr>
203-
<td>`screen`</td>
204-
<td>Object</td>
205-
<td>Dictionary of information about the device's screen, containing `density`, `height` and `width`</td>
206-
</tr>
207-
<tr>
208-
<td>`timezone`</td>
209-
<td>String</td>
210-
<td>Timezones are sent as tzdata strings to add user timezone information which might be stripped from the timestamp, for example `America/New_York`
211-
</td>
212-
</tr>
213-
<tr>
214-
<td>`groupId`</td>
215-
<td>String</td>
216-
<td>Group / Account ID.
217-
<br><br>
218-
This is useful in B2B use cases where you need to attribute your non-group calls to a company or account. It is relied on by several Customer Success and CRM tools.</td>
219-
</tr>
220-
<tr>
221-
<td>`traits`</td>
222-
<td>Object</td>
223-
<td>Dictionary of `traits` of the current user
224-
<br><br>
225-
This is useful in cases where you need to `track` an event, but also associate information from a previous `identify` call. You should fill this object the same way you would fill traits in an [identify call](/docs/connections/spec/identify/#traits).</td>
226-
</tr>
227-
<tr>
228-
<td>`userAgent`</td>
229-
<td>String</td>
230-
<td>User agent of the device making the request</td>
231-
</tr>
232-
</table>
233-
234-
## Context Fields Automatically Collected
235-
236-
Below is a chart that shows you which context variables are populated automatically by the iOS, Android and analytics.js libraries.
124+
| Field | Type | Description |
125+
|-------------|---------|--------------------------|
126+
| `active` | Boolean | Whether a user is active. <br><br> This is usually used to flag an `.identify()` call to just update the traits but not "last seen." |
127+
| `app` | Object | dictionary of information about the current application, containing `name`, `version`, and `build`. <br><br> This is collected automatically from the mobile libraries when possible. |
128+
| `campaign` | Object | Dictionary of information about the campaign that resulted in the API call, containing `name`, `source`, `medium`, `term`, `content`, and any other custom UTM parameter. <br><br> This maps directly to the common UTM campaign parameters. |
129+
| `device` | Object | Dictionary of information about the device, containing `id`, `advertisingId`, `manufacturer`, `model`, `name`, `type`, and `version`. |
130+
| `ip` | String | Current user's IP address. |
131+
| `library` | Object | Dictionary of information about the library making the requests to the API, containing `name` and `version`. |
132+
| `locale` | String | Locale string for the current user, for example `en-US`. |
133+
| `location` | Object | Dictionary of information about the user's current location, containing `city`, `country`, `latitude`, `longitude`, `region`, and `speed`. |
134+
| `network` | Object | Dictionary of information about the current network connection, containing `bluetooth`, `carrier`, `cellular`, and `wifi`. |
135+
| `os` | Object | Dictionary of information about the operating system, containing `name` and `version`. |
136+
| `page` | Object | Dictionary of information about the current page in the browser, containing `path`, `referrer`, `search`, `title` and `url`. This is automatically collected by [Analytics.js](/docs/connections/sources/catalog/libraries/website/javascript/#context--traits). |
137+
| `referrer` | Object | Dictionary of information about the way the user was referred to the website or app, containing `type`, `name`, `url`, and `link`. |
138+
| `screen` | Object | Dictionary of information about the device's screen, containing `density`, `height`, and `width`. |
139+
| `timezone` | String | Timezones are sent as tzdata strings to add user timezone information which might be stripped from the timestamp, for example `America/New_York`. |
140+
| `groupId` | String | Group / Account ID. <br><br> This is useful in B2B use cases where you need to attribute your non-group calls to a company or account. It is relied on by several Customer Success and CRM tools. |
141+
| `traits` | Object | Dictionary of `traits` of the current user. <br><br> This is useful in cases where you need to `track` an event, but also associate information from a previous `identify` call. You should fill this object the same way you would fill traits in an [identify call](/docs/connections/spec/identify/#traits). |
142+
| `userAgent` | String | User agent of the device making the request. |
143+
144+
## Context fields automatically collected
145+
146+
Below is a chart that shows you which context variables are populated automatically by the iOS, Android, and analytics.js libraries.
237147

238148
Other libraries only collect `context.library`, any other context variables must be sent manually.
239149

@@ -302,73 +212,21 @@ Sending data to the rest of Segment's destinations is opt-out so if you don't sp
302212

303213
## Timestamps
304214

305-
Every API call has four timestamps, `originalTimestamp`, `timestamp`, `sentAt` and `receivedAt.` They're used for very different purposes.
215+
Every API call has four timestamps, `originalTimestamp`, `timestamp`, `sentAt`, and `receivedAt.` They're used for very different purposes.
306216

307217
**All timestamps are [ISO-8601](http://en.wikipedia.org/wiki/ISO_8601){:target="_blank"} date strings.**
308218

309219
> note ""
310220
> **NOTE:** You must use ISO-8601 date strings that include timezones when you use timestamps with [Personas](/docs/personas/). If you send custom traits without a timezone, Segment doesn't save the timestamp value.
311221
312-
### Timestamp Overview
222+
### Timestamp overview
313223

314-
<table>
315-
<tr>
316-
<td>**Timestamp**</td>
317-
<td>**Calculated**</td>
318-
<td>**Description**</td>
319-
</tr>
320-
<tr>
321-
<td>`originalTimestamp`</td>
322-
<td>
323-
Time on the client device when call was invoked
324-
<br>
325-
**OR**
326-
<br>
327-
The `timestamp` value manually passed in through server-side libraries.
328-
</td>
329-
<td>
330-
Used by Segment to calculate `timestamp`.
331-
<br><br>
332-
**Note:** `originalTimestamp` is not useful for analysis since it's not always trustworthy as it can be easily adjusted and affected by clock skew.</td>
333-
</tr>
334-
<tr>
335-
<td>`sentAt`</td>
336-
<td>
337-
Time on client device when call was sent
338-
<br>
339-
**OR**
340-
<br>
341-
`sentAt` value manually passed in.
342-
</td>
343-
<td>
344-
Used by Segment to calculate `timestamp`.
345-
<br><br>
346-
**Note:** `sentAt` is not useful for analysis since it's not always trustworthy as it can be easily adjusted and affected by clock skew.
347-
</td>
348-
</tr>
349-
<tr>
350-
<td>`receivedAt`</td>
351-
<td>Time on Segment server clock when call was received</td>
352-
<td>
353-
Used by Segment to calculate `timestamp`, and used as sort key in Warehouses.
354-
<br><br>
355-
**Note:** For max query speed, `receivedAt` is the recommended timestamp for analysis when chronology does not matter as chronology is not ensured.
356-
</td>
357-
</tr>
358-
<tr>
359-
<td>`timestamp`</td>
360-
<td>
361-
Calculated by Segment to correct client-device clock skew using the following formula:
362-
<br>
363-
`receivedAt` - (`sentAt` - `originalTimestamp`)
364-
</td>
365-
<td>
366-
Used by Segment to send to downstream destinations, and used for historical replays.
367-
<br><br>
368-
**Note:** Recommended timestamp for analysis when chronology does matter.
369-
</td>
370-
</tr>
371-
</table>
224+
| Timestamp | Calculated | Description |
225+
|----------------|-----------------|-----------------|
226+
| `originalTimestamp` | Time on the client device when call was invoked <br> **OR** <br> The `timestamp` value manually passed in through server-side libraries. | Used by Segment to calculate `timestamp`. <br><br> **Note:** `originalTimestamp` is not useful for analysis since it's not always trustworthy as it can be easily adjusted and affected by clock skew. |
227+
| `sentAt` | Time on client device when call was sent. <br> **OR** <br> `sentAt` value manually passed in. | Used by Segment to calculate `timestamp`. <br><br> **Note:** `sentAt` is not useful for analysis since it's not always trustworthy as it can be easily adjusted and affected by clock skew. |
228+
| `receivedAt` | Time on Segment server clock when call was received | Used by Segment to calculate `timestamp`, and used as sort key in Warehouses. <br><br> **Note:** For max query speed, `receivedAt` is the recommended timestamp for analysis when chronology does not matter as chronology is not ensured. |
229+
| `timestamp` | Calculated by Segment to correct client-device clock skew using the following formula:<br> `receivedAt` - (`sentAt` - `originalTimestamp`) | Used by Segment to send to downstream destinations, and used for historical replays. <br><br>**Note:** Recommended timestamp for analysis when chronology does matter. |
372230

373231

374232
### originalTimestamp
@@ -389,7 +247,7 @@ The `sentAt` timestamp specifies the clock time for the client's device when the
389247

390248
The `receivedAt` timestamp is added to incoming messages as soon as they hit the API. It's used in combination with `sentAt` to correct clock skew, and also to aid with debugging libraries and systems that deliver events in batches.
391249

392-
The `receivedAt` timestamp is most important as the sort key in Segment's Warehouses product. Use this for max query speed when retrieving data from your Warehouse!
250+
The `receivedAt` timestamp is most important as the sort key in Segment's Warehouses product. Use this for max query speed when retrieving data from your Warehouse.
393251

394252
**Note:** Chronological order of events is not ensured with `receivedAt`.
395253

@@ -399,4 +257,4 @@ The `timestamp` timestamp specifies when the data point occurred, corrected for
399257

400258
If you are using the Segment server Source libraries, or passing calls directly to the HTTP API endpoint, you can manually set the `timestamp` field. This change updates the `originalTimestamp` field of the Segment event. If you use a Segment Source in device mode, the library generates `timestamp` and you cannot manually set one directly in the call payload.
401259

402-
Segment calculates `timestamp` as `timestamp = receivedAt - (sentAt - originalTimeStamp)`.
260+
Segment calculates `timestamp` as `timestamp = receivedAt - (sentAt - originalTimeStamp)`.

0 commit comments

Comments
 (0)