|
| 1 | +--- |
| 2 | +title: Working with Identifiers |
| 3 | +hidden: true |
| 4 | +--- |
| 5 | + |
| 6 | +> warning "Critical Segment recommendation" |
| 7 | +> Segment recommends that you use `uuid4` for `anonymousId`. |
| 8 | +
|
| 9 | +As part of your Segment implementation, you’ll come across various identifiers (IDs) that Segment’s systems may process. The three most prominent identifiers you’ll encounter are `anonymousId`, `userId`, and `groupId`. |
| 10 | + |
| 11 | +This guide explains the most common Segment IDs, why Segment recommends formats like `uuidv4`, and other ID mechanics. |
| 12 | + |
| 13 | +## Understanding the standard identifiers |
| 14 | + |
| 15 | +This section explains the purpose of the three primary IDs and introduces the other two categories that may come into play as you expand your CDP implementation. |
| 16 | + |
| 17 | +### Purpose |
| 18 | + |
| 19 | +A critical component of the Segment CDP is to identify the user through time. To do this, Segment’s default implementations use two identifiers, `anonymousID` and `userID`. |
| 20 | + |
| 21 | +The following table describes the purposes of these two IDs, as well as `groupId`: |
| 22 | + |
| 23 | +| Identifier | Purpose | |
| 24 | +| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | |
| 25 | +| `anonymousId` | `anonymousId` tracks user activity in CDPs and beyond. It lets you attach an identifier to an anonymous user and helps you ensure that all data is captured before Segment identifies the user through a `userId`. | |
| 26 | +| `userId` | `userId` comes into play once Segment has identified a user, which usually occurs through a form of authentication, like a login. | |
| 27 | +| `groupId` | `groupId` lets you capture B2B relationships between individual users and groups they may represent, serving as an identifier for these groups. | |
| 28 | + |
| 29 | +### Identifier generation |
| 30 | + |
| 31 | +Here's how Segment generates the IDs you just learned about: |
| 32 | + |
| 33 | +#### `anonymousId` generation |
| 34 | + |
| 35 | +`anonymousId` generation relates to the two types of libraries (or SDKs) that CDPs offer. Client-side libraries, like web and mobile, automatically generate a [universally unique identifier](https://en.wikipedia.org/wiki/Universally_unique_identifier){:target="_blank"} (UUID), whereas server-side libraries, like .NET, Node.js, and Java, make you generate these IDs yourself. As a result, you have the option to set the `anonymousId` manually in client-side libraries/SDKs. |
| 36 | + |
| 37 | +#### `userId` generation |
| 38 | + |
| 39 | +`userId` is a canonical identifier that you generate on your side, no matter what library or SDK you're using. Because `userId` is woven into your service or product delivery, it has the highest fidelity. |
| 40 | + |
| 41 | +#### `groupId` generation |
| 42 | + |
| 43 | +`groupId` generation is identical to `userId` generation. You generate `groupId` and maintain it off-platform in your customer database. |
| 44 | + |
| 45 | +## Segment's guidance on identifier formats |
| 46 | + |
| 47 | +As you work with identifiers, **Segment recommends that you use `uuidv4` for `anonymousId`**. The following table lists the criteria that Segment recommends your identifiers satisfy, as well as why Segment recommends `uuidv4`: |
| 48 | + |
| 49 | +| Trait | Reasoning | |
| 50 | +| ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | |
| 51 | +| Global uniqueness | `uuidv4` generates statistically unique identifiers without needing a central authority or coordination between systems. This is ideal for distributed systems. | |
| 52 | +| Non-sequential | Unlike incremental integer IDs, `uuidv4` generates non-sequential IDs. This offers a security advantage, as it makes it harder for malicious users to guess other valid IDs. | |
| 53 | +| No information leakage | `uuidv4` doesn't reveal information about the data its associated with, unlike other ID generation strategies that may encode information about the data, time creation, or, even worse, personal data on the individual it identifies. | |
| 54 | +| Standardized | UUIDs are standardized, which means they are widely recognized and suported across various platforms and languages. | |
| 55 | +| No collision | The likelihood of collision, or the generation of two identical UUIDs, is infinitesimally small, even after generation of billions of UUIDs. | |
| 56 | +| Easy generation | You can generate `uuidv4` easily, and it has many deployments across virtually all programming languages. | |
| 57 | + |
| 58 | +### Persistence and resetting |
| 59 | + |
| 60 | +This section explains the persistence of client-side and server-side identifiers. |
| 61 | + |
| 62 | +#### Client-side persistence |
| 63 | + |
| 64 | +Most client-side libraries and SDKs write used identifiers into some form of memory, like cookies and `localStorage` on the web or in-memory databases on mobile devices. |
| 65 | + |
| 66 | +This simplifies persistence and, in most cases, allows libraries and SDKs to fetch IDs automatically from memory, so that you don't have to send all IDs deliberately. Because users may change, though, CDPs offer the functionality to reset these IDs. For Segment, the corresponding method is `analytics.reset()`. |
| 67 | + |
| 68 | +#### Server-side persistence |
| 69 | + |
| 70 | +Servers don't have this kind of memory readily available. Because of this, you'd need to deploy ID persistence as a custom component on your infrastructure. |
| 71 | + |
| 72 | +Segment finds that this is rarely necessary, however, as most servers only process data on known users instead of anonymous users. As a result, servers will already have access to a `userId`. Because there is no ID persistence in requests to your CDP, you won't need to worry about resetting. |
| 73 | + |
| 74 | +## Going beyond the default |
| 75 | + |
| 76 | +While this guide focused on `anonymousId`, `userId`, and `groupId`, other identifiers also exist, like IDFA, system IDs, and so on. Such identifiers vary in their origin, importance, and persistence. Often, these identifiers are system-generated and, as a result, don't require conscious design decisions as you implement your CDP. |
| 77 | + |
| 78 | +Segment recommends applying the formatting criteria discussed on this page to, at a minimum, `anonymousId`, `userId`, and `groupId`. Segment also recommends that you use these criteria for other identifiers you may work with, even beyond Segment's standard IDs. |
0 commit comments