Skip to content

Commit 162dd05

Browse files
committed
edits
1 parent 9879353 commit 162dd05

File tree

2 files changed

+186
-3
lines changed

2 files changed

+186
-3
lines changed

docs/guides/integration-databricks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ The following table provides information about the structure of the output data,
145145
| :--- | :--- | :--- |
146146
| `UID` | string | The value is one of the following:<ul><li>**DII was successfully mapped**: The UID2 associated with the DII.</li><li>**Otherwise**: `NULL`.</li></ul> |
147147
| `PREV_UID` | string | The value is one of the following:<ul><li>**DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.</li><li>**Otherwise**: `NULL`.</li></ul> |
148-
| `REFRESH_FROM` | timestamp | The value is one of the following:<ul><li>**DII was successfully mapped**: The timestamp (in epoch seconds) indicating when this UID2 should be refreshed.</li><li>**Otherwise**: `NULL`.</li></ul> |
148+
| `REFRESH_FROM` | timestamp | The value is one of the following:<ul><li>**DII was successfully mapped**: The timestamp indicating when this UID2 should be refreshed.</li><li>**Otherwise**: `NULL`.</li></ul> |
149149
| `UNMAPPED` | string | The value is one of the following:<ul><li>**DII was successfully mapped**: `NULL`.</li><li>**Otherwise**: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.<br/>For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).</li></ul> |
150150

151151
#### Values for the UNMAPPED Column

i18n/ja/docusaurus-plugin-content-docs/current/guides/integration-databricks.md

Lines changed: 185 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,189 @@ displayed_sidebar: docs
1010

1111
import Link from '@docusaurus/Link';
1212

13-
# UID2 Databricks Clean Rooms Integration Guide
13+
# Databricks Clean Rooms Integration Guide
1414

15-
**COPY OF DATABRICKS DOC WILL GO HERE WHEN IT'S FINALIZED.**
15+
This guide is for advertisers and data providers who want to convert their user data to raw UID2s in a Databricks environment.
16+
17+
## Integration Overview
18+
19+
[Databricks Clean Rooms](https://docs.databricks.com/aws/en/clean-rooms/) is a Databricks data warehousing solution, where you as a partner can store your data and integrate with the UID2 framework. Using Databricks Clean Rooms, UID2 enables you to securely share consumer identifier data without exposing sensitive <Link href="../ref-info/glossary-uid#gl-dii">directly identifying information (DII)</Link>.
20+
21+
In the context of UID2, you set up the Databricks Clean Rooms environment and place your data there. You set up a trust relationship with the UID2 Operator and allow the Operator to convert your data to raw UID2s.
22+
23+
With UID2 supported in the clean room, advertisers and data partners can securely process their first-party data within Databricks.
24+
25+
[**GWH__EE01 is it only first-party data, or just data? If they're just sending phone numbers and emails, I don't see what the difference is... it's just data?**]
26+
27+
[**GWH__EE02 Please provide any additional content you want in the overview. Thx.**]
28+
29+
<!--
30+
## Databricks Partner Network Listing
31+
32+
[**GWH__EE or MC for listing update when available. https://www.databricks.com/company/partners/technology?**]
33+
-->
34+
35+
## Functionality
36+
37+
The following table summarizes the functionality available with the UID2 Databricks integration.
38+
39+
| Encrypt Raw UID2 to UID2 Token for Sharing | Decrypt UID2 Token to Raw UID2 | Generate UID2 Token from DII | Refresh UID2 Token | Map DII to Raw UID2s |
40+
| :--- | :--- | :--- | :--- | :--- |
41+
| &#8212; | &#8212; | &#8212; | &#8212; | &#9989; |
42+
43+
## Key Benefits
44+
45+
Here are some key benefits of integrating with Databricks for your UID2 processing:
46+
47+
- Native support for managing UID2 workflows within a Databricks data clean room.
48+
- Secure identity interoperability between partner datasets.
49+
- Direct lineage and observability for all UID2-related transformations and joins, for auditing and traceability.
50+
- Streamlined integration between UID2 identifiers and The Trade Desk activation ecosystem.
51+
- Self-service support for marketers and advertisers through Databricks.
52+
53+
## Integration Steps
54+
55+
At a high level, the following are the steps to set up your Databricks integration and process your data:
56+
57+
1. [Create a clean room for UID2 collaboration](#create-clean-room-for-uid2-collaboration).
58+
1. [Send your Databricks sharing identifier to your UID2 contact](#send-sharing-identifier-to-uid2-contact).
59+
1. [Add data to the clean room](#add-data-to-the-clean-room).
60+
1. [Map DII](#map-dii) by running the clean room notebook.
61+
62+
### Create Clean Room for UID2 Collaboration
63+
64+
As a starting point, create a Databricks Clean Rooms environment&#8212;a secure environment for you to collaborate with UID2 to process your data.
65+
66+
Follow the steps in [Create clean rooms](https://docs.databricks.com/aws/en/clean-rooms/create-clean-room) in the Databricks documentation. Use the correct sharing identifier based on the [UID2 environment](../getting-started/gs-environments) you want to connect to: see [UID2 Sharing Identifiers](#uid2-sharing-identifiers).
67+
68+
:::important
69+
After you've created a clean room, you cannot change its collaborators. If you have the option to set clean room collaborator aliases&#8212;for example, if you’re using the Databricks Python SDK to create the clean room&#8212;your collaborator alias must be `creator` and the UID2 collaborator alias must be `collaborator`. If you’re creating the clean room using the Databricks web UI, the correct collaborator aliases are set for you.
70+
:::
71+
72+
### Send Sharing Identifier to UID2 Contact
73+
74+
To establish a relationship with your UID2 contact, you'll need to send the Databricks sharing identifier.
75+
76+
The sharing identifier is a string in this format: `<cloud>:<region>:<uuid>`.
77+
78+
Follow these steps:
79+
80+
1. Find the sharing identifier for the Unity Catalog metastore that is attached to the Databricks workspace where you’ll work with the clean room.
81+
82+
For information on how to find this value, see [Finding a Sharing Identifier](#finding-a-sharing-identifier).
83+
1. Send the sharing identifier to your UID2 contact.
84+
85+
### Add Data to the Clean Room
86+
87+
Add one or more tables or views to the clean room. You can use any names for the schema, tables, and views. Tables and views must follow the schema detailed in [Input Table](#input-table ).
88+
89+
### Map DII
90+
91+
Run the `identity_map_v3` Databricks Clean Rooms [notebook](https://docs.databricks.com/aws/en/notebooks/) to map email addresses, phone numbers, or their respective hashes to raw UID2s.
92+
93+
A successful notebook run results in raw UID2s populated in the output table. For details, see [Output Table](#output-table).
94+
95+
## Running the Clean Rooms Notebook
96+
97+
This section provides details to help you use your Databricks Clean Rooms environment to process your DII into raw UID2s, including the following:
98+
99+
- [Notebook Parameters](#notebook-parameters)
100+
- [Input Table](#input-table)
101+
- [DII Format and Normalization](#dii-format-and-normalization)
102+
- [Output Table](#output-table)
103+
- [Output Table Schema](#output-table-schema)
104+
105+
### Notebook Parameters
106+
107+
You can use the `identity_map_v3` notebook to map DII in any table or view that you've added to the `creator` catalog of the clean room.
108+
109+
The notebook has two parameters, `input_schema` and `input_table`. Together, these two parameters identify the table or view in the clean room that contains the DII to be mapped.
110+
111+
For example, to map DII in the clean room table named `creator.default.emails`, set `input_schema` to `default` and `input_table` to `emails`.
112+
113+
| Parameter Name | Description |
114+
| :--- | :--- |
115+
| `input_schema` | The schema containing the table or view. |
116+
| `input_table` | The name you specify for the table or view containing the DII to be mapped. |
117+
118+
### Input Table
119+
120+
The input table or view must have the two columns shown in the following table. The table or view can have additional columns, but the notebook doesn't use any additional columns, only these two.
121+
122+
| Column Name | Data Type | Description |
123+
| :--- | :--- | :--- |
124+
| `INPUT` | string | The DII to map. |
125+
| `INPUT_TYPE` | string | The type of DII to map. Allowed values: `email`, `email_hash`, `phone`, and `phone_hash`. |
126+
127+
### DII Format and Normalization
128+
129+
The normalization requirements depend on the type of DII you're processing, as follows:
130+
131+
- **Email address**: The notebook normalizes the data using the UID2 [Email Address Normalization](../getting-started/gs-normalization-encoding#email-address-normalization) rules.
132+
- **Phone number**: You must normalize the phone number before mapping it with the notebook, using the UID2 [Phone Number Normalization](../getting-started/gs-normalization-encoding#phone-number-normalization) rules.
133+
134+
### Output Table
135+
136+
If the clean room has an output catalog, the mapped DII is written to a table in the output catalog. Output tables are stored for 30 days.
137+
138+
For details, see [Overview of output tables](https://docs.databricks.com/aws/en/clean-rooms/output-tables#overview-of-output-tables) in the Databricks documentation.
139+
140+
### Output Table Schema
141+
142+
The following table provides information about the structure of the output data, including field names and values.
143+
144+
| Column Name | Data Type | Description |
145+
| :--- | :--- | :--- |
146+
| `UID` | string | The value is one of the following:<ul><li>**DII was successfully mapped**: The UID2 associated with the DII.</li><li>**Otherwise**: `NULL`.</li></ul> |
147+
| `PREV_UID` | string | The value is one of the following:<ul><li>**DII was successfully mapped and the current raw UID2 was rotated in the last 90 days**: the previous raw UID2.</li><li>**Otherwise**: `NULL`.</li></ul> |
148+
| `REFRESH_FROM` | timestamp | The value is one of the following:<ul><li>**DII was successfully mapped**: The timestamp indicating when this UID2 should be refreshed.</li><li>**Otherwise**: `NULL`.</li></ul> |
149+
| `UNMAPPED` | string | The value is one of the following:<ul><li>**DII was successfully mapped**: `NULL`.</li><li>**Otherwise**: The reason why the identifier was not mapped: `OPTOUT`, `INVALID IDENTIFIER`, or `INVALID INPUT TYPE`.<br/>For details, see [Values for the UNMAPPED Column](#values-for-the-unmapped-column).</li></ul> |
150+
151+
#### Values for the UNMAPPED Column
152+
153+
The following table shows possible values for the `UNMAPPED` column in the output table schema.
154+
155+
| Value | Meaning |
156+
| :--- | :--- |
157+
| `NULL` | The DII was successfully mapped. |
158+
| `OPTOUT` | The user has opted out. |
159+
| `INVALID IDENTIFIER` | The email address or phone number is invalid. |
160+
| `INVALID INPUT TYPE` | The value of `INPUT_TYPE` is invalid. Valid values for `INPUT_TYPE` are: `email`, `email_hash`, `phone`, `phone_hash`. |
161+
162+
## Testing in the Integ Environment
163+
164+
If you'd like to test the Databricks Clean Rooms implementation before signing a UID2 POC, you can ask your UID2 contact for access in the integ (integration) environment. This environment is for testing only, and has no production data.
165+
166+
In the request, be sure to include your sharing identifier, and use the sharing identifier for the UID2 integration environment. For details, see [UID2 Sharing Identifiers](#uid2-sharing-identifiers).
167+
168+
While you're waiting to hear back, you could create the clean room, invite UID2, and put your assets into the clean room. For details, see [Integration Steps](#integration-steps).
169+
170+
When your access is ready, your UID2 contact notifies you.
171+
172+
## Reference
173+
174+
This section includes the following reference information:
175+
176+
- [UID2 Sharing Identifiers](#uid2-sharing-identifiers)
177+
- [Finding the Sharing Identifier for Your UID2 Contact](#finding-the-sharing-identifier-for-your-uid2-contact)
178+
179+
### UID2 Sharing Identifiers
180+
181+
UID2 sharing identifiers can change. Be sure to check this page for the latest sharing identifiers.
182+
183+
| Environment | UID2 Sharing Identifier |
184+
| :--- | :--- |
185+
| Production | `aws:us-east-2:21149de7-a9e9-4463-b4e0-066f4b033e5d:673872910525611:010d98a6-8cf2-4011-8bf7-ca45940bc329` |
186+
| Integration | `aws:us-east-2:4651b4ea-b29c-42ec-aecb-2377de70bbd4:2366823546528067:c15e03bf-a348-4189-92e5-68b9a7fb4018` |
187+
188+
### Finding a Sharing Identifier
189+
190+
To find the sharing identifier for your UID2 contact, follow these steps:
191+
192+
In your Databricks workspace, in the Catalog Explorer, click **Catalog**.
193+
194+
At the top, click the gear icon and select **Delta Sharing**.
195+
196+
On the **Shared with me** tab, in the upper right, click your Databricks sharing organization and then select **Copy sharing identifier**.
197+
198+
For details, see [Request the recipient's sharing identifier](https://docs.databricks.com/aws/en/delta-sharing/create-recipient#step-1-request-the-recipients-sharing-identifier) in the Databricks documentation.

0 commit comments

Comments
 (0)