Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions develop-docs/backend/application-domains/pii/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: PII and Data Scrubbing
description: This document describes a configuration format that we would like to hide from the user eventually. The only reason this page still exists is, because currently Relay accepts this format in alternative to regular data scrubbing settings.
sidebar_order: 170
---

The following document explores the syntax and semantics of the configuration
for [Advanced Data Scrubbing] consumed and executed by [Relay]. Sometimes, this
is also referred to as PII scrubbing.

## A Basic Example

Say you have an exception message which, unfortunately, contains IP addresses
which are not supposed to be there. You'd write:

```json
{
"applications": {
"$string": ["@ip:replace"]
}
}
```

It reads as "replace all IP addresses in all strings", or "apply `@ip:replace`
to all `$string` fields".

`@ip:replace` is called a rule, and `$string` is a <Link
to="/backend/pii/selectors/">selector</Link>.

## Built-in Rules

The following rules exist by default:

- `@ip:replace` and `@ip:hash` for replacing IP addresses.
- `@imei:replace` and `@imei:hash` for replacing IMEIs
- `@mac:replace`, `@mac:mask` and `@mac:hash` for matching MAC addresses
- `@email:mask`, `@email:replace` and `@email:hash` for matching email addresses
- `@creditcard:mask`, `@creditcard:replace` and `@creditcard:hash` for matching
creditcard numbers
- `@userpath:replace` and `@userpath:hash` for matching local paths (e.g.
`C:/Users/foo/`)
- `@password:remove` for removing passwords. In this case we're pattern matching
against the field's key, whether it contains `password`, `credentials` or
similar strings.
- `@anything:remove`, `@anything:replace` and `@anything:hash` for removing,
replacing or hashing any value. It is essentially equivalent to a
wildcard-regex, but it will also match much more than strings.

## Writing Your Own Rules

Rules generally consist of two parts:

- _Rule types_ describe what to match. See <Link to="/backend/pii/types/">PII Rule
Types</Link> for an exhaustive list.
- _Rule redaction methods_ describe what to do with the match. See <Link
to="/backend/pii/methods/">PII Redaction Methods</Link> for a list.

Each page comes with examples. Try those examples out by pasting them into the
"PII config" column of [Piinguin] and clicking on fields to get suggestions.

## Interactive Editing

The easiest way to go about this is if you already have a raw JSON payload from
some SDK. Go to our PII config editor [Piinguin], and:

1. Paste in a raw event
2. Click on data you want eliminated
3. Paste in other payloads and see if they look fine, go to step **2** if
necessary.

After iterating on the config, paste it back into the project config located at
`.relay/projects/<PROJECT_ID>.json`

For example:

```json
{
"publicKeys": [
{
"publicKey": "___PUBLIC_KEY___",
"isEnabled": true
}
],
"config": {
"allowedDomains": ["*"],
"piiConfig": {
"rules": {
"device_id": {
"type": "pattern",
"pattern": "d/[a-f0-9]{12}",
"redaction": {
"method": "hash"
}
}
},
"applications": {
"freeform": ["device_id"]
}
}
}
}
```
Copy link
Member Author

@tobias-wilfert tobias-wilfert Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjbayer Is it ok if I remove this though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!


[advanced data scrubbing]: https://docs.sentry.io/product/data-management-settings/scrubbing/advanced-datascrubbing/
[relay]: https://github.com/getsentry/relay
[piinguin]: https://getsentry.github.io/piinguin
89 changes: 89 additions & 0 deletions develop-docs/backend/application-domains/pii/methods.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
title: PII Redaction Methods
---

`remove`

: Remove the entire field. Relay may choose to either set it to `null` or to
remove it entirely.

```json
{
"rules": {
"remove_ip": {
"type": "ip",
"redaction": {
"method": "remove"
}
}
},
"applications": {
"$string": ["remove_ip"]
}
}
```

`replace`

: Replace the key with a static string.

```json
{
"rules": {
"replace_ip": {
"type": "ip",
"redaction": {
"method": "replace",
"text": [censored]"
}
}
},
"applications": {
"$string": ["replace_ip"]
}
}
```

### `mask`

: Replace every character of the matched string with a "masking" char `*`. Compared
to `replace` this preserves the length of the original string.

```javascript
{
"rules": {
"mask_ip": {
"type": "ip",
"redaction": {
"method": "mask"
}
}
},
"applications": {
"$string": ["mask_ip"]
}
}
```

### `hash`

: Replace the string with a hashed version of itself. Equal strings will produce
the same hash, so if you, for example, decide to hash the user ID instead of
replacing or removing it, you will still have an accurate count of users
affected.

```javascript
{
"rules": {
"hash_ip": {
"type": "ip",
"redaction": {
"method": "hash"
}
}
}
"applications": {
"$string": ["mask_ip"]
}
}
```
125 changes: 125 additions & 0 deletions develop-docs/backend/application-domains/pii/selectors.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: PII Selectors
---

Selectors allow you to restrict rules to certain parts of the event. This is
useful to unconditionally remove certain data by variable/field name from the
event, but can also be used to conservatively test rules on real data.

Data scrubbing always works on the raw event payload. Keep in mind that some
fields in the UI may be called differently in the JSON schema. When looking at
an event there should always be a link called "JSON" present that allows you to
see what the data scrubber sees.

For example, what is called "Additional Data" in the UI is called `extra` in the
event payload. To remove a specific key called `foo`, you would write:

```
[Remove] [Anything] from [extra.foo]
```

Another example. Sentry knows about two kinds of error messages: The exception
message, and the top-level log message. Here is an example of how such an event
payload as sent by the SDK (and downloadable from the UI) would look like:

```json
{
"logentry": {
"formatted": "Failed to roll out the dinglebop"
},
"exceptions": {
"values": [
{
"type": "ZeroDivisionError",
"value": "integer division or modulo by zero"
}
]
}
}
```

Since the "error message" is taken from the `exception`'s `value`, and the
"message" is taken from `logentry`, we would have to write the following to
remove both from the event:

```
[Remove] [Anything] from [exception.value]
[Remove] [Anything] from [logentry.formatted]
```

### Boolean Logic

You can combine selectors using boolean logic.

- Prefix with `!` to invert the selector. `foo` matches the JSON key `foo`,
while `!foo` matches everything but `foo`.
- Build the conjunction (AND) using `&&`, such as: `foo && !extra.foo` to match
the key `foo` except when inside of `extra`.
- Build the disjunction (OR) using `||`, such as: `foo || bar` to match `foo` or
`bar`.

### Wildcards

- `**` matches all subpaths, so that `foo.**` matches all JSON keys within
`foo`.
- `*` matches a single path item, so that `foo.*` matches all JSON keys one
level below `foo`.

### Value Types

Select subsections by JSON-type using the following:

- `$string` matches any string value
- `$number` matches any integer or float value
- `$datetime` matches any field in the event that represents a timestamp
- `$array` matches any JSON array value
- `$object` matches any JSON object

Select known parts of the schema using the following:

- `$exception` matches a single exception instance in `{"exception": {"values": [...]}}`
- `$stacktrace` matches a stack trace instance
- `$frame` matches a frame
- `$request` matches the HTTP request context of an event
- `$user` matches the user context of an event
- `$logentry` (also applies to the `message` attribute)
- `$thread` matches a single thread instance in `{"threads": {"values": [...]}}`
- `$breadcrumb` matches a single breadcrumb in `{"breadcrumbs": [...]}`
- `$span` matches a [trace span]
- `$sdk` matches the SDK context in `{"sdk": ...}`

#### Examples

- Delete `event.user`:

```
[Remove] [Anything] from [$user]
```

- Delete all frame-local variables:

```
[Remove] [Anything] from [$frame.vars]
```

### Escaping Specal Characters

If the object key you want to match contains whitespace or special characters,
you can use quotes to escape it:

```
[Remove] [Anything] from [extra.'my special value']
```

This matches the key `my special value` in _Additional Data_.

To escape `'` (single quote) within the quotes, replace it with `''` (two
quotes):

```
[Remove] [Anything] from [extra.'my special '' value']
```

This matches the key `my special ' value` in _Additional Data_.

[trace span]: https://docs.sentry.io/product/performance/distributed-tracing/#spans
Loading
Loading