Skip to content

Commit d00e3a3

Browse files
Re-add the developer docs
1 parent 51885b7 commit d00e3a3

File tree

6 files changed

+568
-0
lines changed

6 files changed

+568
-0
lines changed
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
---
2+
title: PII and Data Scrubbing
3+
description: This document describes a configuration format that we would like to hide from the user eventually. The only reason this page still exists is, because currently Relay accepts this format in alternative to regular data scrubbing settings.
4+
sidebar_order: 170
5+
---
6+
7+
The following document explores the syntax and semantics of the configuration
8+
for [Advanced Data Scrubbing] consumed and executed by [Relay]. Sometimes, this
9+
is also referred to as PII scrubbing.
10+
11+
## A Basic Example
12+
13+
Say you have an exception message which, unfortunately, contains IP addresses
14+
which are not supposed to be there. You'd write:
15+
16+
```json
17+
{
18+
"applications": {
19+
"$string": ["@ip:replace"]
20+
}
21+
}
22+
```
23+
24+
It reads as "replace all IP addresses in all strings", or "apply `@ip:replace`
25+
to all `$string` fields".
26+
27+
`@ip:replace` is called a rule, and `$string` is a <Link
28+
to="/backend/pii/selectors/">selector</Link>.
29+
30+
## Built-in Rules
31+
32+
The following rules exist by default:
33+
34+
- `@ip:replace` and `@ip:hash` for replacing IP addresses.
35+
- `@imei:replace` and `@imei:hash` for replacing IMEIs
36+
- `@mac:replace`, `@mac:mask` and `@mac:hash` for matching MAC addresses
37+
- `@email:mask`, `@email:replace` and `@email:hash` for matching email addresses
38+
- `@creditcard:mask`, `@creditcard:replace` and `@creditcard:hash` for matching
39+
creditcard numbers
40+
- `@userpath:replace` and `@userpath:hash` for matching local paths (e.g.
41+
`C:/Users/foo/`)
42+
- `@password:remove` for removing passwords. In this case we're pattern matching
43+
against the field's key, whether it contains `password`, `credentials` or
44+
similar strings.
45+
- `@anything:remove`, `@anything:replace` and `@anything:hash` for removing,
46+
replacing or hashing any value. It is essentially equivalent to a
47+
wildcard-regex, but it will also match much more than strings.
48+
49+
## Writing Your Own Rules
50+
51+
Rules generally consist of two parts:
52+
53+
- _Rule types_ describe what to match. See <Link to="/backend/pii/types/">PII Rule
54+
Types</Link> for an exhaustive list.
55+
- _Rule redaction methods_ describe what to do with the match. See <Link
56+
to="/backend/pii/methods/">PII Redaction Methods</Link> for a list.
57+
58+
Each page comes with examples. Try those examples out by pasting them into the
59+
"PII config" column of [Piinguin] and clicking on fields to get suggestions.
60+
61+
## Interactive Editing
62+
63+
The easiest way to go about this is if you already have a raw JSON payload from
64+
some SDK. Go to our PII config editor [Piinguin], and:
65+
66+
1. Paste in a raw event
67+
2. Click on data you want eliminated
68+
3. Paste in other payloads and see if they look fine, go to step **2** if
69+
necessary.
70+
71+
After iterating on the config, paste it back into the project config located at
72+
`.relay/projects/<PROJECT_ID>.json`
73+
74+
For example:
75+
76+
```json
77+
{
78+
"publicKeys": [
79+
{
80+
"publicKey": "___PUBLIC_KEY___",
81+
"isEnabled": true
82+
}
83+
],
84+
"config": {
85+
"allowedDomains": ["*"],
86+
"piiConfig": {
87+
"rules": {
88+
"device_id": {
89+
"type": "pattern",
90+
"pattern": "d/[a-f0-9]{12}",
91+
"redaction": {
92+
"method": "hash"
93+
}
94+
}
95+
},
96+
"applications": {
97+
"freeform": ["device_id"]
98+
}
99+
}
100+
}
101+
}
102+
```
103+
104+
[advanced data scrubbing]: https://docs.sentry.io/product/data-management-settings/scrubbing/advanced-datascrubbing/
105+
[relay]: https://github.com/getsentry/relay
106+
[piinguin]: https://getsentry.github.io/piinguin
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: PII Redaction Methods
3+
---
4+
5+
`remove`
6+
7+
: Remove the entire field. Relay may choose to either set it to `null` or to
8+
remove it entirely.
9+
10+
```json
11+
{
12+
"rules": {
13+
"remove_ip": {
14+
"type": "ip",
15+
"redaction": {
16+
"method": "remove"
17+
}
18+
}
19+
},
20+
"applications": {
21+
"$string": ["remove_ip"]
22+
}
23+
}
24+
```
25+
26+
`replace`
27+
28+
: Replace the key with a static string.
29+
30+
```json
31+
{
32+
"rules": {
33+
"replace_ip": {
34+
"type": "ip",
35+
"redaction": {
36+
"method": "replace",
37+
"text": [censored]"
38+
}
39+
}
40+
},
41+
"applications": {
42+
"$string": ["replace_ip"]
43+
}
44+
}
45+
```
46+
47+
### `mask`
48+
49+
: Replace every character of the matched string with a "masking" char `*`. Compared
50+
to `replace` this preserves the length of the original string.
51+
52+
```javascript
53+
{
54+
"rules": {
55+
"mask_ip": {
56+
"type": "ip",
57+
"redaction": {
58+
"method": "mask"
59+
}
60+
}
61+
},
62+
"applications": {
63+
"$string": ["mask_ip"]
64+
}
65+
}
66+
```
67+
68+
### `hash`
69+
70+
: Replace the string with a hashed version of itself. Equal strings will produce
71+
the same hash, so if you, for example, decide to hash the user ID instead of
72+
replacing or removing it, you will still have an accurate count of users
73+
affected.
74+
75+
```javascript
76+
{
77+
"rules": {
78+
"hash_ip": {
79+
"type": "ip",
80+
"redaction": {
81+
"method": "hash"
82+
}
83+
}
84+
}
85+
"applications": {
86+
"$string": ["mask_ip"]
87+
}
88+
}
89+
```
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
title: PII Selectors
3+
---
4+
5+
Selectors allow you to restrict rules to certain parts of the event. This is
6+
useful to unconditionally remove certain data by variable/field name from the
7+
event, but can also be used to conservatively test rules on real data.
8+
9+
Data scrubbing always works on the raw event payload. Keep in mind that some
10+
fields in the UI may be called differently in the JSON schema. When looking at
11+
an event there should always be a link called "JSON" present that allows you to
12+
see what the data scrubber sees.
13+
14+
For example, what is called "Additional Data" in the UI is called `extra` in the
15+
event payload. To remove a specific key called `foo`, you would write:
16+
17+
```
18+
[Remove] [Anything] from [extra.foo]
19+
```
20+
21+
Another example. Sentry knows about two kinds of error messages: The exception
22+
message, and the top-level log message. Here is an example of how such an event
23+
payload as sent by the SDK (and downloadable from the UI) would look like:
24+
25+
```json
26+
{
27+
"logentry": {
28+
"formatted": "Failed to roll out the dinglebop"
29+
},
30+
"exceptions": {
31+
"values": [
32+
{
33+
"type": "ZeroDivisionError",
34+
"value": "integer division or modulo by zero"
35+
}
36+
]
37+
}
38+
}
39+
```
40+
41+
Since the "error message" is taken from the `exception`'s `value`, and the
42+
"message" is taken from `logentry`, we would have to write the following to
43+
remove both from the event:
44+
45+
```
46+
[Remove] [Anything] from [exception.value]
47+
[Remove] [Anything] from [logentry.formatted]
48+
```
49+
50+
### Boolean Logic
51+
52+
You can combine selectors using boolean logic.
53+
54+
- Prefix with `!` to invert the selector. `foo` matches the JSON key `foo`,
55+
while `!foo` matches everything but `foo`.
56+
- Build the conjunction (AND) using `&&`, such as: `foo && !extra.foo` to match
57+
the key `foo` except when inside of `extra`.
58+
- Build the disjunction (OR) using `||`, such as: `foo || bar` to match `foo` or
59+
`bar`.
60+
61+
### Wildcards
62+
63+
- `**` matches all subpaths, so that `foo.**` matches all JSON keys within
64+
`foo`.
65+
- `*` matches a single path item, so that `foo.*` matches all JSON keys one
66+
level below `foo`.
67+
68+
### Value Types
69+
70+
Select subsections by JSON-type using the following:
71+
72+
- `$string` matches any string value
73+
- `$number` matches any integer or float value
74+
- `$datetime` matches any field in the event that represents a timestamp
75+
- `$array` matches any JSON array value
76+
- `$object` matches any JSON object
77+
78+
Select known parts of the schema using the following:
79+
80+
- `$exception` matches a single exception instance in `{"exception": {"values": [...]}}`
81+
- `$stacktrace` matches a stack trace instance
82+
- `$frame` matches a frame
83+
- `$request` matches the HTTP request context of an event
84+
- `$user` matches the user context of an event
85+
- `$logentry` (also applies to the `message` attribute)
86+
- `$thread` matches a single thread instance in `{"threads": {"values": [...]}}`
87+
- `$breadcrumb` matches a single breadcrumb in `{"breadcrumbs": [...]}`
88+
- `$span` matches a [trace span]
89+
- `$sdk` matches the SDK context in `{"sdk": ...}`
90+
91+
#### Examples
92+
93+
- Delete `event.user`:
94+
95+
```
96+
[Remove] [Anything] from [$user]
97+
```
98+
99+
- Delete all frame-local variables:
100+
101+
```
102+
[Remove] [Anything] from [$frame.vars]
103+
```
104+
105+
### Escaping Specal Characters
106+
107+
If the object key you want to match contains whitespace or special characters,
108+
you can use quotes to escape it:
109+
110+
```
111+
[Remove] [Anything] from [extra.'my special value']
112+
```
113+
114+
This matches the key `my special value` in _Additional Data_.
115+
116+
To escape `'` (single quote) within the quotes, replace it with `''` (two
117+
quotes):
118+
119+
```
120+
[Remove] [Anything] from [extra.'my special '' value']
121+
```
122+
123+
This matches the key `my special ' value` in _Additional Data_.
124+
125+
[trace span]: https://docs.sentry.io/product/performance/distributed-tracing/#spans

0 commit comments

Comments
 (0)