Skip to content

Commit 0ac6c7a

Browse files
maxvpKianNH
authored andcommitted
[DLP] AI context analysis (#21055)
* Remove first match note * Rename confidence levels --> confidence thresholds * Remove old context analysis section * Add additional context * Add how to report true/false positives * Fix broken links in changelog * Discard changes to src/content/release-notes/dlp.yaml * Remove redirects * Add procedure for editing profile settings * Add procedure for setting up AI * Add confidence levels redirect * Discard changes to public/__redirects
1 parent 7b977d9 commit 0ac6c7a

File tree

4 files changed

+68
-45
lines changed

4 files changed

+68
-45
lines changed

src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-policies/logging-options.mdx

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,12 +43,25 @@ Data Loss Prevention will now store a portion of the payload for HTTP requests t
4343
3. Select **Decrypt Payload Log**.
4444
4. Enter your private key and select **Decrypt**.
4545

46-
You will see the [ID of the matched DLP Profile](/api/resources/zero_trust/subresources/dlp/subresources/profiles/methods/list/) followed by the decrypted payload. Note that DLP currently logs only the first match.
46+
You will see the [ID of the matched DLP Profile](/api/resources/zero_trust/subresources/dlp/subresources/profiles/methods/list/) followed by the decrypted payload.
4747

4848
:::note
49-
Neither the key nor the decrypted payload will be stored by Cloudflare.
49+
Cloudflare does not store the key or the decrypted payload.
5050
:::
5151

52+
### Report false and true positives to AI context analysis
53+
54+
When you have [AI context analysis](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/#ai-context-analysis) turned on for a DLP profile, you can train the AI model to adjust its confident threshold by reporting false and true positives.
55+
56+
To report a DLP match payload as a false or true positive:
57+
58+
1. [Find and decrypt](#4-view-payload-logs) the payload log you want to report.
59+
2. In **Log details**, choose a detected context match.
60+
3. In **Context**, select the redacted match data.
61+
4. In **Match details**, choose whether you want to report the match as a false positive or a true positive.
62+
63+
Based on your report, DLP's machine learning will adjust its confidence in future matches for the associated profile.
64+
5265
### Data privacy
5366

5467
- All Cloudflare logs are encrypted at rest. Encrypting the payload content adds a second layer of encryption for the matched values that triggered a DLP rule.

src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx

Lines changed: 46 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -7,58 +7,73 @@ sidebar:
77

88
import { Badge } from "~/components";
99

10-
This page lists the advanced settings available when configuring a [predefined](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/) or [custom](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/#build-a-custom-profile) DLP profile.
10+
This page lists the profile settings available when configuring a [predefined](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/) or [custom](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/#build-a-custom-profile) DLP profile. You can configure profile settings when you create a custom profile or [edit profile settings](#edit-profile-settings) for an existing predefined or custom profile.
1111

12-
## Match count
12+
## Edit profile settings
13+
14+
To edit profile settings for an existing predefined or custom DLP profile:
15+
16+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
17+
2. Choose a profile, then select **Edit**.
18+
3. In **Settings**, configure the [settings](#available-settings) for your profile.
19+
4. Select **Save profile**.
20+
21+
## Available settings
22+
23+
The following advanced detection settings are available for predefined and custom DLP profiles.
24+
25+
### Match count
1326

1427
Match count refers to the number of times that any enabled entry in the profile can be detected before an action is triggered, such as blocking or logging. For example, if you select a match count of 10, the scanned file or HTTP body must contain 11 or more matching strings. Detections do not have to be unique.
1528

16-
## Confidence levels
29+
### Optical Character Recognition (OCR)
1730

18-
Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection.
31+
Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload.
1932

20-
Confidence level is set on the DLP profile. When you select a confidence level in Zero Trust, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in Zero Trust are not yet supported or are not applicable.
33+
OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters.
2134

22-
DLP confidence detections consist of Low, Medium, and High confidence levels. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy.
35+
### AI context analysis <Badge text="Beta" variant="caution" size="small" /> {/* ai-context-analysis */}
2336

24-
To change the confidence level of a DLP profile:
37+
:::note
38+
AI context analysis only supports Gateway HTTP and HTTPS traffic.
39+
:::
2540

26-
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
27-
2. Select the profile, then select **Edit**.
28-
3. In **Advanced settings** > **Confidence Level**, choose a new confidence level from the dropdown menu.
41+
AI context analysis uses machine learning to analyze and adjust the confidence in a detection based on its surrounding context. DLP will log any matches that are above your confidence threshold.
2942

30-
Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections.
43+
DLP submits the context as an AI text embedding vector to [Cloudflare Workers AI](/workers-ai/). Vectors are stored in a database bucket for up to six months, along with relevant metadata from the HTTP request including the URL, HTTP method, matching DLP profile, and Gateway request ID.
3144

32-
### Gateway detections
45+
To use AI context analysis:
3346

34-
For inline detections in Gateway, to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends creating two HTTP policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. For example:
47+
1. Turn on **AI context analysis** in a DLP profile.
48+
2. [Add the profile](/cloudflare-one/policies/data-loss-prevention/dlp-policies/#2-create-a-dlp-policy) to a DLP policy.
49+
3. When configuring the DLP policy, turn on [payload logging](/cloudflare-one/policies/data-loss-prevention/dlp-policies/logging-options/#log-the-payload-of-matched-rules).
3550

36-
| Selector | Operator | Value | Action |
37-
| ----------- | -------- | --------------------------- | ------ |
38-
| DLP Profile | in | _Low Confidence Detections_ | Allow |
51+
AI context analysis results will appear in the payload section of your [DLP logs](/cloudflare-one/policies/data-loss-prevention/dlp-policies/#4-view-dlp-logs). To further train the machine learning model, you need to [report false and true positives](/cloudflare-one/policies/data-loss-prevention/dlp-policies/logging-options/#report-false-and-true-positives-to-ai-context-analysis).
3952

40-
| Selector | Operator | Value | Action |
41-
| ----------- | -------- | ---------------------------- | ------ |
42-
| DLP Profile | in | _High Confidence Detections_ | Block |
53+
### Confidence thresholds
4354

44-
## Optical Character Recognition (OCR) <Badge text="Beta" variant="caution" size="small" /> {/* optical-character-recognition-ocr */}
55+
Confidence thresholds indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection.
4556

46-
Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload.
57+
Confidence threshold is set on the DLP profile. When you select a confidence threshold in Zero Trust, you will see which DLP entries will be affected by the confidence threshold. Entries that do not reflect a confidence threshold in Zero Trust are not yet supported or are not applicable.
4758

48-
OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters.
59+
DLP confidence detections consist of Low, Medium, and High confidence thresholds. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy.
4960

50-
## Context analysis <Badge text="Deprecated" variant="caution" size="small" /> {/* context-analysis */}
61+
To change the confidence threshold of a DLP profile:
5162

52-
:::caution
53-
Context analysis has been superseded by [confidence levels](#confidence-levels). DLP will migrate users who had context analysis turned on to confidence levels where applicable.
54-
:::
63+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
64+
2. Select the profile, then select **Edit**.
65+
3. In **Settings** > **Confidence threshold**, choose a new confidence threshold from the dropdown menu.
5566

56-
When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`.
67+
Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections.
5768

58-
DLP applied context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections included the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles.
69+
#### Gateway detections
5970

60-
### Exclude files from context analysis
71+
For inline detections in Gateway, to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends creating two HTTP policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. For example:
6172

62-
You could exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you sent an email containing the string `123-45-6789`, DLP only counted a detection if the string was in proximity to keywords such as `ssn`. If you included a file in an email containing the string `123-45-6789`, DLP matched a detection regardless of keywords.
73+
| Selector | Operator | Value | Action |
74+
| ----------- | -------- | --------------------------- | ------ |
75+
| DLP Profile | in | _Low Confidence Detections_ | Allow |
6376

64-
To exclude file content from context analysis, in **Exclude content type**, choose _Files_.
77+
| Selector | Operator | Value | Action |
78+
| ----------- | -------- | ---------------------------- | ------ |
79+
| DLP Profile | in | _High Confidence Detections_ | Block |
Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,31 @@
11
---
22
{}
3-
43
---
54

6-
import { Details } from "~/components"
5+
import { Details } from "~/components";
76

8-
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP Profiles**.
7+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
98

109
2. Select **Create profile**.
1110

1211
3. Enter a name and optional description for the profile.
1312

1413
4. Add custom or existing detection entries.
1514

16-
1715
<Details header="Add a custom entry">
1816

1917
1. Select **Add custom entry** and give it a name.
2018

2119
2. In **Value**, enter a regular expression (or regex) that defines the text pattern you want to detect. For example, `test\d\d` will detect the word `test` followed by two digits.
2220

23-
* Regular expressions are written in Rust. We recommend validating your regex with [Rustexp](https://rustexp.lpil.uk/).
24-
* DLP detects UTF-8 characters, which can be up to 4 bytes each. Custom text pattern detections are limited to 1024 bytes in length.
25-
* DLP does not support regular expressions with `+` or `*` operators because they are prone to exceeding the length limit. For example, the regex pattern `a+` can detect an infinite number of `a` characters. We recommend using `a{min,max}` instead, such as `a{1,1024}`.
21+
- Regular expressions are written in Rust. We recommend validating your regex with [Rustexp](https://rustexp.lpil.uk/).
22+
- DLP detects UTF-8 characters, which can be up to 4 bytes each. Custom text pattern detections are limited to 1024 bytes in length.
23+
- DLP does not support regular expressions with `+` or `*` operators because they are prone to exceeding the length limit. For example, the regex pattern `a+` can detect an infinite number of `a` characters. We recommend using `a{min,max}` instead, such as `a{1,1024}`.
2624

2725
3. To save the detection entry, select **Done**.
2826

29-
3027
</Details>
3128

32-
3329
<Details header="Add existing entries">
3430

3531
Existing entries include [predefined detection entries](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/) and [DLP datasets](/cloudflare-one/policies/data-loss-prevention/datasets/).
@@ -38,9 +34,8 @@ import { Details } from "~/components"
3834
2. Choose which entries you want to add, then select **Confirm**.
3935
3. To save the detection entry, select **Done**.
4036

41-
4237
</Details>
4338

44-
5. (Optional) Configure [**Advanced settings**](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/) for the profile.
39+
5. (Optional) Configure [**profile settings**](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/) for the profile.
4540

4641
6. Select **Save profile**.

src/content/partials/cloudflare-one/data-loss-prevention/predefined-profile.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
{}
33
---
44

5-
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP Profiles**.
5+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
66
2. Choose a [predefined profile](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/) and select **Configure**.
77
3. Enable one or more **Detection entries** according to your preferences. The DLP Profile matches using the OR logical operator — if multiple entries are enabled, your data needs to match only one of the entries.
88
4. Select **Save profile**.

0 commit comments

Comments
 (0)