Skip to content

Commit a7349bf

Browse files
crwaters16maxvpranbel
authored
[DLP] Adding confidence levels content (#18330)
* adding confidence levels info * past tense edits * Apply suggestions from code review Co-authored-by: ranbel <[email protected]> * Update src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx * Fix broken link * Fix wording * Add Gateway policies * Add procedure * Add changelog entry --------- Co-authored-by: Max Phillips <[email protected]> Co-authored-by: ranbel <[email protected]>
1 parent 49a6426 commit a7349bf

File tree

2 files changed

+43
-7
lines changed

2 files changed

+43
-7
lines changed

src/content/changelogs/dlp.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ productLink: "/cloudflare-one/policies/data-loss-prevention/"
55
productArea: Cloudflare One
66
productAreaLink: /cloudflare-one/changelog/
77
entries:
8+
- publish_date: "2024-11-25"
9+
title: Profile confidence levels
10+
description: |-
11+
DLP profiles now support setting a [confidence level](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/#confidence-levels) to choose how tolerant its detections are to false positives based on the context of the detection. The higher a profile's confidence level is, the less false positives will be allowed. Confidence levels include Low, Medium, or High. DLP profile confidence levels supersede [context analysis](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/#context-analysis).
812
- publish_date: "2024-11-01"
913
title: Send entire HTTP requests to a Logpush destination
1014
description: |-

src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx

Lines changed: 39 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,52 @@ This page lists the advanced settings available when configuring a [predefined](
1313

1414
Match count refers to the number of times that any enabled entry in the profile can be detected before an action is triggered, such as blocking or logging. For example, if you select a match count of 10, the scanned file or HTTP body must contain 11 or more matching strings. Detections do not have to be unique.
1515

16-
## Context analysis
16+
## Confidence levels
1717

18-
Context analysis restricts detections based on proximity keywords to prevent false positives. Proximity keywords must be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` will only count as a detection if in proximity to keywords such as `ssn`.
18+
Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection.
1919

20-
DLP will apply context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections include the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles.
20+
Confidence level is set on the DLP profile. When you select a confidence level in Zero Trust, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in Zero Trust are not yet supported or are not applicable.
2121

22-
### Exclude files from context analysis
22+
DLP confidence detections consist of Low, Medium, and High confidence levels. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy.
2323

24-
You can exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you send an email containing the string `123-45-6789`, DLP will only count a detection if the string is in proximity to keywords such as `ssn`. If you include a file in an email containing the string `123-45-6789`, DLP will match a detection regardless of keywords.
24+
To change the confidence level of a DLP profile:
2525

26-
To exclude file content from context analysis, in **Exclude content type**, choose _Files_.
26+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**.
27+
2. Select the profile, then select **Edit**.
28+
3. In **Advanced settings** > **Confidence Level**, choose a new confidence level from the dropdown menu.
29+
30+
Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections.
31+
32+
### Gateway detections
2733

28-
## Optical Character Recognition (OCR) <Badge text="Beta" variant="caution" size="small" />
34+
For inline detections in Gateway, to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends creating two HTTP policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. For example:
35+
36+
| Selector | Operator | Value | Action |
37+
| ----------- | -------- | --------------------------- | ------ |
38+
| DLP Profile | in | _Low Confidence Detections_ | Allow |
39+
40+
| Selector | Operator | Value | Action |
41+
| ----------- | -------- | ---------------------------- | ------ |
42+
| DLP Profile | in | _High Confidence Detections_ | Block |
43+
44+
## Optical Character Recognition (OCR) <Badge text="Beta" variant="caution" size="small" /> {/* optical-character-recognition-ocr */}
2945

3046
Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload.
3147

3248
OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters.
49+
50+
## Context analysis <Badge text="Deprecated" variant="caution" size="small" /> {/* context-analysis */}
51+
52+
:::caution
53+
Context analysis has been superseded by [confidence levels](#confidence-levels). DLP will migrate users who had context analysis turned on to confidence levels where applicable.
54+
:::
55+
56+
When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`.
57+
58+
DLP applied context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections included the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles.
59+
60+
### Exclude files from context analysis
61+
62+
You could exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you sent an email containing the string `123-45-6789`, DLP only counted a detection if the string was in proximity to keywords such as `ssn`. If you included a file in an email containing the string `123-45-6789`, DLP matched a detection regardless of keywords.
63+
64+
To exclude file content from context analysis, in **Exclude content type**, choose _Files_.

0 commit comments

Comments
 (0)