From bbb4b93abbe247c30a3a5b0c83fae8cbaa64fe57 Mon Sep 17 00:00:00 2001 From: Claire Waters Date: Thu, 21 Nov 2024 12:54:50 -0600 Subject: [PATCH 1/9] adding confidence levels info --- .../dlp-profiles/advanced-settings.mdx | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index eb3672ea1f2c783..b536853b75a269a 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -13,7 +13,20 @@ This page lists the advanced settings available when configuring a [predefined]( Match count refers to the number of times that any enabled entry in the profile can be detected before an action is triggered, such as blocking or logging. For example, if you select a match count of 10, the scanned file or HTTP body must contain 11 or more matching strings. Detections do not have to be unique. +## Confidence levels + +Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. The confidence is determined by inspecting the content for proximity keywords around the detection. + +Low confidence detections are generally based on regular expressions, require few keywords, and will trigger more often. High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy. Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out the lower confidence detections. + +Confidence level is set on the DLP profile. When you select a confidence in the dashboard, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in the dashboard are not yet supported or are not applicable. + +For inline detections in Gateway, if you would like to see Low and Medium confidence detections but block High confidence detections, Cloudflare recommends using two policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. + ## Context analysis +:::note +Context analysis has been superseded by confidence levels. Customers who had context analysis turned on will be migrated to confidence levels where applicable. +::: Context analysis restricts detections based on proximity keywords to prevent false positives. Proximity keywords must be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` will only count as a detection if in proximity to keywords such as `ssn`. From aba9dc071f3c2c59f53cad88830641ec0b994c54 Mon Sep 17 00:00:00 2001 From: Claire Waters Date: Thu, 21 Nov 2024 13:35:19 -0600 Subject: [PATCH 2/9] past tense edits --- .../dlp-profiles/advanced-settings.mdx | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index b536853b75a269a..e94688f3035070e 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -28,13 +28,13 @@ For inline detections in Gateway, if you would like to see Low and Medium confid Context analysis has been superseded by confidence levels. Customers who had context analysis turned on will be migrated to confidence levels where applicable. ::: -Context analysis restricts detections based on proximity keywords to prevent false positives. Proximity keywords must be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` will only count as a detection if in proximity to keywords such as `ssn`. +When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`. -DLP will apply context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections include the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles. +DLP applied context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections included the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles. -### Exclude files from context analysis +### How you excluded files from context analysis -You can exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you send an email containing the string `123-45-6789`, DLP will only count a detection if the string is in proximity to keywords such as `ssn`. If you include a file in an email containing the string `123-45-6789`, DLP will match a detection regardless of keywords. +You could exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you sent an email containing the string `123-45-6789`, DLP only counted a detection if the string was in proximity to keywords such as `ssn`. If you included a file in an email containing the string `123-45-6789`, DLP matched a detection regardless of keywords. To exclude file content from context analysis, in **Exclude content type**, choose _Files_. From e5c1b9217adda90536febc3a4f6776fe7adbd2dc Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 12:24:11 -0500 Subject: [PATCH 3/9] Apply suggestions from code review Co-authored-by: ranbel <101146722+ranbel@users.noreply.github.com> --- .../data-loss-prevention/dlp-profiles/advanced-settings.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index e94688f3035070e..7507e56b3a7c5f7 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -25,7 +25,7 @@ For inline detections in Gateway, if you would like to see Low and Medium confid ## Context analysis :::note -Context analysis has been superseded by confidence levels. Customers who had context analysis turned on will be migrated to confidence levels where applicable. +Context analysis has been superseded by [confidence levels](/#confidence-levels). Customers who had context analysis turned on will be migrated to confidence levels where applicable. ::: When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`. From f2c9ca2e6d2984f07cab02c990747cf3ef11f3e4 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 12:25:24 -0500 Subject: [PATCH 4/9] Update src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx --- .../data-loss-prevention/dlp-profiles/advanced-settings.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index 7507e56b3a7c5f7..1a793adef9732f3 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -25,7 +25,7 @@ For inline detections in Gateway, if you would like to see Low and Medium confid ## Context analysis :::note -Context analysis has been superseded by [confidence levels](/#confidence-levels). Customers who had context analysis turned on will be migrated to confidence levels where applicable. +Context analysis has been superseded by [confidence levels](/#confidence-levels). Users who had context analysis turned on will be migrated to confidence levels where applicable. ::: When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`. From 45a45335108a769ac2c32cf8d817b94edcbf0938 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 11:41:38 -0600 Subject: [PATCH 5/9] Fix broken link --- .../data-loss-prevention/dlp-profiles/advanced-settings.mdx | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index 1a793adef9732f3..afc117265c511c6 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -24,8 +24,9 @@ Confidence level is set on the DLP profile. When you select a confidence in the For inline detections in Gateway, if you would like to see Low and Medium confidence detections but block High confidence detections, Cloudflare recommends using two policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. ## Context analysis -:::note -Context analysis has been superseded by [confidence levels](/#confidence-levels). Users who had context analysis turned on will be migrated to confidence levels where applicable. + +:::caution +Context analysis has been superseded by [confidence levels](#confidence-levels). Users who had context analysis turned on will be migrated to confidence levels where applicable. ::: When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`. From d85969ab35d0353f6c3d79afd3e8c9d5afa48d75 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 13:08:54 -0600 Subject: [PATCH 6/9] Fix wording --- .../dlp-profiles/advanced-settings.mdx | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index afc117265c511c6..e195d6945c0f003 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -15,32 +15,32 @@ Match count refers to the number of times that any enabled entry in the profile ## Confidence levels -Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. The confidence is determined by inspecting the content for proximity keywords around the detection. +Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection. Low confidence detections are generally based on regular expressions, require few keywords, and will trigger more often. High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy. Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out the lower confidence detections. -Confidence level is set on the DLP profile. When you select a confidence in the dashboard, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in the dashboard are not yet supported or are not applicable. +Confidence level is set on the DLP profile. When you select a confidence in Zero Trust, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in Zero Trust are not yet supported or are not applicable. -For inline detections in Gateway, if you would like to see Low and Medium confidence detections but block High confidence detections, Cloudflare recommends using two policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. +For inline detections in Gateway, if you would like to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends using two policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. -## Context analysis +## Optical Character Recognition (OCR) + +Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload. + +OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters. + +## Context analysis :::caution -Context analysis has been superseded by [confidence levels](#confidence-levels). Users who had context analysis turned on will be migrated to confidence levels where applicable. +Context analysis has been superseded by [confidence levels](#confidence-levels). DLP will migrate users who had context analysis turned on to confidence levels where applicable. ::: When it was available, context analysis restricted detections based on proximity keywords to prevent false positives. Proximity keywords had to be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` only counted as a detection if in proximity to keywords such as `ssn`. DLP applied context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections included the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles. -### How you excluded files from context analysis +### Exclude files from context analysis You could exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you sent an email containing the string `123-45-6789`, DLP only counted a detection if the string was in proximity to keywords such as `ssn`. If you included a file in an email containing the string `123-45-6789`, DLP matched a detection regardless of keywords. To exclude file content from context analysis, in **Exclude content type**, choose _Files_. - -## Optical Character Recognition (OCR) - -Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload. - -OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters. From 63f8126116dffb1cccaab265ebeca90fd02496d6 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 15:47:45 -0600 Subject: [PATCH 7/9] Add Gateway policies --- .../dlp-profiles/advanced-settings.mdx | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index e195d6945c0f003..1a63570466448dd 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -17,11 +17,23 @@ Match count refers to the number of times that any enabled entry in the profile Confidence levels indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection. -Low confidence detections are generally based on regular expressions, require few keywords, and will trigger more often. High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy. Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out the lower confidence detections. +Confidence level is set on the DLP profile. When you select a confidence level in Zero Trust, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in Zero Trust are not yet supported or are not applicable. -Confidence level is set on the DLP profile. When you select a confidence in Zero Trust, you will see which DLP entries will be affected by the confidence level. Entries that do not reflect a confidence level in Zero Trust are not yet supported or are not applicable. +DLP confidence detections consist of Low, Medium, and High confidence levels. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy. -For inline detections in Gateway, if you would like to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends using two policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. +Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections. + +### Gateway detections + +For inline detections in Gateway, to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends creating two HTTP policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. For example: + +| Selector | Operator | Value | Action | +| ----------- | -------- | --------------------------- | ------ | +| DLP Profile | in | _Low Confidence Detections_ | Allow | + +| Selector | Operator | Value | Action | +| ----------- | -------- | ---------------------------- | ------ | +| DLP Profile | in | _High Confidence Detections_ | Block | ## Optical Character Recognition (OCR) From cba5cd33a660cd9e888d2fc0fb5020b6182c0b7c Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 16:03:18 -0600 Subject: [PATCH 8/9] Add procedure --- .../data-loss-prevention/dlp-profiles/advanced-settings.mdx | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index 1a63570466448dd..07f1bcdaab14d07 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -21,6 +21,12 @@ Confidence level is set on the DLP profile. When you select a confidence level i DLP confidence detections consist of Low, Medium, and High confidence levels. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy. +To change the confidence level of a DLP profile: + +1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **DLP** > **DLP profiles**. +2. Select the profile, then select **Edit**. +3. In **Advanced settings** > **Confidence Level**, choose a new confidence level from the dropdown menu. + Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections. ### Gateway detections From 18602f746b2c12787505de2896b060a9d4e69405 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Tue, 26 Nov 2024 16:14:37 -0600 Subject: [PATCH 9/9] Add changelog entry --- src/content/changelogs/dlp.yaml | 4 ++++ .../data-loss-prevention/dlp-profiles/advanced-settings.mdx | 4 ++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/content/changelogs/dlp.yaml b/src/content/changelogs/dlp.yaml index a70b926586cddf9..6213accc54dd8af 100644 --- a/src/content/changelogs/dlp.yaml +++ b/src/content/changelogs/dlp.yaml @@ -5,6 +5,10 @@ productLink: "/cloudflare-one/policies/data-loss-prevention/" productArea: Cloudflare One productAreaLink: /cloudflare-one/changelog/ entries: + - publish_date: "2024-11-25" + title: Profile confidence levels + description: |- + DLP profiles now support setting a [confidence level](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/#confidence-levels) to choose how tolerant its detections are to false positives based on the context of the detection. The higher a profile's confidence level is, the less false positives will be allowed. Confidence levels include Low, Medium, or High. DLP profile confidence levels supersede [context analysis](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings/#context-analysis). - publish_date: "2024-11-01" title: Send entire HTTP requests to a Logpush destination description: |- diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index 07f1bcdaab14d07..6e94e9e5dbd5346 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -41,13 +41,13 @@ For inline detections in Gateway, to display Low and Medium confidence detection | ----------- | -------- | ---------------------------- | ------ | | DLP Profile | in | _High Confidence Detections_ | Block | -## Optical Character Recognition (OCR) +## Optical Character Recognition (OCR) {/* optical-character-recognition-ocr */} Optical Character Recognition (OCR) analyzes and interprets text within image files. When used with DLP profiles, OCR can detect sensitive data within images your users upload. OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in size. Text is encoded in UTF-8 format, including support for non-Latin characters. -## Context analysis +## Context analysis {/* context-analysis */} :::caution Context analysis has been superseded by [confidence levels](#confidence-levels). DLP will migrate users who had context analysis turned on to confidence levels where applicable.