-
Notifications
You must be signed in to change notification settings - Fork 6
Update filters documentation #198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
0afb29f
Improve anchor in title and refItem
Ndpnt a0beaa7
Improve code block style
Ndpnt 6a1fb28
Add reference for built in filters
Ndpnt 65d3606
Update declaration ref
Ndpnt 796a4e7
Add how to guide to apply filters
Ndpnt 1664c0d
Remove obsolete reference
Ndpnt 829ce0f
Improve filters doc
Ndpnt 05292b1
Minor improvement
Ndpnt 1e346d8
Improve filter docs
Ndpnt be7bb36
Improve writing
Ndpnt d19bad9
Improve filter docs
Ndpnt f97d959
Improve examples
Ndpnt 75acc97
Improve filter doc
Ndpnt 1ef4725
Improve doc
Ndpnt 4bd20bc
Fix title
Ndpnt f825bf8
Improve filters documentation
MattiSG f5afdf7
Document how to add third-party libs in filters
MattiSG File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
title: "Filters" | ||
weight: 3 | ||
--- | ||
|
||
# Filters | ||
|
||
Filters enable solving [noise]({{< relref "/terms/guideline/declaring/#usual-noise" >}}) issues in versions that cannot be addressed with direct selection or removal of content using selectors. | ||
|
||
## When filters are needed | ||
|
||
Use filters when: | ||
|
||
- **Content selectors are insufficient**, for example when noise appears within content that can't be targeted with CSS selectors or [range selectors]({{< relref "terms/explanation/range-selectors" >}}) with the [`select`]({{< relref "terms/reference/declaration/#ref-select" >}}) and [`remove`]({{< relref "terms/reference/declaration/#ref-remove" >}}) properties. | ||
- **Content is dynamically generated**, for example when elements change on each page load with changing classes or IDs that cannot be targeted with [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). | ||
- **Complex tasks are needed**, for example when content transformation is required such as converting images to base64 to store them in the terms version or converting date-based content to a stable format (like “Updated X days ago” to “Last updated on YYYY-MM-DD”). | ||
|
||
## How filters work | ||
|
||
Filters are JavaScript functions that can manipulate the DOM structure directly. They modify the document structure and content in-place. | ||
|
||
## Filter design principles | ||
|
||
Filters should follow these core principles: | ||
|
||
- **Specific**: target only the noise to remove. Avoid broad selectors that might accidentally remove important content. | ||
|
||
> For example, if a filter converts relative dates to absolute dates, make sure to scope the targeted dates. This might translate to selecting with `.metadata time`, not `time`, which might also affect important effective dates within the terms content. | ||
|
||
- **Idempotent**: filters should produce the same result even if run multiple times on their own output. This ensures consistency. | ||
|
||
> For example, if a filter adds section numbers like "1." to headings, it should check if the numbers already exist, to prevent "1. Privacy Policy" from becoming "1. 1. Privacy Policy" on repeated runs. | ||
|
||
- **Efficient**: DOM queries should be optimised and filters should avoid unnecessary operations, processing only the elements needed. | ||
|
||
> For example, if a filter updates timestamp elements with a specific class, using `document.querySelectorAll('.timestamp')` is more efficient than `document.querySelectorAll('*')` followed by filtering for timestamp elements. | ||
|
||
- **Safe**: filters must not accidentally remove important content. The generated version should always be checked after adding a filter to ensure it still contains the whole terms content. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
--- | ||
title: Apply filters | ||
weight: 7 | ||
--- | ||
|
||
# How to apply filters | ||
|
||
This guide explains how to add filters to existing declarations to remove meaningless content that cannot be removed with CSS selectors, to prevent noise in the versions. | ||
|
||
## Prerequisites | ||
|
||
- An existing terms declaration file. | ||
- Having already identified the noise to remove and having double-checked it cannot be removed with CSS selectors with the [`remove`]({{< relref "terms/reference/declaration/#ref-remove" >}}) property. | ||
|
||
## Step 1: Check for built-in filters | ||
|
||
Built-in filters are pre-defined functions that handle common noise patterns. They are the easiest way to clean up content. | ||
|
||
Review the available [built-in filters]({{< relref "/terms/reference/built-in-filters" >}}) to find if one matches your needs. | ||
|
||
If you find a suitable built-in filter, proceed to [Step 3](#step-3-declare-the-filter), otherwise you will need to create a custom filter. | ||
|
||
## Step 2: Create a custom filter _(optional)_ | ||
|
||
If no built-in filter matches your needs, you will need to create a custom filter. This requires JavaScript knowledge and familiarity with DOM manipulation. | ||
|
||
### Create the filter file | ||
|
||
Create a JavaScript file in the same folder and with the same name as your service declaration, but with `.filters.js` extension. | ||
|
||
> For example, if your declaration is `declarations/MyService.json`, create `declarations/MyService.filters.js`. | ||
|
||
### Write the filter function | ||
|
||
Define your filter function with the following signature: | ||
|
||
```js | ||
export function myCustomFilter(document, [parameters]) { | ||
// Your filter logic here | ||
} | ||
``` | ||
|
||
#### Parameters | ||
|
||
- `document`: JSDOM document instance representing the web page | ||
- `parameters`: values passed from the declaration _(optional)_ | ||
|
||
#### Example: Remove session IDs from text content | ||
|
||
For example, let's say you want to remove session IDs from text content: | ||
|
||
```html | ||
<p>We collect your data for the following purposes:</p> | ||
<ul> | ||
<li>To provide our services</li> | ||
<li>To improve user experience</li> | ||
</ul> | ||
<p class="session-id">Last updated on 2023-12-07 (Session: abc123def456)</p> | ||
``` | ||
|
||
You can implement this filter as follows: | ||
|
||
```js | ||
export function removeSessionIds(document) { | ||
// Find all paragraphs that might contain session IDs | ||
const paragraphs = document.querySelectorAll('.session-id'); | ||
|
||
paragraphs.forEach(paragraph => { | ||
let text = paragraph.textContent; | ||
// Remove session ID patterns like "Session: abc123" or "(Session: def456)" | ||
text = text.replace(/\s*\(?Session:\s*[a-zA-Z0-9]+\)?/g, ''); | ||
paragraph.textContent = text.trim(); | ||
}); | ||
} | ||
``` | ||
|
||
Result after applying the filter: | ||
|
||
```diff | ||
<p>We collect your data for the following purposes:</p> | ||
<ul> | ||
<li>To provide our services</li> | ||
<li>To improve user experience</li> | ||
</ul> | ||
- <p class="session-id">Last updated on 2023-12-07 (Session: abc123def456)</p> | ||
+ <p class="session-id">Last updated on 2023-12-07</p> | ||
``` | ||
|
||
## Step 3: Declare the filter | ||
|
||
Open your service declaration file (e.g. `declarations/MyService.json`) and locate the `filter` property of the specific terms you want to apply the filter to. If it doesn't exist, add it as an array. | ||
|
||
### Filter without parameters | ||
|
||
For filters that don’t require parameters, add the filter name as a string: | ||
|
||
```json | ||
{ | ||
"name": "MyService", | ||
"terms": { | ||
"Privacy Policy": { | ||
"fetch": "https://my.service.example/en/privacy-policy", | ||
"select": ".textcontent", | ||
"filter": [ | ||
"removeSessionIds" | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### Filter with parameters | ||
|
||
For filters that take parameters, use an object format, for example with the built-in filter `removeQueryParams` to remove query parameters from URLs: | ||
|
||
```json | ||
{ | ||
"name": "MyService", | ||
"terms": { | ||
"Privacy Policy": { | ||
"fetch": "https://my.service.example/en/privacy-policy", | ||
"select": ".textcontent", | ||
"filter": [ | ||
{ | ||
"removeQueryParams": ["utm_source", "utm_medium", "utm_campaign"] | ||
} | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### Multiple filters | ||
|
||
You can combine multiple filters in the same declaration: | ||
|
||
```json | ||
{ | ||
"name": "MyService", | ||
"terms": { | ||
"Privacy Policy": { | ||
"fetch": "https://my.service.example/en/privacy-policy", | ||
"select": ".textcontent", | ||
"filter": [ | ||
{ | ||
"removeQueryParams": ["utm_source", "utm_medium"] | ||
}, | ||
"removeSessionIds" | ||
] | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Step 4: Test the filter | ||
|
||
After adding the filter, test your declaration to ensure it works correctly: | ||
|
||
1. Start the terms tracking process | ||
2. Check that the noise has been removed | ||
3. Verify that important content is preserved |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
title: "Built-in filters" | ||
--- | ||
|
||
# Built-in filters | ||
|
||
This reference details all available built-in [filters]({{< relref "terms/explanation/filters" >}}) that can be applied to avoid noise in versions. | ||
|
||
{{< refItem | ||
name="removeQueryParams" | ||
description="Removes specified query parameters from URLs in links and images." | ||
>}} | ||
|
||
```json | ||
"filter": [ | ||
{ | ||
"removeQueryParams": ["utm_source", "utm_medium"] | ||
} | ||
] | ||
``` | ||
|
||
```diff | ||
- <p>Read the <a href="https://example.com/example-page?utm_source=OGB&utm_medium=website&lang=en">list of our affiliates</a>.</p> | ||
+ <p>Read the <a href="https://example.com/example-page?lang=en">list of our affiliates</a>.</p> | ||
``` | ||
|
||
{{< /refItem >}} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll use a structure based on the principle that we most often use a built-in filter and optionally a custom filter, so I would put everything related to creating a custom filter on a dedicated how-to page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we currently do not have enough builtin filters to justify splitting into two pages