diff --git a/crowdsec-docs/docs/contributing/bouncers.md b/crowdsec-docs/docs/contributing/bouncers.md index a3e5035c2..3246c8bc2 100644 --- a/crowdsec-docs/docs/contributing/bouncers.md +++ b/crowdsec-docs/docs/contributing/bouncers.md @@ -14,6 +14,32 @@ Sharing on the hub allows other users to find and use it. While increasing your ### How ? +#### Specs + +Remediation components have mandatory and optional features, they are described in the following sub pages: +- [Specifications for Remediation Component and AppSec Capabilities](/contributing/specs/bouncer_appsec_specs) +- [Remediation Component Metrics](/contributing/specs/bouncer_metrics_specs) + +*Don't hesitate to get in touch with us via discord if anything is unclear to you* + +Those specs describe how the Remediation component interacts with the Security Engine Local API as well as how each feature should behave. + +Main features are: +- **Mode**: How the bouncer retrieves decisions + - **Stream**: Pulls them periodically and stores them locally (preferred for low latency remediation) + - **Live**: Queries the LAPI upon request reception (easier to implement) + - Both available ideally, but **Stream** preferred in most cases +- **AppSec**: Ability to forward requests to the Security Engine to eval appsec rules + - Optional but if the remediation component has access to the request this features is a big plus +- **Metrics**: Keep track of what was remediated + - Optional but very useful for the users to be able to evaluate the efficiency of the protection + - Ideally with details on the source of the decision (blocklist, manual block, a scenario triggering a decision 'crowdsec'...) + +Other optional features are: +- **MTLS** support +- Exposing metrics to **Prometheus** + +#### Publish on Github To have it published on the hub, please simply [open a new issue on the hub](https://github.com/crowdsecurity/hub/issues/new), requesting "remediation component inclusion". The remediation component will then be reviewed by the team, and published directly on the hub, for everyone to find & use it! diff --git a/crowdsec-docs/docs/contributing/specs/bouncer_appsec_specs.mdx b/crowdsec-docs/docs/contributing/specs/bouncer_appsec_specs.mdx new file mode 100644 index 000000000..6d8d79109 --- /dev/null +++ b/crowdsec-docs/docs/contributing/specs/bouncer_appsec_specs.mdx @@ -0,0 +1,547 @@ +--- +id: bouncer_appsec_specs +title: Specifications for Remediation Component and AppSec Capabilities +--- + +import useBaseUrl from "@docusaurus/useBaseUrl" + +## Context + +A **Remediation Component** *(aka **Bouncer**)* is enforcing **decisions** made by **CrowdSec Security Engine** based on detected malicious behaviors [\[See figure 1\]](#crowdsec-security-engine-diagram), or Directly with CrowdSec SaaS endpoint channeling public, crowdsec or user made blocklsits. + +A **decision** dictates what action should be applied on incoming traffic from a specific **IP** or **IP-Range**. *(It could also be on the user scope or any other, but these specifications will focus on the IP and Range scopes)* + +The Bouncer communicates with the Security Engine to retrieve the decisions +The Bouncer applies the appropriate remediation *(we’ll only focus on ban/block and captcha)* + +**The following specifications cover** + +* [Basic Bouncer features](#basic-bouncer-features) + * Communication with the Local API (aka LAPI) or its SaaS counterpart + * Decisions retrieval and storage + * Remediation +* [Request Forwarding](#appsec-capability-request-forwarding) (for AppSec capabilities) + * Communication with the AppSec endpoint + * Forwarding protocol +* [Config and support requirements](#extra-details-and-requirements) + +Here is an existing remediation components *(bouncers)* for Nginx and its lua dependency. +It's one of the most complete bouncer with AppSec capabilities and Metrics. A good example to follow for your implementation. +[*cs-nginx-bouncer*](https://github.com/crowdsecurity/cs-nginx-bouncer) *\+ [lua-cs-bouncer](https://github.com/crowdsecurity/lua-cs-bouncer/)* (dependency) + +And a more recent and soon finalized [Node JS bouncer](https://github.com/crowdsecurity/nodejs-cs-bouncer) (for a different implementation, to be used in code) + +⚠️ **Your bouncer must always delete/clean it’s resources on shutdown** + +## Basic Bouncer features + +The bouncer connects to LAPI to retrieve the decisions. +It applies a remediation to incoming requests if the source IP can be found in the decisions list. +The remediation can be blocking or displaying a captcha. + +Fields in purple and/or with the mention (configurable) must appear in the config file, the case of the parameters names can be UPPER or LOWER depending on the type of config file, match the appropriate standard for the bouncer you’re implementing. Try to group them in a logical way in the config file template. + +Details about the config file in the [Installation chapter](#installation--documentation) + +### Connecting to the Local API (LAPI) + +You can find the swagger here [https://crowdsecurity.github.io/api_doc/lapi/](https://crowdsecurity.github.io/api_doc/lapi/) +Details about the endpoints parameters can be found [in the appendix](#appendix) + +* URL to Local API endpoint: configurable field **api_url** + * Default value likely to be: `http://121.0.0.1:8080` + * Security Engine Config : /etc/crowdsec/config.yaml // api.server.listen_url + * For now we only have a v1 of LAPI, bouncer states the version he’s using +* Authentication + * Either by API key passed in the header **X-Api-Key:** configurable field **api_key** + * Or via certificate configurable fields **tls.cert_file \+ tls.key_file** + +### Retrieving decisions + +There are 2 ways for retrieving decisions: + +* **Live Mode**: “Each time” a request is handled we call CrowdSec Security Engine +* **Stream Mode**: We store all decisions in memory and periodically call for delta update + +*We’ll prefer Stream Mode as it’s better for latency for a memory cost that is very acceptable.* +The Stream mode will be the default one in config: configurable field mode + +#### Live Mode + +The live mode endpoint is **/decisions** + +* Parameters + *Only the following fields are to be considered for a basic bouncer implementation* + * **scope:** value **IP** (forced for live mode) + * **value:** the source IP making the request + * **origins:** empty by default mean all origins are considered + * editable by the user to look in a specific origin: configurable field origins + * Origins are comma separated strings (e.g. crowdsec,capi,cscli) + * **contains**: empty by default mean *true* + * Indicates if it should check range decisions + * No need to make this configurable +* Caching + * To avoid consecutive calls for decisions about an IP we’ll cache the decisions per IP + * default **1s** configurable field cache_expiration +* Timeout + * If LAPI doesn’t respond + * Default **200ms** configurable field lapi_timeout + * **Fallback**: + * Fallback in case of timeout + * By default passthrough : let him pass + * Possible values: passthrough, ban, captcha + * configurable field lapi_failure_action + +#### Stream Mode (by default) + +The stream mode endpoint is **/decisions/stream** +Allows to pull all decisions from LAPI and then periodically get a delta + +* Get Decisions + * ⚠️ To retrieve the initial full list, use the **startup \= true** parameter + * This is necessary if you don’t have the decision list in memory + * Following calls need to have startup \= false + * Recommended pull period **10s** configurable field stream_update_frequency + * Parameters + + *Only the following fields are to be considered for a basic bouncer implementation* + + * **scopes**: default to “ip,range” for stream mode + * **origins**: + * empty by default mean all origins are considered + * editable by the user to look in a specific origin: configurable field origins (same field as for live) + * Origins are comma separated strings (e.g. crowdsec,capi,cscli) + * **scenarios_containing** and **scenarios_not_containing** + * Means that the decisions are linked (or not) to alerts triggered by such or such scenario + * The check done by LAPI is a string.contains(...) + * Default as empty configurable fields scenarios_containing && scenarios_not_containing +* Storing decisions + * ℹ️ The number of decisions you can expect is: + * 30-70k ips from Fire (nominal case) + * Can vary a lot depending on the BL subscription of the user + * Have the code be able to handle 100k to be safe for the nominal case + * Storing in memory is ideal, we recommend to convert IPs to integers + * The decisions format is the following: + * See [decisions example in appendix](#decision-example) + * There can be multiple decisions per IP + * Store each decisions independently as they have their own remediation action and TTL + * Ranges are stored too + * ⚠️ do not transform the range into its containing IPS + * Pruning + * When you GET you’ll receive “deleted” decisions + * Also Clean after a GET or periodically for decisions with expired TTL + +### Apply remediation + +If a remediation is found and for the LAPI timeout fallback here are the remediations that should be supported + +* Remediation type + * Remediation property will be “ban”, “captcha” or potentially any custom string + * **ban** (block) + * Return a 403: configurable field ban_return_code + * Accompanied by an HTML body + * Default page model [provided (single HTML file)](#ban-template-page) + * Page path configurable: configurable field ban_template_path + * **captcha** + * Various type of captcha must be supported + * configurable fields: + * captcha _provider + * captcha_secret_key + * captcha_site_key + * captcha_template_path + * Type to support + * RE-captcha + * Turnstile + * Hcaptcha + * onFails + * Re-present the captcha + * ⚠️ Cache: in order not to repeat the captcha too often + * **1h** cache **per IP** after successful captcha configurable field cache_expiration + * **Custom remediation** + * Defaults to ignore/ban/captcha configurable field **remediation_fallback** + * If ignored, you don’t even need to store the decision +* Remediation priority + * There is a priority in the remediation to take in account if an IP has multiple + * Default priority order **Ban** then **Captcha** +* Metrics see below and in the [detailed metrics specs](/contributing/specs/bouncer_metrics_specs) + +### Logging + +* When a remediation occurs, log something containing timestamp,sourceIP,remediationType + +### Metrics + +Remediation component can push information and internal metrics to LAPI about their configuration and the amount of requests/packets/bytes/… that have been blocked or allowed. + +The data is pushed on the `/usage-metrics` endpoint of LAPI. +Metrics push internal should be configurable, with a default value of 30 minutes and not allow intervals smaller than 10 minutes. Setting the interval to 0 disables the push. + +The body will contain information about: + +- The remediation component type and version +- Name and version of the operating system the RC is running on +- Enabled features flags (should be empty for the vast majority of RC) +- Meta information about the payload itself: the push interval in seconds, the startup timestamp in UTC, the push timestamp in UTC +- A list of metrics: + - Each metric must have a name, value, unit, and, optionally, one or more labels + +The metrics track the number of blocked requests per decision origin, so the RC must track internally the origin of every decision (based on the `origin` field from the decision stream). +Each push must reset the internal counter for the metrics (i.e., we have only sent the number of blocked requests since the last push). +Each metric about blocked requests must have an `origin` label whose value is the origin of the decision and a `remediation_type` label whose value is the type of remediation that was applied (e.g., `ban` or `captcha`). +A `processed` metric must also be present that counts the number of requests that were processed by the RC (regardless of whether they were blocked or not). This metric has no label. + +A full sample payload can be found in the [appendix](#metrics-payload). + +## AppSec Capability (request forwarding) + +An additional activatable capability of the bouncer is to forward the request to the security engine allowing more advanced behavior detection. + +The request forwarding is a blocking process, when the AppSec capability is activated the bouncer should wait for a response at each request forwarding to process with the request handling. + +AppSec is disabled by default and activable if url exists configurable field appsec_url + +* Connect to AppSec endpoint + * The security engine should have activated the AppSec and a listen address should be present in the SecurityEngine acquisition + * Default endpoint `http://127.0.0.1:7422` + * Auth by API key passed in the header **X-Api-Key:** same param as LAPI apikey +* Request forwarding + * You can find information about the forwarding protocol on this doc page: [https://docs.crowdsec.net/docs/next/appsec/protocol/](https://docs.crowdsec.net/docs/next/appsec/protocol/) + * When forwarding the query to the AppSec endpoint, the security engine will evaluate the actions to do and return the appropriate response code that the remediation component should display. + * ⚠️ At the exception of codes **500** and **401** which mean that the forwarding or authentication to the endpoint failed. For those response codes you should trigger the fallback described there after.. + * ⚠️ As stated earlier this is a blocking process + * **Timeout** 200ms configurable field appsec_timeout + * **Fallback**: + * Fallback in case of timeout or response failure (500,401…) + * By default passthrough : let him pass + * Possible values: passthrough, ban, captcha + * configurable field appsec_failure_action + +## Extra Details and Requirements + +* The name and version of the bouncer are specified via its **user-agent** communicating with LAPI + * The format is the following : *crowdsec-\-bouncer/v\* + * E.g *crowdsec-firewall-bouncer/v1.42* +* Ideally the bouncer would work for windows versions (if any) and openBSD (if any) +* The bouncer should be able to handle **HTTP 1 & 2 requests**, or mention the limitations + +## Installation / Documentation + +Usually we (at CrowdSec) will deal with **documentation**, **install scripts** and **packaging**. But any pointers from the bouncer’s developper that can help those processes is welcome on the following: + +Let us know what minimum version of the service is required to run the bouncer + +Provide a brief description of the steps necessary to install and configure the bouncer +⚠️Note that the bouncers configuration files must be located in ***/etc/crowdsec/bouncers/*** + +* The bouncer config file name pattern is the following: *crowdsec-\-bouncer.conf* +* Example of config file */etc/crowdsec/bouncers/crowdsec-apache-bouncer.conf* + +Ideally, at install or warmup of the bouncer, a check is made that the *crowdsec service* is running and the bouncer key is automatically created and added to the bouncer config. Provide advice about the best way and phase to perform those actions for this bouncer + +## Developing / Testing + +Here are some pointers and doc to help you test/mock actions for the bouncer during development. + +### Init & Decisions management + +First you must create a bouncer key for your bouncer to communicate with LAPI. +Actions on bouncers can be done via the *cscli bouncers …* commands. +Example: + +``` +$ sudo cscli bouncer add myTestBouncer + +API key for 'myTestBouncer': + + 26WsbH6MLaKUaRilA1zQ4LyYbMz3LvOsDel9bEZXv+U + +Please keep this key since you will not be able to retrieve it! + +$ sudo cscli bouncers list +──────────────────────────────────────────────────────────────────────────────────────── + Name IP Address Valid Last API pull Type Version Auth Type +──────────────────────────────────────────────────────────────────────────────────────── + myTestBouncer ✔️ 2024-01-29T09:24:24Z api-key +──────────────────────────────────────────────────────────────────────────────────────── +``` + +Note that the IP address, type and version will appear after the first connection of the bouncer + +### Populating decisions + +You can have decisions with various origins, here are a few ways to populate them + +#### Local decisions & Community blocklist + +If you installed your CrowdSec on a server with internet access, and it’s able to communicate with our Central API, it will periodically retrieve the community blocklist. If you are in a situation here your crowdsec shares signal you’ll get between 10 and 50k decisions from the community blocklist (decisions origin will be CAPI), if not you’ll receive a fraction of that. + +#### Manually populating decisions + +You can add and remove decisions manually: +Public documentation [available here](https://doc.crowdsec.net/u/user_guides/decisions_mgmt/) + +* Via **cscli decisions add/delete.** + * E.g. sudo cscli decisions add \-i 1.2.3.4 + * Those decisions origin will be “*cscli*” +* Via **cscli decisions import**. + * E.g. sudo cscli decisions import \-i ./myBl.txt \--format values + * Those decisions origin will be “*cscli-impor*t” + +#### Testing failures + +Shutdown the *crowdsec service* to test the failure cases. + +#### Testing AppSec + +You can refer to the AppSec documentation to test request forwarding. + +* AppSec [quickstart guide here](https://doc.crowdsec.net/docs/next/appsec/quickstart), [Testing example here](https://doc.crowdsec.net/docs/next/appsec/installation#making-sure-everything-works) +* E.g. Install virtual patching and try to query a */rpc2* or a *.env* file + +## Appendix + +### CrowdSec Security Engine diagram +**Figure 1** : Interactions around **CrowdSec Security Engine** +
+
+ +
+
+ +### Details about LAPI endpoints parameters + +**GET /decisions/stream** + +* **startup:** set it to **TRUE** for the **initial call** to get all decisions *(when False you’ll get the delta from your last call)* +* **scopes: “**ip,range” is the only relevant values when remediating on IPs +* **origins:** Leave blank to allow all origins, test your configurable origins with [those tests](#decision-example) +* **scenarios_containing:** leave blank by default, allow change in config +* **scenarios_not_containing:** leave blank by default, allow change in config + +**GET /decisions** + +* **scope: “**ip” is the only relevant values when remediating on IPs +* **value:** the ip itself as a string +* **type:** filtering on type of decisions, leave blank by default to get any decisions +* **ip:** ignore/leave blank: shortcut for scope:ip \+ value +* **range:** ignore/leave blank: shortcut for scope:range \+ value +* **contains:** leave blank by default, configurable by user +* **origins:** Leave blank to allow all origins, test your configurable origins with [those tests](#decision-example) +* **scenarios_containing:** leave blank by default, allow change in config +* **scenarios_not_containing:** leave blank by default, allow change in config + +### Decision example + +```javascript +{ + "deleted": [ + { + "duration": "-75h34m54.509128301s", + "id": 55873846, + "origin": "CAPI", + "scenario": "crowdsecurity/ssh-bf", + "scope": "Ip", + "type": "ban", + "value": "61.155.106.101" + }, +], + "new": [ + { + "duration": "167h59m20.890999684s", + "id": 55898280, + "origin": "CAPI", + "scenario": "crowdsecurity/CVE-2022-35914", + "scope": "Ip", + "type": "ban", + "value": "45.95.147.236" + }, +] +} + +``` + +### Ban template page + +```javascript + + + + CrowdSec Ban + + + + + + +
+
+
+
+ +

CrowdSec Access Forbidden

+

You are unable to visit the website.

+
+ Your IP seems to have been blocked, check our CTI info about it or contact this website admin +
+
+
+

+ This security check has been powered by +

+ + + + + + + + + + + + + + + + + + + + + + CrowdSec + + +
+
+
+ + + + + +``` + +### Metrics payload + +More details about metrics in [Metrics specs](/contributing/specs/bouncer_metrics_specs/) + +```json +{ + "remediation_components": [ { + "type": "my-bouncer-stat", + "version": "1.0.0", + "os": { + "name": "ubuntu", + "version": "22.04" + }, + "features": [], //Always empty / invalid / ignored for bouncers + "meta": { + "window_size_seconds": 1800, + "utc_startup_timestamp": 123123, + "utc_now_timestamp": 123123123123 + }, + "metrics": [ + { + "name": "blocked", + "value": 100, + "labels": { + "origin": "fire", + "remediation_type": "ban" + }, + "unit": "request" + }, + { + "name": "blocked", + "value": 40, + "labels": { + "origin": "crowdsec", + "remediation_type": "ban" + }, + "unit": "request" + }, + { + "name": "blocked", + "value": 60, + "labels": { + "origin": "crowdsec", + "remediation_type": "captcha" + }, + "unit": "request" + }, + { + "name": "blocked", + "value": 100, + "labels": { + "origin": "lists:tor" + "remediation_type": "ban" + } + }, + { + "name": "processed", + "value": 500, + "unit": "request" + } + ] +}]} +``` diff --git a/crowdsec-docs/docs/contributing/specs/bouncer_metrics_specs.mdx b/crowdsec-docs/docs/contributing/specs/bouncer_metrics_specs.mdx new file mode 100644 index 000000000..91c45eb17 --- /dev/null +++ b/crowdsec-docs/docs/contributing/specs/bouncer_metrics_specs.mdx @@ -0,0 +1,369 @@ +--- +id: bouncer_metrics_specs +title: Remediation Component Metrics +--- + +## Overview + +This document provides a comprehensive guide for developers to implement the "[Remediation Metrics](https://docs.crowdsec.net/docs/next/observability/usage_metrics)" feature in a remediation component. The remediation metrics feature allows remediation components to report [raw metrics](https://docs.crowdsec.net/u/service_api/quickstart/metrics/#raw-metrics) about their activity to the Local API (LAPI), which can then be forwarded to the Central API (CAPI) for monitoring and analytics purposes. + +The remediation component should send the following data: + +- "**dropped**" metrics: the total number of units (`byte`, `packet` or `request`) for which a remediation (`ban`, `captcha`, etc.) has been applied. + For this metrics, data should be split into origin/remediation pairs. +- "**processed**" metrics: the total number of units that has been processed by the remediation component. + It must also include the number of "bypass" (i.e. when no decision were applied). +- "**active_decisions**" metrics: it represents the number of decisions currently known by the remediation component. + +Additionally, some relevant time values must be sent: + +- "**window_size_seconds**": The time interval between metric reports (typically 1800 seconds / 30 minutes). + We recommend a minimum delay of 15 minutes between each transmission. +- "**utc_startup_timestamp**": When the remediation component started. This can vary depending on implementation: + - For daemon bouncers: timestamp when the daemon process started + - For "on-demand" bouncer like the PHP one: timestamp of the first LAPI call/pull + + +As an example, here is the kind of expected payload that you will have to build and send: + +### Metrics Payload example + +```json +{ + "remediation_components": [{ + "name": "my-bouncer", + "type": "crowdsec-custom-bouncer", + "version": "1.0.0", + "feature_flags": [], + "utc_startup_timestamp": 1704067200, + "os": { + "name": "linux", + "version": "5.4.0" + }, + "metrics": { + "meta": { + "window_size_seconds": 1800, + "utc_now_timestamp": 1704069000 + }, + "items": [ + { + "name": "dropped", + "value": 150, + "unit": "request", + "labels": { + "origin": "CAPI", + "remediation": "ban" + } + }, + { + "name": "dropped", + "value": 25, + "unit": "request", + "labels": { + "origin": "cscli", + "remediation": "ban" + } + }, + { + "name": "dropped", + "value": 12, + "unit": "request", + "labels": { + "origin": "cscli", + "remediation": "captcha" + } + }, + { + "name": "processed", + "value": 1175, + "unit": "request" + }, + { + "name": "active_decisions", + "value": 342010 + } + ] + } + }] +} +``` + + +For more details on valid payloads, please refer to the [API specification](https://crowdsecurity.github.io/api_doc/index.html?urls.primaryName=LAPI#/Remediation%20component/usage-metrics). + + + +## Architecture Overview + +### Key Features + +Implementing remediation metrics involves several capabilities: + +1. **Metrics Storage**: + - Store "remediation by origin" counters and relevant time values in a persistent storage. + - Update or delete stored values +2. **Metrics Building**: + - Retrieve metrics in storage + - Format metrics according to the API specification +3. **Metrics Transmission**: + - Send metrics to LAPI `usage-metrics` endpoint + - Update metrics items so that next push will only send fresh metrics + +### Core Concepts + +- **Origins**: The source of a remediation (e.g., `CAPI`, `lists:***`, `cscli`, etc). + + As we want to track the total number of processed items, we also need to be able to count the number of "bypass". That's why you may use a `clean` and `clean_appsec` origins to track bypass remediations for regular and AppSec traffic respectively. + +- **Remediations**: The final action effectively applied by the remediation component (e.g., "ban", "captcha", "bypass") + + The remediation stored in metrics **must be the final remediation effectively applied by the bouncer**, not the original decision from CrowdSec. Examples: + + - **Captcha Resolution**: If the original decision was "captcha" but the user has already solved the captcha and can access the page, store "bypass" as the final remediation. + + - **Remediation Transformation**: If the original decision was "ban" but the bouncer configuration transforms it to "captcha" (and the user hasn't solved it yet), store "captcha" as the final remediation. + + - **Fallback Scenarios**: If a timeout occurs and the bouncer applies a fallback remediation, store the fallback remediation, not the original intended one. + + +## Implementation Guide + +### 1. Storage + +#### 1.1 Cached Items + +Every time the remediation component is involved, storage should be used to persist data: + +- origin and remediation +- time values + +For example, you could have the following cached items: + +``` +TIME_VALUES = { + "utc_startup_timestamp": , // When the bouncer was started or used for the first time + "last_metrics_sent": , // Last successful metrics transmission +} + +ORIGINS_COUNT = { + "": { + "": + } +} +``` + +Storing a `last_metrics_sent` value makes it easy to compute the `window_size_seconds` value. + +#### 1.1 Metrics Tracking + +Once you know the final remediation that has been applied, you should increment the count of the related "origin/remediation" pair. + +Below are a few lines of pseudo-code to help you visualize what the final implementation might look like. + +```pseudocode +function updateMetricsOriginsCount(origin: string, remediation: string, delta: int = 1): int + // Get current count from cache + currentCount = getFromCache("ORIGINS_COUNT[origin][remediation]") ?? 0 + + // Update count (delta can be negative for decrementing) + newCount = max(0, currentCount + delta) + + // Store updated count in cache + storeInCache("ORIGINS_COUNT[origin][remediation]", newCount) + + return newCount +``` + +### 2. Metrics Building Process + +In order to send metrics, you will have to retrieved cached values and build the required payload. + +#### 2.1 Build Metrics Items + +The main information belongs to the metrics items: + +```pseudocode +function buildMetricsItems(originsCount: object): object + metricsItems = [] + processedTotal = 0 + originsToDecrement = {} + + for each origin in originsCount: + for each remediation, count in origin: + if count <= 0: + continue + + // Track total processed requests + processedTotal += count + + // Prepare for decrementing after successful send + originsToDecrement[origin][remediation] = -count + + // Skip bypass remediations in "dropped" metrics + if remediation == "bypass": + continue + + // Create "dropped" metric for blocked requests + metricsItems.append({ + "name": "dropped", + "value": count, + "unit": getMetricUnit(), // "request", "packet", or other relevant unit + "labels": { + "origin": origin, + "remediation": remediation + } + }) + + // Add total processed metric + if processedTotal > 0: + metricsItems.append({ + "name": "processed", + "value": processedTotal, + "unit": getMetricUnit() // "request", "packet", or other relevant unit + }) + + // Add active_decisions metric (if supported) + activeDecisions = getActiveDecisionsCount() + if activeDecisions > 0: + metricsItems.append({ + "name": "active_decisions", + "value": activeDecisions, + }) + + return { + "items": metricsItems, + "originsToDecrement": originsToDecrement + } +``` + +Note that it's important to record the number sent for each origin/remediation in order to reset the respective counter after the push. + +#### 2.2 Build Complete Metrics Payload + +In addition to the metrics items, payload requires properties and meta attributes: + + +```pseudocode +function buildUsageMetrics(properties: object, meta: object, items: array): object + // Prepare bouncer properties + bouncerProperties = { + "name": properties.name, + "type": properties.type, + "version": properties.version, + "feature_flags": properties.feature_flags ?? [], + "utc_startup_timestamp": properties.utc_startup_timestamp + } + + // Add optional OS information + if properties.os: + bouncerProperties["os"] = { + "name": properties.os.name, + "version": properties.os.version + } + + // Prepare metadata + metricsMetadata = { + "window_size_seconds": meta.window_size_seconds, + "utc_now_timestamp": meta.utc_now_timestamp + } + + // Build final payload + return { + "remediation_components": [{ + ...bouncerProperties, + "metrics": { + "meta": metricsMetadata, + "items": items + } + }] + } +``` + +### 3. Complete Push Metrics Implementation + +```pseudocode +function pushUsageMetrics(bouncerName: string, bouncerVersion: string, bouncerType: string): array + // Get timing information + startupTime = getStartUp() + currentTime = getCurrentTimestamp() + lastSent = getFromCache("CONFIG.last_metrics_sent") ?? startupTime + + // Get current metrics + originsCount = getOriginsCount() + metricsData = buildMetricsItems(originsCount) + + // Return early if no metrics to send + if metricsData.items.isEmpty(): + log("No metrics to send") + return [] + + // Prepare properties and metadata + properties = { + "name": bouncerName, + "type": bouncerType, + "version": bouncerVersion, + "utc_startup_timestamp": startupTime, + "os": getOsInformation() + } + + meta = { + "window_size_seconds": max(0, currentTime - lastSent), + "utc_now_timestamp": currentTime + } + + // Build and send metrics + metricsPayload = buildUsageMetrics(properties, meta, metricsData.items) + + // Send to LAPI/CAPI + sendMetricsToAPI(metricsPayload) + + // Decrement counters after successful send + for origin, remediationCounts in metricsData.originsToDecrement: + for remediation, deltaCount in remediationCounts: + updateMetricsOriginsCount(origin, remediation, deltaCount) + + // Update last sent timestamp + storeMetricsLastSent(currentTime) + + return metricsPayload +``` + +## Useful Tips + +### When to Update Metrics + +Call `updateMetricsOriginsCount()` after each remediation decision is **effectively applied**: + +```pseudocode +// After determining and applying the final remediation +initialRemediation = getRemediationForIP(clientIP) +origin = initialRemediation.origin +finalAction = applyBouncerLogic(initialRemediation.action) + +// Increment the counter with the final action +updateMetricsOriginsCount(origin, finalAction, 1) +``` + +### When to Push Metrics + +Typically push metrics on a scheduled interval (e.g., every 30 minutes): + +```pseudocode +// In your scheduled metrics push job +try: + sentMetrics = pushUsageMetrics("my-bouncer", "1.0.0", "crowdsec-custom-bouncer") + if sentMetrics.isEmpty(): + log("No metrics were sent") + else: + log("Successfully sent metrics", sentMetrics) +catch Exception as e: + log("Failed to send metrics", e) +``` + +### Existing Implementations + +Remediation metrics have already been implemented in various languages and frameworks. You can use it as inspiration for your own implementation: + +- The [LUA library](https://github.com/crowdsecurity/lua-cs-bouncer/) used by the [NGINX remediation component](https://docs.crowdsec.net/u/bouncers/nginx/) +- The [PHP library](https://github.com/crowdsecurity/php-remediation-engine) used by the [WordPress remediation component](https://docs.crowdsec.net/u/bouncers/wordpress). +- The [Firewall Bouncer](https://github.com/crowdsecurity/cs-firewall-bouncer) written in Go. Used for nftables/iptables. diff --git a/crowdsec-docs/sidebars.ts b/crowdsec-docs/sidebars.ts index 22bec8a08..7c3f6cd44 100644 --- a/crowdsec-docs/sidebars.ts +++ b/crowdsec-docs/sidebars.ts @@ -280,7 +280,26 @@ const sidebarsConfig: SidebarConfig = { items: [ "contributing/contributing_doc", "contributing/contributing_hub", - "contributing/contributing_bouncers", + { + type: "category", + label: "Remediation Components", + link: { + type: "doc", + id: "contributing/contributing_bouncers", + }, + items: [ + { + type: "doc", + id: "contributing/specs/bouncer_appsec_specs", + label: "Bouncer&AppSec", + }, + { + type: "doc", + id: "contributing/specs/bouncer_metrics_specs", + label: "Metrics", + }, + ], + }, "contributing/contributing_test_env", "contributing/contributing_crowdsec", ],