Skip to content

Commit 894ed95

Browse files
add cardinality action plugin (#926)
* add cardinality action plugin * fix after review * cardinality: dig sub-fields * fix after review * optimizations for cardinality action * cardinality: more examples * cardinality: metric for uniqie values * cardinality: metric of limit uniqie values * fix after review * fix after review: dont use ordered map in cardinality * fix after review: remove unused variables && fix metric name * cardinality: reuse buf without pool * cardinality: comment when copy buf to prefixKey * cardinality: use metric.Gauge
1 parent a842f3e commit 894ed95

File tree

14 files changed

+912
-1
lines changed

14 files changed

+912
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ TBD: throughput on production servers.
4444

4545
**Input**: [dmesg](plugin/input/dmesg/README.md), [fake](plugin/input/fake/README.md), [file](plugin/input/file/README.md), [http](plugin/input/http/README.md), [journalctl](plugin/input/journalctl/README.md), [k8s](plugin/input/k8s/README.md), [kafka](plugin/input/kafka/README.md), [socket](plugin/input/socket/README.md)
4646

47-
**Action**: [add_file_name](plugin/action/add_file_name/README.md), [add_host](plugin/action/add_host/README.md), [convert_date](plugin/action/convert_date/README.md), [convert_log_level](plugin/action/convert_log_level/README.md), [convert_utf8_bytes](plugin/action/convert_utf8_bytes/README.md), [debug](plugin/action/debug/README.md), [decode](plugin/action/decode/README.md), [discard](plugin/action/discard/README.md), [flatten](plugin/action/flatten/README.md), [hash](plugin/action/hash/README.md), [join](plugin/action/join/README.md), [join_template](plugin/action/join_template/README.md), [json_decode](plugin/action/json_decode/README.md), [json_encode](plugin/action/json_encode/README.md), [json_extract](plugin/action/json_extract/README.md), [keep_fields](plugin/action/keep_fields/README.md), [mask](plugin/action/mask/README.md), [modify](plugin/action/modify/README.md), [move](plugin/action/move/README.md), [parse_es](plugin/action/parse_es/README.md), [parse_re2](plugin/action/parse_re2/README.md), [remove_fields](plugin/action/remove_fields/README.md), [rename](plugin/action/rename/README.md), [set_time](plugin/action/set_time/README.md), [split](plugin/action/split/README.md), [throttle](plugin/action/throttle/README.md)
47+
**Action**: [add_file_name](plugin/action/add_file_name/README.md), [add_host](plugin/action/add_host/README.md), [cardinality](plugin/action/cardinality/README.md), [convert_date](plugin/action/convert_date/README.md), [convert_log_level](plugin/action/convert_log_level/README.md), [convert_utf8_bytes](plugin/action/convert_utf8_bytes/README.md), [debug](plugin/action/debug/README.md), [decode](plugin/action/decode/README.md), [discard](plugin/action/discard/README.md), [flatten](plugin/action/flatten/README.md), [hash](plugin/action/hash/README.md), [join](plugin/action/join/README.md), [join_template](plugin/action/join_template/README.md), [json_decode](plugin/action/json_decode/README.md), [json_encode](plugin/action/json_encode/README.md), [json_extract](plugin/action/json_extract/README.md), [keep_fields](plugin/action/keep_fields/README.md), [mask](plugin/action/mask/README.md), [modify](plugin/action/modify/README.md), [move](plugin/action/move/README.md), [parse_es](plugin/action/parse_es/README.md), [parse_re2](plugin/action/parse_re2/README.md), [remove_fields](plugin/action/remove_fields/README.md), [rename](plugin/action/rename/README.md), [set_time](plugin/action/set_time/README.md), [split](plugin/action/split/README.md), [throttle](plugin/action/throttle/README.md)
4848

4949
**Output**: [clickhouse](plugin/output/clickhouse/README.md), [devnull](plugin/output/devnull/README.md), [elasticsearch](plugin/output/elasticsearch/README.md), [file](plugin/output/file/README.md), [gelf](plugin/output/gelf/README.md), [http](plugin/output/http/README.md), [kafka](plugin/output/kafka/README.md), [loki](plugin/output/loki/README.md), [postgres](plugin/output/postgres/README.md), [s3](plugin/output/s3/README.md), [splunk](plugin/output/splunk/README.md), [stdout](plugin/output/stdout/README.md)
5050

_sidebar.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
- Action
2626
- [add_file_name](plugin/action/add_file_name/README.md)
2727
- [add_host](plugin/action/add_host/README.md)
28+
- [cardinality](plugin/action/cardinality/README.md)
2829
- [convert_date](plugin/action/convert_date/README.md)
2930
- [convert_log_level](plugin/action/convert_log_level/README.md)
3031
- [convert_utf8_bytes](plugin/action/convert_utf8_bytes/README.md)

cmd/file.d/file.d.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ import (
1919
"github.com/ozontech/file.d/pipeline"
2020
_ "github.com/ozontech/file.d/plugin/action/add_file_name"
2121
_ "github.com/ozontech/file.d/plugin/action/add_host"
22+
_ "github.com/ozontech/file.d/plugin/action/cardinality"
2223
_ "github.com/ozontech/file.d/plugin/action/convert_date"
2324
_ "github.com/ozontech/file.d/plugin/action/convert_log_level"
2425
_ "github.com/ozontech/file.d/plugin/action/convert_utf8_bytes"

e2e/start_work_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ import (
2424
"github.com/ozontech/file.d/fd"
2525
_ "github.com/ozontech/file.d/plugin/action/add_file_name"
2626
_ "github.com/ozontech/file.d/plugin/action/add_host"
27+
_ "github.com/ozontech/file.d/plugin/action/cardinality"
2728
_ "github.com/ozontech/file.d/plugin/action/convert_date"
2829
_ "github.com/ozontech/file.d/plugin/action/convert_log_level"
2930
_ "github.com/ozontech/file.d/plugin/action/convert_utf8_bytes"

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ require (
1111
github.com/alecthomas/kingpin v2.2.6+incompatible
1212
github.com/alecthomas/units v0.0.0-20211218093645-b94a6e3cc137
1313
github.com/alicebob/miniredis/v2 v2.35.0
14+
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310
1415
github.com/bitly/go-simplejson v0.5.1
1516
github.com/bmatcuk/doublestar/v4 v4.8.1
1617
github.com/bufbuild/protocompile v0.13.0

go.sum

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ github.com/alicebob/miniredis/v2 v2.35.0 h1:QwLphYqCEAo1eu1TqPRN2jgVMPBweeQcR21j
1616
github.com/alicebob/miniredis/v2 v2.35.0/go.mod h1:TcL7YfarKPGDAthEtl5NBeHZfeUQj6OXMm/+iu5cLMM=
1717
github.com/andybalholm/brotli v1.0.5 h1:8uQZIdzKmjc/iuPu7O2ioW48L81FgatrcpfFmiq/cCs=
1818
github.com/andybalholm/brotli v1.0.5/go.mod h1:fO7iG3H7G2nSZ7m0zPUDn85XEX2GTukHGRSepvi9Eig=
19+
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310 h1:BUAU3CGlLvorLI26FmByPp2eC2qla6E1Tw+scpcg/to=
20+
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310/go.mod h1:ufUuZ+zHj4x4TnLV4JWEpy2hxWSpsRywHrMgIH9cCH8=
1921
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
2022
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
2123
github.com/bitly/go-simplejson v0.5.1 h1:xgwPbetQScXt1gh9BmoJ6j9JMr3TElvuIyjR8pgdoow=

plugin/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,10 @@ It is only applicable for input plugins k8s and file.
188188
It adds field containing hostname to an event.
189189

190190
[More details...](plugin/action/add_host/README.md)
191+
## cardinality
192+
Limits the cardinality of fields on events, drops events or just do nothing.
193+
194+
[More details...](plugin/action/cardinality/README.md)
191195
## convert_date
192196
It converts field date/time data to different format.
193197

plugin/action/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ It is only applicable for input plugins k8s and file.
99
It adds field containing hostname to an event.
1010

1111
[More details...](plugin/action/add_host/README.md)
12+
## cardinality
13+
Limits the cardinality of fields on events, drops events or just do nothing.
14+
15+
[More details...](plugin/action/cardinality/README.md)
1216
## convert_date
1317
It converts field date/time data to different format.
1418

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Cardinality limit plugin
2+
@introduction
3+
4+
## Examples
5+
@examples
6+
7+
## Config params
8+
@config-params|description
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# Cardinality limit plugin
2+
Limits the cardinality of fields on events, drops events or just do nothing.
3+
4+
## Examples
5+
Discarding events with high cardinality field:
6+
```yaml
7+
pipelines:
8+
example_pipeline:
9+
...
10+
- type: cardinality
11+
limit: 2
12+
action: discard
13+
ttl: 1m
14+
metric_prefix: service_client
15+
key:
16+
- service
17+
fields:
18+
- client_id
19+
...
20+
```
21+
Events:
22+
```json
23+
{"service": "registration", "client_id": "1"}
24+
{"service": "registration", "client_id": "1"}
25+
{"service": "registration", "client_id": "2"}
26+
{"service": "registration", "client_id": "3"} // will be discarded
27+
```
28+
---
29+
30+
Remove high cardinality fields:
31+
```yaml
32+
pipelines:
33+
example_pipeline:
34+
...
35+
- type: cardinality
36+
limit: 2
37+
action: remove_fields
38+
ttl: 1m
39+
metric_prefix: service_client
40+
key:
41+
- service
42+
fields:
43+
- client_id
44+
...
45+
```
46+
The original events:
47+
```json
48+
{"service": "registration", "client_id": "1"}
49+
{"service": "registration", "client_id": "2"}
50+
{"service": "registration", "client_id": "3"}
51+
```
52+
The resulting events:
53+
```json
54+
{"service": "registration", "client_id": "1"}
55+
{"service": "registration", "client_id": "2"}
56+
{"service": "registration"}
57+
```
58+
59+
## Config params
60+
**`key`** *`[]cfg.FieldSelector`* *`required`*
61+
62+
Fields used to group events before calculating cardinality.
63+
Events with the same key values are aggregated together.
64+
Required for proper cardinality tracking per logical group.
65+
66+
<br>
67+
68+
**`fields`** *`[]cfg.FieldSelector`* *`required`*
69+
70+
Target fields whose unique values are counted within each key group.
71+
The plugin monitors how many distinct values these fields contain.
72+
Required to define what constitutes high cardinality.
73+
74+
<br>
75+
76+
**`action`** *`string`* *`default=nothing`* *`options=discard|remove_fields|nothing`*
77+
78+
Action to perform when cardinality limit is exceeded.
79+
Determines whether to discard events, remove fields, or just monitor.
80+
Choose based on whether you need to preserve other event data.
81+
82+
<br>
83+
84+
**`metric_prefix`** *`string`*
85+
86+
Prefix added to metric names for better organization.
87+
Useful when running multiple instances to avoid metric name collisions.
88+
Leave empty for default metric naming.
89+
90+
<br>
91+
92+
**`limit`** *`int`* *`default=10000`*
93+
94+
Maximum allowed number of unique values for monitored fields.
95+
When exceeded within a key group, the configured action triggers.
96+
Set based on expected diversity and system capacity.
97+
98+
<br>
99+
100+
**`ttl`** *`cfg.Duration`* *`default=1h`*
101+
102+
Time-to-live for cardinality tracking cache entries.
103+
Prevents unbounded memory growth by forgetting old unique values.
104+
Should align with typical patterns of field value changes.
105+
106+
<br>
107+
108+
109+
<br>*Generated using [__insane-doc__](https://github.com/vitkovskii/insane-doc)*

0 commit comments

Comments
 (0)