Add anomaly detection transform stage to flowlogs-pipeline #1143

vatankh · 2025-11-26T13:07:05Z

Description

This PR introduces a new anomaly transform stage to flowlogs-pipeline as a first step toward anomaly detection for Kubernetes network flows (see issue #).

Key points:

Adds a new type: anomaly transform that computes streaming anomaly scores per key.
Supports two algorithms:
- zscore: rolling z-score over a sliding window.
- ewma: exponentially weighted moving average baseline.
Configuration options:
- algorithm (ewma | zscore)
- valueField (numeric field, e.g. Bytes)
- keyFields (used to group flows per entity, e.g. [SrcAddr, DstAddr, Proto])
- windowSize, baselineWindow, sensitivity, ewmaAlpha
Emits additional fields on each record:
- anomaly_score
- anomaly_type (e.g. warming_up, normal, zscore_high, zscore_low, ewma_high, ewma_low)
- baseline_window (current number of samples in the baseline window)
Adds API docs and an example pipeline (hack/examples/pipeline-anomaly.yaml).

This is intentionally a local, per-instance anomaly stage that works on the existing pipeline input only; it does not consume Loki/Kafka yet, as discussed in the issue conversation.

Dependencies

n/a

Testing

go test ./pkg/pipeline/transform -run TestTransformAnomaly
go test ./...
Manual run:
- go build ./cmd/flowlogs-pipeline
- ./flowlogs-pipeline --log-level debug --config hack/examples/pipeline-anomaly.yaml

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

To run a perfscale test, comment with: /test flp-node-density-heavy-25nodes

…nd-config Add anomaly detection transform

openshift-ci · 2025-11-26T13:07:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign oliviercazade for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-11-26T13:07:17Z

Hi @vatankh. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jotak · 2025-12-09T15:19:12Z

pkg/pipeline/transform/transform_anomaly.go

+func (a *Anomaly) Transform(entry config.GenericMap) (config.GenericMap, bool) {
+	value, err := utils.ConvertToFloat64(entry[a.config.ValueField])
+	if err != nil {
+		anomalyLog.Errorf("unable to convert %s to float: %v", a.config.ValueField, err)


to avoid flooding logs with errors in the data path, we tend to use an error metric rather than logs, like you can see here: https://github.com/netobserv/flowlogs-pipeline/blob/main/pkg/pipeline/encode/metrics_common.go#L192

jotak · 2025-12-09T15:24:11Z

pkg/pipeline/transform/transform_anomaly.go

+	parts := make([]string, 0, len(a.config.KeyFields))
+	for _, key := range a.config.KeyFields {
+		if val, ok := entry[key]; ok {
+			parts = append(parts, fmt.Sprint(val))


we use utils.ConvertToString for this kind of conversion - it should be more performant than fmt-package conversions

jotak · 2025-12-09T15:40:40Z

Thanks @vatankh ! This is looking pretty good already.

I have a few more comments, let's start with the nitpicking one :-) : could you remove .idea from the PR, and add it to .gitignore ?

Then a comment on the API design: as it is, it doesn't allow to run several anomaly detections (e.g. on several valueFields, or with different keys). A single Anomaly stage runs for a single value field, and if several stages are defined, they would conflict when writing on the same output fields. A simple way to fix this would be to add a Prefix field to the API config, which would prefix the "anomaly_score", "anomaly_type" and "baseline_window" outputs. So each stage would define a different prefix, allowing for disambiguation.

Another approach would be to allow multiple value fields in a single stage.

jpinsonneau · 2025-12-10T08:48:54Z

pkg/pipeline/transform/transform_anomaly.go

+		stddev = math.Max(math.Abs(state.baseline)*1e-6, 1e-9)
+	}
+	score := math.Abs(deviation) / stddev
+	state.baseline = state.baseline + a.alpha*(value-state.baseline)


Suggested change

state.baseline = state.baseline + a.alpha*(value-state.baseline)

state.baseline += a.alpha*(value-state.baseline)

jpinsonneau · 2025-12-10T08:49:54Z

pkg/pipeline/transform/transform_anomaly.go

+	anomalyType := "normal"
+	if score >= a.sensitivity {
+		if value > mean {
+			anomalyType = "zscore_high"
+		} else {
+			anomalyType = "zscore_low"
+		}
+	}


Should we create an enum for that ?

You could then use it as return type instead of string

jpinsonneau · 2025-12-10T08:53:17Z

hack/examples/pipeline-anomaly.yaml

+  - name: write
+    write:
+      type: stdout


That's good enough as example but could you explain what's the final goal for your usage ?

Do you want to expose that in a prometheus metric or somewhere else ?

vatankh and others added 3 commits November 26, 2025 15:03

Add anomaly transform support

cb400ab

Merge pull request #1 from vatankh/codex/add-anomaly-transform-type-a…

a832727

…nd-config Add anomaly detection transform

Add anomaly detection transform stage (EWMA + Z-score)

91d45cc

openshift-ci bot added the needs-ok-to-test label Nov 26, 2025

vatankh mentioned this pull request Nov 26, 2025

Proposal: Add AI/ML-based network anomaly detection stage to flowlogs-pipeline #1128

Open

jotak reviewed Dec 9, 2025

View reviewed changes

jpinsonneau reviewed Dec 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add anomaly detection transform stage to flowlogs-pipeline #1143

Add anomaly detection transform stage to flowlogs-pipeline #1143

Uh oh!

vatankh commented Nov 26, 2025 •

edited by jotak

Loading

Uh oh!

openshift-ci bot commented Nov 26, 2025

Uh oh!

openshift-ci bot commented Nov 26, 2025

Uh oh!

jotak Dec 9, 2025

Uh oh!

jotak Dec 9, 2025

Uh oh!

jotak commented Dec 9, 2025

Uh oh!

jpinsonneau Dec 10, 2025

Uh oh!

jpinsonneau Dec 10, 2025 •

edited

Loading

Uh oh!

jpinsonneau Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	state.baseline = state.baseline + a.alpha*(value-state.baseline)
	state.baseline += a.alpha*(value-state.baseline)

Add anomaly detection transform stage to flowlogs-pipeline #1143

Are you sure you want to change the base?

Add anomaly detection transform stage to flowlogs-pipeline #1143

Uh oh!

Conversation

vatankh commented Nov 26, 2025 • edited by jotak Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Testing

Checklist

Uh oh!

openshift-ci bot commented Nov 26, 2025

Uh oh!

openshift-ci bot commented Nov 26, 2025

Uh oh!

jotak Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

jotak Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

jotak commented Dec 9, 2025

Uh oh!

jpinsonneau Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

jpinsonneau Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jpinsonneau Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vatankh commented Nov 26, 2025 •

edited by jotak

Loading

jpinsonneau Dec 10, 2025 •

edited

Loading