feat: add JSON Stream (NDJSON) body processor by fzipi · Pull Request #1481 · corazawaf/coraza

fzipi · 2026-01-20T17:14:16Z

what

Implements a new body processor for handling streaming JSON formats with per-record rule evaluation:

NDJSON (Newline Delimited JSON)
JSON Lines
JSON Sequence (RFC 7464)

Streaming body processor

Line-by-line processing for memory efficiency
Each JSON object indexed by record number (json.0.field, json.1.field)
Built-in DoS protection with 1024 recursion limit
Support for nested objects and arrays
Auto-detection of format (NDJSON vs RFC 7464) by peeking at the first 4KB
Registered under JSONSTREAM, NDJSON, and JSONLINES aliases

Per-record rule evaluation

Instead of parsing all records into ArgsPost and evaluating Phase 2 rules once, rules are now evaluated after each complete JSON record:

Malicious records are caught immediately without processing the rest of the stream
ArgsPost is cleared between records so each record is evaluated in isolation
TX variables (e.g. anomaly scores) persist across records, enabling cross-record correlation — a stream where 3 records each score below the threshold individually can
still trigger a block when the accumulated score exceeds it
Eval() is safe to call multiple times per phase — AllowTypePhase, Skip, and the transformation cache all reset correctly between calls
Non-streaming body processors are completely unaffected

New `StreamingBodyProcessor` interface

Extends BodyProcessor with ProcessRequestRecords/ProcessResponseRecords methods that yield parsed records one at a time via callback. The transaction detects this
interface via type assertion and switches to the per-record evaluation path automatically.

Streaming relay support

Added ProcessRequestBodyFromStream(input, output) / ProcessResponseBodyFromStream(input, output) methods on the transaction for integrators building custom streaming
middleware. These read records from input, evaluate rules per record, and write clean records to output. Exposed via an experimental StreamingTransaction interface.

Usage example

SecRule REQUEST_HEADERS:Content-Type "^application/x-ndjson" \
    "id:'200007',phase:1,pass,nolog,ctl:requestBodyProcessor=JSONSTREAM"

Testing

24 unit tests for the body processor (single/multiple lines, nested objects, arrays, error cases, recursion limits, TX variable storage, large tokens, format
auto-detection)
7 unit tests for callback-based record processing (interruption stops processing, field prefixes, RFC 7464, backward compatibility)
4 integration tests with real WAF rules (interruption at bad record, clean passthrough, TX variable accumulation across records, below-threshold no-block)
Benchmark: ~5,000 ops/sec for 100-object streams

Benchmark Results (Apple M2)

ProcessRequest (buffered) vs Callback (streaming)

Scenario	Records	ProcessRequest	Callback	Throughput Speedup	Alloc Reduction
small	1	2.00 MB/s, 146 allocs	2.66 MB/s, 17 allocs	1.3x	88%
small	10	11.30 MB/s, 283 allocs	16.70 MB/s, 134 allocs	1.5x	53%
small	100	22.38 MB/s, 1,644 allocs	33.01 MB/s, 1,304 allocs	1.5x	21%
small	1,000	22.84 MB/s, 16,671 allocs	36.20 MB/s, 14,493 allocs	1.6x	13%
medium	1	8.50 MB/s, 176 allocs	11.92 MB/s, 39 allocs	1.4x	78%
medium	10	26.20 MB/s, 579 allocs	41.03 MB/s, 354 allocs	1.6x	39%
medium	100	31.71 MB/s, 4,563 allocs	52.34 MB/s, 3,505 allocs	1.7x	23%
medium	1,000	31.54 MB/s, 51,097 allocs	53.93 MB/s, 41,712 allocs	1.7x	18%
nested	1	8.53 MB/s, 178 allocs	12.04 MB/s, 41 allocs	1.4x	77%
nested	10	25.22 MB/s, 599 allocs	38.13 MB/s, 374 allocs	1.5x	38%
nested	100	30.71 MB/s, 4,763 allocs	48.66 MB/s, 3,705 allocs	1.6x	22%
nested	1,000	30.49 MB/s, 53,101 allocs	48.92 MB/s, 43,713 allocs	1.6x	18%

RFC 7464 (JSON Sequence) via Callback

Scenario	Records	Throughput	Allocs/op
small	10	16.61 MB/s	134
small	100	34.37 MB/s	1,304
medium	100	47.02 MB/s	3,505
nested	100	48.06 MB/s	3,705

Key Takeaways

The callback-based streaming path is consistently 1.5–1.7x faster in throughput.
Allocation counts are 13–88% lower (most dramatic at low record counts where per-collection overhead dominates).
RFC 7464 format performance is comparable to NDJSON at the same record counts, confirming negligible format auto-detection overhead.
The callback path avoids populating TransactionVariables collections (ArgsPost, TX vars, raw body via TeeReader), which accounts for the reduced allocations.

Record Templates

small: {"id":1,"name":"Alice"} (24 bytes)
medium: {"user_id":1234567890,"name":"User Name","email":"user@example.com","role":"admin","active":true,"tags":["tag1","tag2","tag3"]} (128 bytes)
nested: {"user":{"name":"Alice","address":{"city":"NYC","zip":"10001"}},"scores":[95,87,92],"meta":{"created":"2026-01-01","active":true}} (131 bytes)

Implements a new body processor for handling streaming JSON formats: - NDJSON (Newline Delimited JSON) - JSON Lines - JSON Sequence (RFC 7464) Features: - Line-by-line processing for memory efficiency - Each JSON object indexed by line number (json.0.field, json.1.field) - Built-in DoS protection with 1024 recursion limit - TX variables for raw body and line count - Support for nested objects and arrays - Comprehensive error handling Configuration: - Added rules to coraza.conf-recommended for NDJSON content types - Optional line count limiting rule - Registered under JSONSTREAM, NDJSON, and JSONLINES aliases Testing: - 13 comprehensive test cases covering: - Single/multiple lines - Nested objects and arrays - Error cases (invalid JSON, empty stream) - Recursion limit enforcement - TX variable storage - Benchmark: ~5,000 ops/sec for 100-object streams Usage example: SecRule REQUEST_HEADERS:Content-Type "^application/x-ndjson" \ "id:'200007',phase:1,pass,nolog,ctl:requestBodyProcessor=JSONSTREAM" Closes: Related to streaming JSON support discussion Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

codecov · 2026-01-20T17:17:02Z

Codecov Report

❌ Patch coverage is 52.69461% with 158 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.07%. Comparing base (537abf9) to head (d4502d1).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
internal/corazawaf/transaction.go	10.17%	148 Missing and 2 partials ⚠️
experimental/bodyprocessors/jsonstream.go	95.15%	4 Missing and 4 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1481      +/-   ##
==========================================
- Coverage   85.30%   84.07%   -1.24%     
==========================================
  Files         174      175       +1     
  Lines        8461     8811     +350     
==========================================
+ Hits         7218     7408     +190     
- Misses        994     1146     +152     
- Partials      249      257       +8

Flag	Coverage Δ
coraza.rule.case_sensitive_args_keys	`84.04% <52.69%> (-1.24%)`	⬇️
coraza.rule.mandatory_rule_id_check	`84.06% <52.69%> (-1.24%)`	⬇️
coraza.rule.multiphase_evaluation	`83.80% <52.69%> (-1.23%)`	⬇️
coraza.rule.no_regex_multiline	`84.06% <52.69%> (-1.24%)`	⬇️
default	`84.07% <52.69%> (-1.24%)`	⬇️
examples+	`21.03% <24.55%> (+4.54%)`	⬆️
examples+coraza.rule.case_sensitive_args_keys	`83.97% <52.69%> (-1.23%)`	⬇️
examples+coraza.rule.mandatory_rule_id_check	`84.06% <52.69%> (-1.24%)`	⬇️
examples+coraza.rule.multiphase_evaluation	`83.60% <52.69%> (-1.22%)`	⬇️
examples+coraza.rule.no_regex_multiline	`83.91% <52.69%> (-1.23%)`	⬇️
examples+memoize_builders	`84.01% <52.69%> (-1.24%)`	⬇️
examples+no_fs_access	`81.75% <52.69%> (-1.14%)`	⬇️
ftw	`84.07% <52.69%> (-1.24%)`	⬇️
memoize_builders	`84.18% <52.69%> (-1.25%)`	⬇️
no_fs_access	`83.61% <52.69%> (-1.22%)`	⬇️
tinygo	`84.05% <52.69%> (-1.24%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR adds support for JSON Stream (NDJSON) body processing to Coraza WAF, enabling line-by-line processing of streaming JSON formats. The implementation includes a new body processor that handles NDJSON, JSON Lines, and claims support for JSON Sequence (RFC 7464).

Changes:

New jsonStreamBodyProcessor that processes JSON objects line-by-line with memory-efficient streaming
Built-in DoS protection via configurable recursion limits (default 1024)
TX variable storage for raw body and line count to enable custom validation rules
Configuration rules in coraza.conf-recommended for NDJSON content types

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File	Description
internal/bodyprocessors/jsonstream.go	Core implementation of NDJSON body processor with line-by-line parsing, recursion limits, and TX variable storage
internal/bodyprocessors/jsonstream_test.go	Comprehensive test suite with 13 test cases covering single/multiple lines, nested objects, arrays, error cases, and benchmarks
coraza.conf-recommended	Configuration rules for enabling NDJSON processing based on Content-Type headers, with optional line count limiting

internal/bodyprocessors/jsonstream.go

coraza.conf-recommended

internal/bodyprocessors/jsonstream.go

experimental/bodyprocessors/jsonstream.go

internal/bodyprocessors/jsonstream_test.go

experimental/bodyprocessors/jsonstream.go

internal/bodyprocessors/jsonstream.go

Memory Documentation: - Add explicit documentation about 2x memory usage from TeeReader - Clarify that this is necessary for TX variables (like regular JSON processor) - Note memory implications: 2x body size (buffer + parsed variables) Line Numbering: - Use 1-based line numbers in error messages instead of 0-based - More user-friendly: "line 1" instead of "line 0" - Applied to both invalid JSON and parsing errors Scanner Buffer Limit: - Increase max scan token size from default 64KB to 1MB - Prevents failure on large JSON objects per line - Set initial buffer to 64KB, max to 1MB for memory efficiency Configuration Consistency: - Fix rule 200008 to use JSONSTREAM (was NDJSON) - Now consistent with rule 200007 - Both rules use the same processor name Test Code Quality: - Replace string concatenation with fmt.Sprintf for line numbers - Fix issue where rune('0'+tt.line) only works for single digits - Add fmt import to test file Documentation Accuracy: - Remove RFC 7464 JSON Sequence from "supported formats" - Add note that RS separator (0x1E) is not yet implemented - Avoid misleading users about unsupported features All tests passing: 13/13

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

jcchavezs · 2026-01-22T22:46:54Z

Could we add this to experimental?

fzipi · 2026-01-22T22:49:32Z

I thought I mentioned this. Yes, that was my idea.

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

internal/corazawaf/transaction.go

experimental/bodyprocessors/jsonstream.go

- Extract inline interface to named indexedCollection type (jcchavezs) - Preserve original stream format in relay by including format-specific delimiters in rawRecord (NDJSON uses \n, RFC 7464 uses RS prefix + \n) - Update readItemsWithLimit TODO comments to reference #1110

fzipi · 2026-03-21T17:54:03Z

@copilot Fix the conflicts first. Then add e2e tests to tests for this new body processor.

Copilot · 2026-03-21T17:54:10Z

@fzipi I've opened a new pull request, #1563, to work on those changes. Once the pull request is ready, I'll request review from you.

@fzipi

* fix(deps): update module golang.org/x/net to v0.45.0 [security] (#1487) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update go modules in go.mod (#1433) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * docs(actions): update format and add package (#1475) * docs(actions): update format and add package Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: update documentation for package Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: go fmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: add A-Z to auditlog (#1479) Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: SecRuleUpdateActionById should replace disruptive actions (#1471) * fix: SecRuleUpdateActionById should replace disruptive actions Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: multiphase test with bad expectations Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * tests: improve coverage on engine Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * refactor: address SecRuleUpdateActionById review comments (#1484) * Initial plan * Address code review comments: improve documentation, fix double parsing, and fix range logic Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Refactor: Extract hasDisruptiveActions helper to avoid code duplication Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: Improve applyParsedActions documentation Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: Clarify body parsing logic in SetRawRequest Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * refactor: address review comments on SecRuleUpdateActionById - Rename ClearActionsOfType to ClearDisruptiveActions - Add comments explaining quote trimming in action parsing - Remove empty line after function brace in updateActionBySingleID - Split engine_test.go: move output/helper tests to engine_output_test.go * Apply suggestions from code review Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix: use index-based iteration for SecRuleUpdateActionById range updates The range loop variable copied each Rule, so modifications to disruptive actions were lost. Use index-based iteration to modify rules in place. Also adds a test case exercising the range update path. --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * refactor: remove root package dependency on experimental (#1494) * refactor: remove root package dependency on experimental Replace experimental.Options with corazawaf.Options in waf.go, breaking the import cycle that prevented the experimental package from importing the root coraza package. This unblocks PR #1478 and lets experimental helpers use coraza.WAFConfig with proper type safety instead of any. * Update waf.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * chore: min go version to 1.25 (#1497) * No content wants no body * Update .github/workflows/regression.yml Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * one more place --------- Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * feat: add optional rule observer callback to WAF config (#1478) * feat: add optional rule observer callback to WAF config Introduce an optional rule observer callback that is invoked for each rule successfully added to the WAF during initialization. The observer receives rule metadata via the existing RuleMetadata interface. * Move to the experimental package * Do not use reflection to keep the compatibility with older Go versions * Use coraza.WAFConfig, move the test to where it belongs. --------- Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> * feat: add WAFWithRules interface with RulesCount() (#1492) Add WAFWithRules interface with RulesCount() * fix(deps): update module golang.org/x/net to v0.51.0 [security] (#1502) * fix(deps): update module golang.org/x/net to v0.51.0 [security] * chore: update go.work to 1.25.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update golang to 1.25.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore(deps): update module golang.org/x/net to v0.51.0 [security] (#1506) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix: lowercase regex patterns for case-insensitive variable collections (#1505) * fix: lowercase regex patterns for case-insensitive variable collections When a rule uses regex-based variable selection (e.g. TX:/PATTERN/), the regex pattern was compiled from the raw uppercase string before any case normalization. Since TX collection keys are stored lowercase, the uppercase regex would never match, causing rules like CRS 922110 (which uses TX:/MULTIPART_HEADERS_CONTENT_TYPES_*/) to silently fail. Now AddVariable and AddVariableNegation lowercase the regex pattern before compilation for case-insensitive variables, matching the existing behavior for string keys in newRuleVariableParams. * chore: update coreruleset to v4.24.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update libinjection-go and deps (#1496) * chore: update libinjection-go and deps Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update coreruleset v4.24.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: ctl:ruleRemoveTargetById to support whole-collection exclusion (#1495) * Initial plan * Fix ruleRemoveTargetById to support removing entire collection (empty key) Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * feat: add SecRequestBodyJsonDepthLimit directive (#1110) * feat: add SecRequestBodyJsonDepthLimit directive Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * Apply suggestions from code review * fix: mage format Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * Update internal/bodyprocessors/json_test.go * Update internal/bodyprocessors/json_test.go * fix: bad char Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: gofmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * docs: add clarifying comments for JSON recursion limit behavior - Explain why ResponseBodyRecursionLimit = -1 (unlimited for responses) - Document dual purpose of body reading (TX vars + ARGS_POST) - Clarify DoS protection mechanism in readItems() - Note how negative values bypass recursion check * fix: address PR review comments for JSON depth limit - Always enforce a positive recursion limit: change ResponseBodyRecursionLimit from -1 (unlimited) to 1024, matching the request body default - Rename test case "broken1" to "unbalanced_brackets" for clarity - Extract error check from the key iteration loop in TestReadJSON * test: add benchmarks for gjson.Valid pre-validation overhead Measures the cost of gjson.Valid() in the full readJSON pipeline. gjson.Parse is lazy (~9ns), so the real overhead is Valid vs the readItems traversal. Results show ~10-16% overhead for validation, which is acceptable for WAF safety. No single-pass alternative exists in the gjson API. * Apply suggestions from code review * Apply suggestion from @fzipi --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> * fix: update constants for recursion limit (#1512) * fix: conflate the constants for recursion limit * fix: value setting * chore: remove panic from seclang compiler (#1514) * Initial plan * fix: replace panic with error return in parser.go evaluateLine Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> * fix: revert go.sum changes - do not modify go.sum files in this PR Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> * ci: reduce regression matrix from 128 to 15 jobs (#1522) Replace dynamic 64-permutation tag matrix with a curated static list of 13 build-flag combinations. Run all combos on Go 1.25.x and only baseline + kitchen-sink on Go 1.26.x. Add concurrency groups to regression, lint, tinygo, and codeql workflows so stale PR runs are auto-cancelled on new pushes. * feat: ignore unexpected EOF in MIME multipart request body processor (#1453) * Ignore unexpected EOF in MIME multipart request body processor We need this behavior since we need to process an incomplete MIME multipart request body when SecRequestBodyLimitAction is set to ProcessPartial. * fix: add copilot code review comments Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: set changed flag in removeComments and escapeSeqDecode (#1532) Fix two bugs where transformation functions modified the input string but did not report changed=true: - removeComments: entering a C-style (/* */) or HTML () comment block did not set changed=true, causing the multi-match optimization to skip the transformed result. - escapeSeqDecode: unrecognized escape sequences (e.g. \z) dropped the backslash but did not set changed=true. Add test coverage for both fixes including a new remove_comments_test.go and an additional unrecognized-escape test case for escape_seq_decode. * perf: use map for ruleRemoveByID for O(1) lookup (#1524) * perf: use map for ruleRemoveByID for O(1) lookup Replace []int slice with map[int]struct{} for the per-transaction rule exclusion list. The rule evaluation loop checks this list for every rule in every phase, making O(1) map lookup significantly faster than O(n) linear scan when rules are excluded via ctl actions. * test: add TestRemoveRuleByID for map-based rule exclusion * bench: add BenchmarkRuleEvalWithRemovedRules * refactor: use real unconditionalMatch operator from registry in tests * Fix HTTP middleware to process all Transfer-Encoding values (#1518) * Fix HTTP middleware to process all Transfer-Encoding values Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix(deps): update module golang.org/x/sync to v0.20.0 in go.mod (#1543) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * feat: optimize ruleRemoveById range handling store ranges instead of expanding to int slices (#1538) * Initial plan * Optimize ruleRemoveById range handling to avoid generating massive int slices - Replace rangeToInts (which allocated []int of all matching rule IDs) with parseRange and parseIDOrRange helpers that return start/end integers - For ctlRuleRemoveByID with ranges: store the range in Transaction.ruleRemoveByIDRanges ([][2]int) and check it in the rule evaluation loop, avoiding both the intermediate []int and potentially large map expansions - For ctlRuleRemoveTargetByID: iterate rules once directly, eliminating the intermediate []int allocation - Add RemoveRuleByIDRange method to Transaction - Reset ruleRemoveByIDRanges on transaction pool reuse - Replace TestCtlParseRange with TestCtlParseIDOrRange to test the new helpers Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Improve test coverage for range-based rule removal - Add TestRemoveRuleByIDRange in transaction_test.go: - range is stored in ruleRemoveByIDRanges - rules in range are skipped during Eval - multiple ranges work correctly - ruleRemoveByIDRanges is reset on transaction pool reuse - Add TestCtlParseRange in ctl_test.go to cover parseRange directly (including the no-separator and start>end error paths) - Add GetRuleRemoveByIDRanges() accessor on Transaction for cross-package test assertions - Enhance "ruleRemoveById range" TestCtl case to verify the range is stored - Add "ruleRemoveTargetById range" TestCtl case to verify range path works Coverage changes: parseRange: 83.3% → 100% parseIDOrRange: 100% (unchanged) RemoveRuleByIDRange: 0% → 100% Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix(testing): Correct use of ProcessURI in Benchmarks (#1546) * perf: prefix-based transformation cache with inline values (#1544) Redesign the transformation cache to share intermediate results across rules with common transformation prefixes (e.g. rules using t:lowercase,t:urlDecodeUni reuse the t:lowercase result cached by an earlier rule using just t:lowercase). Key changes: - Add transformationPrefixIDs to Rule for backward prefix search - Cache every intermediate transformation step, not just the final result - Store cache values inline (not pointers) to avoid heap allocations - Fix ClearTransformations (t:none) to reset transformationsID Benchmarked against full CRS v4 ruleset (8 runs, benchstat): Allocations: -2% (small) to -19% (30 params) Memory: -2% (small) to -12% (30 params) Timing: -5% (small/large), neutral (medium) No regressions on any metric. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * perf: bulk-allocate MatchData in collection Find methods (#1530) * perf: bulk-allocate MatchData in collection Find methods Pre-allocate a contiguous []corazarules.MatchData buffer and take pointers into it instead of individually heap-allocating each MatchData. This reduces per-result allocations from N to 2 (one buf slice + one result slice), improving GC pressure for large result sets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf: avoid double regex evaluation in FindRegex Collect matching data slices during the counting pass so the second pass only iterates over already-matched entries, eliminating redundant MatchString calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * bench: add FindAll/FindRegex/FindString benchmarks --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * perf: use FindStringSubmatchIndex to avoid capture allocations (#1547) * perf: use FindStringSubmatchIndex to avoid capture allocations Replace FindStringSubmatch (allocates a []string slice per match) with FindStringSubmatchIndex (returns index pairs). Substrings passed to CaptureField become slices of the original input — zero allocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add BenchmarkRxCapture for submatch allocation comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(DetectionOnly): fixed RelevantOnly audit logs, improved matchedRules (#1549) * add detectedInterruption var for DetectionOnly mode * IsDetectionOnly, refactor, populate matchedRules * nit * Apply suggestions from code review Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: Romain SERVIERES <romain@madeformed.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * fix(deps): update module golang.org/x/net to v0.52.0 in go.mod (#1553) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * ci: increase fuzztime (#1554) * more fuzztime * go mod * chore(ci): harden GHA workflows with least-privilege permissions (#1559) - Add top-level `permissions: {}` (deny-all) to every workflow - Add scoped per-job permissions granting only what each job needs - Fix expression injection in regression.yml by using env instead of inline shell interpolation for BUILD_TAGS - Restrict regression.yml pull_request trigger to main branch only - Add explicit permissions to fuzz.yml (issues: write for failure reports) - Add security-events: write to CodeQL workflow * feat: enable regex memoize by default (#1540) * feat: enable regex memoize by default Memoization of regex and aho-corasick builders was previously opt-in via the `memoize_builders` build tag. Most users didn't know to enable it, missing a critical performance optimization. This commit: - Enables memoization by default (opt-out via `coraza.no_memoize` tag) - Refactors internal/memoize from package-level Do() to Memoizer struct - Adds Memoizer interface to plugintypes.OperatorOptions - Wires WAF's Memoizer through to all operator and rule consumers - Replaces `memoize_builders` build tag with `coraza.no_memoize` opt-out Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: document cache tradeoffs and add noop memoize test - Update README and memoize README to document global cache behavior and point to WAF.Close() for live-reload scenarios. - Add test file for coraza.no_memoize build variant to verify no-op behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add WAF.Close() with per-owner memoize cache tracking and scale benchmarks (#1541) * feat: add WAF.Close() with per-owner memoize cache tracking Add WAFCloser interface and per-owner tracking to the memoize cache so that long-lived processes can release compiled regex entries when a WAF instance is destroyed. Each WAF gets a uint64 ID; Release() removes the owner and tombstones entries with no remaining owners. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add memoize scale benchmarks and CRS integration tests Add benchmarks demonstrating memoize value at scale (1-100 WAFs × 300 patterns) and CRS integration tests verifying Close() releases memory. Results show ~27x speedup for 100 WAFs and 27MiB released on Close(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add WAF.Close() calls to e2e and CRS tests Demonstrate proper WAFCloser usage in integration tests: e2e test, CRS FTW test, CRS benchmarks, and crsWAF helper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * test: extend coraza.no_memoize coverage in noop_test.go (#1555) * Initial plan * test: extend noop_test.go coverage for coraza.no_memoize build tag Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: check error return of m.Do in benchmark to resolve errcheck lint failure (#1556) * Initial plan * fix: check error return of m.Do in benchmark test to fix errcheck lint Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: skip memoize scale tests in short mode The scale tests (TestMemoizeScaleMultipleOwners, TestCacheGrowthWithoutClose, TestCacheBoundedWithClose) compile hundreds of regexes across many owners/cycles. Under TinyGo's slower regex engine these take hours when run in CI with -short. Gate all three scale tests behind testing.Short() in both sync_test.go and nosync_test.go so TinyGo CI (which passes -short) completes in reasonable time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(memoize): avoid deadlock in TinyGo's sync.Map during Release and Reset TinyGo's sync.Map.Range() holds its internal lock for the entire iteration. Calling cache.Delete() inside the Range callback tries to re-acquire the same non-reentrant lock, causing a deadlock. Defer all cache.Delete() calls until after Range returns by collecting keys first. This also fixes t.Skip() in tests which does not halt execution in TinyGo due to unimplemented runtime.Goexit(). On standard Go this is a net performance win for Release (up to 60% faster at 100 owners) with negligible temporary memory (~9KB slice). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * feat: implement SecUploadKeepFiles directive (#1557) * feat: implement SecUploadKeepFiles with RelevantOnly support Add UploadKeepFilesStatus type supporting On, Off, and RelevantOnly values for the SecUploadKeepFiles directive. When set to On, uploaded files are preserved after transaction close. When set to RelevantOnly, files are kept only if rules matched during the transaction. Closes #1550 * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @M4tteoP Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * docs: update SecUploadKeepFiles in coraza.conf-recommended Remove the "not supported" note and document the RelevantOnly option. * fix: filter nolog rules in RelevantOnly upload keep files check RelevantOnly now only considers rules with Log enabled, matching the same filtering used for audit log part K. This prevents CRS initialization rules (nolog) from making RelevantOnly behave like On. * fix: require SecUploadDir when SecUploadKeepFiles is enabled Add validation in WAF.Validate() to ensure SecUploadDir is configured when SecUploadKeepFiles is set to On or RelevantOnly, matching the ModSecurity requirement. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: directive docs Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix: correct two compile errors in SecUploadKeepFiles implementation (#1560) * Initial plan * fix: correct lint errors - HasAccessToFS is a bool not a function, fix wrong constant name Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: gofmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: skip SecUploadKeepFiles tests when no_fs_access build tag is set The upload keep files tests expected success for On/RelevantOnly modes, but the implementation correctly rejects these when filesystem access is disabled. Guard these test cases behind environment.HasAccessToFS. --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> * feat: add regex support to ctl:ruleRemoveTargetById, ruleRemoveTargetByTag, and ruleRemoveTargetByMsg collection keys (#1561) * Initial plan * Add regex support to ctl:ruleRemoveTargetById for URI-scoped exclusions Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Use memoization for regex compilation in parseCtl Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Add benchmarks for short and medium regex exceptions in GetField Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * refactor: add HasRegex shared utility and use it in rule.go and ctl.go Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add POST JSON body test for ruleRemoveTargetById regex key exclusion Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: update RemoveRuleTargetByID comment to document keyRx parameter Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: update ctl action doc comment to describe regex key syntax with example Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add ruleRemoveTargetByTag and ruleRemoveTargetByMsg regex key integration tests Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * style: apply gofmt to internal/actions/ctl.go Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add memoizer coverage to TestParseCtl for ctl regex path Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Initial plan * test: add e2e tests for JSONSTREAM body processor Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> Agent-Logs-Url: https://github.com/corazawaf/coraza/sessions/bebca76e-344f-4966-8675-8bf4e5fda0cb --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Alexander S. <126732+heaven@users.noreply.github.com> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> Co-authored-by: Pierre POMES <pierre.pomes@gmail.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> Co-authored-by: Juan Pablo Tosso <jptosso@gmail.com> Co-authored-by: Hiroaki Nakamura <hnakamur@gmail.com> Co-authored-by: Marc W. <113890636+MarcWort@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Romain SERVIERES <romain@madeformed.com>

fzipi requested a review from Copilot January 20, 2026 17:14

fzipi requested a review from a team as a code owner January 20, 2026 17:14

Copilot started reviewing on behalf of fzipi January 20, 2026 17:14 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

fzipi marked this pull request as draft January 20, 2026 17:52

feat: add RFC 7464 support and true streaming to JSON processor

d5f68c3

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

refactor: move bodyprocessor as experimental

78b0c45

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

fzipi requested a review from Copilot January 23, 2026 02:57

Copilot started reviewing on behalf of fzipi January 23, 2026 02:59 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

fzipi added 2 commits January 23, 2026 00:14

feat: add getter to plugins.bodyprocessors

43fb993

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

tests: add more coverage

ce3dca4

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

fzipi requested a review from Copilot January 23, 2026 03:30

Copilot started reviewing on behalf of fzipi January 23, 2026 03:31 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

fzipi and others added 5 commits February 14, 2026 18:21

feat: add new streaming records processing for phase 2

34f103a

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

Merge branch 'main' into feat/add-streaming-json-processor

8c8b160

test: add benchmarks

f2045ff

Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>

Merge branch 'main' into feat/add-streaming-json-processor

493bce1

Merge branch 'main' into feat/add-streaming-json-processor

222fbec

fzipi marked this pull request as ready for review February 15, 2026 12:50

fzipi requested a review from Copilot February 15, 2026 12:51

Copilot started reviewing on behalf of fzipi February 15, 2026 12:52 View session

Copilot AI reviewed Feb 15, 2026

View reviewed changes

internal/corazawaf/transaction.go Outdated Show resolved Hide resolved

internal/corazawaf/transaction.go Outdated Show resolved Hide resolved

jcchavezs reviewed Feb 15, 2026

View reviewed changes

experimental/bodyprocessors/jsonstream.go Outdated Show resolved Hide resolved

Copilot AI mentioned this pull request Mar 21, 2026

feat: add JSON Stream (NDJSON) body processor #1563

Merged

Uh oh!

Conversation

fzipi commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

what

Streaming body processor

Per-record rule evaluation

New StreamingBodyProcessor interface

Streaming relay support

Usage example

Testing

Benchmark Results (Apple M2)

ProcessRequest (buffered) vs Callback (streaming)

RFC 7464 (JSON Sequence) via Callback

Key Takeaways

Record Templates

Uh oh!

codecov bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcchavezs commented Jan 22, 2026

Uh oh!

fzipi commented Jan 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fzipi commented Mar 21, 2026

Uh oh!

Copilot AI commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fzipi commented Jan 20, 2026 •

edited

Loading

New `StreamingBodyProcessor` interface

codecov bot commented Jan 20, 2026 •

edited

Loading