Conversation
Implements a new body processor for handling streaming JSON formats:
- NDJSON (Newline Delimited JSON)
- JSON Lines
- JSON Sequence (RFC 7464)
Features:
- Line-by-line processing for memory efficiency
- Each JSON object indexed by line number (json.0.field, json.1.field)
- Built-in DoS protection with 1024 recursion limit
- TX variables for raw body and line count
- Support for nested objects and arrays
- Comprehensive error handling
Configuration:
- Added rules to coraza.conf-recommended for NDJSON content types
- Optional line count limiting rule
- Registered under JSONSTREAM, NDJSON, and JSONLINES aliases
Testing:
- 13 comprehensive test cases covering:
- Single/multiple lines
- Nested objects and arrays
- Error cases (invalid JSON, empty stream)
- Recursion limit enforcement
- TX variable storage
- Benchmark: ~5,000 ops/sec for 100-object streams
Usage example:
SecRule REQUEST_HEADERS:Content-Type "^application/x-ndjson" \
"id:'200007',phase:1,pass,nolog,ctl:requestBodyProcessor=JSONSTREAM"
Closes: Related to streaming JSON support discussion
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1481 +/- ##
==========================================
- Coverage 85.30% 84.07% -1.24%
==========================================
Files 174 175 +1
Lines 8461 8811 +350
==========================================
+ Hits 7218 7408 +190
- Misses 994 1146 +152
- Partials 249 257 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds support for JSON Stream (NDJSON) body processing to Coraza WAF, enabling line-by-line processing of streaming JSON formats. The implementation includes a new body processor that handles NDJSON, JSON Lines, and claims support for JSON Sequence (RFC 7464).
Changes:
- New
jsonStreamBodyProcessorthat processes JSON objects line-by-line with memory-efficient streaming - Built-in DoS protection via configurable recursion limits (default 1024)
- TX variable storage for raw body and line count to enable custom validation rules
- Configuration rules in
coraza.conf-recommendedfor NDJSON content types
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| internal/bodyprocessors/jsonstream.go | Core implementation of NDJSON body processor with line-by-line parsing, recursion limits, and TX variable storage |
| internal/bodyprocessors/jsonstream_test.go | Comprehensive test suite with 13 test cases covering single/multiple lines, nested objects, arrays, error cases, and benchmarks |
| coraza.conf-recommended | Configuration rules for enabling NDJSON processing based on Content-Type headers, with optional line count limiting |
Memory Documentation:
- Add explicit documentation about 2x memory usage from TeeReader
- Clarify that this is necessary for TX variables (like regular JSON processor)
- Note memory implications: 2x body size (buffer + parsed variables)
Line Numbering:
- Use 1-based line numbers in error messages instead of 0-based
- More user-friendly: "line 1" instead of "line 0"
- Applied to both invalid JSON and parsing errors
Scanner Buffer Limit:
- Increase max scan token size from default 64KB to 1MB
- Prevents failure on large JSON objects per line
- Set initial buffer to 64KB, max to 1MB for memory efficiency
Configuration Consistency:
- Fix rule 200008 to use JSONSTREAM (was NDJSON)
- Now consistent with rule 200007
- Both rules use the same processor name
Test Code Quality:
- Replace string concatenation with fmt.Sprintf for line numbers
- Fix issue where rune('0'+tt.line) only works for single digits
- Add fmt import to test file
Documentation Accuracy:
- Remove RFC 7464 JSON Sequence from "supported formats"
- Add note that RS separator (0x1E) is not yet implemented
- Avoid misleading users about unsupported features
All tests passing: 13/13
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
|
Could we add this to experimental? |
|
I thought I mentioned this. Yes, that was my idea. |
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
- Extract inline interface to named indexedCollection type (jcchavezs) - Preserve original stream format in relay by including format-specific delimiters in rawRecord (NDJSON uses \n, RFC 7464 uses RS prefix + \n) - Update readItemsWithLimit TODO comments to reference #1110
|
@copilot Fix the conflicts first. Then add e2e tests to tests for this new body processor. |
* fix(deps): update module golang.org/x/net to v0.45.0 [security] (#1487) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update go modules in go.mod (#1433) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * docs(actions): update format and add package (#1475) * docs(actions): update format and add package Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: update documentation for package Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: go fmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: add A-Z to auditlog (#1479) Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: SecRuleUpdateActionById should replace disruptive actions (#1471) * fix: SecRuleUpdateActionById should replace disruptive actions Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: multiphase test with bad expectations Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * tests: improve coverage on engine Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * refactor: address SecRuleUpdateActionById review comments (#1484) * Initial plan * Address code review comments: improve documentation, fix double parsing, and fix range logic Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Refactor: Extract hasDisruptiveActions helper to avoid code duplication Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: Improve applyParsedActions documentation Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: Clarify body parsing logic in SetRawRequest Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * refactor: address review comments on SecRuleUpdateActionById - Rename ClearActionsOfType to ClearDisruptiveActions - Add comments explaining quote trimming in action parsing - Remove empty line after function brace in updateActionBySingleID - Split engine_test.go: move output/helper tests to engine_output_test.go * Apply suggestions from code review Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix: use index-based iteration for SecRuleUpdateActionById range updates The range loop variable copied each Rule, so modifications to disruptive actions were lost. Use index-based iteration to modify rules in place. Also adds a test case exercising the range update path. --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * refactor: remove root package dependency on experimental (#1494) * refactor: remove root package dependency on experimental Replace experimental.Options with corazawaf.Options in waf.go, breaking the import cycle that prevented the experimental package from importing the root coraza package. This unblocks PR #1478 and lets experimental helpers use coraza.WAFConfig with proper type safety instead of any. * Update waf.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * chore: min go version to 1.25 (#1497) * No content wants no body * Update .github/workflows/regression.yml Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * one more place --------- Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * feat: add optional rule observer callback to WAF config (#1478) * feat: add optional rule observer callback to WAF config Introduce an optional rule observer callback that is invoked for each rule successfully added to the WAF during initialization. The observer receives rule metadata via the existing RuleMetadata interface. * Move to the experimental package * Do not use reflection to keep the compatibility with older Go versions * Use coraza.WAFConfig, move the test to where it belongs. --------- Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> * feat: add WAFWithRules interface with RulesCount() (#1492) Add WAFWithRules interface with RulesCount() * fix(deps): update module golang.org/x/net to v0.51.0 [security] (#1502) * fix(deps): update module golang.org/x/net to v0.51.0 [security] * chore: update go.work to 1.25.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update golang to 1.25.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore(deps): update module golang.org/x/net to v0.51.0 [security] (#1506) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix: lowercase regex patterns for case-insensitive variable collections (#1505) * fix: lowercase regex patterns for case-insensitive variable collections When a rule uses regex-based variable selection (e.g. TX:/PATTERN/), the regex pattern was compiled from the raw uppercase string before any case normalization. Since TX collection keys are stored lowercase, the uppercase regex would never match, causing rules like CRS 922110 (which uses TX:/MULTIPART_HEADERS_CONTENT_TYPES_*/) to silently fail. Now AddVariable and AddVariableNegation lowercase the regex pattern before compilation for case-insensitive variables, matching the existing behavior for string keys in newRuleVariableParams. * chore: update coreruleset to v4.24.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update libinjection-go and deps (#1496) * chore: update libinjection-go and deps Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * chore: update coreruleset v4.24.0 Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: ctl:ruleRemoveTargetById to support whole-collection exclusion (#1495) * Initial plan * Fix ruleRemoveTargetById to support removing entire collection (empty key) Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * feat: add SecRequestBodyJsonDepthLimit directive (#1110) * feat: add SecRequestBodyJsonDepthLimit directive Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * Apply suggestions from code review * fix: mage format Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * Update internal/bodyprocessors/json_test.go * Update internal/bodyprocessors/json_test.go * fix: bad char Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: gofmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * docs: add clarifying comments for JSON recursion limit behavior - Explain why ResponseBodyRecursionLimit = -1 (unlimited for responses) - Document dual purpose of body reading (TX vars + ARGS_POST) - Clarify DoS protection mechanism in readItems() - Note how negative values bypass recursion check * fix: address PR review comments for JSON depth limit - Always enforce a positive recursion limit: change ResponseBodyRecursionLimit from -1 (unlimited) to 1024, matching the request body default - Rename test case "broken1" to "unbalanced_brackets" for clarity - Extract error check from the key iteration loop in TestReadJSON * test: add benchmarks for gjson.Valid pre-validation overhead Measures the cost of gjson.Valid() in the full readJSON pipeline. gjson.Parse is lazy (~9ns), so the real overhead is Valid vs the readItems traversal. Results show ~10-16% overhead for validation, which is acceptable for WAF safety. No single-pass alternative exists in the gjson API. * Apply suggestions from code review * Apply suggestion from @fzipi --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> * fix: update constants for recursion limit (#1512) * fix: conflate the constants for recursion limit * fix: value setting * chore: remove panic from seclang compiler (#1514) * Initial plan * fix: replace panic with error return in parser.go evaluateLine Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> * fix: revert go.sum changes - do not modify go.sum files in this PR Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> * ci: reduce regression matrix from 128 to 15 jobs (#1522) Replace dynamic 64-permutation tag matrix with a curated static list of 13 build-flag combinations. Run all combos on Go 1.25.x and only baseline + kitchen-sink on Go 1.26.x. Add concurrency groups to regression, lint, tinygo, and codeql workflows so stale PR runs are auto-cancelled on new pushes. * feat: ignore unexpected EOF in MIME multipart request body processor (#1453) * Ignore unexpected EOF in MIME multipart request body processor We need this behavior since we need to process an incomplete MIME multipart request body when SecRequestBodyLimitAction is set to ProcessPartial. * fix: add copilot code review comments Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: set changed flag in removeComments and escapeSeqDecode (#1532) Fix two bugs where transformation functions modified the input string but did not report changed=true: - removeComments: entering a C-style (/* */) or HTML (<!-- -->) comment block did not set changed=true, causing the multi-match optimization to skip the transformed result. - escapeSeqDecode: unrecognized escape sequences (e.g. \z) dropped the backslash but did not set changed=true. Add test coverage for both fixes including a new remove_comments_test.go and an additional unrecognized-escape test case for escape_seq_decode. * perf: use map for ruleRemoveByID for O(1) lookup (#1524) * perf: use map for ruleRemoveByID for O(1) lookup Replace []int slice with map[int]struct{} for the per-transaction rule exclusion list. The rule evaluation loop checks this list for every rule in every phase, making O(1) map lookup significantly faster than O(n) linear scan when rules are excluded via ctl actions. * test: add TestRemoveRuleByID for map-based rule exclusion * bench: add BenchmarkRuleEvalWithRemovedRules * refactor: use real unconditionalMatch operator from registry in tests * Fix HTTP middleware to process all Transfer-Encoding values (#1518) * Fix HTTP middleware to process all Transfer-Encoding values Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix(deps): update module golang.org/x/sync to v0.20.0 in go.mod (#1543) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * feat: optimize ruleRemoveById range handling store ranges instead of expanding to int slices (#1538) * Initial plan * Optimize ruleRemoveById range handling to avoid generating massive int slices - Replace rangeToInts (which allocated []int of all matching rule IDs) with parseRange and parseIDOrRange helpers that return start/end integers - For ctlRuleRemoveByID with ranges: store the range in Transaction.ruleRemoveByIDRanges ([][2]int) and check it in the rule evaluation loop, avoiding both the intermediate []int and potentially large map expansions - For ctlRuleRemoveTargetByID: iterate rules once directly, eliminating the intermediate []int allocation - Add RemoveRuleByIDRange method to Transaction - Reset ruleRemoveByIDRanges on transaction pool reuse - Replace TestCtlParseRange with TestCtlParseIDOrRange to test the new helpers Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Improve test coverage for range-based rule removal - Add TestRemoveRuleByIDRange in transaction_test.go: - range is stored in ruleRemoveByIDRanges - rules in range are skipped during Eval - multiple ranges work correctly - ruleRemoveByIDRanges is reset on transaction pool reuse - Add TestCtlParseRange in ctl_test.go to cover parseRange directly (including the no-separator and start>end error paths) - Add GetRuleRemoveByIDRanges() accessor on Transaction for cross-package test assertions - Enhance "ruleRemoveById range" TestCtl case to verify the range is stored - Add "ruleRemoveTargetById range" TestCtl case to verify range path works Coverage changes: parseRange: 83.3% → 100% parseIDOrRange: 100% (unchanged) RemoveRuleByIDRange: 0% → 100% Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix(testing): Correct use of ProcessURI in Benchmarks (#1546) * perf: prefix-based transformation cache with inline values (#1544) Redesign the transformation cache to share intermediate results across rules with common transformation prefixes (e.g. rules using t:lowercase,t:urlDecodeUni reuse the t:lowercase result cached by an earlier rule using just t:lowercase). Key changes: - Add transformationPrefixIDs to Rule for backward prefix search - Cache every intermediate transformation step, not just the final result - Store cache values inline (not pointers) to avoid heap allocations - Fix ClearTransformations (t:none) to reset transformationsID Benchmarked against full CRS v4 ruleset (8 runs, benchstat): Allocations: -2% (small) to -19% (30 params) Memory: -2% (small) to -12% (30 params) Timing: -5% (small/large), neutral (medium) No regressions on any metric. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * perf: bulk-allocate MatchData in collection Find methods (#1530) * perf: bulk-allocate MatchData in collection Find methods Pre-allocate a contiguous []corazarules.MatchData buffer and take pointers into it instead of individually heap-allocating each MatchData. This reduces per-result allocations from N to 2 (one buf slice + one result slice), improving GC pressure for large result sets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf: avoid double regex evaluation in FindRegex Collect matching data slices during the counting pass so the second pass only iterates over already-matched entries, eliminating redundant MatchString calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * bench: add FindAll/FindRegex/FindString benchmarks --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * perf: use FindStringSubmatchIndex to avoid capture allocations (#1547) * perf: use FindStringSubmatchIndex to avoid capture allocations Replace FindStringSubmatch (allocates a []string slice per match) with FindStringSubmatchIndex (returns index pairs). Substrings passed to CaptureField become slices of the original input — zero allocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add BenchmarkRxCapture for submatch allocation comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(DetectionOnly): fixed RelevantOnly audit logs, improved matchedRules (#1549) * add detectedInterruption var for DetectionOnly mode * IsDetectionOnly, refactor, populate matchedRules * nit * Apply suggestions from code review Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: Romain SERVIERES <romain@madeformed.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> * fix(deps): update module golang.org/x/net to v0.52.0 in go.mod (#1553) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * ci: increase fuzztime (#1554) * more fuzztime * go mod * chore(ci): harden GHA workflows with least-privilege permissions (#1559) - Add top-level `permissions: {}` (deny-all) to every workflow - Add scoped per-job permissions granting only what each job needs - Fix expression injection in regression.yml by using env instead of inline shell interpolation for BUILD_TAGS - Restrict regression.yml pull_request trigger to main branch only - Add explicit permissions to fuzz.yml (issues: write for failure reports) - Add security-events: write to CodeQL workflow * feat: enable regex memoize by default (#1540) * feat: enable regex memoize by default Memoization of regex and aho-corasick builders was previously opt-in via the `memoize_builders` build tag. Most users didn't know to enable it, missing a critical performance optimization. This commit: - Enables memoization by default (opt-out via `coraza.no_memoize` tag) - Refactors internal/memoize from package-level Do() to Memoizer struct - Adds Memoizer interface to plugintypes.OperatorOptions - Wires WAF's Memoizer through to all operator and rule consumers - Replaces `memoize_builders` build tag with `coraza.no_memoize` opt-out Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: document cache tradeoffs and add noop memoize test - Update README and memoize README to document global cache behavior and point to WAF.Close() for live-reload scenarios. - Add test file for coraza.no_memoize build variant to verify no-op behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add WAF.Close() with per-owner memoize cache tracking and scale benchmarks (#1541) * feat: add WAF.Close() with per-owner memoize cache tracking Add WAFCloser interface and per-owner tracking to the memoize cache so that long-lived processes can release compiled regex entries when a WAF instance is destroyed. Each WAF gets a uint64 ID; Release() removes the owner and tombstones entries with no remaining owners. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add memoize scale benchmarks and CRS integration tests Add benchmarks demonstrating memoize value at scale (1-100 WAFs × 300 patterns) and CRS integration tests verifying Close() releases memory. Results show ~27x speedup for 100 WAFs and 27MiB released on Close(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add WAF.Close() calls to e2e and CRS tests Demonstrate proper WAFCloser usage in integration tests: e2e test, CRS FTW test, CRS benchmarks, and crsWAF helper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * test: extend coraza.no_memoize coverage in noop_test.go (#1555) * Initial plan * test: extend noop_test.go coverage for coraza.no_memoize build tag Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: check error return of m.Do in benchmark to resolve errcheck lint failure (#1556) * Initial plan * fix: check error return of m.Do in benchmark test to fix errcheck lint Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: skip memoize scale tests in short mode The scale tests (TestMemoizeScaleMultipleOwners, TestCacheGrowthWithoutClose, TestCacheBoundedWithClose) compile hundreds of regexes across many owners/cycles. Under TinyGo's slower regex engine these take hours when run in CI with -short. Gate all three scale tests behind testing.Short() in both sync_test.go and nosync_test.go so TinyGo CI (which passes -short) completes in reasonable time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(memoize): avoid deadlock in TinyGo's sync.Map during Release and Reset TinyGo's sync.Map.Range() holds its internal lock for the entire iteration. Calling cache.Delete() inside the Range callback tries to re-acquire the same non-reentrant lock, causing a deadlock. Defer all cache.Delete() calls until after Range returns by collecting keys first. This also fixes t.Skip() in tests which does not halt execution in TinyGo due to unimplemented runtime.Goexit(). On standard Go this is a net performance win for Release (up to 60% faster at 100 owners) with negligible temporary memory (~9KB slice). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * feat: implement SecUploadKeepFiles directive (#1557) * feat: implement SecUploadKeepFiles with RelevantOnly support Add UploadKeepFilesStatus type supporting On, Off, and RelevantOnly values for the SecUploadKeepFiles directive. When set to On, uploaded files are preserved after transaction close. When set to RelevantOnly, files are kept only if rules matched during the transaction. Closes #1550 * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @M4tteoP Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * docs: update SecUploadKeepFiles in coraza.conf-recommended Remove the "not supported" note and document the RelevantOnly option. * fix: filter nolog rules in RelevantOnly upload keep files check RelevantOnly now only considers rules with Log enabled, matching the same filtering used for audit log part K. This prevents CRS initialization rules (nolog) from making RelevantOnly behave like On. * fix: require SecUploadDir when SecUploadKeepFiles is enabled Add validation in WAF.Validate() to ensure SecUploadDir is configured when SecUploadKeepFiles is set to On or RelevantOnly, matching the ModSecurity requirement. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: directive docs Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> * fix: correct two compile errors in SecUploadKeepFiles implementation (#1560) * Initial plan * fix: correct lint errors - HasAccessToFS is a bool not a function, fix wrong constant name Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * fix: gofmt Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> * fix: skip SecUploadKeepFiles tests when no_fs_access build tag is set The upload keep files tests expected success for On/RelevantOnly modes, but the implementation correctly rejects these when filesystem access is disabled. Guard these test cases behind environment.HasAccessToFS. --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> * feat: add regex support to ctl:ruleRemoveTargetById, ruleRemoveTargetByTag, and ruleRemoveTargetByMsg collection keys (#1561) * Initial plan * Add regex support to ctl:ruleRemoveTargetById for URI-scoped exclusions Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Use memoization for regex compilation in parseCtl Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Add benchmarks for short and medium regex exceptions in GetField Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * refactor: add HasRegex shared utility and use it in rule.go and ctl.go Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add POST JSON body test for ruleRemoveTargetById regex key exclusion Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: update RemoveRuleTargetByID comment to document keyRx parameter Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs: update ctl action doc comment to describe regex key syntax with example Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add ruleRemoveTargetByTag and ruleRemoveTargetByMsg regex key integration tests Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * style: apply gofmt to internal/actions/ctl.go Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * test: add memoizer coverage to TestParseCtl for ctl regex path Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * Initial plan * test: add e2e tests for JSONSTREAM body processor Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> Agent-Logs-Url: https://github.com/corazawaf/coraza/sessions/bebca76e-344f-4966-8675-8bf4e5fda0cb --------- Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Matteo Pace <pace.matteo96@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Alexander S. <126732+heaven@users.noreply.github.com> Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com> Co-authored-by: Pierre POMES <pierre.pomes@gmail.com> Co-authored-by: Felipe Zipitria <felipe.zipitria@owasp.org> Co-authored-by: jptosso <1236942+jptosso@users.noreply.github.com> Co-authored-by: Juan Pablo Tosso <jptosso@gmail.com> Co-authored-by: Hiroaki Nakamura <hnakamur@gmail.com> Co-authored-by: Marc W. <113890636+MarcWort@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Romain SERVIERES <romain@madeformed.com>
what
Implements a new body processor for handling streaming JSON formats with per-record rule evaluation:
Streaming body processor
json.0.field,json.1.field)JSONSTREAM,NDJSON, andJSONLINESaliasesPer-record rule evaluation
Instead of parsing all records into
ArgsPostand evaluating Phase 2 rules once, rules are now evaluated after each complete JSON record:ArgsPostis cleared between records so each record is evaluated in isolationstill trigger a block when the accumulated score exceeds it
Eval()is safe to call multiple times per phase —AllowTypePhase,Skip, and the transformation cache all reset correctly between callsNew
StreamingBodyProcessorinterfaceExtends
BodyProcessorwithProcessRequestRecords/ProcessResponseRecordsmethods that yield parsed records one at a time via callback. The transaction detects thisinterface via type assertion and switches to the per-record evaluation path automatically.
Streaming relay support
Added
ProcessRequestBodyFromStream(input, output)/ProcessResponseBodyFromStream(input, output)methods on the transaction for integrators building custom streamingmiddleware. These read records from input, evaluate rules per record, and write clean records to output. Exposed via an experimental
StreamingTransactioninterface.Usage example
Testing
auto-detection)
Benchmark Results (Apple M2)
ProcessRequest (buffered) vs Callback (streaming)
RFC 7464 (JSON Sequence) via Callback
Key Takeaways
TransactionVariablescollections (ArgsPost, TX vars, raw body viaTeeReader), which accounts for the reduced allocations.Record Templates
{"id":1,"name":"Alice"}(24 bytes){"user_id":1234567890,"name":"User Name","email":"user@example.com","role":"admin","active":true,"tags":["tag1","tag2","tag3"]}(128 bytes){"user":{"name":"Alice","address":{"city":"NYC","zip":"10001"}},"scores":[95,87,92],"meta":{"created":"2026-01-01","active":true}}(131 bytes)