Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ The Coraza Project maintains implementations and plugins for the following serve

## Prerequisites

* Go v1.22+ or tinygo compiler
* Recent Go version (see [go.mod](./go.mod)) or tinygo compiler.
* Linux distribution (Debian or Centos recommended), Windows or Mac.

## Coraza Core Usage
Expand Down Expand Up @@ -107,8 +107,9 @@ live reloads, use `WAF.Close()` (via `experimental.WAFCloser`) to release cached
WAF is destroyed, or use this tag to opt out of memoization entirely.
* `no_fs_access` - indicates that the target environment has no access to FS in order to not leverage OS' filesystem related functionality e.g. file body buffers.
* `coraza.rule.case_sensitive_args_keys` - enables case-sensitive matching for ARGS keys, aligning Coraza behavior with RFC 3986 specification. It will be enabled by default in the next major version.
* `coraza.rule.no_regex_multiline` - disables enabling by default regexes multiline modifiers in `@rx` operator. It aligns with CRS expected behavior, reduces false positives and might improve performances. No multiline regexes by default will be enabled in the next major version. For more context check [this PR](https://github.com/corazawaf/coraza/pull/876)
* `coraza.rule.no_regex_multiline` - disables enabling by default regexes multiline modifiers in `@rx` operator. It aligns with CRS expected behavior, reduces false positives and might improve performances. No multiline regexes by default will be enabled in the next major version. For more context check [this PR](https://github.com/corazawaf/coraza/pull/876).
* `coraza.rule.mandatory_rule_id_check` - enables strict rule id check where `id` action is required for all SecRule/SecAction.
* `coraza.rule.rx_prefilter` - sets the default value of the `SecRxPreFilter` directive to `On`. Optimizes `@rx` operator, by skipping the full regex when an input can not match. The build tag is meant for testing, rely on the directive `SecRxPreFilter` for runtime configuration and broader documentation.

## E2E Testing

Expand Down Expand Up @@ -180,7 +181,7 @@ Our vulnerability management team will respond within 3 working days of your rep

## Donations

For donations, see [Donations site](https://owasp.org/donate/?reponame=www-project-coraza-web-application-firewall&title=OWASP+Coraza+Web+Application+Firewall)
For donations, see [Donations site](https://owasp.org/donate/?reponame=www-project-coraza-web-application-firewall&title=OWASP+Coraza+Web+Application+Firewall).

## Thanks to all the people who have contributed

Expand Down
9 changes: 9 additions & 0 deletions coraza.conf-recommended
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@
SecRuleEngine DetectionOnly


# -- Performance optimizations -----------------------------------------------

# Enable compile-time literal pre-filtering for the @rx operator.
# When enabled, Coraza analyses each regex pattern at rule-load time and
# builds pre-checks that will be executed before the full regex evaluation,
# allowing to skip unnecessary regex evaluations.
#
SecRxPreFilter Off

# -- Request body handling ---------------------------------------------------

# Allows Coraza to access request bodies. Without this, Coraza
Expand Down
8 changes: 6 additions & 2 deletions experimental/plugins/plugintypes/operator.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ type Memoizer interface {

// OperatorOptions is used to store the options for a rule operator
type OperatorOptions struct {
// Arguments is used to store the operator args
// Arguments stores the operator args.
Arguments string

// Path is used to store a list of possible data paths
// Path stores a list of possible data paths.
Path []string

// Root is the root to resolve Path from.
Expand All @@ -27,6 +27,10 @@ type OperatorOptions struct {

// Memoizer caches expensive compilations (regex, aho-corasick).
Memoizer Memoizer

// RxPreFilterEnabled controls whether the @rx operatorcompile-time
// literal pre-filtering is enabled.
RxPreFilterEnabled bool
}

// Operator interface is used to define rule @operators
Expand Down
11 changes: 11 additions & 0 deletions internal/corazawaf/rxprefilter_default.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// Copyright 2022 Juan Pablo Tosso and the OWASP Coraza contributors
// SPDX-License-Identifier: Apache-2.0

//go:build !coraza.rule.rx_prefilter

package corazawaf

// The feature is always compiled, and by default disabled. It can be set via SecRxPreFilter.
// This build tag is used to enable the feature by default for testing, being able to run the whole
// test suite with the feature enabled.
const defaultRxPreFilterEnabled = false
12 changes: 12 additions & 0 deletions internal/corazawaf/rxprefilter_on.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Copyright 2022 Juan Pablo Tosso and the OWASP Coraza contributors
// SPDX-License-Identifier: Apache-2.0

//go:build coraza.rule.rx_prefilter

package corazawaf

// defaultRxPreFilterEnabled is true when the coraza.rule.rx_prefilter build tag
// is set so that the entire test suite (and any deployment built with the tag)
// exercises the prefilter path without requiring an explicit SecRxPreFilter On
// directive. The directive can still override this per WAF instance.
const defaultRxPreFilterEnabled = true
11 changes: 8 additions & 3 deletions internal/corazawaf/waf.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,10 @@ type WAF struct {
// Configures the maximum number of ARGS that will be accepted for processing.
ArgumentLimit int

// RxPreFilterEnabled controls whether the @rx operator uses compile-time
// literal pre-filtering. Set by the SecRxPreFilter directive.
RxPreFilterEnabled bool

memoizerID uint64
memoizer *memoize.Memoizer
closeOnce gosync.Once
Expand Down Expand Up @@ -331,9 +335,10 @@ func NewWAF() *WAF {
types.AuditLogPartResponseHeaders,
types.AuditLogPartAuditLogTrailer,
},
AuditLogFormat: "Native",
Logger: logger,
ArgumentLimit: 1000,
AuditLogFormat: "Native",
Logger: logger,
ArgumentLimit: 1000,
RxPreFilterEnabled: defaultRxPreFilterEnabled,
}

if environment.HasAccessToFS {
Expand Down
20 changes: 14 additions & 6 deletions internal/operators/rx.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,16 +82,22 @@ func newRX(options plugintypes.OperatorOptions) (plugintypes.Operator, error) {
// Compile regex + prefilter together so memoize caches all artifacts as one
// unit. This avoids re-parsing the AST for minMatchLength/prefilterFunc when
// the same pattern appears in multiple rules.
compiled, err := memoizeDo(options.Memoizer, data, func() (any, error) {
//
// The prefilter flag is part of the key because the global cache is shared
// across all WAF instances: two WAFs with different SecRxPreFilter settings
// must not share a compiled artifact.
cacheKey := fmt.Sprintf("rx:%v:%s", options.RxPreFilterEnabled, data)
compiled, err := memoizeDo(options.Memoizer, cacheKey, func() (any, error) {
re, err := regexp.Compile(data)
if err != nil {
return nil, err
}
return &rxCompiled{
re: re,
minLen: minMatchLength(data),
prefilter: prefilterFunc(data),
}, nil
c := &rxCompiled{re: re}
if options.RxPreFilterEnabled {
c.minLen = minMatchLength(data)
c.prefilter = prefilterFunc(data)
}
return c, nil
})
if err != nil {
return nil, err
Expand All @@ -105,12 +111,14 @@ func newRX(options plugintypes.OperatorOptions) (plugintypes.Operator, error) {
}

func (o *rx) Evaluate(tx plugintypes.TransactionState, value string) bool {
// Prefiltering evaluation is performed here, skipping regex evaluation for clearly non-matching inputs.
if len(value) < o.minLen {
return false
}
if o.prefilter != nil && !o.prefilter(value) {
return false
}

if tx.Capturing() {
// FindStringSubmatchIndex returns a slice of index pairs [start0, end0, start1, end1, ...]
// instead of allocating new strings for each capture group. We then slice the original
Expand Down
16 changes: 8 additions & 8 deletions internal/operators/rxprefilter.go
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
// Copyright 2022 Juan Pablo Tosso and the OWASP Coraza contributors
// SPDX-License-Identifier: Apache-2.0

//go:build coraza.rule.rx_prefilter

// rxprefilter implements compile-time analysis of regex patterns to build cheap
// pre-checks that can skip expensive regexp.Regexp evaluation when the input
// clearly cannot match.
Expand Down Expand Up @@ -180,7 +178,8 @@ func prefilterFunc(pattern string) func(string) bool {
if len(filtered) == 0 {
return nil
}
if len(filtered) == 1 {
switch {
case len(filtered) == 1:
needle := filtered[0]
if caseInsensitive {
pf = func(s string) bool {
Expand All @@ -191,7 +190,7 @@ func prefilterFunc(pattern string) func(string) bool {
return strings.Contains(s, needle)
}
}
} else if caseInsensitive {
case caseInsensitive:
pf = func(s string) bool {
for _, needle := range filtered {
if !containsFoldASCII(s, needle) {
Expand All @@ -200,7 +199,7 @@ func prefilterFunc(pattern string) func(string) bool {
}
return true
}
} else {
default:
pf = func(s string) bool {
for _, needle := range filtered {
if !strings.Contains(s, needle) {
Expand All @@ -223,7 +222,8 @@ func prefilterFunc(pattern string) func(string) bool {
return nil
}
filtered := v
if len(filtered) == 1 {
switch {
case len(filtered) == 1:
needle := filtered[0]
if caseInsensitive {
pf = func(s string) bool {
Expand All @@ -234,14 +234,14 @@ func prefilterFunc(pattern string) func(string) bool {
return strings.Contains(s, needle)
}
}
} else if caseInsensitive && !allASCIIStrings([]string(filtered)) {
case caseInsensitive && !allASCIIStrings([]string(filtered)):
// When case-insensitive, Aho-Corasick uses ASCII-only folding. If any
// needle is non-ASCII (e.g. "ſelect" lowercased from "Select"), it could
// fold to an ASCII equivalent under Go's Unicode case rules — meaning a
// pure-ASCII input like "select" would match (?i)ſelect but the automaton
// wouldn't find "ſelect" in "select". To avoid false negatives, bail out.
return nil
} else {
default:
// Build an Aho-Corasick automaton for multi-pattern matching in O(n).
// Same library already used by the @pm operator.
builder := ahocorasick.NewAhoCorasickBuilder(ahocorasick.Opts{
Expand Down
14 changes: 0 additions & 14 deletions internal/operators/rxprefilter_noop.go

This file was deleted.

41 changes: 19 additions & 22 deletions internal/operators/rxprefilter_test.go
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
// Copyright 2022 Juan Pablo Tosso and the OWASP Coraza contributors
// SPDX-License-Identifier: Apache-2.0

//go:build coraza.rule.rx_prefilter

package operators

import (
Expand Down Expand Up @@ -64,8 +62,8 @@ func TestMinMatchLength(t *testing.T) {
{"\\bhello\\b", 5},

// Unicode
{"ハロー", 9}, // 3 runes × 3 bytes each
{"café", 5}, // é is 2 bytes
{"ハロー", 9}, // 3 runes × 3 bytes each
{"café", 5}, // é is 2 bytes
}
for _, tc := range tests {
t.Run(tc.pattern, func(t *testing.T) {
Expand All @@ -83,11 +81,11 @@ func TestMinMatchLength(t *testing.T) {
// accepts known matching inputs and rejects known non-matching inputs.
func TestPrefilterFuncBuildability(t *testing.T) {
tests := []struct {
pattern string
wantNil bool
desc string
match string // input that the regex matches (checked when prefilter is non-nil)
noMatch string // input that the regex does not match (checked when prefilter is non-nil)
pattern string
wantNil bool
desc string
match string // input that the regex matches (checked when prefilter is non-nil)
noMatch string // input that the regex does not match (checked when prefilter is non-nil)
}{
{"hello", false, "plain literal", "say hello", "goodbye"},
{"[a-z]+", true, "char class only", "", ""},
Expand Down Expand Up @@ -128,9 +126,8 @@ func TestPrefilterFuncBuildability(t *testing.T) {
t.Fatalf("test bug: noMatch %q actually matches %q", tc.noMatch, tc.pattern)
}
// Prefilter may accept (conservative) or reject — but if it rejects, it's correct
if pf(tc.noMatch) {
// Conservative pass-through: prefilter said "maybe", that's OK
}
// Conservative pass-through: prefilter said "maybe", that's OK
_ = pf(tc.noMatch)
}
})
}
Expand Down Expand Up @@ -500,7 +497,7 @@ func TestPrefilterIntegrationViaNewRX(t *testing.T) {

for _, tc := range tests {
t.Run(fmt.Sprintf("%s/%s", tc.pattern, tc.input), func(t *testing.T) {
opts := plugintypes.OperatorOptions{Arguments: tc.pattern}
opts := plugintypes.OperatorOptions{Arguments: tc.pattern, RxPreFilterEnabled: true}
op, err := newRX(opts)
if err != nil {
t.Fatal(err)
Expand Down Expand Up @@ -554,7 +551,7 @@ func TestPrefilterCapturingCorrectness(t *testing.T) {

for _, tc := range tests {
t.Run(tc.pattern, func(t *testing.T) {
opts := plugintypes.OperatorOptions{Arguments: tc.pattern}
opts := plugintypes.OperatorOptions{Arguments: tc.pattern, RxPreFilterEnabled: true}
op, err := newRX(opts)
if err != nil {
t.Fatal(err)
Expand Down Expand Up @@ -593,11 +590,11 @@ func TestContainsFoldASCII(t *testing.T) {
{"", "hello", false},
{"hi", "hello", false},
{"xhellox", "hello", true},
{"HÉLLO", "hello", false}, // non-ASCII É in haystack, ASCII needle
{"Straße", "straße", true}, // non-ASCII needle: conservative true to avoid false negatives
{"STRASSE", "straße", true}, // non-ASCII needle: conservative true (Unicode folding is tricky)
{"HÉLLO", "hello", false}, // non-ASCII É in haystack, ASCII needle
{"Straße", "straße", true}, // non-ASCII needle: conservative true to avoid false negatives
{"STRASSE", "straße", true}, // non-ASCII needle: conservative true (Unicode folding is tricky)
{"totally different", "straße", true}, // non-ASCII needle: conservative true even when absent
{"", "", true}, // empty needle always matches
{"", "", true}, // empty needle always matches
{"abc", "", true},
{"SELECT", "select", true},
{"sElEcT", "select", true},
Expand Down Expand Up @@ -755,7 +752,7 @@ func TestPrefilterWithSMPrefix(t *testing.T) {
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
opts := plugintypes.OperatorOptions{Arguments: tc.pattern}
opts := plugintypes.OperatorOptions{Arguments: tc.pattern, RxPreFilterEnabled: true}
op, err := newRX(opts)
if err != nil {
t.Fatal(err)
Expand Down Expand Up @@ -829,8 +826,8 @@ func TestMemoizeSharesPrefilter(t *testing.T) {
// TestPrefilterConcurrentSafety verifies the prefilter closure and Aho-Corasick
// automaton can be safely called from multiple goroutines concurrently.
func TestPrefilterConcurrentSafety(t *testing.T) {
pattern := "(?i)(?:union\\s+select|insert\\s+into|delete\\s+from)"
opts := plugintypes.OperatorOptions{Arguments: pattern}
RxPattern := `(?i)(?:union\s+select|insert\s+into|delete\s+from)`
opts := plugintypes.OperatorOptions{Arguments: RxPattern}
op, err := newRX(opts)
if err != nil {
t.Fatal(err)
Expand All @@ -852,7 +849,7 @@ func TestPrefilterConcurrentSafety(t *testing.T) {
done := make(chan struct{})

// Compile the reference regex once, outside the goroutines.
re := regexp.MustCompile("(?i)(?:union\\s+select|insert\\s+into|delete\\s+from)")
re := regexp.MustCompile(RxPattern)

for g := 0; g < goroutines; g++ {
go func() {
Expand Down
26 changes: 26 additions & 0 deletions internal/seclang/directives.go
Original file line number Diff line number Diff line change
Expand Up @@ -1393,6 +1393,32 @@ func directiveSecArgumentsLimit(options *DirectiveOptions) error {
return nil
}

// Description: Enables or disables pre-filtering for the @rx operator.
// Syntax: SecRxPreFilter On|Off
// Default: Off
// ---
// When enabled, Coraza analyses each regex pattern at rule-load time to extract required
// literal substrings and compute the minimum match length. At request time these cheap
// checks run before the full regex, allowing the engine to skip the regex entirely when
// an input clearly cannot match.
//
// Example:
// ```seclang
// SecRxPreFilter On
// ```
func directiveSecRxPreFilter(options *DirectiveOptions) error {
if len(options.Opts) == 0 {
return errEmptyOptions
}

b, err := parseBoolean(strings.ToLower(options.Opts))
if err != nil {
return err
}
options.WAF.RxPreFilterEnabled = b
return nil
}

func parseBoolean(data string) (bool, error) {
data = strings.ToLower(data)
switch data {
Expand Down
Loading
Loading