feat(spoe): Migrate implementation to dropmorepackets/haproxy-go#138
feat(spoe): Migrate implementation to dropmorepackets/haproxy-go#138LaurenceJJones merged 17 commits intomainfrom
Conversation
- Replace github.com/negasus/haproxy-spoe-go with github.com/dropmorepackets/haproxy-go - Implement zero-allocation message parsing with single-pass KV extraction - Add struct pooling for HTTPMessageData and IPMessageData with embedded buffers - Use bytes.Equal for key matching to avoid string allocations - Update all action setting to use encoding.ActionWriter - Optimize buffer reuse by embedding buffers in pooled structs - Remove unused code and simplify pooling strategy
fe9c247 to
c43c867
Compare
- Remove ctx field from Spoa struct (containedctx) - Use context parameter instead of context.Background() (contextcheck) - Add type assertion checks with ok for all type assertions (revive) - Prefix unused ctx parameters with _ (unparam) - Remove unused readKeyFromMessage function (unused)
There was a problem hiding this comment.
Pull request overview
This PR migrates the SPOE (Stream Processing Offload Engine) implementation from github.com/negasus/haproxy-spoe-go to github.com/dropmorepackets/haproxy-go, introducing significant performance optimizations through zero-allocation message parsing, struct pooling with embedded buffers, and efficient key matching using byte comparisons.
Key Changes:
- Replace SPOE library dependency and update all integration points to use the new API (encoding.ActionWriter, encoding.Message)
- Implement struct pooling for HTTPMessageData and IPMessageData with embedded buffer reuse to reduce GC pressure
- Optimize message parsing with single-pass KV extraction using bytes.Equal for key matching instead of string conversions
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/spoa/root.go | Core migration: replaces handler wrapper with HandleSPOE interface, implements struct pooling, adds zero-allocation message parsing with extractHTTPMessageData and extractIPMessageData functions |
| internal/remediation/captcha/root.go | Updates InjectKeyValues to use encoding.ActionWriter instead of action.Actions |
| internal/remediation/ban/root.go | Updates InjectKeyValues to use encoding.ActionWriter instead of action.Actions |
| go.mod | Replaces negasus/haproxy-spoe-go v1.0.7 with dropmorepackets/haproxy-go v0.0.7 |
| go.sum | Updates dependency checksums for library migration |
Comments suppressed due to low confidence (2)
pkg/spoa/root.go:723
- Missing panic check on type assertion. If
storedURLValis not astring, this will panic. Add a check:storedURL, ok := storedURLVal.(string); if !ok { return remediation.Allow, httpData }or use a two-value type assertion to handle unexpected types gracefully.
"session": uuid,
pkg/spoa/root.go:905
- The
readKeyFromMessagefunction (lines 869-905) and its associated error variables (lines 865-867) appear to be unused. The new implementation usesextractHTTPMessageDataandextractIPMessageDatafor extracting message data. Per the PR description "Remove unused code and simplify pooling strategy", this function should be removed.
ipTypeLabel := "ipv4"
if ipAddr.Is6() {
ipTypeLabel = "ipv6"
}
// Count processed requests - use WithLabelValues to avoid map allocation on hot path
metrics.TotalProcessedRequests.WithLabelValues(ipTypeLabel).Inc()
// Check IP directly against dataset
r, origin := s.getIPRemediation(ctx, writer, ipAddr)
// Count blocked requests
if r > remediation.Unknown {
// Label order: origin, ip_type, remediation (as defined in metrics.go)
metrics.TotalBlockedRequests.WithLabelValues(origin, ipTypeLabel, r.String()).Inc()
}
_ = writer.SetString(encoding.VarScopeTransaction, "remediation", r.String())
}
var (
ErrMessageKeyNotFound = errors.New("message key not found")
ErrMessageKeyTypeMismatch = errors.New("message key type mismatch")
)
func readHeaders(headers []byte) (http.Header, error) {
h := http.Header{}
if len(headers) == 0 {
return nil, fmt.Errorf("no headers found")
}
// Split by \r\n
hs := strings.Split(string(headers), "\r\n")
for _, header := range hs {
if header == "" {
continue
}
kv := strings.SplitN(header, ":", 2)
if len(kv) != 2 {
return nil, fmt.Errorf("invalid header: %q", header)
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix double-free bug: remove Put(nil) call when extractIPMessageData returns error - Add max buffer size limits (64KB headers, 512KB body) to prevent unbounded memory growth - Remove redundant clear() calls before copy() operations - All type assertions already have proper ok checks - Context usage already correct (using ctx parameter, not context.Background()) - Unused ctx field already removed
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove ErrMessageKeyNotFound and ErrMessageKeyTypeMismatch as they are no longer used after migration - These were left over from the old readKeyFromMessage function that was removed
- Add reset() method to IPMessageData struct, matching HTTPMessageData pattern - Update extractIPMessageData to use reset() instead of manual field clearing - Update handleIPRequest defer to use reset() method - Improves code consistency and maintainability
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…onversion - Use bytes.Split() instead of converting entire headers byte slice to string - Use bytes.IndexByte() to find colon separator directly in byte slice - Only convert individual key/value parts to strings when needed - Reduces allocations, especially for requests with large header blocks
- Replace bytes.Split with bytes.SplitSeq to avoid allocating intermediate slice - SplitSeq uses iterator pattern, yielding lines one at a time - Reduces memory allocations when parsing headers - Addresses linting suggestion for more efficient iteration
- Remove HTTPRequestData struct to fix memory safety issue with pooled buffers - Eliminates risk of holding pointers to pooled slices that could be reused - Simplifies code since all values are extracted upfront in msgData - handleCaptchaRemediation now only returns remediation (not HTTPRequestData) - Fix misleading comment about URL nil check (line 666) - Clarify that msgData.URL is guaranteed non-nil at that point - Fix buffer reallocation edge case for headers and body buffers - Change > to >= to properly handle capacity exactly equal to max size - Prevents wasting pooled buffers for common boundary cases
* Refactor SPOE to use groups with dropmorepackets package - Convert crowdsec-ip to crowdsec-tcp message for clarity - Create separate groups for crowdsec-tcp, crowdsec-http-body, and crowdsec-http-no-body - Use event on-client-session for TCP message triggering - Add conditional HTTP group sends based on body size using ACL - Update HTTP handler to always check IP and only count metrics when TCP handler didn't run - Add debug logging for all received messages - Use req.body_size with ACL for body size checking (works with http-buffer-request) - Update both standard and upstream proxy configs with IP extraction and conditional groups * Clean up SPOE groups implementation - Remove crowdsec-tcp group definition (uses event directive, not group) - Standardize body size limit to 100KB in both configs with consistent notes - Remove unnecessary string conversion in HandleSPOE (use bytes directly) - Ensure code only handles message names defined in crowdsec.cfg - Update comments for clarity * Update configs and code cleanup * Address PR review comments - Fix ACL logic to prevent both groups being sent when body_size not found - Update comments: clarify TCP handler behavior in upstream proxy mode - Fix grammar: 'uses' -> 'use' for plural subject - Clarify TCP remediation logic comment - Fix missing apostrophe: 'dont' -> 'don't' - Update handleTCPRequest function comment to explain TCP-level processing - Fix misleading comment about TCP handler using groups (it uses event directive)
- Integrated AppSec functionality from main - Adapted AppSec functions to use encoding.Message and HTTPMessageData - Resolved conflicts while maintaining migration to dropmorepackets/haproxy-go - Updated readHeaders to handle both \r\n and \n line endings
- Extract buildAppSecRequest method from validateWithAppSec - Refactor handleCaptchaRemediation into smaller focused functions - Add getSessionString helper to eliminate repeated type assertions - Simplify createNewSessionAndCookie by removing unused parameters - Improve code readability and maintainability
Remove unnecessary reset() calls immediately after Get() from sync.Pool. The pool's New function returns zeroed instances, and we already reset before Put() in defer blocks, so resetting after Get() is redundant.
- Replace custom pointer helpers (stringPtr, boolPtr, etc.) with ptr.Of from go-cs-lib - Extract Host header and captcha cookie from req.hdrs instead of separate KV pairs - Add extractCookieValue helper using strings.SplitSeq and CutPrefix - Remove unused keyHost and keyCaptchaCookie variables
- Add tune.bufsize 65536 (64KB) to support larger body inspection - Add timeout connect 2s and timeout server 60s to crowdsec-spoa backend - Increases max body size that can be inspected by AppSec WAF
- Add tune.bufsize 65536 to haproxy.cfg and haproxy-upstreamproxy.cfg for 64KB body support - Add stats socket for runtime debugging (/tmp/haproxy.sock) - Lower body_within_limit to 50KB to stay safely under SPOE frame limit - Remove redundant max-frame-size from crowdsec.cfg (tune.bufsize handles it) - Add docker-compose.dev.yaml with debug container (tcpdump, socat, curl, etc.) - Move per-message logging from Debug to Trace for quieter Debug mode - Add Debug log for AppSec validation results
- Remove duplicate copy of body bytes when building AppSecRequest - BodyCopied remains valid until handler returns (pool return in defer) - Also remove debug log for body length
- Add getAppSecConfig() helper to centralize host/global AppSec resolution - Add shouldRunAppSec() predicate for cleaner conditional logic - Remove redundant early return in captcha switch case - Fix blocked metrics: don't count as blocked if user solved captcha - Reduce ~30 lines of scattered logic to ~4 lines per call site
* Migrate SPOE implementation to dropmorepackets/haproxy-go - Replace github.com/negasus/haproxy-spoe-go with github.com/dropmorepackets/haproxy-go - Implement zero-allocation message parsing with single-pass KV extraction - Add struct pooling for HTTPMessageData and IPMessageData with embedded buffers - Use bytes.Equal for key matching to avoid string allocations - Update all action setting to use encoding.ActionWriter - Optimize buffer reuse by embedding buffers in pooled structs - Remove unused code and simplify pooling strategy * Fix linter issues - Remove ctx field from Spoa struct (containedctx) - Use context parameter instead of context.Background() (contextcheck) - Add type assertion checks with ok for all type assertions (revive) - Prefix unused ctx parameters with _ (unparam) - Remove unused readKeyFromMessage function (unused) * fix(spoe): address Copilot PR review comments - Fix double-free bug: remove Put(nil) call when extractIPMessageData returns error - Add max buffer size limits (64KB headers, 512KB body) to prevent unbounded memory growth - Remove redundant clear() calls before copy() operations - All type assertions already have proper ok checks - Context usage already correct (using ctx parameter, not context.Background()) - Unused ctx field already removed * fix(spoe): remove unused error variables - Remove ErrMessageKeyNotFound and ErrMessageKeyTypeMismatch as they are no longer used after migration - These were left over from the old readKeyFromMessage function that was removed * refactor(spoe): add reset() method to IPMessageData for consistency - Add reset() method to IPMessageData struct, matching HTTPMessageData pattern - Update extractIPMessageData to use reset() instead of manual field clearing - Update handleIPRequest defer to use reset() method - Improves code consistency and maintainability * perf(spoe): optimize readHeaders to avoid full byte slice to string conversion - Use bytes.Split() instead of converting entire headers byte slice to string - Use bytes.IndexByte() to find colon separator directly in byte slice - Only convert individual key/value parts to strings when needed - Reduces allocations, especially for requests with large header blocks * perf(spoe): use bytes.SplitSeq for more efficient header parsing - Replace bytes.Split with bytes.SplitSeq to avoid allocating intermediate slice - SplitSeq uses iterator pattern, yielding lines one at a time - Reduces memory allocations when parsing headers - Addresses linting suggestion for more efficient iteration * fix(spoe): address memory safety and code quality issues - Remove HTTPRequestData struct to fix memory safety issue with pooled buffers - Eliminates risk of holding pointers to pooled slices that could be reused - Simplifies code since all values are extracted upfront in msgData - handleCaptchaRemediation now only returns remediation (not HTTPRequestData) - Fix misleading comment about URL nil check (line 666) - Clarify that msgData.URL is guaranteed non-nil at that point - Fix buffer reallocation edge case for headers and body buffers - Change > to >= to properly handle capacity exactly equal to max size - Prevents wasting pooled buffers for common boundary cases * refactor(SPOE): use message groups (#141) * Refactor SPOE to use groups with dropmorepackets package - Convert crowdsec-ip to crowdsec-tcp message for clarity - Create separate groups for crowdsec-tcp, crowdsec-http-body, and crowdsec-http-no-body - Use event on-client-session for TCP message triggering - Add conditional HTTP group sends based on body size using ACL - Update HTTP handler to always check IP and only count metrics when TCP handler didn't run - Add debug logging for all received messages - Use req.body_size with ACL for body size checking (works with http-buffer-request) - Update both standard and upstream proxy configs with IP extraction and conditional groups * Clean up SPOE groups implementation - Remove crowdsec-tcp group definition (uses event directive, not group) - Standardize body size limit to 100KB in both configs with consistent notes - Remove unnecessary string conversion in HandleSPOE (use bytes directly) - Ensure code only handles message names defined in crowdsec.cfg - Update comments for clarity * Update configs and code cleanup * Address PR review comments - Fix ACL logic to prevent both groups being sent when body_size not found - Update comments: clarify TCP handler behavior in upstream proxy mode - Fix grammar: 'uses' -> 'use' for plural subject - Clarify TCP remediation logic comment - Fix missing apostrophe: 'dont' -> 'don't' - Update handleTCPRequest function comment to explain TCP-level processing - Fix misleading comment about TCP handler using groups (it uses event directive) * refactor(spoa): clean up and simplify SPOA functions - Extract buildAppSecRequest method from validateWithAppSec - Refactor handleCaptchaRemediation into smaller focused functions - Add getSessionString helper to eliminate repeated type assertions - Simplify createNewSessionAndCookie by removing unused parameters - Improve code readability and maintainability * perf(spoa): remove redundant reset() calls after pool Get() Remove unnecessary reset() calls immediately after Get() from sync.Pool. The pool's New function returns zeroed instances, and we already reset before Put() in defer blocks, so resetting after Get() is redundant. * refactor: use ptr.Of and extract host/cookie from headers - Replace custom pointer helpers (stringPtr, boolPtr, etc.) with ptr.Of from go-cs-lib - Extract Host header and captcha cookie from req.hdrs instead of separate KV pairs - Add extractCookieValue helper using strings.SplitSeq and CutPrefix - Remove unused keyHost and keyCaptchaCookie variables * config: increase buffer size and timeouts for WAF body inspection - Add tune.bufsize 65536 (64KB) to support larger body inspection - Add timeout connect 2s and timeout server 60s to crowdsec-spoa backend - Increases max body size that can be inspected by AppSec WAF * feat: improve HAProxy buffer config and add debug tooling - Add tune.bufsize 65536 to haproxy.cfg and haproxy-upstreamproxy.cfg for 64KB body support - Add stats socket for runtime debugging (/tmp/haproxy.sock) - Lower body_within_limit to 50KB to stay safely under SPOE frame limit - Remove redundant max-frame-size from crowdsec.cfg (tune.bufsize handles it) - Add docker-compose.dev.yaml with debug container (tcpdump, socat, curl, etc.) - Move per-message logging from Debug to Trace for quieter Debug mode - Add Debug log for AppSec validation results * perf: remove unnecessary body copy for AppSec requests - Remove duplicate copy of body bytes when building AppSecRequest - BodyCopied remains valid until handler returns (pool return in defer) - Also remove debug log for body length * refactor: simplify AppSec config logic and fix metrics counting - Add getAppSecConfig() helper to centralize host/global AppSec resolution - Add shouldRunAppSec() predicate for cleaner conditional logic - Remove redundant early return in captcha switch case - Fix blocked metrics: don't count as blocked if user solved captcha - Reduce ~30 lines of scattered logic to ~4 lines per call site
Running the stress test and capturing pprof profiles for comparison:
Test: 50,000 requests, 100 concurrent connections