Skip to content

Conversation

@yanivagman
Copy link
Collaborator

Introduces Common Expression Language (CEL) for dynamic conditions and field extraction in YAML detectors, enabling powerful declarative detection without writing Go code.

Features:

  1. CEL Expressions

    • Dynamic conditions with CEL syntax
    • Helper functions: getData(), hasData() with macros
    • Cost limits (1M units) & timeout enforcement (5ms)
    • Breaking: extract_fields → fields, source → expression
  2. Shared Lists

    • Reusable lists across detectors (e.g., SHELL_BINARIES)
    • Zero runtime overhead (compiled at load time)
    • Defined in {detector-dir}/lists/ subdirectory
  3. Datastore Access

    • 12 CEL functions: process.get(), container.get(), system.info(),
      kernel.resolveSymbol(), dns.getResponse(), syscall.getName(), etc.
    • Graceful validation-mode fallback (nil registry)
  4. String Utilities

    • 8 helpers: split(), join(), trim(), replace(), upper(), lower(),
      basename(), dirname()

Example:
conditions:

  • getData("pathname") in SHELL_BINARIES
  • workload.container.id != ""

output:
fields:
- name: shell_path
expression: getData("pathname")

@yanivagman yanivagman self-assigned this Dec 29, 2025
Copilot AI review requested due to automatic review settings December 29, 2025 12:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Common Expression Language (CEL) support to YAML detectors, enabling dynamic runtime conditions and field extraction without requiring Go code. The major changes include:

  • CEL integration for conditions and field extraction
  • Shared lists feature for reusable value collections
  • Datastore access functions for querying system state
  • String utility functions for path and text manipulation

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pkg/detectors/yaml/cel_env.go Core CEL environment setup with macros for simplified syntax (getData/hasData)
pkg/detectors/yaml/cel_datastores.go Datastore functions for accessing process, container, system, kernel, DNS, and syscall information
pkg/detectors/yaml/cel_strings.go String utility functions (split, join, trim, replace, upper, lower, basename, dirname)
pkg/detectors/yaml/detector.go Updated detector to compile and evaluate CEL expressions with timeout and cost limits
pkg/detectors/yaml/schema.go Added Conditions field and renamed ExtractFieldSpec to FieldSpec
pkg/detectors/yaml/validator.go Enhanced validation to compile CEL expressions and removed legacy extraction path validation
pkg/detectors/yaml/list_loader.go List loading from {dir}/lists/ subdirectory with uppercase snake_case naming
pkg/detectors/yaml/loader.go Integrated list loading into detector directory loading
pkg/detectors/yaml/extractor.go Deleted legacy field extractor (replaced by CEL)
tests/integration/yaml_detector_test.go Updated tests to use getData() syntax instead of data.field
examples/detectors/yaml/*.yaml Updated example detectors to use CEL expressions
docs/docs/detectors/yaml-detectors.md Comprehensive documentation for CEL features, shared lists, and datastore functions

@yanivagman yanivagman force-pushed the yaml_detectors_cel_support branch 2 times, most recently from 8cf7067 to 3007f1f Compare December 29, 2025 16:16
@codecov
Copy link

codecov bot commented Dec 29, 2025

Codecov Report

❌ Patch coverage is 53.90476% with 484 lines in your changes missing coverage. Please review.
✅ Project coverage is 36.65%. Comparing base (235daa0) to head (caa3c91).
⚠️ Report is 62 commits behind head on main.

Files with missing lines Patch % Lines
pkg/detectors/yaml/cel_datastores.go 62.60% 90 Missing and 39 partials ⚠️
pkg/detectors/yaml/detector.go 22.43% 117 Missing and 4 partials ⚠️
pkg/detectors/yaml/cel_env.go 67.74% 55 Missing and 15 partials ⚠️
pkg/detectors/yaml/cel_strings.go 63.29% 56 Missing and 13 partials ⚠️
pkg/detectors/yaml/validator.go 18.75% 31 Missing and 8 partials ⚠️
pkg/detectors/yaml/loader.go 50.79% 28 Missing and 3 partials ⚠️
pkg/detectors/yaml/list_loader.go 24.24% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5147      +/-   ##
==========================================
+ Coverage   33.51%   36.65%   +3.13%     
==========================================
  Files         250      262      +12     
  Lines       28908    31401    +2493     
==========================================
+ Hits         9688    11509    +1821     
- Misses      18609    19096     +487     
- Partials      611      796     +185     
Flag Coverage Δ
unit 36.65% <53.90%> (+3.13%) ⬆️
Files with missing lines Coverage Δ
pkg/detectors/yaml/list_loader.go 24.24% <24.24%> (ø)
pkg/detectors/yaml/loader.go 56.25% <50.79%> (ø)
pkg/detectors/yaml/validator.go 20.14% <18.75%> (ø)
pkg/detectors/yaml/cel_strings.go 63.29% <63.29%> (ø)
pkg/detectors/yaml/cel_env.go 67.74% <67.74%> (ø)
pkg/detectors/yaml/detector.go 32.35% <22.43%> (ø)
pkg/detectors/yaml/cel_datastores.go 62.60% <62.60%> (ø)

... and 25 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yanivagman yanivagman force-pushed the yaml_detectors_cel_support branch 2 times, most recently from 8a61428 to 0413c0b Compare January 7, 2026 23:09
Copy link
Collaborator

@josedonizetti josedonizetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass, still testing

@yanivagman yanivagman force-pushed the yaml_detectors_cel_support branch 2 times, most recently from 2bc10a1 to b89f3ca Compare January 8, 2026 12:27
- Add Common Expression Language (CEL) for dynamic conditions and field extraction
- Support short form (field names) and long form (CEL expressions) for output fields
- Add helper functions: getData(), getDataInt(), getDataUInt(), hasData()
- Rename extract_fields -> fields, source -> expression for clarity
- Update all existing YAML detectors to use CEL syntax
- Add comprehensive documentation for CEL features
- All unit tests pass (38.3% coverage)
- YAML detectors load successfully with CEL compilation

Breaking changes:
- YAML schema: extract_fields -> fields, source -> expression
- Field extraction now uses CEL expressions instead of JSONPath-like paths
Implements CEL global list variables to share common lists (e.g., shell
binaries, sensitive paths) across YAML detectors. Lists are defined in
{detector-dir}/lists/ subdirectory and exposed as CEL variables.

- Add list schema and loader with validation (uppercase snake_case names)
- Register lists as CEL variables (list<string>) in environment
- Pass lists to CEL evaluation context at runtime
- Add comprehensive unit and integration tests (16 new tests)
- Add example shell_binaries list and detector
- Update documentation with usage examples

Lists are compiled into CEL at load time for zero runtime overhead and
compile-time type safety.
Expose all core datastores (Process, Container, System, Kernel, DNS, Syscall)
as namespaced CEL functions for use in YAML detector conditions and outputs.

Key changes:
- Add datastores.Registry field to YAMLDetector, rebuild CEL env in Init()
- Implement 12 datastore functions: process.get/getAncestry/getChildren,
  container.get/getByName, system.info, kernel.resolveSymbol/getSymbolAddress,
  dns.getResponse, syscall.getName/getId
- Add comprehensive test coverage with mock datastores
- Update documentation with examples and usage patterns

Functions return null for not-found entities, handle time.Time conversion,
and integrate seamlessly with existing CEL expressions.
Add 8 string utility functions for YAML detectors:
- split(str, delimiter) - Split string into list
- join(list, delimiter) - Join list into string
- trim(str) - Remove leading/trailing whitespace
- replace(str, old, new) - Replace all occurrences
- upper(str) - Convert to uppercase
- lower(str) - Convert to lowercase
- basename(path) - Get filename from path
- dirname(path) - Get directory from path

Functions are available in both conditions and output expressions.
All functions handle CEL's various list representations ([]string, []interface{}, []ref.Val).

Includes comprehensive unit tests and documentation updates.
…ory structure

- Add required 'type' field to all YAML detector and list files
- Support flat directory structure (detectors and lists in same dir)
- Remove lists/ subdirectory requirement for K8s ConfigMap deployment
- Add type field validation in loader and validator
- Rewrite LoadFromDirectory with clean three-pass design
- Update all examples, tests, and documentation

BREAKING CHANGE: All YAML detector files must include 'type: detector' at the top.
All list files must include 'type: string_list'. Lists are no longer in a
subdirectory but in the same directory as detectors.
@yanivagman yanivagman force-pushed the yaml_detectors_cel_support branch from b89f3ca to b0dfd12 Compare January 8, 2026 14:06
Replace /usr/bin/cat with /usr/bin/true for positive test case and
/usr/bin/id with /usr/bin/false for negative test case to avoid
interference from background processes that may execute common utilities.

Add 200ms delay before buffer clear to ensure all events from test 1
have arrived, preventing race conditions where late events cause false
positives in the negative assertion.
@yanivagman yanivagman force-pushed the yaml_detectors_cel_support branch from b0dfd12 to caa3c91 Compare January 8, 2026 14:41
@yanivagman yanivagman merged commit dc45de2 into aquasecurity:main Jan 9, 2026
88 of 89 checks passed
@yanivagman yanivagman deleted the yaml_detectors_cel_support branch January 9, 2026 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants