Skip to content

Topic/christian/pluggable conntuples#59

Open
ckreibich wants to merge 181 commits intomasterfrom
topic/christian/pluggable-conntuples
Open

Topic/christian/pluggable conntuples#59
ckreibich wants to merge 181 commits intomasterfrom
topic/christian/pluggable-conntuples

Conversation

@ckreibich
Copy link
Owner

No description provided.

@ckreibich ckreibich force-pushed the topic/christian/pluggable-conntuples branch 4 times, most recently from 39fae4b to 30cb623 Compare April 14, 2025 06:45
timwoj and others added 26 commits May 22, 2025 10:22
…ollowup'

* origin/topic/timw/sqlite-cluster-test-followup:
  SQLite: Add TODO note about possibly using sqlite3_busy_timeout
  SQLite: Fix typo in variable name causing pragmas not to retry on busy
  SQLite: Use tableval iteration instead of ToMap for pragmas
  SQLite: Fix logging/error messages around executing pragmas
* origin/topic/timw/remove-findclangtidy:
  Remove FindClangTidy.cmake, update cmake submodule
Avoid proliferation of accessors on EventMgr.
Introduce a new EventMetadata module and members on EventMgr to register
event metadata types.
This removes the ts attribute from Event and instead allocates a vector for
storing metadata. By default, adds the network time as a TimeVal. Later
patches will make the allocation of the vector optional by introducing a
different constructor so that users that are not interested in network
timestamp metadata do not take the allocation hit.

Moving the explicit ``ts`` out of the event is done in order to treat it
just as generic metadata, too. However, the Time() accessor is adapted to
lookup the value from the metadata vector instead.
This deprecates the Event constructor and the ``ts`` parameter of Enqueue()
Instead, versions are introduced that take a detail::MetadataVectorPtr which
can hold the network timestamp metadata and is meant to be allocated by the
caller instead of automatically during Enqueue() or within the Event
constructor.

This also introduces a BifConst ``EventMetadata::add_network_timestamp`` to
opt-in adding network timestamps to events globally. It's disabled by
default as there are not a lot of known use cases that need this.
This is a bit intermediary. In part 2 this will deal with any metadata,
not just timestamps.
…vent-metadata-part-1'

* origin/topic/awelzel/4177-4178-custom-event-metadata-part-1:
  Event: Move meta after args
  Event: Use IntrusivePtr to manage obj refcount
  btest/zam: Update for new EventMetadata bifs
  broker and cluster: Switch to new Enqueue() API
  Event/zeek.bif: Add EventMetadata current() and current_values() accessors
  Event: Deprecate default network timestamp metadata
  Event: Store timestamp in metadata vector
  EventRegistry/zeek.bif/init-bare: Add event metadata infrastructure
  EventMgr: Add CurrentEvent() accessor
…ure'

* origin/topic/timw/redis-connection-failure:
  Redis: bump version of hiredis required
  Redis: return proper error if connection fails
This will be cleaned up later to just pass all contained metadata from
a cluster event to the queued event, but for now do this here, otherwise
we break some internal tests.
It seems less surprising if only local events receive automatic network
timestamp metadata. For remote events the automatic value will most
likely be misleading.
…amp-metadata'

* origin/topic/awelzel/fix-no-zero-timestamp-metadata:
  btest: Add test for Cluster::hello zero-timestamp
  EventMgr/Enqueue: Add automatic timestamp metadata to local events, only
  cluster and broker: Propagate zero-timestamp as metadata, too.
zeek-bot and others added 29 commits June 3, 2025 00:29
* origin/topic/timw/4350-redis-passwords:
  Redis: Add support for sending AUTH commands during connection
  Redis: disconnect cleanly if INFO request fails
  Fix segfault if storage sync open_backend returns bad code
  Add ToStdString and ToStdStringView to ZeekString
Currently, coverage/bare-mode-errors is one of the slowest tests in the
entire test suite. This is caused by the fact that it has to repeatedly
launch Zeek for every script that we ship. This is done sequentially.

This commit changes this test to use xargs to spawn 20 parallell
processes.
…-log'

* origin/topic/etyp/fix-reenable-analyzer-log:
  Fix Spicy re-enable builtin analyzer debug message
…ge-bare-mode-errors'

* origin/topic/johanna/parallelize-coverage-bare-mode-errors:
  Parallelize coverage/bare-mode-errors
…delines-fixes'

* origin/topic/timw/clang-tidy-cppcoreguidelines-fixes:
  Add some notes about missing/disabled cppcoreguildlines clang-tidy checkers
  Fix clang-tidy cppcoreguidelines-macro-usage findings (macro functions)
  Fix clang-tidy cppcoreguidelines-macro-usage findings (macros as constants)
  script_opt: Add missing virtual destructor (cppcoreguidelines-virtual-class-destructor)
This likely simply hasn't come up in practice, but we now go the intended route
of producing the textual enum value from the type.
Builders are intermediaries that encapsulate the details of how to instantiate
connection tuples & keys. By virtualizing those data structures, builder
implementations can adapt Zeek's notion of connection tuples.
ConnKey instances encodes connection tuples for connection lookups via session
keys. They so far relied on their internal memory layout as input for session
keying: given a pointer to an instance plus its sizeof() was used as hash
data input.

This approach breaks when virtualizing ConnKeys, with compilers explicitly
warning against using this approach. To allow virtualizing ConnKeys, this commit
changes the hashing approach to explicit packing of the tuple context into a
memory chunk, allowing derived classes to provide their own/additional
implementations. One benefit of this approach is that the packed chunks can vary
in size, using space for additional fields only when actually present in the
traffic. It also reduces the packed size somewhat: on my machine full object
ConnKey instances account for 48 bytes, while the packed chunk makes up 38 --
and we no longer need to memset() out the extra space to ensure hash
correctness.

This also removes unneeded versions of the comparison operators, and bases the
remaining ones on the session key values themselves. There's some potential for
optimization here to avoid repeated creation of the packed region since it will
never change for a given tuple.
The class previously kept a full copy, preventing virtualization.

There are 3 remaining uses of the existing Connection object constructor, all in
C++ unit tests. Since Connection isn't in the detail namespace we can't
immediately change its API, but we could deprecate the existing constructor at
this time.
This allows tuple builders to complete conn_id instances in implementations that
redef that record. We could be more invasive here and shift most of GetVal() into
the five-tuple builder implementation.
Loading policy/protocols/conntuple/vlan adapts Zeek's flow hashing and the
script-layer conn_id record to show VLAN tags when present.

I'm using script-layer ints for the VLAN tag representation for consistency with
what we alrady do elsewhere, but it seems odd since they can never be negative.

I'm currently skipping protocols/conntuple/vlan in test-all-policy since it
otherwise affects the external testsuites -- could revisit if people feel it
should run on these.
@ckreibich ckreibich force-pushed the topic/christian/pluggable-conntuples branch from 30cb623 to 3c28f10 Compare June 5, 2025 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants