A Zeek plugin for high-performance threat intelligence matching using Matchy databases. Includes MatchyIntel, a drop-in alternative to Zeek's Intel Framework that fixes its two biggest pain points: memory consumption on clusters and updating data at runtime.
- Why Replace the Intel Framework?
- Installation
- Quick Start
- Deployment
- MatchyIntel Framework
- Low-Level API
- API Reference
- Building Matchy Databases
- Testing
- Troubleshooting
If you've run Zeek's Intel Framework at scale, you've hit these problems:
The Intel Framework loads every indicator into each worker's heap. On a 32-core cluster, that's 32 copies of your indicator set in memory. A million indicators can easily consume tens of gigabytes across workers.
Matchy databases are memory-mapped. The OS maps the .mxy file once and all workers share the same physical pages via the page cache. Zero heap allocation per worker. On that same 32-core cluster, you go from 32 copies to 1.
Replacing the loaded indicator set in the Intel Framework at runtime has been a long-standing pain point. You either restart Zeek (causing a gap in monitoring) or deal with the complexity of incremental insert/remove operations and Broker synchronization.
With Matchy, you just replace the .mxy file on disk. Auto-reload detects the change and swaps in the new database atomically—lock-free, with ~1-2ns overhead per query. No restart, no gap, no coordination between workers. Build your database offline, scp it to your sensor, done.
| Operation | Throughput |
|---|---|
| IP queries | 7M+/sec |
| Pattern queries (globs) | 3M+/sec |
| Database load time | <1ms |
| Auto-reload overhead | ~1-2ns/query |
Performance is deterministic—no GC pauses, no hash table resizing during operation.
- No Broker: Database files are self-contained. Copy them with
scp, distribute with Ansible, serve from S3. - No
zeekctl deploy: Just replace the file on disk. Auto-reload handles the rest. - Debug offline:
matchy query threats.mxy 1.2.3.4works from any command line—no need to inspect Zeek's internal state. - Build anywhere: Generate
.mxyfiles from CSV, JSON, or MISP feeds in CI/CD. The same binary file works on Linux, macOS, and FreeBSD.
- Zeek 5.0+ (with development headers if not installed from source)
- Rust/Cargo (install from rustup.rs)
- CMake 3.15+
- C++17 compiler
zkg install https://github.com/matchylabs/zeek-matchy-pluginThis requires Rust/Cargo to be installed on the build machine. The package manager handles the rest.
git clone https://github.com/matchylabs/zeek-matchy-plugin.git
cd zeek-matchy-plugin
mkdir build && cd build
cmake ..
makeThis automatically clones and builds Matchy from source. If you already have a local Matchy checkout, point CMake at it to skip the clone:
cmake -DMATCHY_SOURCE_DIR=/path/to/matchy ..Or if Matchy is already installed system-wide:
cmake -DBUILD_MATCHY=OFF ..
# Or specify the install prefix:
cmake -DBUILD_MATCHY=OFF -DMATCHY_ROOT=/usr/local ..sudo make install# If using ZEEK_PLUGIN_PATH (development)
export ZEEK_PLUGIN_PATH=/path/to/zeek-matchy-plugin/build
zeek -N Matchy::DBExpected:
Matchy::DB - Fast IP and pattern matching using Matchy databases (dynamic, version 0.3.0)
-
Install the Matchy CLI (if you don't have it already):
cargo install matchy
-
Create a threat database from CSV:
cat > threats.csv << 'EOF' entry,threat_level,category,description 1.2.3.4,high,malware,Known C2 server 10.0.0.0/8,low,internal,RFC1918 private network *.evil.com,critical,phishing,Phishing domain pattern malware.example.com,high,malware,Malware distribution site EOF matchy build threats.csv -o threats.mxy --format csv
-
Use it in Zeek (add to your
local.zeekor a site-specific script):@load Matchy/DB/intel redef MatchyIntel::db_path = "/opt/threat-intel/threats.mxy"; event MatchyIntel::match(s: MatchyIntel::Seen, metadata: string) { print fmt("THREAT: %s (%s) -> %s", s$indicator, s$where, metadata); }
That's it. MatchyIntel automatically checks connection IPs, DNS queries, HTTP hosts/URLs, and SSL/TLS SNI against your database.
Add these lines to your local.zeek (or a site-specific script):
@load Matchy/DB/intel
redef MatchyIntel::db_path = "/opt/threat-intel/threats.mxy";Then deploy as usual with zeekctl deploy.
Matchy databases are memory-mapped, which means all Zeek workers on the same host share the same physical memory pages. You don't need to worry about per-worker memory — the OS handles sharing via the page cache.
Each host in your cluster needs a copy of the .mxy file at the same path. Options:
- Shared filesystem (NFS, CIFS): Put the
.mxyon a shared mount. All hosts read from the same file. Simplest option. - Local copies: Distribute with
rsync, Ansible, Salt, etc. Better I/O performance since reads don't cross the network. - CI/CD pipeline: Build the database in CI, push to an artifact store or S3, pull from each sensor on a cron job.
With auto-reload enabled (the default), updating is a file replacement. Always write to a temporary file first, then mv it into place. This ensures workers never see a partially-written file — mv on the same filesystem is atomic.
# Build new database (on your build host or in CI)
matchy build updated-threats.csv -o /opt/threat-intel/threats.mxy.tmp --format csv
# Atomically replace the live file
mv /opt/threat-intel/threats.mxy.tmp /opt/threat-intel/threats.mxyIf distributing to remote sensors, copy to a temp path first:
scp threats.mxy sensor01:/opt/threat-intel/threats.mxy.tmp
ssh sensor01 'mv /opt/threat-intel/threats.mxy.tmp /opt/threat-intel/threats.mxy'All workers detect the file change and reload automatically. No Zeek restart, no zeekctl deploy, no monitoring gap.
MatchyIntel is designed to feel familiar if you've used the Intel Framework, but with a fundamentally different architecture.
When you @load Matchy/DB/intel, it immediately starts observing:
| Protocol | What | Where Enum |
|---|---|---|
| Connections | Originator and responder IPs | Conn::IN_ORIG, Conn::IN_RESP |
| DNS | Query strings | DNS::IN_REQUEST |
| HTTP | Host header, full URL | HTTP::IN_HOST_HEADER, HTTP::IN_URL |
| SSL/TLS | SNI, certificate CN | SSL::IN_SERVER_NAME, X509::IN_CERT |
By default, MatchyIntel watches the database file and reloads when it changes. This is the recommended mode for production.
# Enabled by default
redef MatchyIntel::auto_reload = T;
# To disable (for manual control):
redef MatchyIntel::auto_reload = F;To update your threat intel, simply replace the .mxy file on disk. All workers pick up the change automatically.
You can also change the database path at runtime via Zeek's Config framework:
# Switch to a different database
Config::set_value("MatchyIntel::db_path", "/opt/threat-intel/updated.mxy");
# Unload the database (stop matching)
Config::set_value("MatchyIntel::db_path", "");If the new path is invalid, the change is rejected and the current database stays loaded.
Check arbitrary indicators programmatically:
# Check an IP
MatchyIntel::seen(MatchyIntel::Seen($host=1.2.3.4,
$where=MatchyIntel::IN_ANYWHERE));
# Check a domain
MatchyIntel::seen(MatchyIntel::Seen($indicator="evil.example.com",
$indicator_type=MatchyIntel::DOMAIN,
$where=MatchyIntel::IN_ANYWHERE));# Filter matches before they fire
hook MatchyIntel::seen_policy(s: MatchyIntel::Seen, found: bool) {
# Suppress matches for local IPs
if (s?$host && Site::is_local_addr(s$host))
break;
}
# Customize logging
hook MatchyIntel::extend_match(info: MatchyIntel::Info, s: MatchyIntel::Seen, metadata: string) {
# Add custom fields, modify info record, etc.
}Matches are logged to matchy_intel.log:
| Field | Description |
|---|---|
ts |
Timestamp |
uid |
Connection UID (if applicable) |
id |
Connection 4-tuple (if applicable) |
seen.indicator |
What was matched |
seen.indicator_type |
ADDR, DOMAIN, URL, etc. |
seen.where |
Where it was observed |
metadata |
JSON blob from your database (all your custom fields) |
For more control, use the BiF functions directly:
global threats_db: opaque of MatchyDB;
event zeek_init() {
threats_db = Matchy::load_database("/path/to/threats.mxy");
if (!Matchy::is_valid(threats_db)) {
print "Failed to load database!";
return;
}
}
event new_connection(c: connection) {
local result = Matchy::query_ip(threats_db, c$id$orig_h);
if (result != "") {
print fmt("Threat detected from %s: %s", c$id$orig_h, result);
}
}
event dns_request(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count) {
local result = Matchy::query_string(threats_db, query);
if (result != "") {
print fmt("Malicious domain queried: %s - %s", query, result);
}
}Query results are JSON strings. Use Zeek's from_json() to parse them into typed records:
@load base/frameworks/notice
module ThreatIntel;
export {
redef enum Notice::Type += {
Threat_Detected
};
type ThreatData: record {
category: string &optional;
threat_level: string &optional;
description: string &optional;
};
global threats_db: opaque of MatchyDB;
}
event zeek_init() {
threats_db = Matchy::load_database("/opt/threat-intel/threats.mxy");
}
event new_connection(c: connection) {
local result = Matchy::query_ip(threats_db, c$id$orig_h);
if (result != "") {
local parsed = from_json(result, ThreatData);
if (parsed$valid) {
local threat: ThreatData = parsed$v;
NOTICE([$note=Threat_Detected,
$conn=c,
$msg=fmt("Threat: %s (%s)", threat$category, threat$threat_level),
$sub=fmt("IP: %s", c$id$orig_h)]);
}
}
}Load a database and return an opaque handle. The database is memory-mapped (not copied into memory). Automatically closed when the handle goes out of scope.
Load a database with auto-reload support. When auto_reload is T, the database watches its source file and transparently reloads when changes are detected (~1-2ns overhead per query, lock-free).
Check if a database handle is valid and open.
Query by IP address. Returns a JSON string with match metadata, or "" if no match. Supports both exact IPs and CIDR matching (longest prefix wins).
Query by string. Returns a JSON string with match metadata, or "" if no match. Supports exact string matching and glob patterns (*.evil.com).
Install the CLI:
cargo install matchy# First column must be named "entry" — it's the match key.
# All other columns become metadata fields in query results.
cat > threats.csv << 'EOF'
entry,threat_level,category,description
1.2.3.4,high,malware,Known C2 server
10.0.0.0/8,low,internal,RFC1918 private network
*.evil.com,critical,phishing,Phishing domain pattern
malware.example.com,high,malware,Malware distribution site
EOF
matchy build threats.csv -o threats.mxy --format csvMatchy auto-detects entry types: IP addresses, CIDR ranges, glob patterns, and literal strings. You can include as many entries as you need — databases with hundreds of thousands of indicators build in about a second.
matchy build threats.json -o threats.mxyMatchy can import directly from MISP JSON exports, preserving all metadata (tags, threat levels, categories):
matchy build misp-feed/ -o threats.mxyThis handles MISP's directory structure automatically, including manifest.json and per-event files. All indicator types are supported: IPs, domains, URLs, hashes, email addresses, etc.
You can pass multiple files of the same format to a single build:
matchy build feed1.csv feed2.csv -o combined.mxy --format csv# Show database metadata and statistics
matchy inspect threats.mxy
# Query from the command line (useful for debugging)
matchy query threats.mxy 1.2.3.4
matchy query threats.mxy "foo.evil.com"The plugin includes a comprehensive btest suite:
cd testing
btestTests cover:
- Plugin loading
- IP and string queries (exact, CIDR, glob)
load_database_with_options()with auto-reload on/off- MatchyIntel
seen()function - MatchyIntel auto-reload mode
- Runtime database switching via
Config::set_value()
Plugin not found at runtime:
export ZEEK_PLUGIN_PATH=/path/to/zeek-matchy-plugin/build
zeek -N Matchy::DBDatabase fails to load with "Unsupported version" error:
Your .mxy file was built with matchy 1.x. Rebuild it with matchy 2.x:
cargo install matchy # updates to 2.x
matchy build threats.csv -o threats.mxy --format csvBuild options:
# Use a local Matchy source checkout
cmake -DMATCHY_SOURCE_DIR=/path/to/matchy ..
# Use an existing Matchy installation
cmake -DBUILD_MATCHY=OFF -DMATCHY_ROOT=/path/to/matchy ..
# Specify Zeek location manually
cmake -DCMAKE_MODULE_PATH=/path/to/zeek/cmake ..Apache-2.0 License. See LICENSE.
- Matchy — The matching engine
- Zeek Documentation — Zeek network security monitor
- Zeek Plugin Development — Plugin API docs