Skip to content

Fix use-after-free: call emu_env_free before emu_free#13

Merged
micheloosterhof merged 18 commits intomainfrom
claude/review-c-code-logic-gKpMn
Feb 11, 2026
Merged

Fix use-after-free: call emu_env_free before emu_free#13
micheloosterhof merged 18 commits intomainfrom
claude/review-c-code-logic-gKpMn

Conversation

@micheloosterhof
Copy link
Member

@micheloosterhof micheloosterhof commented Feb 11, 2026

In emulate_ctx_free(), emu_free(ctx->emu) was called before
emu_env_free(ctx->env). Since emu_env holds a reference to the
emu object, freeing emu first causes emu_env_free to access
already-freed memory. The correct order (as used in profile.c)
is emu_env_free first, then emu_free.

In emulate_ctx_free(), emu_free(ctx->emu) was called before
emu_env_free(ctx->env). Since emu_env holds a reference to the
emu object, freeing emu first causes emu_env_free to access
already-freed memory. The correct order (as used in profile.c)
is emu_env_free first, then emu_free.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
The offset was updated as ctx->offset += size, but since the scan
uses a 300-byte lookback (offset = MAX(ctx->offset-300, 0)), the
actual start of the scanned data can be up to 300 bytes before
ctx->offset. Adding the full size to ctx->offset would advance it
past the end of the data, causing bytes to be skipped in future
scans. The correct update is ctx->offset = offset + size, matching
the pattern used in speakeasy/detect.c.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
When the xmatch pattern matching returns 0 (no matches), the
function returned without freeing the streamdata buffer allocated
by bistream_get_stream. The error case (-1) and the match-found
path both correctly free streamdata.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
The debug log lines for netmask, broadcast address, and destination
address all incorrectly passed addr->addr (the interface address)
to inet_ntop/addr_offset instead of addr->netmask, addr->broadaddr,
and addr->dstaddr respectively. This caused all three debug lines
to print the interface address instead of the correct value.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
In hupy(), g_hash_table_lookup was called with module_name (type
gchar**, a pointer-to-pointer) instead of *module_name (type
gchar*, the actual string). Since the hash table uses g_str_hash/
g_str_equal, passing the pointer-to-pointer caused g_str_hash to
hash the raw pointer bytes rather than the string contents, so the
lookup would never match existing imports. This made module reload
(HUP) always take the "new import" path instead of reloading.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
When size <= 0, the function returned without freeing the emu
object or the streamdata buffer. Move the size check before the
emu_new() call and add the missing g_free(streamdata).

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
After trying to execute 64 instructions following a 0xE8 (CALL)
byte, the code unconditionally returned 1 even when the CALL/POP
pattern was never confirmed (ESP never returned to its pre-CALL
value). This caused every 0xE8 byte with a displacement <= 512
that didn't crash emulation to be flagged as a GetPC pattern,
leading to false positives. Return 0 when the pattern is not
confirmed by the execution check.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
emu_shellcode_test_x86() used uint16_t for its size parameter,
silently truncating buffers larger than 65535 bytes. This meant
shellcode beyond offset 65535 would never be scanned. The ARM32
and ARM64 variants already correctly used uint32_t. Changed to
uint32_t in the function signature, header declaration, and
removed the now-unnecessary truncation in the caller.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
The while loop at line 290 checked itsc->stream_offset before
reassigning itsc from the current list element. This meant the
condition was always one iteration behind: it checked the previous
chunk's offset, not the current one. When the previous chunk's
stream_offset < end but the current chunk's stream_offset >= end,
the loop would enter incorrectly. This caused end_offset to wrap
around as an unsigned value (end - stream_offset where stream_offset
> end), leading to a massive memcpy that could overflow the buffer.

Fix by restructuring the loop to check the condition after
reassigning itsc from the current list element.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
The action_str array was only defined inside #ifdef DEBUG, but the
g_debug() calls that reference it are outside the guard. When DEBUG
is not defined but g_debug still expands to g_log (i.e. G_DISABLE_DEBUG
is not set), this references an undeclared identifier. Remove the
#ifdef guard since the array is needed whenever g_debug is active.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
PyDict_New() and PyList_New() return new references (refcount=1).
PyDict_SetItemString/PyDict_SetItem increment the refcount to 2.
Without Py_DECREF after the SetItem calls, these objects leak one
reference each time a new interface or address family is encountered.

Add Py_DECREF(pyafdict) after inserting into result dict, and
Py_DECREF(pyaflist) after inserting into the per-interface dict.
The else branch correctly uses PyDict_GetItem which returns a
borrowed reference, so no DECREF is needed there.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
Every obj_value from py_config_string/py_config_string_list returns
a new reference (refcount=1). PyDict_SetItemString increments to 2.
Without Py_XDECREF after each SetItemString, every config value
leaked one reference.

Similarly, obj2 was created with PyDict_New (refcount=1), inserted
into obj via PyDict_SetItemString (refcount=2), then overwritten by
a new PyDict_New without DECREF. The old obj2 would never reach
refcount 0, leaking the dict and all its contents.

Use Py_XDECREF for obj_value (which may be NULL from Py_RETURN_NONE
path) and Py_DECREF for obj2 after insertion.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
The download path at line 345 correctly casts the expected size to
ssize_t before comparing with write()'s return value. The upload
path at line 354 compared write()'s ssize_t return with size*nmemb
(size_t) without the cast, creating a signed/unsigned mismatch.
Add the same (ssize_t) cast for consistency and correctness.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
When nfqueue_cb received packets with len <= 0 or len <= sizeof(iphdr),
it returned early without extracting the packet ID or issuing a verdict.
These packets remained queued in the kernel NFQ queue indefinitely. An
attacker could send many malformed/tiny packets to exhaust the queue,
preventing legitimate packets from being processed (DoS).

Also adds validation for ip->ihl >= 5 (minimum valid IP header length).
Crafted packets with ihl < 5 caused the TCP header pointer to overlap
with the IP header, resulting in garbage port values being parsed and
reported in incidents.

Fix by extracting packet_id immediately after the header check, and
using goto to ensure all code paths reach the verdict-sending code.
Short/malformed packets now receive NF_DROP verdicts.

Also fixes signed/unsigned comparison warnings in the length checks.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
…ction

emu_shellcode_test_x86 allocated calloc(size, sizeof(uint32_t)) where
size is the stream data length from network input. A long-lived
connection accumulating megabytes of data would trigger multi-megabyte
allocations (size * 4 bytes) on every processor invocation, even if the
data contains no shellcode. This is a memory amplification vector.

Fix by capping the GetPC candidate array at MAX_GETPC_CANDIDATES (4096).
Real shellcode has 1-2 GetPC patterns; legitimate data rarely has more
than a few dozen false matches. This bounds both memory (16KB fixed) and
CPU (max 4096 emulator invocations per scan).

Also removes the unused execution_result struct and results array, which
was accumulated but never read back (only best_offset/best_steps are
tracked).

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
strunpack8() read a length byte and sliced data without any bounds
validation. On crafted input (e.g., from a MITM'd hpfeeds server
connection), this could cause:
- IndexError on empty data (unhandled, crashes the hpfeeds handler)
- Silently truncated data when claimed length exceeds buffer
- Wrong remaining data for chained strunpack8 calls (lines 172-173)

Fix by validating the buffer has at least 1 byte for the length field,
and at least 1+length bytes for the data. Raises BadClient on
malformed data, which is caught by the existing handler at line 180.

strpack8() used len(x)%0xff (modulo 255, not 256) to compute the
length byte. This caused incorrect length fields: a 255-byte string
got length 0, a 256-byte string got length 1, etc. Fix by removing
the modulo entirely and raising BadClient if the string exceeds the
255-byte limit of a single-byte length field.

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
Replace calloc(MAX_GETPC_CANDIDATES, sizeof(uint32_t)) with a
fixed-size stack array. At 4096 * 4 = 16KB this is well within
stack limits and eliminates:
- The heap allocation that could fail
- The NULL check and early return path
- Two free() calls (one in the no-match path, one at the end)
- Any risk of forgetting to free on a new error path

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
compute_sha256_hex: Change from returning a g_malloc'd string to writing
into a caller-provided buffer. SHA256 hex is always exactly 64 chars + null
(65 bytes), making it ideal for a stack buffer.

save_shellcode: Replace g_strdup_printf with snprintf into fixed stack
buffers (SHELLCODE_PATH_MAX=512) for bin_path and txt_path. This
eliminates 3 heap allocations and simplifies cleanup on every return path
(no more g_free chains).

https://claude.ai/code/session_01Pm9yPS4qqpBHtrvVYUSz7k
@micheloosterhof micheloosterhof merged commit 8b9704b into main Feb 11, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants