Skip to content

Add feature.network.incoming.raw_tcp_ports#4209

Open
fergusean wants to merge 2 commits intometalbear-co:mainfrom
zyno-io:feat/raw-tcp-ports
Open

Add feature.network.incoming.raw_tcp_ports#4209
fergusean wants to merge 2 commits intometalbear-co:mainfrom
zyno-io:feat/raw-tcp-ports

Conversation

@fergusean
Copy link
Copy Markdown

@fergusean fergusean commented Apr 24, 2026

Summary

Introduces a new feature.network.incoming.raw_tcp_ports config option — a list of ports that should be stolen as raw TCP, bypassing HTTP detection and TLS handling entirely.

By default, mirrord runs HTTP detection on every stolen port by reading the first bytes the client sends. For server-first protocols (SMTP, FTP, custom binary protocols, etc.) the client sends nothing after connecting — it waits for the server to speak first. Because the detection read blocks until data arrives, and data never arrives, these connections hang indefinitely and are never forwarded to the local application.

Ports listed in raw_tcp_ports skip detection and are forwarded to the local application immediately as raw byte streams.

What changed

Protocol (mirrord-protocol → 1.27.0)

  • Added StealType::AllRawTcp(Port) — a new steal subscription variant that signals the agent to skip HTTP detection for a port.
  • Added STEAL_RAW_TCP_VERSION (>=1.27.0) for capability negotiation.
  • StealType::get_port() now covers the new variant.
  • BlockedAction display updated to include AllRawTcp.

Config (mirrord-config)

  • Added raw_tcp_ports: Option<Vec<u16>> to IncomingConfig.
  • Validation rejects raw_tcp_ports when an HTTP filter applies to all ports (i.e., http_filter.ports is unset), and rejects any port that appears in both raw_tcp_ports and http_filter.ports.
  • Mirror mode adds a raw_tcp_ports warning (the field is silently ignored in mirror mode).
  • Analytics tracks raw_tcp_ports_count.

Layer (mirrord-layer-lib)

  • IncomingMode carries the raw_tcp_ports set.
  • IncomingMode::subscription() emits StealType::AllRawTcp for ports in the set, taking priority over HTTP filter logic.

intproxy (mirrord-intproxy)

  • PortSubscriptionExt::requests_raw_tcp() identifies raw TCP subscriptions.
  • agent_subscribe() downgrades AllRawTcp to All when talking to an older agent (with a warning), preserving backward compatibility.
  • SubscriptionsManager::layer_subscribed() rejects a second subscription on the same port if it requests a different mode, returning PortAlreadyStolen.

Agent (mirrord-agent)

  • Added IncomingPortMode enum (Detect | RawTcp) tracked per port in PortState.
  • MaybeHttp::accept_raw_tcp() wraps a redirected connection without running detection.
  • RedirectorTask uses port_mode to branch: RawTcp ports skip the detection select! entirely.
  • StealHandle::steal() now takes a mode argument.
  • Mode conflicts (e.g., a later subscriber requests a different mode) produce a warning but don't change the mode established by the first subscriber.
  • steal/api.rs maps StealType::AllRawTcpIncomingPortMode::RawTcp; all other steal types → IncomingPortMode::Detect.

Tests

  • mirror_detection_survives_later_raw_tcp_steal / raw_tcp_steal_mode_survives_later_mirror — verify that port-wide mode is fixed by the first subscriber.
  • rejects_mode_change_for_existing_subscription — intproxy correctly rejects a second layer that tries to change the mode.
  • raw_tcp_ports_subscribe_raw_tcp — layer-level integration test confirming AllRawTcp is sent to the agent for ports in raw_tcp_ports.

Quality Checklist:

  • I have documented new code sufficiently
  • I have checked and updated the relevant existing docs in code, including removing outdated material
  • I have written user-facing website docs for new features, or opened a (sub)issue to do so
  • I have checked and updated existing website docs for changed features
  • I have tested this change and know it succeeds and fails as expected
  • I have written unit tests or purposefully omitted them
  • I have written e2e tests or purposefully omitted them
  • I have explained what this PR introduces and why, and linked to relevant context (e.g. linear issues, related PRs,
    documentation)
  • I have introduced a short, clear and well-formatted changelog entry

@aviramha
Copy link
Copy Markdown
Member

aviramha commented Apr 27, 2026

Thank you for your contribution!
Before we review, I want to focus on the issue you mentioned:

By default, mirrord runs HTTP detection on every stolen port by reading the first bytes the client sends. For server-first protocols (SMTP, FTP, custom binary protocols, etc.) the client sends nothing after connecting — it waits for the server to speak first. Because the detection read blocks until data arrives, and data never arrives, these connections hang indefinitely and are never forwarded to the local application.

Can you share a minimal reproducible example? We're not aware of such bug, and that should be handled before we add more explicit config, if needed.

@fergusean
Copy link
Copy Markdown
Author

fergusean commented Apr 27, 2026

Base Setup

Tab 1

create test deployment & service in cluster (listen-test) -- a simple Node server that outputs a message when connected to:

kubectl create deployment listen-test --image=node:alpine -- node -e "require('net').createServer(c => { console.log('connection from ' + c.remoteAddress); c.write('test data from server upon connect'); }).listen(1234).on('listening', () => console.log('listening'))"
kubectl create service clusterip listen-test --tcp=1234:1234

tail logs

kubectl logs -f deployment/listen-test

Tab 2

create a pod from which to test connections (connect-test):

kubectl run connect-test --image=busybox --command -- tail -f /dev/null

connect to listen-test from connect-test:

kubectl exec -it connect-test -- nc -v listen-test 1234

expected result:

➜  ~ kubectl exec -it connect-test -- nc -v listen-test 1234
listen-test (10.43.24.28:1234) open
test data from server upon connect

plus you should see connection from ... in tab 1


Introduce mirrord

Tab 2

mirrord exec --steal --target=deployment/listen-test node -- -e "require('net').createServer(c => { console.log('local: connection from ' + c.remoteAddress); c.write('local: test data from server upon connect'); }).listen(1234).on('listening', () => console.log('local: listening'))"

expected result:

⠓ mirrord exec (cli version 3.206.1)
    ✓ running on latest!
    ✓ ready to launch process
      ✓ layer extracted
      ✓ operator not found
      ✓ agent pod default/mirrord-agent-qrc2p7cxqp-mzs2x created
      ✓ pod is ready
      ✓ arm64 layer library extracted
    ✓ config summary
    ✓ Ready!
local: listening

Tab 3

connect to listen-test from connect-test (same as before):

kubectl exec -it connect-test -- nc -v listen-test 1234

expected result:

listen-test (10.43.24.28:1234) open

However, you'll notice that neither the cluster listener in tab 1, nor the local listener in tab 2 show a connection.

Now type something and hit enter.
The local listener (tab 2) will then show a connection, and connect-test (tab 3) suddenly shows:

local: test data from server upon connect

While the timeout in http.rs can be fixed to solve the problem of the indefinite hang, that's still not a desired outcome for server-first protocols, as it would introduce an unnecessarily long delay between connect and TTFB.

@aviramha
Copy link
Copy Markdown
Member

While the timeout in http.rs can be fixed to solve the problem of the indefinite hang, that's still not a desired outcome for server-first protocols, as it would introduce an unnecessarily long delay between connect and TTFB.
I agree, but the change proposed is bigger and would be useful to do it in steps.
Do you mind sending a PR that:

  • makes timeout from first byte
  • reduces timeout to 2s
  • makes it also configurable via agent config

then we can discuss how to handle the connection that are server first byte by default before implementing?
Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants