Network probing agent for the Retina distributed measurement system.
The agent connects to an orchestrator via TCP, receives probing directives, executes network probes, and returns forwarding information elements (FIEs).
Part of the Retina system:
- Generator: Creates probing directives
- Orchestrator: Distributes directives to agents, collects FIEs
- Agent: Executes network probes (this component)
┌─────────────┐
│Orchestrator │
└──────┬──────┘
│ TCP (JSON over newline-delimited stream)
│
┌──────▼──────────────────────────┐
│ Retina Agent │
│ │
│ ┌────────┐ ┌──────────┐ ┌──────┐│
│ │ Reader │─▶│Processor │─▶│Writer││
│ └────────┘ └─────┬────┘ └──────┘│
│ │ │
│ ┌─────▼─────┐ │
│ │ Prober │ │
│ │ (caracal) │ │
│ └───────────┘ │
└─────────────────────────────────────┘
Three-stage pipeline:
- Reader: Receives
ProbingDirectivemessages from orchestrator - Processor: Executes two probes per directive (near TTL, far TTL) in parallel, sends FIE when both complete
- Writer: Sends
ForwardingInfoElementresults back to orchestrator
Key features:
- Non-blocking probe execution (thousands of concurrent probes)
- Automatic reconnection with exponential backoff
- Graceful shutdown on SIGINT/SIGTERM
- Go 1.21+
- For production: caracal and raw socket privileges
git clone https://github.com/dioptra-io/retina-agent
cd retina-agent
go build -o retina-agent ./cmd/retina-agent./retina-agent --id agent-1 --address localhost:50050 --prober-type mockExample output:
2026/01/21 17:02:54 Agent agent-1: Connected to orchestrator at localhost:50050
2026/01/21 17:02:54 Agent agent-1: ← Directive for 8.8.8.8 (TTL 10 → 11)
2026/01/21 17:02:54 Agent agent-1: → FIE for 8.8.8.8 | Near(TTL10) | Far(TTL11)
2026/01/21 17:02:55 Agent agent-1: ← Directive for 1.1.1.1 (TTL 15 → 16)
2026/01/21 17:02:55 Agent agent-1: → FIE for 1.1.1.1 | Near(TTL15) | Far(TTL16)
Use the mock orchestrator to test the complete pipeline:
# Terminal 1: Start mock orchestrator
go run test/mock_orchestrator.go
# Terminal 2: Start agent with mock prober
./retina-agent --id agent-1 --address localhost:50050 --prober-type mockYou should see directives flowing in and FIEs flowing out in both terminals.
| Flag | Default | Description |
|---|---|---|
--id |
agent-1 |
Agent identifier |
--address |
localhost:50050 |
Orchestrator address (host:port) |
--prober-type |
caracal |
Prober: caracal or mock |
--prober-path |
(searches PATH) | Path to prober executable |
--probe-timeout |
5s |
Timeout for probe responses |
--directives-buffer |
100 |
Directives channel buffer |
--fies-buffer |
100 |
FIEs channel buffer |
--max-consecutive-decode-errors |
3 |
Max decode errors before reconnecting |
See --help for all options.
Caracal-specific arguments can be set programmatically via Config.ProberArgs:
cfg.ProberArgs = []string{"--n-packets", "3", "--interface", "eth0"}Not exposed as CLI flag for MVP - modify code if needed.
For each ProbingDirective:
- Launch two probes concurrently:
- Near probe: TTL =
directive.NearTTL - Far probe: TTL =
directive.NearTTL + 1
- Near probe: TTL =
- Correlate results by destination, protocol, header fields, TTL, and timestamp
- If both succeed: Build and send FIE
- If either times out: Discard (no FIE)
The caracal prober uses a high-throughput pipeline:
- Multiple goroutines queue probe requests (non-blocking)
- Single writer goroutine sends to caracal stdin (CSV format)
- Single reader goroutine receives from caracal stdout (CSV format)
- Results correlated back to waiting goroutines via shared map
- Supports thousands of concurrent probes without blocking
- Network errors: Trigger reconnection with exponential backoff
- Decode errors: Log and skip (reconnect after 3 consecutive)
- Probe timeouts: Expected behavior, no FIE created
- Context cancellation: Clean shutdown
retina-agent/
├── cmd/retina-agent/ # Main entry point
├── internal/agent/
│ ├── agent.go # Core pipeline logic
│ ├── config.go # Configuration
│ ├── prober.go # Prober interface
│ ├── caracal_prober.go # Caracal implementation
│ ├── mock_prober.go # Mock for testing
│ └── agent_test.go # Tests
└── test/
└── mock_orchestrator.go # For end-to-end testing
# All tests
go test ./...
# Specific test
go test -v ./internal/agent -run TestAgentPipeline
# With race detection
go test -race ./...- Implement the
Proberinterface:
type Prober interface {
Probe(ctx context.Context, pd *api.ProbingDirective, ttl uint8) (*ProbeResult, error)
Close() error
}- Add to
createProber()inagent.go:
case "myprober":
return NewMyProber(cfg), nil- Use it:
./retina-agent --prober-type myproberCause: Orchestrator is unreachable.
Check: nc -zv orchestrator.example.com 50050
Cause: Probes timing out (expected with mock prober's 10% timeout rate).
Check logs: You should see directives received (←) but some won't produce FIEs (→).
Fix: Use --prober-type mock or --prober-type caracal
Fix: Run as root or grant capabilities:
sudo setcap cap_net_raw+ep /path/to/caracalEach FIE contains consecutive hop information (near and far TTL) needed for topology analysis.
Partial FIEs complicate downstream processing. Clean failure (no FIE) is simpler than partial success.
Sequential probing is too slow. Each directive spawns a goroutine that launches both probes in parallel and waits for results. This allows processing thousands of directives concurrently without blocking the main pipeline.
Allows testing with mock prober and easy addition of new implementations without changing agent code.
Passing TTL separately (rather than embedding in the directive) allows probing multiple hops with the same directive without duplicating the directive structure. This keeps the API simple and flexible.
MIT