Skip to content

Per-host HTTP client pool for TemplateSpray: bounded keep-alive + connection reuse (LRU/TTL) #6702

@Ice3man543

Description

@Ice3man543

Nuclei’s HTTP engine historically optimized for maximum host coverage in TemplateSpray by effectively disabling connection reuse:

  • Transport.DisableKeepAlives = true in multi-host mode
  • Requests were often forced to close (e.g. req.Close = true), preventing Go’s net/http connection pooling
  • Net effect: most HTTPS requests paid full TCP + TLS handshake costs repeatedly

This is safe for scans with huge numbers of unique targets, but it is slow for workloads that hit the same host repeatedly (multi-step templates, fuzzing, sequential flows, etc.).

Proposed change

Introduce a bounded per-host HTTP client pool for TemplateSpray that re-enables reuse selectively while keeping resource usage bounded.

What changed (high level)

  1. Stop implicitly forcing request close

    • Remove auto req.Close = true behavior unless explicitly requested by template headers.
    • Let net/http reuse connections when possible.
  2. Add per-host pooled clients (LRU + TTL)

    • New PerHostClientPool stores retryablehttp.Client per normalized origin key:
      • scheme://host:port (ex: https://example.com:443)
    • Cache is bounded (LRU) and expires entries (TTL).
  3. Keep-alive enabled only for pooled clients

    • For TemplateSpray, GetForTarget(...) returns a pooled client where:
      • DisableKeepAlive = false
      • uses “single-host style” connection pool settings (Threads = 1 trick to select appropriate transport limits)
    • HostSpray is excluded because Go’s transport already reuses effectively when locality is high.
  4. Bound idle sockets

    • Set Transport.IdleConnTimeout = 90s to ensure idle connections are reclaimed.

Why this helps

Go’s connection reuse requires:

  • same *http.Transport instance
  • keep-alive enabled (DisableKeepAlives=false)
  • request not forcing close (Request.Close=false)
  • response body drained and closed
  • same connection key (scheme/host/port/proxy/TLS constraints)

TemplateSpray breaks locality, so global keep-alive can retain too many sockets across many hosts.
The per-host pool aims to strike a balance:

  • Reuse when host locality exists
  • Bound resource footprint when locality doesn’t exist

Expected benefits

Big wins

  • Multi-request templates against the same host (login flows, chained steps)
  • zip-backup-files - max time 12 mins under heavy load, 90s after these (makes 1500+ requests to same target)
  • Fuzzing / sequential probing against one target
  • HTTP/2 multiplexing scenarios (many requests over one TLS connection)
  • Reduced new TCP/TLS handshakes -> improved throughput and latency

Limited wins (but still safe)

  • TemplateSpray with very large target sets where most hosts are not revisited within TTL/idle windows:
    • pool behaves like a bounded “recent host” cache
    • limited reuse but bounded resources

Design details

Keying

  • Pool key: normalized origin scheme://host:port
    • http://example.com -> http://example.com:80
    • https://example.com -> https://example.com:443

Eviction

  • Bounded LRU size (e.g. 500 hosts)
  • TTL expiration (e.g. 5 minutes)
  • Transport-level idle socket timeout (90s)

Scope

  • Enabled for TemplateSpray (default scanning strategy)
  • Disabled for HostSpray since a single transport already provides strong reuse with host locality

Risks / correctness concerns (important)

1) Host-only keying can cause config cross-talk

The pool key is host-only, but client behavior may vary by configuration:

  • redirect policy (redirects, host-redirects, max redirects)
  • per-request timeout overrides (e.g. ResponseHeaderTimeout annotations)
  • proxy settings, TLS options, etc.
  • cookie jar identity / state
  • TODO: This is critical to fix before merging

Risk: first-created client for a host may be reused even when subsequent requests expect different semantics.

Possible follow-up fix: key by (host + config hash) and include cookie jar identity when isolation is required.

2) Cookie jar cloning is risky

If cloning copies cookiejar.Jar by value, it may introduce subtle race/behavior issues since the jar contains internal state/synchronization.

Possible follow-up fixes:

  • Treat jar pointer as intentionally shared (don’t shallow-copy), OR
  • Include jar identity in pool key to preserve isolation semantics.

3) Global pool lock contention

Cache misses serialize on a global mutex in GetOrCreate.
In low-reuse/high-churn workloads, the lock can become a hotspot.

Possible follow-up fix: shard locks or per-host maps, or use a striped locking strategy.

How we should validate (benchmarks + regression monitoring)

Benchmarks to run (recommended)

Use a local deterministic test server (HTTP/1.1 + HTTP/2 TLS) and run:

  1. High locality / multi-step

    • same host hit repeatedly (multi-request templates)
    • Expect: large drop in TCP/TLS handshakes and wall time
  2. Medium locality

    • ~100–1000 hosts revisited frequently
    • Expect: measurable reuse and stable resource usage
  3. Low locality / churn

    • tens/hundreds of thousands of unique hosts, few revisits
    • Expect: limited speedup, but no runaway sockets/FDs/goroutines
  4. Correctness variance test

    • same host alternating redirect settings / cookie jar / timeouts
    • Expect: behavior matches per-template settings consistently

Metrics that prove reuse

Pool hit/miss only shows “same client returned”. To prove socket reuse, add:

  • httptrace.GotConn (Reused, WasIdle, IdleTime)
  • TCP dial count (DialContext wrapper)
  • TLS handshake count/duration (TLSHandshakeStart/Done)
  • Peak FDs, goroutines, RSS/heap
  • conn_wait_time (GetConn -> GotConn duration) to detect starvation

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions