Skip to content

[yaws_trace] Break deadlock between yaws_trace and yaws_server#505

Open
QuinnWilton wants to merge 1 commit intoerlyaws:masterfrom
QuinnWilton:fix/yaws-trace-call-cycle
Open

[yaws_trace] Break deadlock between yaws_trace and yaws_server#505
QuinnWilton wants to merge 1 commit intoerlyaws:masterfrom
QuinnWilton:fix/yaws-trace-call-cycle

Conversation

@QuinnWilton
Copy link

@QuinnWilton QuinnWilton commented Feb 10, 2026

Note that this issue was found while running a static analysis tool I'm working on. It seems like a legitimate finding, but it isn't something I observed while operating yaws, and so there might be some nuance I'm missing.

yaws_trace:disable_trace/1 and enable_trace/2 call yaws_server:getconf() (a gen_server:call) to read config, then call yaws_api:setconf/2 which also does gen_server:call(yaws_server, ...). Meanwhile, yaws_server calls yaws_trace:get_filter/0 (another gen_server:call). If yaws_server is handling a request that calls get_filter while a trace operation is in flight, both processes block waiting on each other.

Solution: cache the GC config in yaws_trace state (populated at setup time) and make all config propagation async.

disable_trace and enable_trace are now handle_call clauses that read GC from yaws_trace's local cache. When the trace field changes, they cast {update_trace, TraceVal} to yaws_server, which applies the delta to its own authoritative GC copy. Both sync calls into yaws_server (getconf and setconf) must be eliminated to break the cycle — caching removes getconf, and the cast replaces setconf.

Bypassing setconf is safe: when only gconf.trace changes, soft_setconf's meaningful work is just update_gconf — cert checks, arg validation, auth setup, and group reconfig are all no-ops.

Sending a delta ({update_trace, TraceVal}) rather than the full GC record avoids a stale-cache problem: if yaws_trace held a full GC copy and cast it back, any concurrent config changes (reload, admin ops) would be silently reverted.

Other cleanup:

  • Remove dead groups field from yaws_trace state (never read)
  • Remove setup/2 (groups arg was only stored, never consumed)

yaws_trace:disable_trace/1 and enable_trace/2 call
yaws_server:getconf() (a gen_server:call) to read config, then call
yaws_api:setconf/2 which also does gen_server:call(yaws_server, ...).
Meanwhile, yaws_server calls yaws_trace:get_filter/0 (another
gen_server:call). If yaws_server is handling a request that calls
get_filter while a trace operation is in flight, both processes block
waiting on each other.

Solution: cache the GC config in yaws_trace state (populated at setup
time) and make all config propagation async.

disable_trace and enable_trace are now handle_call clauses that read
GC from yaws_trace's local cache. When the trace field changes, they
cast {update_trace, TraceVal} to yaws_server, which applies the delta
to its own authoritative GC copy. Both sync calls into yaws_server
(getconf and setconf) must be eliminated to break the cycle — caching
removes getconf, and the cast replaces setconf.

Bypassing setconf is safe: when only gconf.trace changes,
soft_setconf's meaningful work is just update_gconf — cert checks, arg
validation, auth setup, and group reconfig are all no-ops.

Sending a delta ({update_trace, TraceVal}) rather than the full GC
record avoids a stale-cache problem: if yaws_trace held a full GC copy
and cast it back, any concurrent config changes (reload, admin ops)
would be silently reverted.

Other cleanup:
- Remove dead `groups` field from yaws_trace state (never read)
- Remove setup/2 (groups arg was only stored, never consumed)
@vinoski vinoski self-assigned this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants