[yaws_trace] Break deadlock between yaws_trace and yaws_server#505
Open
QuinnWilton wants to merge 1 commit intoerlyaws:masterfrom
Open
[yaws_trace] Break deadlock between yaws_trace and yaws_server#505QuinnWilton wants to merge 1 commit intoerlyaws:masterfrom
QuinnWilton wants to merge 1 commit intoerlyaws:masterfrom
Conversation
yaws_trace:disable_trace/1 and enable_trace/2 call
yaws_server:getconf() (a gen_server:call) to read config, then call
yaws_api:setconf/2 which also does gen_server:call(yaws_server, ...).
Meanwhile, yaws_server calls yaws_trace:get_filter/0 (another
gen_server:call). If yaws_server is handling a request that calls
get_filter while a trace operation is in flight, both processes block
waiting on each other.
Solution: cache the GC config in yaws_trace state (populated at setup
time) and make all config propagation async.
disable_trace and enable_trace are now handle_call clauses that read
GC from yaws_trace's local cache. When the trace field changes, they
cast {update_trace, TraceVal} to yaws_server, which applies the delta
to its own authoritative GC copy. Both sync calls into yaws_server
(getconf and setconf) must be eliminated to break the cycle — caching
removes getconf, and the cast replaces setconf.
Bypassing setconf is safe: when only gconf.trace changes,
soft_setconf's meaningful work is just update_gconf — cert checks, arg
validation, auth setup, and group reconfig are all no-ops.
Sending a delta ({update_trace, TraceVal}) rather than the full GC
record avoids a stale-cache problem: if yaws_trace held a full GC copy
and cast it back, any concurrent config changes (reload, admin ops)
would be silently reverted.
Other cleanup:
- Remove dead `groups` field from yaws_trace state (never read)
- Remove setup/2 (groups arg was only stored, never consumed)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note that this issue was found while running a static analysis tool I'm working on. It seems like a legitimate finding, but it isn't something I observed while operating
yaws, and so there might be some nuance I'm missing.yaws_trace:disable_trace/1 and enable_trace/2 call yaws_server:getconf() (a gen_server:call) to read config, then call yaws_api:setconf/2 which also does gen_server:call(yaws_server, ...). Meanwhile, yaws_server calls yaws_trace:get_filter/0 (another gen_server:call). If yaws_server is handling a request that calls get_filter while a trace operation is in flight, both processes block waiting on each other.
Solution: cache the GC config in yaws_trace state (populated at setup time) and make all config propagation async.
disable_trace and enable_trace are now handle_call clauses that read GC from yaws_trace's local cache. When the trace field changes, they cast {update_trace, TraceVal} to yaws_server, which applies the delta to its own authoritative GC copy. Both sync calls into yaws_server (getconf and setconf) must be eliminated to break the cycle — caching removes getconf, and the cast replaces setconf.
Bypassing setconf is safe: when only gconf.trace changes, soft_setconf's meaningful work is just update_gconf — cert checks, arg validation, auth setup, and group reconfig are all no-ops.
Sending a delta ({update_trace, TraceVal}) rather than the full GC record avoids a stale-cache problem: if yaws_trace held a full GC copy and cast it back, any concurrent config changes (reload, admin ops) would be silently reverted.
Other cleanup:
groupsfield from yaws_trace state (never read)