Skip to content

DNS resolver mutex contention serializes concurrent webhook dispatchesΒ #2182

@IANewCool

Description

@IANewCool

Summary

In server/svix-server/src/core/webhook_http_client.rs, the NonLocalDnsResolver uses Arc<Mutex<DnsState>> for lazy initialization of the DNS resolver.

During the Init state, the mutex lock is held across an .await boundary (the new_resolver().await? call), which serializes all concurrent webhook dispatches behind a single lock acquisition during initialization.

Even in steady state (Ready), every DNS resolution acquires the mutex to read the resolver reference.

Impact

  • During initialization, all concurrent webhook dispatches are serialized behind the mutex
  • In steady state, every dispatch pays mutex acquisition cost on the DNS resolution hot path
  • For high-throughput webhook delivery (thousands/sec), this becomes a contention point

Suggested Fix

Replace Arc<Mutex<DnsState>> with tokio::sync::OnceCell for the lazy initialization pattern:

use tokio::sync::OnceCell;

struct NonLocalDnsResolver {
    resolver: OnceCell<TokioAsyncResolver>,
}

impl NonLocalDnsResolver {
    async fn get_resolver(&self) -> Result<&TokioAsyncResolver> {
        self.resolver.get_or_try_init(|| async {
            new_resolver().await
        }).await
    }
}

This eliminates the mutex entirely β€” initialization happens once with no contention, and subsequent accesses are lock-free.

Additional

Also noticed panic!()/.expect() in 6+ code paths in queue/redis.rs for queue initialization. A transient Redis hiccup during a rolling deploy would crash the server rather than retrying gracefully.

Great codebase overall β€” the #![forbid(unsafe_code)] at workspace level is a smart choice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions