-
Notifications
You must be signed in to change notification settings - Fork 313
Optimize DNS performance with lock-free concurrency and connection pooling #849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize DNS performance with lock-free concurrency and connection pooling #849
Conversation
- Add DnsHandlingStateManager for DNS request deduplication using sync.Map - Add UdpHealthMonitor for connection health tracking with lockless operations - Add DnsForwarderManager with reference counting and lifecycle management These foundational components provide the building blocks for lockless concurrency optimizations across the DNS and UDP processing pipelines. All components use sync.Map to eliminate lock contention in high-concurrency scenarios while maintaining thread safety. Related to daeuniverse#589, daeuniverse#767 - addressing performance bottlenecks in high-concurrency DNS/UDP processing scenarios
- Replace upstream2IndexMu + upstream2Index map with sync.Map in DNS component - Replace RWMutex + map with sync.Map in AnyfromPool for UDP connections - Eliminate lock contention in DNS upstream resolution and UDP pool access - Improve concurrent access performance in hot paths These optimizations target the most frequently accessed data structures in DNS and UDP processing, providing significant performance improvements in high-concurrency scenarios. Helps address daeuniverse#589 - performance issues in DNS resolution under high load Helps address daeuniverse#767 - UDP connection pool contention
- Enhance UDP endpoint pool with sync.Map and health monitoring integration - Optimize UDP task pool with increased capacity and lockless state management - Add health monitoring integration for connection tracking and timeout handling - Improve task scheduling, error handling, and resource cleanup This optimization significantly improves UDP processing performance by integrating health monitoring and eliminating lock contention in task and endpoint management. Addresses daeuniverse#589 - high-concurrency UDP performance issues Helps resolve daeuniverse#767 - UDP task pool bottlenecks under load
- Integrate DnsHandlingState, DnsForwarderManager, and UdpHealthMonitor in DNS control layer - Wire up optimized components in control plane for system-wide coordination - Improve error handling, resource cleanup, and connection management - Simplify DNS control logic while enhancing performance This final integration brings together all lockless optimization components to provide cohesive performance improvements across the entire DNS and UDP processing pipeline. Closes daeuniverse#589 - resolves high-concurrency DNS/UDP performance bottlenecks Closes daeuniverse#767 - eliminates lock contention in connection management The lockless design using sync.Map significantly reduces mutex contention and improves throughput in high-traffic scenarios, addressing the core performance issues reported in both issues.
- Add DnsServerConfig struct to config/config.go for DNS server configuration - Extend Dns config struct with Server field for DNS server settings - Add dialArgument struct to control_plane.go for DNS dialing decisions - Implement DNS server startup logic in control plane with graceful error handling - Refactor dns_control.go to support both transparent proxy and DNS server modes - Add dual-mode udpRequest struct supporting both proxy and server contexts - Implement DNS forwarder manager with connection pooling and lifecycle management - Add DNS server configuration example to example.dae with detailed comments - Optimize DNS cache using sync.Map to reduce lock contention - Ensure DNS server startup failure does not block main DAE functionality This enables DAE to function as a complete DNS solution while maintaining backward compatibility and transparent proxy capabilities.
…n toggle" This reverts commit 8f2f960.
|
该PR较为潦草,我在尝试重新设计PR中指出存在问题的模块 |
0c9baa4 to
b1fdf6c
Compare
- Improve AnyfromPool concurrent control with exponential backoff retry - Add timeout protection to prevent infinite waiting in high concurrency - Enhance error handling in sendPkt to prevent crashes on binding failures - Replace fatal errors with warning logs for DNS response sending failures - Maintain service availability when UDP port conflicts occur Fixes the occasional crashes with 'bind: address already in use' errors that occurred during high concurrent DNS requests on port 53.
65cc5ef to
d3f5890
Compare
|
我patch这个pr,在immortalwrt上的dae有大量本机发出的dns请求解析...ip6.arpa,dae的cpu占用持续30%~50%,但不影响使用(service dae restart 后恢复了)。 log像这样: 不知道还会不会再repro。 |
Note The following content has been translated from its original language using an automated process powered by a proprietary API. Segments originally written in English have been preserved, while non-English portions have been machine-translated for readability. Please be aware that minor inaccuracies may exist due to the automated nature of the translation. I patch this PR and there are many local DNS requests being resolved for IPv6.arpa in dae on ImmortalWrt. The CPU utilization of dae remains at 30% to 50%, but it does not affect the usability (the issue is resolved by restarting the dae service). I'm not sure if the issue will still occur. |
|
我为rpcd加上了must_direct再试: |
Note The following content has been translated from its original language using an automated process powered by a proprietary API. Segments originally written in English have been preserved, while non-English portions have been machine-translated for readability. Please be aware that minor inaccuracies may exist due to the automated nature of the translation. I added must_direct to rpcd and tried again: |
|
又repro了,貌似跟rpcd无关 |
Note The following content has been translated from its original language using an automated process powered by a proprietary API. Segments originally written in English have been preserved, while non-English portions have been machine-translated for readability. Please be aware that minor inaccuracies may exist due to the automated nature of the translation. It seems like it's been repro'd, probably not related to rpcd. |
|
放弃此pr换到main,此问题就not repro了。 |
Note The following content has been translated from its original language using an automated process powered by a proprietary API. Segments originally written in English have been preserved, while non-English portions have been machine-translated for readability. Please be aware that minor inaccuracies may exist due to the automated nature of the translation. The input text has been translated to English. The original text was not in Markdown format, so no formatting or structure preservation was required. Translated text: "Give up this PR to switch to main, and this issue will no longer reproduce." |
|
昨天可能给我干烂了,等我修复把 |
Note The following content has been translated from its original language using an automated process powered by a proprietary API. Segments originally written in English have been preserved, while non-English portions have been machine-translated for readability. Please be aware that minor inaccuracies may exist due to the automated nature of the translation. It seems that you've shared a piece of text that doesn't fully translate to English. However, I can help translate parts of it or provide explanations for it. The given text contains a mix of Chinese and English:
If there's a specific part you would like to translate or need more context, please let me know! |
d3f5890 to
ee6c5b4
Compare
不知道为什么你用main不能reproduce, 我在使用v1.0.0仍然是这样的. 这些反向查询来源ImmortalWrt, 一部分来源于ipv4的PTR记录查询反向查询局域网主机名, 另一部分是局域网的ipv6的PTR查询. 我倒是不明白为什么你使用main不会出现这些呢? 或许这要看你规则怎么写了. |

Background
本 PR 主要针对 DAE 的 DNS 处理性能进行优化,通过实现无锁并发机制解决高并发场景下的 DNS 性能瓶颈问题。这些变更将传统的互斥锁保护的数据结构替换为
sync.Map,并引入专门的 DNS 组件来提供更好的资源管理。解决的错误场景:
Checklist
Full Changelogs
1. DNS Foundation Components (
56fb759)DnsHandlingStateManager用于使用 sync.Map 进行 DNS 请求去重,解决重复请求问题UdpHealthMonitor用于无锁连接健康跟踪,提升 UDP DNS 查询稳定性DnsForwarderManager带有引用计数和生命周期管理,优化 DNS 转发器资源使用2. DNS Core Optimizations (
af2e2c6)upstream2IndexMu+upstream2Indexmap 替换为 sync.Map,直接优化 DNS 上游查找性能3. UDP DNS Processing Enhancement (
d96dc26)4. DNS System Integration (
6ff101c)Issue Reference
Test Result