Skip to content

Commit 26fae5b

Browse files
authored
Merge pull request #4 from MaurUppi/codex/read-and-document-dns-audit-report
Fix DNS audit findings: DoUDP conn race, dialSend context/fallback, add race CI
2 parents 7512079 + 8e9111a commit 26fae5b

File tree

5 files changed

+116
-13
lines changed

5 files changed

+116
-13
lines changed

.github/workflows/dns-race.yml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
name: DNS Race Test
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- "control/**/*.go"
7+
- "go.mod"
8+
- "go.sum"
9+
- ".github/workflows/dns-race.yml"
10+
push:
11+
branches:
12+
- main
13+
paths:
14+
- "control/**/*.go"
15+
- "go.mod"
16+
- "go.sum"
17+
- ".github/workflows/dns-race.yml"
18+
19+
jobs:
20+
race-test:
21+
runs-on: ubuntu-22.04
22+
steps:
23+
- uses: actions/checkout@v4
24+
25+
- name: Set up Go
26+
uses: actions/setup-go@v5
27+
with:
28+
go-version: '^1.22'
29+
30+
- name: Run race detector for control package
31+
run: go test -race ./control/...

.plan/code_audit_report-dev.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# dae DNS 改进代码审计报告 - 开发执行文档
2+
3+
> 来源:`.plan/code_audit_report.md`
4+
> 目标:按 `Removal/Iteration Plan` 严格串行落实变更、验证与记录。
5+
6+
## 0. 执行总则(强制)
7+
1. 所有任务严格串行执行:`Tn` 未测试通过,不开始 `Tn+1`
8+
2. 每个任务必须包含:代码实现、任务级测试、测试记录。
9+
3. 每个里程碑必须在全部任务通过后执行一次回归测试。
10+
4. 任一测试失败,先修复并重测,直至通过。
11+
12+
## 1. 任务拆解(对应 Removal/Iteration Plan)
13+
14+
### T1:删除 dead code(P1-1)
15+
- 目标:删除 `control/dns_control.go` 中无效分支 `if err != nil { return err }`
16+
- 影响:仅清理控制流,不改变业务行为。
17+
- 验收:代码扫描确认该 dead code 不存在。
18+
19+
### T2:修复 DoUDP 并发竞争(P0-1)
20+
- 目标:`DoUDP.ForwardDNS` 中 goroutine 与读取路径不再共享可变 `d.conn` 引用。
21+
- 实施:使用局部连接变量进行读写,避免复用实例时写错连接;重试计时改为 ticker。
22+
- 验收:代码扫描确认 goroutine/read 均使用局部连接变量。
23+
24+
### T3:修复 fallback 错误返回语义(P1-2)
25+
- 目标:TCP fallback forwarder 创建失败时返回 `fallbackErr` 包装错误,而非原始 UDP 超时错误。
26+
- 验收:代码扫描确认返回错误包含 fallback 创建失败信息与原始错误上下文。
27+
28+
### T4:dialSend context 传播(P1-3)
29+
- 目标:`dialSend` 不再使用 `context.TODO()`,改为接收上层 context 并在递归路径透传。
30+
- 验收:代码扫描确认 `context.WithTimeout(ctx, ...)`,且递归调用透传 `ctx`
31+
32+
### T5:CI 增加 race detector 检查
33+
- 目标:新增工作流执行 `go test -race ./control/...`
34+
- 验收:工作流文件存在且命令准确。
35+
36+
## 2. 里程碑回归
37+
- 命令:
38+
- `go test ./control -run 'Test(IsTimeoutError|TcpFallbackDialArgument|SendStreamDNSRespectsContextCancelBeforeIO|EvictDnsForwarderCacheOneLocked)' -count=1`
39+
- `go test -race ./control/...`
40+
- 说明:如受依赖拉取限制,需记录失败原因与影响范围到 `.plan/test-log.md`

.plan/test-log.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,3 +67,36 @@
6767
- 命令:`go test ./control -run 'Test(IsTimeoutError|TcpFallbackDialArgument|SendStreamDNSRespectsContextCancelBeforeIO|EvictDnsForwarderCacheOneLocked)' -count=1`
6868
- 结果:失败(环境限制),`proxy.golang.org` 拉取私有/受限依赖 `github.com/daeuniverse/outbound` 返回 403 Forbidden。
6969
- 结论:在当前环境无法完成自动化回归编译;已保留任务级静态校验记录。
70+
71+
## Code Audit Iteration - T1(移除 dead code)
72+
- 命令:`sed -n '626,650p' control/dns_control.go`
73+
- 结果:`forwarder.ForwardDNS(ctxDial, data)` 前不再存在 `if err != nil { return err }` 的残留分支。
74+
- 结论:通过(dead code 已移除)。
75+
76+
## Code Audit Iteration - T2(DoUDP 并发竞争修复)
77+
- 命令:`sed -n '312,360p' control/dns.go`
78+
- 结果:`DoUDP.ForwardDNS` 新增 `localConn := conn`,goroutine 写入与主流程读取均使用 `localConn`,重试等待改为 `retryTicker`
79+
- 结论:通过(避免 goroutine 与后续调用共享可变 `d.conn`)。
80+
81+
## Code Audit Iteration - T3(fallback 错误语义修复)
82+
- 命令:`rg -n "tcp fallback forwarder creation failed" control/dns_control.go`
83+
- 结果:命中 `return fmt.Errorf("tcp fallback forwarder creation failed: %w (original: %v)", fallbackErr, err)`
84+
- 结论:通过(fallback 创建失败不再误报为原始 UDP 错误)。
85+
86+
## Code Audit Iteration - T4(dialSend context 传播)
87+
- 命令:`rg -n "dialSend\(context.Background\(|func \(c \*DnsController\) dialSend\(ctx context.Context|context.WithTimeout\(ctx, consts.DefaultDialTimeout\)|dialSend\(ctx, invokingDepth\+1" control/dns_control.go`
88+
- 结果:命中入口传入 `context.Background()``dialSend(ctx ...)` 签名、`WithTimeout(ctx, ...)`、递归透传 `ctx`
89+
- 结论:通过(已去除 `context.TODO()`)。
90+
91+
## Code Audit Iteration - T5(CI race detector)
92+
- 命令:`rg -n "go test -race ./control/..." .github/workflows/dns-race.yml`
93+
- 结果:命中新增工作流中的 race 检测命令。
94+
- 结论:通过(CI 已补充 race 检测入口)。
95+
96+
## Code Audit Iteration - 里程碑回归
97+
- 命令:`go test ./control -run 'Test(IsTimeoutError|TcpFallbackDialArgument|SendStreamDNSRespectsContextCancelBeforeIO|EvictDnsForwarderCacheOneLocked)' -count=1`
98+
- 结果:失败,依赖 `github.com/daeuniverse/outbound``proxy.golang.org` 拉取返回 403 Forbidden。
99+
- 结论:受环境限制,无法完成自动化回归编译。
100+
- 命令:`go test -race ./control/...`
101+
- 结果:失败,除上述依赖拉取 403 外,`control/kern/tests` 还出现 `bpftestObjects/loadBpftestObjects` 未定义构建错误。
102+
- 结论:受环境限制,未能在本地完成 race 回归;已在 CI 增加对应检测工作流。

control/dns.go

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -314,16 +314,19 @@ func (d *DoUDP) ForwardDNS(ctx context.Context, data []byte) (*dnsmessage.Msg, e
314314
return nil, err
315315
}
316316
d.conn = conn
317+
localConn := conn
317318

318319
timeout := 5 * time.Second
319-
_ = d.conn.SetDeadline(time.Now().Add(timeout))
320+
_ = localConn.SetDeadline(time.Now().Add(timeout))
320321
dnsReqCtx, cancelDnsReqCtx := context.WithTimeout(ctx, timeout)
321322
defer cancelDnsReqCtx()
323+
retryTicker := time.NewTicker(1 * time.Second)
324+
defer retryTicker.Stop()
322325

323326
go func() {
324327
// Send DNS request every seconds.
325328
for {
326-
_, _ = d.conn.Write(data)
329+
_, _ = localConn.Write(data)
327330
// if err != nil {
328331
// if c.log.IsLevelEnabled(logrus.DebugLevel) {
329332
// c.log.WithFields(logrus.Fields{
@@ -341,7 +344,7 @@ func (d *DoUDP) ForwardDNS(ctx context.Context, data []byte) (*dnsmessage.Msg, e
341344
select {
342345
case <-dnsReqCtx.Done():
343346
return
344-
case <-time.After(1 * time.Second):
347+
case <-retryTicker.C:
345348
}
346349
}
347350
}()
@@ -350,7 +353,7 @@ func (d *DoUDP) ForwardDNS(ctx context.Context, data []byte) (*dnsmessage.Msg, e
350353
respBuf := pool.GetFullCap(consts.EthernetMtu)
351354
defer pool.Put(respBuf)
352355
// Wait for response.
353-
n, err := d.conn.Read(respBuf)
356+
n, err := localConn.Read(respBuf)
354357
if err != nil {
355358
return nil, err
356359
}

control/dns_control.go

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -536,7 +536,7 @@ func (c *DnsController) handle_(
536536
if err != nil {
537537
return fmt.Errorf("pack DNS packet: %w", err)
538538
}
539-
return c.dialSend(0, req, data, dnsMessage.Id, upstream, needResp)
539+
return c.dialSend(context.Background(), 0, req, data, dnsMessage.Id, upstream, needResp)
540540
}
541541

542542
// sendReject_ send empty answer.
@@ -562,7 +562,7 @@ func (c *DnsController) sendReject_(dnsMessage *dnsmessage.Msg, req *udpRequest)
562562
return nil
563563
}
564564

565-
func (c *DnsController) dialSend(invokingDepth int, req *udpRequest, data []byte, id uint16, upstream *dns.Upstream, needResp bool) (err error) {
565+
func (c *DnsController) dialSend(ctx context.Context, invokingDepth int, req *udpRequest, data []byte, id uint16, upstream *dns.Upstream, needResp bool) (err error) {
566566
if invokingDepth >= MaxDnsLookupDepth {
567567
return fmt.Errorf("too deep DNS lookup invoking (depth: %v); there may be infinite loop in your DNS response routing", MaxDnsLookupDepth)
568568
}
@@ -607,7 +607,7 @@ func (c *DnsController) dialSend(invokingDepth int, req *udpRequest, data []byte
607607
// We should set a connClosed flag to avoid it.
608608
var connClosed bool
609609

610-
ctxDial, cancel := context.WithTimeout(context.TODO(), consts.DefaultDialTimeout)
610+
ctxDial, cancel := context.WithTimeout(ctx, consts.DefaultDialTimeout)
611611
defer cancel()
612612

613613
// get forwarder from cache
@@ -632,10 +632,6 @@ func (c *DnsController) dialSend(invokingDepth int, req *udpRequest, data []byte
632632
}
633633
}()
634634

635-
if err != nil {
636-
return err
637-
}
638-
639635
respMsg, err = forwarder.ForwardDNS(ctxDial, data)
640636
if err != nil {
641637
if c.timeoutExceedCallback != nil && isTimeoutError(err) {
@@ -644,7 +640,7 @@ func (c *DnsController) dialSend(invokingDepth int, req *udpRequest, data []byte
644640
if fallbackDialArgument := tcpFallbackDialArgument(upstream, dialArgument, err); fallbackDialArgument != nil {
645641
fallbackForwarder, fallbackErr := newDnsForwarder(upstream, *fallbackDialArgument)
646642
if fallbackErr != nil {
647-
return err
643+
return fmt.Errorf("tcp fallback forwarder creation failed: %w (original: %v)", fallbackErr, err)
648644
}
649645
defer fallbackForwarder.Close()
650646
respMsg, fallbackErr = fallbackForwarder.ForwardDNS(ctxDial, data)
@@ -702,7 +698,7 @@ func (c *DnsController) dialSend(invokingDepth int, req *udpRequest, data []byte
702698
"next_upstream": nextUpstream.String(),
703699
}).Traceln("Change DNS upstream and resend")
704700
}
705-
return c.dialSend(invokingDepth+1, req, data, id, nextUpstream, needResp)
701+
return c.dialSend(ctx, invokingDepth+1, req, data, id, nextUpstream, needResp)
706702
}
707703
if upstreamIndex.IsReserved() && c.log.IsLevelEnabled(logrus.InfoLevel) {
708704
var (

0 commit comments

Comments
 (0)