Unify BuilderNet mkosi recipe#360
Conversation
|
Claude finished @canercidam's task in 3m 10s —— View job PR #360 Review
SummaryThis PR restructures custom recipes to a Issues FoundBug —
You likely need Bug — Index out of bounds in
Behavioral change — Post-hook timing (inline comment) Post-hooks now execute in Code duplication — The standalone API inconsistency — Two
Minor Notes
|
| beacon.ReplaceArgs(map[string]string{ | ||
| "--target-peers": "1", | ||
| "--subscribe-all-subnets": "true", | ||
| }) |
There was a problem hiding this comment.
Bug: --subscribe-all-subnets was removed from LighthouseBeaconNode.Apply() in components.go (this PR), so ReplaceArgs will not find it. Even if it were still present, --subscribe-all-subnets is a boolean flag in Lighthouse (it doesn't take a separate value argument), so the ReplaceArgs approach of replacing args[i+1] would corrupt the next unrelated argument.
You likely need to use WithArgs("--subscribe-all-subnets") to add the flag back, and only use ReplaceArgs for --target-peers which does take a value.
| func applyReplaceArgs(args, replaceArgs []string) []string { | ||
| if len(replaceArgs)%2 != 0 { | ||
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | ||
| } | ||
|
|
||
| result := make([]string, len(args)) | ||
| copy(result, args) | ||
|
|
||
| for i := 0; i < len(replaceArgs); i += 2 { | ||
| flag := replaceArgs[i] | ||
| newValue := replaceArgs[i+1] | ||
| result = applyReplacePair(flag, newValue, result) | ||
| } |
There was a problem hiding this comment.
Bug: Index out of bounds panic when replaceArgs has an odd length. The warning on line 469 fires but execution continues. At i = len(replaceArgs)-1 (the last odd element), replaceArgs[i+1] will panic with an index out of range.
After the warning, you should return early or truncate the iteration bound:
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value, ignoring last unpaired element", "count", len(replaceArgs)) | |
| replaceArgs = replaceArgs[:len(replaceArgs)-1] | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| return result | |
| } |
playground/local_runner.go
Outdated
| // run post hook operations | ||
| if err := d.manifest.ExecutePostHookActions(); err != nil { | ||
| return fmt.Errorf("failed to execute post-hook operations: %w", err) | ||
| } |
There was a problem hiding this comment.
Moving post-hook execution here changes the timing: it now runs after docker compose up -d but before services are healthy (the WaitForReady call happens in main.go after Run() returns). Previously it was called after WaitForReady.
If any post-hook depends on services being fully ready (e.g., the registerBuilder hook that calls the beacon API and reth API), this will fail because those services may not be accepting connections yet.
Is this intentional? If so, consider adding a comment explaining why the timing change is safe.
| // ReplaceArgs replaces argument values in the service's Args. | ||
| // The replacements map contains flag -> new_value pairs. | ||
| // For each flag found in Args, the following value is replaced. | ||
| func (s *Service) ReplaceArgs(replacements map[string]string) *Service { | ||
| for i := 0; i < len(s.Args); i++ { | ||
| if newValue, ok := replacements[s.Args[i]]; ok && i+1 < len(s.Args) { | ||
| s.Args[i+1] = newValue | ||
| } | ||
| } | ||
| return s | ||
| } |
There was a problem hiding this comment.
Nit: Service.ReplaceArgs (takes a map[string]string) has a different signature than the YAML-layer applyReplaceArgs (takes a []string of alternating pairs). Two different interfaces for the same concept could be confusing. Consider unifying them or clearly documenting why they differ.
Also, if a flag appears as the last element of Args (no value follows), the i+1 < len(s.Args) check silently skips it without any warning, which could mask misconfiguration.
…358) This PR changes the definitions of the custom recipes we include in the playground CLI binary. We're switching from ``` custom-recipes/ rbuilder/ bin.yaml custom.yaml rbuilder.toml ``` to ``` custom-recipes/ rbuilder/ bin/ playground.yaml rbuilder.toml custom/ playground.yaml rbuilder.toml ``` This separates extra files (e.g. `rbuilder.toml`) for each recipe variant and enables `rbuilder/container` and `buildernet/qemu` custom recipes.
- Add `rbuilder/container` custom recipe (currently runs on Mac) - Remove `rbuilder/custom` in favor of reusing `rbuilder/bin` with a suggestion --------- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
This allows definining custom services like:
```yaml
recipe:
my-component:
services:
my-service:
lifecycle_hooks: true
init:
- cmd1
- cmd2
start: cmd3
stop:
- cmd4
- cmd5
```
This is useful in bringing in external VM handling for the buildernet
recipe.
|
Claude finished @canercidam's task in 3m 25s —— View job PR #360 Review
SummaryThis PR adds lifecycle hooks for custom services (init/start/stop commands), restructures custom recipes to The lifecycle hooks feature is well-designed with good test coverage. The directory restructuring is clean. Several issues from the previous review remain unaddressed. IssuesBug —
Bug — Index out of bounds in
Race condition — lifecycle start command (inline comment)
Behavioral change — Post-hook timing (inline comment)
Code duplication — The standalone API inconsistency — Two
Minor Notes
|
| } | ||
|
|
||
| result := make([]string, len(args)) | ||
| copy(result, args) | ||
|
|
||
| for i := 0; i < len(replaceArgs); i += 2 { | ||
| flag := replaceArgs[i] | ||
| newValue := replaceArgs[i+1] |
There was a problem hiding this comment.
Bug (still present from previous review): Index out of bounds panic when replaceArgs has an odd length. The warning on line 491 fires but execution continues. At i = len(replaceArgs)-1, replaceArgs[i+1] panics.
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| replaceArgs = replaceArgs[:len(replaceArgs)-1] | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] |
| beacon.ReplaceArgs(map[string]string{ | ||
| "--target-peers": "1", | ||
| "--subscribe-all-subnets": "true", | ||
| }) |
There was a problem hiding this comment.
Bug (still present from previous review): --subscribe-all-subnets was removed from LighthouseBeaconNode.Apply() in this PR, so ReplaceArgs won't find it. Even if it were present, it's a boolean flag (no value argument), so ReplaceArgs replacing args[i+1] is semantically wrong.
This should be:
beacon.ReplaceArgs(map[string]string{
"--target-peers": "1",
})
beacon.WithArgs("--subscribe-all-subnets")| startCmd := lc.newCmd(ctx, svc.Start) | ||
| go func() { | ||
| if err := startCmd.Run(); err != nil { | ||
| if mainctx.IsExiting() { | ||
| return | ||
| } | ||
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | ||
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | ||
| } | ||
| }() | ||
|
|
||
| d.handlesMu.Lock() | ||
| defer d.handlesMu.Unlock() | ||
| d.handles = append(d.handles, startCmd) |
There was a problem hiding this comment.
Race condition: The goroutine on line 106 calls startCmd.Run() (which internally calls Start() then Wait()). The command is appended to d.handles here, after the goroutine is launched. If the process exits very quickly (or fails to start), sendExitError could fire before the handle is tracked.
Consider starting the command explicitly before the goroutine so you have a guaranteed running process:
if err := startCmd.Start(); err != nil {
return fmt.Errorf("%s", lc.formatError("start", svc.Start, err))
}
d.handlesMu.Lock()
d.handles = append(d.handles, startCmd)
d.handlesMu.Unlock()
go func() {
if err := startCmd.Wait(); err != nil {
...
}
}()This also avoids the window where stopAllProcessesWithSignal could find a handle whose Process is still nil.
playground/local_runner.go
Outdated
| // run post hook operations | ||
| if err := d.manifest.ExecutePostHookActions(); err != nil { | ||
| return fmt.Errorf("failed to execute post-hook operations: %w", err) |
There was a problem hiding this comment.
Behavioral change (still present from previous review): Post-hooks now execute after docker compose up -d but before WaitForReady (which is called in main.go after Run() returns). The registerBuilder hook calls the beacon API and reth API — these may not be accepting connections yet.
If lifecycle host services (like the VM) need the post-hook to have run before they start, this ordering makes sense. But it's worth adding a comment explaining the intentional timing, since the previous code ran post-hooks after services were healthy.
|
Claude finished @canercidam's task in 3m 13s —— View job PR #360 Review
SummaryThis PR restructures custom recipes to a The latest commit (
Remaining IssuesBug — Index out of bounds in
Race condition — lifecycle start command (inline comment)
Code duplication — The standalone Bug —
func (r *RunnerConfig) AddCallback(c Callback) {
r.Callbacks = append(r.Callbacks, c)
}API inconsistency — Two
Minor Notes
|
| func applyReplaceArgs(args, replaceArgs []string) []string { | ||
| if len(replaceArgs)%2 != 0 { | ||
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | ||
| } | ||
|
|
||
| result := make([]string, len(args)) | ||
| copy(result, args) | ||
|
|
||
| for i := 0; i < len(replaceArgs); i += 2 { | ||
| flag := replaceArgs[i] | ||
| newValue := replaceArgs[i+1] | ||
| result = applyReplacePair(flag, newValue, result) | ||
| } |
There was a problem hiding this comment.
Bug (persists from previous reviews): Index out of bounds panic when replaceArgs has an odd length. The warning on line 492 fires but execution continues. At the last odd index, replaceArgs[i+1] on line 500 panics.
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value, ignoring last unpaired element", "count", len(replaceArgs)) | |
| replaceArgs = replaceArgs[:len(replaceArgs)-1] | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| return result | |
| } |
| startCmd := lc.newCmd(ctx, svc.Start) | ||
| go func() { | ||
| if err := startCmd.Run(); err != nil { | ||
| if mainctx.IsExiting() { | ||
| return | ||
| } | ||
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | ||
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | ||
| } | ||
| }() | ||
|
|
||
| d.handlesMu.Lock() | ||
| defer d.handlesMu.Unlock() | ||
| d.handles = append(d.handles, startCmd) | ||
| return nil |
There was a problem hiding this comment.
Race condition (persists from previous reviews): startCmd.Run() (which internally calls Start() then Wait()) is launched in a goroutine. The handle is appended to d.handles only afterward on line 118. If the process exits very quickly, sendExitError could fire before the handle is tracked, and stopAllProcessesWithSignal could find a handle whose Process is still nil.
Consider using Start() explicitly, appending the handle, then Wait() in the goroutine:
if err := startCmd.Start(); err != nil {
return fmt.Errorf("%s", lc.formatError("start", svc.Start, err))
}
d.handlesMu.Lock()
d.handles = append(d.handles, startCmd)
d.handlesMu.Unlock()
go func() {
if err := startCmd.Wait(); err != nil {
if mainctx.IsExiting() {
return
}
slog.Error("Lifecycle service failed", "service", svc.Name, "error", err)
d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err)))
}
}()
return nil|
Claude finished @canercidam's task in 3m 41s —— View job PR #360 Review
SummaryThis PR restructures custom recipes to a Compared to previous reviews, several issues have been fixed:
Remaining IssuesBug — Index out of bounds in
Race condition — lifecycle start command (inline comment)
Code duplication — The standalone Bug —
API inconsistency — Two
Duplicate validation block in
Minor Notes
|
This avoids a chicken-egg situation when there are cross-dependencies.
This allows replacing flags like: ```yaml - "--foo" - "bar" ``` with: ```yaml replace_args: - "--foo" - "baz" ``` without redefining the entire flag set. It is not used in case the custom recipe prefers definining the full set of args with `args`. (This was considered for supporting the buildernet recipe but is no longer necessary for that. It is still a potentially useful feature.)
|
Claude finished @canercidam's task in 2m 38s —— View job PR #360 Review
SummaryThis PR restructures custom recipes to a Several issues from previous reviews have been fixed:
Remaining IssuesBug — Index out of bounds in
Race condition — lifecycle start command (inline comment)
Code duplication — The standalone Bug —
Bug — Duplicate validation block in
API inconsistency — Two
Minor Notes
|
|
Claude finished @canercidam's task in 2m 16s —— View job PR #360 Review
SummaryThis PR restructures custom recipes to a Several issues from previous reviews have been fixed:
Remaining IssuesBug — Index out of bounds in
Race condition — lifecycle start command (inline comment)
Code duplication — The standalone Bug —
Duplicate validation block in
API inconsistency — Two
Minor Notes
|
| func applyReplaceArgs(args, replaceArgs []string) []string { | ||
| if len(replaceArgs)%2 != 0 { | ||
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | ||
| } | ||
|
|
||
| result := make([]string, len(args)) | ||
| copy(result, args) | ||
|
|
||
| for i := 0; i < len(replaceArgs); i += 2 { | ||
| flag := replaceArgs[i] | ||
| newValue := replaceArgs[i+1] |
There was a problem hiding this comment.
Bug (persists from previous reviews): Index out of bounds panic when replaceArgs has an odd length. The warning on line 493 fires but execution continues. At the last odd index, replaceArgs[i+1] panics.
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value, ignoring last unpaired element", "count", len(replaceArgs)) | |
| replaceArgs = replaceArgs[:len(replaceArgs)-1] | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] |
| startCmd := lc.newCmd(ctx, svc.Start) | ||
| go func() { | ||
| if err := startCmd.Run(); err != nil { | ||
| if mainctx.IsExiting() { | ||
| return | ||
| } | ||
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | ||
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | ||
| } | ||
| }() | ||
|
|
||
| d.handlesMu.Lock() | ||
| defer d.handlesMu.Unlock() | ||
| d.handles = append(d.handles, startCmd) | ||
| return nil |
There was a problem hiding this comment.
Race condition (persists from previous reviews): startCmd.Run() (which internally calls Start() then Wait()) is launched in a goroutine. The handle is appended to d.handles only on line 124, after the goroutine is launched. If the process exits very quickly, sendExitError fires before the handle is tracked, and stopAllProcessesWithSignal could encounter a handle whose Process is still nil.
The same pattern exists in local_runner.go:959-980 for the regular runOnHost path.
Use Start() explicitly, append the handle, then Wait() in the goroutine:
| startCmd := lc.newCmd(ctx, svc.Start) | |
| go func() { | |
| if err := startCmd.Run(); err != nil { | |
| if mainctx.IsExiting() { | |
| return | |
| } | |
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | |
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | |
| } | |
| }() | |
| d.handlesMu.Lock() | |
| defer d.handlesMu.Unlock() | |
| d.handles = append(d.handles, startCmd) | |
| return nil | |
| startCmd := lc.newCmd(ctx, svc.Start) | |
| if err := startCmd.Start(); err != nil { | |
| return fmt.Errorf("%s", lc.formatError("start", svc.Start, err)) | |
| } | |
| d.handlesMu.Lock() | |
| d.handles = append(d.handles, startCmd) | |
| d.handlesMu.Unlock() | |
| go func() { | |
| if err := startCmd.Wait(); err != nil { | |
| if mainctx.IsExiting() { | |
| return | |
| } | |
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | |
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | |
| } | |
| }() | |
| return nil |
| func applyReplaceArgs(args, replaceArgs []string) []string { | ||
| if len(replaceArgs)%2 != 0 { | ||
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | ||
| } | ||
|
|
||
| result := make([]string, len(args)) | ||
| copy(result, args) | ||
|
|
||
| for i := 0; i < len(replaceArgs); i += 2 { | ||
| flag := replaceArgs[i] | ||
| newValue := replaceArgs[i+1] | ||
| result = applyReplacePair(flag, newValue, result) | ||
| } | ||
|
|
||
| return result |
There was a problem hiding this comment.
Bug (persists from previous reviews): Index out of bounds panic when replaceArgs has an odd length. The warning on line 493 fires but execution continues. At the last odd index, replaceArgs[i+1] on line 501 panics.
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value", "count", len(replaceArgs)) | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| return result | |
| func applyReplaceArgs(args, replaceArgs []string) []string { | |
| if len(replaceArgs)%2 != 0 { | |
| slog.Warn("replace_args should contain pairs of flag and value, ignoring last unpaired element", "count", len(replaceArgs)) | |
| replaceArgs = replaceArgs[:len(replaceArgs)-1] | |
| } | |
| result := make([]string, len(args)) | |
| copy(result, args) | |
| for i := 0; i < len(replaceArgs); i += 2 { | |
| flag := replaceArgs[i] | |
| newValue := replaceArgs[i+1] | |
| result = applyReplacePair(flag, newValue, result) | |
| } | |
| return result | |
| } |
| go func() { | ||
| if err := startCmd.Run(); err != nil { | ||
| if mainctx.IsExiting() { | ||
| return | ||
| } | ||
| slog.Error("Lifecycle service failed", "service", svc.Name, "error", err) | ||
| d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err))) | ||
| } | ||
| }() | ||
|
|
||
| d.handlesMu.Lock() | ||
| defer d.handlesMu.Unlock() | ||
| d.handles = append(d.handles, startCmd) | ||
| return nil |
There was a problem hiding this comment.
Race condition (persists from previous reviews): startCmd.Run() (which internally calls Start() then Wait()) is launched in a goroutine on line 113. The handle is appended to d.handles only on line 124, after the goroutine is launched. If the process exits very quickly, sendExitError fires before the handle is tracked. Additionally, stopAllProcessesWithSignal could encounter a handle whose Process is still nil (the stopProcessWithSignal helper polls for non-nil Process, but this still creates an unnecessary window).
The same pattern exists in local_runner.go:959-980 for the regular runOnHost path.
Consider using Start() explicitly, appending the handle, then Wait() in the goroutine:
if err := startCmd.Start(); err != nil {
return fmt.Errorf("%s", lc.formatError("start", svc.Start, err))
}
d.handlesMu.Lock()
d.handles = append(d.handles, startCmd)
d.handlesMu.Unlock()
go func() {
if err := startCmd.Wait(); err != nil {
if mainctx.IsExiting() {
return
}
slog.Error("Lifecycle service failed", "service", svc.Name, "error", err)
d.sendExitError(fmt.Errorf("%s", lc.formatError("start", svc.Start, err)))
}
}()
return nil
This PR tries to unify the BuilderNet recipe by basing itself on Fryd's PR (#302). It may contain changes from PRs awaiting to be merged and will shrink in size as they are merged.