-
Notifications
You must be signed in to change notification settings - Fork 193
feat: add custom module implementing flight recorder #2334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new FlightRecorder module that records request traces to timestamped files when request latency exceeds a configured threshold, a dedicated Go executable entrypoint that imports the module for side effects, and accompanying README documentation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Comment |
Router image scan passed✅ No security vulnerabilities found in image: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (4)
router/cmd/flightrecorder/README.md (1)
5-17: Replace hard tabs with spaces in code example.The code example uses hard tabs on lines 9, 10, 11, and 15, which should be replaced with spaces to align with standard Go formatting conventions.
Apply this diff:
package main import ( - routercmd "github.com/wundergraph/cosmo/router/cmd" - // Import your modules here - _ "github.com/wundergraph/cosmo/router/cmd/flightrecorder/module" + routercmd "github.com/wundergraph/cosmo/router/cmd" + // Import your modules here + _ "github.com/wundergraph/cosmo/router/cmd/flightrecorder/module" ) func main() { - routercmd.Main() + routercmd.Main() }router/cmd/flightrecorder/module/module.go (3)
29-29: Remove template comment.This comment appears to be leftover from scaffolding and should be removed.
Apply this diff:
requestLatencyRecordThresholdDuration time.Duration - // Add a new property here fl *trace.FlightRecorder
100-100: Fix error message format.The error message uses a format string with
%wbut should rely onzap.Errorfor structured logging.Apply this diff:
- m.Logger.Error("failed to create trace file: %w", zap.Error(err)) + m.Logger.Error("failed to create trace file", zap.Error(err))
92-113: Add file cleanup mechanism and sanitize operation names.Two operational concerns:
No file cleanup/rotation: Trace files accumulate indefinitely in the output directory, which could eventually exhaust disk space. Consider implementing a retention policy (e.g., delete files older than N days, or keep only the last N files).
Unsanitized operation names in filenames: Operation names are used directly in filenames (line 94), but may contain characters that are invalid for filenames (
/,\,:,*,?,",<,>,|). This could causeos.Createto fail.Example sanitization approach:
import ( "regexp" // ... other imports ) var invalidFilenameChars = regexp.MustCompile(`[<>:"/\\|?*]`) func sanitizeOperationName(name string) string { // Replace invalid characters with underscores sanitized := invalidFilenameChars.ReplaceAllString(name, "_") // Limit length to avoid filesystem limits if len(sanitized) > 100 { sanitized = sanitized[:100] } return sanitized } func (m *FlightRecorder) RecordTrace(operationName string) { sanitized := sanitizeOperationName(operationName) filename := fmt.Sprintf("trace-%s-%s.out", sanitized, time.Now().Format("2006-01-02-15-04-05")) // ... rest of the method }For file cleanup, consider adding a cleanup routine in the Provision method or as a background goroutine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
router/cmd/flightrecorder/README.md(1 hunks)router/cmd/flightrecorder/main.go(1 hunks)router/cmd/flightrecorder/module/module.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
router/cmd/flightrecorder/main.go (1)
router/cmd/main.go (1)
Main(42-277)
router/cmd/flightrecorder/module/module.go (2)
router/core/modules.go (7)
RegisterModule(56-73)ModuleContext(160-164)Module(52-54)ModuleInfo(44-50)RouterOnRequestHandler(113-115)Provisioner(149-152)Cleaner(154-157)router/core/context.go (1)
RequestContext(61-142)
🪛 markdownlint-cli2 (0.18.1)
router/cmd/flightrecorder/README.md
9-9: Hard tabs
Column: 1
(MD010, no-hard-tabs)
10-10: Hard tabs
Column: 1
(MD010, no-hard-tabs)
11-11: Hard tabs
Column: 1
(MD010, no-hard-tabs)
15-15: Hard tabs
Column: 1
(MD010, no-hard-tabs)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: build-router
- GitHub Check: build_push_image
- GitHub Check: build_push_image (nonroot)
- GitHub Check: image_scan
- GitHub Check: image_scan (nonroot)
- GitHub Check: build_test
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: integration_test (./events)
- GitHub Check: integration_test (./telemetry)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
🔇 Additional comments (1)
router/cmd/flightrecorder/main.go (1)
1-10: LGTM!The entrypoint correctly delegates to the router's main function and imports the module for side-effect registration. This is the standard pattern for module initialization.
67237dd to
e0770d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
router/cmd/flightrecorder/README.md (2)
21-21: Clarify which configuration file contains the modules section.Line 21 references "the main router configuration file," but doesn't specify the filename (e.g.,
config.yaml,router.yaml). Provide the explicit filename to reduce ambiguity for users.
45-45: Provide more context for.env.examplesetup.The note mentions copying
.env.examplebut doesn't explain what environment variables are needed or what the router should connect to. Either expand this section with required env vars or reference the main router documentation where this is covered.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
router/cmd/flightrecorder/README.md(1 hunks)router/cmd/flightrecorder/main.go(1 hunks)router/cmd/flightrecorder/module/module.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- router/cmd/flightrecorder/main.go
- router/cmd/flightrecorder/module/module.go
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
router/cmd/flightrecorder/README.md
9-9: Hard tabs
Column: 1
(MD010, no-hard-tabs)
10-10: Hard tabs
Column: 1
(MD010, no-hard-tabs)
11-11: Hard tabs
Column: 1
(MD010, no-hard-tabs)
15-15: Hard tabs
Column: 1
(MD010, no-hard-tabs)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-router
- GitHub Check: integration_test (./telemetry)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: integration_test (./events)
- GitHub Check: build_test
- GitHub Check: build_push_image (nonroot)
- GitHub Check: build_push_image
- GitHub Check: Analyze (go)
e0770d1 to
de3f5eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
router/cmd/flightrecorder/README.md (2)
9-16: Replace hard tabs with spaces in the Go code block.The fenced Go example still uses hard tabs on the import and
routercmd.Main()lines, which violates the repo’s markdownlint rule (MD010) and can render inconsistently. Please replace those tabs with the project’s standard indentation (e.g., 4 spaces).
3-3: Clarify wording: reference theruntime/traceFlightRecorder API, not a standalone module.The text calls this the "new
flightrecordermodule from Go 1.25". To match Go’s terminology, this should point to the FlightRecorder API in theruntime/tracepackage (the blog link is fine), e.g., “…uses the newFlightRecorderAPI in theruntime/tracepackage (Go 1.25)…”.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
router/cmd/flightrecorder/README.md(1 hunks)router/cmd/flightrecorder/main.go(1 hunks)router/cmd/flightrecorder/module/module.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- router/cmd/flightrecorder/main.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router/cmd/flightrecorder/module/module.gorouter/cmd/flightrecorder/README.md
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.
Applied to files:
router/cmd/flightrecorder/README.md
🧬 Code graph analysis (1)
router/cmd/flightrecorder/module/module.go (2)
router/core/modules.go (7)
RegisterModule(56-73)ModuleContext(160-164)Module(52-54)ModuleInfo(44-50)RouterOnRequestHandler(113-115)Provisioner(149-152)Cleaner(154-157)router/core/context.go (1)
RequestContext(61-142)
🪛 markdownlint-cli2 (0.18.1)
router/cmd/flightrecorder/README.md
9-9: Hard tabs
Column: 1
(MD010, no-hard-tabs)
10-10: Hard tabs
Column: 1
(MD010, no-hard-tabs)
11-11: Hard tabs
Column: 1
(MD010, no-hard-tabs)
15-15: Hard tabs
Column: 1
(MD010, no-hard-tabs)
🔇 Additional comments (1)
router/cmd/flightrecorder/module/module.go (1)
83-97: Router hook logic looks sound.The request hook wraps
next.ServeHTTP, measures duration, and only records whenEnabled()is true and the latency threshold is exceeded. This is a clean, low-overhead integration point and aligns with the module’s purpose.
StarpTech
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
a12a5b4 to
c438034
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
router/cmd/flightrecorder/README.md (1)
5-17: Replace hard tabs with spaces in the code example.The code block contains hard tab characters that violate Markdown linting standards and may render inconsistently across viewers.
Apply this diff to replace tabs with spaces:
```go package main import ( - routercmd "github.com/wundergraph/cosmo/router/cmd" - // Import your modules here - _ "github.com/wundergraph/cosmo/router/cmd/flightrecorder/module" + routercmd "github.com/wundergraph/cosmo/router/cmd" + // Import your modules here + _ "github.com/wundergraph/cosmo/router/cmd/flightrecorder/module" ) func main() { - routercmd.Main() + routercmd.Main() }</blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (2)</summary><blockquote> <details> <summary>router/cmd/flightrecorder/module/module.go (2)</summary><blockquote> `54-68`: **Clarify MaxBytes calculation logic.** The comment states "10MB/s of MinAge" but the implementation only scales MaxBytes when the threshold exceeds 512ms. For thresholds ≤512ms, a 2-second MinAge gets only the 10MB minimum, which is 5MB/s—not the stated 10MB/s. Consider either: 1. Updating the comment to reflect the actual behavior (10MB minimum with scaling for larger thresholds) 2. Adjusting the formula to consistently provide ~10MB per second of MinAge Current behavior: - Threshold 100ms → MinAge 200ms → MaxBytes 10MB (50MB/s, way above target) - Threshold 500ms → MinAge 1000ms → MaxBytes 10MB (10MB/s, matches target) - Threshold 1000ms → MinAge 2000ms → MaxBytes ~19.5MB (~9.75MB/s, close to target) Example for consistent 10MB/s: ```diff - // 10MB minimum - var maxBytes uint64 = 10 * 1024 * 1024 - - // We actually want ~10MB/s of MinAge - // 1000ms = 1 second, 1000 is close enough to 1024 - // sub in the uint milliseconds count for one of the factors - // if it would result in a value greater than default maxBytes - if m.RequestLatencyRecordThreshold*2 > 1024 { - maxBytes = (m.RequestLatencyRecordThreshold * 2) * 1024 * 10 - } + // Target: 10MB per second of MinAge (doubled threshold) + // MinAge is in milliseconds: (threshold * 2) ms + // 10MB/s = 10MB per 1000ms + // maxBytes = (threshold * 2) * 10MB / 1000 + maxBytes := (m.RequestLatencyRecordThreshold * 2 * 10 * 1024 * 1024) / 1000 + // Enforce 10MB minimum + if maxBytes < 10*1024*1024 { + maxBytes = 10 * 1024 * 1024 + }
99-119: Verify RecordMultiple behavior with concurrent requests.When
RecordMultipleis false, line 117 stops the recorder after writing a trace. However, if multiple slow requests trigger concurrently before the first Stop() completes, multiple RecordTrace calls may execute, potentially recording multiple traces despite the intent to record only once.Combined with the concurrency issue flagged in RouterOnRequest, consider:
- Using the mutex suggested in the previous comment to serialize all RecordTrace calls
- Adding a flag to track whether a trace has been recorded when
RecordMultipleis falseExample:
type FlightRecorder struct { OutputPath string `mapstructure:"outputPath"` RecordMultiple bool `mapstructure:"recordMultiple"` RequestLatencyRecordThreshold uint64 `mapstructure:"requestLatencyRecordThreshold"` requestLatencyRecordThresholdDuration time.Duration fl *trace.FlightRecorder + recordMu sync.Mutex + recorded bool Logger *zap.Logger }Then in RecordTrace:
func (m *FlightRecorder) RecordTrace(operationName string) { + if !m.RecordMultiple && m.recorded { + return + } + // Generate timestamped filename filename := fmt.Sprintf("trace-%s-%s.out", operationName, time.Now().Format("2006-01-02-15-04-05")) // Create the file file, err := os.Create(filepath.Join(m.OutputPath, filename)) if err != nil { m.Logger.Error("failed to create trace file", zap.Error(err)) return } defer file.Close() _, err = m.fl.WriteTo(file) if err != nil { m.Logger.Error("Failed to record request", zap.Error(err)) + return } + m.recorded = true + if !m.RecordMultiple { m.fl.Stop() } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
router/cmd/flightrecorder/README.md(1 hunks)router/cmd/flightrecorder/main.go(1 hunks)router/cmd/flightrecorder/module/module.go(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.
Applied to files:
router/cmd/flightrecorder/README.md
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.
Applied to files:
router/cmd/flightrecorder/README.mdrouter/cmd/flightrecorder/module/module.go
🧬 Code graph analysis (2)
router/cmd/flightrecorder/main.go (1)
router/cmd/main.go (1)
Main(42-277)
router/cmd/flightrecorder/module/module.go (1)
router/core/modules.go (7)
RegisterModule(56-73)ModuleContext(160-164)Module(52-54)ModuleInfo(44-50)RouterOnRequestHandler(113-115)Provisioner(149-152)Cleaner(154-157)
🪛 markdownlint-cli2 (0.18.1)
router/cmd/flightrecorder/README.md
9-9: Hard tabs
Column: 1
(MD010, no-hard-tabs)
10-10: Hard tabs
Column: 1
(MD010, no-hard-tabs)
11-11: Hard tabs
Column: 1
(MD010, no-hard-tabs)
15-15: Hard tabs
Column: 1
(MD010, no-hard-tabs)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build-router
- GitHub Check: build_test
- GitHub Check: integration_test (./telemetry)
- GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
- GitHub Check: Analyze (go)
🔇 Additional comments (4)
router/cmd/flightrecorder/main.go (1)
1-10: LGTM! Clean entrypoint implementation.The implementation correctly uses a side-effect import to register the FlightRecorder module and delegates to the existing router main function. This is a clean pattern for extending the router with custom modules.
router/cmd/flightrecorder/module/module.go (3)
1-31: LGTM! Clean module structure and registration.The module structure follows the core module system conventions correctly with proper registration in init() and appropriate field definitions.
77-81: LGTM! Proper cleanup implementation.The Cleanup method correctly stops the flight recorder when the module shuts down.
121-136: LGTM! Proper module metadata and interface guards.The Module() method correctly returns the module metadata, and the interface guards ensure compile-time verification of interface implementation.
Summary by CodeRabbit
New Features
Configuration
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.
Checklist