Skip to content

Conversation

@gregns1
Copy link
Contributor

@gregns1 gregns1 commented Nov 10, 2025

CBG-4714

  • Collects a stack trace for all running goroutines for signal SIGUSR1
  • Endpoint via REST API to collect stack trace for all running goroutines
  • sgcollect changes to collect these files
  • sgcollect will call endpoint to collect stack trace when running

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

  • Link upstream PRs
  • Update Go module dependencies when merged

Integration Tests

@gregns1 gregns1 self-assigned this Nov 10, 2025
Copilot AI review requested due to automatic review settings November 10, 2025 11:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds goroutine stack trace collection capabilities to Sync Gateway, triggered both by SIGUSR1 signal and via a REST API endpoint. The implementation includes automatic rotation of stack trace files (keeping the 10 most recent), integration with sgcollect for diagnostics, and refactoring of profile rotation logic into shared utility functions.

Key changes:

  • Introduces /_debug/stacktrace REST endpoint and SIGUSR1 signal handler for on-demand stack trace collection
  • Refactors profile rotation logic into reusable base.RotateProfilesIfNeeded() function
  • Updates sgcollect.py to collect stack trace files via the new endpoint

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
rest/server_context.go Implements signal handler registration and stack trace logging with file rotation
rest/api.go Adds REST endpoint handler for stack trace collection
rest/routing.go Registers the new /_debug/stacktrace endpoint
rest/stats_context.go Refactors memory profile collection to use shared rotation utility
base/util.go Adds utility functions for stack trace capture, profile rotation, and file creation
tools/sgcollect.py Adds stack trace collection task and includes it in the collection workflow
rest/adminapitest/admin_api_test.go Adds test coverage for the stack trace endpoint
rest/server_context_test.go Adds test coverage for stack trace file collection and rotation
tools-tests/sgcollect_info_test.py Adds test coverage for sgcollect stack trace file collection

@github-actions
Copy link

github-actions bot commented Nov 10, 2025

Redocly previews

// stack trace signal received
currentTime := time.Now()
timestamp := currentTime.Format(time.RFC3339)
sc.logStackTraces(ctx, timestamp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be good to log that we are logging the stack trace to stderr, and also log the stack stack trace with the traditional InfofCtx logging so that it gets picked up if someone isn't grabbing stderr output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have logging to indicate that a stack trace is being collected but feel logging at stderr and Info level and writing to the file is over kill. I have left out logging to info level given we write this stuff in a file for sgcollect to collect up anyway.

@gregns1 gregns1 assigned torcolvin and unassigned gregns1 Nov 11, 2025
name="Collect stack trace via http client",
auth_headers=auth_headers,
url=stack_trace_url,
log_file="sg_stack_trace.log",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log_file="sg_stack_trace.log",
log_file="goroutines.log",

I don't care what we call this but sg_stack_trace.log will mean that this output overwrites the output from the signal handler, so it needs to be a different name.

Basically this is just a plain text representation of pprof_goroutines.pb.gz so I think that would be a reasonable name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will alter the name for all the files to make it goroutine specific but this doesn't overwrite the signal files, the signal files will have timestamp in the file name, sgcollect http call will not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's fine then.

torcolvin
torcolvin previously approved these changes Nov 13, 2025
@@ -0,0 +1,60 @@
//go:build !windows
// +build !windows
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is actually unnecessary and will be flagged by govet in go 1.25, but not sure, we can leave this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use go 1.25 locally and my linter didn't pick that up so should be okay fro now (I think)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants