This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
CockroachDB is a distributed SQL database written in Go. We use Bazel as a build
system but most operations are wrapped through the ./dev tool, which should be
preferred to direct go (build|test) or bazel invocations.
Building a package / package tests
This is useful as a compilation check.
# Invoke (but skip) all tests in that package, which
# implies that both package and its tests compile.
./dev test ./pkg/util/log -f -
# Build package ./pkg/util/log
./dev build ./pkg/util/logTesting:
./dev test pkg/sql # run unit tests for SQL package (slow!)
./dev test pkg/sql -f=TestParse -v # run specific test patternNote that when filtering tests via -f to include the -v flag which
will warn you in the output if your filter didn't match anything. Look
for testing: warning: no tests to run in the output.
See ./dev test --help for all options`.
Building:
./dev build cockroach # build full cockroach binary
./dev build short # build cockroach without UI (faster)Building a CockroachDB binary (even in short mode) should be considered slow. Avoid doing this unless necessary.
Use ./dev build --help for the entire list of artifacts that can
be built.
Code Generation and Linting:
Protocol buffers, SQL parser, SQL Optimizer rules and others rely on Go code
generated by ./dev generate <args>. This should be considered a slow command.
Rebuild only what is actually needed. ./dev (test|build) commands
automatically generate their dependencies, but do not lift them into the
worktree, i.e. if they need to be visible to you, you need to invoke the
appropriate ./dev generate command yourself.
./dev generate # generate all code (protobuf, parsers, etc.) - SLOW
./dev generate go # generate Go code only
./dev generate bazel # update BUILD.bazel files when dependencies change
./dev generate protobuf # generate protobuf files - relatively fastSee ./dev generate --help.
CockroachDB consists of many components and subsystems. The file .github/CODEOWNERS is a good starting point if the overall architecture is relevant to the task.
CockroachDB implements redactability to ensure sensitive information (PII, confidential data) is automatically removed or marked in log messages and error outputs. This enables customers to safely share logs with support teams.
Safe vs Unsafe Data:
- Safe data: Information certainly known to NOT be PII-laden (node IDs, range IDs, error codes)
- Unsafe data: Information potentially PII-laden or confidential (user data, SQL statements, keys)
Redactable Strings:
- Unsafe data is enclosed in Unicode markers:
‹unsafe_data› - Safe data appears without markers
- Log entries show a special indicator:
⋮(vertical ellipsis)
SafeValue Interface - For types that are always safe:
type NodeID int32
func (n NodeID) SafeValue() {} // always safe to log
// Interface verification pattern.
var _ redact.SafeValue = NodeID(0)SafeFormatter Interface - For complex types mixing safe/unsafe data:
func (s *ComponentStats) SafeFormat(w redact.SafePrinter, _ rune) {
w.Printf("ComponentStats{ID: %v", s.Component)
// Use w.Printf(), w.SafeString(), w.SafeRune() to mark safe parts.
}redact.Safe(value)- Mark a value as saferedact.SafeString(s)- Mark string literal as safe
Prefer using SafeFormatter, which does not require the below check. If implementing SafeValue instead:
The linter in /pkg/testutils/lint/passes/redactcheck/redactcheck.go:
- Maintains allowlist of types permitted to implement
SafeValue - Validates
RegisterSafeTypecalls - Prevents accidental marking of sensitive types as safe
To add a new safe type:
- Implement
SafeValue()method - Add interface verification:
var _ redact.SafeValue = TypeName{} - Update redactcheck allowlist if needed
/pkg/util/log/redact.go- Core redaction logic/docs/RFCS/20200427_log_file_redaction.md- Design RFC/pkg/testutils/lint/passes/redactcheck/redactcheck.go- Linter implementation/pkg/util/log/redact_test.go- Test examples and patterns
CockroachDB uses a custom code formatter called crlfmt that goes beyond standard Go formatting tools to enforce project-specific style guidelines.
crlfmt is CockroachDB's custom Go code formatter (not a wrapper around
standard tools) that enforces specific coding standards beyond what gofmt and
goimports provide. It's an external tool developed specifically for
CockroachDB's needs. Editor-integrated agents often don't need to invoke this,
since the editor does it automatically. Otherwise, the agent should invoke the
tool after each round of edits to ensure correct formatting.
Repository: github.com/cockroachdb/crlfmt
Line Length and Wrapping:
- Code: 100 columns, Comments: 80 columns
- Tab width: 2 characters
Function Signatures:
func (s *someType) myFunctionName(
arg1 somepackage.SomeArgType, arg2 int, arg3 somepackage.SomeOtherType,
) (somepackage.SomeReturnType, error) {
// ...
}
// One argument per line for long lists.
func (s *someType) myFunctionName(
arg1 somepackage.SomeArgType,
arg2 int,
arg3 somepackage.SomeOtherType,
) (somepackage.SomeReturnType, error) {
// ...
}Basic Formatting:
# Format (in-place) with CockroachDB standard settings
crlfmt -w -tab 2 filename.go
# Check formatting without writing changes
crlfmt -diff filename.goNote: crlfmt only accepts one filename at a time. To format multiple files, use xargs or a loop:
# Format multiple files with xargs
find pkg/sql -name "*.go" | xargs -n1 crlfmt -w
# Or use a loop
for f in pkg/sql/*.go; do crlfmt -w "$f"; doneCockroachDB follows specific Go coding conventions inspired by the Uber Go style guide with CockroachDB-specific modifications.
Pointers to Interfaces:
- Almost never need a pointer to an interface
- Pass interfaces as values—underlying data can still be a pointer
- Interface contains: type-specific information pointer + data pointer
Receivers and Interfaces:
type S struct {
data string
}
// Value receiver - can be called on pointers and values.
func (s S) Read() string {
return s.data
}
// Pointer receiver - needed to modify data.
func (s *S) Write(str string) {
s.data = str
}
// Interface verification pattern.
var _ redact.SafeValue = NodeID(0)Mutexes:
// Zero-value mutex is valid.
var mu sync.Mutex
mu.Lock()
// Embed mutex in struct (preferred for private types).
type smap struct {
sync.Mutex
data map[string]string
}
// Named field for exported types.
type SMap struct {
mu sync.Mutex
data map[string]string
}Defer for Cleanup:
// Always use defer for locks and cleanup.
func (m *SMap) Get(k string) string {
m.mu.Lock()
defer m.mu.Unlock()
return m.data[k]
}
// For panic protection, use separate function.
func myFunc() error {
doRiskyWork := func() error {
p.Lock()
defer p.Unlock()
return somethingThatCanPanic()
}
return doRiskyWork()
}Slice and Map Copying:
Comments should clarify ownership.
// SetTrips sets the driver's trips.
// Note that the slice is captured by reference, the
// caller should take care of preventing unwanted aliasing.
func (d *Driver) SetTrips(trips []Trip) { d.trips = trips }or
// SetTrips sets the driver's trips. It does not hold on
// to the provided slice.
func (d *Driver) SetTrips(trips []Trip) {
d.trips = make([]Trip, len(trips))
copy(d.trips, trips)
}
// Snapshot returns a copy of the internal state.
func (s *Stats) Snapshot() map[string]int {
s.Lock()
defer s.Unlock()
result := make(map[string]int, len(s.counters))
for k, v := range s.counters {
result[k] = v
}
return result
}
Many code paths are performance-sensitive and in particular heap allocations should be avoided. In all code paths, reasonably performant code should be written, as long as complexity does not significantly increase as a result. Some examples of this follow.
String Conversion:
// strconv is faster than fmt for primitives.
s := strconv.Itoa(rand.Int()) // good
s := fmt.Sprint(rand.Int()) // slowerString-to-Byte Conversion:
// Avoid repeated conversion.
data := []byte("Hello world") // good - once
for i := 0; i < b.N; i++ {
w.Write(data)
}
// Bad - repeated allocation.
for i := 0; i < b.N; i++ {
w.Write([]byte("Hello world"))
}Channels:
- Size should be one or unbuffered (zero)
- Any other size requires high scrutiny
Enums:
// Start enums at one unless zero value is meaningful.
type Operation int
const (
Add Operation = iota + 1
Subtract
Multiply
)Import Grouping:
import (
// Standard library
"fmt"
"os"
// Everything else
"go.uber.org/atomic"
"golang.org/x/sync/errgroup"
)Variable Declarations:
// Top-level: omit type if clear from function return.
var _s = F()
// Local: use short declaration.
s := "foo"
// Empty slices: prefer var declaration.
var filtered []int
// Over: filtered := []int{}.
// nil is a valid slice.
return nil // not return []int{}
// Check empty with len(), not nil comparison.
func isEmpty(s []string) bool {
return len(s) == 0 // not s == nil
}Struct Initialization:
// Always specify field names.
k := User{
FirstName: "John",
LastName: "Doe",
Admin: true,
}
// Use &T{} instead of new(T).
sptr := &T{Name: "bar"}Bool Parameters:
// Avoid naked bools - use comments or enums.
printInfo("foo", true /* isLocal */, true /* done */)
// Better: custom types.
type EndTxnAction bool
const (
Commit EndTxnAction = false
Abort = true
)
func endTxn(action EndTxnAction) {}Error Handling:
// Reduce variable scope.
if err := f.Close(); err != nil {
return err
}
// Reduce nesting - handle errors early.
for _, v := range data {
if v.F1 != 1 {
log.Printf("Invalid v: %v", v)
continue
}
v = process(v)
if err := v.Call(); err != nil {
return err
}
v.Send()
}Printf and Formatting:
// Format strings should be const for go vet.
const msg = "unexpected values %v, %v\n"
fmt.Printf(msg, 1, 2)
// Printf-style function names should end with 'f'.
func Wrapf(format string, args ...interface{}) error
// Use raw strings to avoid escaping.
wantError := `unknown error:"test"`Table-Driven Tests:
tests := []struct{
give string
wantHost string
wantPort string
}{{
give: "192.0.2.0:8000",
wantHost: "192.0.2.0",
wantPort: "8000",
}, {
give: ":8000",
wantHost: "",
wantPort: "8000",
}}
for _, tt := range tests {
t.Run(tt.give, func(t *testing.T) {
host, port, err := net.SplitHostPort(tt.give)
require.NoError(t, err)
assert.Equal(t, tt.wantHost, host)
assert.Equal(t, tt.wantPort, port)
})
}Functional Options Pattern:
type Option interface {
apply(*options)
}
type optionFunc func(*options)
func (f optionFunc) apply(o *options) { f(o) }
func WithTimeout(t time.Duration) Option {
return optionFunc(func(o *options) {
o.timeout = t
})
}
func Connect(addr string, opts ...Option) (*Connection, error) {
options := options{
timeout: defaultTimeout,
caching: defaultCaching,
}
for _, o := range opts {
o.apply(&options)
}
// ...
}Block comments (standalone line) use full sentences with capitalization and punctuation:
// Bad - panics on wrong type.
t := i.(string)
// Good - handles gracefully.
t, ok := i.(string)Inline comments (end of code line) are lowercase without terminal punctuation:
s := strconv.Itoa(rand.Int()) // good
s := fmt.Sprint(rand.Int()) // slower
func (n NodeID) SafeValue() {} // always safe to logComments should be placed where they provide the most value and avoid duplication:
Data Structure Comments:
- Belong at the data structure declaration
- Explain the purpose, lifecycle, and invariants of the struct/type
- Document which code initializes fields, which code accesses them, and when they become obsolete
- Do not repeat this information in function comments that use these structures
Algorithmic Comments:
- Belong inside function bodies
- Explain the logic, phases, and non-obvious implementation details
- Separate different processing phases with summary comments
- Focus on "why" rather than "what" the code does
Function Declaration Comments:
- Focus on inputs and outputs - what the function does, not how
- Describe the contract, preconditions, postconditions, and behavior
- Do NOT explain the intricacies of input/output types if those are data structures already documented at their declaration
- Readers should refer to the data structure definition for detailed field explanations
Overview and Design Comments:
- Must be completely understandable with zero knowledge of the code
- If a reader cannot understand the overview without reading code, either improve the comment or remove it
- Reading an incomprehensible comment followed by code is double work
- To make overview comments understandable:
- Minimize new terminology - use plain language where possible
- Define new terms immediately when you must introduce them
- Illustrate with examples to clarify abstract concepts
- Limit term introduction - avoid introducing many new terms at once; consider using variants of existing terms instead
- If you introduce terms like "satisfiable", "strict", "coverage" in quick succession, step back and see if fewer, clearer terms would suffice
CockroachDB is a complex system and you should write code under the assumption that it will have to be understood and modified in the future by readers who have basic familiarity with CockroachDB, but are not experts on the respective subsystem.
Key concepts and abstractions should be explained clearly, and lifecycles and ownership clearly stated. Whenever possible, you should use examples to make the code accessible to the reader. Comments should always add depth to the code (rather than repeating the code).
When reviewing, other than technical correctness, you should also focus on the above aspects. Do not over-emphasize on grammar and comment typos, prefix with "nit:" in reviews.
CockroachDB is a distributed system that allows for rolling upgrades. This means
that any shared state or inter-process communication needs to be mindful of
compatibility issues. See pkg/clusterversion for more on this.
Top-Level Design Comments: Explain concepts/abstractions, show how pieces fit together, connect to use cases. These must be understandable without reading any code - define terms clearly and use examples to illustrate abstract concepts.
Example from concurrency/concurrency_control.go:
// Package concurrency provides a concurrency manager that coordinates
// access to keys and key ranges. The concurrency manager sequences
// concurrent txns that access overlapping keys and ensures that locks
// are respected and txn isolation guarantees are upheld.
//
// The concurrency manager is structured as a two-level hierarchy...API and Interface Comments:
// AuthConn is the interface used by the authenticator for interacting with the
// pgwire connection.
type AuthConn interface {
// SendAuthRequest sends a request for authentication information. After
// calling this, the authenticator needs to call GetPwdData() quickly, as the
// connection's goroutine will be blocked on providing us the requested data.
SendAuthRequest(authType int32, data []byte) error
// GetPwdData returns authentication info that was previously requested with
// SendAuthRequest. The call blocks until such data is available.
// An error is returned if the client connection dropped or if the client
// didn't respect the protocol.
GetPwdData() ([]byte, error)
}Function Comments: Focus on inputs, outputs, and behavior. Avoid re-documenting data structure details that are explained at their declaration.
// Append appends the provided string and any number of query parameters.
// Instead of using normal placeholders (e.g. $1, $2), use meta-placeholder $.
// This method rewrites the query so that it uses proper placeholders.
//
// For example, suppose we have the following calls:
//
// query.Append("SELECT * FROM foo WHERE a > $ AND a < $ ", arg1, arg2)
// query.Append("LIMIT $", limit)
//
// The query is rewritten into:
//
// SELECT * FROM foo WHERE a > $1 AND a < $2 LIMIT $3
// /* $1 = arg1, $2 = arg2, $3 = limit */
//
// Note that this method does NOT return any errors. Instead, we queue up
// errors, which can later be accessed.
func (q *sqlQuery) Append(s string, params ...interface{}) { /* ... */ }Struct Field Comments: Document the purpose, lifecycle, and usage of each field. This is where data structure details belong - function comments should not repeat this information.
// cliState defines the current state of the CLI during command-line processing.
//
// Note: options customizable via \set and \unset should be defined in
// sqlCtx or cliCtx instead, so that the configuration remains global
// across multiple instances of cliState.
type cliState struct {
// forwardLines is the array of lookahead lines. This gets
// populated when there is more than one line of input
// in the data read by ReadLine(), which can happen
// when copy-pasting.
forwardLines []string
// partialStmtsLen represents the number of entries in partialLines
// parsed successfully so far. It grows larger than zero whenever 1)
// syntax checking is enabled and 2) multi-statement entry starts.
partialStmtsLen int
}Phase Comments in Function Bodies: Algorithmic comments that separate different processing phases and explain non-obvious logic. These belong inside functions, not at function declarations.
func (r *Replica) executeAdminCommandWithDescriptor(
ctx context.Context, updateDesc func(*roachpb.RangeDescriptor) error,
) *roachpb.Error {
// Retry forever as long as we see errors we know will resolve.
retryOpts := base.DefaultRetryOptions()
for retryable := retry.StartWithCtx(ctx, retryOpts); retryable.Next(); {
// The replica may have been destroyed since the start of the retry loop.
// We need to explicitly check this condition.
if _, err := r.IsDestroyed(); err != nil {
return roachpb.NewError(err)
}
// Admin commands always require the range lease to begin, but we may
// have lost it while in this retry loop. Without the lease, a replica's
// local descriptor can be arbitrarily stale.
if _, pErr := r.redirectOnOrAcquireLease(ctx); pErr != nil {
return pErr
}
}
}Protobuf Message Comments:
// ReplicaType identifies which raft activities a replica participates in. In
// normal operation, VOTER_FULL and LEARNER are the only used states. However,
// atomic replication changes require a transition through a "joint config"; in
// this joint config, the VOTER_DEMOTING and VOTER_INCOMING types are used as
// well to denote voters which are being downgraded to learners and newly added
// by the change, respectively.
enum ReplicaType {
// VOTER_FULL indicates a replica that is a voter both in the
// incoming and outgoing set.
VOTER_FULL = 0;
// VOTER_INCOMING indicates a voting replica that will be a
// VOTER_FULL once the ongoing atomic replication change is finalized;
// that is, it is in the process of being added.
VOTER_INCOMING = 2;
}When to Update Comments:
- Add explanations when you discover valuable missing knowledge
- Fix factually incorrect comments immediately (treat as bugs)
- Fix grammar/spelling that significantly impairs reading
- Avoid cosmetic changes that don't improve understanding
Review Guidelines:
- Point out when missing comments would help understanding
- Add comments explaining review discussion outcomes
- Prefix grammar suggestions with "nit:" to indicate low priority
CockroachDB uses the
github.com/cockroachdb/errors library,
a superset of Go's standard errors package and pkg/errors. Importantly,
cockroachdb/errors interoperates with cockroachdb/redact. Ensure that
information that is passed to error constructors has proper redaction, as
unredacted information might be stripped before reaching our support team.
// Simple static string errors.
errors.New("connection failed")
// Formatted error strings.
errors.Newf("invalid value: %d", val)
// Assertion failures for implementation bugs (generates louder alerts).
errors.AssertionFailedf("expected non-nil pointer")It can be helpful to add context when propagating errors up the call stack:
// Wrap with context (preferred over fmt.Errorf with %w).
return errors.Wrap(err, "opening file")
// Wrap with formatted context.
return errors.Wrapf(err, "connecting to %s", addr)Keep context succinct; avoid phrases like "failed to" which pile up:
// Bad: "failed to x: failed to y: failed to create store: the error"
return errors.Newf("failed to create new store: %s", err)
// Good: "x: y: new store: the error"
return errors.Wrap(err, "new store")// Add hints for end-users (actionable guidance, excluded from Sentry).
errors.WithHint(err, "check your network connection")
// Add details for developers (contextual info, excluded from Sentry).
errors.WithDetail(err, "request payload was 2.5MB")For errors that clients need to detect, use sentinel errors or custom types:
// Sentinel error pattern.
var ErrNotFound = errors.New("not found")
func Find(id string) error {
return ErrNotFound
}
// Caller detection with errors.Is (works across network boundaries!).
if errors.Is(err, ErrNotFound) {
// Handle not found.
}For errors with additional information, use custom types:
type NotFoundError struct {
Resource string
}
func (e *NotFoundError) Error() string {
return fmt.Sprintf("%s not found", e.Resource)
}
// Caller detection with errors.As.
var nfErr *NotFoundError
if errors.As(err, &nfErr) {
log.Printf("missing: %s", nfErr.Resource)
}| Scenario | Approach |
|---|---|
| No additional context needed | Return original error |
| Adding context | Use errors.Wrap or errors.Wrapf |
| Passing through goroutine channel | Use errors.WithStack on both ends |
| Callers don't need to detect this error | Use errors.Newf |
| Hide original cause | Use errors.Handled or errors.Opaque |
Error messages are redacted by default in Sentry reports. Mark data as safe explicitly:
// Mark specific values as safe for reporting.
errors.WithSafeDetails(err, "node_id=%d", nodeID)
// The Safe() wrapper for known-safe values.
errors.Newf("processing %s", errors.Safe(operationName))Always use the "comma ok" idiom to avoid panics:
// Bad - panics on wrong type.
t := i.(string)
// Good - handles gracefully.
t, ok := i.(string)
if !ok {
return errors.New("expected string type")
}/docs/RFCS/20190318_error_handling.md- Error handling RFC- cockroachdb/errors README - Full API documentation
- Main Documentation: https://cockroachlabs.com/docs/stable/
- Architecture Guide: https://www.cockroachlabs.com/docs/stable/architecture/overview.html
- Contributing: See
/CONTRIBUTING.mdand https://wiki.crdb.io/ - Design Documents:
/docs/design.mdand/docs/tech-notes/
Use the commit-helper skill (invoked via /commit-helper) when creating commits and PRs.
- For multi-commit PRs, summarize each commit in the PR record.
- Do not include a test plan unless explicitly asked by the user.
- Be direct and honest.
- Skip unnecessary acknowledgments.
- Correct me when I'm wrong and explain why.
- Suggest better alternatives if my ideas can be improved.
- Focus on accuracy and efficiency.
- Challenge my assumptions when needed.
- Prioritize quality information and directness.