perf: eliminate per-cell CGO calls in getEnum via pre-built reverse dictionary#122
perf: eliminate per-cell CGO calls in getEnum via pre-built reverse dictionary#122hellower wants to merge 5 commits intoduckdb:mainfrom
Conversation
…ictionary Problem: getEnum() was calling 3 CGO functions on every cell read: 1. mapping.VectorGetColumnType(vec.vec) — creates a LogicalType from C 2. mapping.EnumDictionaryValue(lt, idx) — looks up the enum string via C API 3. mapping.DestroyLogicalType(<) — frees the LogicalType back to C For a 100K-row query with an ENUM column, this means 300,000 CGO round-trips just to convert enum indices to strings — even though the entire dictionary is already available in Go memory. Root cause: initEnum() already builds namesDict (map[string]uint32) for the setter path (string → index), but the getter path had no reverse mapping and fell back to CGO on every cell. Fix: 1. Added enumDict (map[uint32]string) field to vectorTypeInfo — the reverse of namesDict (index → name). 2. In initEnum(), both maps are now built in a single pass over the C dictionary. Also added capacity hints to avoid rehashing. 3. In getEnum(), replaced the 3 CGO calls with a single Go map lookup. The ENUM dictionary is immutable for the lifetime of the type (CREATE TYPE ... AS ENUM), so caching is safe. This is consistent with the setter path which already uses the cached namesDict. Benchmark results (enum_bench_test.go): goos: linux, goarch: amd64 BenchmarkEnumGetValue/Dict/N=10000 184 6,586,813 ns/op 164,574 B/op 10,092 allocs/op BenchmarkEnumGetValue/CGO/N=10000 60 19,522,868 ns/op 388,492 B/op 40,092 allocs/op BenchmarkEnumGetValue/Dict/N=100000 18 62,224,891 ns/op 1,635,777 B/op 100,664 allocs/op BenchmarkEnumGetValue/CGO/N=100000 6 188,925,219 ns/op 3,876,269 B/op 400,664 allocs/op - Throughput: ~3x faster - Allocations: 75% fewer (3 allocs/cell eliminated) - Memory: 2.4x less heap usage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mlafeldt
left a comment
There was a problem hiding this comment.
Looks good in general.
Three open issues:
-
Tidy checks are failing
-
Reviewer: verify ENUM read correctness with edge cases (NULL values, large dictionaries)
Since you're already using AI, it would make more sense for you to add those tests. In particular, there should be one for >255 enum values to exercise USMALLINT/UINTEGER branches.
- See my remark on using []string.
type_info.go
Outdated
| tagDict map[uint32]string | ||
| // enumDict is the reverse of namesDict for ENUM types (index → name). | ||
| // Built once during initEnum to avoid per-cell CGO calls in getEnum. | ||
| enumDict map[uint32]string |
There was a problem hiding this comment.
I know that this mirrors tagDict, but wouldn't a simple string slice be faster since we don't really need hashing etc?
There was a problem hiding this comment.
Switched enumDict to []string as suggested. Also added tests for NULL values and >255 entries (USMALLINT branch).
…stion Enum indices are dense integers starting at 0, so a slice provides O(1) access without hash overhead. Also adds edge-case tests for NULL enum values and large dictionaries (>255 entries, USMALLINT path).
|
@hellower Thanks for the changes. Tidy checks are still failing though. |
…ation
Go's defer statement evaluates function arguments immediately at the
defer site, not when the deferred function executes. This means:
var types []mapping.LogicalType // types is nil
defer destroyLogicalTypes(types) // captures nil — cleanup is a no-op
types = append(types, ...) // types now points to new backing array
When the function returns, destroyLogicalTypes receives the nil slice
that was captured at the defer site, iterates over zero elements, and
never calls DestroyLogicalType on any of the C objects. The LogicalType
objects allocated via the C API are permanently leaked.
Affected functions:
- logicalStructType (type_info.go) — leaks one LogicalType per STRUCT field
- logicalUnionType (type_info.go) — leaks one LogicalType per UNION member
- inferSliceLogicalTypeAndValue (value.go) — leaks one LogicalType per
LIST/ARRAY element type inferred during value binding
These functions are called from Appender (STRUCT/UNION columns),
scalar/table UDFs, and any code path that constructs composite
LogicalTypes from Go-side TypeInfo.
Fix: wrap the defer call in a closure so that it captures the variable
by reference, not by value:
defer func() { destroyLogicalTypes(types) }()
The closure captures the `types` variable itself (not its current value),
so when it executes at function return, it sees the fully-populated
slice and correctly destroys all LogicalType objects.
Verified with a standalone program:
var s []int
defer fmt.Println("direct:", len(s)) // prints 0
defer func() { fmt.Println("closure:", len(s)) }() // prints 3
s = append(s, 1, 2, 3)
The same bug pattern exists for destroyValueSlice in value.go: - inferSliceLogicalTypeAndValue (line 398): values := make([]Value, 0, length) - createSliceValue (line 459): var values []Value - createStructValue (line 487): var values []Value All three use `defer destroyValueSlice(values)` where `values` is nil or len=0 at the defer site. The deferred cleanup iterates over zero elements and never calls DestroyValue on the C-allocated duckdb_value objects. These functions are called when binding STRUCT, LIST, or ARRAY values via Appender or prepared statements. Each call leaks one C Value object per element/field. Not a double-free risk: DuckDB's Create*Value functions (CreateListValue, CreateArrayValue, CreateStructValue) deep-copy the input values via allocValues → duckdb_create_*_value. The original Value objects remain owned by the caller and must be destroyed separately.
- Remove redundant uint32 cast in vector_getters.go (unconvert) - Wrap defer Close() calls in require.NoError for errcheck compliance in enum_bench_test.go (conn, dkRows, Stmt) and types_test.go (rows x2)
b6d76cc to
7e0d908
Compare
Summary
getEnum()was calling 3 CGO functions per cell (VectorGetColumnType→EnumDictionaryValue→DestroyLogicalType) to convert an enum index to its string nameinitEnum()already buildsnamesDict(map[string]uint32) for the setter path, but the getter had no reverse mapping and fell back to CGO every timeenumDict(map[uint32]string) — the reverse ofnamesDict— built once duringinitEnum(), replacing 3 CGO round-trips with a single Go map lookupCREATE TYPE ... AS ENUM), so caching is safeChanges
type_info.goenumDict map[uint32]stringfield tovectorTypeInfovector.goinitEnum()builds bothnamesDictandenumDictin one pass; addedmake()capacity hintsvector_getters.gogetEnum()now doesvec.enumDict[idx]instead of 3 CGO callsenum_bench_test.goBenchmark
Test plan
go test ./...)go test -bench=BenchmarkEnumGetValue -benchmem)🤖 Generated with Claude Code