Skip to content

Commit 2acb399

Browse files
committed
Post about schema coders and benchmarks.
1 parent 6e9995d commit 2acb399

File tree

2 files changed

+127
-1
lines changed

2 files changed

+127
-1
lines changed

content/posts/2025-01-22-schema.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: "Schemas and Coders and Benchmarks"
3+
date: 2025-01-22T06:24:24-08:00
4+
tags:
5+
- beam
6+
- go
7+
- hobby sdk
8+
- dev
9+
categories:
10+
- Dev
11+
---
12+
13+
This weekend I got nerd snipped into working on a part of my hobby SDK, that
14+
I didn't have too much motivation to do. Beam Schema Row coders.
15+
16+
But they do need to be done eventually, so I implemented them.
17+
18+
In particular, what I needed was a way to take the Beam Schema Proto, and turn
19+
it into coder that could produce a dynamic row value. The existing Go SDK in
20+
principle could do it, but it's not easy due to a choice I made years and years
21+
ago. I'm not convinced it's the wrong choice though. Besides, adding the new handling
22+
to the API would take longer than adding it to the hobby SDK, or for what I needed
23+
it for from scratch.
24+
25+
Perhaps that's not entirely true. I'm being paranoid about continuing to broaden
26+
the API surface of the existing SDK. It's already quite large and complex, and
27+
the interactions are already quite subtle.
28+
29+
Anyway, I wrote a quick naive implementation of the code, added tests, made them all
30+
pass, and [wrote a benchmark](https://github.com/lostluck/beam-go/blob/9429632fa47a6752671c4a4d6ad0325742485599/internal/schema/schema_test.go#L143).
31+
32+
```go
33+
func BenchmarkRoundtrip(b *testing.B) {
34+
for _, test := range suite {
35+
b.Run(test.name, func(b *testing.B) {
36+
c := ToCoder(test.schema)
37+
b.ReportAllocs()
38+
b.ResetTimer()
39+
for range b.N {
40+
r := coders.Decode(c, test.data)
41+
if got, want := coders.Encode(c, r), test.data; !cmp.Equal(got, want) {
42+
b.Errorf("round trip decode-encode not equal: want %v, got %v", want, got)
43+
}
44+
}
45+
})
46+
}
47+
}
48+
```
49+
50+
It's a pretty straightforward thing. I took the implementation straight from the
51+
above tests, set it to report allocations, and restart the benchmark timer, and
52+
then get to iterating.
53+
54+
This gives bad results though. The problem is here:
55+
56+
```go
57+
if got, want := coders.Encode(c, r), test.data; !cmp.Equal(got, want) {
58+
b.Errorf("round trip decode-encode not equal: want %v, got %v", want, got)
59+
}
60+
```
61+
62+
I kept the comparison in, to validate that everything continues to work as
63+
desired through the thousands of runs the coders would be put though. Or I was
64+
lazy about it. `cmp` is not intended to be high performance, it's intended to be
65+
convenient and correct.
66+
67+
Had I used `bytes.Equal` instead of the general `cmp.Equal`,
68+
I'd have spent less CPU, and probably wouldn't have noticed.
69+
70+
For this task, I didn't much care about CPU. I cared about allocations, which do
71+
directly affect CPU and memory usage. And `cmp` is allocation heavy by
72+
comparison to a straight byte slice equality check.
73+
74+
That's not all though. I had used my convenience functions for the benchmark too,
75+
to trivially get the through the decode and encode cycle.
76+
`coders.Decode` and `coders.Encode` are just small wrappers to simplify quick one
77+
off encodings, such as for testing.
78+
79+
But like `cmp`, they are convenient, not high performance.
80+
81+
The test body [now looks like this](https://github.com/lostluck/beam-go/blob/53c2c2b073dce1eebbd4090b2d642362f277c895/internal/schema/schema_test.go#L225):
82+
83+
```go
84+
c := ToCoder(test.schema)
85+
// Mild shenanigans to prevent unnecessary allocations.
86+
enc := coders.NewEncoder()
87+
dec := *coders.NewDecoder(test.data)
88+
n := len(test.data)
89+
want := test.data
90+
91+
b.ReportAllocs()
92+
b.ResetTimer()
93+
for range b.N {
94+
enc.Reset(n)
95+
dec = *coders.NewDecoder(test.data)
96+
r := c.Decode(&dec)
97+
c.Encode(enc, r)
98+
if got := enc.Data(); !bytes.Equal(got, want) {
99+
b.Errorf("encoding not equal: want %v, got %v", want, got)
100+
}
101+
}
102+
```
103+
104+
Moving the `Encoder` and `Decoder` allocations out of the hot loop, removed them
105+
from the profile graph. There's a bit of "fun" with the Decoder to avoid it
106+
getting heap allocated anew for each loop when the test data is being reset.
107+
Having the comparison back in adds a few nanoseconds per run, but helps keep the
108+
code staying robust.
109+
110+
All of this was a problem largely because I wanted to collect a nice and clean
111+
"before" and "after" versions of the metrics as I cleaned up and changed the
112+
implementation. I like seeing performance improvements. But now the results are
113+
incomparable, and not apples to apples, so I've cleared them away.
114+
115+
I am happy where this has ended up though. I incorporated the `unsafe` tricks
116+
protocol buffers use to minimize allocations and space in their protoreflect
117+
package: https://github.com/protocolbuffers/protobuf-go/blob/master/reflect/protoreflect/value_unsafe_go121.go.
118+
119+
The idea is the same, really. Be able to refer to and mutate values and fields
120+
efficiently, against a known schema. I'd use their implementation directly if I
121+
were able to, but they have it quite locked down for the same reasons why I've
122+
put this stuff in an internal package for the time being. I want it to be able
123+
to change.
124+
125+
This is basically half of a useful article, other than the lesson about
126+
being certain you know what you're measuring in your benchmarks. We'll see if
127+
I turn back the clock and collect clean measurements of the implementations.

themes/hugo-coder/assets/scss/_content.scss

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,6 @@
5959
// hyphens: auto;
6060

6161
white-space: normal;
62-
max-width: 60rem;
6362
}
6463
}
6564

0 commit comments

Comments
 (0)