Skip to content

Commit de07f19

Browse files
authored
Merge pull request #45358 from thockin/master
Go workspaces blog post
2 parents 487ef00 + 281b42f commit de07f19

File tree

1 file changed

+210
-0
lines changed

1 file changed

+210
-0
lines changed
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
---
2+
layout: blog
3+
title: 'Using Go workspaces in Kubernetes'
4+
date: 2024-03-19T08:30:00-08:00
5+
slug: go-workspaces-in-kubernetes
6+
canonicalUrl: https://www.kubernetes.dev/blog/2024/03/19/go-workspaces-in-kubernetes/
7+
---
8+
9+
**Author:** Tim Hockin (Google)
10+
11+
The [Go programming language](https://go.dev/) has played a huge role in the
12+
success of Kubernetes. As Kubernetes has grown, matured, and pushed the bounds
13+
of what "regular" projects do, the Go project team has also grown and evolved
14+
the language and tools. In recent releases, Go introduced a feature called
15+
"workspaces" which was aimed at making projects like Kubernetes easier to
16+
manage.
17+
18+
We've just completed a major effort to adopt workspaces in Kubernetes, and the
19+
results are great. Our codebase is simpler and less error-prone, and we're no
20+
longer off on our own technology island.
21+
22+
## GOPATH and Go modules
23+
24+
Kubernetes is one of the most visible open source projects written in Go. The
25+
earliest versions of Kubernetes, dating back to 2014, were built with Go 1.3.
26+
Today, 10 years later, Go is up to version 1.22 — and let's just say that a
27+
_whole lot_ has changed.
28+
29+
In 2014, Go development was entirely based on
30+
[`GOPATH`](https://go.dev/wiki/GOPATH). As a Go project, Kubernetes lived by the
31+
rules of `GOPATH`. In the buildup to Kubernetes 1.4 (mid 2016), we introduced a
32+
directory tree called `staging`. This allowed us to pretend to be multiple
33+
projects, but still exist within one git repository (which had advantages for
34+
development velocity). The magic of `GOPATH` allowed this to work.
35+
36+
Kubernetes depends on several code-generation tools which have to find, read,
37+
and write Go code packages. Unsurprisingly, those tools grew to rely on
38+
`GOPATH`. This all worked pretty well until Go introduced modules in Go 1.11
39+
(mid 2018).
40+
41+
Modules were an answer to many issues around `GOPATH`. They gave more control to
42+
projects on how to track and manage dependencies, and were overall a great step
43+
forward. Kubernetes adopted them. However, modules had one major drawback —
44+
most Go tools could not work on multiple modules at once. This was a problem
45+
for our code-generation tools and scripts.
46+
47+
Thankfully, Go offered a way to temporarily disable modules (`GO111MODULE` to
48+
the rescue). We could get the dependency tracking benefits of modules, but the
49+
flexibility of `GOPATH` for our tools. We even wrote helper tools to create fake
50+
`GOPATH` trees and played tricks with symlinks in our vendor directory (which
51+
holds a snapshot of our external dependencies), and we made it all work.
52+
53+
And for the last 5 years it _has_ worked pretty well. That is, it worked well
54+
unless you looked too closely at what was happening. Woe be upon you if you
55+
had the misfortune to work on one of the code-generation tools, or the build
56+
system, or the ever-expanding suite of bespoke shell scripts we use to glue
57+
everything together.
58+
59+
## The problems
60+
61+
Like any large software project, we Kubernetes developers have all learned to
62+
deal with a certain amount of constant low-grade pain. Our custom `staging`
63+
mechanism let us bend the rules of Go; it was a little clunky, but when it
64+
worked (which was most of the time) it worked pretty well. When it failed, the
65+
errors were inscrutable and un-Googleable — nobody else was doing the silly
66+
things we were doing. Usually the fix was to re-run one or more of the `update-*`
67+
shell scripts in our aptly named `hack` directory.
68+
69+
As time went on we drifted farther and farher from "regular" Go projects. At
70+
the same time, Kubernetes got more and more popular. For many people,
71+
Kubernetes was their first experience with Go, and it wasn't always a good
72+
experience.
73+
74+
Our eccentricities also impacted people who consumed some of our code, such as
75+
our client library and the code-generation tools (which turned out to be useful
76+
in the growing ecosystem of custom resources). The tools only worked if you
77+
stored your code in a particular `GOPATH`-compatible directory structure, even
78+
though `GOPATH` had been replaced by modules more than four years prior.
79+
80+
This state persisted because of the confluence of three factors:
81+
1. Most of the time it only hurt a little (punctuated with short moments of
82+
more acute pain).
83+
1. Kubernetes was still growing in popularity - we all had other, more urgent
84+
things to work on.
85+
1. The fix was not obvious, and whatever we came up with was going to be both
86+
hard and tedious.
87+
88+
As a Kubernetes maintainer and long-timer, my fingerprints were all over the
89+
build system, the code-generation tools, and the `hack` scripts. While the pain
90+
of our mess may have been low _on_average_, I was one of the people who felt it
91+
regularly.
92+
93+
## Enter workspaces
94+
95+
Along the way, the Go language team saw what we (and others) were doing and
96+
didn't love it. They designed a new way of stitching multiple modules together
97+
into a new _workspace_ concept. Once enrolled in a workspace, Go tools had
98+
enough information to work in any directory structure and across modules,
99+
without `GOPATH` or symlinks or other dirty tricks.
100+
101+
When I first saw this proposal I knew that this was the way out. This was how
102+
to break the logjam. If workspaces was the technical solution, then I would
103+
put in the work to make it happen.
104+
105+
## The work
106+
107+
Adopting workspaces was deceptively easy. I very quickly had the codebase
108+
compiling and running tests with workspaces enabled. I set out to purge the
109+
repository of anything `GOPATH` related. That's when I hit the first real bump -
110+
the code-generation tools.
111+
112+
We had about a dozen tools, totalling several thousand lines of code. All of
113+
them were built using an internal framework called
114+
[gengo](https://github.com/kubernetes/gengo), which was built on Go's own
115+
parsing libraries. There were two main problems:
116+
117+
1. Those parsing libraries didn't understand modules or workspaces.
118+
1. `GOPATH` allowed us to pretend that Go _package paths_ and directories on
119+
disk were interchangeable in trivial ways. They are not.
120+
121+
Switching to a
122+
[modules- and workspaces-aware parsing](https://pkg.go.dev/golang.org/x/tools/go/packages)
123+
library was the first step. Then I had to make a long series of changes to
124+
each of the code-generation tools. Critically, I had to find a way to do it
125+
that was possible for some other person to review! I knew that I needed
126+
reviewers who could cover the breadth of changes and reviewers who could go
127+
into great depth on specific topics like gengo and Go's module semantics.
128+
Looking at the history for the areas I was touching, I asked Joe Betz and Alex
129+
Zielenski (SIG API Machinery) to go deep on gengo and code-generation, Jordan
130+
Liggitt (SIG Architecture and all-around wizard) to cover Go modules and
131+
vendoring and the `hack` scripts, and Antonio Ojea (wearing his SIG Testing
132+
hat) to make sure the whole thing made sense. We agreed that a series of small
133+
commits would be easiest to review, even if the codebase might not actually
134+
work at each commit.
135+
136+
Sadly, these were not mechanical changes. I had to dig into each tool to
137+
figure out where they were processing disk paths versus where they were
138+
processing package names, and where those were being conflated. I made
139+
extensive use of the [delve](https://github.com/go-delve/delve) debugger, which
140+
I just can't say enough good things about.
141+
142+
One unfortunate result of this work was that I had to break compatibility. The
143+
gengo library simply did not have enough information to process packages
144+
outside of GOPATH. After discussion with gengo and Kubernetes maintainers, we
145+
agreed to make [gengo/v2](https://github.com/kubernetes/gengo/tree/master/v2).
146+
I also used this as an opportunity to clean up some of the gengo APIs and the
147+
tools' CLIs to be more understandable and not conflate packages and
148+
directories. For example you can't just string-join directory names and
149+
assume the result is a valid package name.
150+
151+
Once I had the code-generation tools converted, I shifted attention to the
152+
dozens of scripts in the `hack` directory. One by one I had to run them, debug,
153+
and fix failures. Some of them needed minor changes and some needed to be
154+
rewritten.
155+
156+
Along the way we hit some cases that Go did not support, like workspace
157+
vendoring. Kubernetes depends on vendoring to ensure that our dependencies are
158+
always available, even if their source code is removed from the internet (it
159+
has happened more than once!). After discussing with the Go team, and looking
160+
at possible workarounds, they decided the right path was to
161+
[implement workspace vendoring](https://github.com/golang/go/issues/60056).
162+
163+
The eventual Pull Request contained over 200 individual commits.
164+
165+
## Results
166+
167+
Now that this work has been merged, what does this mean for Kubernetes users?
168+
Pretty much nothing. No features were added or changed. This work was not
169+
about fixing bugs (and hopefully none were introduced).
170+
171+
This work was mainly for the benefit of the Kubernetes project, to help and
172+
simplify the lives of the core maintainers. In fact, it would not be a lie to
173+
say that it was rather self-serving - my own life is a little bit better now.
174+
175+
This effort, while unusually large, is just a tiny fraction of the overall
176+
maintenance work that needs to be done. Like any large project, we have lots of
177+
"technical debt" — tools that made point-in-time assumptions and need
178+
revisiting, internal APIs whose organization doesn't make sense, code which
179+
doesn't follow conventions which didn't exist at the time, and tests which
180+
aren't as rigorous as they could be, just to throw out a few examples. This
181+
work is often called "grungy" or "dirty", but in reality it's just an
182+
indication that the project has grown and evolved. I love this stuff, but
183+
there's far more than I can ever tackle on my own, which makes it an
184+
interesting way for people to get involved. As our unofficial motto goes:
185+
"chop wood and carry water".
186+
187+
Kubernetes used to be a case-study of how _not_ to do large-scale Go
188+
development, but now our codebase is simpler (and in some cases faster!) and
189+
more consistent. Things that previously seemed like they _should_ work, but
190+
didn't, now behave as expected.
191+
192+
Our project is now a little more "regular". Not completely so, but we're
193+
getting closer.
194+
195+
## Thanks
196+
197+
This effort would not have been possible without tons of support.
198+
199+
First, thanks to the Go team for hearing our pain, taking feedback, and solving
200+
the problems for us.
201+
202+
Special mega-thanks goes to Michael Matloob, on the Go team at Google, who
203+
designed and implemented workspaces. He guided me every step of the way, and
204+
was very generous with his time, answering all my questions, no matter how
205+
dumb.
206+
207+
Writing code is just half of the work, so another special thanks to my
208+
reviewers: Jordan Liggitt, Joe Betz, Alexander Zielenski, and Antonio Ojea.
209+
These folks brought a wealth of expertise and attention to detail, and made
210+
this work smarter and safer.

0 commit comments

Comments
 (0)