Skip to content

Conversation

@zhming0
Copy link
Contributor

@zhming0 zhming0 commented Jan 8, 2026

Fixes #797.

Viper has a known limitation, it does not support case sensitive keys in config file.

In the past when we encounter such issue, we had a number of bandaid solution: #732, #738. This is clearly not sustainable as K8s spec is fundamentally case sensitive.

This PR replaced Viper with Kong + k8s native unmarshaller, solving this problem categorically, and at the same time, getting rid of many viper specific patches from the past.

The diff of this PR is unfortunately larger than what I wanted. Despite we have test coverage on various config code path, running this controller locally checking things will be better approach.

Hopefully it will be worth it

@zhming0 zhming0 requested a review from a team as a code owner January 8, 2026 03:16
@zhming0 zhming0 force-pushed the ming/ps-1530 branch 2 times, most recently from d4e5a91 to 4fcd103 Compare January 8, 2026 04:31
Copy link

@trvrnrth trvrnrth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this PR this morning was a pleasant surprise and resolves the volume attribute issues I was experiencing in #797 🎉. Thanks for the quick response!

I haven't carried out in-depth testing of all the option parsing but have noticed a subtle change in behaviour with it no longer being possible to set a job ttl of 0.

@zhming0 zhming0 force-pushed the ming/ps-1530 branch 2 times, most recently from 9fabc87 to 9ed156d Compare January 8, 2026 23:36
Comment on lines +34 to +48
// Build args for config parsing from os.Args
// Accepts -f, --config, -f=, --config= flags
var configArgs []string
for i, arg := range os.Args {
if arg == "-f" || arg == "--config" {
if i+1 < len(os.Args) {
configArgs = append(configArgs, "--config="+os.Args[i+1])
}
}
if strings.HasPrefix(arg, "-f=") {
configArgs = append(configArgs, "--config="+strings.TrimPrefix(arg, "-f="))
}
if strings.HasPrefix(arg, "--config=") {
configArgs = append(configArgs, arg)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we ever use --config to impact our test environment config, maybe I can remove this whole bit. But it doesn't have much harm keeping it either.

@zhming0 zhming0 force-pushed the ming/ps-1530 branch 5 times, most recently from 8a5ab3d to dd826f6 Compare January 9, 2026 05:43
Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very much LGTM in concept, viper really annoys me. i'm sure Kong will grow to annoy me in different ways, but for now it's very exciting.

i have a couple of notes on the config parsing logic, but nothing major.

Debug *bool `kong:"name='debug',help='Sets log level to debug',env='DEBUG'"`

// Job / Pod settings
JobTTL *time.Duration `kong:"name='job-ttl',help='Time to retain kubernetes jobs after completion (default: 10m)',env='JOB_TTL'"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were these envars previously namespaced into BUILDKITE_KUBERNETES_STACK_* or something? if so, will this change change that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use env var prefix on agent-k8s-stack, I think we should, but this isn't something we can/should change at this point. So no behavior change 👍🏿

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet, good to know

cfgField.Set(cliField)
}
case reflect.Slice:
// Slice types (empty = not set)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty = not set

i just want to double check that this is the right semantic to check — in the agent, there are slice fields with defaults (eg vars to redact) that have a sensible set of default values, but that can manually be set to [] if the user desires. Using array emptiness as the metric for a field not being set essentially gates us out of having behaviour like that.

i think if we're making that decision then that's all good, but i also think it's probably good to think about whether checking nil-ness might give us better semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great catch! I've changed to checking IsNil to determine if it's unset, and added a test case. This would match the old viper behavior and I also think it's generally better 👍🏿 .

}

// newConfigWithDefaults returns a config with default values.
func newConfigWithDefaults() *config.Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably worth calling out that kong does have a default directive in its struct tag schema, but i actually think that this is significantly more readable, so no action necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought about using this default directive. But decided against that for the reason you mentioned.

Also, after this refactor, unlike viper/cobra, the kong bit is only a part of the config loading process, the config file parsing and CLI + EnvVar parsing are separated. So it no long make sense to use default directive in struct tag scheme.

Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zhming0 zhming0 merged commit 46d30de into main Jan 12, 2026
1 check passed
@zhming0 zhming0 deleted the ming/ps-1530 branch January 12, 2026 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] CSI volume attributes are incorrectly lowercased

4 participants