Skip to content

Conversation

@joemiller
Copy link
Member

When investigating potential issues with node labels syncing to EC2 was no visibility into the controller's decision-making. The predicates silently filter events, and reconcile doesn't log which labels it finds vs expects. Without this, debugging requires guessing whether the predicate rejected the event or the label simply wasn't on the node.

Adds V(1) debug logs throughout the sync pipeline. The predicate now logs every Create/Update event with shouldProcess=true/false so you can see if events are being filtered. Reconcile logs which monitored labels are missing from the node and the final set of tags being synced.

Replaces the custom --json flag with controller-runtime's standard zap flags (--zap-log-level, --zap-encoder, etc). To see debug logs, run with --zap-log-level=debug or --zap-log-level=1.

Additionally, introduce a new 4 hour resync period and update the predicate on Update events to allow a full resync to the cloud provider to catch any potentially missed label sync events. This is a trade-off between eventual consistency and the cost of the extra DescribeTags calls.

NOTE: Due to the dropping of --json flag and replacing with zap flags,
the minor version is being bumped to v0.1.0.

When investigating potential issues with node labels syncing to EC2
was no visibility into the controller's decision-making. The predicates
silently filter events, and reconcile doesn't log which labels it finds vs
expects. Without this, debugging requires guessing whether the predicate
rejected the event or the label simply wasn't on the node.

Adds V(1) debug logs throughout the sync pipeline. The predicate now logs
every Create/Update event with `shouldProcess=true/false` so you can see
if events are being filtered. Reconcile logs which monitored labels are
missing from the node and the final set of tags being synced.

Replaces the custom `--json` flag with controller-runtime's standard zap
flags (`--zap-log-level`, `--zap-encoder`, etc). To see debug logs, run with
`--zap-log-level=debug` or `--zap-log-level=1`.

Additionally, introduce a new 4 hour resync period and update the predicate
on Update events to allow a full resync to the cloud provider to catch
any potentially missed label sync events. This is a trade-off between
eventual consistency and the cost of the extra DescribeTags calls.

> NOTE: Due to the dropping of `--json` flag and replacing with zap flags,
> the minor version is being bumped to v0.1.0.
@joemiller joemiller requested a review from a team as a code owner December 9, 2025 17:23
controller.go Outdated
if value, exists := node.Labels[k]; exists {
tagsToSync[k] = value
} else {
logger.V(1).Info("Monitored label not found on node", "label", k)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't that write N line, one for each label?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does, but it turns out to be quite readable due to the fixed size of the preceding text in the log line. I will try to accumulate and do one log to see how it looks instead tho. The goal here is to print much more info than the current logging to try to chase down some elusive edge cases

ex:

{"level":"debug","ts":"2025-12-09T17:28:22Z","logger":"controller","msg":"Monitored label not found on node","node":{"name":"ip-10-0-126-121.ec2.internal"},"label":"foo"}
{"level":"debug","ts":"2025-12-09T17:28:22Z","logger":"controller","msg":"Monitored label not found on node","node":{"name":"ip-10-0-126-121.ec2.internal"},"label":"foo-bar"}
{"level":"debug","ts":"2025-12-09T17:28:22Z","logger":"controller","msg":"Monitored label not found on node","node":{"name":"ip-10-0-126-121.ec2.internal"},"label":"foo-bar-baz"}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to accumulating with one log line. slightly less readable in terminal but might be better from a log viewer (ie: loki, vlogs) view. will try it out.

@joemiller joemiller enabled auto-merge December 9, 2025 19:42
@joemiller joemiller merged commit 5d54569 into main Dec 9, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants