Add support for MSK clusters with kraft metadata nodes#19
Merged
joshm91 merged 4 commits intostatsbomb:mainfrom Feb 7, 2025
Merged
Add support for MSK clusters with kraft metadata nodes#19joshm91 merged 4 commits intostatsbomb:mainfrom
joshm91 merged 4 commits intostatsbomb:mainfrom
Conversation
Since kafka version 3.7.0 MSK has supported using kraft metadata nodes instead of zookeeper. https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-msk-kraft-mode-apache-kafka-clusters/ When running prometheus-msk-discovery against a cluster in kraft node there is a panic like this: ``` 2025/01/30 12:03:36 http: panic serving 10.3.57.169:45616: runtime error: invalid memory address or nil pointer dereference goroutine 18 [running]: net/http.(*conn).serve.func1() /usr/local/go/src/net/http/server.go:1868 +0xb9 panic({0x7675e0?, 0xa8f9c0?}) /usr/local/go/src/runtime/panic.go:920 +0x270 main.getBrokers({0x86e250?, 0xc00013a140}, {0xc00002bf20, 0x5b}) /src/main.go:131 +0x24d main.buildClusterDetails({0x86e250?, 0xc00013a140?}, {0x0, 0xc000268730, 0xc000232b40, 0xc000232a80, 0xc000232a70, 0xc000226930, 0xc000226948, 0xc000232b20, ...}) /src/main.go:140 +0x56 main.GetStaticConfigs({0x86e250, 0xc00013a140}, {0xc00025ba18, 0x1, 0xc0000ef988?}) /src/main.go:199 +0x26b main.httpSD.func1({0x86feb0, 0xc00015a000}, 0xc0000efb18?) /src/main.go:248 +0xbb net/http.HandlerFunc.ServeHTTP(0x440480?, {0x86feb0?, 0xc00015a000?}, 0x6457fa?) /usr/local/go/src/net/http/server.go:2136 +0x29 net/http.(*ServeMux).ServeHTTP(0xace040?, {0x86feb0, 0xc00015a000}, 0xc000154000) /usr/local/go/src/net/http/server.go:2514 +0x142 net/http.serverHandler.ServeHTTP({0xc000150090?}, {0x86feb0?, 0xc00015a000?}, 0x6?) /usr/local/go/src/net/http/server.go:2938 +0x8e net/http.(*conn).serve(0xc00011a1b0, {0x8704c0, 0xc0000a3f80}) /usr/local/go/src/net/http/server.go:2009 +0x5f4 created by net/http.(*Server).Serve in goroutine 1 /usr/local/go/src/net/http/server.go:3086 +0x5cb ``` This is caused because the list nodes api returns records like this: ``` { "AddedToClusterTime": null, "BrokerNodeInfo": null, "ControllerNodeInfo": { "Endpoints": [ "c-10002.foo.xxxxxx.c7.kafka.us-east-1.amazonaws.com" ] }, "InstanceType": null, "NodeARN": null, "NodeType": "CONTROLLER", "ZookeeperNodeInfo": null } ``` which have a nil BrokerNodeInfo This PR fixes this bug, and also adds these controller nodes to the target endpoints. When JMX Exporter and Node Exporter are enabled on the cluster, these nodes only seem to be running JMX Exporter, so we are only adding this endpoint for these nodes.
Contributor
Author
|
Ping @joshm91 |
Collaborator
|
Thanks for this, @errm. I've set some time aside next week to give this repo a bit of attention so I'll get your change reviewed and merged then. |
Contributor
Author
|
Awesome! Thanks 🙇 |
Collaborator
|
Thanks again for this! Merged and pushed to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Since kafka version 3.7.0 MSK has supported using kraft metadata nodes instead of zookeeper. https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-msk-kraft-mode-apache-kafka-clusters/
When running prometheus-msk-discovery against a cluster in kraft node there is a panic like this:
This is caused because the list nodes api returns records like this:
which have a nil BrokerNodeInfo
This PR fixes this bug, and also adds these controller nodes to the target endpoints.
When JMX Exporter and Node Exporter are enabled on the cluster, these nodes only seem to be running JMX Exporter, so we are only adding this endpoint for these nodes.