Skip to content

Conversation

@mustafaabdullahk
Copy link
Contributor

@mustafaabdullahk mustafaabdullahk commented May 15, 2024

Hey Zephyr Community! 🚀

I'm excited to announce the launch of our latest addition to the Zephyr ecosystem: the Zephyr Prometheus Library! 🎉

What is Zephyr Prometheus Library?

Zephyr Prometheus is a lightweight library designed to seamlessly integrate Prometheus metrics into your Zephyr projects. Whether you're developing for IoT, edge computing, or any other embedded application with Zephyr, monitoring and collecting metrics is now easier than ever.

Features:

📊 Effortless Integration: With just a few lines of code, you can start exposing custom metrics from your Zephyr applications.

🔍 Flexible Metrics Collection: Monitor crucial aspects of your application's performance, such as memory usage, task execution times, and custom-defined metrics.

🔒 Secure Communication: Ensure the integrity and confidentiality of your metrics data with secure communication protocols supported by Zephyr.

we'r talk about that pr following discussion: #72127

Metric Server

prom-metrics-screen

Grafana

grafana-screen

Signed-off-by: Mustafa Abdullah Kus [email protected]

@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch 4 times, most recently from cf94baa to b806b91 Compare May 15, 2024 15:33
Copy link
Contributor

@bjarki-andreasen bjarki-andreasen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some notes:

  • The pm_ namespace is used for "power management", maybe update it to net_pm, or pmts_? or simply prometheus_
  • I think prometheous may be better placed in subsys/mgmt alongside integrations like mcumgr, rather than net, where we have MQTT, IPV4 etc

@mustafaabdullahk
Copy link
Contributor Author

Looks good, some notes:

* The `pm_` namespace is used for "power management", maybe update it to `net_pm`, or `pmts_`? or simply `prometheus_`

* I think prometheous may be better placed in `subsys/mgmt` alongside integrations like mcumgr, rather than `net`, where we have MQTT, IPV4 etc
  • i will change with prometheus instead of others, first of all i did think that name but give up that name because its too long. if that namespace conflict with power management, should be change. i guess best way change with prometheus namespace.
  • i dont sure about place. mr @jukkar (zephyr prometheus library #72127) did suggest to me that place. if best place, i can change with gladly :)

Copy link
Member

@jukkar jukkar May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We normally use typedefs only for function pointers in Zephyr and avoid using typedefs for structs. So with this in mind you could just have struct pm_collector_node { ... } here. Similar comment to all the other typedefs you have in the header files.

Also, all the structs and their fields need to be documented if they are considered to be "public" i.e., used by the application. For "private" structs, you should use /** @cond INTERNAL_HIDDEN */ ... /** @endcond */ to prevent the struct from showing in the generated documentation. See examples in other header files. It would be good to document all structs regardless of their public/private status so that the usage of them can be understood by casual reader / reviewer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the pm_ prefix, I think we could use prometheus_ as suggested by @bjarki-trackunit. It is a bit longish but we avoid confusion with power mgmt APIs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better Prometheus ...

@jukkar
Copy link
Member

jukkar commented May 16, 2024

About the location, I initially thought the net for this but as @bjarki-trackunit mentioned, the mgmt is also possible. I do not have strong opinion about this, what people in the review list think?

@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch from b806b91 to 25ffd4a Compare May 20, 2024 15:13
@mustafaabdullahk
Copy link
Contributor Author

About the location, I initially thought the net for this but as @bjarki-trackunit mentioned, the mgmt is also possible. I do not have strong opinion about this, what people in the review list think?

I waiting to opinion for re-place library components. For now i did solve @jukkar reviewed points.

@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch 2 times, most recently from 4eb172e to 1d174ce Compare May 22, 2024 13:19
Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not finish the review yet, but some preliminary comments added.
As a general note I am not a big fan of the dynamic memory we are using all over the place. We have been trying to minimize the usage of malloc in the network stack, but this is not possible always but we should try to minimize malloc use (because if malloc is not used, then there is no possibility to memory leaks).

Comment on lines 20 to 33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are part of the API that applications will use, then these needs to be documented so that the generated documentation contains useful information. Same comment to all structs in the header files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add detailed comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of our own list pointers, would it make sense to use the slist API here, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use slist apis at first. I will update with slist instead of my linked list implementation but I should be try. We should will measure overhead between implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use strncmp() to make sure we do not read past the end of buffer

Copy link
Member

@jukkar jukkar May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite large and will use lot of stack, do we need this long tmp buffer?

@jukkar
Copy link
Member

jukkar commented May 22, 2024

Also we will need some tests for these, could be placed for example to tests/net/lib/prometheus

@mustafaabdullahk
Copy link
Contributor Author

Also we will need some tests for these, could be placed for example to tests/net/lib/prometheus

We did decide about library place? Can i integrate to under net/lib for now ?

@jukkar
Copy link
Member

jukkar commented May 23, 2024

We did decide about library place? Can i integrate to under net/lib for now ?

I think there is no consensus yet, it should be trivial change if the location is changed, just some git mv

@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch 2 times, most recently from fe4fe4b to fa57e53 Compare May 29, 2024 12:52
@mustafaabdullahk
Copy link
Contributor Author

Did not finish the review yet, but some preliminary comments added. As a general note I am not a big fan of the dynamic memory we are using all over the place. We have been trying to minimize the usage of malloc in the network stack, but this is not possible always but we should try to minimize malloc use (because if malloc is not used, then there is no possibility to memory leaks).

Hi @jukkar. I did completely remove k_malloc and various allocation so library evolve to compile time definitions. As far as i see try to implement iterable sections. If the implementation good, i will start to write test.

@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch from fa57e53 to 123da84 Compare May 30, 2024 15:20
@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch 2 times, most recently from b1b5248 to 306d73f Compare September 30, 2024 20:42
@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch 2 times, most recently from 5918a49 to f9fc954 Compare October 3, 2024 20:27
@jukkar
Copy link
Member

jukkar commented Oct 4, 2024

There is one compliance error that we could fix

 OPEN_BRACE: that open brace { should be on the previous line
File:samples/net/prometheus/src/main.c
Line:76
Compliance error, check for error messages in the "Run Compliance Tests" step
You can run this step locally with the ./scripts/ci/check_compliance.py script.
Error: Process completed with exit code 1.

@mustafaabdullahk
Copy link
Contributor Author

There is one compliance error that we could fix

 OPEN_BRACE: that open brace { should be on the previous line
File:samples/net/prometheus/src/main.c
Line:76
Compliance error, check for error messages in the "Run Compliance Tests" step
You can run this step locally with the ./scripts/ci/check_compliance.py script.
Error: Process completed with exit code 1.

I just apply clang-format. Check complience pipeline give error following type.

struct http_resource_detail_dynamic dyn_resource_detail = {
	.common =
		{
			.type = HTTP_RESOURCE_TYPE_DYNAMIC,
			.bitmask_of_supported_http_methods = BIT(HTTP_GET),
			.content_type = "text/plain",
		},
	.cb = dyn_handler,
	.user_data = NULL,
};

but if i move to bracket previous line like following i give warning (https://github.com/zephyrproject-rtos/zephyr/actions/runs/11113829987). Which one is true?:

struct http_resource_detail_dynamic dyn_resource_detail = {
	.common = {
			.type = HTTP_RESOURCE_TYPE_DYNAMIC,
			.bitmask_of_supported_http_methods = BIT(HTTP_GET),
			.content_type = "text/plain",
		},
	.cb = dyn_handler,
	.user_data = NULL,
};

@pdgendt
Copy link
Contributor

pdgendt commented Oct 4, 2024

Clang format annotations are non-blocking, ideally this is fixed in the .clang-format file. But as this is outside the scope of the PR, it's safe to ignore.

@jukkar
Copy link
Member

jukkar commented Oct 4, 2024

As Pieter said, just ignore the clang- format suggestions for now.

@mustafaabdullahk
Copy link
Contributor Author

@josuah gentle ping :)

Copy link
Contributor

@josuah josuah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience, I was traveling but I am back now.
Some very minor changes proposed, but it looks very close now.
Thank you for all the iterations!

The library provides Prometheus metrics
types, collector and exposion formatter.
The library isn't thread-safe for now.
The next first pull request will support
that. Can be use exposion formatted
output with Zephyr Http server.

Signed-off-by: Mustafa Abdullah Kus <[email protected]>
The sample uses the Zephyr HTTP server library
and demonstrates the Prometheus metric
server node. Prometheus client library
runs as a pull method. The sample contains
the HTTP request counter and increases
when refresh path of '/metrics' from the browser.

Signed-off-by: Mustafa Abdullah Kus <[email protected]>
Tests for prometheus library support.
It contains integration and unit test
each metric types.

Signed-off-by: Mustafa Abdullah Kus <[email protected]>
@mustafaabdullahk mustafaabdullahk force-pushed the mustafaabdullahk/prometheus-library branch from f9fc954 to 0f1ccd3 Compare October 12, 2024 20:17
Copy link
Contributor

@josuah josuah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, everything got fixed and addressed!
One minor suggestion, but does not hurt in any way.
Thank you for all the changes!

@jukkar jukkar requested a review from rlubos October 14, 2024 13:08
@jukkar
Copy link
Member

jukkar commented Oct 14, 2024

@nashif You had -1 to this, could you take a look again and approve. Also we need help to force push this because the PR contains binary certificate files and the compliance script does not accept them.

@nashif nashif dismissed their stale review October 14, 2024 16:46

addressed

@carlescufi carlescufi merged commit abea54e into zephyrproject-rtos:main Oct 18, 2024
24 of 25 checks passed
@bjarki-andreasen
Copy link
Contributor

congratulations! 🍾

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.