Skip to content

Add option to negotiate UTF-8 names#240

Merged
beorn7 merged 1 commit intomasterfrom
beorn7/utf-8
Oct 14, 2025
Merged

Add option to negotiate UTF-8 names#240
beorn7 merged 1 commit intomasterfrom
beorn7/utf-8

Conversation

@beorn7
Copy link
Member

@beorn7 beorn7 commented Oct 2, 2025

Due to its simplistic design, prom2json has always tolerated any characters in metric and label names. (It does not validate anything, just puts strings into JSON objects…)

However, a compliant scrape target will only serve unescaped UTF-8 in names if asked to do so. This commit adds a CLI option to make prom2json ask for it.

We cannot set this option by default without a major release because users might depend on the escaping.

I implemented this in a slightly clunky way to not change the library API (although we use semver for the user API, not the library API here – I still thought we can be nice at a relatively small price).

@beorn7 beorn7 requested review from krajorama and ywwg October 2, 2025 13:54
Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's more escaping schemes https://prometheus.io/docs/instrumenting/escaping_schemes/

If, as you say, prom2json doesn't actually care about the names, why single out utf-8? Might as well enumerate the possible values in the option ?

@beorn7
Copy link
Member Author

beorn7 commented Oct 7, 2025

True. It didn't occur to me that people would like to pick other schemes (just that we cannot just make utf-8 the default). But you are right, maybe some people really want a particular escaping in their JSON. Will update.

@beorn7
Copy link
Member Author

beorn7 commented Oct 7, 2025

All addressed. PHAL.

Due to its simplistic design, prom2json has always tolerated any
characters in metric and label names. (It does not validate anything,
just puts strings into JSON objects…)

However, a compliant scrape target will only serve unescaped UTF-8 in
names if asked to do so. This commit adds a CLI option to add the
'escaping=SCHEME' parameter in content negotiation.

We cannot set this option to 'allow-utf-8' by default without a major
release because users might depend on the default (underscore)
escaping.

I implemented this in a slightly clunky way to not change the library
API (although we use semver for the user API, not the library API here
– I still thought we can be nice at a relatively small price).

Signed-off-by: beorn7 <beorn@grafana.com>
Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with comment.

@beorn7 beorn7 merged commit 3b6ccc6 into master Oct 14, 2025
5 checks passed
@beorn7 beorn7 deleted the beorn7/utf-8 branch October 14, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants