Skip to content

enhancement: use frontend MaxExemplars config as single source of truth for exemplar limits#6515

Merged
zhxiaogg merged 6 commits intografana:mainfrom
zhxiaogg:zhxiaogg/use-max-exemplar-limit-in-frontend
Feb 25, 2026
Merged

enhancement: use frontend MaxExemplars config as single source of truth for exemplar limits#6515
zhxiaogg merged 6 commits intografana:mainfrom
zhxiaogg:zhxiaogg/use-max-exemplar-limit-in-frontend

Conversation

@zhxiaogg
Copy link
Contributor

@zhxiaogg zhxiaogg commented Feb 19, 2026

What this PR does:

Removed a hardcoded maxExemplars = 100 from engine_metrics.go. Instead use the configurable limit from frontend for metrics range queries.

Without the fix, the clients could never receive more than 100 exemplars regardless of server configuration.

Test:

  • Set MaxExemplars = 300 in frontend configuration
  • Verified previous logic returns ~90 exemplars while the new logic returns ~270 exemplars, see the following screenshots for details.

Before:
Screenshot 2026-02-19 at 8 46 04 AM

After:
Screenshot 2026-02-19 at 8 50 29 AM

Which issue(s) this PR fixes:
Fixes #5166

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@CLAassistant
Copy link

CLAassistant commented Feb 19, 2026

CLA assistant check
All committers have signed the CLA.

@zhxiaogg zhxiaogg enabled auto-merge (squash) February 19, 2026 21:20
// newBucketSet creates a new bucket set for the given time range
// start and end are in nanoseconds
func newBucketSet(exemplars uint32, start, end uint64) *limitedBucketSet {
if exemplars > maxExemplars || exemplars == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this can be dangerous. This code can be reached from the queriers without passing by the clamping code in the frontend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I assume all requests sent to queriers are from frontend. What else requests are you referring to here?

My thinking is to cap the exemplars at the earliest stage, e.g. frontend, before it being used by downstream codes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in person. There are queries sent to querier directly.

Updated the PR to cap the req.Exemplars at the entrance of the traceQL engine, which are CompileMetricsQueryRange and CompileMetricsQueryRangeNonRaw.

@zhxiaogg zhxiaogg force-pushed the zhxiaogg/use-max-exemplar-limit-in-frontend branch 4 times, most recently from 737a05b to 97011d8 Compare February 23, 2026 21:12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can lead to a division by zero panic now.

For a very small interval and a big number of buckets bucketWidth can be zero.

Later on in the bucket() method we use bucketWidth as the division denominator

Example:

interval = 1000ms
buckets = 1500
bucketWidth = 1000/1500 = 0 since we are using integers.

We need to handle that case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for catching this!

// its not present then we look to see if the user provided `with(exemplars=false)`
exemplars := int(req.Exemplars)
if v, ok := expr.Hints.GetInt(HintExemplars, allowUnsafeQueryHints); ok {
exemplars = v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overwritting the maxExemplars cap when the hint is used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate known issue, will fix in a following up PR if that's possible.

@zhxiaogg zhxiaogg force-pushed the zhxiaogg/use-max-exemplar-limit-in-frontend branch from 97011d8 to 9977c3e Compare February 24, 2026 17:59
Copy link
Contributor

@javiermolinar javiermolinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zhxiaogg zhxiaogg merged commit 401698d into grafana:main Feb 25, 2026
24 checks passed
@zhxiaogg zhxiaogg deleted the zhxiaogg/use-max-exemplar-limit-in-frontend branch February 25, 2026 15:19
zalegrala pushed a commit to zalegrala/tempo that referenced this pull request Feb 27, 2026
@tempo-ci-app
Copy link

tempo-ci-app bot commented Mar 10, 2026

The backport to release-v2.10 failed:

error cherry-picking: error running git cherry-pick: error running command 'git cherry-pick -x 401698df93a3f6337f8aa959bf1868494aca30c9'
error: exit status 1
stdout: Auto-merging CHANGELOG.md
Auto-merging modules/frontend/metrics_query_range_handler.go
CONFLICT (content): Merge conflict in modules/frontend/metrics_query_range_handler.go
Auto-merging modules/frontend/metrics_query_range_sharder.go
Auto-merging pkg/traceql/engine_metrics.go
CONFLICT (content): Merge conflict in pkg/traceql/engine_metrics.go
Auto-merging pkg/traceql/engine_metrics_test.go

stderr: error: could not apply 401698df9... enhancement: use frontend MaxExemplars config as single source of truth for exemplar limits (#6515)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
hint: Disable this message with "git config set advice.mergeConflict false"

To backport manually, run these commands in your terminal:

git fetch
git switch --create backport-6515-to-release-v2.10 origin/release-v2.10
git cherry-pick -x 401698df93a3f6337f8aa959bf1868494aca30c9

Resolve the conflicts, then add the changes and run git cherry-pick --continue:

git add . && git cherry-pick --continue

If you have the GitHub CLI installed:

git push --set-upstream origin backport-6515-to-release-v2.10
PR_BODY=$(gh pr view 6515 --json body --template 'Backport 401698df93a3f6337f8aa959bf1868494aca30c9 from #6515{{ "\n\n---\n\n" }}{{ index . "body" }}')
echo "${PR_BODY}" | gh pr create --title '[release-v2.10] enhancement: use frontend MaxExemplars config as single source of truth for exemplar limits' --body-file - --label 'backport' --label '' --base release-v2.10 --web

Or, if you don't have the GitHub CLI installed (we recommend you install it!):

git push --set-upstream origin backport-6515-to-release-v2.10

And open a pull request where the base branch is release-v2.10 and the compare/head branch is backport-6515-to-release-v2.10

zhxiaogg added a commit to zhxiaogg/tempo that referenced this pull request Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use query_frontend.metrics.max_exemplars instead of hardcoded limit

3 participants