ref(explorer): update issue tags and timeseries by time range by aliu39 · Pull Request #106316 · getsentry/sentry

aliu39 · 2026-01-14T20:58:05Z

Updates issue tag distribution and timeseries to respect the optional start/end params passed by the agent. Achieves this by hitting the events-facets endpoint for topValues, and doing an aggregate table query for total event count. The response is reshaped and passed to the same autofix tags overview util. If start/end isn't passed we still do the original tagstore query, as it's more performant.

Facets (tags) returned is limited to 1000, which matches tagstore

todo: update tool code or desc?

src/sentry/seer/explorer/utils.py

src/sentry/seer/explorer/tools.py

src/sentry/seer/explorer/utils.py

src/sentry/seer/explorer/tools.py

src/sentry/seer/explorer/utils.py

roaga · 2026-01-15T18:12:34Z

src/sentry/seer/explorer/tools.py

+            )
+        )
+
+    return get_all_tags_overview(group, tag_keys=tag_keys)


does this replicate the "other" value that the tagstore endpoint does? we should make sure the output formats are the same in general

yeah "other" is just the difference between the TagKey.count and sum(topValues.count). By using the total event count of the time range, I'm leaving room for other (possible overestimate if not all events contain the key), but if I used sum(topValues.count) then there would never be an other

@roaga the current autofix tag percentages use sum(count) for value in top_values as the denominator. The overall key count isn't used for anything except a display in seer formatted output. I think most cases this sum is close enough to the key's count, so we can use it as an approximation or just omit it, does that make sense?

Another approximation would be the total event count for the time range

roaga · 2026-01-15T18:15:11Z

tests/sentry/seer/explorer/test_tools.py

        ), f"Expected total count {expected_total}, got {total_count}"


+class TestGetGroupTagsOverview(APITestCase, SnubaTestCase):


would like to see tests for cases where there are many values and we would get "other" and for cases with "(empty)" values. and also generally verifying both approaches return the same thing if the time range is the same

src/sentry/seer/explorer/tools.py

src/sentry/seer/autofix/autofix.py

Zylphrex · 2026-01-19T18:05:33Z

src/sentry/seer/explorer/utils.py

+def get_retention_boundary(organization: Organization, has_timezone: bool) -> datetime:
+    """Get the minimum datetime within retention, based on current time."""
+    retention_days = quotas.backend.get_event_retention(organization=organization) or 90
+    now = datetime.now(UTC) if has_timezone else datetime.now(UTC).replace(tzinfo=None)
+    return now - timedelta(days=retention_days)


The event retention actually changes based on the event type now. Is this specifically for errors?

from the get_event_retention fx, looks like this is the org-level policy, not for any specific event type. By event type do you mean the data category, or errors vs performance?

Returns the retention for events in the given organization in days. Returns ``None`` if events are to be stored indefinitely. :param organization: The organization model. :param category: Return the retention policy for this data category. If this is not given, return the org-level policy.

Zylphrex · 2026-01-19T18:07:45Z

src/sentry/seer/explorer/tools.py

+    start_dt = datetime.fromisoformat(start) if start else None
+    end_dt = datetime.fromisoformat(end) if end else None
+    start_dt, end_dt = get_group_date_range(group, organization, start_dt, end_dt)
+    if start_dt >= end_dt:


Is this possible in practice? If so, when does this happen? Does it make sense to just swap the 2 and add some buffer to force it into a valid range?

Logging a warning is fine too, just trying to understand why this exists.

it happens when the passed start/end are both out of retention range, or the first and last seen are both out of range. I think I'd rather have it so the util raises in these cases, updating..

Zylphrex · 2026-01-19T18:08:32Z

src/sentry/seer/explorer/tools.py

+            selected_period, selected_delta, interval = p, d, i
            break
-    stats_period = stats_period or "90d"
+    selected_period = selected_period or "90d"


Defaulting to 90d seems very expensive. Is this necessary?

this line is mainly for typecheck and shouldn't happen in practice with retention clamping. selected_period is the minimum period (24h, 7d, 14d, 30d, 90d) that's >= end_dt - start_dt. So it'd only be 90d if we pass a really long date range

Zylphrex · 2026-01-19T18:17:05Z

src/sentry/seer/explorer/tools.py

+        # Aggregate query for total events with the tag key in the time range.
+        # TODO: assess performance, parallelize or find an alternative
+        # As an estimate, could make a single query and no has filter
+        count_result = execute_table_query(
+            org_id=organization.id,
+            dataset=dataset,
+            fields=["count()"],
+            query=f"issue:{group.qualified_short_id} has:{key}",
+            project_ids=[group.project_id],
+            start=start,
+            end=end,
+            per_page=1,
+        )


This is a very expensive way to query tag frequencies as you'll be reading the tags column (which can be huge) repeatedly. Looks like the max is 100? We should consider other ways of doing this.

…verview

sentry · 2026-01-20T19:36:13Z

src/sentry/seer/explorer/tools.py

+            end=end,
+            per_page=1,
+        )
+        total_count = (count_result or {}).get("data", [{}])[0].get("count()", 0)


Bug: The code may raise an IndexError when execute_table_query returns a response with an empty data list, as [0] is accessed on an empty list.
_{Severity: MEDIUM}

Suggested Fix

Modify the logic to safely handle an empty data list. First, retrieve the list, then check if it's non-empty before accessing its first element. For example: data = (count_result or {}).get("data") followed by total_count = data[0].get("count()", 0) if data else 0.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/sentry/seer/explorer/tools.py#L972 Potential issue: The line `total_count = (count_result or {}).get("data", [{}])[0].get("count()", 0)` can raise an `IndexError`. This occurs when the `execute_table_query` function returns a result where the "data" key contains an empty list, such as `{"data": []}`. This can happen when a query for a tag key finds no matching events. The code attempts to access the first element of this empty list, causing the error. While the exception is caught and results in a graceful degradation where `tags_overview` becomes `None`, it is still an unexpected runtime error that impacts reliability.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-01-20T20:09:23Z

src/sentry/seer/explorer/tools.py

+            end=end,
+            per_page=1,
+        )
+        total_count = (count_result or {}).get("data", [{}])[0].get("count()", 0)


Empty data array causes IndexError crash

Low Severity

The expression (count_result or {}).get("data", [{}])[0] will raise IndexError if count_result contains {"data": []} (empty list). The default [{}] is only used when the "data" key is missing, not when it exists but is empty. If execute_table_query ever returns an empty data array, this line crashes.

aliu39 added 4 commits January 14, 2026 12:03

new tags overview util

29d4edf

draft

8006cb3

tweak

be0281d

comment

75c9906

aliu39 requested a review from a team as a code owner January 14, 2026 20:58

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 14, 2026

vercel bot deployed to Preview January 14, 2026 21:00 View deployment

sentry bot reviewed Jan 14, 2026

View reviewed changes

src/sentry/seer/explorer/utils.py Outdated Show resolved Hide resolved

src/sentry/seer/explorer/tools.py Show resolved Hide resolved

cursor bot reviewed Jan 14, 2026

View reviewed changes

src/sentry/seer/explorer/tools.py Show resolved Hide resolved

final

8eefa3f

vercel bot deployed to Preview January 15, 2026 03:06 View deployment

cursor bot reviewed Jan 15, 2026

View reviewed changes

src/sentry/seer/explorer/tools.py Outdated Show resolved Hide resolved

src/sentry/seer/explorer/utils.py Show resolved Hide resolved

limit and type fix

6f37a9d

vercel bot deployed to Preview January 15, 2026 03:18 View deployment

sentry bot reviewed Jan 15, 2026

View reviewed changes

src/sentry/seer/explorer/tools.py Outdated Show resolved Hide resolved

src/sentry/seer/explorer/tools.py Outdated Show resolved Hide resolved

cursor bot reviewed Jan 15, 2026

View reviewed changes

src/sentry/seer/explorer/tools.py Show resolved Hide resolved

roaga reviewed Jan 15, 2026

View reviewed changes

per_page

7022de4

aliu39 requested a review from Zylphrex January 15, 2026 23:30

aliu39 added 2 commits January 16, 2026 13:56

rename

cc962a5

possible approach for accurate total count

2510279

vercel bot deployed to Preview January 16, 2026 22:04 View deployment

sentry bot reviewed Jan 16, 2026

View reviewed changes

src/sentry/seer/explorer/tools.py Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

Zylphrex reviewed Jan 19, 2026

View reviewed changes

aliu39 added 2 commits January 20, 2026 11:15

Merge branch 'master' of github.com:getsentry/sentry into aliu/tags-o…

8bb2999

…verview

cleanup typefix

863ab62

sentry bot reviewed Jan 20, 2026

View reviewed changes

vercel bot deployed to Preview January 20, 2026 19:36 View deployment

update util

967b238

vercel bot deployed to Preview January 20, 2026 20:03 View deployment

cursor bot reviewed Jan 20, 2026

View reviewed changes

aliu39 marked this pull request as draft January 23, 2026 00:03

aliu39 mentioned this pull request Jan 23, 2026

ref(explorer): update tags overview and timeseries by time range #106915

Merged

aliu39 closed this Jan 26, 2026

github-actions bot locked and limited conversation to collaborators Feb 11, 2026

		), f"Expected total count {expected_total}, got {total_count}"


		class TestGetGroupTagsOverview(APITestCase, SnubaTestCase):

Uh oh!

Conversation

aliu39 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sentry bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Jan 20, 2026

Choose a reason for hiding this comment

Empty data array causes IndexError crash

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aliu39 commented Jan 14, 2026 •

edited

Loading