-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat(Jira Integration): use grouplabels to calculate hash #4677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(Jira Integration): use grouplabels to calculate hash #4677
Conversation
…roupKey function Signed-off-by: Holger Waschke <[email protected]>
36a4f47 to
fbd25c2
Compare
I'm not sure this is true when you use |
Good point — but that shouldn’t be an issue, since the JQL search is always scoped to a single Jira project when looking for existing issues.
If two routes with different label matchers target the same Jira project and produce the same hash, that means they share the same group labels — so reusing the existing issue is the expected and desired behavior. |
The hash is derived from the GroupKey I understand? What happens to the Jira issue when there are two groups two the same GroupKey, but different alerts? Won't they overwrite each other on the same issue? That's why the RouteKey is important iirc. |
Just to clarify — Alertmanager still uses the groupKey to deliver alerts to its integrations. Once the Jira integration receives the alert, it only uses the hash to search for existing issues. |
Sorry I'm getting very confused. In this PR, you replaced the GroupKey with GroupLabels? But GroupLabels are not globally unique, so I believe with this change you will have hash collisions. |
there are two different layers here. one is the global alertmanager dispatch pipeline to it´s integrations. this is unchanged. this still uses the GroupKey to create the aggrGroup (route matchers + grouping labels). second layer is the jira integration itself. here i changed the groupkey to grouplabels for calculating its hash. but it´s only being used to search for existingissues over the jira api to reuse existing issues for deduplication. (and this api search is scoped to the configured jira project in the route config anyway). this means there are no updates on existing issues other than the configured jira project in the route config itself right now. imho the hash shouldnt differ just because they arrived using different routes. if the group labels are the same is logical the same issue. |
Yep agree.
Here I think is the potential issue. Suppose you have the following configuration: route:
- matchers:
- severity=warning
continue: true
receiver: jira
group_by:
- service
- matchers:
- team=ops
continue: true
receiver: jira
group_by:
- serviceYou can have different aggregation groups with the same Group 1 Group 2 However because they have the same |
At first glance, it might seem like they should map to different issues, but this behavior is actually intentional. Deduplication between alerts and Jira issues is a desired feature. You don’t want a new Jira issue every time the repeat_interval triggers; instead, you want related alerts to update or reopen the same Jira issue. That’s why the choice of group_by labels is critical. In real-world configurations, you’ll almost always have multiple grouping labels to ensure alerts are grouped meaningfully. Taking your example: When an alert fires through the first route, it generates a Jira issue with a summary like: [FIRING:1] High CPU Usage production myhost01.com The issue hash is derived from the group_by labels. This is the desired and logical behavior, since both alerts refer to the same underlying problem. Also note: each Jira receiver is tied to one Jira project and one issue type, which helps maintain consistency across alert routes. But maybe it´s a good idea to enhance the documentation by this points. |
Agree!
This where I am concerned. Alertmanager doesn't work like that and doesn't force that assumption onto users. It doesn't assume that two different alert groups, from two different routes, are the same problem, which is why all Alertmanager integrations respect GroupKey as the logical separator between notifications. I don't think this change can be the default because 1.) now Jira does something different from every other integration and 2.) it becomes inconsistent when used together with another integration in the same receiver. I would much prefer allowing templating of the Jira issue labels so users can opt-in to this behavior if they want it? |
got you. can you give a bit more details how your desired solution would look like? |
|
There is something I didn't understand about the original issue:
Can you help me understand what makes this true? As far as I can see there is no information in this GroupKey that identifies a Jira project, so what makes it unique to the project? |
My point here is when using groupkey for calculate the hash you´ll end up with two different hashes for the same alert in two different jira projects. given an example So in our use case: One main Jira Project with 24/7 issue handling. This is where ALL issues are being created no matter what. Before this PR: in both Jira projects the issue will be created but the hash differs. You can still build our own JQL to filter for identical issues but you dont have the possibility to use the hash to search project independent for identical issues. After this PR: Still two separately issues in both projects will be created but they share the common hash as a Jira tag. In a future PR this could be tweaked even more for configuring which jira projects should be used for deduplication. |
|
Ah OK I think I am following. In this example there are two distinct routes creating distinct issues in two different jira projects, and because there are two distinct routes, any aggregation groups created from these routes have distinct GroupKeys.
So from Alertmanager's PoV, this is very much intentional as these are two completely distinct routes with distinct matchers. This is how Alertmanager is supposed to work as notifications are delivered based on the route and the resulting aggregation group (i.e. It sounds to me like allowing an optional custom template that replaces the |
This comment was marked as spam.
This comment was marked as spam.
Yes, my point is do you really force this logic down to each individual integration? Would this make sense? Because on the dispatch logic how AM sends its individual alert to it´s integration is untouched, see this earlier comment
The hash I changed is only being used within Jira Integration for generating the JQL to identify existing issues Implementing this with a custom template will make things more complicated for end users at the end. I´m not quite sure this is the way too go. Deduplication Jira issues on the their logical alert grouping would make most sense for most user IMHO. |
Yes as a default behavior. |
It still will be the default behavior for most integrations. For those who have good reasons to do so may opt to choose the shared function HashGroupLabels For my POV this gives the integrations more flexibility without leaving standards behind. If it´s not possible I can have a look at the templated solution. |
|
So as I said before, I don't think it's reasonable to have a default behavior where two different aggregation groups from two different matchers re-use the same Jira issue. It's both an unexpected and a breaking change and no other Jira users have demanded it. There is also no way for users to use the old behaviour in this PR either right? For all of those reasons I think it has to be an opt-in behavior. |
Problem
The hash used to identify existing Jira issues is not unique across multiple Jira projects.
Root Cause
Currently, the hash is generated using the ExtractGroupKey function, which relies on both the route matchers and group labels, see here.
Example result:
"{}/{env=\"prod\"}:{alertname=\"HighErrorRate\", cluster=\"bb\", service=\"api\"}"This makes the hash unique only within a single Jira project.
In larger environments, where the same alert may be mirrored or transferred across multiple Jira projects you cant identitify the same issue by its hash.
Change Summary
This update modifies the hash calculation to use only the group labels via the notify.GroupLabels function.
This ensures that the resulting hash uniquely identifies an issue based solely on alert labels, independent of the Jira project.
Impact
Jira issue lookup remains scoped to a single project, as defined by the JQL query:
This change does not cause cross-project updates or searches.
It lays the groundwork for a future enhancement to allow configurable multi-project issue lookups.
Future Considerations
A potential next step could be introducing a parameter that defines which Jira projects should be included in the update/search scope.