Skip to content

Commit 4bdd109

Browse files
committed
Correlating Anomalies via Temporal Overlap Similarity
OpenSearch anomalies such as service degradation, job delays, and incident bursts are represented as time intervals, not isolated points. If two detectors fire on the same incident, their anomaly intervals will substantially overlap in time (might with a little timestamp jitter due to different interval, detector start time, and causal relationship). Our similarity therefore measures: * how much the time windows overlap (after a small tolerance δ to account for jitter), * optionally, whether the duration is consistent. This PR implements threshold-graph + connected components based on similarity. Major algorithm: - De-dupe input anomalies by id (stable insertion order). - For every pair (i,j): - Dilate both time intervals by ±delta to tolerate bucket alignment drift. - Require dilated overlap >= minOverlap (cheap early filter). - Compute temporal overlap: - IoU (Jaccard over time) on dilated intervals - Overlap coefficient (overlap / min(lenA,lenB)) for containment cases - Detect strong containment (ovl >= tauContain and duration ratio <= rhoMax). - Pick temporal term by mode: - IOU: use IoU - OVL: use overlap coefficient - HYBRID: if strong containment, blend ((1-lam)*IoU + lam*OVL); else use IoU - Compute duration penalty exp(-|durA-durB|/kappa). - If strong containment, relax the penalty via pow(basePen, containmentRelax) (or disable penalty entirely when containmentRelax == 0). - Similarity = temporalTerm * penalty; add an undirected edge if similarity >= alpha. - Run DFS connected-components on the threshold graph to form clusters. - Output deterministically: sort members in each cluster by anomaly id. - Attach an event window per cluster as [min(start), max(end)] across its members. Testing done: 1. UT 2. Tests on real world data Signed-off-by: Kaituo Li <kaituo@amazon.com>
1 parent 7115d64 commit 4bdd109

File tree

4 files changed

+1390
-0
lines changed

4 files changed

+1390
-0
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@ All notable changes to this project are documented in this file.
44
Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
55

66
## [Unreleased 3.x](https://github.com/opensearch-project/anomaly-detection/compare/3.4...HEAD)
7+
8+
79
### Features
10+
- Correlating Anomalies via Temporal Overlap Similarity ([#1641](https://github.com/opensearch-project/anomaly-detection/pull/1641))
11+
812
### Enhancements
913
### Bug Fixes
1014
### Infrastructure
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
/*
2+
* SPDX-License-Identifier: Apache-2.0
3+
*
4+
* The OpenSearch Contributors require contributions made to
5+
* this file be licensed under the Apache-2.0 license or a
6+
* compatible open source license.
7+
*
8+
* Modifications Copyright OpenSearch Contributors. See
9+
* GitHub history for details.
10+
*
11+
*/
12+
package org.opensearch.ad.correlation;
13+
14+
import java.time.Duration;
15+
import java.time.Instant;
16+
import java.util.Objects;
17+
18+
/**
19+
* Anomaly class for anomaly correlation.
20+
* @param id This uniquely identifies the emitting source of the anomaly (e.g., model id).
21+
* @param configId The id of the detector.
22+
* @param dataStartTime The start time of the anomaly.
23+
* @param dataEndTime The end time of the anomaly.
24+
*/
25+
public final class Anomaly {
26+
// e.g., model id
27+
private final String id;
28+
private final String configId;
29+
private final Instant dataStartTime;
30+
private final Instant dataEndTime;
31+
32+
public Anomaly(String id, String configId, Instant dataStartTime, Instant dataEndTime) {
33+
this.id = Objects.requireNonNull(id, "id");
34+
this.configId = Objects.requireNonNull(configId, "configId");
35+
this.dataStartTime = Objects.requireNonNull(dataStartTime, "dataStartTime");
36+
this.dataEndTime = Objects.requireNonNull(dataEndTime, "dataEndTime");
37+
38+
if (!dataEndTime.isAfter(dataStartTime)) {
39+
throw new IllegalArgumentException("dataEndTime must be after dataStartTime");
40+
}
41+
}
42+
43+
public String getId() {
44+
return id;
45+
}
46+
47+
public Instant getDataStartTime() {
48+
return dataStartTime;
49+
}
50+
51+
public Instant getDataEndTime() {
52+
return dataEndTime;
53+
}
54+
55+
public Duration getDuration() {
56+
return Duration.between(dataStartTime, dataEndTime);
57+
}
58+
59+
public Instant getMidpoint() {
60+
long halfNanos = getDuration().toNanos() / 2;
61+
return dataStartTime.plusNanos(halfNanos);
62+
}
63+
64+
public String getConfigId() {
65+
return configId;
66+
}
67+
68+
@Override
69+
public String toString() {
70+
return "Anomaly{"
71+
+ "id='"
72+
+ id
73+
+ '\''
74+
+ ", detectorName='"
75+
+ configId
76+
+ '\''
77+
+ ", dataStartTime="
78+
+ dataStartTime
79+
+ ", dataEndTime="
80+
+ dataEndTime
81+
+ '}';
82+
}
83+
}

0 commit comments

Comments
 (0)