Skip to content

Commit 88a7ff0

Browse files
committed
Correlating Anomalies via Temporal Overlap Similarity
OpenSearch anomalies such as service degradation, job delays, and incident bursts are represented as time intervals, not isolated points. If two detectors fire on the same incident, their anomaly intervals will substantially overlap in time (might with a little timestamp jitter due to different interval, detector start time, and causal relationship). Our similarity therefore measures: * how much the time windows overlap (after a small tolerance δ to account for jitter), * optionally, whether the duration is consistent. This PR implements threshold-graph + connected components based on similarity. Major algorithm: - De-dupe input anomalies by id (stable insertion order). - For every pair (i,j): - Dilate both time intervals by ±delta to tolerate bucket alignment drift. - Require dilated overlap >= minOverlap (cheap early filter). - Compute temporal overlap: - IoU (Jaccard over time) on dilated intervals - Overlap coefficient (overlap / min(lenA,lenB)) for containment cases - Detect strong containment (ovl >= tauContain and duration ratio <= rhoMax). - Pick temporal term by mode: - IOU: use IoU - OVL: use overlap coefficient - HYBRID: if strong containment, blend ((1-lam)*IoU + lam*OVL); else use IoU - Compute duration penalty exp(-|durA-durB|/kappa). - If strong containment, relax the penalty via pow(basePen, containmentRelax) (or disable penalty entirely when containmentRelax == 0). - Similarity = temporalTerm * penalty; add an undirected edge if similarity >= alpha. - Run DFS connected-components on the threshold graph to form clusters. - Output deterministically: sort members in each cluster by anomaly id. - Attach an event window per cluster as [min(start), max(end)] across its members. Testing done: 1. UT 2. Tests on real world data Signed-off-by: Kaituo Li <kaituo@amazon.com>
1 parent 7115d64 commit 88a7ff0

File tree

4 files changed

+1389
-0
lines changed

4 files changed

+1389
-0
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@ All notable changes to this project are documented in this file.
44
Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
55

66
## [Unreleased 3.x](https://github.com/opensearch-project/anomaly-detection/compare/3.4...HEAD)
7+
8+
79
### Features
10+
- Correlating Anomalies via Temporal Overlap Similarity ([#1641](https://github.com/opensearch-project/anomaly-detection/pull/1641))
11+
812
### Enhancements
913
### Bug Fixes
1014
### Infrastructure
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
/*
2+
* SPDX-License-Identifier: Apache-2.0
3+
*
4+
* The OpenSearch Contributors require contributions made to
5+
* this file be licensed under the Apache-2.0 license or a
6+
* compatible open source license.
7+
*
8+
* Modifications Copyright OpenSearch Contributors. See
9+
* GitHub history for details.
10+
*
11+
*/
12+
package org.opensearch.ad.correlation;
13+
14+
import java.time.Duration;
15+
import java.time.Instant;
16+
import java.util.Objects;
17+
18+
/**
19+
* Anomaly class for anomaly correlation.
20+
*/
21+
public final class Anomaly {
22+
// This uniquely identifies the emitting source of the anomaly (e.g., model id).
23+
private final String id;
24+
// The id of the detector.
25+
private final String configId;
26+
// The start time of the anomaly.
27+
private final Instant dataStartTime;
28+
// The end time of the anomaly.
29+
private final Instant dataEndTime;
30+
31+
public Anomaly(String id, String configId, Instant dataStartTime, Instant dataEndTime) {
32+
this.id = Objects.requireNonNull(id, "id");
33+
this.configId = Objects.requireNonNull(configId, "configId");
34+
this.dataStartTime = Objects.requireNonNull(dataStartTime, "dataStartTime");
35+
this.dataEndTime = Objects.requireNonNull(dataEndTime, "dataEndTime");
36+
37+
if (!dataEndTime.isAfter(dataStartTime)) {
38+
throw new IllegalArgumentException("dataEndTime must be after dataStartTime");
39+
}
40+
}
41+
42+
public String getId() {
43+
return id;
44+
}
45+
46+
public Instant getDataStartTime() {
47+
return dataStartTime;
48+
}
49+
50+
public Instant getDataEndTime() {
51+
return dataEndTime;
52+
}
53+
54+
public Duration getDuration() {
55+
return Duration.between(dataStartTime, dataEndTime);
56+
}
57+
58+
public Instant getMidpoint() {
59+
long halfNanos = getDuration().toNanos() / 2;
60+
return dataStartTime.plusNanos(halfNanos);
61+
}
62+
63+
public String getConfigId() {
64+
return configId;
65+
}
66+
67+
@Override
68+
public String toString() {
69+
return "Anomaly{"
70+
+ "id='"
71+
+ id
72+
+ '\''
73+
+ ", detectorName='"
74+
+ configId
75+
+ '\''
76+
+ ", dataStartTime="
77+
+ dataStartTime
78+
+ ", dataEndTime="
79+
+ dataEndTime
80+
+ '}';
81+
}
82+
}

0 commit comments

Comments
 (0)