ICMP Fallback (Better Status Classification)

# Description

ICMP fallback enhances outage classification accuracy by automatically performing ICMP ping tests when TCP or HTTP probes fail. This provides better distinction between service-level failures (application down) and network connectivity issues (host unreachable), enabling more precise root cause analysis and reducing false positive outage alerts.

Currently, ThingConnect Pulse treats all probe failures equally - a failed HTTP probe could indicate either a web service crash or complete network isolation. ICMP fallback adds intelligent secondary testing to classify outage types, helping manufacturing teams understand whether issues are network infrastructure problems or specific service failures.

Instead of showing generic "TCP connection timeout", the system will display:

- 🟠 **Service Outage**: TCP port unavailable, host reachable via ICMP
- 🔴 **Network Outage**: Host completely unreachable (both TCP and ICMP fail)

## **Scope of Work**

**Core Implementation:**

- Add automatic ICMP fallback logic when TCP/HTTP probes fail
- Implement outage classification system (Network, Service, Mixed)
- Extend **`CheckResult`** model to include fallback probe results
- Update outage detection logic to consider fallback test outcomes

**Probe Enhancement:**

- Modify **`ProbeService`** to perform secondary ICMP tests on TCP/HTTP failures
- Add configurable fallback timeout (shorter than primary probe timeout)
- Ensure fallback probes respect existing concurrency limits and jitter scheduling
- Preserve original error messages while adding fallback context

**Database Schema:**

- Extend **`CheckResultRaw`** table with fallback probe fields
- Add **`OutageClassification`** enum (Network, Service, Mixed, Unknown)
- Update **`Outage`** table to store classification results
- Maintain backwards compatibility with existing data

**UI Integration:**

- Display outage classification in dashboard status indicators
- Show fallback probe results in endpoint detail views
- Add filtering by outage type in history views
- Include classification information in CSV exports

## **Acceptance Criteria**

**Functional Requirements:**

- TCP/HTTP probe failures automatically trigger ICMP fallback within 2 seconds
- ICMP fallback timeout defaults to 500ms (configurable via settings)
- Outage classification correctly identifies Network vs Service vs Mixed failures
- Fallback probes respect existing ±10% jitter and concurrency limits
- Original probe error messages preserved alongside fallback results

**Classification Logic:**

- **Network Outage**: Primary probe fails, ICMP fallback fails → Host unreachable
- **Service Outage**: Primary probe fails, ICMP fallback succeeds → Service down
- **Mixed Outage**: Inconsistent results over multiple checks → Network instability
- **Unknown**: Fallback probe times out or encounters errors → Default classification

**Performance Requirements:**

- Fallback probes add <1 second overhead to failed probe cycles
- No impact on successful probe execution (no fallback needed)
- Memory usage increases by <20 bytes per endpoint for fallback state tracking
- Database storage increases by ~30% for failed check records (fallback data)

**User Experience:**

- Dashboard shows color-coded outage types (red=Network, orange=Service, yellow=Mixed)
- Endpoint detail pages display "Last seen via ICMP" timestamps for service outages
- History charts distinguish between outage classification types
- CSV exports include OutageType and FallbackResult columns

###Classification

None = -1,           // Explicitly healthy, no outage detected
Unknown = 0,         // Not enough information to classify
Network = 1,           // Host completely unreachable
Service = 2,           // Service down, host reachable
Intermittent = 3,      // Flapping/unstable connection
Performance = 4,       // Slow response times
PartialService = 5,    // HTTP errors but TCP works
DnsResolution = 6,     // DNS fails but IP works
Congestion = 7,        // Network performance issues
Maintenance = 8        // Planned downtime classification

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ICMP Fallback (Better Status Classification) #168

Description

Scope of Work

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ICMP Fallback (Better Status Classification) #168

Description

Description

Scope of Work

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions