Add SNMP ping (switch health, port up/down)

Here is a professional GitHub issue draft to add SNMP ping for switch health with optional port up/down checks, aligned to the existing probe engine, outage flow, and YAML configuration patterns.[1][2]

### Title
Add SNMP ping (switch health, port up/down)[2]

### Background
Pulse currently supports ICMP, TCP, and HTTP reachability, and extending to SNMP aligns with the plant-aware objective while staying within the availability-only scope for v1.0.[2]
The new SNMP probe must plug into the existing ProbeService → OutageDetectionService pipeline, reuse timeout semantics, record RTT, and surface consistently in API, CSV, and live board.[3][1]

### Objective
Implement a new probe type snmp that reports UP when an SNMP GET succeeds within timeout (defaulting to sysUpTime.0), records RTT, and optionally evaluates a configured OID for port status to annotate health while keeping the primary outcome availability-focused.[1][2]

### Scope
- Backend probe: SNMP reachability via UDP/161 using a minimal GET on a default OID (sysUpTime.0), with configurable version (v1/v2c), community, timeout, and retries, mapping success/failure to UP/DOWN with RTT as request roundtrip.[1]
- Optional port check: allow an OID parameter (e.g., ifOperStatus for a specific interface index) to be fetched after reachability; availability remains based on SNMP reachability while port status is recorded in the result metadata for UI display.[1]
- YAML schema: add type: snmp with host, port (default 161), version, community, timeout, retries, and optional oid/expectedValue for port-state hints; validate and surface in Apply/diff/versioning.[4]

### Non-goals
- No deep device inventory, traps, or bulk walks; this is a lightweight reachability/health check appropriate for v1 availability scope.[2]
- No SNMPv3 security profiles in this phase; start with v1/v2c to minimize complexity and configuration burden.[2]

### Design Notes
- Result mapping: Success = UP with RTT measured as GET roundtrip; Failure = DOWN on timeout, no response, or auth error, with granular error categories for observability.[1]
- Timeouts/retries: Adopt existing per-probe timeout and retry semantics; wire through cancellation tokens and error paths consistent with other probes.[1]
- Outage flow: Feed CheckResult into OutageDetectionService unchanged to honor 2/2 flap damping and transactional outage open/close behavior.[3]

### Tasks
- Backend
  - Implement SnmpPingProbe with reachability GET to default OID and RTT capture, plus optional fetch of a configured OID for port status annotation.[1]
  - Extend ProbeService.ProbeAsync to route type: snmp and produce standardized CheckResult with error categorization (timeout, noResponse, authError).[3]
  - Add unit tests for success, timeout, no response, bad community, and optional port OID resolution; add an integration test using a mock SNMP agent.[1]
- Configuration & Apply
  - Update config.schema.json to include enum value snmp with properties: host, port, version (v1/v2c), community, timeout, retries, and optional oid/expectedValue.[4]
  - Extend ConfigurationParser validations and Apply diff to show additions/changes and preserve version snapshots and warnings for invalid parameter combinations.[4]
- API/UI
  - Ensure API DTOs and CSV export include probe type snmp, RTT, and optional portStatus metadata without changing outage semantics.[5]
  - Update Configuration editor to add SNMP fields with inline validation and help text, and label SNMP endpoints distinctly in live board and detail pages.[4]
- Docs
  - Add docs examples for snmp endpoints, defaults, version/community notes, optional port status OID, and firewall/UDP considerations.[5]
  - Note performance expectations for SNMP RTT and error categorization in probes-spec.md.[1]

### Acceptance Criteria
- A YAML endpoint with type: snmp applies cleanly, appears in diff/versioning, and is visible/editable in the UI with sensible defaults and validations.[4]
- SNMP endpoints report UP when a GET completes within timeout and DOWN on timeout/no response/auth error, with RTT populated and errors categorized.[1]
- Outage transitions for SNMP respect 2/2 flap damping and persist open/close events as with other probe types.[3]
- API and CSV show probe type snmp and RTT, and UI clearly distinguishes SNMP endpoints and optionally displays port status if configured.[5]

### Risks & Mitigations
- Variability across vendors and MIBs for port OIDs; default to sysUpTime.0 for reachability and document port-OID as optional.[1]
- UDP filtering or rate-limiting in OT networks; provide clear error categorization and operator guidance in docs and UI.[1]

### Testing Plan
- Unit tests for SnmpPingProbe covering success/failure modes with deterministic timings and cancellations.[1]
- Integration test against a mock agent to validate port OID flow and RTT reporting, plus E2E from YAML apply → probe execution → outage transitions → API/CSV verification.

### References
- Probe semantics and budgets to mirror: probes-spec.md.
- Flow integration and execution boundaries: outage-probe-flow-analysis.md.
- Configuration/Apply/versioning architecture and file layout: Configuration.md.
- API/doc surfacing and examples: README.md.
- Scope guardrails and availability-only emphasis: scope-v1.md.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SNMP ping (switch health, port up/down) #172

Title

Background

Objective

Scope

Non-goals

Design Notes

Tasks

Acceptance Criteria

Risks & Mitigations

Testing Plan

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add SNMP ping (switch health, port up/down) #172

Description

Title

Background

Objective

Scope

Non-goals

Design Notes

Tasks

Acceptance Criteria

Risks & Mitigations

Testing Plan

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions