|
2 | 2 |
|
3 | 3 | AI-driven extractor log analysis. Deploys CDF Functions and a data model that segments extractor logs into per-run chunks and generates structured root cause analyses using Atlas AI. |
4 | 4 |
|
| 5 | +## System Overview |
| 6 | + |
| 7 | +Argus AI is an end-to-end log monitoring and analysis system built on Cognite Data Fusion (CDF). It turns raw extractor log files into structured, AI-generated root-cause analyses surfaced through a web dashboard. The full system has three layers: |
| 8 | + |
| 9 | +### Layer 1 — Data Ingestion: Cognite File Extractor |
| 10 | + |
| 11 | +The starting point is an on-premises server where an extractor is running. A **Cognite File Extractor** is installed on the same VM. It watches a designated folder on the filesystem and continuously uploads log files to CDF as CDM `CogniteFile` nodes. No processing happens at this stage — the raw log file simply lands in CDF as a file resource. |
| 12 | + |
| 13 | +### Layer 2 — Data Modeling & Processing (this repository) |
| 14 | + |
| 15 | +This repository is packaged as a **CDF Toolkit library module** (`dp:argus_ai`) that clients deploy into their own CDF Toolkit project. It provisions a data model and two CDF Functions that process the uploaded log file: |
| 16 | + |
| 17 | +1. **`argus_ai_log_chopper`** reads only new lines from the source log file (tracking state in CDF RAW), segments them into discrete extractor runs based on start/end markers, and uploads each run as a separate `ArgusAILogChunkFile` node — a file containing just that run's log lines, together with metadata (timestamps, error/warning counts, pipeline ID, PI host). |
| 18 | + |
| 19 | +2. **`argus_ai_log_analyzer`** processes unanalyzed chunks for a given pipeline. Clean runs (zero errors/warnings) receive a `no_issues` result directly. For all others, the chunk is sent to an **Atlas AI agent**, and the structured JSON response is validated against allowed enum values fetched from the DMS container schema, then written as an `ArgusAILogAnalysis` node. |
| 20 | + |
| 21 | +### Layer 3 — Monitoring UI |
| 22 | + |
| 23 | +A Dune application queries the `ArgusAILogAnalysis` nodes from CDF and presents them as a health timeline and detected-issues list, with severity badges, category and root cause labels, and an embedded AI chat for troubleshooting guidance. |
| 24 | + |
| 25 | +### End-to-End Flow |
| 26 | + |
| 27 | +``` |
| 28 | +Extractor VM |
| 29 | + └── Cognite File Extractor |
| 30 | + └── uploads log file ──► CDF (CogniteFile node) |
| 31 | + │ |
| 32 | + argus_ai_log_chopper |
| 33 | + (segments into per-run chunks) |
| 34 | + │ |
| 35 | + ArgusAILogChunkFile nodes |
| 36 | + (one file per extractor run) |
| 37 | + │ |
| 38 | + argus_ai_log_analyzer |
| 39 | + (Atlas AI → structured root cause) |
| 40 | + │ |
| 41 | + ArgusAILogAnalysis nodes |
| 42 | + (CogniteActivity with severity, |
| 43 | + category, root cause code, …) |
| 44 | + │ |
| 45 | + ArgusAI Web Dashboard |
| 46 | + (health timeline, issue list, |
| 47 | + AI-assisted troubleshooting) |
| 48 | +``` |
| 49 | + |
5 | 50 | ## What gets deployed |
6 | 51 |
|
7 | 52 | | Resource | Description | |
|
0 commit comments