|
| 1 | +# Contributing to raglogs |
| 2 | + |
| 3 | +raglogs is an incident analysis CLI tool. Contributions are welcome — bug fixes, new log adapters, normalization improvements, and documentation are all useful. |
| 4 | + |
| 5 | +This document covers how to get set up, what the codebase expects, and how to submit changes. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Getting started |
| 10 | + |
| 11 | +```bash |
| 12 | +git clone https://github.com/leo-aa88/raglogs |
| 13 | +cd raglogs |
| 14 | +python -m venv .venv && source .venv/bin/activate |
| 15 | +pip install -r requirements.txt && pip install -e . |
| 16 | +cp .env.example .env |
| 17 | +# Edit .env — set RAGLOGS_DB_URL at minimum |
| 18 | +docker compose up postgres -d |
| 19 | +raglogs init |
| 20 | +``` |
| 21 | + |
| 22 | +Run the demo to verify everything works: |
| 23 | + |
| 24 | +```bash |
| 25 | +raglogs demo |
| 26 | +raglogs timeline --since 2h |
| 27 | +raglogs compare --since 2h --baseline 24h |
| 28 | +``` |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## Project structure |
| 33 | + |
| 34 | +``` |
| 35 | +src/core/ |
| 36 | + clustering/ Fingerprint grouping, importance scoring, baseline comparison |
| 37 | + compare/ Window diffing — new, disappeared, increased, decreased |
| 38 | + explain/ Evidence assembly, confidence, templates, summarizer |
| 39 | + ingestion/ Ingestion orchestration and batch persistence |
| 40 | + llm/ Provider abstraction (OpenAI, Ollama, noop) |
| 41 | + normalization/ Message normalization, fingerprinting, trigger patterns |
| 42 | + parsing/ JSON and text parsers, field alias resolution |
| 43 | + retrieval/ Keyword-based question answering |
| 44 | + timeline/ Causal timeline reconstruction |
| 45 | +src/cli/commands/ One file per CLI command |
| 46 | +src/api/routes/ FastAPI route handlers |
| 47 | +src/db/ SQLAlchemy models and session management |
| 48 | +src/utils/ Time window parsing, hashing helpers |
| 49 | +tests/unit/ Pure unit tests — no database required |
| 50 | +tests/integration/ Full pipeline tests — require running Postgres |
| 51 | +``` |
| 52 | + |
| 53 | +The pipeline flows: ingest → normalize → fingerprint → cluster → baseline compare → evidence assembly → explain / timeline / compare. |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +## Running tests |
| 58 | + |
| 59 | +Unit tests require no database: |
| 60 | + |
| 61 | +```bash |
| 62 | +make test-unit |
| 63 | +# or |
| 64 | +pytest tests/unit/ |
| 65 | +``` |
| 66 | + |
| 67 | +Integration tests require a running Postgres instance: |
| 68 | + |
| 69 | +```bash |
| 70 | +docker compose up postgres -d |
| 71 | +make test-int |
| 72 | +# or |
| 73 | +pytest tests/integration/ |
| 74 | +``` |
| 75 | + |
| 76 | +All tests must pass before a pull request will be merged. If you are adding new functionality, add tests for it. |
| 77 | + |
| 78 | +--- |
| 79 | + |
| 80 | +## Code style |
| 81 | + |
| 82 | +- Python 3.11+ |
| 83 | +- Type annotations on all function signatures |
| 84 | +- `ruff` for linting, `black` for formatting |
| 85 | + |
| 86 | +```bash |
| 87 | +make lint |
| 88 | +make format |
| 89 | +``` |
| 90 | + |
| 91 | +No hard rules on line length beyond what `black` enforces. Prefer explicit over clever. Avoid abstractions that exist only to save lines. |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## Areas where contributions are most useful |
| 96 | + |
| 97 | +**Normalization improvements** |
| 98 | + |
| 99 | +The normalization step (`src/core/normalization/patterns.py`) determines clustering quality. If you have real-world log formats that normalize poorly — producing too many clusters for the same error, or collapsing unrelated errors — a fix there has high leverage. Include before/after examples and a test in `tests/unit/test_normalization.py`. |
| 100 | + |
| 101 | +**Log source adapters** |
| 102 | + |
| 103 | +New adapters go in `src/adapters/`. Each adapter yields `ParsedLogLine` objects. The rest of the pipeline is fully source-agnostic. Useful adapters: Datadog, Loki, Kubernetes pod logs, CloudWatch. |
| 104 | + |
| 105 | +**Trigger patterns** |
| 106 | + |
| 107 | +New trigger event patterns go in `TRIGGER_PATTERNS` in `src/core/normalization/patterns.py`. A good trigger pattern is specific enough to avoid false positives and general enough to match common variants across log formats. |
| 108 | + |
| 109 | +**Bug fixes** |
| 110 | + |
| 111 | +Check the issue tracker. Bugs with a reproducing log sample are easiest to fix. |
| 112 | + |
| 113 | +--- |
| 114 | + |
| 115 | +## Submitting a pull request |
| 116 | + |
| 117 | +1. Fork the repository and create a branch from `main` |
| 118 | +2. Make your changes with tests |
| 119 | +3. Run `make lint` and `make test-unit` — both must pass |
| 120 | +4. Open a pull request with a clear description of what changed and why |
| 121 | +5. Reference any related issue |
| 122 | + |
| 123 | +Pull requests that are purely cosmetic (reformatting with no functional change) will not be merged. |
| 124 | + |
| 125 | +Keep pull requests focused. A PR that fixes a bug and adds an unrelated feature is harder to review and slower to merge than two separate PRs. |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## Commit messages |
| 130 | + |
| 131 | +No strict format required. Be clear about what changed and why. One-line messages are fine for small changes. For anything non-trivial: |
| 132 | + |
| 133 | +``` |
| 134 | +Short summary (under 72 chars) |
| 135 | +
|
| 136 | +Longer explanation of what changed and why, if not obvious from the |
| 137 | +diff. Reference the issue number if applicable. |
| 138 | +``` |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## Questions |
| 143 | + |
| 144 | +Open a GitHub issue with the `question` label. If it is a quick question about whether a contribution would be accepted before you invest time in it, that is a good use of an issue. |
0 commit comments