You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+18-38Lines changed: 18 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,25 +8,20 @@ Agents must distinguish between the two primary orchestration tiers to avoid "ci
8
8
9
9
### 🌌 Hybrid Orchestration Layers
10
10
11
-
-**Host Tier (Systemd)**: Reserved for hardware-level telemetry, security gates, and GitOps reconciliation. Reliability here is critical for cluster recovery. Core logic is extracted into `pkg/`libraries to ensure reusability and consistency across different execution triggers (CLI, API, and future AI tools).
11
+
-**Host Tier (Systemd)**: Reserved for hardware-level telemetry, security gates, and GitOps reconciliation. Reliability here is critical for cluster recovery. Core logic is strictly encapsulated in `internal/` to ensure reusability and enforce project boundaries.
12
12
-**Cluster Tier (K3s)**: Handles scalable data services (Postgres, Loki, Prometheus, Grafana, Tempo, MinIO). Orchestrated via **OpenTofu (IaC)** in `tofu/`.
13
13
14
-
### 📦 Distribution Pattern
14
+
### 🏗️ Directory Map (Consolidated Monorepo)
15
15
16
-
To maintain a clean repository and ensure operational stability, all compiled binaries must be output to the root `dist/` directory. Systemd unit files and automated scripts should reference artifacts from this location.
|**K3s Ops**|`make build-collectors`|Builds and imports the collectors Docker image into K3s. |
42
+
|**Host Ops**|`make proxy-build`|Builds proxy server to `bin/` and restarts the service. |
48
43
49
44
## 3. Engineering Standards
50
45
51
46
### 🐹 Go (Backend)
52
47
53
-
-**Library-First**: Move core domain logic to `pkg/` before implementing the service entry point. Services should be thin wrappers around library capabilities.
54
-
-**Environment Loading**: Always use `pkg/env` for standardized `.env` discovery. Do not use `godotenv` directly in services.
55
-
-**Dependency Management**: Delegate driver registration (e.g., `lib/pq`) to `pkg/db` to avoid redundant blank imports in services.
56
-
-**Failure Modes**: Never swallow errors. Use explicit wrapping: `fmt.Errorf(\"context: %w\", err)`.
57
-
-**Observability**: Every service must emit JSON-formatted logs to `stdout` using `pkg/logger`.
58
-
-**Telemetry**: All instrumentation must be handled through the centralized `pkg/telemetry` library.
59
-
-**Testing**: Table-driven tests are the standard. Run `make go-cov` to verify coverage. Maintain a minimum of 80% coverage for `pkg/` libraries.
60
-
61
-
### 🎨 HTML/CSS (Frontend)
62
-
63
-
-**Zero Frameworks**: Use native HTML5 and CSS3 only.
64
-
-**Styling**: Leverage CSS variables in `:root` for dark-theme consistency.
48
+
-**Thin Main**: Entry points in `cmd/` must be minimal. Move all core domain logic to `internal/`.
49
+
-**Internal-First**: Shared libraries reside in `internal/` to prevent external logic leakage.
50
+
-**Environment Loading**: Always use `internal/env` for standardized `.env` discovery.
51
+
-**Observability**: Every service must emit structured JSON logs using `internal/telemetry`.
52
+
-**Telemetry**: All instrumentation must be handled through the centralized `internal/telemetry` library.
53
+
-**Testing**: Table-driven tests are the standard. Run `make go-cov` to verify coverage.
65
54
66
55
### 📝 Institutional Memory (Documentation)
67
56
@@ -71,15 +60,6 @@ The project uses a unified automation layer. **Always prefer `make` commands** a
71
60
72
61
## 4. Operational Excellence & Safety
73
62
74
-
-**Secrets**: NEVER commit secrets. Use `.env` for local dev and OpenBao for production secrets.
63
+
-**Secrets**: NEVER commit secrets. Use `.env` for local dev and OpenBao for production.
75
64
-**GitOps**: Host-tier changes are applied via `gitops_sync.sh` (triggered by Proxy webhooks).
76
-
-**Observability**: Any new service must be integrated into the telemetry pipeline (Logs to Loki, Metrics to Postgres/Prometheus).
77
65
-**Security**: All Kubernetes manifests must pass `kube-lint`. All Go code must pass `go-vuln-scan`.
78
-
79
-
## 5. Failure Mode Analysis (FMA)
80
-
81
-
Before proposing a change, agents should ask:
82
-
83
-
1. "Does this create a circular dependency between the host and the cluster?"
84
-
2. "How will this be debugged in production if the network is down?"
85
-
3. "Is this change recorded in an ADR to preserve the 'Why'?"
Copy file name to clipboardExpand all lines: README.md
+7-10Lines changed: 7 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,11 +12,12 @@ Built using Go and orchestrated on Kubernetes (K3s), the platform unifies system
12
12
13
13
This project highlights significant accomplishments in building a modern observability and platform engineering solution:
14
14
15
-
***Full OpenTelemetry (LMT) Implementation:** Achieved end-to-end observability with a unified OTel Collector, Tempo (Traces), Prometheus (Metrics), Loki (Logs), and Go SDK for instrumentation. Includes Service Graphs, synthetic transaction monitoring, and comprehensive host-level telemetry.
15
+
***Unified Go Monorepo:** Consolidated fragmented modules into a single root module, eliminating 17 `replace` directives and standardizing dependency management across all services.
16
+
***Encapsulated Architecture:** Transitioned to an `internal/` and `cmd/` layout, enforcing Go's package visibility rules and adopting the "Thin Main" pattern for better testability and system integrity.
17
+
***Full OpenTelemetry (LMT) Implementation:** Achieved end-to-end observability with a unified OTel Collector, Tempo (Traces), Prometheus (Metrics), Loki (Logs), and Go SDK for instrumentation.
16
18
***GitOps Reconciliation Engine:** Implemented a secure, templated GitOps reconciliation engine for automated state enforcement via webhooks, scaled to support multi-tenant synchronization.
17
19
***Kubernetes Migration & Cloud-Native Operations:** All core observability stack components (Loki, Grafana, Tempo, Prometheus, Postgres) are running natively in Kubernetes with persistent storage.
18
-
***Library-First Architecture:** Structural transition into `pkg/` and `services/` layout, decoupling core business logic into transport-agnostic modules for improved reusability and testability.
19
-
***Centralized Secrets Management:** Transitioned to OpenBao for secure secrets storage and retrieval, replacing insecure static configurations.
20
+
***Centralized Secrets Management:** Integrated OpenBao for secure, dynamic credential retrieval across all services, replacing insecure static configurations.
20
21
***Hybrid Cloud Architecture (Store-and-Forward Bridge):** Designed and implemented a secure bridge for ingesting external telemetry without exposing local ports, ensuring reliable data flow from diverse sources.
21
22
***Reproducible Local Development:** Ensures consistent and reproducible developer environments via `shell.nix` and `docker-compose`.
22
23
***Formalized Decision-Making & Incident Response:** Established an Architectural Decision Record (ADR) process and an Incident Response/RCA framework for structured decision-making and operational excellence.
@@ -72,7 +73,7 @@ flowchart TB
72
73
Tailscale[Tailscale]
73
74
end
74
75
75
-
GoApps["Go Services (Proxy, Reading Sync, Second Brain)"]
Copy file name to clipboardExpand all lines: docs/architecture/core-concepts/automation.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,15 +16,14 @@ The system consists of several main service families, each with a `.service` uni
16
16
| :--- | :--- | :--- | :--- |
17
17
|**`tailscale-gate`**|`simple`| Continuous |**Security**: Monitors Proxy health and toggles Tailscale Funnel access. |
18
18
|**`proxy`**|`simple`| Continuous |**API Gateway**: Core listener for data pipelines and GitOps webhooks. |
19
-
|**`gitops-sync`**|`oneshot`|**Webhook**|**Reconciliation**: Triggered by Proxy to pull latest code and apply changes. |
20
-
|**`reading-sync`**|`oneshot`| Twice Daily (00:00, 12:00) |**Data Pipeline Trigger**: Calls Proxy API to sync MongoDB data to Postgres. |
19
+
|**`ingestion`**|`oneshot`| Daily (00:00) |**Data Ingestion**: Unified engine for Reading Analytics and Second Brain sync. |
21
20
22
21
## Operational Excellence
23
22
24
23
Our systemd configurations employ several production-grade patterns:
25
24
26
25
-**Security Gating**: The `tailscale-gate` service implements a loop that ensures the public entry point (Funnel) is automatically closed if the underlying `proxy` service stops, preventing "dead" endpoints from being exposed.
27
-
-**Persistence (`Persistent=true`)**: Used in `reading-sync`. If the host is powered off during the scheduled time, systemd will trigger the service immediately upon the next boot.
26
+
-**Persistence (`Persistent=true`)**: Used in `ingestion`. If the host is powered off during the scheduled time, systemd will trigger the service immediately upon the next boot.
28
27
-**Unified Observability**: All units emit logs, metrics, and traces, which are captured, enriched, and forwarded by the host-level OpenTelemetry Collector.
Copy file name to clipboardExpand all lines: docs/architecture/core-concepts/observability.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,7 +75,7 @@ The platform aggregates infrastructure metrics through Prometheus scraping, appl
75
75
Distributed tracing is powered by OpenTelemetry for correlation and performance profiling across high-throughput pipelines.
76
76
77
77
-**Collection Pipeline**:
78
-
-**Instrumentation**: Services use the **OpenTelemetry SDK** to generate spans in OTLP format. We follow a **Pure Wrapper** philosophy where shared libraries (`pkg/db`) provide standardized infrastructure spans (e.g., `db.postgres.record_metric`), while services own the root spans (`job.*` or `handler.*`).
78
+
-**Instrumentation**: Services use the **OpenTelemetry SDK** to generate spans in OTLP format. We follow a **Pure Wrapper** philosophy where shared libraries (`internal/db`) provide standardized infrastructure spans (e.g., `db.postgres.record_metric`), while services own the root spans (`job.*` or `handler.*`).
79
79
-**Ingestion**: Spans are sent to the **OpenTelemetry Collector** via gRPC (NodePort `30317`) or HTTP (NodePort `30318`), which batches and exports them to **Grafana Tempo**.
80
80
-**Processing**: Tempo analyzes raw spans to generate derived **Service Graphs** and **Span Metrics**, which are pushed to Prometheus via `remote_write` for operational correlation.
The Ingestion Service (`cmd/ingestion/`) is a unified data orchestration engine responsible for synchronizing external data sources into the platform's local analytical store (PostgreSQL). It operates as a periodic task runner managed by a Systemd timer.
4
+
5
+
## Component Details
6
+
7
+
### Task Overview
8
+
9
+
The service follows a **Task-Oriented Design**, where specific data synchronization logics are encapsulated into independent, testable tasks managed by a centralized engine.
10
+
11
+
| Task | Source | Purpose |
12
+
| :--- | :--- | :--- |
13
+
|`reading`| MongoDB Atlas |**Reading Analytics**: Syncs article metadata and engagement metrics from cloud to local store. |
Synchronizes Cloud-based MongoDB data with the local PostgreSQL environment.
21
+
22
+
1.**Fetch**: Retrieves unprocessed documents from MongoDB Atlas in configurable batches.
23
+
2.**Transform**: Maps MongoDB BSON/JSON metadata to the structured PostgreSQL `reading_analytics` schema.
24
+
3.**Persist**: Executes UPSERT operations in PostgreSQL to ensure data consistency.
25
+
4.**Acknowledge**: Marks documents as "processed" in MongoDB to prevent duplicate ingestion.
26
+
27
+
#### Second Brain Task (`brain`)
28
+
29
+
Transforms GitHub-based journaling entries into atomic, searchable database records.
30
+
31
+
1.**Check**: Queries PostgreSQL for the most recent entry date to determine the sync delta.
32
+
2.**Ingest**: Fetches new journal entries from the configured GitHub repository via the GitHub API.
33
+
3.**Atomize**: Decomposes long-form markdown logs into granular "thought atoms," including metadata like tags and categories.
34
+
4.**Quantify**: Calculates token counts for each atom to support future LLM-based analytical workloads.
35
+
36
+
## Distributed Tracing
37
+
38
+
The Ingestion Service is instrumented with the **OpenTelemetry SDK** to provide visibility into the data pipeline performance and task status.
39
+
40
+
### Configuration
41
+
42
+
The service initializes a global TracerProvider during startup, controlled by environment variables:
43
+
44
+
| Variable | Description |
45
+
| :--- | :--- |
46
+
|`OTEL_EXPORTER_OTLP_ENDPOINT`| The gRPC endpoint of OpenTelemetry (e.g., `localhost:30317`). |
47
+
|`OTEL_SERVICE_NAME`| The service identifier used in traces (defaults to `ingestion`). |
48
+
49
+
### Trace Coverage
50
+
51
+
Spans are created for the entire job lifecycle:
52
+
53
+
-**Job Lifecycle**: Root span `job.ingestion` tracks the overall synchronization run.
54
+
-**Task Execution**: Child spans (`task.reading`, `task.brain`) provide granular visibility into individual task performance.
55
+
-**API/DB Operations**: Sub-spans for GitHub API requests, MongoDB fetches, and PostgreSQL transactions.
56
+
57
+
Traces are exported to the central **OpenTelemetry Collector** via gRPC and stored in **Grafana Tempo**.
58
+
59
+
### Instrumentation Strategy
60
+
61
+
1.**Task Engine Wrapper**: The service uses a centralized `RunTask` engine that automatically wraps every registered task in a named OpenTelemetry span, capturing task-specific attributes and success/failure status.
62
+
2.**Manual Spans**: High-latency operations (external API calls and complex database syncs) are manually instrumented to provide precise timing and error context for pipeline optimization.
Copy file name to clipboardExpand all lines: docs/architecture/services/proxy.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Proxy Service Architecture
2
2
3
-
The Proxy Service (`services/proxy/`) is a custom Go application that acts as the API gateway, Data Pipeline engine, and **GitOps automation trigger** for the platform. It runs as a native host process managed by Systemd.
3
+
The Proxy Service (`cmd/proxy/`) is a custom Go application that acts as the API gateway, Data Pipeline engine, and **GitOps automation trigger** for the platform. It runs as a native host process managed by Systemd.
0 commit comments