|
| 1 | +# Logs and Metrics in RAM |
| 2 | + |
| 3 | +The SAS Retrieval Agent Manager (RAM) system collects and stores logs and metrics using [Vector](https://vector.dev/), a high-performance observability data pipeline. Vector aggregates telemetry data from Kubernetes clusters and routes it to PostgreSQL via PostgREST for persistent storage and querying. |
| 4 | + |
| 5 | +## Architecture Overview |
| 6 | + |
| 7 | +```text |
| 8 | +Kubernetes Logs / RAM APIMetrics → Vector → PostgREST → PostgreSQL |
| 9 | +``` |
| 10 | + |
| 11 | +Vector runs as a DaemonSet in the cluster, collecting: |
| 12 | + |
| 13 | +- **Logs**: Container logs from all pods via Kubernetes log files |
| 14 | + |
| 15 | +- **Metrics**: Performance metrics, resource usage, and custom application metrics |
| 16 | + |
| 17 | +## Configuration |
| 18 | + |
| 19 | +### Vector Pipeline Components |
| 20 | + |
| 21 | +The Vector configuration consists of three main components: |
| 22 | + |
| 23 | +1. **Sources**: Data collection from Kubernetes |
| 24 | +2. **Transforms**: Data processing and enrichment using VRL (Vector Remap Language) |
| 25 | +3. **Sinks**: Delivery to PostgREST endpoints |
| 26 | + |
| 27 | +### Logs Pipeline |
| 28 | + |
| 29 | +Vector collects Kubernetes pod logs and enriches them with metadata: |
| 30 | + |
| 31 | +```yaml |
| 32 | +sources: |
| 33 | + kube_logs: |
| 34 | + type: kubernetes_logs |
| 35 | + auto_partial_merge: true |
| 36 | + |
| 37 | +transforms: |
| 38 | + logs_transform: |
| 39 | + type: remap |
| 40 | + inputs: |
| 41 | + - kube_logs |
| 42 | + source: | |
| 43 | + # Remove fields not in database schema |
| 44 | + del(.source_type) |
| 45 | + del(.stream) |
| 46 | + |
| 47 | +sinks: |
| 48 | + logs_postgrest: |
| 49 | + type: http |
| 50 | + inputs: |
| 51 | + - logs_transform |
| 52 | + uri: "http://sas-retrieval-agent-manager-postgrest.retagentmgr.svc.cluster.local:3002/logs" |
| 53 | + encoding: |
| 54 | + codec: json |
| 55 | + method: post |
| 56 | +``` |
| 57 | +
|
| 58 | +> Note: See a full [Vector example values file here](../../examples/vector.yaml) |
| 59 | +
|
| 60 | +#### Log Schema |
| 61 | +
|
| 62 | +Logs are stored in PostgreSQL with the following schema: |
| 63 | +
|
| 64 | +| Column | Type | Description | |
| 65 | +|--------|------|-------------| |
| 66 | +| `file` | TEXT | Path to the log file in Kubernetes | |
| 67 | +| `kubernetes` | JSONB | Kubernetes metadata (pod, namespace, labels, etc.) | |
| 68 | +| `message` | TEXT | The actual log message | |
| 69 | +| `timestamp` | TIMESTAMPTZ | When the log entry was created | |
| 70 | + |
| 71 | +#### Kubernetes Metadata |
| 72 | + |
| 73 | +The `kubernetes` JSONB column includes the following context: |
| 74 | + |
| 75 | +- `pod_name`, `pod_namespace`, `pod_uid` |
| 76 | + |
| 77 | +- `container_name`, `container_image` |
| 78 | + |
| 79 | +- `node_labels` |
| 80 | + |
| 81 | +- `pod_labels` |
| 82 | + |
| 83 | +- `pod_ip`, `pod_owner` |
| 84 | + |
| 85 | +### Metrics Pipeline |
| 86 | + |
| 87 | +Metrics collection follows a similar pattern but does not need transformations: |
| 88 | + |
| 89 | +```yaml |
| 90 | +sources: |
| 91 | + otel: |
| 92 | + type: opentelemetry |
| 93 | + grpc: |
| 94 | + address: 0.0.0.0:4317 |
| 95 | + http: |
| 96 | + address: 0.0.0.0:4318 |
| 97 | +
|
| 98 | +sinks: |
| 99 | + metrics_postgrest: |
| 100 | + type: http |
| 101 | + inputs: |
| 102 | + - otel.metrics |
| 103 | + uri: "http://sas-retrieval-agent-manager-postgrest.retagentmgr.svc.cluster.local:3002/metrics" |
| 104 | + headers: |
| 105 | + Content-Type: "Application/json" |
| 106 | + encoding: |
| 107 | + codec: json |
| 108 | +``` |
| 109 | + |
| 110 | +## Installation |
| 111 | + |
| 112 | +To install Vector, edit the [example Vector values file](../../examples/vector.yaml) to your desired settings and run the following commands: |
| 113 | + |
| 114 | +```sh |
| 115 | +helm repo add vector https://helm.vector.dev |
| 116 | +helm repo update |
| 117 | +
|
| 118 | +helm install vector vector/vector \ |
| 119 | + -n vector -f .\values.yaml \ |
| 120 | + --create-namespace --version 0.46.0 |
| 121 | +``` |
| 122 | + |
| 123 | +## PostgREST Integration |
| 124 | + |
| 125 | +Vector sends data directly to PostgREST HTTP endpoints, which provides: |
| 126 | + |
| 127 | +- Automatic API generation from PostgreSQL schema |
| 128 | + |
| 129 | +- Role-based access control via PostgreSQL roles |
| 130 | + |
| 131 | +- JSON validation and type safety |
| 132 | + |
| 133 | +## Testing |
| 134 | + |
| 135 | +### Manual Log Injection |
| 136 | + |
| 137 | +Test the postgREST endpoint with a curl from within the cluster: |
| 138 | + |
| 139 | +```bash |
| 140 | +curl -X POST \ |
| 141 | + "http://sas-retrieval-agent-manager-postgrest.retagentmgr.svc.cluster.local:3002/logs" \ |
| 142 | + -H "Content-Type: application/json" \ |
| 143 | + -H "Prefer: return=representation" \ |
| 144 | + -d '{ |
| 145 | + "file": "/var/log/pods/test_pod/container/0.log", |
| 146 | + "kubernetes": { |
| 147 | + "container_name": "test-container", |
| 148 | + "pod_name": "test-pod", |
| 149 | + "pod_namespace": "default", |
| 150 | + "pod_uid": "test-uid-12345" |
| 151 | + }, |
| 152 | + "message": "Test log message", |
| 153 | + "timestamp": "2025-11-10T18:00:00.000000Z" |
| 154 | + }' |
| 155 | +``` |
| 156 | + |
| 157 | +### Verify Vector is Running |
| 158 | + |
| 159 | +```bash |
| 160 | +# Check Vector pods |
| 161 | +kubectl get pods -n vector |
| 162 | +
|
| 163 | +# View Vector logs |
| 164 | +kubectl logs -n vector -l app.kubernetes.io/name=vector --tail=100 |
| 165 | +
|
| 166 | +# Check for errors |
| 167 | +kubectl logs -n vector -l app.kubernetes.io/name=vector | grep ERROR |
| 168 | +``` |
| 169 | + |
| 170 | +## Troubleshooting |
| 171 | + |
| 172 | +### Common Issues |
| 173 | + |
| 174 | +#### 1. Schema Mismatch Errors |
| 175 | + |
| 176 | +**Error**: `Could not find the 'source_type' column` |
| 177 | + |
| 178 | +**Solution**: Add a VRL transform to remove fields not in your database schema: |
| 179 | + |
| 180 | +```yaml |
| 181 | +transforms: |
| 182 | + remove_extra_fields: |
| 183 | + type: remap |
| 184 | + inputs: |
| 185 | + - kube_logs |
| 186 | + source: | |
| 187 | + del(.source_type) |
| 188 | + del(.stream) |
| 189 | +``` |
| 190 | + |
| 191 | +#### 2. PostgREST Connection Failures |
| 192 | + |
| 193 | +**Error**: `Service call failed. No retries or retries exhausted` |
| 194 | + |
| 195 | +Check PostgREST is accessible: |
| 196 | + |
| 197 | +```bash |
| 198 | +
|
| 199 | +kubectl get svc -n retagentmgr sas-retrieval-agent-manager-postgrest |
| 200 | +kubectl get pods -n retagentmgr -l app.kubernetes.io/name=postgrest |
| 201 | +
|
| 202 | +``` |
| 203 | + |
| 204 | +## Related Documentation |
| 205 | + |
| 206 | +- [Vector Documentation](https://vector.dev/docs/) |
| 207 | +- [PostgREST API Reference](https://postgrest.org/en/stable/api.html) |
| 208 | +- [OpenTelemetry Specification](https://opentelemetry.io/docs/) |
| 209 | +- [VRL Language Reference](https://vector.dev/docs/reference/vrl/) |
0 commit comments