Skip to content

Commit 2f91b59

Browse files
committed
up
1 parent 434ad7a commit 2f91b59

File tree

2 files changed

+241
-0
lines changed

2 files changed

+241
-0
lines changed
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
title: Log Processor Offline
3+
id: log_processor_offline
4+
---
5+
6+
When the Console or a notification rule reports **Log Processor Offline**, the local agent has not checked in with the Local API (LAPI) for more than 24 hours. The alert is different from **Log Processor No Alert**, which only means logs were parsed but no scenarios fired. Use the sections below to identify why the heartbeat stopped and how to bring the agent back online.
7+
8+
## Common Root Causes & Diagnostics
9+
10+
### Service stopped or stuck
11+
12+
- Confirm the service state on the host:
13+
14+
```bash
15+
sudo systemctl status crowdsec
16+
sudo journalctl -u crowdsec -n 50
17+
```
18+
19+
- For containerised deployments, verify the workload is still running:
20+
21+
```bash
22+
docker ps --filter name=crowdsec
23+
kubectl get pods -n crowdsec
24+
```
25+
26+
- On the LAPI node, run `sudo cscli machines list` and check whether the `Last Update` column is older than 24 hours for the affected machine.
27+
28+
### Machine not validated or credentials revoked
29+
30+
- `sudo cscli machines list` on the LAPI shows the machine in `PENDING` state or missing entirely.
31+
- On the agent host, ensure `/etc/crowdsec/local_api_credentials.yaml` exists and contains the expected login and password.
32+
- If you recently reinstalled or renamed the machine, it must be re-validated. See [Machines management](/u/user_guides/machines_mgmt) for details.
33+
34+
### Local API unreachable
35+
36+
- From the agent, run:
37+
38+
```bash
39+
sudo cscli lapi status
40+
```
41+
42+
Errors such as `401 Unauthorized`, TLS failures, or connection timeouts indicate an authentication or network issue.
43+
44+
- Verify the API endpoint declared in `/etc/crowdsec/config.yaml` (`api.client.credentials_path`, `url`, `ca_cert`, `insecure_skip_verify`) matches your LAPI setup. Refer to [Local API configuration](/docs/local_api/configuration) and [TLS authentication](/docs/local_api/tls_auth) if certificates changed.
45+
- Confirm the network path between the agent and the LAPI host is open (default port `8080/TCP`). Firewalls or reverse proxies introduced after installation commonly block the heartbeat.
46+
47+
### Local API unavailable
48+
49+
- If several agents show as offline simultaneously, the LAPI service might be down. Check its status on the LAPI machine:
50+
51+
```bash
52+
sudo systemctl status crowdsec
53+
sudo journalctl -u crowdsec -n 50
54+
```
55+
56+
- Inspect `/var/log/crowdsec/` (or container logs) for database or authentication errors that prevent the LAPI from responding.
57+
- Use `sudo cscli metrics show engine` on the LAPI to confirm it is still ingesting events from other agents. See the [Health Check guide](/u/getting_started/health_check) for additional diagnostics.
58+
59+
## Recovery Actions
60+
61+
### Restart the Log Processor service
62+
63+
- Systemd:
64+
65+
```bash
66+
sudo systemctl restart crowdsec
67+
```
68+
69+
- Docker:
70+
71+
```bash
72+
docker restart crowdsec
73+
```
74+
75+
- Kubernetes:
76+
77+
```bash
78+
kubectl rollout restart deployment/crowdsec -n crowdsec
79+
```
80+
81+
After the restart, re-run `sudo cscli machines list` on the LAPI to confirm the `Last Update` timestamp is refreshed.
82+
83+
### Validate or re-register the machine
84+
85+
#### Using credentials
86+
87+
:::info
88+
More suitable for single machine setups.
89+
:::
90+
91+
- To regenerate credentials directly on the LAPI host when the agent runs locally, run:
92+
93+
```bash
94+
sudo cscli machines add -a
95+
```
96+
97+
#### Using registration system
98+
99+
:::info
100+
Registration system is more suitable for distributed setups.
101+
:::
102+
103+
104+
105+
- Approve pending machines on the LAPI:
106+
107+
```bash
108+
sudo cscli machines validate <machine_name>
109+
```
110+
111+
- If credentials were removed or the agent was rebuilt, re-register it against the LAPI:
112+
113+
```bash
114+
sudo cscli lapi register --url http://<lapi_host>:8080 --machine <machine_name>
115+
sudo systemctl restart crowdsec
116+
```
117+
118+
Update the `--url` to match your deployment. Auto-registration tokens are covered in [Machines management](/u/user_guides/machines_mgmt#machine-auto-validation).
119+
120+
### Restore connectivity to the Local API
121+
122+
- Open the required port on firewalls or security groups and verify with:
123+
124+
```bash
125+
nc -zv <lapi_host> 8080
126+
```
127+
128+
- If TLS certificates were renewed, update the agent trust store (`ca_cert`) or temporarily enable `insecure_skip_verify: true` for testing. Follow the hardening recommendations in [TLS authentication](/docs/local_api/tls_auth).
129+
- When using proxies or load balancers, ensure they forward HTTP headers and TLS material expected by the LAPI.
130+
131+
### Stabilise the Local API
132+
133+
- Restart the LAPI service or pod if it was unresponsive:
134+
135+
```bash
136+
sudo systemctl restart crowdsec
137+
kubectl rollout restart deployment/crowdsec-lapi -n crowdsec
138+
```
139+
140+
- Run `sudo cscli support dump` to collect diagnostics if the LAPI repeatedly crashes or loses database access. Review the resulting archive for database connectivity errors and consult the [Security Engine troubleshooting guide](/u/troubleshooting/security_engine) when escalation is required.
141+
142+
Once the heartbeat is restored, the Console alert clears automatically during the next polling cycle. Consider adding a [notification rule](/u/console/notification_integrations/rule) for **Log Processor Offline** so you are alerted promptly when it happens again.
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: Security Engine Offline
3+
id: security_engine_offline
4+
---
5+
6+
The **Security Engine Offline** alert appears in the Console and notification integrations when an enrolled engine has not reported or logged in to CrowdSec for more than 48 hours. This usually means the core `crowdsec` service (Log Processor + Local API) has stopped working or communicating with our infrastructure.
7+
8+
## Common Root Causes & Diagnostics
9+
10+
### Host or service down
11+
12+
- Check that the `crowdsec` service is running:
13+
14+
```bash
15+
sudo systemctl status crowdsec
16+
sudo journalctl -u crowdsec -n 50
17+
```
18+
19+
- For container or Kubernetes deployments, confirm the workload is still healthy:
20+
21+
```bash
22+
docker ps --filter name=crowdsec
23+
kubectl get pods -n crowdsec
24+
```
25+
26+
- If the host itself is unreachable (hypervisor, VM, or cloud instance down), the Console cannot receive a heartbeat and marks the engine offline.
27+
28+
### Enrollment revoked or pending
29+
30+
- On the engine, run `sudo cscli console status` to verify it is still enrolled and accepted.
31+
- In the Console, visit **Security Engines** and confirm the engine is not archived or removed. Follow [Pending Security Engines](/u/console/security_engines/pending_security_engines) if it shows as waiting for approval.
32+
- Review `/etc/crowdsec/console.yaml` for disabled options (`console_management`, `custom`, `tainted`, `context`) that may prevent expected data from being sent.
33+
34+
### Console connectivity issues
35+
36+
- `sudo cscli console status` may show errors such as `permission denied`, `unable to reach console`, or TLS failures. Inspect `/var/log/crowdsec/crowdsec.log` (or container stdout) for more details.
37+
- Ensure outbound access to the CrowdSec Console endpoints listed in [Network management](/docs/configuration/network_management). Firewalls or proxy changes often block the HTTPS calls required for heartbeats.
38+
- Verify system time is synced (via NTP). Large clock drifts can invalidate console tokens.
39+
40+
### Local API unavailable
41+
42+
- If the Local API is stopped, the Security Engine cannot gather or forward alerts. Check its status on the same host:
43+
44+
```bash
45+
sudo cscli machines list
46+
sudo cscli metrics show engine
47+
```
48+
49+
- Errors in `/var/log/crowdsec/local_api.log` regarding database connectivity or TLS indicate the Local API is not processing alerts, which will in turn stop console updates. Refer to [Security Engine troubleshooting](/u/troubleshooting/security_engine) and [Log Processor Offline](/u/troubleshooting/log_processor_offline) if needed.
50+
51+
## Recovery Actions
52+
53+
### Restart the Security Engine service
54+
55+
- Systemd:
56+
57+
```bash
58+
sudo systemctl restart crowdsec
59+
```
60+
61+
- Docker:
62+
63+
```bash
64+
docker restart crowdsec
65+
```
66+
67+
- Kubernetes:
68+
69+
```bash
70+
kubectl rollout restart deployment/crowdsec -n crowdsec
71+
```
72+
73+
After restarting, re-run `sudo cscli console status` to ensure the heartbeat is restored.
74+
75+
### Re-enroll the engine in the Console
76+
77+
- If the engine was removed or enrollment expired, obtain a fresh key from **Settings > Enrollment** in the Console and run:
78+
79+
```bash
80+
sudo cscli console enroll <ENROLLMENT_KEY>
81+
sudo systemctl restart crowdsec
82+
```
83+
84+
- When replacing an existing enrollment, append `--overwrite` so the Console updates the existing record.
85+
- Confirm the engine appears as **Healthy** in the Console after the restart.
86+
87+
### Restore connectivity to the Console
88+
89+
- Check that you can access crowdsec services and APIs listed in [network management](https://doc.crowdsec.net/docs/next/configuration/network_management/)
90+
- If a proxy is required, configure it in `/etc/crowdsec/config.yaml` under `common.http_proxies` and reload the service.
91+
- Renew TLS trust stores if the host cannot validate the Console certificate chain.
92+
93+
### Stabilise the Local API
94+
95+
- Restart the Local API component (same `crowdsec` service or the dedicated LAPI pod) and confirm it responds to local commands:
96+
97+
- Investigate persistent database or authentication errors using `sudo cscli support dump`, then consult the [Security Engine troubleshooting guide](/u/troubleshooting/security_engine) if issues remain.
98+
99+
Once the engine resumes contact, the Console clears the **Security Engine Offline** alert during the next poll. Consider enabling the **Security Engine Offline** notification in your preferred integration so future outages are caught quickly.

0 commit comments

Comments
 (0)