-
Version2.4.60 Installation MethodSecurity Onion ISO image Descriptionconfiguration Installation TypeDistributed Locationon-prem with Internet access Hardware SpecsExceeds minimum requirements CPUVaries depending on the node; minimum of 4 RAMAgain varies depending on the node, minimum of 16 GB Storage for /Varies, minimum of 128 GB Storage for /nsmVaries heavily, between 500 GB and 70 TB Network Traffic Collectionspan port Network Traffic Speedsmore than 10Gbps StatusNo, one or more services are failed (please provide detail below) Salt StatusNo, there are no failures LogsYes, there are additional clues in /opt/so/log/ (please provide detail below) DetailWe have been working on building out a new distributed 2.4 install. We have VMs for the manager node and two search nodes and we have physical servers for two forward sensor nodes and one physical search node box. We connected our first forward node to capture traffic from our datacenter to the rest of our campus and everything seemed to work fine for about a week. Then we started to get faults on the manager node and the error "The search query encountered a failure within the Elasticsearch cluster. " when trying to look at alerts. Looking at the logs and in Kibana under Stack Management we found that we had shards failing and that some Zeek indices from the forward node were showing red under health. We tried deleting the indices that were having difficulty, and that fixed the issue temporarily. Shortly after the indices would show red and the shards would fail again. Now after rebooting the entire system the forward node is now also showing a fault and that elasticsearch and logstash are missing. Trying to restart elasticsearch and logstash keeps failing. Guidelines
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
From the manager can you give us the output from something like |
Beta Was this translation helpful? Give feedback.
-
Thanks! |
Beta Was this translation helpful? Give feedback.
On sostorep1 can you run the command
test -d /nsm && echo "correct" | echo "incorrect"
/nsm should be a directory and it looks like currently it might exist as a file ? If the output says "incorrect" try removing the fileOn sostorev1 it looks like it might be having problems communicating with the manager check that it still has access to the manager. You can try running
nc -zv <managerip> 4505
&nc -zv <managerip> 4506
Here is a list of ports that all nodes should back to the manager https://docs.securityonion.net/en/2.4/firewall.html#node-communication