You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary for local
---------------
Succeeded: 1603 (changed=38)
Failed: 0
---------------
Total states run: 1603
Total run time: 284.988 s
So I reinstalled SO and I figured out how to maintain an SO deployment-- daily reboots on scheduled task/cron. Has kept SO stable since the last discussion. It appears that over time, maybe an elastic shard issue crops up and the whole thing falls apart without a reboot (excessive swap usage, etc.). I've theorized that maybe it just has a hard time keeping up with the TAP NICs and EPS's. I understand that this may not be SOP/in documentation, but it's worked for me. Pretty happy to say the least to have a home lab running 24/7 without the previous faults (the exception is elasticsearch is always "pending" right now). I've set up a custom API alert integration that notifies me on certain alerts across Sigma/Suricata/Elastic Defend, so I am kept in the loop while away from the SOC console. I've also turned all rules on and turned all Elastic defend alerts/rules on so I have good coverage (plus prevention on Elastic). Thus, I'm happy to say I feel like I am getting a LOT out of it now and have leveled up big time since using SO over the past several years. Basic nmap and recon scans light up, and I am catching scanning activity on the perimeter previously unnoticed and blocking entire ASNs/geos and reducing noise!
I do see at least 5% packet loss right now then suddenly backed down to 1%... that may be a side issue. But def annoying consider currently uplinks are 1 GBPS symmetric fiber, and i have increaed MTU ring size and everything to ensure it can accomodate a modern fiber network. (Did this to all monitoring NICs: sudo ethtool -G ens161 rx 4096 tx 4096 rx-jumbo 4096)
So finally configuring tuning to reduce noise, I keep hitting some obstacles on certain rules. Annoying to say the least when you've taken a lot of stabs at a rule and it still wants to trigger. But I'm happy I'm finally to this point on an SO deployment.
I believe this has worked for certain rules. However, I am facing some rules that are untunable. The common denominator appears to be the network.data.decoded field. No matter how much I regex or try to sofilter it out, it doesn't work.
ET INFO Observed UA-CPU Header
ET DNS Excessive NXDOMAIN responses - Possible DNS Backscatter or Domain Generation Algorithm Lookups
ET INFO exe download via HTTP - Informational
ET MALWARE Suspected BPFDoor UDP Magic Packet (Inbound)
GPL INFO MISC Tunneling IP over DNS with NSTX
Focusing on this rule for example: GPL INFO MISC Tunneling IP over DNS with NSTX
Example network.data.decoded output: <...........(f38b6553438ee053a9ff902c92d6a3d80f7dceb3.malware.hash.cymru.com.......)........
Obviously this is a benign cymru Threat intel hash lookup DNS request that keeps triggering. It appears no matter how many attempts I have done to supppress this, it won't go away. Curious if this is a known issue or what workarounds exist as well as the best tuning approach is on SO rules.
Another thing I've noticed is that on Suricata rules that have been disabled, they continue to show up in the UI (despite potentially not triggering API notification) and require a 0.0.0.0/0 either suppression allowlist to fully disable. Not too upset since that seems to work but I am concerned there may be a fault somewhere in the tuning processes causing this.
Finally, I am wondering if there is any global modify/tuning capability, such as to block a field or any field across all Suricata/ET rules as sometimes there are multiple rules triggering on the same thing? Also, is there a way to quickly validate submitted tuning modifications against rules/events? (e.g., a quick command to run to see if the rule would trigger or if the filter would catch/exclude an event?).
TIA!
Guidelines
I have read the discussion guidelines at Read before posting! #1720 and assert that I have followed the guidelines.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Version
2.4.160
Installation Method
Security Onion ISO image
Description
other (please provide detail below)
Installation Type
Standalone
Location
on-prem with Internet access
Hardware Specs
Exceeds minimum requirements
CPU
12
RAM
40
Storage for /
163
Storage for /nsm
327
Network Traffic Collection
tap
Network Traffic Speeds
1Gbps to 10Gbps
Status
Yes, all services on all nodes are running OK
Salt Status
No, there are no failures
Logs
No, there are no additional clues
Detail
sudo salt-call state.highstate output:
So I reinstalled SO and I figured out how to maintain an SO deployment-- daily reboots on scheduled task/cron. Has kept SO stable since the last discussion. It appears that over time, maybe an elastic shard issue crops up and the whole thing falls apart without a reboot (excessive swap usage, etc.). I've theorized that maybe it just has a hard time keeping up with the TAP NICs and EPS's. I understand that this may not be SOP/in documentation, but it's worked for me. Pretty happy to say the least to have a home lab running 24/7 without the previous faults (the exception is elasticsearch is always "pending" right now). I've set up a custom API alert integration that notifies me on certain alerts across Sigma/Suricata/Elastic Defend, so I am kept in the loop while away from the SOC console. I've also turned all rules on and turned all Elastic defend alerts/rules on so I have good coverage (plus prevention on Elastic). Thus, I'm happy to say I feel like I am getting a LOT out of it now and have leveled up big time since using SO over the past several years. Basic nmap and recon scans light up, and I am catching scanning activity on the perimeter previously unnoticed and blocking entire ASNs/geos and reducing noise!
I do see at least 5% packet loss right now then suddenly backed down to 1%... that may be a side issue. But def annoying consider currently uplinks are 1 GBPS symmetric fiber, and i have increaed MTU ring size and everything to ensure it can accomodate a modern fiber network. (Did this to all monitoring NICs:
sudo ethtool -G ens161 rx 4096 tx 4096 rx-jumbo 4096
)So finally configuring tuning to reduce noise, I keep hitting some obstacles on certain rules. Annoying to say the least when you've taken a lot of stabs at a rule and it still wants to trigger. But I'm happy I'm finally to this point on an SO deployment.
My routine after tuning rules in the UI:
I believe this has worked for certain rules. However, I am facing some rules that are untunable. The common denominator appears to be the
network.data.decoded
field. No matter how much I regex or try to sofilter it out, it doesn't work.Focusing on this rule for example:
GPL INFO MISC Tunneling IP over DNS with NSTX
Example
network.data.decoded
output:<...........(f38b6553438ee053a9ff902c92d6a3d80f7dceb3.malware.hash.cymru.com.......)........
Here are my attempts at tuning this rule:
true Modify network.data.decoded: \.?[a-zA-Z0-9_-]+\.(cymru\.com)\b 2025-08-03 10:30:40.697 -05:00 2025-08-03 10:30:40.697 -05:00 true Modify network.data.decoded \.?[a-zA-Z0-9_-]+\.(cymru\.com)\b 2025-08-03 11:51:08.064 -05:00 2025-08-03 11:51:08.064 -05:00 true Modify startswith; content: !"cymru.com"; nocase; 2025-08-03 13:31:16.407 -05:00 2025-08-03 13:31:16.407 -05:00 true Modify startswith; startswith; content:"cymru.com"; nocase; 2025-08-04 00:40:56.625 -05:00 2025-08-04 00:40:56.625 -05:00 true Modify startswith; startswith; content:!"cymru.com"; nocase; 2025-08-10 23:24:46.900 -05:00 2025-08-10 23:24:46.900 -05:00 true Modify contains; sofilter: network.data.decoded|re: "cymru\.com" 2025-08-11 23:47:27.571 -05:00 2025-08-11 23:47:27.571 -05:00 true Modify sofilter: network.data.decoded|re: "cymru\.com" 2025-08-11 23:47:27.571 -05:00 2025-08-11 23:47:27.571 -05:00 true Modify sofilter: regex: "cymru" 2025-08-15 21:49:25.138 -05:00 2025-08-15 21:49:25.138 -05:00 true Modify sofilter: .*|regex: "(?i)cymru" 2025-08-16 10:10:51.170 -05:00 2025-08-16 10:10:51.170 -05:00 true Modify sofilter: sofilter: .*|regex: "(?i)cymru" 2025-08-16 10:11:18.837 -05:00 2025-08-16 10:11:18.837 -05:00 true Modify sofilter: sofilter: network.data.decoded|regex: "(?i)cymru" 2025-08-16 10:11:52.051 -05:00 2025-08-16 10:11:52.051 -05:00 true Modify sofilter: network.data.decoded|re: "(?i)\bcymru\.com\b" 2025-08-16 18:46:35.091 -05:00 2025-08-16 18:46:35.091 -05:00 true Suppress by_src [REDACTED_IP]/32 2025-08-17 12:50:31.232 -05:00 2025-08-17 12:50:31.232 -05:00
Obviously this is a benign cymru Threat intel hash lookup DNS request that keeps triggering. It appears no matter how many attempts I have done to supppress this, it won't go away. Curious if this is a known issue or what workarounds exist as well as the best tuning approach is on SO rules.
Another thing I've noticed is that on Suricata rules that have been disabled, they continue to show up in the UI (despite potentially not triggering API notification) and require a
0.0.0.0/0 either
suppression allowlist to fully disable. Not too upset since that seems to work but I am concerned there may be a fault somewhere in the tuning processes causing this.Finally, I am wondering if there is any global modify/tuning capability, such as to block a field or any field across all Suricata/ET rules as sometimes there are multiple rules triggering on the same thing? Also, is there a way to quickly validate submitted tuning modifications against rules/events? (e.g., a quick command to run to see if the rule would trigger or if the filter would catch/exclude an event?).
TIA!
Guidelines
Beta Was this translation helpful? Give feedback.
All reactions