Unintended log deletions - "Used disk space exceeds LOG_SIZE_LIMIT (0 GB)" #15617

S6T0Sa0B1v · 2026-03-16T14:42:05Z

S6T0Sa0B1v
Mar 16, 2026

Version

2.4.201

Installation Method

Security Onion ISO image

Description

other (please provide detail below)

Installation Type

other (please provide detail below)

Location

on-prem with Internet access

Hardware Specs

Exceeds minimum requirements

CPU

32

RAM

33.2 GB

Storage for /

264.0 GB

Storage for /nsm

67068.1 GB

Network Traffic Collection

other (please provide detail below)

Network Traffic Speeds

Less than 1Gbps

Status

Yes, all services on all nodes are running OK

Salt Status

Yes, there are salt failures (please provide detail below)

Logs

Yes, there are additional clues in /opt/so/log/ (please provide detail below)

Detail

Hello Security Onion,

I am writing in as a follow-up to my previous thread re: an issue with unprompted log deletion on our organization's Security Onion deployment. To summarize again, we have two nodes in our Onion grid: a main Standalone server and a Fleet node in our DMZ which sends logs to the first. On the Standalone node the following ILM retention settings are configured:

Warm: 30d
Cold: 90d
Delete: 365d

so-elasticsearch-indices-delete: On
elasticsearch.retention.retention_pct: 50

I logged in this morning to find that the issue we reported earlier had reoccurred over the weekend. Once again nearly all of our historical logs had been deleted from Dashboards long before the intended ILM retention thresholds, as a result of which nearly everything dated before 15MAR26 was gone. At the suggestion of a poster in the previous thread, I read through /opt/so/log/elasticsearch/so-elasticsearch-indices-delete.log from that day and found several entries of interest, a sample of which I have provided below:

Sun Mar 15 04:41:10 AM UTC 2026 - Used disk space exceeds LOG_SIZE_LIMIT (0 GB) - There is only one backing index (.ds-logs-elastic
search.server-default-2026.02.21-000001).  Deleting logs-elasticsearch.server-default data stream...
{"acknowledged":true}
Sun Mar 15 04:41:35 AM UTC 2026 - Used disk space exceeds LOG_SIZE_LIMIT (0 GB) - There is only one backing index (.ds-logs-endpoin
t.events.api-default-2026.02.21-000001).  Deleting logs-endpoint.events.api-default data stream...
{"acknowledged":true}

[...]

Over the next couple of minutes several dozen logs like the ones above follow, cycling through our various log indices and deleting them, until the file terminated for the day with the following:

Sun Mar 15 04:50:54 AM UTC 2026 - Used disk space exceeds LOG_SIZE_LIMIT (0 GB) - There is only one backing index (.ds-logs-detections.alerts-so-2026.03.14-000022).  Deleting logs-detections.alerts-so data stream...
{"acknowledged":true}
Sun Mar 15 04:50:57 AM UTC 2026 -> Maximum iteration limit reached (10). Unable to bring disk below threshold. Writing alert ([GUID]) to /opt/so/log/elasticsearch/indices-delete-alert.log
Sun Mar 15 04:50:57 AM UTC 2026,[GUID],Maximum iteration limit reached (10). Unable to bring disk below threshold.

While this does seem to confirm that elasticsearch-indices-delete is indeed the source of the log deletion, we still have several questions about how and why this is happening given the retention settings we have configured.

If our /nsm partition has several terabytes of available space, how is elasticsearch-indices-delete being triggered to delete these logs?
Which misconfigured setting (related to "LOG_SIZE_LIMIT (0 GB))" is prompting Onion to delete these logs with 0 GB of space expended?
How often is this deletion process running, and where is that schedule configured?
What would be the correct way to remediate this unintended deletion schedule and ensure that our ILM retention thresholds are followed? We were under the impression that our settings shown above would allow us to retain logs for up to a year of cold retention before deletion.

Thank you again for any and all assistance you can provide. Please let me know if you have any additional questions and I will follow up as soon as I am able.

Result of salt-call state.highstate:

local:
    Data failed to compile:
----------
    The function "state.highstate" is running as PID 846826 and was started at 2026, Mar 16 14:36:24.162181 with jid 20260316143624162181

Guidelines

I have read the discussion guidelines at Read before posting! #1720 and assert that I have followed the guidelines.

Answered by cm-ops

Mar 17, 2026

A couple of places to check. In your Elasticsearch log, /opt/so/log/elasticsearch/securityonion.log look for an entry for the standalone node disconnecting from the cluster. When the script runs on the cron, the variable for LOG_SIZE_LIMIT is calculated in the so-elasticsearch-indices-delete-delete script as LOG_SIZE_LIMIT_GB=$(/usr/sbin/so-elasticsearch-cluster-space-total 50)

If this part of the script failed to yield a number:

# Iterate through the output of _cat/allocation for each node in the cluster to determine the total available space

for i in $(/usr/sbin/so-elasticsearch-query _cat/allocation | awk '{print $8}'); do

  size=$(echo $i | grep -oE '[0-9].*' | awk '{print int($1+0.…

View full answer

cm-ops · 2026-03-17T13:30:04Z

cm-ops
Mar 17, 2026
Maintainer

A couple of places to check. In your Elasticsearch log, /opt/so/log/elasticsearch/securityonion.log look for an entry for the standalone node disconnecting from the cluster. When the script runs on the cron, the variable for LOG_SIZE_LIMIT is calculated in the so-elasticsearch-indices-delete-delete script as LOG_SIZE_LIMIT_GB=$(/usr/sbin/so-elasticsearch-cluster-space-total 50)

If this part of the script failed to yield a number:

# Iterate through the output of _cat/allocation for each node in the cluster to determine the total available space

for i in $(/usr/sbin/so-elasticsearch-query _cat/allocation | awk '{print $8}'); do

  size=$(echo $i | grep -oE '[0-9].*' | awk '{print int($1+0.5)}')
  unit=$(echo $i | grep -oE '[A-Za-z]+')
  if [ $unit = "tb" ]; then
    size=$(( size * 1024 ))
  fi
  TOTAL_AVAILABLE_SPACE=$(( TOTAL_AVAILABLE_SPACE + size ))
done

Then the TOTAL_AVAILABLE_SPACE variable would be 0, hence the 0GB you saw (variable is TOTAL_AVAILABLE_SPACE=0 in the script) .

To find the cron, look in the crontab. It runs every 5 minutes by default

{%     if grains.role in ['so-eval', 'so-standalone', 'so-managersearch', 'so-heavynode', 'so-manager'] %}
so-elasticsearch-indices-delete:
  cron.{{ap}}:
    - name: /usr/sbin/so-elasticsearch-indices-delete > /opt/so/log/elasticsearch/cron-elasticsearch-indices-delete.log 2>&1
    - identifier: so-elasticsearch-indices-delete
    - user: root
    - minute: '*/5'
    - hour: '*'
    - daymonth: '*'
    - month: '*'
    - dayweek: '*'
{%     endif %}

If you are confident that your ILM settings will keep the cluster healthy, you should disable the SOC > Administration > Configuration > elasticsearch > index_clean setting.

1 reply

S6T0Sa0B1v Mar 19, 2026
Author

This is basically exactly what we see, and index_clean has been turned off. Thanks again for helping us track this down, Chris!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unintended log deletions - "Used disk space exceeds LOG_SIZE_LIMIT (0 GB)" #15617

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Unintended log deletions - "Used disk space exceeds LOG_SIZE_LIMIT (0 GB)" #15617

Uh oh!

Uh oh!

S6T0Sa0B1v Mar 16, 2026

Version

Installation Method

Description

Installation Type

Location

Hardware Specs

CPU

RAM

Storage for /

Storage for /nsm

Network Traffic Collection

Network Traffic Speeds

Status

Salt Status

Logs

Detail

Guidelines

Replies: 1 comment · 1 reply

Uh oh!

cm-ops Mar 17, 2026 Maintainer

Uh oh!

S6T0Sa0B1v Mar 19, 2026 Author

S6T0Sa0B1v
Mar 16, 2026

Replies: 1 comment 1 reply

cm-ops
Mar 17, 2026
Maintainer

S6T0Sa0B1v Mar 19, 2026
Author