add preStop hook for autoscaling down #142

prabhatsharma · 2025-07-20T21:53:46Z

add preStop hook for autoscaling down

Copilot

Pull Request Overview

This PR implements a preStop hook for ingester pods to support graceful autoscaling down by allowing data to be properly flushed before pod termination. The key changes include adding a preStop lifecycle hook that disables the node, flushes data from memory to WAL, and waits for S3 synchronization, along with extending termination grace periods to accommodate the longer shutdown process.

Adds preStop lifecycle hook to ingester pods for graceful shutdown during autoscaling
Extends terminationGracePeriodSeconds from 30 seconds to 20 minutes for data flushing
Minor documentation improvements to chart README

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
charts/openobserve/values.yaml	Increases termination grace periods to allow time for data flushing
charts/openobserve/templates/ingester-statefulset.yaml	Implements preStop hook with node disable, data flush, and wait logic
charts/openobserve/README.md	Minor formatting improvements to headings

charts/openobserve/values.yaml

Copilot · 2025-07-20T21:58:06Z

charts/openobserve/values.yaml

+        terminationGracePeriodSeconds: 1200 # 20 minutes for now, since we are using pre-stop hook to flush data andit takes up to 10 minutes to flush data to s3
      livenessProbe:
        initialDelaySeconds: 10
        periodSeconds: 10
        timeoutSeconds: 5
        successThreshold: 1
        failureThreshold: 3
-        terminationGracePeriodSeconds: 30
+        terminationGracePeriodSeconds: 1200 # 20 minutes for now, since we are using pre-stop hook to flush data andit takes up to 10 minutes to flush data to s3


There is a spelling error in the comment: 'andit' should be 'and it'.

Copilot · 2025-07-20T21:58:06Z

charts/openobserve/templates/ingester-statefulset.yaml

+                  # Wait for 900 seconds after flush to ensure data is moved to s3
+                  # 15 minutes for now, since file movement to s3 may take up to 10 minutes
+                  echo "Waiting 900 seconds to flush data..."
+                  sleep 900


The sleep duration of 900 seconds is a magic number that should be made configurable through values.yaml to allow for environment-specific tuning.

Copilot · 2025-07-20T21:58:06Z

charts/openobserve/templates/ingester-statefulset.yaml

+                  curl -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/enable?value=false" \
+                    -H "Authorization: Basic ${AUTH_HEADER}" 
+
+                  # returns 200 if successful and "true" if the node is disabled
+
+                  # Flush all data from memory to WAL. This does not flush data from ingester to s3.
+                  echo "Flushing data from ingester..."
+                  curl -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/flush" \
+                    -H "Authorization: Basic ${AUTH_HEADER}" 
+
+                  # returns 200 if successful and "true" if the node is flushed


The curl commands lack error handling. If the API calls fail, the script continues without knowing if the operations succeeded, which could lead to data loss.

Suggested change

curl -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/enable?value=false" \

-H "Authorization: Basic ${AUTH_HEADER}"

# returns 200 if successful and "true" if the node is disabled

# Flush all data from memory to WAL. This does not flush data from ingester to s3.

echo "Flushing data from ingester..."

curl -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/flush" \

-H "Authorization: Basic ${AUTH_HEADER}"

# returns 200 if successful and "true" if the node is flushed

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/enable?value=false" \

-H "Authorization: Basic ${AUTH_HEADER}")

if [ "$RESPONSE" -ne 200 ]; then

echo "Error: Failed to disable ingester node. HTTP response code: $RESPONSE"

exit 1

fi

# Flush all data from memory to WAL. This does not flush data from ingester to s3.

echo "Flushing data from ingester..."

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" -X PUT "http://localhost:{{ .Values.config.ZO_HTTP_PORT }}/node/flush" \

-H "Authorization: Basic ${AUTH_HEADER}")

if [ "$RESPONSE" -ne 200 ]; then

echo "Error: Failed to flush data from ingester. HTTP response code: $RESPONSE"

exit 1

fi

charts/openobserve/templates/ingester-statefulset.yaml

Co-authored-by: Copilot <[email protected]>

add preStop hook for autoscaling down

ba767e3

prabhatsharma requested a review from Copilot July 20, 2025 21:57

Copilot AI reviewed Jul 20, 2025

View reviewed changes

prabhatsharma and others added 3 commits July 20, 2025 15:15

Update charts/openobserve/templates/ingester-statefulset.yaml

b9b3eca

Co-authored-by: Copilot <[email protected]>

Update charts/openobserve/values.yaml

db4ecfe

Co-authored-by: Copilot <[email protected]>

typo fix

da26c66

hengfeiyang approved these changes Jul 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add preStop hook for autoscaling down #142

add preStop hook for autoscaling down #142

Uh oh!

prabhatsharma commented Jul 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jul 20, 2025

Uh oh!

Copilot AI Jul 20, 2025

Uh oh!

Copilot AI Jul 20, 2025

Uh oh!

Uh oh!

Uh oh!

add preStop hook for autoscaling down #142

Are you sure you want to change the base?

add preStop hook for autoscaling down #142

Uh oh!

Conversation

prabhatsharma commented Jul 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!