Skip to content

Conversation

@shaohuzhang1
Copy link
Contributor

fix: Multi process startup document modification executed multiple times

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Mar 17, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@f2c-ci-robot
Copy link

f2c-ci-robot bot commented Mar 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shaohuzhang1 shaohuzhang1 merged commit 000a397 into main Mar 17, 2025
4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@main@fix_start branch March 17, 2025 09:00
'The download process was interrupted, please try again')})
update_execute(update_document_status_sql, [])
finally:
lock.un_lock('event_init')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few issues and optimizations that can be made to this code:

Issues:

  1. Lack of Lock Handling: The original code does not handle the lock properly if it fails to acquire the lock within the specified time. This can lead to deadlocks or data inconsistencies.
  2. SQL Injection Risk: Hardcoding SQL queries with dynamic content like ? is inherently risky. It's better to use parameterized queries or ORM methods where available.

Optimizations:

  1. Parameterized Queries: Use Django’s ORM method .filter() instead of raw SQL for safer database operations.
  2. Exception Handling: Implement proper exception handling to manage errors more gracefully, possibly logging them or providing feedback to the user.
  3. Lock Management: Enhance the lock logic to ensure robustness, such as retrying on failure after a delay before giving up entirely.

Here is an optimized version of the code incorporating these improvements:

from django.db import transaction
from django.utils.translation import gettext_lazy as _

from ..db.sql_execute import update_execute

from .models import Model  # Assuming this file has 'Model' defined somewhere
import settings
from common.lock.impl.file_lock import FileLock

lock = FileLock()

@update_document_status_sql = """
UPDATE "public"."document"
SET status =
       CASE WHEN status ~ '\\b1\\b' OR status ~ '\\b0\\b' THEN replacement
           ELSE status END
FROM (
    VALUES ('1', '3'),
           ('0', '3'),
           ('4', '3')
) AS replacements (original, replaced)
WHERE document.status LIKE CONCAT('%', replacements.original, '%');
"""

@transaction.atomic
def run():
    try:
        lock.acquire('event_init', timeout=60 * 5)  # Timeout of 5 minutes
        QuerySet(Model).filter(status=settings.Status.DOWNLOAD).update(status=settings.Status.ERROR,
                                                                    meta={'message': _('The download process was interrupted, please try again')})
        print(f"Updated {QuerySet(Model).filter(status=settings.Status.DOWNLOAD).count()} documents.")
        update_execute(update_document_status_sql, [])
        print("Document statuses updated successfully.")

    except FileLock.LockError:
        print("Failed to acquire lock within the allowed timeframe.")
        return None

    finally:
        lock.release('event_init')

Key Changes:

  • Use Transaction: Wrapped the block in a transaction.atomic to ensure atomicity during the database updates.
  • Parameterized Query: Used a subquery for updating document statuses using the LIKE clause combined with string concatenation.
  • Retry Logic: Added a timeout for acquiring the lock to avoid indefinite waits if needed.
  • Feedback Mechanism: Added print statements for debugging purposes to indicate the number of documents updated and success/failure messages.

This should address the main issues and improve the robustness of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants