-
Notifications
You must be signed in to change notification settings - Fork 524
Fix SQLite concurrency issues with WAL mode and timeouts #982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
📝 WalkthroughWalkthroughReworked the SQLite connection manager to set a 30s timeout, enable WAL and concurrency/performance PRAGMAs, enforce relational integrity PRAGMAs, and add lifecycle guards. Refactored image-related DB functions to use the unified context manager and changed error paths to raise exceptions. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @backend/app/database/images.py:
- Around line 408-426: The file mixes the new get_db_connection() context
manager (used in db_toggle_image_favourite_status) and legacy _connect() calls
causing inconsistent concurrency/locking behavior; migrate the remaining
image-related DB functions (e.g., db_create_images_table, db_bulk_insert_images,
db_get_all_images, db_get_untagged_images, db_update_image_tagged_status,
db_insert_image_classes_batch, db_get_images_by_folder_ids,
db_delete_images_by_ids and any others still calling _connect) to use
get_db_connection() as a context manager, replace direct _connect() usages with
"with get_db_connection() as conn: cursor = conn.cursor()", ensure WAL
mode/timeout are applied by get_db_connection, call conn.commit() after writes
where needed, preserve existing SQL and return semantics, and align exception
logging/raising to the pattern used in db_toggle_image_favourite_status
(logger.error and re-raise).
- Around line 47-53: The file mixes two connection strategies: the legacy
_connect() and the context-managed get_db_connection(); migrate all functions
that currently call _connect() (the eight functions referenced in the review) to
use the get_db_connection() context manager like the existing
db_toggle_image_favourite_status does. Replace patterns like conn = _connect();
cur = conn.cursor() ... conn.commit(); conn.close() with a with
get_db_connection() as conn: block, use conn.cursor() inside it, remove explicit
conn.close(), and keep commit/rollback semantics consistent (commit when
mutating, let exceptions propagate for rollback). Ensure you do not re-apply
PRAGMA settings already handled by get_db_connection() and update any
docstrings/comments to reflect the unified connection strategy.
🧹 Nitpick comments (3)
backend/app/database/connection.py (3)
20-28: Bothtimeoutparameter andbusy_timeoutPRAGMA are redundant.Setting both
timeout=30.0(line 20) andPRAGMA busy_timeout = 30000(line 28) is redundant—both configure the same 30-second wait period for locked databases. Thetimeoutparameter is sufficient and slightly more efficient.♻️ Simplify by removing redundant PRAGMA
- conn.execute("PRAGMA busy_timeout = 30000;") # 30 second timeout for locksThe connection-level
timeoutparameter already handles this.
23-23: Setting WAL mode on every connection is inefficient.WAL mode is a database-level persistent setting, not a connection-level setting. Once enabled, it remains active for all subsequent connections until explicitly changed. Executing
PRAGMA journal_mode = WALon every connection adds unnecessary overhead.Consider setting WAL mode once during database initialization (e.g., in a migration script or startup routine) rather than on every connection.
22-28: Consider the performance overhead of setting multiple PRAGMAs on every connection.Executing 7 PRAGMA statements on every connection has measurable overhead, especially for short-lived connections or high-frequency operations. While these settings optimize concurrency, the setup cost may outweigh the benefits for quick queries.
For frequently-used settings like WAL mode, synchronous, and foreign keys, consider:
- Setting persistent PRAGMAs (WAL, synchronous) once at database initialization
- Applying connection-specific PRAGMAs (cache_size, mmap_size) only for long-lived or high-load connections
- Profiling whether all 7 PRAGMAs provide measurable benefit in your workload
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/database/connection.pybackend/app/database/images.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
📚 Learning: 2025-10-31T17:00:50.132Z
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
Applied to files:
backend/app/database/connection.py
🧬 Code graph analysis (1)
backend/app/database/images.py (1)
backend/app/database/connection.py (1)
get_db_connection(8-43)
🔇 Additional comments (4)
backend/app/database/connection.py (3)
20-20: Verify: This contradicts prior team preferences on timeout configuration.Based on learnings, user Hemil36 previously preferred not to use extended busy timeouts in
get_db_connection(), even with concurrent access patterns. The addition oftimeout=30.0here contradicts that preference.Please confirm whether the team's position on timeout configuration has changed, or if this should be adjusted to align with the previous decision.
25-27: Verify that large cache and mmap sizes are appropriate for your deployment environment.The settings allocate significant memory per connection:
cache_size = -64000→ 64MB cachemmap_size = 268435456→ 256MB memory-mapped I/OFor applications with many concurrent connections or memory-constrained environments, these values could lead to excessive memory usage or OOM errors.
Please verify:
- Whether these sizes align with your deployment environment's memory capacity
- Whether connection pooling limits are in place to prevent excessive memory consumption
- Whether these values have been tested under your expected peak concurrency
38-39: LGTM: Connection lifecycle guards prevent errors.The addition of
if conn:guards beforerollback()andclose()properly handles cases where connection initialization fails, preventing potentialAttributeErrorexceptions.Also applies to: 42-43
backend/app/database/images.py (1)
60-96: Improved exception handling in table creation.The refactored
db_create_images_table()now properly wraps operations in try/except/finally blocks with explicit commit and rollback, which is a good defensive programming practice.However, note that this function still uses
_connect()instead ofget_db_connection(), which means it lacks the full concurrency optimizations described in the PR objectives. See the related comment on lines 47-53.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/app/database/images.py (2)
110-196: Consider raising exceptions for consistency with write operations.This read operation returns an empty list on error (line 196), while write operations (
db_bulk_insert_images,db_update_image_tagged_status,db_delete_images_by_ids) raise exceptions. This inconsistency makes it difficult for callers to distinguish between "no results found" and "database error occurred."For better error visibility and debugging, consider propagating exceptions from read operations as well. If graceful degradation is intentional for reads, document this design decision in the docstring.
♻️ Proposed refactor for consistent error handling
except Exception as e: logger.error(f"Error getting all images: {e}") - return [] + raise
294-325: Consider raising exceptions for consistency with write operations.Similar to
db_get_all_images, this read operation returns an empty list on error (line 325) while write operations raise exceptions. This masks database errors from callers who cannot distinguish between "no images in these folders" and "database error."For consistency and better error visibility, consider propagating exceptions.
♻️ Proposed refactor for consistent error handling
except Exception as e: logger.error(f"Error getting images by folder IDs: {e}") - return [] + raise
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/app/database/images.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
📚 Learning: 2025-10-31T17:00:50.132Z
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
Applied to files:
backend/app/database/images.py
🧬 Code graph analysis (1)
backend/app/database/images.py (2)
backend/app/database/connection.py (1)
get_db_connection(8-43)backend/app/utils/images.py (1)
image_util_parse_metadata(496-513)
🔇 Additional comments (8)
backend/app/database/images.py (8)
9-9: Verify that the 30-second timeout aligns with current architectural preferences.Based on learnings, a previous team member (Hemil36, PR 570) preferred not to use extended busy timeouts in the centralized connection manager. The current PR introduces a 30-second timeout to address issue #913. Please confirm this approach change has been reviewed and aligns with the team's updated concurrency strategy.
47-79: LGTM! Clean table creation with proper foreign key constraints.The function correctly uses the context manager for both table creations, with appropriate IF NOT EXISTS guards and CASCADE deletes for referential integrity. The lack of explicit error handling is acceptable here since table creation errors should propagate to the caller for proper initialization flow handling.
81-107: LGTM! Proper bulk upsert with conflict resolution.The function correctly uses the context manager and implements appropriate upsert logic. The CASE statement for
isTaggedensures that once an image is tagged, it remains tagged even during re-insertion. Error handling (log and raise) is consistent with write operation patterns.
199-238: LGTM! Clean query with proper error propagation.The function correctly uses the context manager, filters images needing AI tagging with an appropriate JOIN, and lets exceptions propagate naturally. The metadata parsing and None-handling for
folder_idare both correct.
241-262: LGTM! Proper update with rowcount validation.The function correctly uses the context manager and returns a boolean based on whether any rows were affected. Exception handling (log and raise) is consistent with other write operations.
Note: The function returns
Falsewhenimage_iddoesn't exist (rowcount=0), not an exception. This is a valid design choice but callers should be aware thatFalseindicates "no rows affected" rather than an error.
265-291: LGTM! Efficient batch insert with proper conflict handling.The function correctly uses the context manager and
INSERT OR IGNOREto handle duplicate pairs gracefully. The early return for empty lists and consistent error handling (log and raise) align with other write operations.
328-355: LGTM! Proper bulk delete with CASCADE awareness.The function correctly uses the context manager and documents the CASCADE behavior for
image_classesrecords. The deletion count logging is valuable for audit trails, and error handling (log and raise) is consistent with other write operations.
357-376: LGTM! Proper toggle implementation with existence check.The function correctly uses the context manager, checks for image existence before attempting the toggle, and uses the CASE expression to flip the boolean value. Error handling is consistent with other write operations.
Problem
Intermittent SQLite concurrency issues (e.g. locked database errors) could occur under concurrent access and reload scenarios due to unsafe direct connections and missing timeout configuration.
Fixes #913
Changes
Testing
Summary by CodeRabbit
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.