-
Notifications
You must be signed in to change notification settings - Fork 521
Feat/thread safe sqlite connection #948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/thread safe sqlite connection #948
Conversation
…SSIE-Org#946 - Add PKGBUILD for binary package installation from releases - Add PKGBUILD-git for building from source - Add pictopy.install for post-install hooks - Add .SRCINFO metadata file - Add publish-aur.yml workflow for automatic AUR publishing - Update build-and-release.yml to generate tar.gz for AUR - Update tauri.conf.json to build all Linux targets (deb, rpm, appimage) - Update README.md with AUR installation instructions This enables Arch Linux users to install PictoPy via: yay -S pictopy paru -S pictopy Closes AOSSIE-Org#946
- Add DatabaseConnectionManager class with per-thread connection storage - Implement threading.local() for thread-safe connection isolation - Add WAL mode for better concurrent read performance - Add write serialization with global lock for safe write operations - Provide context managers: get_db_connection, get_db_transaction, get_db_write_transaction - Refactor all database modules to use new connection manager: - images.py: Thread-safe image CRUD operations - folders.py: Thread-safe folder management - faces.py: Thread-safe face embedding storage - face_clusters.py: Thread-safe cluster operations - albums.py: Thread-safe album management - metadata.py: Thread-safe metadata storage - yolo_mapping.py: Thread-safe YOLO class mappings - Remove direct sqlite3.connect() calls and check_same_thread=False usage - Add proper transaction management with automatic commit/rollback - Configure SQLite pragmas for integrity and performance: - foreign_keys, busy_timeout, journal_mode=WAL, synchronous=NORMAL Benefits: - Prevents race conditions under concurrent FastAPI requests - Eliminates data corruption from shared connections across threads - Improves stability during parallel image processing workloads - Aligns with SQLite's recommended concurrency model Fixes AOSSIE-Org#943
📝 WalkthroughWalkthroughThis pull request introduces thread-safe SQLite connection management for the Python backend via a new DatabaseConnectionManager class and transaction helpers, refactors all database modules to use context-managed transactions instead of manual connection handling, adds Arch Linux User Repository (AUR) packaging infrastructure with workflows and build scripts, updates frontend Tauri build configuration, and documents installation instructions. Changes
Sequence Diagram(s)sequenceDiagram
participant FastAPI as FastAPI Request<br/>(Thread N)
participant DBMgr as DatabaseConnectionManager
participant ThreadLocal as Thread-Local<br/>Storage
participant SQLite as SQLite<br/>Database
participant TxnCtx as Transaction Context<br/>Manager
FastAPI->>DBMgr: get_connection() / get_db_transaction()
DBMgr->>ThreadLocal: check thread ID for existing conn
alt Connection exists
ThreadLocal-->>DBMgr: return cached conn
else New thread
ThreadLocal->>SQLite: create new connection<br/>(WAL, FK constraints)
SQLite-->>ThreadLocal: configured conn
end
DBMgr->>TxnCtx: enter context (begin transaction)
FastAPI->>SQLite: execute query
SQLite-->>FastAPI: result
FastAPI->>TxnCtx: exit context
alt Write transaction
TxnCtx->>SQLite: commit (with write-lock)
else Read transaction
TxnCtx->>SQLite: rollback / release
end
TxnCtx-->>FastAPI: success
sequenceDiagram
participant Release as GitHub Release Event
participant CI as GitHub Actions
participant Deb as Linux .deb Artifact
participant Tar as create-aur-tarball Job
participant AUR as AUR Git Repo
participant Publish as publish-aur Workflow
Release->>CI: trigger (publish or workflow_dispatch)
CI->>Tar: run create-aur-tarball after build
Tar->>CI: extract version from tag
Tar->>Deb: download .deb from release
Tar->>Tar: extract .deb → tar.gz + sha256sum
Tar->>Release: upload tarball + checksum
Release-->>Publish: trigger publish-aur workflow
Publish->>Release: download amd64/arm64 tarballs
Publish->>Publish: compute sha256 checksums
Publish->>AUR: update PKGBUILD<br/>version, sources, sha256sums
Publish->>AUR: regenerate .SRCINFO<br/>metadata + dependencies
Publish->>AUR: commit + push via SSH
AUR-->>Publish: package published
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Rationale: The changes span heterogeneous concerns—extensive database refactoring across 8 modules introducing a new connection management pattern with transaction helpers, concurrent AUR packaging infrastructure with 5 new files and 2 modified workflows, and Tauri configuration updates. The database refactoring requires careful verification of transaction context correctness and resource cleanup across many functions, while the AUR packaging requires understanding shell scripting, GitHub Actions mechanics, and Arch packaging standards. The changes are interconnected but demand separate reasoning per domain. Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Fix all issues with AI Agents 🤖
In @.github/workflows/build-and-release.yml:
- Line 408: Replace the outdated action reference softprops/action-gh-release@v1
with a current compatible release (e.g., softprops/action-gh-release@v2) by
updating the uses entry for the GitHub Actions step that currently lists
softprops/action-gh-release@v1; after updating the uses value, run or dry-run
the workflow to verify runner compatibility and adjust any input names if the
newer major version changed them.
- Around line 379-389: The workflow step "Download Linux deb from release" uses
a fixed sleep 30 which is flaky; replace it with a retry loop that polls the
asset URL built from TAG and VERSION (the same URLs used in the curl commands)
using a lightweight HEAD/GET check (e.g., curl -sfI) with exponential backoff
and a max retry count, exiting with error if the asset never appears; after a
successful availability check proceed to download the deb with the existing curl
fallback logic for both "picto-py_${VERSION}_amd64.deb" and
"PictoPy_${VERSION}_amd64.deb".
In @aur/.SRCINFO:
- Around line 40-42: Update the .SRCINFO to point to an existing release: change
the package version from 1.1.0 to 1.0.0 and update the source entries (e.g., the
fields source_aarch64, source_x86_64 or their equivalents) to the correct URLs
for v1.0.0 artifacts that actually exist on GitHub (or remove platform tarball
entries if only .deb/.exe/.app are available), and replace the placeholder
sha256sums_x86_64 and sha256sums_aarch64 values (currently SKIP) with the real
SHA256 checksums for the actual artifacts once you have valid downloads;
alternatively, if you intend to target v1.1.0, revert these changes and wait
until v1.1.0 publishes Linux tar.gz assets before committing, ensuring the
source_* and sha256sums_* symbols match real release files.
In @aur/PKGBUILD:
- Around line 48-49: Replace the placeholder SKIP values in the PKGBUILD by
computing and inserting the actual SHA256 checksums for the release artifacts
referenced in this PKGBUILD: update the arrays sha256sums_x86_64 and
sha256sums_aarch64 with the real SHA256 hex strings for each source entry;
generate the checksums using the same script/process referenced in the previous
comment on aur/.SRCINFO (or run sha256sum on each downloaded source artifact)
and then paste the resulting checksum values into the corresponding array
elements, ensuring ordering matches the sources array.
- Line 84: The PKGBUILD installs a non-existent LICENSE via the line containing
install -Dm644 "${srcdir}/../LICENSE" ... || true which masks failures; either
add a LICENSE file to the repository and include it in the release tarball
(update sources so the file is available in ${srcdir} and change the install
path to "${srcdir}/LICENSE" and remove the "|| true" so failures surface), or
remove that install line entirely until a LICENSE is provided; ensure the
declared license=('MIT') matches the presence of the LICENSE file in the
distribution.
In @aur/PKGBUILD-git:
- Around line 112-113: The current recursive chmod call (chmod -R 755
"${pkgdir}/usr/lib/pictopy/resources") makes all files executable; replace it by
setting directory perms to 755 and regular file perms to 644, then explicitly
mark only actual executables/scripts as 755. Concretely, use find on
"${pkgdir}/usr/lib/pictopy/resources" to: 1) chmod directories (-type d -exec
chmod 755 {} +), 2) chmod regular files (-type f -exec chmod 644 {} +), and 3)
chmod script files with a shebang or known executable extensions (e.g., find ...
-type f -exec grep -q '^#!' {} \; -exec chmod 755 {} + or target specific script
locations) so only true executables become executable.
In @backend/app/database/connection.py:
- Around line 112-134: The transaction() context manager currently always issues
"BEGIN IMMEDIATE" which fails on nested use; change it to detect nesting (use
sqlite3.Connection.in_transaction or a thread-local nesting counter on the same
connection) and when already in a transaction use SAVEPOINTs instead of
BEGIN/COMMIT: create a unique SAVEPOINT name on entry, RELEASE the savepoint on
success, and ROLLBACK TO the savepoint on exception (and then RELEASE or let it
be cleaned up), while keeping the original BEGIN IMMEDIATE / COMMIT / ROLLBACK
behavior for top-level transactions; update the error logging (logger.error in
the except block) to indicate whether a top-level rollback or savepoint rollback
occurred and ensure _get_connection() and transaction() handle the nesting state
consistently.
In @backend/app/utils/face_clusters.py:
- Around line 610-635: The error handling is inconsistent: when a cursor is
passed the code logs and re-raises the exception, but in the no-cursor branch it
logs and returns False; standardize by re-raising the exception in the no-cursor
path so callers can handle errors uniformly. In the else branch that uses
get_db_write_transaction(), after logger.error(...) re-raise the caught
exception (or simply remove the try/except to let it bubble) instead of
returning False; apply this change around the block that obtains conn, cur, and
executes the UPDATE to mirror the cursor != None behavior.
In @frontend/src-tauri/tauri.conf.json:
- Line 10: You changed the Tauri config key "targets" in tauri.conf.json from
the explicit array ["nsis","deb","app"] to the string "all", which expands
packaging to every supported bundle type and will add platform-specific build
requirements (WiX, macOS toolchain), CI complexity, and more artifacts; confirm
this is intentional, and if not, revert "targets" to the previous explicit array
or explicitly set the minimal target list you want, otherwise update CI/build
docs and ensure required platform toolchains (e.g., WiX for MSI, macOS for DMG)
are installed and storage/artifact retention policies accommodate the extra
bundles.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/app/utils/face_clusters.py (2)
128-156: Critical: Using read-only connection for write operations.Line 128 uses
get_db_connection()but performs multiple write operations:
db_delete_all_clusters(cursor)db_insert_clusters_batch(cluster_list, cursor)db_update_face_cluster_ids_batch(results, cursor)_update_cluster_face_image(...)db_update_metadata(current_metadata, cursor)According to the connection module documentation,
get_db_connection()is for read-only operations. Write operations requireget_db_transaction()(with automatic commit/rollback) orget_db_write_transaction()(serialized writes).Using the wrong connection type means these writes won't be automatically committed, breaking the transaction semantics and potentially causing data loss.
🔎 Proposed fix
- with get_db_connection() as conn: + with get_db_transaction() as conn: cursor = conn.cursor() # Clear old clusters first db_delete_all_clusters(cursor)
159-162: Critical: Using read-only connection for write operations.Line 159 uses
get_db_connection()for write operations (db_update_face_cluster_ids_batch). This should useget_db_transaction()orget_db_write_transaction()to ensure the changes are committed.🔎 Proposed fix
- with get_db_connection() as conn: + with get_db_transaction() as conn: cursor = conn.cursor() db_update_face_cluster_ids_batch(face_cluster_mappings, cursor)backend/app/database/faces.py (1)
117-143: Batch inserts lack atomicity due to per-item transactions.When inserting multiple embeddings, each call to
db_insert_face_embeddingsopens its own write transaction. This means:
- N separate lock acquisitions instead of 1
- If the 3rd insert fails, the first 2 are already committed
- Performance overhead from repeated transaction setup/teardown
Consider wrapping the batch in a single transaction for atomicity and performance.
🔎 Proposed fix for atomic batch inserts
# Handle multiple faces in one image if ( isinstance(embeddings, list) and len(embeddings) > 0 and isinstance(embeddings[0], np.ndarray) ): + with get_db_write_transaction() as conn: + cursor = conn.cursor() + face_ids = [] + for i, emb in enumerate(embeddings): + conf = ( + confidence[i] + if isinstance(confidence, list) and i < len(confidence) + else confidence + ) + bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox + cid = ( + cluster_id[i] + if isinstance(cluster_id, list) and i < len(cluster_id) + else cluster_id + ) + embeddings_json = json.dumps([e.tolist() for e in emb]) + bbox_json = json.dumps(bb) if bb is not None else None + cursor.execute( + """ + INSERT INTO faces (image_id, cluster_id, embeddings, confidence, bbox) + VALUES (?, ?, ?, ?, ?) + """, + (image_id, cid, embeddings_json, conf, bbox_json), + ) + face_ids.append(cursor.lastrowid) + return face_ids - face_ids = [] - for i, emb in enumerate(embeddings): - conf = ( - confidence[i] - if isinstance(confidence, list) and i < len(confidence) - else confidence - ) - bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox - cid = ( - cluster_id[i] - if isinstance(cluster_id, list) and i < len(cluster_id) - else cluster_id - ) - face_id = db_insert_face_embeddings(image_id, emb, conf, bb, cid) - face_ids.append(face_id) - return face_ids
🧹 Nitpick comments (9)
backend/app/utils/face_clusters.py (1)
623-623: Local import inside function may indicate circular dependency.The import of
get_db_write_transactionis placed inside the function rather than at module level. While this works, it typically indicates either:
- A circular import that should be resolved by restructuring
- An unnecessary local import that could be moved to the top
If there's no circular dependency, prefer moving this to the module-level imports (line 16) for clarity and consistency.
backend/app/database/metadata.py (1)
72-83: Rowcount check after DELETE + INSERT may be unclear.Lines 76 and 83 return
cursor.rowcount > 0after executing both DELETE and INSERT. Therowcountattribute reflects the last operation (INSERT), which should always insert exactly 1 row.While this works, the logic is indirect. Consider either:
- Returning
Trueunconditionally after successful execution (since INSERT should always succeed)- Adding a comment clarifying that rowcount reflects the INSERT operation
Alternative implementation
if cursor is not None: # Use provided cursor (external transaction management) cursor.execute("DELETE FROM metadata") cursor.execute("INSERT INTO metadata (metadata) VALUES (?)", (metadata_json,)) - return cursor.rowcount > 0 + return True # INSERT always succeeds or raises else: # Use our own transaction with get_db_write_transaction() as conn: cur = conn.cursor() cur.execute("DELETE FROM metadata") cur.execute("INSERT INTO metadata (metadata) VALUES (?)", (metadata_json,)) - return cur.rowcount > 0 + return True # INSERT always succeeds or raisesbackend/app/database/images.py (1)
201-241: Inconsistent error handling pattern.Unlike
db_get_all_imageswhich wraps the operation in try/except and returns an empty list on error, this function lets exceptions propagate. Consider adding consistent error handling.🔎 Proposed fix for consistent error handling
def db_get_untagged_images() -> List[UntaggedImageRecord]: """ Find all images that need AI tagging. ... """ + try: with get_db_connection() as conn: cursor = conn.cursor() # ... existing code ... return untagged_images + except Exception as e: + logger.error(f"Error getting untagged images: {e}") + return []backend/app/database/connection.py (2)
23-24: Unused module-level thread-local storage.
_thread_localat line 24 is defined but never used. TheDatabaseConnectionManagerclass uses its ownself._localinstead. Consider removing this dead code.🔎 Proposed fix
-# Thread-local storage for database connections -_thread_local = threading.local() - # Lock for write operations to serialize database writes _write_lock = threading.Lock()
228-247: Inconsistent PRAGMA configuration between connections.
get_new_connection()configures only 3 PRAGMAs while_create_connection()configures 8. This could lead to inconsistent behavior (e.g.,ignore_check_constraints,recursive_triggers,synchronous) between thread-local and standalone connections.🔎 Proposed fix for consistent PRAGMA configuration
def get_new_connection() -> sqlite3.Connection: """ Create a new standalone database connection. ... """ conn = sqlite3.connect( DATABASE_PATH, timeout=30.0, ) + # Enable WAL mode for better concurrent read performance + conn.execute("PRAGMA journal_mode=WAL;") + + # Enforce all integrity constraints conn.execute("PRAGMA foreign_keys = ON;") - conn.execute("PRAGMA journal_mode=WAL;") + conn.execute("PRAGMA ignore_check_constraints = OFF;") + conn.execute("PRAGMA recursive_triggers = ON;") + conn.execute("PRAGMA defer_foreign_keys = OFF;") + conn.execute("PRAGMA case_sensitive_like = ON;") + + # Optimize for concurrent access conn.execute("PRAGMA busy_timeout = 30000;") + conn.execute("PRAGMA synchronous = NORMAL;") + return conn.github/workflows/publish-aur.yml (2)
33-36: Hardcoded sleep is fragile for asset availability.A fixed 60-second wait may be insufficient if release asset uploads are delayed, causing checksum downloads to fail and result in
SKIPvalues. Consider polling the GitHub API for asset availability instead.🔎 Proposed polling approach
- name: Wait for release assets run: | echo "Waiting for release assets to be available..." - sleep 60 + VERSION="${{ steps.get_version.outputs.version }}" + MAX_ATTEMPTS=30 + ATTEMPT=0 + while [ $ATTEMPT -lt $MAX_ATTEMPTS ]; do + ASSETS=$(gh api repos/${{ github.repository }}/releases/tags/v${VERSION} --jq '.assets | length' 2>/dev/null || echo "0") + if [ "$ASSETS" -ge 2 ]; then + echo "Release assets are available ($ASSETS assets found)" + break + fi + echo "Waiting for assets... attempt $((ATTEMPT+1))/$MAX_ATTEMPTS" + sleep 30 + ATTEMPT=$((ATTEMPT+1)) + done + env: + GH_TOKEN: ${{ github.token }}
89-147: Consider generating .SRCINFO from PKGBUILD instead of duplicating.The
.SRCINFOis generated inline with hardcoded values that duplicate the PKGBUILD content. This creates a maintenance burden as changes to PKGBUILD must be manually synced. The standard approach is to usemakepkg --printsrcinfowithin an Arch container.🔎 Alternative approach using makepkg
- name: Generate .SRCINFO uses: docker://archlinux:base with: args: bash -c "cd aur && makepkg --printsrcinfo > .SRCINFO"If this isn't feasible due to build dependencies, ensure the hardcoded values stay synchronized with PKGBUILD.
aur/PKGBUILD-git (2)
57-95: Build steps lack proper error handling and directory validation.The build function has several reliability concerns:
cdcommands don't verify success before proceedingsource venv/bin/activatemay not persist across commands in some shell contexts- No explicit error checking after critical commands like
pip installorcargo build🔎 Proposed improvements
build() { cd "${srcdir}/${pkgname}" + + set -e # Exit on first error # Build backend server echo "Building backend server..." - cd backend + cd backend || exit 1 python -m venv venv source venv/bin/activate pip install --upgrade pip pip install -r requirements.txt pip install pyinstaller pyinstaller main.py --name PictoPy_Server --onedir --distpath dist mkdir -p dist/PictoPy_Server/images cp -r app dist/PictoPy_Server/ deactivate - cd .. + cd "${srcdir}/${pkgname}" # Build sync microservice echo "Building sync microservice..." - cd sync-microservice + cd sync-microservice || exit 1 # ... similar pattern
139-163: Consider adding a LICENSE file to the repository root.The LICENSE file is missing from the repository, making the embedded fallback license necessary. However, the conventional approach is to include a LICENSE file in the repository root so that tools can discover it automatically. Adding the file to the repository eliminates the need for this embedding workaround and follows standard open-source practices.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
.github/workflows/build-and-release.yml.github/workflows/publish-aur.ymlREADME.mdaur/.SRCINFOaur/PKGBUILDaur/PKGBUILD-gitaur/README.mdaur/pictopy.installbackend/app/database/albums.pybackend/app/database/connection.pybackend/app/database/face_clusters.pybackend/app/database/faces.pybackend/app/database/folders.pybackend/app/database/images.pybackend/app/database/metadata.pybackend/app/database/yolo_mapping.pybackend/app/utils/face_clusters.pybackend/test.pyfrontend/src-tauri/tauri.conf.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-31T17:00:50.132Z
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.
Applied to files:
backend/app/database/connection.pybackend/app/database/images.py
🧬 Code graph analysis (8)
backend/app/utils/face_clusters.py (1)
backend/app/database/connection.py (1)
get_db_write_transaction(199-216)
backend/app/database/yolo_mapping.py (1)
backend/app/database/connection.py (1)
get_db_transaction(178-195)
backend/app/database/metadata.py (1)
backend/app/database/connection.py (3)
get_db_connection(157-174)get_db_transaction(178-195)get_db_write_transaction(199-216)
backend/app/database/folders.py (2)
backend/app/database/connection.py (3)
get_db_connection(157-174)get_db_transaction(178-195)get_db_write_transaction(199-216)backend/app/logging/setup_logging.py (2)
setup_logging(134-196)get_logger(199-209)
backend/app/database/faces.py (2)
backend/app/database/connection.py (3)
get_db_connection(157-174)get_db_transaction(178-195)get_db_write_transaction(199-216)backend/app/logging/setup_logging.py (2)
setup_logging(134-196)get_logger(199-209)
backend/app/database/images.py (2)
backend/app/database/connection.py (3)
get_db_connection(157-174)get_db_transaction(178-195)get_db_write_transaction(199-216)backend/app/utils/images.py (1)
image_util_parse_metadata(496-513)
backend/test.py (2)
backend/app/database/connection.py (1)
get_db_connection(157-174)backend/app/database/faces.py (1)
get_all_face_embeddings(146-216)
backend/app/database/albums.py (1)
backend/app/database/connection.py (3)
get_db_connection(157-174)get_db_transaction(178-195)get_db_write_transaction(199-216)
🪛 actionlint (1.7.9)
.github/workflows/build-and-release.yml
408-408: the runner of "softprops/action-gh-release@v1" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🔇 Additional comments (56)
frontend/src-tauri/tauri.conf.json (1)
16-21: Theappimage.bundleMediaFramework: trueandrpm.release: "1"configurations are correctly implemented per Tauri's official documentation. The appimage setting includes GStreamer for audio/video support, and the rpm release value uses the standard default. These configurations are appropriate and properly formatted—no changes needed.backend/app/database/face_clusters.py (1)
1-319: LGTM - Clean refactoring to thread-safe connection management.The file correctly uses the new connection management pattern:
- Read operations (
db_get_*) useget_db_connection()- Write operations use
get_db_write_transaction()- DDL operations use
get_db_transaction()- Optional cursor parameters enable transaction composition while providing sensible defaults
The implementation aligns with SQLite's concurrency best practices and the PR's thread-safety objectives.
backend/test.py (1)
4-30: LGTM - Correct usage of connection context manager.The refactoring properly wraps the database query in the
get_db_connection()context manager. The connection will be correctly closed after the return statement at line 30, as the context manager handles cleanup after the block exits.backend/app/database/yolo_mapping.py (1)
1-28: LGTM - Clean transaction-based refactoring.The function correctly uses
get_db_transaction()to wrap both table creation and data insertion in a single transaction with automatic commit/rollback. This ensures atomicity and eliminates manual connection management.backend/app/database/metadata.py (1)
17-54: LGTM - Correct connection usage for DDL and reads.
db_create_metadata_tablecorrectly usesget_db_transaction()for DDL with automatic commitdb_get_metadatacorrectly usesget_db_connection()for read-only operationsbackend/app/database/faces.py (7)
1-14: LGTM!The imports are correctly set up with the new centralized connection helpers and logger initialization.
38-55: LGTM!Table creation correctly uses
get_db_transactionfor the DDL operation with proper schema definition.
58-94: LGTM!Correct use of
get_db_write_transactionfor the insert operation with proper return of the new face_id.
146-216: LGTM!Read operation correctly uses
get_db_connection. The in-function import at line 171 is acceptable to avoid circular imports.
219-280: LGTM!Read functions correctly use
get_db_connectionwith proper JSON parsing of embeddings.
282-332: LGTM!The dual-path design supporting both external cursor and internal transaction is well-implemented. This allows callers to include this operation in larger transactions when needed.
335-382: LGTM!
db_get_cluster_mean_embeddingscorrectly uses read connection and properly calculates mean embeddings per cluster.backend/app/database/images.py (9)
1-13: LGTM!Imports and logger initialization are correctly set up.
46-78: LGTM!Table creation correctly uses
get_db_transactionto create both tables atomically.
81-107: LGTM!Bulk insert correctly uses
get_db_write_transactionwith proper error handling and upsert logic.
110-198: LGTM!Read operation correctly uses
get_db_connectionwith proper result aggregation and error handling.
244-265: LGTM!Write operation correctly uses
get_db_write_transactionwith proper error handling.
268-294: LGTM!Batch insert with
INSERT OR IGNOREcorrectly uses write transaction.
297-329: LGTM!Read operation correctly uses
get_db_connection.
332-360: LGTM!Delete operation correctly uses
get_db_write_transactionwith proper logging.
363-392: LGTM!Toggle operation correctly uses
get_db_write_transactionwith existence check before update.backend/app/database/connection.py (5)
1-28: LGTM!Clear module documentation, proper imports, and thread-local storage setup. The
_write_lockfor serializing writes is a good approach for SQLite.
136-149: LGTM!The
write_transaction()correctly serializes writes using the global lock before delegating totransaction().
156-175: LGTM!
get_db_connectioncorrectly delegates to the connection manager with proper documentation.
177-216: LGTM!
get_db_transactionandget_db_write_transactionproperly delegate to the manager methods.
219-226: LGTM!
close_thread_connectionprovides a clean way to release thread-local resources.backend/app/database/folders.py (12)
1-13: LGTM!Imports and logger initialization are correctly set up.
23-39: LGTM!Table creation correctly uses
get_db_transaction.
42-53: LGTM!Batch insert correctly uses
get_db_write_transaction.
56-110: LGTM!
db_insert_foldercorrectly validates the path before the transaction and handles both insert and lookup of existing folders within a single transaction.
113-144: LGTM!Read functions correctly use
get_db_connection.
147-153: LGTM!
db_get_all_folder_idscorrectly uses read connection.
156-176: LGTM!Batch delete correctly uses
get_db_write_transaction.
179-200: LGTM!
db_delete_foldercorrectly uses write transaction with existence check.
203-222: LGTM!
db_update_parent_ids_for_subtreecorrectly uses write transaction.
225-255: LGTM!
db_folder_existsanddb_find_parent_folder_idcorrectly use read connections.
258-299: LGTM!AI tagging batch update functions correctly use write transactions.
302-393: LGTM!Remaining read functions correctly use
get_db_connectionwith proper query patterns.backend/app/database/albums.py (11)
1-12: LGTM!Imports and logger initialization are correctly set up.
15-46: LGTM!Table creation functions correctly use
get_db_transaction.
49-99: LGTM!Read functions correctly use
get_db_connection.
102-132: LGTM!Album insert correctly uses
get_db_write_transactionwith proper bcrypt password hashing.
135-176: LGTM!Album update correctly handles both password change and no-password-change scenarios.
179-188: LGTM!Album delete correctly uses
get_db_write_transaction.
191-207: LGTM!
db_get_album_imagescorrectly uses read connection.
210-236: LGTM!
db_add_images_to_albumcorrectly validates image IDs before insertion with proper error handling.
239-265: LGTM!
db_remove_image_from_albumcorrectly validates existence before deletion.
268-281: LGTM!
db_remove_images_from_albumcorrectly uses write transaction for batch deletion.
284-303: LGTM!Password verification correctly uses read connection with proper bcrypt password checking.
README.md (1)
5-34: LGTM! Clear installation documentation.The installation section is well-organized and covers multiple platforms appropriately. The AUR installation instructions align with the new packaging infrastructure introduced in this PR.
aur/pictopy.install (4)
1-22: LGTM!The
post_installfunction follows Arch Linux packaging conventions correctly. The conditional checks forgtk-update-icon-cacheandupdate-desktop-databasewith-xbefore execution are appropriate, and the user messaging is helpful.
24-27: LGTM!Correctly reuses
post_installand accesses the new version via$1per Arch Linux install script conventions.
29-31: LGTM!Simple informational message before removal is appropriate.
33-47: Verify the user data directory path before approving.The review assumes
~/.local/share/pictopyis correct, but this path is not explicitly referenced in the visible codebase. The database path is defined as a relative"app/database/PictoPy.db"in settings. Without examining the PyInstaller configuration or the compiled desktop application's code, the actual user data storage location cannot be definitively confirmed. Verify whether the application or documentation specifies where user data is actually stored.aur/PKGBUILD-git (1)
52-55: LGTM!The
pkgver()function correctly formats git describe output for Arch versioning, with a sensible fallback to1.1.0if tags are unavailable..github/workflows/publish-aur.yml (1)
149-162: Version pinning is good practice, but the action's public release history cannot be verified.The action
KSXGitHub/[email protected]is not found in public GitHub repositories or release registries, so the currency of v3.0.1 cannot be confirmed through public sources. However, using a pinned version is the correct approach for reproducibility and stability—verify that this specific version works as intended for your deployment needs.
Description
This PR implements thread-safe SQLite connection management for the Python backend (FastAPI), addressing race conditions and data corruption issues that can occur during concurrent image processing workloads.
Changes Made
DatabaseConnectionManagerclass with per-thread connection storage usingthreading.local()get_db_connection()- Basic thread-safe connectionget_db_transaction()- Auto commit/rollback transactionget_db_write_transaction()- Serialized write operations with global lockforeign_keys,busy_timeout,synchronous)images.py- Thread-safe image CRUD operationsfolders.py- Thread-safe folder managementfaces.py- Thread-safe face embedding storageface_clusters.py- Thread-safe cluster operationsalbums.py- Thread-safe album managementmetadata.py- Thread-safe metadata storageyolo_mapping.py- Thread-safe YOLO class mappingssqlite3.connect()calls andcheck_same_thread=FalseusageBenefits
Technical Details
The implementation uses:
threading.local()for per-thread connection isolationRelated Issue
Fixes #943
Summary by CodeRabbit
Release Notes
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.