Skip to content

Conversation

@hharshhsaini
Copy link

@hharshhsaini hharshhsaini commented Jan 5, 2026

Description

This PR implements thread-safe SQLite connection management for the Python backend (FastAPI), addressing race conditions and data corruption issues that can occur during concurrent image processing workloads.

Changes Made

  • Add DatabaseConnectionManager class with per-thread connection storage using threading.local()
  • Implement three context managers for different use cases:
    • get_db_connection() - Basic thread-safe connection
    • get_db_transaction() - Auto commit/rollback transaction
    • get_db_write_transaction() - Serialized write operations with global lock
  • Enable WAL mode for better concurrent read performance
  • Configure SQLite pragmas for integrity and performance (foreign_keys, busy_timeout, synchronous)
  • Refactor all database modules to use the new connection manager:
    • images.py - Thread-safe image CRUD operations
    • folders.py - Thread-safe folder management
    • faces.py - Thread-safe face embedding storage
    • face_clusters.py - Thread-safe cluster operations
    • albums.py - Thread-safe album management
    • metadata.py - Thread-safe metadata storage
    • yolo_mapping.py - Thread-safe YOLO class mappings
  • Remove all direct sqlite3.connect() calls and check_same_thread=False usage

Benefits

  • Prevents race conditions under concurrent FastAPI requests
  • Eliminates data corruption from shared connections across threads
  • Improves stability during parallel image processing workloads
  • Aligns with SQLite's recommended concurrency model
  • Prepares backend for future scalability

Technical Details

The implementation uses:

  • threading.local() for per-thread connection isolation
  • Global write lock for serializing write operations
  • WAL journal mode for concurrent reads
  • 30-second busy timeout for lock contention handling

Related Issue

Fixes #943

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Arch Linux AUR package support with automatic releases to the AUR repository.
    • Expanded Linux packaging to include AppImage and RPM formats alongside existing DEB packages.
  • Documentation

    • Added comprehensive installation instructions for Arch Linux (AUR), other Linux distributions, Windows, and macOS.
    • Included AUR package documentation and troubleshooting guides.

✏️ Tip: You can customize this high-level summary in your review settings.

…SSIE-Org#946

- Add PKGBUILD for binary package installation from releases
- Add PKGBUILD-git for building from source
- Add pictopy.install for post-install hooks
- Add .SRCINFO metadata file
- Add publish-aur.yml workflow for automatic AUR publishing
- Update build-and-release.yml to generate tar.gz for AUR
- Update tauri.conf.json to build all Linux targets (deb, rpm, appimage)
- Update README.md with AUR installation instructions

This enables Arch Linux users to install PictoPy via:
  yay -S pictopy
  paru -S pictopy

Closes AOSSIE-Org#946
- Add DatabaseConnectionManager class with per-thread connection storage
- Implement threading.local() for thread-safe connection isolation
- Add WAL mode for better concurrent read performance
- Add write serialization with global lock for safe write operations
- Provide context managers: get_db_connection, get_db_transaction, get_db_write_transaction
- Refactor all database modules to use new connection manager:
  - images.py: Thread-safe image CRUD operations
  - folders.py: Thread-safe folder management
  - faces.py: Thread-safe face embedding storage
  - face_clusters.py: Thread-safe cluster operations
  - albums.py: Thread-safe album management
  - metadata.py: Thread-safe metadata storage
  - yolo_mapping.py: Thread-safe YOLO class mappings
- Remove direct sqlite3.connect() calls and check_same_thread=False usage
- Add proper transaction management with automatic commit/rollback
- Configure SQLite pragmas for integrity and performance:
  - foreign_keys, busy_timeout, journal_mode=WAL, synchronous=NORMAL

Benefits:
- Prevents race conditions under concurrent FastAPI requests
- Eliminates data corruption from shared connections across threads
- Improves stability during parallel image processing workloads
- Aligns with SQLite's recommended concurrency model

Fixes AOSSIE-Org#943
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

This pull request introduces thread-safe SQLite connection management for the Python backend via a new DatabaseConnectionManager class and transaction helpers, refactors all database modules to use context-managed transactions instead of manual connection handling, adds Arch Linux User Repository (AUR) packaging infrastructure with workflows and build scripts, updates frontend Tauri build configuration, and documents installation instructions.

Changes

Cohort / File(s) Summary
Thread-safe database connection layer
backend/app/database/connection.py
Introduces DatabaseConnectionManager class for per-thread connection management, adds get_db_transaction() and get_db_write_transaction() context managers with serialized write-lock support, new get_new_connection() utility, and close_thread_connection() lifecycle helper.
Database module refactoring - Read/Write operations
backend/app/database/albums.py, backend/app/database/face_clusters.py, backend/app/database/faces.py
Replaced manual sqlite3 connection lifecycle with context-managed transactions via get_db_connection/get_db_transaction/get_db_write_transaction; added type hints and docstrings; improved error handling; albums.py adds new db_get_album_by_name() function.
Database module refactoring - Persistence & Metadata
backend/app/database/folders.py, backend/app/database/images.py, backend/app/database/metadata.py, backend/app/database/yolo_mapping.py
Eliminated manual connection management (connect/commit/close patterns); wrapped all operations in transaction contexts; preserved existing API signatures while improving transaction safety and resource cleanup.
Database utilities and tests
backend/app/utils/face_clusters.py, backend/test.py
Updated to use context-managed connections; support external cursor parameters for nested transaction scenarios; removed direct sqlite3 imports in favor of new connection helpers.
AUR packaging infrastructure
.github/workflows/build-and-release.yml, .github/workflows/publish-aur.yml
New/extended GitHub Actions workflows: create-aur-tarball job extracts .deb and creates sha256-checksummed tarballs; publish-aur automates AUR package updates with version derivation, asset staging, PKGBUILD/.SRCINFO generation, and deployment.
AUR package definitions
aur/PKGBUILD, aur/PKGBUILD-git, aur/.SRCINFO, aur/pictopy.install, aur/README.md
Complete Arch Linux packaging suite: PKGBUILD for binary releases with dependencies and installation hooks; PKGBUILD-git for development builds with dynamic versioning and full build() stages (backend, sync-microservice, frontend); install script with post_install/post_upgrade/pre_remove/post_remove lifecycle hooks; comprehensive documentation.
Documentation and configuration
README.md, frontend/src-tauri/tauri.conf.json
Added platform-specific installation section (Arch AUR, Linux AppImage/.deb, Windows/macOS installers); updated Tauri bundle targets to "all" and added appimage/rpm Linux-specific configurations.

Sequence Diagram(s)

sequenceDiagram
    participant FastAPI as FastAPI Request<br/>(Thread N)
    participant DBMgr as DatabaseConnectionManager
    participant ThreadLocal as Thread-Local<br/>Storage
    participant SQLite as SQLite<br/>Database
    participant TxnCtx as Transaction Context<br/>Manager

    FastAPI->>DBMgr: get_connection() / get_db_transaction()
    DBMgr->>ThreadLocal: check thread ID for existing conn
    alt Connection exists
        ThreadLocal-->>DBMgr: return cached conn
    else New thread
        ThreadLocal->>SQLite: create new connection<br/>(WAL, FK constraints)
        SQLite-->>ThreadLocal: configured conn
    end
    DBMgr->>TxnCtx: enter context (begin transaction)
    FastAPI->>SQLite: execute query
    SQLite-->>FastAPI: result
    FastAPI->>TxnCtx: exit context
    alt Write transaction
        TxnCtx->>SQLite: commit (with write-lock)
    else Read transaction
        TxnCtx->>SQLite: rollback / release
    end
    TxnCtx-->>FastAPI: success
Loading
sequenceDiagram
    participant Release as GitHub Release Event
    participant CI as GitHub Actions
    participant Deb as Linux .deb Artifact
    participant Tar as create-aur-tarball Job
    participant AUR as AUR Git Repo
    participant Publish as publish-aur Workflow

    Release->>CI: trigger (publish or workflow_dispatch)
    CI->>Tar: run create-aur-tarball after build
    Tar->>CI: extract version from tag
    Tar->>Deb: download .deb from release
    Tar->>Tar: extract .deb → tar.gz + sha256sum
    Tar->>Release: upload tarball + checksum
    Release-->>Publish: trigger publish-aur workflow
    Publish->>Release: download amd64/arm64 tarballs
    Publish->>Publish: compute sha256 checksums
    Publish->>AUR: update PKGBUILD<br/>version, sources, sha256sums
    Publish->>AUR: regenerate .SRCINFO<br/>metadata + dependencies
    Publish->>AUR: commit + push via SSH
    AUR-->>Publish: package published
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Rationale: The changes span heterogeneous concerns—extensive database refactoring across 8 modules introducing a new connection management pattern with transaction helpers, concurrent AUR packaging infrastructure with 5 new files and 2 modified workflows, and Tauri configuration updates. The database refactoring requires careful verification of transaction context correctness and resource cleanup across many functions, while the AUR packaging requires understanding shell scripting, GitHub Actions mechanics, and Arch packaging standards. The changes are interconnected but demand separate reasoning per domain.

Possibly related PRs

Suggested labels

backend, enhancement, packaging

Suggested reviewers

  • rahulharpal1603

Poem

🐰 Thread-safe hops through SQLite's door,
No tangled locks upon the floor,
AUR packages bloom in Arch so fine,
Each transaction's path a careful line.
From database burrows to Linux wings,
This rabbit's work does greater things!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes scope expansion: AUR packaging (aur/ directory files), workflow automation (.github/workflows/), and frontend configuration changes (tauri.conf.json) are beyond the #943 issue scope of thread-safe SQLite backend management. Separate AUR packaging, GitHub Actions workflows, and frontend config changes into distinct PRs focused on their respective scopes. Keep this PR focused on backend SQLite thread-safety refactoring.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All coding requirements from issue #943 are met: thread-safe DatabaseConnectionManager using threading.local(), context managers for read/write transactions, per-thread connections, removal of check_same_thread=False, refactored database modules, and WAL mode with proper pragmas.
Docstring Coverage ✅ Passed Docstring coverage is 96.05% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The PR title 'Feat/thread safe sqlite connection' directly and clearly describes the main objective and primary change—implementing thread-safe SQLite connection management for the backend.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Fix all issues with AI Agents 🤖
In @.github/workflows/build-and-release.yml:
- Line 408: Replace the outdated action reference softprops/action-gh-release@v1
with a current compatible release (e.g., softprops/action-gh-release@v2) by
updating the uses entry for the GitHub Actions step that currently lists
softprops/action-gh-release@v1; after updating the uses value, run or dry-run
the workflow to verify runner compatibility and adjust any input names if the
newer major version changed them.
- Around line 379-389: The workflow step "Download Linux deb from release" uses
a fixed sleep 30 which is flaky; replace it with a retry loop that polls the
asset URL built from TAG and VERSION (the same URLs used in the curl commands)
using a lightweight HEAD/GET check (e.g., curl -sfI) with exponential backoff
and a max retry count, exiting with error if the asset never appears; after a
successful availability check proceed to download the deb with the existing curl
fallback logic for both "picto-py_${VERSION}_amd64.deb" and
"PictoPy_${VERSION}_amd64.deb".

In @aur/.SRCINFO:
- Around line 40-42: Update the .SRCINFO to point to an existing release: change
the package version from 1.1.0 to 1.0.0 and update the source entries (e.g., the
fields source_aarch64, source_x86_64 or their equivalents) to the correct URLs
for v1.0.0 artifacts that actually exist on GitHub (or remove platform tarball
entries if only .deb/.exe/.app are available), and replace the placeholder
sha256sums_x86_64 and sha256sums_aarch64 values (currently SKIP) with the real
SHA256 checksums for the actual artifacts once you have valid downloads;
alternatively, if you intend to target v1.1.0, revert these changes and wait
until v1.1.0 publishes Linux tar.gz assets before committing, ensuring the
source_* and sha256sums_* symbols match real release files.

In @aur/PKGBUILD:
- Around line 48-49: Replace the placeholder SKIP values in the PKGBUILD by
computing and inserting the actual SHA256 checksums for the release artifacts
referenced in this PKGBUILD: update the arrays sha256sums_x86_64 and
sha256sums_aarch64 with the real SHA256 hex strings for each source entry;
generate the checksums using the same script/process referenced in the previous
comment on aur/.SRCINFO (or run sha256sum on each downloaded source artifact)
and then paste the resulting checksum values into the corresponding array
elements, ensuring ordering matches the sources array.
- Line 84: The PKGBUILD installs a non-existent LICENSE via the line containing
install -Dm644 "${srcdir}/../LICENSE" ... || true which masks failures; either
add a LICENSE file to the repository and include it in the release tarball
(update sources so the file is available in ${srcdir} and change the install
path to "${srcdir}/LICENSE" and remove the "|| true" so failures surface), or
remove that install line entirely until a LICENSE is provided; ensure the
declared license=('MIT') matches the presence of the LICENSE file in the
distribution.

In @aur/PKGBUILD-git:
- Around line 112-113: The current recursive chmod call (chmod -R 755
"${pkgdir}/usr/lib/pictopy/resources") makes all files executable; replace it by
setting directory perms to 755 and regular file perms to 644, then explicitly
mark only actual executables/scripts as 755. Concretely, use find on
"${pkgdir}/usr/lib/pictopy/resources" to: 1) chmod directories (-type d -exec
chmod 755 {} +), 2) chmod regular files (-type f -exec chmod 644 {} +), and 3)
chmod script files with a shebang or known executable extensions (e.g., find ...
-type f -exec grep -q '^#!' {} \; -exec chmod 755 {} + or target specific script
locations) so only true executables become executable.

In @backend/app/database/connection.py:
- Around line 112-134: The transaction() context manager currently always issues
"BEGIN IMMEDIATE" which fails on nested use; change it to detect nesting (use
sqlite3.Connection.in_transaction or a thread-local nesting counter on the same
connection) and when already in a transaction use SAVEPOINTs instead of
BEGIN/COMMIT: create a unique SAVEPOINT name on entry, RELEASE the savepoint on
success, and ROLLBACK TO the savepoint on exception (and then RELEASE or let it
be cleaned up), while keeping the original BEGIN IMMEDIATE / COMMIT / ROLLBACK
behavior for top-level transactions; update the error logging (logger.error in
the except block) to indicate whether a top-level rollback or savepoint rollback
occurred and ensure _get_connection() and transaction() handle the nesting state
consistently.

In @backend/app/utils/face_clusters.py:
- Around line 610-635: The error handling is inconsistent: when a cursor is
passed the code logs and re-raises the exception, but in the no-cursor branch it
logs and returns False; standardize by re-raising the exception in the no-cursor
path so callers can handle errors uniformly. In the else branch that uses
get_db_write_transaction(), after logger.error(...) re-raise the caught
exception (or simply remove the try/except to let it bubble) instead of
returning False; apply this change around the block that obtains conn, cur, and
executes the UPDATE to mirror the cursor != None behavior.

In @frontend/src-tauri/tauri.conf.json:
- Line 10: You changed the Tauri config key "targets" in tauri.conf.json from
the explicit array ["nsis","deb","app"] to the string "all", which expands
packaging to every supported bundle type and will add platform-specific build
requirements (WiX, macOS toolchain), CI complexity, and more artifacts; confirm
this is intentional, and if not, revert "targets" to the previous explicit array
or explicitly set the minimal target list you want, otherwise update CI/build
docs and ensure required platform toolchains (e.g., WiX for MSI, macOS for DMG)
are installed and storage/artifact retention policies accommodate the extra
bundles.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
backend/app/utils/face_clusters.py (2)

128-156: Critical: Using read-only connection for write operations.

Line 128 uses get_db_connection() but performs multiple write operations:

  • db_delete_all_clusters(cursor)
  • db_insert_clusters_batch(cluster_list, cursor)
  • db_update_face_cluster_ids_batch(results, cursor)
  • _update_cluster_face_image(...)
  • db_update_metadata(current_metadata, cursor)

According to the connection module documentation, get_db_connection() is for read-only operations. Write operations require get_db_transaction() (with automatic commit/rollback) or get_db_write_transaction() (serialized writes).

Using the wrong connection type means these writes won't be automatically committed, breaking the transaction semantics and potentially causing data loss.

🔎 Proposed fix
-        with get_db_connection() as conn:
+        with get_db_transaction() as conn:
             cursor = conn.cursor()
             # Clear old clusters first
             db_delete_all_clusters(cursor)

159-162: Critical: Using read-only connection for write operations.

Line 159 uses get_db_connection() for write operations (db_update_face_cluster_ids_batch). This should use get_db_transaction() or get_db_write_transaction() to ensure the changes are committed.

🔎 Proposed fix
-        with get_db_connection() as conn:
+        with get_db_transaction() as conn:
             cursor = conn.cursor()
             db_update_face_cluster_ids_batch(face_cluster_mappings, cursor)
backend/app/database/faces.py (1)

117-143: Batch inserts lack atomicity due to per-item transactions.

When inserting multiple embeddings, each call to db_insert_face_embeddings opens its own write transaction. This means:

  1. N separate lock acquisitions instead of 1
  2. If the 3rd insert fails, the first 2 are already committed
  3. Performance overhead from repeated transaction setup/teardown

Consider wrapping the batch in a single transaction for atomicity and performance.

🔎 Proposed fix for atomic batch inserts
     # Handle multiple faces in one image
     if (
         isinstance(embeddings, list)
         and len(embeddings) > 0
         and isinstance(embeddings[0], np.ndarray)
     ):
+        with get_db_write_transaction() as conn:
+            cursor = conn.cursor()
+            face_ids = []
+            for i, emb in enumerate(embeddings):
+                conf = (
+                    confidence[i]
+                    if isinstance(confidence, list) and i < len(confidence)
+                    else confidence
+                )
+                bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox
+                cid = (
+                    cluster_id[i]
+                    if isinstance(cluster_id, list) and i < len(cluster_id)
+                    else cluster_id
+                )
+                embeddings_json = json.dumps([e.tolist() for e in emb])
+                bbox_json = json.dumps(bb) if bb is not None else None
+                cursor.execute(
+                    """
+                    INSERT INTO faces (image_id, cluster_id, embeddings, confidence, bbox)
+                    VALUES (?, ?, ?, ?, ?)
+                    """,
+                    (image_id, cid, embeddings_json, conf, bbox_json),
+                )
+                face_ids.append(cursor.lastrowid)
+            return face_ids
-        face_ids = []
-        for i, emb in enumerate(embeddings):
-            conf = (
-                confidence[i]
-                if isinstance(confidence, list) and i < len(confidence)
-                else confidence
-            )
-            bb = bbox[i] if isinstance(bbox, list) and i < len(bbox) else bbox
-            cid = (
-                cluster_id[i]
-                if isinstance(cluster_id, list) and i < len(cluster_id)
-                else cluster_id
-            )
-            face_id = db_insert_face_embeddings(image_id, emb, conf, bb, cid)
-            face_ids.append(face_id)
-        return face_ids
🧹 Nitpick comments (9)
backend/app/utils/face_clusters.py (1)

623-623: Local import inside function may indicate circular dependency.

The import of get_db_write_transaction is placed inside the function rather than at module level. While this works, it typically indicates either:

  1. A circular import that should be resolved by restructuring
  2. An unnecessary local import that could be moved to the top

If there's no circular dependency, prefer moving this to the module-level imports (line 16) for clarity and consistency.

backend/app/database/metadata.py (1)

72-83: Rowcount check after DELETE + INSERT may be unclear.

Lines 76 and 83 return cursor.rowcount > 0 after executing both DELETE and INSERT. The rowcount attribute reflects the last operation (INSERT), which should always insert exactly 1 row.

While this works, the logic is indirect. Consider either:

  1. Returning True unconditionally after successful execution (since INSERT should always succeed)
  2. Adding a comment clarifying that rowcount reflects the INSERT operation
Alternative implementation
     if cursor is not None:
         # Use provided cursor (external transaction management)
         cursor.execute("DELETE FROM metadata")
         cursor.execute("INSERT INTO metadata (metadata) VALUES (?)", (metadata_json,))
-        return cursor.rowcount > 0
+        return True  # INSERT always succeeds or raises
     else:
         # Use our own transaction
         with get_db_write_transaction() as conn:
             cur = conn.cursor()
             cur.execute("DELETE FROM metadata")
             cur.execute("INSERT INTO metadata (metadata) VALUES (?)", (metadata_json,))
-            return cur.rowcount > 0
+            return True  # INSERT always succeeds or raises
backend/app/database/images.py (1)

201-241: Inconsistent error handling pattern.

Unlike db_get_all_images which wraps the operation in try/except and returns an empty list on error, this function lets exceptions propagate. Consider adding consistent error handling.

🔎 Proposed fix for consistent error handling
 def db_get_untagged_images() -> List[UntaggedImageRecord]:
     """
     Find all images that need AI tagging.
     ...
     """
+    try:
         with get_db_connection() as conn:
             cursor = conn.cursor()
             # ... existing code ...
             return untagged_images
+    except Exception as e:
+        logger.error(f"Error getting untagged images: {e}")
+        return []
backend/app/database/connection.py (2)

23-24: Unused module-level thread-local storage.

_thread_local at line 24 is defined but never used. The DatabaseConnectionManager class uses its own self._local instead. Consider removing this dead code.

🔎 Proposed fix
-# Thread-local storage for database connections
-_thread_local = threading.local()
-
 # Lock for write operations to serialize database writes
 _write_lock = threading.Lock()

228-247: Inconsistent PRAGMA configuration between connections.

get_new_connection() configures only 3 PRAGMAs while _create_connection() configures 8. This could lead to inconsistent behavior (e.g., ignore_check_constraints, recursive_triggers, synchronous) between thread-local and standalone connections.

🔎 Proposed fix for consistent PRAGMA configuration
 def get_new_connection() -> sqlite3.Connection:
     """
     Create a new standalone database connection.
     ...
     """
     conn = sqlite3.connect(
         DATABASE_PATH,
         timeout=30.0,
     )
+    # Enable WAL mode for better concurrent read performance
+    conn.execute("PRAGMA journal_mode=WAL;")
+    
+    # Enforce all integrity constraints
     conn.execute("PRAGMA foreign_keys = ON;")
-    conn.execute("PRAGMA journal_mode=WAL;")
+    conn.execute("PRAGMA ignore_check_constraints = OFF;")
+    conn.execute("PRAGMA recursive_triggers = ON;")
+    conn.execute("PRAGMA defer_foreign_keys = OFF;")
+    conn.execute("PRAGMA case_sensitive_like = ON;")
+    
+    # Optimize for concurrent access
     conn.execute("PRAGMA busy_timeout = 30000;")
+    conn.execute("PRAGMA synchronous = NORMAL;")
+    
     return conn
.github/workflows/publish-aur.yml (2)

33-36: Hardcoded sleep is fragile for asset availability.

A fixed 60-second wait may be insufficient if release asset uploads are delayed, causing checksum downloads to fail and result in SKIP values. Consider polling the GitHub API for asset availability instead.

🔎 Proposed polling approach
      - name: Wait for release assets
        run: |
          echo "Waiting for release assets to be available..."
-          sleep 60
+          VERSION="${{ steps.get_version.outputs.version }}"
+          MAX_ATTEMPTS=30
+          ATTEMPT=0
+          while [ $ATTEMPT -lt $MAX_ATTEMPTS ]; do
+            ASSETS=$(gh api repos/${{ github.repository }}/releases/tags/v${VERSION} --jq '.assets | length' 2>/dev/null || echo "0")
+            if [ "$ASSETS" -ge 2 ]; then
+              echo "Release assets are available ($ASSETS assets found)"
+              break
+            fi
+            echo "Waiting for assets... attempt $((ATTEMPT+1))/$MAX_ATTEMPTS"
+            sleep 30
+            ATTEMPT=$((ATTEMPT+1))
+          done
+        env:
+          GH_TOKEN: ${{ github.token }}

89-147: Consider generating .SRCINFO from PKGBUILD instead of duplicating.

The .SRCINFO is generated inline with hardcoded values that duplicate the PKGBUILD content. This creates a maintenance burden as changes to PKGBUILD must be manually synced. The standard approach is to use makepkg --printsrcinfo within an Arch container.

🔎 Alternative approach using makepkg
      - name: Generate .SRCINFO
        uses: docker://archlinux:base
        with:
          args: bash -c "cd aur && makepkg --printsrcinfo > .SRCINFO"

If this isn't feasible due to build dependencies, ensure the hardcoded values stay synchronized with PKGBUILD.

aur/PKGBUILD-git (2)

57-95: Build steps lack proper error handling and directory validation.

The build function has several reliability concerns:

  1. cd commands don't verify success before proceeding
  2. source venv/bin/activate may not persist across commands in some shell contexts
  3. No explicit error checking after critical commands like pip install or cargo build
🔎 Proposed improvements
 build() {
     cd "${srcdir}/${pkgname}"
+    
+    set -e  # Exit on first error

     # Build backend server
     echo "Building backend server..."
-    cd backend
+    cd backend || exit 1
     python -m venv venv
     source venv/bin/activate
     pip install --upgrade pip
     pip install -r requirements.txt
     pip install pyinstaller
     pyinstaller main.py --name PictoPy_Server --onedir --distpath dist
     mkdir -p dist/PictoPy_Server/images
     cp -r app dist/PictoPy_Server/
     deactivate
-    cd ..
+    cd "${srcdir}/${pkgname}"

     # Build sync microservice
     echo "Building sync microservice..."
-    cd sync-microservice
+    cd sync-microservice || exit 1
     # ... similar pattern

139-163: Consider adding a LICENSE file to the repository root.

The LICENSE file is missing from the repository, making the embedded fallback license necessary. However, the conventional approach is to include a LICENSE file in the repository root so that tools can discover it automatically. Adding the file to the repository eliminates the need for this embedding workaround and follows standard open-source practices.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe45e2f and cf14002.

📒 Files selected for processing (19)
  • .github/workflows/build-and-release.yml
  • .github/workflows/publish-aur.yml
  • README.md
  • aur/.SRCINFO
  • aur/PKGBUILD
  • aur/PKGBUILD-git
  • aur/README.md
  • aur/pictopy.install
  • backend/app/database/albums.py
  • backend/app/database/connection.py
  • backend/app/database/face_clusters.py
  • backend/app/database/faces.py
  • backend/app/database/folders.py
  • backend/app/database/images.py
  • backend/app/database/metadata.py
  • backend/app/database/yolo_mapping.py
  • backend/app/utils/face_clusters.py
  • backend/test.py
  • frontend/src-tauri/tauri.conf.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-31T17:00:50.132Z
Learnt from: Hemil36
Repo: AOSSIE-Org/PictoPy PR: 570
File: backend/app/database/connection.py:16-24
Timestamp: 2025-10-31T17:00:50.132Z
Learning: In PictoPy backend, the user prefers not to use database connection retry logic or extended busy timeouts in the centralized get_db_connection() context manager, even though the app has concurrent access patterns via ProcessPoolExecutor and FastAPI.

Applied to files:

  • backend/app/database/connection.py
  • backend/app/database/images.py
🧬 Code graph analysis (8)
backend/app/utils/face_clusters.py (1)
backend/app/database/connection.py (1)
  • get_db_write_transaction (199-216)
backend/app/database/yolo_mapping.py (1)
backend/app/database/connection.py (1)
  • get_db_transaction (178-195)
backend/app/database/metadata.py (1)
backend/app/database/connection.py (3)
  • get_db_connection (157-174)
  • get_db_transaction (178-195)
  • get_db_write_transaction (199-216)
backend/app/database/folders.py (2)
backend/app/database/connection.py (3)
  • get_db_connection (157-174)
  • get_db_transaction (178-195)
  • get_db_write_transaction (199-216)
backend/app/logging/setup_logging.py (2)
  • setup_logging (134-196)
  • get_logger (199-209)
backend/app/database/faces.py (2)
backend/app/database/connection.py (3)
  • get_db_connection (157-174)
  • get_db_transaction (178-195)
  • get_db_write_transaction (199-216)
backend/app/logging/setup_logging.py (2)
  • setup_logging (134-196)
  • get_logger (199-209)
backend/app/database/images.py (2)
backend/app/database/connection.py (3)
  • get_db_connection (157-174)
  • get_db_transaction (178-195)
  • get_db_write_transaction (199-216)
backend/app/utils/images.py (1)
  • image_util_parse_metadata (496-513)
backend/test.py (2)
backend/app/database/connection.py (1)
  • get_db_connection (157-174)
backend/app/database/faces.py (1)
  • get_all_face_embeddings (146-216)
backend/app/database/albums.py (1)
backend/app/database/connection.py (3)
  • get_db_connection (157-174)
  • get_db_transaction (178-195)
  • get_db_write_transaction (199-216)
🪛 actionlint (1.7.9)
.github/workflows/build-and-release.yml

408-408: the runner of "softprops/action-gh-release@v1" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🔇 Additional comments (56)
frontend/src-tauri/tauri.conf.json (1)

16-21: The appimage.bundleMediaFramework: true and rpm.release: "1" configurations are correctly implemented per Tauri's official documentation. The appimage setting includes GStreamer for audio/video support, and the rpm release value uses the standard default. These configurations are appropriate and properly formatted—no changes needed.

backend/app/database/face_clusters.py (1)

1-319: LGTM - Clean refactoring to thread-safe connection management.

The file correctly uses the new connection management pattern:

  • Read operations (db_get_*) use get_db_connection()
  • Write operations use get_db_write_transaction()
  • DDL operations use get_db_transaction()
  • Optional cursor parameters enable transaction composition while providing sensible defaults

The implementation aligns with SQLite's concurrency best practices and the PR's thread-safety objectives.

backend/test.py (1)

4-30: LGTM - Correct usage of connection context manager.

The refactoring properly wraps the database query in the get_db_connection() context manager. The connection will be correctly closed after the return statement at line 30, as the context manager handles cleanup after the block exits.

backend/app/database/yolo_mapping.py (1)

1-28: LGTM - Clean transaction-based refactoring.

The function correctly uses get_db_transaction() to wrap both table creation and data insertion in a single transaction with automatic commit/rollback. This ensures atomicity and eliminates manual connection management.

backend/app/database/metadata.py (1)

17-54: LGTM - Correct connection usage for DDL and reads.

  • db_create_metadata_table correctly uses get_db_transaction() for DDL with automatic commit
  • db_get_metadata correctly uses get_db_connection() for read-only operations
backend/app/database/faces.py (7)

1-14: LGTM!

The imports are correctly set up with the new centralized connection helpers and logger initialization.


38-55: LGTM!

Table creation correctly uses get_db_transaction for the DDL operation with proper schema definition.


58-94: LGTM!

Correct use of get_db_write_transaction for the insert operation with proper return of the new face_id.


146-216: LGTM!

Read operation correctly uses get_db_connection. The in-function import at line 171 is acceptable to avoid circular imports.


219-280: LGTM!

Read functions correctly use get_db_connection with proper JSON parsing of embeddings.


282-332: LGTM!

The dual-path design supporting both external cursor and internal transaction is well-implemented. This allows callers to include this operation in larger transactions when needed.


335-382: LGTM!

db_get_cluster_mean_embeddings correctly uses read connection and properly calculates mean embeddings per cluster.

backend/app/database/images.py (9)

1-13: LGTM!

Imports and logger initialization are correctly set up.


46-78: LGTM!

Table creation correctly uses get_db_transaction to create both tables atomically.


81-107: LGTM!

Bulk insert correctly uses get_db_write_transaction with proper error handling and upsert logic.


110-198: LGTM!

Read operation correctly uses get_db_connection with proper result aggregation and error handling.


244-265: LGTM!

Write operation correctly uses get_db_write_transaction with proper error handling.


268-294: LGTM!

Batch insert with INSERT OR IGNORE correctly uses write transaction.


297-329: LGTM!

Read operation correctly uses get_db_connection.


332-360: LGTM!

Delete operation correctly uses get_db_write_transaction with proper logging.


363-392: LGTM!

Toggle operation correctly uses get_db_write_transaction with existence check before update.

backend/app/database/connection.py (5)

1-28: LGTM!

Clear module documentation, proper imports, and thread-local storage setup. The _write_lock for serializing writes is a good approach for SQLite.


136-149: LGTM!

The write_transaction() correctly serializes writes using the global lock before delegating to transaction().


156-175: LGTM!

get_db_connection correctly delegates to the connection manager with proper documentation.


177-216: LGTM!

get_db_transaction and get_db_write_transaction properly delegate to the manager methods.


219-226: LGTM!

close_thread_connection provides a clean way to release thread-local resources.

backend/app/database/folders.py (12)

1-13: LGTM!

Imports and logger initialization are correctly set up.


23-39: LGTM!

Table creation correctly uses get_db_transaction.


42-53: LGTM!

Batch insert correctly uses get_db_write_transaction.


56-110: LGTM!

db_insert_folder correctly validates the path before the transaction and handles both insert and lookup of existing folders within a single transaction.


113-144: LGTM!

Read functions correctly use get_db_connection.


147-153: LGTM!

db_get_all_folder_ids correctly uses read connection.


156-176: LGTM!

Batch delete correctly uses get_db_write_transaction.


179-200: LGTM!

db_delete_folder correctly uses write transaction with existence check.


203-222: LGTM!

db_update_parent_ids_for_subtree correctly uses write transaction.


225-255: LGTM!

db_folder_exists and db_find_parent_folder_id correctly use read connections.


258-299: LGTM!

AI tagging batch update functions correctly use write transactions.


302-393: LGTM!

Remaining read functions correctly use get_db_connection with proper query patterns.

backend/app/database/albums.py (11)

1-12: LGTM!

Imports and logger initialization are correctly set up.


15-46: LGTM!

Table creation functions correctly use get_db_transaction.


49-99: LGTM!

Read functions correctly use get_db_connection.


102-132: LGTM!

Album insert correctly uses get_db_write_transaction with proper bcrypt password hashing.


135-176: LGTM!

Album update correctly handles both password change and no-password-change scenarios.


179-188: LGTM!

Album delete correctly uses get_db_write_transaction.


191-207: LGTM!

db_get_album_images correctly uses read connection.


210-236: LGTM!

db_add_images_to_album correctly validates image IDs before insertion with proper error handling.


239-265: LGTM!

db_remove_image_from_album correctly validates existence before deletion.


268-281: LGTM!

db_remove_images_from_album correctly uses write transaction for batch deletion.


284-303: LGTM!

Password verification correctly uses read connection with proper bcrypt password checking.

README.md (1)

5-34: LGTM! Clear installation documentation.

The installation section is well-organized and covers multiple platforms appropriately. The AUR installation instructions align with the new packaging infrastructure introduced in this PR.

aur/pictopy.install (4)

1-22: LGTM!

The post_install function follows Arch Linux packaging conventions correctly. The conditional checks for gtk-update-icon-cache and update-desktop-database with -x before execution are appropriate, and the user messaging is helpful.


24-27: LGTM!

Correctly reuses post_install and accesses the new version via $1 per Arch Linux install script conventions.


29-31: LGTM!

Simple informational message before removal is appropriate.


33-47: Verify the user data directory path before approving.

The review assumes ~/.local/share/pictopy is correct, but this path is not explicitly referenced in the visible codebase. The database path is defined as a relative "app/database/PictoPy.db" in settings. Without examining the PyInstaller configuration or the compiled desktop application's code, the actual user data storage location cannot be definitively confirmed. Verify whether the application or documentation specifies where user data is actually stored.

aur/PKGBUILD-git (1)

52-55: LGTM!

The pkgver() function correctly formats git describe output for Arch versioning, with a sensible fallback to 1.1.0 if tags are unavailable.

.github/workflows/publish-aur.yml (1)

149-162: Version pinning is good practice, but the action's public release history cannot be verified.

The action KSXGitHub/[email protected] is not found in public GitHub repositories or release registries, so the currency of v3.0.1 cannot be confirmed through public sources. However, using a pinned version is the correct approach for reproducibility and stability—verify that this specific version works as intended for your deployment needs.

@hharshhsaini hharshhsaini changed the title Feat/thread safe sqlite connection #943 Feat/thread safe sqlite connection Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feat: Thread-safe SQLite connection management for Python backend (FastAPI)

1 participant