Batch database commits during Gmail sync for 50-70% performance improvement#46
Draft
Batch database commits during Gmail sync for 50-70% performance improvement#46
Conversation
- Add BATCH_COMMIT_SIZE constant (20 emails per commit) - Track emails_processed counter across sync operations - Batch commits every N emails in both initial and incremental sync - Update history ID during batch commits for crash recovery - Add rollback in exception handler for failed transactions - Keep final commits at end of sync operations Co-authored-by: mrrobot47 <25586785+mrrobot47@users.noreply.github.com>
- Extract _batch_commit_if_needed() helper to reduce code duplication - Optimize to single commit per batch (removed duplicate commit) - Update both initial and incremental sync to use helper - Improve code maintainability and readability Co-authored-by: mrrobot47 <25586785+mrrobot47@users.noreply.github.com>
- Add check for emails_processed > 0 to prevent commit when counter is zero - Ensures batch commits only happen after processing actual emails Co-authored-by: mrrobot47 <25586785+mrrobot47@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Optimize Gmail sync to reduce database commits
Batch database commits during Gmail sync for 50-70% performance improvement
Jan 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Gmail sync commits to the database after processing each individual email. For users with 50-100 new emails, this results in 50-100 separate database commits per sync cycle, creating excessive I/O overhead and transaction locking.
Changes
Batch commit logic
BATCH_COMMIT_SIZE = 20constant for commit frequency control_batch_commit_if_needed()helper to commit every N emails with history ID update for crash recoveryemails_processedcounter across entire sync operation (all labels)Applied to both sync modes
Transaction safety
frappe.db.rollback()in exception handlerBefore/After
Performance: 100 emails now requires ~6 commits instead of 100 (94% reduction). With 10 accounts, reduces ~1000 commits to ~50 per sync cycle.
Original prompt
This section details on the original issue you should resolve
<issue_title>Commit Per Email During Gmail Sync</issue_title>
<issue_description>
Metadata
frappe_gmail_thread/frappe_gmail_thread/doctype/gmail_thread/gmail_thread.py:222,249,272,372Problem Description
The Gmail sync function (
sync()) callsfrappe.db.commit()after processing each individual email message. For users with active inboxes receiving 50-100 new emails, a single sync operation will execute 50-100 separate database commits.Each commit:
This significantly slows down the sync process and increases database load.
Code Location
Initial sync - commit per email (line 222):
End of initial sync label loop (line 249):
History not found error handling (line 272):
End of incremental sync (line 372):
Root Cause
The developer likely added per-email commits to ensure data persistence in case of mid-sync failures. However, this approach:
Proposed Solution
Batch commits - either commit once at the end of processing all emails for a label, or commit every N emails (e.g., 10-20) to balance durability with performance:
Implementation Steps
last_historyidduring batch commits for crash recoveryfrappe.db.rollback()in exception handler to clean up failed transactionsAdditional Notes
# nosemgrepcomment suggests the developer was aware this is a pattern to avoid but added it intentionallyfrappe.db.savepoint()for finer-grained transaction control if needed✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.