Fix cell duplication when YStore is out-of-sync with file#526
Fix cell duplication when YStore is out-of-sync with file#526samihamine wants to merge 1 commit intojupyterlab:mainfrom
Conversation
This commit addresses the cell duplication bug reported in jupyterlab/jupyterlab#14031. Root cause analysis from production logs: - When initialize() detects an out-of-sync between YStore and file - The YDoc already contains content from YStore (potentially corrupted) - Calling set() tries to MERGE instead of REPLACE - This causes cell duplication Fixes: 1. handlers.py: Reorder initialize() BEFORE serve() to prevent race condition 2. rooms.py: Clear YDoc cells when out-of-sync is detected before loading from file Tested with production logs showing 100% correlation between out-of-sync and duplication.
krassowski
left a comment
There was a problem hiding this comment.
Thank you for the PR!
Looking at jupyter_ydoc/ynotebook.py, the set() method (lines 233-321):
The code which you highlight was only introduced 2 weeks ago, but issue that you are addressing was opened over two years ago.
If that code is really the culprit, can you confirm:
- it started two weeks ago for you
- downgrading to jupyter-ydoc v3.1.0 resolves the issue for you
But the Y.js internal state (clientID, clock) differs, causing the merge to create duplicates.
This is kind of irrelevant to set() implementation because it operates on serialized value.
| # FIX: Clear the document content BEFORE loading from file to prevent | ||
| # cell duplication. When out-of-sync is detected, the YDoc contains | ||
| # content from the YStore that may have corrupted state (e.g., from | ||
| # previous duplications). If we don't clear it, the set() method will | ||
| # try to MERGE the YStore content with the file content instead of | ||
| # REPLACING it, which causes cell duplication. | ||
| # See: https://github.com/jupyterlab/jupyterlab/issues/14031 | ||
| self._clear_document() |
There was a problem hiding this comment.
Doing it here would undo the benefits of work done to address #509 and jupyterlab/jupyterlab#18100, e.g.
- Fix cell modifications causing cell output reload and shift to active cell index jupyter-server/jupyter_ydoc#360
- Improve reloading of documents to avoid spurious side-effects jupyter-server/jupyter_ydoc#355
set() method will try to MERGE the YStore content with the file content instead of REPLACING it
This should be:
- tested (there should be a test showing the problem, failing now, passing after a test)
- addressed in
jupyter-ydocset()method for notebook, not in this package - hard reload should be only performed IF required to prevent UI side-effects
There was a problem hiding this comment.
It might be easier to test this by adding a UI test here, even if the fix ultimately goes to the jupyter-ydoc package because this repo has UI tests setup.
|
We've experienced this bug on both versions of jupyter-ydoc:
So the Looking more carefully at the log sequence:
In all 3 cases, the duplication is detected AFTER The actual mechanism:
About the About Some questions i have in mind if you can help with please :
|
That's something we've been thinking about, and it would also help with managing the size of the YStore (see here).
What do you mean by out-of-sync? Technically client and server are always out-of-sync. I think we could introduce some kind of session ID into the shared document, and when the room is cleared and the shared document recreated in the backend, change this ID. When the client sees that its session ID doesn't match, it recreates a fresh shared document which syncs with the backend. But this would require cooperation from the frontend. |
Fixes cell duplication and DuplicateCellId warnings that occur when YStore state diverges from the notebook file on disk, typically after disconnect/reconnect beyond document_cleanup_delay where the server
loads from YStore then detects out-of-sync with disk.
Root cause: YNotebook.set() would insert new cells when encountering retained cells in the wrong position, creating duplicates instead of reordering existing cells. This happens when:
- Server loads doc from YStore after room cleanup
- Compares with model["content"] (file on disk)
- Detects out-of-sync and calls set() to reload from file
- File has same cell IDs but different order than YStore
Solution: Use pycrdt Array.move() to reorder retained cells instead of recreating them. This:
- Prevents duplication when cell order differs between YStore and disk
- Preserves CRDT structure (observers, undo scope, granular updates)
- Fixes the root cause in the document model layer (not server layer)
- Handles already-corrupted state with duplicate IDs
- O(n) complexity for pure reordering of retained cells
Implementation:
- Deduplicate cell IDs when building old_ycells_by_id mapping
- Remove duplicate IDs during pruning phase
- Build id_to_index map for O(1) retained cell lookups
- Use Array.move(cur, index) to reorder retained cells
- Update id_to_index only for affected range after moves (not entire map)
- Rebuild id_to_index after inserts (rare, so O(n) rebuild acceptable)
- Use .get("id") instead of ["id"] for safer access
- Append new_cell (not old_cell snapshot) to new_cell_list
- Add defensive fallthrough if retained cell not found
Complexity:
- Pure reordering: O(n) with range updates
- With inserts: O(n) per insert for map rebuild
- Worst case: O(n²) if many inserts, but inserts are uncommon in
out-of-sync recovery (most cells are retained)
Tests:
- test_set_reorder_does_not_duplicate_cells
- test_set_removes_preexisting_duplicate_ids
- test_set_reorder_with_mixed_operations
- test_set_simple_adjacent_swap
Fixes [jupyter-server/jupyter-collaboration#526](jupyterlab/jupyter-collaboration#526)
Fixes [jupyterlab/jupyterlab#14031](jupyterlab/jupyterlab#14031)
* Fix cell duplication when YStore diverges from disk
Fixes cell duplication and DuplicateCellId warnings that occur when YStore state diverges from the notebook file on disk, typically after disconnect/reconnect beyond document_cleanup_delay where the server
loads from YStore then detects out-of-sync with disk.
Root cause: YNotebook.set() would insert new cells when encountering retained cells in the wrong position, creating duplicates instead of reordering existing cells. This happens when:
- Server loads doc from YStore after room cleanup
- Compares with model["content"] (file on disk)
- Detects out-of-sync and calls set() to reload from file
- File has same cell IDs but different order than YStore
Solution: Use pycrdt Array.move() to reorder retained cells instead of recreating them. This:
- Prevents duplication when cell order differs between YStore and disk
- Preserves CRDT structure (observers, undo scope, granular updates)
- Fixes the root cause in the document model layer (not server layer)
- Handles already-corrupted state with duplicate IDs
- O(n) complexity for pure reordering of retained cells
Implementation:
- Deduplicate cell IDs when building old_ycells_by_id mapping
- Remove duplicate IDs during pruning phase
- Build id_to_index map for O(1) retained cell lookups
- Use Array.move(cur, index) to reorder retained cells
- Update id_to_index only for affected range after moves (not entire map)
- Rebuild id_to_index after inserts (rare, so O(n) rebuild acceptable)
- Use .get("id") instead of ["id"] for safer access
- Append new_cell (not old_cell snapshot) to new_cell_list
- Add defensive fallthrough if retained cell not found
Complexity:
- Pure reordering: O(n) with range updates
- With inserts: O(n) per insert for map rebuild
- Worst case: O(n²) if many inserts, but inserts are uncommon in
out-of-sync recovery (most cells are retained)
Tests:
- test_set_reorder_does_not_duplicate_cells
- test_set_removes_preexisting_duplicate_ids
- test_set_reorder_with_mixed_operations
- test_set_simple_adjacent_swap
Fixes [jupyter-server/jupyter-collaboration#526](jupyterlab/jupyter-collaboration#526)
Fixes [jupyterlab/jupyterlab#14031](jupyterlab/jupyterlab#14031)
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update jupyter_ydoc/ynotebook.py
Co-authored-by: David Brochart <david.brochart@gmail.com>
* Update jupyter_ydoc/ynotebook.py
Co-authored-by: David Brochart <david.brochart@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update tests/test_ynotebook.py
Co-authored-by: David Brochart <david.brochart@gmail.com>
* Update tests/test_ynotebook.py
Co-authored-by: David Brochart <david.brochart@gmail.com>
* remove YArray.move in favor of del+insert for yjs 13.x compat
* dont need complexity of building a map since we now have O(n^2) complexity without move
* Apply suggestions from code review
Co-authored-by: David Brochart <david.brochart@gmail.com>
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: David Brochart <david.brochart@gmail.com>
|
@samihamine would you mind confirming that upgrading to |
|
Awesome, thanks for the update, that’s great news. Yes, we’ll test by upgrading to v3.3.4 on our side and see if it fixes the issue in our repro setup. I’ll report back here with the results once we’ve validated it. |
Fix cell duplication when YStore is out-of-sync with file
Fixes jupyterlab/jupyterlab#14031
Summary
This PR fixes a cell duplication bug that occurs when the SQLite YStore content differs from the file on disk (out-of-sync condition). The bug manifests after laptop sleep/wake cycles or prolonged disconnections.
Root Cause Analysis
Why is this bug INTERMITTENT (random)?
The bug only occurs when ALL of these conditions are met:
document_cleanup_delay(default: 60s)This explains why:
Evidence from Production Logs
We analyzed 15 reconnection events across two production incidents.
Statistical Correlation
Correlation: 100% - Every out-of-sync event led to duplication.
Log Extract: Case WITHOUT duplication (first connection)
Log Extract: Case WITHOUT duplication (first connection, no YStore)
Content loaded from file(direct load, no YStore) → NO DuplicateCellIdLog Extract: Case WITH duplication (reconnection after sleep, YStore out-of-sync)
Content loaded from the ystore SQLiteYStore→ YStore was loaded firstContent in file ... is out-of-sync with the ystore→ OUT-OF-SYNC DETECTED (TRIGGER)Content loaded from file→ File content applied viaset()which does MERGETechnical Deep Dive
The Feedback Loop (Why duplications accumulate)
This explains why duplications accumulate over multiple reconnections:
Why set() causes MERGE instead of REPLACE
Looking at
jupyter_ydoc/ynotebook.py, theset()method (lines 233-321):The problem: When YStore has cells
[A, B, C]and file has cells[A, B, C]with the same IDs,set()tries to retain them. But the Y.js internal state (clientID, clock) differs, causing the merge to create duplicates.The Fix
Fix 1: Clear YDoc on out-of-sync (
rooms.py)When out-of-sync is detected, we now clear the YDoc cells before calling
set():The
_clear_document()method:This ensures
set()receives an empty YDoc, so all cells come fresh from the file with no merge conflicts.Fix 2: Reorder initialize/serve (
handlers.py) - (DEFENSE IN DEPTH)The original code had a race condition:
This allowed SYNC_STEP1 to be sent before initialization completed. We now ensure initialization completes first:
Why Both Fixes Are Needed
rooms.py(clear on out-of-sync)handlers.py(reorder init/serve)The
rooms.pyfix is critical as it directly prevents the merge that causes duplication. Thehandlers.pyfix is defense in depth - good practice but won't help if YStore is already out-of-sync.Testing
ycells.clear()is the same method used internally byset()when no cells are retained