Skip to content

Conversation

@ellisonbg
Copy link
Collaborator

@ellisonbg ellisonbg commented Jul 13, 2025

Summary

  • Fixes LLMs can add broken cells that never show output #137
  • Fix cell ID regex pattern to properly handle nbformat cell IDs with alphanumeric characters, underscores, and hyphens
  • Ensure all notebook cells have IDs by upgrading notebooks to nbformat 4.5+ and generating IDs for cells that lack them
  • Improve regex pattern specificity for file IDs (inline UUID pattern) and cell IDs (nbformat-compatible pattern)

Changes

  • Updated cell ID regex from strict UUID pattern to [a-zA-Z0-9_-]+ to match nbformat specification
  • Added automatic notebook upgrade to nbformat 4.5+ to ensure cell IDs are available
  • Generate UUIDs for cells missing IDs in both read and write operations
  • Simplified regex patterns by inlining UUID pattern for file IDs

Test plan

  • Test API endpoints with various cell ID formats (UUID and alphanumeric)
  • Verify notebooks without cell IDs get properly upgraded and assigned IDs
  • Confirm outputs are correctly saved and retrieved with new cell ID handling

🤖 Generated with Claude Code

@ellisonbg
Copy link
Collaborator Author

A quick additional note: the part of this PR needed discussion is that because we require cell ids, we need to update all notebooks to nbformat 4.5. Both jupyter_ydoc and nbformat delete cell ids if present so we have to do this. An alternative would be to change nbformat and jupyter_ydoc to allow for cell ids even though they would need to be optional in the spec.

Copy link
Collaborator

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellisonbg Thank you for opening this PR! I was able to verify that this fixes #137 with Claude Code.

@ellisonbg
Copy link
Collaborator Author

Feel free to merge, I have another PR on the way that fixed kernel restarts, so we might want to wait for that one before we cut another release.

@dlqqq
Copy link
Collaborator

dlqqq commented Jul 15, 2025

A quick additional note: the part of this PR needed discussion is that because we require cell ids, we need to update all notebooks to nbformat 4.5. Both jupyter_ydoc and nbformat delete cell ids if present so we have to do this.

Is this change invisible to end users (i.e. the upgrade has no intended side effects)? If so, then I think this is fine. If this causes any issue, the proper solution is to fix the migration logic in nbformat.

@dlqqq dlqqq changed the title Fix cell ID handling and regex patterns in outputs API Loosen cell ID regex to match nbformat spec Jul 15, 2025
@dlqqq dlqqq merged commit 389f2af into jupyter-ai-contrib:main Jul 15, 2025
4 of 8 checks passed
@ellisonbg
Copy link
Collaborator Author

The only visible change is that their notebook on disk will be in the 4.5 format. If they were deliberately using an older spec (almost unheard of) they will perhaps be surprised. We could emit a notification when this happens to let them know as a follow on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLMs can add broken cells that never show output

2 participants