Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 73 additions & 3 deletions jupyter_server_nbmodel/actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,60 @@ async def _get_ycell(
return ycell


def handle_carriage_return(s: str) -> str:
"""Handle text the same way that a terminal emulator would display it

Args:
s: The message
Returns:
Message with carriage returns handled in the text
"""
lines = s.split('\n')
processed_lines = []

for line in lines:
result = []
i = 0
while i < len(line):
if line[i] == '\r':
# Move cursor to start of the line
# Reset the result buffer and prepare to overwrite
i += 1
overwrite_chars = []
while i < len(line) and line[i] != '\r':
overwrite_chars.append(line[i])
i += 1
for j, c in enumerate(overwrite_chars):
if j < len(result):
result[j] = c
else:
result.append(c)
else:
result.append(line[i])
i += 1
processed_lines.append(''.join(result))

return '\n'.join(processed_lines)


def handle_backspace(s: str) -> str:
"""Simulate backspaces in the text

Args:
s: The message
Returns:
The message with backspaces applied
"""
new_str = []
for c in s:
if c == '\b':
if len(new_str) > 0 and new_str[-1] not in ('\n', '\r'):
new_str.pop()
else:
new_str.append(c)
return ''.join(new_str)


def _output_hook(outputs: list[NotebookNode], ycell: y.Map | None, msg: dict) -> None:
"""Callback on execution request when an output is emitted.

Expand All @@ -99,29 +153,45 @@ def _output_hook(outputs: list[NotebookNode], ycell: y.Map | None, msg: dict) ->
# FIXME support for version
output = nbformat.v4.output_from_msg(msg)
outputs.append(output)

if ycell is not None:
cell_outputs = ycell["outputs"]
if msg_type == "stream":
with cell_outputs.doc.transaction():
text = output["text"]
# FIXME Logic is quite complex at https://github.com/jupyterlab/jupyterlab/blob/7ae2d436fc410b0cff51042a3350ba71f54f4445/packages/outputarea/src/model.ts#L518

if text.endswith((os.linesep, "\n")):
text = text[:-1]
Comment on lines 163 to 164
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • BTW, this was an approximation, it loses data. Regular foo\n should not be treated same as foo.
  • And it'd be nice to tie the array structure of "text": [...] to newlines?

In regular Jupyterlab, here is what I see:
screenshot
and here is the .ipynb representation:

> jq .cells[].outputs --compact-output Untitled.ipynb
[{"name":"stdout","output_type":"stream","text":["no trailing"]}]
[{"name":"stdout","output_type":"stream","text":["1 trailing\n"]}]
[{"name":"stdout","output_type":"stream","text":["2 trailing\n","\n"]}]
[{"name":"stdout","output_type":"stream","text":["foobar\n","baz"]}]
[]

=> in doc model, "text" array tends to start new item after every newline. 📜

https://nbformat.readthedocs.io/en/latest/format_description.html says it doesn't matter!

Some fields, such as code input and text output, are characteristically multi-line strings. When these fields are written to disk, they may be written as a list of strings, which should be joined with '' when reading back into memory. In programmatic APIs for working with notebooks (Python, Javascript), these are always re-joined into the original multi-line string. ...

However, keeping item-per-line is nice for .ipynb diff-ability.

=> in doc model, the last line does record whether it had a trailing newline. ✍️

=> the UI does not show whether the the last line had trailing newline 🙈

In a terminal, lack of trailing newline is evident by next prompt just appearing on same line e.g. no trailing>>> but that'd be a sin against the structuredness of notebooks. And since 1 newline is the common case, UI didn't want to spend vertical space rendering it. (the first two cells have exactly same height)

UI does reflect >1 newlines at the end by showing blank lines.

But in theory, future/alternate UI might reflect trailing newline.


For comparison, here is the structure for same cells I'm getting with this branch:

[{"name":"stdout","output_type":"stream","text":["no trailing"]}]
[{"name":"stdout","output_type":"stream","text":["1 trailing"]}]
[{"name":"stdout","output_type":"stream","text":["2 trailing\n"]}]
[{"name":"stdout","output_type":"stream","text":["foo","bar\nbaz"]}]
  • eats one newline off the 1 & 2 cases 🐞
  • the array structure — while formally not important — reflects how it got streamed, less stable for diffs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ However, it appears the current Jupyterlab is not as lenient as the spec I cited, and does NOT concat multiple items without newlines into one line?

for i in range(30):
    print(i, end=' ')
    import time; time.sleep(1)

upstream results in json:

[{"name":"stdout","output_type":"stream","text":["0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 "]}]

which renders all on one line. With this extension (with or without this branch), I get:

[{"name":"stdout","output_type":"stream","text":["0 ","1 ","2 ","3 ","4 ","5 ","6 ","7 ","8 ","9 ","10 ","11 ","12 ","13 ","14 ","15 ","16 ","17 ","18 ","19 ","20 ","21 ","22 ","23 ","24 ","25 ","26 ","27 ","28 ","29 "]}]

which is supposed to be equivalent yet shows up in browser as each number on a separate line!


if (not cell_outputs) or (cell_outputs[-1]["name"] != output["name"]):
output["text"] = [text]
output["text"] = [handle_carriage_return(handle_backspace(text))]
cell_outputs.append(output)
else:
last_output = cell_outputs[-1]
last_output["text"].append(text)
old_text = last_output["text"][-1] if len(last_output["text"]) > 0 else ""
combined_text = old_text + text
if '\r' in combined_text or '\b' in combined_text:
if combined_text[-1] == '\r':
Copy link
Contributor

@cben cben Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange to me that this needs to be special-cased here. I hope it can be folded into the handle_...() functions.

Why do I say that? A terminal has a state machine processing incoming characters, but never cares whether they arrived in one buffer vs. char-by-char, and a char's handling is independent of being first/middle/last in some buffer.
[in practice, there may be some timeouts for incomplete escape sequences? I know on input emacs distinguishes ESC press from prefix by timeout, but don't think terminals do this? Anyway \r and \b are single-char "atomic" sequences, so that's irrelevant.]

It's possible that modeling the state as old_text fed back into the functions same as arriving characters is insufficient. 🤔

block-beta
   columns 6
   space     space state[("old_text")] space space kernel
   space     space space               space space space
   handle_cr space handle_b            space space space

   kernel-- "+ incoming text" -->state
   state --> handle_b
   handle_b --> handle_cr
   handle_cr --> state
Loading

More formally, these functions could take 2 args (state, incoming text) and reduce it to updated state? Not sure whether that matters here.

  • For starters, cursor position is now modeled by cutting off/replacing some trailing characters but perhaps is more natural to model state as (text, cursor) pair?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the special case for when incoming text contains \r at the end to ensure the text is shown. Otherwise, the text may be discarded too early and never be shown. So yeah, I guess that is a weakness of the old_text/new_text approach.

The other issue that you pointed out is due to appending incoming text as new lines when neither the old text nor the new one contain a special character. I don't remember if that was part of the original behavior, but what would be the ideal way to handle that here? I'm thinking of concatenating the new output to the end of the last cell until \n is encountered or doing away with the lists and replacing them with a (text, cursor) pair as you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, the text may be discarded too early and never be shown.

Ah yes that's where it gets interesting 🤔

In an ideal world, now that we've had glass TTYs for decades, there'd be some migration path to simpler semantics for apps, where \b would actually eat chars and \r would clear out everything since the last \n.

  • But what \r does in actual terminals, as well as existing Jupyter client-side output processing, is keep text in place, just move cursor to start of line. E.g. '11111\r 2' results in '2 111' (with cursor at the first 1) — if new text is shorter, remains of previous text are still visible.
    This is rarely used; it's more of a nuisance that after \r apps may need to right-pad the next line with spaces to ensure previous gets over-written. But it does give flexibility, as you say, that printing '...\r' each time is near-equivalent to printing '\r...' each time (only difference is where cursor stays most of the time).
    Anyway, I don't suppose a plugin shifting execution to server is the place to deviate from that behavior.

  • Moreover, actual terminals [soft-]wrap long lines and \r only returns to start of current line; at least Jupyter avoids that complication (behaves closer to infinite-width terminal, with horizontal scrolling if necessary).

But your suffix technique is interesting, I only now fully grokked it.
The Y model has no concept of representing or dislaying a "cursor", right?
By sometimes keeping an \r unprocessed — and storing it as such in the Y doc¹ too — it can represent states where a line is not fully overwritten yet 👍.
(¹This falls back on client-side to hide text per \r \b. Funny but OK imho.)

  • So perhaps even the handle_*() functions themselves could adopt this approach to represent all partially-overwritten states? E.g. now, if \r is in middle and doesn't hit your special code path:

    print('11111\r2 ', end='')
    time.sleep(1)
    print('b\b3', end='')  # \b simply to avoid `last_output["text"].append(text)` branch
    

    results in 2 1113, note how 3 got wrongly tacked at the end even though logical "cursor" should have stayed after the 2 . If the intermediate state was kept in 11111\r2 form, the "cursor position" could be retained.

  • With my previous suggestion of (text, cursor) pair, the cursor position can't be persisted into Y doc, only kept in-memory in server.
    However, it's not like we have any way to continue an in-progress execution when server dies, so I guess that's fine too.

suffix = '\r'
combined_text = combined_text[:-1]
else:
suffix = ''
new_text = handle_carriage_return(handle_backspace(combined_text)) + suffix
last_output["text"][-1] = new_text
else:
last_output["text"].append(text)
cell_outputs[-1] = last_output
else:
with cell_outputs.doc.transaction():
cell_outputs.append(output)

elif msg_type == "clear_output":
# FIXME msg.content.wait - if true should clear at the next message
outputs.clear()

if ycell is not None:
del ycell["outputs"][:]

elif msg_type == "update_display_data":
# FIXME
...
Expand Down
Loading