Skip to content

Conversation

chrisjsewell
Copy link
Contributor

@chrisjsewell chrisjsewell commented Sep 15, 2021

This is a hacked together Proof-of-Principle for https://discourse.jupyter.org/t/inline-variable-insertion-in-markdown/10525/87

nbclient would accept an additional configuration trait: parse_md_expressions, which is a callable that takes a Markdown string and extracts from it a list of substitution expressions.

If present, these are then parsed to the kernel, via user_expressions, to return their mime-bundles, which are then stored as attachments on the Markdown cell

@chrisjsewell
Copy link
Contributor Author

Oh wait a minute, its obviously in the exec_reply 🤦
thats ok I will carry on messing around with this

@chrisjsewell
Copy link
Contributor Author

working example done 😄
Obviously very rough around the edges, but feel free to give feedback already (I'll post about it in the discussion now)

@chrisjsewell chrisjsewell changed the title Try getting back user_expressions PoP for Inline variable insertion in markdown Sep 15, 2021
@meeseeksmachine
Copy link

This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/inline-variable-insertion-in-markdown/10525/90

finally:
raise
self._check_raise_for_error(cell, exec_reply)
attachments = {key: val["data"] if "data" in val else {"traceback": "\n".join(val["traceback"])} for key, val in exec_reply["content"]["user_expressions"].items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over on jupyterlab-imarkdown I'm playing around with keeping the kernel-output itself as two different MIME types, i.e. the "attachment" is just one of two possible MIME types.

My thought process here is that we keep the structural information, and do the formatting as late as possible. The status field probably shouldn't be stored, as it's already encoded in the MIME type.

How do you feel about doing something like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya 👋🏼 well my feeling is that really we use have a new cell type with a new json schema, that would be a sort of hybrid between code and markdown; e.g. it should definitely have an execution number and then this data should have its own “expressions” key, where yeh we would have a more 1-to-1 mapping with the data returned by the kernel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having these special mime types indeed works, but I feel is a bit “hacky” for core jupyter

Copy link
Contributor Author

@chrisjsewell chrisjsewell Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it would be along the lines of:

  {
   "cell_type": "imarkdown",
   "execution_count": 1,
   "metadata": {},
   "expressions": [
      {
        "input": "a",
        "output_type": "display",
        "data": {
          "text/plain": "1"
        }
      },
      {
        "input": "b",
        "output_type": "error",
        "traceback": ["..."]
      },
   ],
   "source": [
    "{{ a }} {{ b }}\n"
   ]
  }

@agoose77
Copy link
Contributor

Hey @chrisjsewell, thanks for linking me to this!

In general, I'm glad that you're tackling the other "big" Jupyter frontend that needs to implement this functionality.

One of the problems as I see it with the existing notebook schema is that we only support GFM. Because we are deviating from that to implement the concept of "inline-variable" insertion, we are defining our own "flavour". For this reason, I think it's a good idea that you've used a configurable to off-load the implementation. In the long run, I think we need to standardise this somehow. One of my ideas is to remove Markdown as a special-case from the Jupyter Notebook schema, and instead have a MIME-type associated with each non-cell. This could then be used to provide finer-grained information about the Markdown flavour(s) used. This is not really in-scope for this particular PoC, though, so I won't keep musing on it in this thread!

@meeseeksmachine
Copy link

This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/inline-variable-insertion-in-markdown/10525/113

@davidbrochart
Copy link
Member

@chrisjsewell are you still planning to work on this PR?

@chrisjsewell
Copy link
Contributor Author

chrisjsewell commented Jan 5, 2022

are you still planning to work on this PR?

yes very much so 😬

https://discourse.jupyter.org/t/inline-variable-insertion-in-markdown/10525/113 has stalled a bit, but will "rally the troops" when I find time.

Outside that, I would really like to utilise something like this in https://github.com/executablebooks/jupyter-cache (and by extension https://github.com/executablebooks/jupyter-book).
I may do some "sub-classing" in jupyter-cache, as an interim solution, but ultimately would like to have it here.

Probably the requirement of Markdown parsing etc is a bit overly complex to be in nbclient.
The main aspiration here, is just a "hook" into the client's user_expressions feature (https://jupyter-client.readthedocs.io/en/stable/messaging.html#execute)
I think a minimal implementation would be simply to modify NotebookClient.async_execute_cell something like this:

    user_expressions: bool = Bool(False, help="Process user_expressions in cell metadata")

    async def async_execute_cell(
            self,
            cell: NotebookNode,
            cell_index: int,
            execution_count: t.Optional[int] = None,
            store_history: bool = True) -> NotebookNode:
        """
        Executes a single code cell...
        """
        assert self.kc is not None
        if cell.cell_type != 'code' or not cell.source.strip():
            self.log.debug("Skipping non-executing cell %s", cell_index)
            return cell

        if self.record_timing and 'execution' not in cell['metadata']:
            cell['metadata']['execution'] = {}

        self.log.debug("Executing cell:\n%s", cell.source)

        cell_allows_errors = (not self.force_raise_errors) and (
            self.allow_errors
            or "raises-exception" in cell.metadata.get("tags", []))

        # retrieve desired user expressions from the metadata (validate type?)
        user_expressions = cell.metadata.get('user_expressions', None) if self.user_expressions else None

        parent_msg_id = await ensure_async(
            self.kc.execute(
                cell.source,
                store_history=store_history,
                stop_on_error=not cell_allows_errors,
                user_expressions=user_expressions,
            )
        )
        # We launched a code cell to execute
        self.code_cells_executed += 1
        exec_timeout = self._get_timeout(cell)

        cell.outputs = []
        self.clear_before_next_output = False

        # clear any existing user_expressions outputs
        # note, it would be ideal if these could be a top-level key in the cell (rather than in the metadata),
        # but nbformat/v4/nbformat.v4.5.schema.json does not allow additional properties on the code cell
        if self.user_expressions:
            cell.metadata.pop('user_expressions_outputs', None)

        task_poll_kernel_alive = asyncio.ensure_future(
            self._async_poll_kernel_alive()
        )
        task_poll_output_msg = asyncio.ensure_future(
            self._async_poll_output_msg(parent_msg_id, cell, cell_index)
        )
        self.task_poll_for_reply = asyncio.ensure_future(
            self._async_poll_for_reply(
                parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
            )
        )
        try:
            exec_reply = await self.task_poll_for_reply
        except asyncio.CancelledError:
            # can only be cancelled by task_poll_kernel_alive when the kernel is dead
            task_poll_output_msg.cancel()
            raise DeadKernelError("Kernel died")
        except Exception as e:
            # Best effort to cancel request if it hasn't been resolved
            try:
                # Check if the task_poll_output is doing the raising for us
                if not isinstance(e, CellControlSignal):
                    task_poll_output_msg.cancel()
            finally:
                raise

        # add user_expressions outputs, if present
        if self.user_expressions and 'user_expressions' in exec_reply['content']:
            cell.metadata['user_expressions_outputs'] = exec_reply['content']['user_expressions']

        if execution_count:
            cell['execution_count'] = execution_count
        self._check_raise_for_error(cell, exec_reply)
        self.nb['cells'][cell_index] = cell

        return cell

An example of this working is, create test.ipynb:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "user_expressions": {"key": "a"}
},
   "outputs": [],
   "source": [
    "a = 1"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

Then run:

In [1]: import nbformat, nbclient
In [2]: client = nbclient.NotebookClient(nbformat.read(open("test.ipynb"), as_version=4), user_expressions=True)
In [3]: client.execute()
Out[3]: 
{'cells': [{'cell_type': 'code',
   'execution_count': 1,
   'metadata': {'user_expressions': {'key': 'a'},
    'execution': {'iopub.status.busy': '2022-01-05T16:46:25.294446Z',
     'iopub.execute_input': '2022-01-05T16:46:25.295670Z',
     'iopub.status.idle': '2022-01-05T16:46:25.307139Z',
     'shell.execute_reply': '2022-01-05T16:46:25.307858Z'},
    'user_expressions_outputs': {'key': {'status': 'ok',
      'data': {'text/plain': '1'},
      'metadata': {}}}},
   'outputs': [],
   'source': 'a = 1'}],
 'metadata': {'language_info': {'name': 'python',
   'version': '3.7.10',
   'mimetype': 'text/x-python',
   'codemirror_mode': {'name': 'ipython', 'version': 3},
   'pygments_lexer': 'ipython3',
   'nbconvert_exporter': 'python',
   'file_extension': '.py'}},
 'nbformat': 4,
 'nbformat_minor': 2}

Any "auto-generation" of the user_expressions via Markdown cell parsing, could then be in pre-processing, outside of nbclient.

what do you think about something like this @davidbrochart?

@davidbrochart
Copy link
Member

Thanks @chrisjsewell, I like the idea of using user_expressions.
One thing I don't understand though, would it be a code cell, or a markdown cell?

@chrisjsewell
Copy link
Contributor Author

One thing I don't understand though, would it be a code cell, or a markdown cell?

On a code-cell.
Essentially, forget the initial proposal regarding any interaction with the markdown cell; at least for now that can be done in pre/post-processing steps on the notebook (outside nbclient)

Basically what we would be offering to nbclient users, is the ability to ask the kernel to snapshot a set of variable outputs, after execution of the code source code, without having to display them (and in a kernel agnostic manner)

@choldgraf
Copy link
Contributor

I was looking through the user_expressions docs (of which there are very little 🤦) and I think that supporting user_expressions in nbclient in this way is a nice step forward here. It follows the jupyter_client spec nicely, and I sorta think about nbclient like jupyter_client extended to notebooks. An added bonus here is that it gives us a foundation that we could use to play around with the markdown insertion stuff. So I'm +1 supporting user_expressions in nbclient.

@davidbrochart
Copy link
Member

that can be done in pre/post-processing steps on the notebook (outside nbclient)

@chrisjsewell you might want to have a look at #188, which adds on_cell_start and on_cell_complete hooks. Could they be used for the pre/post processing steps you're talking about?

@chrisjsewell
Copy link
Contributor Author

FYI, I went ahead and made a PoP PR in myst-nb for full inline variable evaluation 😄 https://myst-nb--382.org.readthedocs.build/en/382/use/inline_execution.html
Basically, its running a NotebookClient, as it parses the document (within a context manager); calling kernel execution whenever it meets a code cell or inline variable.

@chrisjsewell
Copy link
Contributor Author

Closing this, I actually went a different route: https://myst-nb.readthedocs.io/en/latest/render/inline.html

May be good to eventually upstream the additional code in: https://github.com/executablebooks/MyST-NB/blob/master/myst_nb/ext/eval/__init__.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants