MAINT/DOC: Designing notebook execution workflows

@dylansdaniels  This is partially related to https://github.com/jonescompneurolab/textbook/pull/96 , but also related to how we manage the "process" of adding new notebooks or editing them according to new changes in *development* which are not yet on master (see here https://github.com/jonescompneurolab/hnn-core/pull/1064#issuecomment-3080251324 ). I think we need to have a discussion of our approach for how we handle development on tutorials that use code (e.g. notebooks) in its entirety.

## Currently

We have two places where we are deploying notebooks (or "notebook-like-content"):

1. The distinct Jupyter notebooks at `textbook`, which are only built against the latest stable version.
2. The notebook-like "scripts" in our examples at `hnn-core`, which are deployed and built for every version, *including new development versions as of each PR*.

There is also currently no way for a script/notebook to automatically "go between" these two repos; currently we must manually copy Jupyter notebooks, which is problematic for a number of reasons (e.g. the notebooks that `hnn-core` builds are malformed).

## Proposal

Essentially, I propose the following:

1. We change `hnn-core`'s doc system to use *and execute* true Jupyter notebooks, instead of scripts that are currently deployed as both 1. empty (no output and un-executed) Jupyter notebooks and 2. webpages that are built using `sphinx` out of script code and output. There are multiple `sphinx` extensions that support this, including [`myst-nb`](https://myst-nb.readthedocs.io/en/latest/) and [`nbsphinx`](https://nbsphinx.readthedocs.io/en/0.9.7/). Here is some [good documentation](https://docs.readthedocs.com/platform/stable/guides/jupyter.html). Instead of the scripts getting tested via CircleCI for every PR (and release), the Jupyter notebooks would be tested. Honestly, even if we don't proceed with this proposal, this is a good idea for `hnn-core` regardless.
2. We remove the "execution" part of Jupyter notebook processing in the `textbook` repo, but we still retain the code to take a Jupyter notebook, extract it into the JSON files you've made, and use them to assemble webpages that look good on the `textbook` website.
3. Instead, the `textbook` repo will be changed to *directly use the Jupyter notebooks from the `hnn-core` repo itself*. There are at least 2 ways to do this:
    1. on deployment, `textbook` grabs notebook files directly from raw.githubusercontent and updates the hashes and its JSON files, or
    2. on deployment, `textbook` depends on and downloads `hnn-core` as a git submodule, then processes the newly-downloaded versions of the notebooks (on first glance I prefer this, and it has the added benefit that the only hash we need to track is that of the `hnn-core` submodule commit itself, rather than the hash of multiple files).
    3. There are probably other ways to do this.

Pros:
- This means that, if desired, we can actually use `textbook` and point to BOTH stable and development notebook versions, if we want! For example, most of our notebooks would presumably be pointing to the `stable` version of each notebook (which will be located on the `hnn-core/gh-pages` branch), but if we want to add a page for a new feature that is still in development, that page by itself could point to the new notebook off of the `hnn-core/master` branch.
- `textbook` could access development versions of textbooks, even including those of *unfinished PRs*. This is the issue I ran into here https://github.com/jonescompneurolab/hnn-core/pull/1064#issuecomment-3080251324 which made me come up with this solution.
- There would no longer be any ambiguity about which version a notebook was successfully run on, or between websites. `textbook` and `hnn-core` would never have alternate versions of a notebook that the other repo could not access. There is "one and only one" version of each notebook per `hnn-core` commit.
- Notebooks would only need to be executed once for each version, rather than twice (both `hnn-core` and separately `textbook`).
    - Similarly, `textbook` would also never have to double-check that each notebook actually works.
- Similar to our code website, if we wanted to provide separate "stable" and "development" *webpage* versions of our `textbook` website, it's just a matter of pointing to different versions of the same notebooks.
- The current method of manually copying Jupyter notebooks from `hnn-core` (which need to be heavily changed still, and do not necessarily work out-of-the-box!) is prone to issues, including 1. it's manual, 2. the current notebooks on `textbook` are not the same as the notebooks on `hnn-core`, 3. updates to scripts (or new ones) on `hnn-core` need to be tested on `hnn-core`, then converted to notebooks, then those notebooks need to have some changes made, then the updated notebooks need to be manually copied over to `textbook`, and this process needs to be repeated every time. If we are planning to add major rewrites or new additions of notebooks, which we are, then this is not a good approach.
- This would prevent us from having to start dealing with multiple kinds of execution styles like from #96.

Cons:
- This is obviously a non-trivial amount of work. However, I think the current method is unsustainable, and we need to streamline (and connect the two repos) somehow.
- Authors would have to edit notebooks on `hnn-core` rather than the `textbook` repo, even though they want the notebook content to show up on `textbook`. However, I think this is inevitable if we want `textbook` and `hnn-core` execution of every notebook to be synchronized.
- Much of the execution code here would no longer be needed (but the rest of the code would still be).
- ???


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT/DOC: Designing notebook execution workflows #100

Currently

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MAINT/DOC: Designing notebook execution workflows #100

Description

Currently

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions