feat: overhaul for relaunch by gvwilson · Pull Request #135 · marimo-team/learn

gvwilson · 2026-03-11T18:58:10Z

Create requirements.txt with pinned dependencies for building site.
Update .gitignore.
Enhance Makefile.
Rename existing lessons to be lesson/dd_specific.py (two digits, lower case).
- Notebooks with other names are not included in index page.
Add SQL tutorial.
- Add scripts to regenerate SQLite databases.
Add queueing theory tutorial.
Rename scripts directory to bin.
Replace old build.py with:
- bin/extract.py gets metadata from */index.md lesson pages.
- bin/build.py builds root home page and lesson home pages.
- bin/check_empty_cells.py: look for empty cells in notebooks (enhanced).
- bin/check_missing_titles.py: look for notebooks without an H1 title.
- bin/check_notebook_packages.py: check consistency of package versions within lesson.
Add make check_packages NOTEBOOKS="*/??_*.py" to check package consistency within lesson.
- If NOTEBOOKS not specified, all notebooks are checked.
Add make check_exec NOTEBOOKS="*/??_*.py" to check notebook execution.
- If NOTEBOOKS not specified, all notebooks are executed (slow).
Fix missing package imports in notebook headers.
Pin package versions in notebook headers.
Make content of lesson home pages uniform.
Update GitHub workflows to launch commands from Makefile.
- Requires using uv in workflows.
Extract and modify CSS.
Putt SVG icons in includeable files in templates/icons/*.svg.
Make titles of notebooks more uniform.
Building pages/*.md using templates/page.html.
Add link checker.
- Requires local server to be running, and takes 10 minutes or more to execute.
Fix multiple bugs in individual lessons.
- Most introduced by package version pinning.
- See notes below for outstanding issues.

Note: build marimo_learn package with utilities to localize SQLite database files.

To Do

Add disabled=True to prevent execution of deliberately buggy cells in script mode (?).

`polars/03_loading_data.py`

The code at line 497–499 calls lz.sink_csv(..., lazy=True). The lazy=True argument was added to return a lazy sink that could be passed to pl.collect_all() for parallel execution — rather than immediately writing the file. However, in polars 1.24.0, the lazy parameter was removed from sink_csv() (and likely sink_parquet(), sink_ndjson() too). The API for collecting multiple sinks in parallel has changed.

`polars/03_loading_data.py`, `polars/05_reactive_plots.py`, and `polars/11_missing_data.py`

These notebook use the hf:// protocol to stream a parquet file directly from Hugging Face:

URL = f"hf://datasets/{repo_id}@{branch}/{file_path}"

Polars is URL-encoding the slash in the repo name when it calls the HF API, which then rejects it as an invalid repo name. The fix is to download the file and store it locally, or make it available in some other location.

`polars/07_querying_with_sql.py`

Kagglehub requires Kaggle API credentials not available in the browser. Either remove the data-loading step or substitute a bundled sample dataset.

`polars/14_user_defined_functions.py` and `polars/16_lazy_execution.py`

Replace numba with a pure-Python alternative for the WASM version, or gate the numba cells with a WASM check and change prose accordingly:

import sys
if "pyodide" not in sys.modules:
    import numba

- Create `requirements.txt` with pinned dependencies for building site. - Update .gitignore. - Enhance Makefile. - Rename existing lessons to be lesson/dd_specific.py (two digits, lower case). - Notebooks with other names are not included in index page. - Add SQL tutorial. - Add scripts to regenerate SQLite databases. - Add queueing theory tutorial. - Rename `scripts` directory to `bin`. - Replace old `build.py` with: - `bin/extract.py` gets metadata from `*/index.md` lesson pages. - `bin/build.py` builds root home page and lesson home pages. - `bin/check_empty_cells.py`: look for empty cells in notebooks (enhanced). - `bin/check_missing_titles.py`: look for notebooks without an H1 title. - `bin/check_notebook_packages.py`: check consistency of package versions within lesson. - Add `make check_packages NOTEBOOKS="*/??_*.py"` to check package consistency within lesson. - If `NOTEBOOKS` not specified, all notebooks are checked. - Add `make check_exec NOTEBOOKS="*/??_*.py"` to check notebook execution. - If `NOTEBOOKS` not specified, all notebooks are executed (slow). - Fix missing package imports in notebook headers. - Pin package versions in notebook headers. - Make content of lesson home pages uniform. - Update GitHub workflows to launch commands from Makefile. - Requires using `uv` in workflows. - Extract and modify CSS. - Putt SVG icons in includeable files in `templates/icons/*.svg`. - Make titles of notebooks more uniform. - Building `pages/*.md` using `templates/page.html`. - Add link checker. - Requires local server to be running, and takes 10 minutes or more to execute. - Fix multiple bugs in individual lessons. - Most introduced by package version pinning. - See notes below for outstanding issues. Note: build [`marimo_learn`](https://github.com/gvwilson/marimo_learn) package with utilities to localize SQLite database files. Add `disabled=True` to prevent execution of deliberately buggy cells in script mode (?). The code at line 497–499 calls `lz.sink_csv(..., lazy=True)`. The `lazy=True` argument was added to return a lazy sink that could be passed to pl.collect_all() for parallel execution — rather than immediately writing the file. However, in polars 1.24.0, the lazy parameter was removed from sink_csv() (and likely sink_parquet(), sink_ndjson() too). The API for collecting multiple sinks in parallel has changed. These notebook use the `hf://` protocol to stream a parquet file directly from Hugging Face: ``` URL = f"hf://datasets/{repo_id}@{branch}/{file_path}" ``` Polars is URL-encoding the slash in the repo name when it calls the HF API, which then rejects it as an invalid repo name. The fix is to download the file and store it locally, or make it available in some other location. Kagglehub requires Kaggle API credentials not available in the browser. Either remove the data-loading step or substitute a bundled sample dataset. Replace numba with a pure-Python alternative for the WASM version, or gate the numba cells with a WASM check and change prose accordingly: ``` import sys if "pyodide" not in sys.modules: import numba ```

…rriculum/

Haleshot

@gvwilson, it's really nice to be reviewing a contribution of yours; left some comments and suggestions below; nits mostly (+ a recurring issue across the queueing series around the (unused) run button).

Haven't reviewed the SQL series yet since I still need to run the marimo_learn setup locally. Will follow up once I do. In the meantime, I came across this intro SQL notebook recently from the notebook.link folks (might be worth checking out).

Also some other libraries / things I wanted to mention: skrub has a nice TableReport that works well as a one-line data overview at the start of a SQL or EDA notebook.
lychee is worth adding as a GitHub Action; lot of URLs here (in this codebase) that'll drift over time (I set it up in another project and it helps track stale links & generates a report (based on the interval you set)). Happy to file a separate issue for this :)

The cross-filtering in the view composition nb is a nice addition!!

Reviewing PRs here reminded me of the work @etrotta and @peter-gy (and others) had done in this repo (some great high quality contribs)!! Peter's still doing great work on the marimo side too (the notebook gallery is shaping up well).

Haleshot · 2026-03-23T04:00:42Z

altair/02_marks_encoding.py

+
+    The `column` and `row` encoding channels generate either a horizontal (columns) or vertical (rows) set of sub-plots, in which the data is partitioned according to the provided data field.
+
+    Here is a trellis plot that divides the data into one column per \`cluster\` value:


Suggested change

Here is a trellis plot that divides the data into one column per \`cluster\` value:

Here is a trellis plot that divides the data into one column per `cluster` value:

Haleshot · 2026-03-23T04:33:33Z

altair/04_scales_axes_legends.py

+    mo.md(r"""
+    _We can see that Gram-positive bacteria seem most susceptible to penicillin, whereas neomycin is more effective for Gram-negative bacteria!_
+
+    The color scheme above was automatically chosen to provide perceptually-distinguishable colors for nominal (equal or not equal) comparisons. However, we might wish to customize the colors used. In this case, Gram staining results in [distinctive physical colorings: pink for Gram-negative, purple for Gram-positive](https://en.wikipedia.org/wiki/Gram_stain#/media/File:Gram_stain_01.jpg).


Suggested change

The color scheme above was automatically chosen to provide perceptually-distinguishable colors for nominal (equal or not equal) comparisons. However, we might wish to customize the colors used. In this case, Gram staining results in [distinctive physical colorings: pink for Gram-negative, purple for Gram-positive](https://en.wikipedia.org/wiki/Gram_stain#/media/File:Gram_stain_01.jpg).

The color scheme above was automatically chosen to provide _perceptually-distinguishable_ colors for nominal (equal or not equal) comparisons. However, we might wish to customize the colors used. In this case, Gram staining results in [distinctive physical colorings: pink for Gram-negative, purple for Gram-positive](https://en.wikipedia.org/wiki/Gram_stain#/media/File:Gram_stain_01.jpg).

Haleshot · 2026-03-23T04:52:55Z

altair/05_view_composition.py

+    return
+
+
+@app.cell


Suggested change

@app.cell

@app.cell(hide_code=True)

By this point in the notebook, I'd say a reader would be exhausted and seeing a (relatively) big codeblock which is an amalgamation of displays of data from earlier sections of the notebook could make them lose interest (?). I'd recommend hiding this cell block and write a short note in the markdown cell block above stating that they can toggle the cell block to view the code.

Haleshot · 2026-03-23T06:00:06Z

pages/educators.md

+- why *not* marimo?
+    - not yet as widely known as Jupyter (i.e., your IT department may not already support it)
+    - not yet integrated with auto-grading tools ([faw](https://github.com/gvwilson/faw) is a start, but we're waiting to see what you want)
+    - doesn't yet support multi-notebook books


I'm curious on what "multi-notebook books" means...something along the lines of Jupyterbook?

This PR: marimo-team/marimo#8056 introduced serving a gallery of notebooks from your directory.

Haleshot · 2026-03-23T06:00:44Z

pages/educators.md

+    - more than Notebook but not as intimidating as VS Code
+    - reactivity allows for (encourages) dynamic, interactive elements
+        - marimo is both a notebook and a library of UI elements
+        - and AnyWidget makes it relatively easy to extend [GVW: point at [faw](https://github.com/gvwilson/faw)]


Suggested change

- and AnyWidget makes it relatively easy to extend [GVW: point at [faw](https://github.com/gvwilson/faw)]

- and [AnyWidget](https://anywidget.dev/) makes it relatively easy to extend [GVW: point at [faw](https://github.com/gvwilson/faw)]

I'd recommend hyperlinking here.

Haleshot · 2026-03-23T08:15:00Z

altair/03_data_transformation.py

+@app.cell(hide_code=True)
+def _(mo):
+    mo.md(r"""
+    _Yes!_


When a cell below a chart lands on a conclusion -- "yes, " or something along those lines...a mo.callout makes that stand out in a (better) way imo. The callout docs show the available kind options ("info", "success", "warn", "danger").

On a similar note -- same goes for the attribution lines at the start of the notebooks... a mo.callout(mo.md("..."), kind="info") is something students will actually pause on.

Haleshot · 2026-03-23T08:19:35Z

queueing/01_basic_ideas.py

+        step=1,
+        label="Random seed"
+    )
+    run_button = mo.ui.button(label="Run simulation")


The run_button here is a plain mo.ui.button -- marimo actually has a dedicated mo.ui.run_button for this. Worth either switching to that or removing the button entirely across the series, since all notebooks re-run reactively on the other UI element changes anyway (meaning the Run simulation button doesn't do anything and playing around with the other UI elements causes changes downstream).

Haleshot · 2026-03-23T08:21:12Z

queueing/03_littles_law.py

+app = marimo.App(width="medium")
+
+
+@app.cell(hide_code=True)


The hidden imports cell block is something worth having across the series.

gvwilson self-assigned this Mar 11, 2026

gvwilson added the enhancement New feature or request label Mar 11, 2026

gvwilson mentioned this pull request Mar 11, 2026

standardize on two-digit prefixes for lessons #134

Open

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch 18 times, most recently from 1068c24 to a987680 Compare March 14, 2026 14:52

This was referenced Mar 14, 2026

bug: polars/05_reactive_plots.py cannot find plotly #137

Open

bug: polars/11_missing_data.py complains that 'stations' is undefined #138

Open

bug: polars/16_lazy_execution.py complains that pivot method doesn't exist #139

Open

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch 2 times, most recently from e9df78b to 20b1785 Compare March 15, 2026 15:14

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch from 20b1785 to c05ae44 Compare March 15, 2026 16:25

gvwilson marked this pull request as ready for review March 15, 2026 16:30

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch from 9bc73ca to 1291526 Compare March 17, 2026 12:56

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch from 1291526 to be8d543 Compare March 17, 2026 12:58

gvwilson mentioned this pull request Mar 17, 2026

feat: adding GIS lesson #140

Draft

feat: Altair notebooks from https://uwdata.github.io/visualization-cu…

6c1216c

…rriculum/

gvwilson force-pushed the gvwilson/overhaul-for-relaunch branch from be8d543 to 6c1216c Compare March 18, 2026 14:57

feat: adding formative assessment widgets

9f425aa

Haleshot reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: overhaul for relaunch#135

feat: overhaul for relaunch#135
gvwilson wants to merge 3 commits intomainfrom
gvwilson/overhaul-for-relaunch

gvwilson commented Mar 11, 2026 •

edited

Loading

Uh oh!

Haleshot left a comment •

edited

Loading

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Haleshot Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		The `column` and `row` encoding channels generate either a horizontal (columns) or vertical (rows) set of sub-plots, in which the data is partitioned according to the provided data field.

		Here is a trellis plot that divides the data into one column per \`cluster\` value:

	The color scheme above was automatically chosen to provide perceptually-distinguishable colors for nominal (equal or not equal) comparisons. However, we might wish to customize the colors used. In this case, Gram staining results in [distinctive physical colorings: pink for Gram-negative, purple for Gram-positive](https://en.wikipedia.org/wiki/Gram_stain#/media/File:Gram_stain_01.jpg).
	The color scheme above was automatically chosen to provide _perceptually-distinguishable_ colors for nominal (equal or not equal) comparisons. However, we might wish to customize the colors used. In this case, Gram staining results in [distinctive physical colorings: pink for Gram-negative, purple for Gram-positive](https://en.wikipedia.org/wiki/Gram_stain#/media/File:Gram_stain_01.jpg).

	- and AnyWidget makes it relatively easy to extend [GVW: point at [faw](https://github.com/gvwilson/faw)]
	- and [AnyWidget](https://anywidget.dev/) makes it relatively easy to extend [GVW: point at [faw](https://github.com/gvwilson/faw)]

Conversation

gvwilson commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

To Do

polars/03_loading_data.py

polars/03_loading_data.py, polars/05_reactive_plots.py, and polars/11_missing_data.py

polars/07_querying_with_sql.py

polars/14_user_defined_functions.py and polars/16_lazy_execution.py

Uh oh!

Haleshot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gvwilson commented Mar 11, 2026 •

edited

Loading

`polars/03_loading_data.py`

`polars/03_loading_data.py`, `polars/05_reactive_plots.py`, and `polars/11_missing_data.py`

`polars/07_querying_with_sql.py`

`polars/14_user_defined_functions.py` and `polars/16_lazy_execution.py`

Haleshot left a comment •

edited

Loading