Skip to content

Conversation

@andreatgretel
Copy link
Contributor

  • a script docs/scripts/generate_colab_notebooks.py that injects new cells and converts .py notebooks (without executing)
    • new cells are for 1) downloading deps and 2) setting up NVIDIA_API_KEY
  • a new make command, generate-colab-notebooks
  • a CI that, whenever a .py notebook has changed, checks if the command above was ran

}

COLAB_INSTALL_CELL = """\
# Install data-designer and dependencies
Copy link
Contributor

@johnnygreco johnnygreco Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those "parrot comments", I think. Maybe add this to the markdown note that appears above this cell?

Something like:
"Run the cells below install Data Designer and set up the environment for Google Colab."

Also, a couple questions:

  • Do we want to use -q or put %%capture at the top of the cell? We used the latter at Gretel, but if -q does the same, that's cool.
  • Should we pin the version on the install?

Copy link
Contributor Author

@andreatgretel andreatgretel Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed!

Re questions

  • I like -q because if there are issues they still show up, but we could use %%capture, don't have a strong opinion.
  • I've added a snippet that uses >= the current version, see what you think. I'm using whatever data_designer.__version__ outputs but doing -1 on the minor, since I think this is always a dev version? I.e. we won't do a PR on a tagged commit, we tag for releasing after the PR 🤔

Comment on lines 36 to 42
COLAB_API_KEY_CELL = """\
# Set up NVIDIA API key from Colab secrets
import os

from google.colab import userdata

os.environ["NVIDIA_API_KEY"] = userdata.get("NVIDIA_API_KEY")"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this or use getpass and ask for their API key? @kirit93, thoughts on what's the preferred flow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a try/except, which looks for the envvar inside userdata, but if it isn't there it asks for it



def find_import_section_index(cells: list[NotebookNode]) -> int:
"""Find the index of the 'Import the essentials' markdown cell."""
Copy link
Contributor

@johnnygreco johnnygreco Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we always have to have this markdown to detect the section? Wondering if from data_designer.essentials import is a more future-proof check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but then you will likely have at least 1 or more markdown cells before it, hard to know. We can assume it's 1 but I don't know if that will always be the case.

Maybe right after the 1st markdown cell? And then we make sure that the 1st one is always title plus some description.

Copy link
Contributor

@johnnygreco johnnygreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really neat @andreatgretel – thank you!

Just had a few questions to think through.

Copy link
Contributor

@johnnygreco johnnygreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛸

@andreatgretel andreatgretel force-pushed the andreatgretel/colab-notebooks branch from 701f304 to e94a5a8 Compare December 12, 2025 18:12
@andreatgretel andreatgretel merged commit 7fa9a41 into main Dec 12, 2025
14 checks passed
@andreatgretel andreatgretel deleted the andreatgretel/colab-notebooks branch December 12, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants