Please refer to the v1.0.1 documentation;
the code for v1.0 is identical to the code for v1.0.1.
See https://docs.cleanlab.ai/ if you want to browse the documentation (including for past versions).
In the cleanlab repository, we've configured GitHub Actions to perform the following automatically:
-
When a commit is pushed to the
masterbranch, a new version of themasterdocs will be built and deployed to thecleanlab-docsrepository. -
When a release is published, a new version of the docs with the corresponding release tag will be built and deployed as a new folder in the
cleanlab-docsrepository. Redirection to thestableversion of the docs will be changed to this newly released one, accessible via a link on the docs' site sidebar. All the older versions will remain available in thecleanlab-docsrepo, accessible by manually entering the subdirectory in the URL. -
When a user manually runs the workflow, one of the above will happen depending on the user's selection to run from a
branchortag.
If you'd like to build our docs locally or remotely yourself, or want to know more about the steps taken in the GitHub Pages workflow, read on!
pip install -r docs/requirements.txt
-
Install Pandoc.
-
If you don't already have it, install wget. This can be done with
brewon macOS:brew install wget -
[Optional] Create a new branch, make your code changes, and then
git committhem. ONLY COMMITTED CHANGES WILL BE REFLECTED IN THE DOCS BUILD WITHsphinx-multiversion. Instead usesphinx-buildif you don't want to commit some test changes but still want to see their corresponding docs. -
Build the docs with either
-
- If you're building from a branch (usually the
masterbranch):
sphinx-multiversion docs/source cleanlab-docs -D smv_branch_whitelist=YOUR_BRANCH_NAME -D smv_tag_whitelist=None- If you're building from a tag (usually the tag of the stable release):
sphinx-multiversion docs/source cleanlab-docs -D smv_branch_whitelist=None -D smv_tag_whitelist=YOUR_TAG_NAMENote: To also build docs for another branch or tag, run the above command again changing only the
YOUR_BRANCH_NAMEorYOUR_TAG_NAMEplaceholder. - If you're building from a branch (usually the
-
- If you want to test out some changes without comitting them, then you can build from your current working directory tree (where you have any un-committed changes locally saved):
sphinx-build docs/source cleanlab-docs
This won't properly produce/display other versions of the docs, but that shouldn't matter if you are just trying to test some local edits to the current version. If some notebooks are giving you trouble (eg. due to runtime or dependencies), you can simply delete those .ipynb files before calling
sphinx-build.Fast build: Executing the Jupyter Notebooks (i.e., the
.ipynbfiles) that make up some portion of the docs, such as the tutorials, takes a long time. If you want to skip rendering these, set the environment variableSKIP_NOTEBOOKS=1. You can either set this usingexport SKIP_NOTEBOOKS=1or do this inline withSKIP_NOTEBOOKS=1 sphinx-multiversion ....Skipping specific notebooks: If you want to skip rendering a few specific notebooks during your local build, the best way to do this is to temporarily move the files outside the
cleanlabfolder (sonbsphinxwould not find it), then build the docs, before finally moving the files back (to ensure they will not be deleted when pushed to GitHub)Example workflow for skipping notebooks, given our current working directory is the
cleanlabroot folder and we want to ignore theaudio.ipynbnotebook:- create an empty folder outside of cleanlab folder
mkdir ../ignore_notebooks- move the notebook to ignore from local build to the newly created folder
mv docs/source/tutorials/audio.ipynb ../ignore_notebooks- build the docs locally, using
sphinx-buildas it does not require you to commit your changes
sphinx-build docs/source cleanlab-docs- move the notebook back to its original location
mv ../ignore_notebooks/audio.ipynb docs/source/tutorialsWhile building the docs with
sphinx-multiversion, your terminal might output:unknown config value 'smv_branch_whitelist' in override, ignoring, andunknown config value 'smv_tag_whitelist' in override, ignoring.
This is because the
smv_branch_whitelistandsmv_tag_whitelistconfig values are only used bysphinx-multiversion, but may also be checked bysphinxor other extensions that do not use them. Hence, these can be safely ignored as long as the docs are built correctly. -
-
[Optional] To show dynamic versioning and version warning banners:
-
Copy the
docs/_templates/versioning.jsfile to thecleanlab-docs/directory. -
In the copied
versioning.jsfile:-
find
placeholder_version_numberand replace it with the latest release tag name, and -
find
placeholder_commit_hashand replace it with themasterbranch commit hash.
-
-
-
[Optional] To redirect site visits from
/or/stableto the stable version of the docs:-
Create a copy of the
docs/_templates/redirect-to-stable.htmlfile and rename it asindex.html. -
In this
index.htmlfile, findstable_urland replace it with/cleanlab-docs/YOUR_LATEST_RELEASE_TAG_NAME/index.html. -
Copy this
index.htmlto:-
cleanlab-docs/, and -
cleanlab-docs/stable/.
-
-
-
The docs for each branch and/or tag can be found in the
cleanlab-docs/directory, open any of theindex.htmlin your browser to view the docs:cleanlab-docs | index.html (redirects to stable release of the docs) | versioning.js (for dynamic versioning and version warning banner) | └───YOUR_BRANCH_NAME (e.g. master) │ index.html │ ... │ └───YOUR_TAG_NAME_1 (e.g. your stable release tag name) │ index.html │ ... │ └───YOUR_TAG_NAME_2 (e.g. an old release tag name) │ index.html │ ... │ └───stable │ index.html (redirects to stable release of the docs) │ └───...Note: If you're building the docs from a working directory tree, the docs will be found at the top of the
cleanlab-docs/directory:cleanlab-docs | index.html (docs for the working directory tree) | ... | └───...This may overwrite some of the files in
cleanlab-docs/, likeindex.htmlfrom the previous step.
-
Fork the
cleanlabrepository. -
Create a new repository named
cleanlab-docsand a new branch namedmaster. -
In the
cleanlab-docsrepo, configure GitHub Pages; under the Source section, select themasterbranch and/(root)folder. Take note of the URL where your site is published. -
Generate SSH deploy key and add them to your repos as such:
- In the
cleanlab-docsrepo, go to Settings > Deploy Keys > Add deploy key and add your public key with the Allow write access - In the
cleanlabrepo, go to Settings > Secrets > New repository secrets and add your private key namedACTIONS_DEPLOY_KEY
- In the
-
In the
cleanlabrepo, check that you have the GitHub Pages workflow under the repo's Actions tab. This should be created automatically from.github\workflows\gh-pages.yaml. This workflow can be activated by any of the 3 triggers below:- A push to the
masterbranch in thecleanlabrepo. - Publish of a new release in the
cleanlabrepo. - Manually run from the Run workflow option and select either the
masterbranch or one of the release tag.
- A push to the
-
Activate the workflow with any of the 3 triggers listed above and wait for it to complete.
-
Navigate to the URL where your GitHub Pages site is published in step 3. The default URL should have the format https://repository_owner.github.io/cleanlab-docs/.
GitHub Actions automatically builds and deploys the docs' build artifacts when triggered. If you delete and recreate a release tag, the docs for this tag will be rebuilt and redeployed, hence overwriting the existing artifacts with the new ones.
On rare occasions, you may want to update the docs without deleting and recreating the release tag, for example, when you want to fix a typo in the docs, but you've already deployed your tag to PyPI or Conda. This can be done by manually adding specific docs' build artifacts to the cleanlab/cleanlab-docs repo. These steps are for users who have push permission to cleanlab/cleanlab and cleanlab/cleanlab-docs repo.
-
If you haven't already done so, clone the
cleanlab/cleanlabrepo. -
Make the necessary code changes.
-
Perform git add and git commit for the changes.
-
git push to the
cleanlab/cleanlabrepo. As this is pushed from a non-masterbranch, GitHub Actions will only build but not deploy the docs' build artifacts. -
Navigate to github.com/cleanlab/cleanlab in your browser, select the "Actions" tab, under "Workflow", click "GitHub Pages", then select the workflow that was triggered by the previous step.
-
Ensure that the workflow has completed running.
-
Scroll to the bottom of the page, under "Artifacts", click "docs-html" to download the docs' build artifacts.
-
Unzip "docs-html.zip" and open the "docs-html" folder.
-
Identify the files you would like to replace, i.e., the corresponding files creating the pages on docs.cleanlab.ai.
-
Replace these files in github.com/cleanlab/cleanlab-docs by uploading the new ones to the corresponding version folder in the
masterbranch of thecleanlab/cleanlab-docsrepo.
⚠️ Any build artifacts manually added tocleanlab/cleanlab-docsthat do not live in themasterbranch of thecleanlab/cleanlabrepo will be lost in future versions of cleanlab docs. So any edit made in the v2.0.0 docs which you also want to have in the v2.0.1, v2.0.2, etc. docs needs to be introduced as a PR to thecleanlab/cleanlabrepo as well.
⚠️ Currently, if updating stable/old version (sayvXXX) of tutorials from latest master branch version, the install of cleanlab package in notebooks/colabs will be wrong. To remedy this, you need to update the cleanlab version in all.ipynbfiles inside folders: cleanlab-docs/vXXX/tutorials/ and cleanlab-docs/vXXX/_sources/. The tutorial.htmlpages will also have wrong colab links as well. Currently have to update the.htmlfiles in cleanlab-docs/vXXX/tutorials/ to replace these colab links with the proper links (replace/master/in the link with/vXXX/for the version you are building docs for).
We've configured GitHub Actions to run the GitHub Pages workflow (gh-pages.yaml) to build and deploy our docs' static files. Here's a breakdown of what this workflow does in the background:
-
Spin up a Ubuntu server.
-
Install Pandoc, a document converter required by
nbsphinxto generate static sites from notebooks (.ipynb). -
Check-out the
cleanlabrepository. -
Setup Python and cache dependencies.
-
Install dependencies for the docs from
docs/requirements.txt.
- Run Sphinx with the
sphinx-multiversionwrapper to build the doc's static site files. These files will be outputted to thecleanlab-docs/directory.
-
Get the latest release tag name and insert it in the
versioning.jsfile. Theindex.htmlof each doc version will read this as a variable and display it beside the stable hyperlink. -
Insert the latest commit hash in the
versioning.jsfile. Theindex.htmlof each doc version will read this as a variable and display it beside the developer hyperlink. -
Copy the
versioning.jsfile to thecleanlab-docs/folder.
If the workflow is triggered by a new release, generate the redirecting HTML which redirects site visits to the stable version
-
Insert the relative path to the stable docs in the
redirect-to-stable.htmlfile AKA the redirecting HTML. -
Create a copy of the
redirect-to-stable.htmlfile tocleanlab-docs/index.htmlandcleanlab-docs/index.html.
- Deploy
cleanlab-docs/folder to thecleanlab/cleanlab-docsrepo'smaster branch.
Each tutorial is a Jupyter notebook (unexecuted .ipynb file) that will be executed during CI for the version displayed at docs.cleanlab.ai using nbsphinx. Some basic linting is also applied to ensure proper notebook formatting such as no trailing newlines at the end of cells. Here are some tips when adding a new tutorial notebook:
-
Make sure to clear all Cell outputs before you
git commita tutorial. The outputs of cells should never be tracked in git, these outputs are automatically constructed for displaying on docs.cleanlab.ai during the CI which executes all notebooks in the folder docs/source/. -
For cells which contain code that should not be executed during CI, make sure the cell-type is Markdown and use proper syntax to make contents look like code.
-
To suppress certain Jupyter cells that should not be shown on docs.cleanlab.ai web version of tutorial:
"metadata": {
"nbsphinx": "hidden"
}
This includes cells that install dependencies and cells that run tests to verify the notebook has executed correctly. These cells will still be visible when the notebook is run in Colab or locally in Jupyter, so make sure to add a comment explaining their purpose at the top.
- If developing Notebook in virtualenv, make sure at the end to change the end of the raw .ipynb file to have the following:
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
}
instead of containing your own virtualenv in there. CI will FAIL if you instead list your own virtualenv here!
-
When adding dependencies to a tutorial:
- Make sure to update docs/requirements.txt which lists all extra dependencies installed during CI to build the docs.
- Add a comment in hidden cell not displayed on docs.cleanlab.ai stating which version of dependencies you used.
- Think carefully whether each dependency is really necessary and if its future versions will be stable / compatible with future versions of existing dependencies.
-
Don't forget to update docs/source/index.rst with a short title and docs/source/tutorials/index.rst to ensure your tutorial properly linked. Otherwise it will not appear on docs.cleanlab.ai!
-
Ask yourself:
- How can I make this tutorial run faster without sacrificing educational value? Perhaps use smaller subsample of the dataset, smaller/pretrained model, etc.
- What sections of this tutorial are least vital? Consider creating a separate Examples notebook that features those.
All of our tutorials are quickstart guides that should run quite fast. Longer/comprehensive notebooks are better added in Examples.
-
Verify your new docstrings adhere to our documentation format guidelines
-
To ensure documentation for new source code files is linked from the main page, don't forget to update: docs/source/index.rst