First, clone the GitHub repository to download the latest development code:
git clone https://github.com/ImperialCollegeLondon/safedata_validator.gitThe package makes use of the python dependency manager
poetry, so you will also need to install poetry:
curl -sSL https://install.python-poetry.org | python3 -Once those two steps are complete, you can run a single command from the package root to install the package and all dependencies:
poetry installThis installed package is 'editable', i.e. it changes with the current state of the repo
directory. The package is installed as part of a virtual environment, so can only be
used when the relevant environment is active. This environment can be activated with a
single poetry command.
poetry shellTesting for this package makes use of the pytest framework. When new functions are
added to this package unit tests for them must also be added. These operate as a check
that functions still operate in the manner they were originally designed to. Either all
unit tests can be run locally.
pytestOr a specific testing file can be run.
pytest test/test_specific_module.pyThese unit tests are also run as part of our continuous integration workflow, which runs whenever commits are made to this repository and for all pull requests.
All new package releases should be from the main branch, so the changes to develop
have to be moved here. This is achieved using a release branch.
git branch release/x.y.z
git switch release/x.y.zThe version of this branch should be updated to a pre-release version using poetry.
There are multiple options here: premajor should be used for major versions (e.g.
2.0.0), preminor for minor (e.g. 2.1.0), and prepatch for patch versions (e.g.
2.1.1).
poetry version [premajor/preminor/prepatch]The change this causes to pyproject.toml should be committed, and the branch is now
ready to be pushed to the remote repo to ensure that other developers have access to it.
git push --set-upstream origin release/x.y.zThis branch ultimately needs to be merged into main (and back into develop), so once
the release branch is on GitHub, a pull request should be made against main from
the release branch. This will cause the GitHub continuous integration tests and
documentation building to run, validating the release branch. Other developers can also
look at the PR to review the changes prior to release, and to make commits to the branch
to update the PR or fix issues.
Once everything seems to be running smoothly, the next steps are to make sure that the website builds correctly on Read The Docs (details below) and then that the package publication process works correctly.
ReadTheDocs needs to be updated to build the release/x.y.z branch. The branch
needs to be 'Activated' from the Versions tab on the RTD project admin site - it should
be 'Active' but also 'Hidden'.
The package can be published to the Test PyPi site (see below for details) using:
poetry build
poetry publish -r test-pypiOnce the relevant changes have been made and checks have been performed the final commit
should bump the version using poetry so that the correct version is recorded in
pyproject.toml.
poetry version [major/minor/patch]Alternatively, edit pyproject.toml by hand if the release tag to be used is unusual in
some way (e.g. x.y.zrc1). Commit this last change!
The release branch now is ready to be merged with the main branch and the pull
request should be accepted and merged online. A tag should be added marking the package
version, and then the updated main branch should be pushed to the remote repository.
git tag x.y.z
git push origin x.y.zA PR should also be created online to merge any changes added to the release branch
back into develop.
Note that the main branch should only be used for new releases.
It can often be the case that a package that appears to build fine locally has errors
that prevent it from uploading properly to PyPi. By first uploading to test PyPi
site these kind of errors can be caught without clogging up the
real PyPi site with broken packages. Upload of new package versions occurs via poetry.
This means that poetry must be configured to have access to the test PyPi. To do this,
test PyPi must be added as a repository, and a valid API access token must be associated
with the repository.
poetry config repositories.testpypi https://test.pypi.org/legacy/
poetry config pypi-token.testpypi my_test_api_tokenThis token should be a personal API token for PyPi, these can be generated through
your test PyPi account. We are now setup to
publish to test PyPi, but before the package is published it must first be built. This
also done using poetry.
poetry buildThis produces both a sdist source distribution, and a wheel compiled package. These
can then be published to test PyPi.
poetry publish -r testpypiIt is important to note that a test upload should always be done before the package
is uploaded to PyPi. You should perform a test upload when the release branch
otherwise ready to merge to main. If the release branch changes after this point,
use poetry version prerelease to increment the version and run another test upload to
confirm that the final version can be uploaded cleanly.
The package documentation is maintained using MkDocs
and deployed to
https://safedata-validator.readthedocs.io/.
MkDocs is installed automatically to the poetry virtual environment.
In order to build and deploy the documentation.
- Edit the source files in the
docsfolder. - Some of the documentation presents the command line help for the script tools
in the package. To keep these synchronized with the codebase, the
docs/command_line_usagedirectory contains a shell script that saves these outputs to file, so they can be included in the documentation. If the script commands are updated, these inputs need to be recreated. - From the package root, run
mkdocs build. This will create the docs site in thesitefolder - note that this folder is not included in the git repo. - Changes made to the
mainbranch of the repository will automatically trigger a rebuild of the package documentation. - To build the documentation for specific branches you need to login to Read the Docs. You can then build whichever branch you require.
Once a release branch has passed all the tests and been merged into main, it
should be published to PyPi. This allows users to install the new
package version via pip. As with test PyPi, publication is handled by the poetry
package manager. PyPi is automatically configured as the default upload repository, so
in this case you only need to add an API token to the poetry configuration for PyPi.
poetry config pypi-token.pypi my_api_tokenAs with the token for test PyPi, this token should be a personal API token, these
can be generated through your PyPi account. We use
personal tokens rather than a project specific token as the standard setup method
with poetry only allows one PyPi token to be saved. If necessary this issue can be
circumvented by adding each project as a new repository
see (with PyPi
remaining the repository published to) and then configuring this duplicate repository to
use the project specific token. We are not using this approach at present as we feel it
introduces unnecessary complexity.
It should be noted that your personal token will only allow you to publish new package versions if you are a maintainer. If you wish to upload a new package version you should therefore contact the current maintainers to request maintainer status.
Once poetry has been setup to allow publication of safedata_validator to PyPi, the
new package version can then be published from the main branch:
git switch main
poetry build
poetry publish
git switch develop