Skip to content

Commit 2e28064

Browse files
committed
CU-8699my5eg Add install bundles to releases (#24)
* CU-8699my5eg: Add workflow jobs/steps to create release bundles * CU-8699my5eg: Hopefully fix a path issue with release workflow * CU-8699my5eg: Add sanity check integration tests to release bundling job * CU-8699my5eg: Build wheel with lowest supported python version for backwards compatibility * CU-8699my5eg: [TEMP/TEST/TO_REMOVE] Make workflow run on pull request * CU-8699my5eg: [TEMP/TEST/TO_REMOVE] Fix/hardcode branch name * CU-8699my5eg: Allow unsafe index strategy for python 3.9 and cpu-only toch bundle. Otherwise the dependencies are not able to be resolved. See comment in code for some more details * CU-8699my5eg: Move to virtual environment when downloading wheels. uv pip does not support a download command (at least not yet) so that cannot be used. And uv python doesn't support using -m, so can't use that either. So now just creating the env and using that instead * CU-8699my5eg: Make sure there's a PIP to play with during bundling * CU-8699my5eg: Clear venv after usage * CU-8699my5eg: Fix typo regarding venv path * CU-8699my5eg: Fix usage of wrong extra parts or GPU-enabled bundle * CU-8699my5eg: Allow only binaries * CU-8699my5eg: Allow only binaries during compilaton time * CU-8699my5eg: Hopefully fix wheel artifact upload * CU-8699my5eg: Add .tar.gz to uploaded wheel artifact * CU-8699my5eg: Add kust oof donwnloaded artifacts as a step * CU-8699my5eg: Update debug / ls output * CU-8699my5eg: Update download artifact paths * Revert "CU-8699my5eg: Update debug / ls output" This reverts commit 8c670ed. * CU-8699my5eg: Make sure bundles get included in release. Previosuly the wheels download probably overwrote the bundles that were copied there. * CU-8699my5eg: Move .tar.gz to dist as wel * CU-8699my5eg: Add debug output after moving release bundles * CU-8699my5eg: Add debug output reguarding all artifacts before moving release bundles * CU-8699my5eg: Fix bundle upload path * CU-8699my5eg: Remove GPU install bundle * CU-8699my5eg: Include release version in bundle names * CU-8699my5eg: Fix extraction of version tag * CU-8699my5eg: Fix version tag in install bundle name * CU-8699my5eg: Add release bundle README * CU-8699my5eg: Add install bundle README to install bundles * CU-8699my5eg: Rename release bundle readme to install bundle readme * CU-8699my5eg: Rename release bundle readme to install bundle readme * CU-8699my5eg: Add requirements file to install bundle * Revert "CU-8699my5eg: [TEMP/TEST/TO_REMOVE] Fix/hardcode branch name" This reverts commit afb148f. * Revert "CU-8699my5eg: [TEMP/TEST/TO_REMOVE] Make workflow run on pull request" This reverts commit b9c76af.
1 parent ffaf139 commit 2e28064

File tree

2 files changed

+229
-6
lines changed

2 files changed

+229
-6
lines changed

.github/workflows/medcat-v2_release.yml

Lines changed: 150 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,17 @@ defaults:
1414

1515
jobs:
1616
build:
17-
name: Build and release
17+
name: Build medcat-v2 wheel
1818
runs-on: ubuntu-latest
19-
19+
outputs:
20+
version_tag: ${{ steps.extract.outputs.version_tag }}
21+
version_only: ${{ steps.extract.outputs.version_only }}
2022
steps:
2123
- name: Checkout repository
2224
uses: actions/checkout@v4
2325

24-
- name: Checkout release branch
26+
- name: Extract version tag and checkout release branch
27+
id: extract
2528
run: |
2629
# Fetch all branches to ensure we can access the one we need
2730
git fetch --all
@@ -31,28 +34,169 @@ jobs:
3134
# NOTE: branch name is in line with version tag, except for the patch version
3235
BRANCH_NAME="${VERSION_TAG%.*}" # This removes the patch version (everything after the second dot)
3336
37+
# set version tag as output for later use
38+
echo "version_tag=$VERSION_TAG" >> $GITHUB_OUTPUT
39+
40+
# only the version (no medcat/v prefix)
41+
VERSION_ONLY="${VERSION_TAG#medcat/v}"
42+
echo "version_only=$VERSION_ONLY" >> $GITHUB_OUTPUT
43+
3444
# Check out the corresponding release branch (e.g., medcat/v0.1)
3545
git checkout $BRANCH_NAME
3646
3747
# Ensure the branch is up-to-date with the remote
3848
git pull origin $BRANCH_NAME
3949
40-
- name: Set up Python
50+
# NOTE: building with the lowest python version supported by the package
51+
- name: Set up Python 3.9
4152
uses: actions/setup-python@v5
4253
with:
43-
python-version: "3.11"
54+
python-version: "3.9"
4455

4556
- name: Install build dependencies
4657
run: pip install --upgrade build
4758

4859
- name: Build package
4960
run: python -m build
5061

62+
- name: Upload wheel artifact
63+
uses: actions/upload-artifact@v4
64+
with:
65+
name: medcat-v2-wheel
66+
path: |
67+
medcat-v2/dist/*.whl
68+
medcat-v2/dist/*.tar.gz
69+
70+
bundle:
71+
name: Build install bundles
72+
needs: build
73+
runs-on: ubuntu-latest
74+
strategy:
75+
matrix:
76+
python-version: ["3.9", "3.10", "3.11", "3.12"]
77+
steps:
78+
- name: Checkout repository
79+
uses: actions/checkout@v4
80+
81+
- name: Set up Python ${{ matrix.python-version }}
82+
uses: actions/setup-python@v5
83+
with:
84+
python-version: ${{ matrix.python-version }}
85+
86+
- name: Install uv
87+
run: pip install uv
88+
89+
- name: Generate requirements and download (CPU)
90+
run: |
91+
if [[ "${{ matrix.python-version }}" == "3.9" ]]; then
92+
echo "Runnoing unsafe index strategy for Python 3.9 to avoid issues with torch / numpy compatibility"
93+
# NOTE: for python 3.9 it will otherwise look for `numpy>2` in torch's index
94+
# but there's (as of writing on 2025-07-02) none there that support 3.9
95+
# (though there are ones that support 3.10+) and because of that this
96+
# step would fail wihtout the unsafe index match
97+
# for some documentation on dependency confusion attacks, can reference:
98+
# https://docs.astral.sh/uv/reference/settings/#pip_index-strategy
99+
uv pip compile pyproject.toml --only-binary=:all: \
100+
--extra spacy --extra deid --extra meta-cat --extra rel-cat \
101+
--extra-index-url https://download.pytorch.org/whl/cpu \
102+
--index-strategy unsafe-best-match \
103+
> req-cpu.txt
104+
else
105+
uv pip compile pyproject.toml --only-binary=:all: \
106+
--extra spacy --extra deid --extra meta-cat --extra rel-cat \
107+
--extra-index-url https://download.pytorch.org/whl/cpu \
108+
> req-cpu.txt
109+
fi
110+
uv venv .venv
111+
.venv/bin/python -m ensurepip
112+
.venv/bin/python -m pip download --only-binary=:all: --dest bundle-cpu -r req-cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu
113+
114+
# - name: Generate requirements and download (GPU)
115+
# run: |
116+
# uv pip compile pyproject.toml --only-binary=:all: \
117+
# --extra spacy --extra deid --extra meta-cat --extra rel-cat \
118+
# > req-gpu.txt
119+
# .venv/bin/python -m pip download --only-binary=:all: --dest bundle-gpu -r req-gpu.txt
120+
121+
- name: Run sanity check / integration tests on cpu-only bundle
122+
run: |
123+
.venv/bin/python -m pip install --no-index --find-links=bundle-cpu -r req-cpu.txt
124+
uv run bash tests/backwards_compatibility/run_current.sh
125+
126+
- name: Clear virtual environment
127+
run: |
128+
rm -rf .venv
129+
130+
- name: Add README to bundles
131+
run: |
132+
cp .release/install_bundle_readme.md bundle-cpu/README.md
133+
cp req-cpu.txt bundle-cpu/requirements.txt
134+
# cp .release/install_bundle_readme.md bundle-gpu/README.md
135+
# cp req-gpu.txt bundle-gpu/requirements.txt
136+
137+
- name: Download built medcat wheel for inclusion in bundles
138+
uses: actions/download-artifact@v4
139+
with:
140+
name: medcat-v2-wheel
141+
path: medcat-v2/built-wheel
142+
143+
- name: List downloaded artifacts
144+
run: ls -lh built-wheel
145+
146+
- name: Copy built wheel to CPU bundle
147+
run: |
148+
cp built-wheel/medcat*.whl bundle-cpu/.
149+
# cp built-wheel/medcat*.whl bundle-gpu/.
150+
151+
- name: Archive CPU and GPU bundles
152+
run: |
153+
tar -czf medcat-v${{ needs.build.outputs.version_only }}-${{ matrix.python-version }}-cpu.tar.gz -C bundle-cpu .
154+
# tar -czf medcat-v${{ needs.build.outputs.version_only }}-${{ matrix.python-version }}-gpu.tar.gz -C bundle-gpu .
155+
156+
- name: Upload bundles as artifacts
157+
uses: actions/upload-artifact@v4
158+
with:
159+
name: bundles-${{ matrix.python-version }}
160+
path: |
161+
medcat-v2/medcat-v${{ needs.build.outputs.version_only }}-${{ matrix.python-version }}-cpu.tar.gz
162+
# medcat-v2/medcat-v${{ needs.build.outputs.version_only }}-${{ matrix.python-version }}-gpu.tar.gz
163+
164+
release:
165+
name: Create GitHub Release
166+
needs: [build, bundle]
167+
runs-on: ubuntu-latest
168+
steps:
169+
- name: Download all artifacts
170+
uses: actions/download-artifact@v4
171+
with:
172+
path: medcat-v2/artifacts
173+
174+
- name: Move all bundles to dist/
175+
run: |
176+
ls -l artifacts
177+
mkdir -p dist
178+
find artifacts -name '*.tar.gz' -exec mv {} dist/ \;
179+
ls -l dist/
180+
181+
- name: Download built wheel
182+
uses: actions/download-artifact@v4
183+
with:
184+
name: medcat-v2-wheel
185+
path: medcat-v2/dist-wheel
186+
187+
- name: Move wheels to dist/
188+
run: |
189+
mv dist-wheel/*.whl dist/.
190+
mv dist-wheel/*.tar.gz dist/.
191+
192+
- name: Show files in dist/ for sanity check
193+
run: ls -l dist/
194+
51195
- name: Create GitHub Release
52196
id: create_release
53197
uses: softprops/action-gh-release@v2
54198
with:
55-
tag_name: ${{ github.ref_name }}
199+
tag_name: ${{ needs.build.outputs.version_tag }}
56200
draft: true
57201
# softprops/action-gh-release v2 doesnt support the working-directory field, so put the path in files
58202
files: |
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# What are install bundles?
2+
3+
An install bundle (at least in this context) is collection of dependencies needed to use `medcat`.
4+
This includes all direct and transitive dependencies as well as the `medcat` package itself.
5+
6+
The install bundle:
7+
- Is a `.tar.gz`
8+
- Has the naming scheme `medcat-v<MAJOR>.<MINOR>.<PATCH>-3.<PY_MINOR>-cpu.tar.gz`
9+
- The `<MAJOR>`, `<MINOR>`, and `<PATCH>` placeholder indicate the major, minor and patch release numbers for `medcat`
10+
- The `<PY_MINOR>` placeholder indicates the Python verison it was built for
11+
- It contains
12+
- A collection of `.whl` files
13+
- These are installation files for packages
14+
- There's one for `medcat` itself
15+
- And there's one for each direct and transitive dependency
16+
- A `requirements.txt` file specifying the requirements installed
17+
- This README file
18+
19+
# Who are install bundles for?
20+
21+
Most of the time, when installing python packages, `pip` (or another similar tool) is used to install them.
22+
It (generally) uses the Python Package Index ([PyPI](pypi.org)) to do those installs.
23+
However, sometimes another index / mirror can be set up internally within an organisation instead.
24+
25+
An install bundle is designed to simplify the installation in air-gapped or semi air-gapped environments where:
26+
- The installation environment does not have access to PyPI
27+
- If there is a organisation-specific index / mirror it does not include all the dependencies
28+
29+
# What are some other benefits of install bundles
30+
31+
The main purpose is to help the people described in the section above.
32+
However, there's a few other benefits:
33+
- Using an install bundle provides a better guarantee of compatibility
34+
- Since we've done some (albeit limited) tests during release
35+
- There's a higher chance that the combination of dependencies just works
36+
- Install bundles live forever (or at least as long as GitHub)
37+
- One can go back and install an older version of `medcat`
38+
- Even if some newer dependencies would be allowed by requirements, but those are (retroactively) incompatible
39+
- Even if/when some dependencies cease to exist on PyPI (are removed / deprecated)
40+
41+
# Who are install bundles NOT for?
42+
43+
Install bundles are not for
44+
- First time users trying out `medcat`
45+
- You should use `pip install` (or similar) instead
46+
- Users with full internet access
47+
- You should use `pip install` (or similar) instead
48+
- Users building a service / docker image
49+
- Use other existing tooling
50+
51+
The main reason you would normally want to use existing tooling for installing `medcat` is so that it is compatible with the rest of your existing ecosystem.
52+
If you rely too heavily on the install bundle, you might find yourself with incompatible dependencies.
53+
54+
# What install bundles do we provide as part of a release?
55+
56+
Currently we provide an install bundle for each supported python version (3.9, 3.10, 3.11, and 3.12).
57+
These are targeting `x86_64` (think Intel and AMD CPUs) based Linux (think Ubuntu, Debian) machines.
58+
They **do not** provide GPU enabled `torch` because the bundle would become too large to handle for a GitHub release if they did.
59+
Users who need gpu-enabled `torch` will need to install it separately.
60+
61+
**The included release bundles are unlikely to work in other environments (i.e on MacOS, or Windows, or on an ARM based CPU architecture).**
62+
63+
# How to install an install bundle?
64+
65+
Once you've downloaded the install bundle on a computer with internet / PyPI access you need to
66+
- Move the archive (a `.tar.gz` file) to the target machine
67+
- Unarchive using `tar -xvzf medcat-v2.*-cpu.tar.gz`
68+
- Probably best to specify your exact file path
69+
- This will extract the contents (both the `.whl` files and this README) in the current folder
70+
- Activate your virtual environment (`venv`, `conda`, etc).
71+
- You generally don't want to install packages for your system `python`
72+
- Install all the wheels
73+
- `pip install /path/to/unarchived/bundle/*.whl`
74+
- NOTE: If there are other `.whl` files in the folder, this will attempt to install these as well
75+
- Now everything should work as expected
76+
- You can run this to verify:
77+
```
78+
python -c "from medcat import __version__ as v;print(f'Installed medcat v{v}')"
79+
```

0 commit comments

Comments
 (0)