Skip to content

Commit 556d179

Browse files
authored
Restore prek cache in a more robust way (apache#56796)
Apparently the prek cache mechanism has been somewhat broken for a while - after we split prek to monorepo. The hash files used to determine prek-cache was different for save and restore step (the `**/` has been missing in the save cache step. Which means that we always failed to restore cache and created it from the scratch. Also, it seems that the prek cache-when prepared refers to the uv version that is pre-installed for it in case uv is not installed in the system. And it refers to the uv version when creating the virtual environments used by prek, and we first attempted to install prek and create cache, and only after we installed uv, which had a side-effect that in some cases the installed venvs referred to a missing python binary. Finally - there is a bug in prek j178/prek#918 that pygrep cache contains reference to a non-existing python binary that should be run when pygrep runs. Also it's possible that some of the cache installed in workspace by the github worker remained, and we did not preemptively clean the cache when we attempted to restore it and failed. This PR attempts to restore the cache usage in a more robust way: * fixed cache key on save to save cache with proper name * added uv version to cache key for prek * always install uv in desired version before installing prek * if we faile to cache-hit and restore the cache, we clean-up the .cache/prek folder * we do not look at skipped hooks when installing prek and restoring or saving cache. There is very little saving on some hooks and since we are preparing the cache in "build-info" now - it's better to always use the same cache, no matter if some checks are skipped * upgraded to prek 0.2.10 that fixed the issue with pygrep cache
1 parent 6d977a9 commit 556d179

File tree

16 files changed

+81
-151
lines changed

16 files changed

+81
-151
lines changed

.github/actions/breeze/action.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@ inputs:
2222
python-version:
2323
description: 'Python version to use'
2424
default: "3.10"
25+
uv-version:
26+
description: 'uv version to use'
27+
default: "0.9.3" # Keep this comment to allow automatic replacement of uv version
2528
outputs:
2629
host-python-version:
2730
description: Python version used in host
@@ -33,6 +36,11 @@ runs:
3336
uses: actions/setup-python@v5
3437
with:
3538
python-version: ${{ inputs.python-version }}
39+
- name: "Install uv"
40+
shell: bash
41+
run: curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
42+
env:
43+
UV_VERSION: ${{ inputs.uv-version }}
3644
# NOTE! Installing Breeze without using cache is FASTER than when using cache - uv is so fast and has
3745
# so low overhead, that just running upload cache/restore cache is slower than installing it from scratch
3846
- name: "Install Breeze"

.github/actions/install-pre-commit/action.yml

Lines changed: 0 additions & 88 deletions
This file was deleted.

.github/actions/install-prek/action.yml

Lines changed: 29 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,7 @@ inputs:
2727
default: "0.9.3" # Keep this comment to allow automatic replacement of uv version
2828
prek-version:
2929
description: 'prek version to use'
30-
default: "0.2.9" # Keep this comment to allow automatic replacement of prek version
31-
skip-prek-hooks:
32-
description: "Skip some prek hooks from installation"
33-
default: ""
30+
default: "0.2.10" # Keep this comment to allow automatic replacement of prek version
3431
save-cache:
3532
description: "Whether to save prek cache"
3633
required: true
@@ -40,14 +37,17 @@ inputs:
4037
runs:
4138
using: "composite"
4239
steps:
43-
- name: Install uv and prek
40+
- name: "Install uv"
41+
shell: bash
42+
run: curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
43+
env:
44+
UV_VERSION: ${{ inputs.uv-version }}
45+
- name: Install prek
4446
shell: bash
4547
env:
46-
UV_VERSION: ${{inputs.uv-version}}
4748
PREK_VERSION: ${{inputs.prek-version}}
48-
SKIP: ${{ inputs.skip-prek-hooks }}
49+
UV_VERSION: ${{ inputs.uv-version }}
4950
run: |
50-
curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
5151
uv tool install prek==${PREK_VERSION} --with uv==${UV_VERSION}
5252
working-directory: ${{ github.workspace }}
5353
# We need to use tar file with archive to restore all the permissions and symlinks
@@ -64,7 +64,7 @@ runs:
6464
uses: apache/infrastructure-actions/stash/restore@1c35b5ccf8fba5d4c3fdf25a045ca91aa0cbc468
6565
with:
6666
# yamllint disable rule:line-length
67-
key: cache-prek-v6-${{ inputs.platform }}-${{ inputs.python-version }}-${{inputs.skip-prek-hooks}}-${{ hashFiles('**/.pre-commit-config.yaml') }}
67+
key: cache-prek-v8-${{ inputs.platform }}-python${{ inputs.python-version }}-uv${{ inputs.uv-version }}-${{ hashFiles('**/.pre-commit-config.yaml') }}
6868
path: /tmp/
6969
id: restore-prek-cache
7070
- name: "Check if prek cache tarball exists"
@@ -88,12 +88,28 @@ runs:
8888
echo
8989
shell: bash
9090
if: steps.restore-prek-cache.outputs.stash-hit == 'true'
91+
- name: "Prepare local venv for pygrep"
92+
# Prek cache restore seems to have a bug where removed temporary python is
93+
# used in cached pygrep installation. It seems prek can fallback to uv sync installed python
94+
# So let's install it. See https://github.com/j178/prek/issues/918:
95+
run: |
96+
uv sync --no-dev --no-install-workspace
97+
shell: bash
98+
if: steps.restore-prek-cache.outputs.stash-hit == 'true'
99+
- name: "Make sure cache is cleared on cache miss"
100+
run: |
101+
echo "Cleaning up prek cache in case of cache miss (in case of pre-installed-cache from the system)"
102+
ls -la ~/.cache/prek || true
103+
rm -rf ~/.cache/prek
104+
shell: bash
105+
if: steps.restore-prek-cache.outputs.stash-hit != 'true'
91106
- name: Install prek hooks
92107
shell: bash
93-
run: prek install-hooks || (cat ~/.cache/prek/prek.log && exit 1)
108+
run: prek install-hooks
94109
working-directory: ${{ github.workspace }}
95-
env:
96-
SKIP: ${{ inputs.skip-prek-hooks }}
110+
- name: "Show prek log"
111+
shell: bash
112+
run: cat ~/.cache/prek/prek.log || true
97113
- name: "Prepare .tar file from prek cache"
98114
run: |
99115
tar -C ~ -czf /tmp/cache-prek.tar.gz .cache/prek
@@ -103,7 +119,7 @@ runs:
103119
uses: apache/infrastructure-actions/stash/save@1c35b5ccf8fba5d4c3fdf25a045ca91aa0cbc468
104120
with:
105121
# yamllint disable rule:line-length
106-
key: cache-prek-v6-${{ inputs.platform }}-${{ inputs.python-version }}-${{ inputs.skip-prek-hooks }}-${{ hashFiles('.pre-commit-config.yaml') }}
122+
key: cache-prek-v8-${{ inputs.platform }}-python${{ inputs.python-version }}-uv${{ inputs.uv-version }}-${{ hashFiles('**/.pre-commit-config.yaml') }}
107123
path: /tmp/cache-prek.tar.gz
108124
if-no-files-found: 'error'
109125
retention-days: '2'

.github/workflows/basic-tests.yml

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,6 @@ jobs:
100100
matrix:
101101
shared-distribution: ${{ fromJSON(inputs.shared-distributions-as-json) }}
102102
runs-on: ${{ fromJSON(inputs.runners) }}
103-
env:
104-
UV_VERSION: ${{inputs.uv-version}}
105103
steps:
106104
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
107105
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
@@ -110,6 +108,8 @@ jobs:
110108
persist-credentials: false
111109
- name: "Install uv"
112110
run: curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
111+
env:
112+
UV_VERSION: ${{ inputs.uv-version }}
113113
- name: "Run shared ${{ matrix.shared-distribution }} tests"
114114
run: uv run --group dev pytest --color=yes -n auto
115115
working-directory: shared/${{ matrix.shared-distribution }}
@@ -193,7 +193,6 @@ jobs:
193193
id: prek
194194
with:
195195
python-version: ${{steps.breeze.outputs.host-python-version}}
196-
skip-prek-hooks: ${{ inputs.skip-prek-hooks }}
197196
platform: ${{ inputs.platform }}
198197
save-cache: false
199198
- name: "Check translation completeness"
@@ -229,7 +228,6 @@ jobs:
229228
id: prek
230229
with:
231230
python-version: ${{ steps.breeze.outputs.host-python-version }}
232-
skip-prek-hooks: ${{ inputs.skip-prek-hooks }}
233231
platform: ${{ inputs.platform }}
234232
save-cache: false
235233
- name: Fetch incoming commit ${{ github.sha }} with its parent
@@ -282,7 +280,6 @@ jobs:
282280
id: prek
283281
with:
284282
python-version: ${{ steps.breeze.outputs.host-python-version }}
285-
skip-prek-hooks: ${{ inputs.skip-prek-hooks }}
286283
platform: ${{ inputs.platform }}
287284
save-cache: false
288285
- name: "Autoupdate all prek hooks"
@@ -404,7 +401,6 @@ jobs:
404401
name: "Test Airflow standalone commands"
405402
runs-on: ${{ fromJSON(inputs.runners) }}
406403
env:
407-
UV_VERSION: ${{inputs.uv-version}}
408404
AIRFLOW_HOME: ~/airflow
409405
FORCE_COLOR: 1
410406
steps:
@@ -414,6 +410,8 @@ jobs:
414410
persist-credentials: false
415411
- name: "Install uv"
416412
run: curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
413+
env:
414+
UV_VERSION: ${{ inputs.uv-version }}
417415
- name: "Set up Airflow home directory"
418416
run: |
419417
echo "Setting AIRFLOW_HOME to $AIRFLOW_HOME"

.github/workflows/ci-amd.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,6 @@ jobs:
170170
id: prek
171171
with:
172172
python-version: ${{ steps.breeze.outputs.host-python-version }}
173-
skip-prek-hooks: ${{ needs.build-info.outputs.skip-prek-hooks }}
174173
platform: "linux/amd64"
175174
save-cache: true
176175
run-pin-versions-hook:
@@ -190,7 +189,6 @@ jobs:
190189
python-version: "3.11"
191190
platform: "linux/amd64"
192191
save-cache: true
193-
skip-prek-hooks: ""
194192
- name: "Run pin-versions"
195193
run: >
196194
prek -c dev/.pre-commit-config.yaml --all-files --verbose --hook-stage manual

.github/workflows/ci-arm.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,6 @@ jobs:
162162
id: prek
163163
with:
164164
python-version: ${{ steps.breeze.outputs.host-python-version }}
165-
skip-prek-hooks: ${{ needs.build-info.outputs.skip-prek-hooks }}
166165
platform: "linux/arm64"
167166
save-cache: true
168167
basic-tests:

.github/workflows/ci-image-checks.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -155,17 +155,17 @@ jobs:
155155
platform: ${{ inputs.platform }}
156156
save-cache: false
157157
- name: "Static checks"
158-
# We have added cache cleaning here as otherwise it was failing on rst-backticks hook
159-
# This increases the time of this step from ~9 minutes 25 seconds to ~12 minutes 37 seconds
160-
# If we want to remove the first part, we need to find root cause of the problem
161-
run: prek cache clean && prek --all-files --show-diff-on-failure --color always
158+
run: prek --all-files --show-diff-on-failure --color always
162159
env:
163160
VERBOSE: "false"
164161
SKIP: ${{ inputs.skip-prek-hooks }}
165162
COLUMNS: "202"
166163
SKIP_GROUP_OUTPUT: "true"
167164
DEFAULT_BRANCH: ${{ inputs.branch }}
168165
RUFF_FORMAT: "github"
166+
- name: "Show prek log on failure"
167+
run: cat ~/.cache/prek/prek.log || true
168+
if: failure()
169169

170170
mypy:
171171
timeout-minutes: 45

.github/workflows/release_dockerhub_image.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -87,19 +87,19 @@ jobs:
8787
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
8888
with:
8989
persist-credentials: false
90-
- name: "Install uv"
91-
run: curl -LsSf https://astral.sh/uv/${UV_VERSION}/install.sh | sh
92-
- name: "Check airflow version"
93-
id: check-airflow-version
94-
shell: bash
95-
run: uv run scripts/ci/airflow_version_check.py "${AIRFLOW_VERSION}" >> "${GITHUB_OUTPUT}"
9690
- name: "Install Breeze"
9791
uses: ./.github/actions/breeze
92+
with:
93+
uv-version: ${{ env.UV_VERSION }}
9894
- name: Selective checks
9995
id: selective-checks
10096
env:
10197
VERBOSE: "false"
10298
run: breeze ci selective-check 2>> ${GITHUB_OUTPUT}
99+
- name: "Check airflow version"
100+
id: check-airflow-version
101+
shell: bash
102+
run: uv run scripts/ci/airflow_version_check.py "${AIRFLOW_VERSION}" >> "${GITHUB_OUTPUT}"
103103
- name: "Determine build matrix"
104104
shell: bash
105105
id: determine-matrix

Dockerfile.ci

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1678,7 +1678,7 @@ COPY --from=scripts common.sh install_packaging_tools.sh install_additional_depe
16781678
ARG AIRFLOW_PIP_VERSION=25.2
16791679
# ARG AIRFLOW_PIP_VERSION="git+https://github.com/pypa/pip.git@main"
16801680
ARG AIRFLOW_UV_VERSION=0.9.3
1681-
ARG AIRFLOW_PREK_VERSION="0.2.9"
1681+
ARG AIRFLOW_PREK_VERSION="0.2.10"
16821682

16831683
# UV_LINK_MODE=copy is needed since we are using cache mounted from the host
16841684
ENV AIRFLOW_PIP_VERSION=${AIRFLOW_PIP_VERSION} \

dev/breeze/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,6 @@ PLEASE DO NOT MODIFY THE HASH BELOW! IT IS AUTOMATICALLY UPDATED BY PREK.
135135

136136
---------------------------------------------------------------------------------------------------------
137137

138-
Package config hash: eeecbf0df75ded82558188c801a214f66c89da758f4bf58fd6381234bcd222fe9b53154e10331c6975f5521377d93f8f3a8f39e3d76e6874510df80511096a00
138+
Package config hash: 809191a413ca650c171f5c64561742bf391cf170207308be2605c063afd115f5ae5ece39da6f1d193a990009bcb1b54abb8598f17dae9733cedfd9baa8aa6fbb
139139

140140
---------------------------------------------------------------------------------------------------------

0 commit comments

Comments
 (0)