Skip to content

Commit 29cf863

Browse files
authored
Build: Skip checking Twitter links in the hydra crawler (#1231)
Twitter pages now do 302-redirects to themselves for users without a specific cookie set which trips the crawler; avoid checking Twitter links by abusing the `exclude_scheme_prefixes` option of the crawler. Since the project only accepts options in a form of a configuration file, we also need to clone the API repo to provide such a file.
1 parent 0cb2745 commit 29cf863

File tree

2 files changed

+16
-2
lines changed

2 files changed

+16
-2
lines changed

.github/configs/hydra-config.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"exclude_scheme_prefixes": [
3+
"https://twitter.com/"
4+
]
5+
}

.github/workflows/spider-check.yaml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,18 @@ jobs:
2020
runs-on: ubuntu-latest
2121
if: ${{ github.repository_owner == 'jquery' }} # skip on forks
2222
steps:
23-
- uses: actions/checkout@v2
23+
- name: Checkout hydra-link-checker
24+
uses: actions/checkout@v3
2425
with:
2526
repository: jquery/hydra-link-checker
2627
ref: v2.0.0
28+
path: hydra
29+
30+
# Checkout the API repo as well to provide the config for hydra-link-checker
31+
- name: Checkout API repo
32+
uses: actions/checkout@v3
33+
with:
34+
path: api
35+
2736
- name: Run hydra-link-checker
28-
run: python3 hydra.py "$MY_SITE"
37+
run: python3 hydra/hydra.py "$MY_SITE" --config api/.github/configs/hydra-config.json

0 commit comments

Comments
 (0)