Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
c00cfa2
chore: update lxml version
mziv Oct 13, 2025
35ea6ad
less restrictive dep update
mziv Oct 13, 2025
456f9e3
fix: reprocess cached html with crawler run config
anna-xing Nov 19, 2025
4412df1
cleanup
anna-xing Nov 19, 2025
1b99071
early return
anna-xing Nov 19, 2025
60cf0e3
restructure
anna-xing Nov 19, 2025
d6064f3
Merge pull request #1 from CoProcure/anna/sc-31444/postprocess-cached…
anna-xing Nov 19, 2025
0bd2915
fix: handle cases when redirected_url is none
ghmeier Nov 19, 2025
54132a3
Merge pull request #2 from CoProcure/ghmeier/fix-non-redirect
ghmeier Nov 19, 2025
8a847ac
fix: make base directory env variable work
anna-xing Nov 20, 2025
1a6fe72
clean up imports
anna-xing Nov 20, 2025
6cba694
cleanup
anna-xing Nov 20, 2025
62e6f39
Merge pull request #3 from CoProcure/anna/sc-31444/custom-base-dir
anna-xing Nov 20, 2025
c4b0bc4
fix: normalize url and make tests runnable
ghmeier Nov 20, 2025
064a356
fix: correct url parsing for images and test
ghmeier Nov 20, 2025
e2f21c9
chore: a letter
ghmeier Nov 20, 2025
8fae6ff
chore: add ruff
ghmeier Nov 20, 2025
20c6b18
Merge pull request #4 from CoProcure/ghmeier/fix-base-url
ghmeier Nov 20, 2025
4fa609a
chore: update comment about cache_mode default
anna-xing Nov 21, 2025
6bd611b
Merge pull request #5 from CoProcure/anna/cache-mode-comment
anna-xing Nov 21, 2025
6dfa25f
feat: use CacheClient for caching crawl results
anna-xing Nov 24, 2025
c0b66d1
fix circular imports
anna-xing Nov 24, 2025
2c650b3
Merge pull request #6 from CoProcure/anna/sc-31491/abstract-cache-client
anna-xing Nov 24, 2025
9647f09
chore: update tests for robots parser
anna-xing Nov 25, 2025
d96f8b4
further consolidation of test files
anna-xing Nov 25, 2025
8bccabe
Merge pull request #7 from CoProcure/anna/robots-parser-caching-test
anna-xing Nov 25, 2025
d98bd6d
feat: use CacheClient for URL seeder
anna-xing Nov 25, 2025
07301de
Merge pull request #8 from CoProcure/anna/sc-31491/cache-url-seeder
anna-xing Nov 25, 2025
5b85912
chore: re-raise run_urls exception (#9)
anna-xing Dec 3, 2025
069a910
chore: lower default TTL to 2 hours (#10)
anna-xing Dec 11, 2025
4921901
chore: split out unvalidated tests from validated tests (#11)
anna-xing Jan 2, 2026
02ee303
chore: enable CI test check, update PR template (#12)
anna-xing Jan 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions .github/FUNDING.yml

This file was deleted.

20 changes: 5 additions & 15 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,9 @@
## Summary
Please include a summary of the change and/or which issues are fixed.
[sc-XStoryNumberXX](https://app.shortcut.com/coprocure/story/XStoryNumberXX/helpful-shortcut-but-still-need-tag-above-for-integration)

eg: `Fixes #123` (Tag GitHub issue numbers in this format, so it automatically links the issues with your PR)
### Summary

## List of files changed and why
eg: quickstart.py - To update the example as per new changes
XX Replace me - What is the goal of this PR? What changes were made? XX

## How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
### Testing

## Checklist:

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have added/updated unit tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
XX Replace me - Automated tests? Manual testing? XX
46 changes: 0 additions & 46 deletions .github/workflows/main.yml

This file was deleted.

63 changes: 63 additions & 0 deletions .github/workflows/pr-ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Run this workflow when a pull request is created or activity
# happens on the PR target branch.
name: PR CI

on: [pull_request]

env:
PYTHON_VERSION: 3.13.10

# Cancel jobs in progress when a new reference is pushed.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
# Cancel in progress CI for non-renovate PRs.
cancel-in-progress: ${{ github.actor != 'renovate[bot]' }}

jobs:
# Lint step disabled until we format all files.
# lint:
# runs-on: blacksmith-2vcpu-ubuntu-2404

# steps:
# - name: PR Checkout
# uses: actions/checkout@v6
# with:
# fetch-depth: 0
# - name: Lint Python Files
# uses: astral-sh/ruff-action@v3
# with:
# version: "0.14.10" # keep in sync with pyproject.toml
# - name: Format Python Files
# run: ruff format --check

python_test:
runs-on: blacksmith-2vcpu-ubuntu-2404
env:
# Allow the Python interpreter version to be newer than PyO3's maximum supported version.
PYO3_USE_ABI3_FORWARD_COMPATIBILITY: 1

steps:
- name: PR Checkout
uses: actions/checkout@v6
- name: Install Dependencies
run: |
sudo apt-get update
sudo apt-get install \
libxml2-dev \
libxslt1-dev \
libjpeg-dev \
libgeos-dev
- name: Setup uv
uses: astral-sh/setup-uv@v7
with:
python-version: ${{ env.PYTHON_VERSION }}
activate-environment: true
enable-cache: true
cache-local-path: ".cache/.uv-cache"
- name: Install Playwright
run: uv run playwright install --with-deps chromium
- name: Python Tests
run: uv run pytest tests/
timeout-minutes: 10
- name: Minimize uv cache
run: uv cache prune --ci
142 changes: 0 additions & 142 deletions .github/workflows/release.yml

This file was deleted.

Loading