Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
c00cfa2
chore: update lxml version
mziv Oct 13, 2025
35ea6ad
less restrictive dep update
mziv Oct 13, 2025
456f9e3
fix: reprocess cached html with crawler run config
anna-xing Nov 19, 2025
4412df1
cleanup
anna-xing Nov 19, 2025
1b99071
early return
anna-xing Nov 19, 2025
60cf0e3
restructure
anna-xing Nov 19, 2025
d6064f3
Merge pull request #1 from CoProcure/anna/sc-31444/postprocess-cached…
anna-xing Nov 19, 2025
0bd2915
fix: handle cases when redirected_url is none
ghmeier Nov 19, 2025
54132a3
Merge pull request #2 from CoProcure/ghmeier/fix-non-redirect
ghmeier Nov 19, 2025
8a847ac
fix: make base directory env variable work
anna-xing Nov 20, 2025
1a6fe72
clean up imports
anna-xing Nov 20, 2025
6cba694
cleanup
anna-xing Nov 20, 2025
62e6f39
Merge pull request #3 from CoProcure/anna/sc-31444/custom-base-dir
anna-xing Nov 20, 2025
c4b0bc4
fix: normalize url and make tests runnable
ghmeier Nov 20, 2025
064a356
fix: correct url parsing for images and test
ghmeier Nov 20, 2025
e2f21c9
chore: a letter
ghmeier Nov 20, 2025
8fae6ff
chore: add ruff
ghmeier Nov 20, 2025
20c6b18
Merge pull request #4 from CoProcure/ghmeier/fix-base-url
ghmeier Nov 20, 2025
4fa609a
chore: update comment about cache_mode default
anna-xing Nov 21, 2025
6bd611b
Merge pull request #5 from CoProcure/anna/cache-mode-comment
anna-xing Nov 21, 2025
6dfa25f
feat: use CacheClient for caching crawl results
anna-xing Nov 24, 2025
c0b66d1
fix circular imports
anna-xing Nov 24, 2025
2c650b3
Merge pull request #6 from CoProcure/anna/sc-31491/abstract-cache-client
anna-xing Nov 24, 2025
9647f09
chore: update tests for robots parser
anna-xing Nov 25, 2025
d96f8b4
further consolidation of test files
anna-xing Nov 25, 2025
8bccabe
Merge pull request #7 from CoProcure/anna/robots-parser-caching-test
anna-xing Nov 25, 2025
d98bd6d
feat: use CacheClient for URL seeder
anna-xing Nov 25, 2025
07301de
Merge pull request #8 from CoProcure/anna/sc-31491/cache-url-seeder
anna-xing Nov 25, 2025
5b85912
chore: re-raise run_urls exception (#9)
anna-xing Dec 3, 2025
069a910
chore: lower default TTL to 2 hours (#10)
anna-xing Dec 11, 2025
4921901
chore: split out unvalidated tests from validated tests (#11)
anna-xing Jan 2, 2026
02ee303
chore: enable CI test check, update PR template (#12)
anna-xing Jan 6, 2026
2fb7288
chore: bump numpy (#13)
anna-xing Jan 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ dependencies = [
"aiohttp>=3.11.11",
"aiosqlite~=0.20",
"anyio>=4.0.0",
"lxml~=5.3",
"lxml==6.0",
"litellm>=1.53.1",
"numpy>=1.26.0,<3",
"pillow>=10.4",
Expand Down Expand Up @@ -45,7 +45,7 @@ dependencies = [
"humanize>=4.10.0",
"lark>=1.2.2",
"alphashape>=1.3.1",
"shapely>=2.0.0"
"shapely>=2.0.0",
]
classifiers = [
"Development Status :: 4 - Beta",
Expand Down
Loading