Skip to content

Commit 4322b3e

Browse files
committed
feat(fetch): add distill parameter for token optimization
Add distill parameter to aggressively clean HTML before processing: - Remove scripts, styles, navigation, headers, footers - Remove ads, sidebars, popups, cookie banners - Remove social widgets and non-content elements - Normalize whitespace Typical token reduction: 60-85% This is an opt-in feature (distill=false by default) to maintain backward compatibility. Removes security-related code that belongs in a separate PR.
1 parent 3f4ae21 commit 4322b3e

File tree

6 files changed

+128
-1093
lines changed

6 files changed

+128
-1093
lines changed

src/fetch/pyproject.toml

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ classifiers = [
1616
"Programming Language :: Python :: 3.10",
1717
]
1818
dependencies = [
19-
"beautifulsoup4>=4.12.0",
2019
"httpx<0.28",
2120
"markdownify>=0.13.1",
2221
"mcp>=1.1.3",
@@ -34,13 +33,4 @@ requires = ["hatchling"]
3433
build-backend = "hatchling.build"
3534

3635
[tool.uv]
37-
dev-dependencies = [
38-
"pyright>=1.1.389",
39-
"ruff>=0.7.3",
40-
"pytest>=7.0.0",
41-
"pytest-asyncio>=0.21.0",
42-
]
43-
44-
[tool.pytest.ini_options]
45-
asyncio_mode = "auto"
46-
testpaths = ["tests"]
36+
dev-dependencies = ["pyright>=1.1.389", "ruff>=0.7.3"]

0 commit comments

Comments
 (0)