web2PDFbook

🧠 What it does

web2PDFbook crawls a website and compiles its pages into a single PDF. It is useful for archiving or offline reading.

⚙️ How to install

Install from PyPI:

pip install web2pdfbook

To test a pre-release from TestPyPI:

pip install -i https://test.pypi.org/simple web2pdfbook

If you cloned the repository and want to invoke web2pdfbook locally, install the dependencies first:

pip install -r requirements.txt

🔄 How it works

Link crawling – crawler.extract_links() retrieves all internal HTML links starting from the base URL.
PDF rendering – renderer.render_to_pdf() uses Playwright to save each page as a PDF.
Merging – merger.merge_documents() merges the PDFs into a single document.

Generate a book via the CLI:

web2pdfbook --help
web2pdfbook https://example.com output.pdf --timeout 20000 --use-index

--timeout – render timeout in milliseconds.
--use-index – only crawl links from index pages.

✅ How to test

Install dependencies first:

pip install -r requirements.txt

python -m coverage run -m pytest -q
python -m coverage report

📦 How to release

Install packaging dependencies:

pip install -r dev-requirements.txt

Build and upload the distribution (defaults to TestPyPI):

./release/publish.sh

This script runs python -m build and uploads with twine. Set REPOSITORY_URL to publish elsewhere.

The repository contains a .pypirc template with placeholder credentials for TestPyPI and PyPI. Fill in your tokens (or copy it to ~/.pypirc) so twine can authenticate during the upload.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

web2PDFbook

🧠 What it does

⚙️ How to install

🔄 How it works

✅ How to test

📦 How to release

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

web2PDFbook

🧠 What it does

⚙️ How to install

🔄 How it works

✅ How to test

📦 How to release