All HTML documents are structured as trees. This project provides code to parse and visualise HTML documents as graphs.
Features of the project are:
- View any html page (using page source) as a graph (collection of connected nodes).
- Search graph by html tags, tag attributes, strings etc.
- Find shortest path between HTML node to any other node in graph, for easier web-scraping.
pip install webtree
webtree scrape --site=https://www.google.com