A tool to extract text from HTML from your terminal.
# pipe in
curl -s 'https://en.wikipedia.org/wiki/Wiki' | node2text '#siteSub'
# Outputs: From Wikipedia, the free encyclopedia
# extract from path
node2text '#app.title' /path/to/file.html
# May or may not output depending on if selector is matchedWhen I reinstall my machine, I want to automate my install process. Usually it involves quickly grabbing snippet from the internet and writing it to file, this tool aims to help script it.
Hugely inspired by pup.
If you have rust toolchain installed, node2text is available on crates.io, if you don't have rust toolchain installed, please install rust by going to the official website.
Run
cargo install node2textPiping will always take precedence even if <path> is provided.
Comparison with pup:
node2text
- Selectors are purely CSS selectors, no dsl
- Takes html, spits out text
- Written in rust programming language
- Less features than
pup - Outputs are not escaped
pup
- Selectors are CSS selectors plus dsl
- Takes html, spits out text, json, html
- Written in go programming language
- Has many features, visit their github page to know more
- Outputs are escaped