Skip to content

sukhoy94/habr2md

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

habr2md

habr2md is a simple CLI tool to download and convert articles from Habr — a popular Russian tech blog and knowledge-sharing platform — into clean Markdown.

It:

  • extracts only article content
  • removes images, galleries and author blocks
  • ignores comments
  • saves result as .md file
  • file name is generated from article title

Requirements

  • Python 3.10+
  • pip

Installation

Create virtual environment:

python3 -m venv .venv
source .venv/bin/activate

Install dependencies:

pip install requests beautifulsoup4 markdownify

Usage

Run:

python habr2md.py

Paste article URL, for example:

https://habr.com/en/companies/postgrespro/articles/988066/

Result will be saved to:

results/<article-title>.md

Project structure

habr2md/
├── habr2md.py
├── results/
├── README.md
└── .gitignore

Notes

  • Parser is adapted for the current Habr layout (article-formatted-body)
  • Images and galleries are removed
  • Output format is Markdown

License

MIT

About

CLI tool to download and convert Habr articles into Markdown, without images or comments

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages