This project provides a web scraper that automatically extracts and formats the complete documentation from Google's NotebookLM Help Center. The scraper generates a comprehensive Markdown file containing all documentation sections, making it easy to read and reference offline.
The script performs the following operations:
- Navigates through the NotebookLM Help Center
- Expands all collapsible sections
- Extracts content from each documentation page
- Formats the content into a clean Markdown structure
- Saves everything to a single
notebook_lm_documentation.mdfile
selenium
beautifulsoup4
chrome/chromium browser- Clone this repository
- Install the required dependencies:
pip install selenium beautifulsoup4- Run the script:
python app.pyThe scraper uses:
- Selenium for browser automation and handling dynamic content
- BeautifulSoup for parsing HTML and extracting text content
- Headless Chrome mode for efficient scraping
- Automatic handling of expandable sections and nested lists
- Smart content formatting with proper Markdown hierarchy
The script generates a notebook_lm_documentation.md file that includes:
- Complete NotebookLM documentation
- Properly formatted headers (H1-H3)
- Nested lists and paragraphs
- All sections from the Help Center organized hierarchically
This is an unofficial tool for educational purposes. The generated documentation is sourced from Google's public Help Center for NotebookLM.