Fetch Markdown

Purpose

This script is designed to fetch the content of one or more URLs, with special handling for Reddit, and convert it into Markdown format. It is useful for extracting content for LLM processing.

Approach

The script uses Playwright to launch a headless Chromium browser instance, and opens the target URLs concurrently (controlled by p-limit).

It scrolls the page to trigger lazy-loaded content, and cleans the DOM, removing scripts, styles, etc. before conversion. It uses html-to-md to convert the cleaned HTML body into Markdown.

If the URL is from Reddit, it performs specialized actions to expand "View more comments" buttons and nested replies to capture the full discussion content.

Dependency Installation

To run this script, you need to npm install the required Node.js dependencies: playwright, commander, clipboardy, html-to-md, and p-limit

General Usage

CLI Arguments

You can pass one or more URLs directly as arguments:

./fetch_markdown.mjs https://example.com https://google.com

Interactive Mode

If no URLs are provided, the script runs in interactive mode. You can enter URLs (one per line) and press Ctrl+D when finished:

Options

-o, --output <file>: Specify output file (default: ~/Downloads/markdown_TIMESTAMP.md)
-c, --clipboard: Copy results to clipboard automatically
-p, --parallel <number>: Max parallel pages (default: 5)

Raycast Script Command Setup

You can run this script directly from Raycast to quickly fetch markdown from URLs in your clipboard or by entering them manually.

Instructions

Prerequisites: Ensure you have Node.js installed.
Script Location: Place fetch_markdown.mjs in your script commands directory.
Permissions: Ensure the script is executable:
```
chmod +x fetch_markdown.mjs
```
Raycast Configuration:
- The script includes Raycast metadata headers.
- Argument: URLs (space separated) (Optional).
- If no argument is provided, it will attempt to read URLs from your clipboard.
- Output: The script is set to @raycast.mode fullOutput, so standard output will be displayed in a Raycast window.

Usage in Raycast

From Clipboard: Copy one or more URLs, open Raycast, and run "Fetch Markdown".
Manual Input: Open Raycast, run "Fetch Markdown", and type the URLs separated by spaces.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
fetch_markdown.mjs		fetch_markdown.mjs
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fetch Markdown

Purpose

Approach

Dependency Installation

General Usage

CLI Arguments

Interactive Mode

Options

Raycast Script Command Setup

Instructions

Usage in Raycast

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fetch Markdown

Purpose

Approach

Dependency Installation

General Usage

CLI Arguments

Interactive Mode

Options

Raycast Script Command Setup

Instructions

Usage in Raycast

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages