Skip to content

This is a node server that takes in a pdf url, takes a screenshot of the first page and uploads it to DB

Notifications You must be signed in to change notification settings

timelessco/pdf-url-screenshot

Repository files navigation

PDF Screenshot Service

A high-performance Fastify server that generates thumbnail images from PDF documents and uploads them to Cloudflare R2 storage.

Features

  • πŸš€ Fast PDF rendering using pdfjs-dist and node-canvas
  • πŸ“Έ Generates PNG thumbnails from the first page of PDFs
  • ☁️ Automatic upload to Cloudflare R2 storage
  • πŸ” Bearer token authentication for API security
  • πŸ“š Interactive Swagger/OpenAPI documentation
  • πŸ›‘οΈ Built-in rate limiting to prevent abuse
  • πŸ”„ Process management with PM2
  • πŸ“ Comprehensive logging and error handling
  • πŸ’Ž Production-ready TypeScript codebase

Prerequisites

  • Node.js (v16 or higher)
  • npm or yarn
  • PM2 (for production deployment)
  • Cloudflare R2 credentials

Installation

  1. Clone the repository:
git clone <repository-url>
cd pdf-screenshot
  1. Install dependencies:
npm install
  1. Configure environment variables:
# Copy the example file
cp env.example .env

# Edit .env and add your actual credentials
nano .env

Required environment variables:

  • R2_ACCOUNT_ID - Your Cloudflare R2 account ID
  • R2_ACCESS_KEY_ID - Your R2 access key
  • R2_SECRET_ACCESS_KEY - Your R2 secret key
  • R2_PUBLIC_BUCKET_URL - Your public bucket URL
  • R2_MAIN_BUCKET_NAME - Your R2 bucket name
  • PORT - Server port (default: 3000)
  • SERVER_URL - Full server URL for API documentation (default: http://localhost:3000)
  • API_KEYS - Comma-separated list of API keys for authentication (required)
  1. Build the TypeScript code:
npm run build

Development

Run the server in development mode:

npm run dev

The server will start on http://localhost:3000

Production Deployment with PM2

First Time Setup

  1. Install PM2 globally:
npm install -g pm2
  1. Install PM2 log rotation:
pm2 install pm2-logrotate
  1. Build and start the server:
npm run build
npm run pm2:start
  1. Save PM2 configuration (optional, for auto-restart on system reboot):
pm2 save
pm2 startup

PM2 Commands

Command Description
npm run pm2:start Start the server with PM2
npm run pm2:stop Stop the server
npm run pm2:restart Restart the server
npm run pm2:delete Remove from PM2
npm run pm2:logs View real-time logs
npm run pm2:status Check server status

Updating the Server

After making code changes:

npm run build
npm run pm2:restart

Authentication

The API uses Bearer Token authentication. All endpoints require a valid API key.

Getting Your API Key

API keys are managed via the API_KEYS environment variable in .env.

Generate a New API Key

node -e "console.log('sk_live_' + require('crypto').randomBytes(32).toString('hex'))"

Add the generated key to your .env file:

API_KEYS=sk_live_your_generated_key_here

For multiple keys (comma-separated):

API_KEYS=sk_live_key1...,sk_live_key2...,sk_live_key3...

Using Your API Key

Include the API key in the Authorization header:

curl -X POST http://localhost:3000/upload/pdf-screenshot \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/document.pdf"}'

Unauthorized Response

Requests without a valid API key will receive:

HTTP 401 Unauthorized
{
  "statusCode": 401,
  "error": "Unauthorized",
  "message": "Invalid or missing API key. Include \"Authorization: Bearer YOUR_API_KEY\" header."
}

API Documentation

Interactive API Docs

Once the server is running, visit the interactive Swagger UI documentation:

http://localhost:3000/docs

Click the πŸ”’ Authorize button and enter your API key to test endpoints.

The documentation provides:

  • πŸ“š Complete API reference for all endpoints
  • πŸ§ͺ Interactive "Try it out" feature to test endpoints
  • πŸ“‹ Request/response schemas with examples
  • 🏷️ Organized by tags (health, upload)
  • πŸ” Built-in authentication testing

API Endpoints

Health Check

GET /

Headers:

Authorization: Bearer YOUR_API_KEY

Response:

{
  "message": "Hello from Hetzner Node server fastify"
}

Generate PDF Screenshot

POST /upload/pdf-screenshot

Authentication: Required

Headers:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Rate Limit: 10 requests per hour per IP address

Request Body:

{
  "url": "https://example.com/document.pdf"
}

Success Response (200):

{
  "success": true,
  "path": "test/thumb-document.png",
  "publicUrl": "https://your-r2-domain.com/test/thumb-document.png"
}

Unauthorized Response (401):

{
  "statusCode": 401,
  "error": "Unauthorized",
  "message": "Invalid or missing API key. Include \"Authorization: Bearer YOUR_API_KEY\" header."
}

Error Response (400/500):

{
  "success": false,
  "error": "Error message",
  "details": "Detailed error information"
}

Rate Limit Response (429):

{
  "statusCode": 429,
  "error": "Too Many Requests",
  "message": "PDF processing rate limit exceeded. Maximum 10 requests per hour. Try again in 45 minutes.",
  "retryAfter": 2700000
}

Rate Limit Headers: All responses include rate limit information:

x-ratelimit-limit: 10
x-ratelimit-remaining: 7
x-ratelimit-reset: 1640000000000

Rate Limiting

The API implements rate limiting to prevent abuse and ensure fair usage:

Global Rate Limits

  • 100 requests per minute per IP address across all endpoints
  • Applies to all API calls

Endpoint-Specific Limits

  • Health Check (GET /): 100 requests per minute
  • PDF Processing (POST /upload/pdf-screenshot): 10 requests per hour

Rate Limit Headers

Every response includes these headers:

  • x-ratelimit-limit - Maximum requests allowed in time window
  • x-ratelimit-remaining - Requests remaining in current window
  • x-ratelimit-reset - Unix timestamp when the limit resets

Rate Limit Exceeded

When you exceed the limit, you'll receive:

HTTP 429 Too Many Requests
{
  "statusCode": 429,
  "error": "Too Many Requests", 
  "message": "Rate limit exceeded. Try again later.",
  "retryAfter": 3600000
}

Whitelisted IPs

  • Localhost (127.0.0.1) is whitelisted for development

Configuration

PM2 Configuration (ecosystem.config.js)

  • instances: Number of instances to run (default: 1)
  • max_memory_restart: Auto-restart if memory exceeds limit (default: 1G)
  • autorestart: Automatically restart on crashes
  • logs: Stored in ./logs/ directory

Log Rotation

PM2 automatically rotates logs with these settings:

  • Max log size: 10MB
  • Retained files: 30 old logs
  • Compression: Enabled for old logs
  • Check interval: Every 30 seconds

Project Structure

pdf-screenshot/
β”œβ”€β”€ index.ts              # Main server file with Fastify initialization
β”œβ”€β”€ env.schema.ts         # Environment variable schema and types
β”œβ”€β”€ swagger.config.ts     # Swagger/OpenAPI configuration
β”œβ”€β”€ rate-limit.config.ts  # Rate limiting configuration
β”œβ”€β”€ types.ts              # TypeScript type definitions
β”œβ”€β”€ r2Client.ts           # R2 storage client
β”œβ”€β”€ schemas/              # Reusable schemas
β”‚   └── common-responses.ts  # Common response schemas (400, 401, 429, 500)
β”œβ”€β”€ routes/               # Route handlers
β”‚   β”œβ”€β”€ root.ts           # GET / endpoint
β”‚   └── upload/
β”‚       └── pdf-screenshot.ts  # POST /upload/pdf-screenshot endpoint
β”œβ”€β”€ ecosystem.config.js   # PM2 configuration
β”œβ”€β”€ tsconfig.json         # TypeScript configuration
β”œβ”€β”€ package.json          # Dependencies and scripts
β”œβ”€β”€ dist/                 # Compiled JavaScript (generated)
└── logs/                 # PM2 logs (generated)

Technical Details

PDF Rendering

  • Uses pdfjs-dist (version 3.4.120) for PDF parsing
  • Renders to canvas with node-canvas (version 3.2.0)
  • Default scale: 1.5x for better quality
  • Output format: PNG

Storage

  • Uploads to Cloudflare R2 using AWS S3 SDK
  • Automatic filename generation from PDF URL
  • Returns public URL for immediate access

Troubleshooting

Check PM2 Status

npm run pm2:status

View Logs

npm run pm2:logs

Restart Server

npm run pm2:restart

Clear and Restart

npm run pm2:delete
npm run build
npm run pm2:start

About

This is a node server that takes in a pdf url, takes a screenshot of the first page and uploads it to DB

Resources

Stars

Watchers

Forks

Packages

No packages published