NewsSift is an intelligent news curation service that uses AI to deliver personalized news digests based on your interests. It automatically fetches articles from your chosen news sources, evaluates them using large language models, and delivers curated reports straight to your inbox.
- 🎯 Customizable News Sources: Add any news website with CSS selectors to specify where to find articles
- 🤖 AI-Powered Curation: Uses any LLM model compatible with the OpenAI API to evaluate articles based on your preferences
- 📅 Flexible Scheduling: Receive daily reports generated at your preferred time
- ✨ Instant Testing: Generate on-demand reports to test and refine your preferences
- 📧 Email Delivery: Receive curated articles directly in your inbox
NewsSift is built with a modern, scalable architecture. The code is divided into a processing server, which handles all LLM requests, and a webapp. The processing server kicks off report generation either based on the scheduled report settings in the database, or via an API endpoint for on-demand report generation that is kicked off by the webapp backend.
In addition to promoting separation of concerns, splitting up the code into two services in this way means that each service can be deployed on the platform that's best fit for it. The webapp, which is used to manage settings and view reports, can be run on a serverless platform such as Vercel or Netlify since its requests are fast to handle. Meanwhile, calls to LLM endpoints are more cost-effective to run on a shared Node.js process than within a serverless billing model. I suggest deploying the webapp to Vercel and the processing server to Railway.
The web interface handles:
- Authentication
- News source configuration
- Preferences (evaluation prompt and report scheduled time)
- Report history and triggering of on-demand generation
It uses:
- Framework: Next.js 15 with App Router, using Server Components and Server Actions
- The Data Access Layer pattern (described in the article "How to Think About Security in Next.js") is used to separate code that accesses the database from other concerns
- The combination of Server Actions and useActionState allows forms (e.g., login) to be submitted even with JavaScript disabled
- UI Components: shadcn/ui + Tailwind CSS
- Authentication: Better Auth
A Node.js server that handles:
- Scheduled report generation
- On-demand report creation
- Article fetching and parsing
- LLM-based content evaluation
- Email sending
It uses:
- Email Service: Resend and React Email
- LLM Integration: OpenAI-compatible API (I suggest using OpenRouter for easy access to many models)
- Note that the service places a rate limit on the
- Web Scraping: Cheerio for HTML parsing and Readability for extracting article contents
- HTTP Server: Fastify
The following steps are used to generate a report:
- For each news source, visit the provided page and apply the configured CSS selector to retrieve a list of article links
- For each gathered article link, visit the page and use Readability to extract the main article text
- Put the gathered articles in batches and, for each batch, submit a request to the LLM that includes a system prompt, the user's evaluation prompt and each article's text 3.1. The LLM is asked to respond either yes or no to whether each article should be selected, and to give a brief explanation of its answer to facilitate tuning the prompt
Across both the webapp and the processing server, the codebase uses:
- Database: Drizzle ORM with Neon Postgres
- KSUIDs with Stripe-style prefixes as primary keys
- Error Tracking: Sentry
- RSS feed support (both for inputting news items and for outputting the filtered feed)
- Multiple evaluation prompts per user (e.g., per source)
- Generating article summaries
To run locally:
- Create
./.envand./web/.envbased on the corresponding.env.examplefiles 1.1. A Neon database URL is expected, but other drivers for Postgres can easily be used by tweakingsrc/lib/db/index.ts - Run
pnpm installto install top-level dependencies - Run
pnpm run drizzle-kit pushin order to push the schema to your Postgres database - Run
pnpm run build && pnpm run startto start the processing server (pnpm run devis also available for watching dev changes) - Run
cd web && pnpm install && pnpm run build && pnpm run startto start the webapp (pnpm run devis also available)
To run on the cloud:
- Run
pnpm installandPOSTGRES_URL=YOUR_URL_HERE pnpm run drizzle-kit pushlocally to push the schema to your Postgres database - Deploy the repo as a Next.js project to a host such as Vercel, using
./webas the base directory and using./web/.env.exampleas a template for the environment variables to configure - Deploy the processing server as a Docker image to a host such as Railway, Render or GCP, using
Dockerfileto automatically build and run the server and using./.env.exampleas a template for the environment variables to configure

