NEXUS Research is a full-stack multi-agent AI application that transforms a single research question into a structured executive brief. Four specialized agents β SCOUT, ANALYST, CRITIC, and SCRIBE β collaborate through a live pipeline powered by Groq LLaMA 3.3 / 3.1 models, streaming their progress over WebSockets while rendering a real-time agent topology graph, source evidence, adversarial critique, and an exportable PDF report.
- Overview
- Application Preview
- Features
- Architecture
- Tech Stack
- Project Structure
- Installation
- Usage
- The Four Agents
- Token Optimization
- API Reference
- Configuration
- Running Modes
- Testing
- Security Notes
- Contributing
NEXUS Research is a demonstration of how a small, transparent pipeline of specialized LLM agents can outperform a single monolithic prompt β and do it on a Groq free-tier budget. Instead of hiding reasoning behind one opaque call, NEXUS decomposes the research task into four visible stages and streams every intermediate state to the browser.
Users can:
- Enter a plain-English research question, pick a depth (standard / deep) and a mode (auto / live / demo)
- Watch a live ReactFlow agent graph animate as each agent takes over β powered by a FastAPI WebSocket
- See sources, findings, synthesis, critique, and the final brief populate the UI in real time
- Export the completed brief as a branded, searchable PDF with one click
- Run fully offline in demo mode (deterministic content, no API key required)
The backend is built with FastAPI and uses the Groq Python SDK against llama-3.3-70b-versatile and llama-3.1-8b-instant, with DuckDuckGo (AsyncDDGS + httpx HTML fallback) for web search and BeautifulSoup for page scraping. Every agent has its own model, temperature, and token cap β routed to keep a full run under a ~12k token budget on the free tier.
| Feature | Description |
|---|---|
| π SCOUT Agent | Expands a single question into 3β12 targeted search queries, then hits DuckDuckGo (async) and dedupes / de-ads the results |
| π ANALYST Agent | Fetches each source, extracts the top findings with Groq, and synthesizes themes, consensus, conflicts, and knowledge gaps |
| π§ββοΈ CRITIC Agent | Adversarially reviews the synthesis for logical flaws, missing perspectives, bias risk, overstatements, and reliability concerns |
| βοΈ SCRIBE Agent | Produces the final executive brief: summary, background, analysis, critical perspectives, key findings, recommendations, conclusion |
| π‘ Live WebSocket Stream | Normalized agent events (running / searching / reading / complete / error) push to the browser as they happen |
| π ReactFlow Topology Graph | Animated live pipeline visualization β edges flow when agents are active, colors match agent roles |
| πΎ Semantic Prompt Cache | Hash-based response cache keyed on (prompt + model + schema) β reused identical prompts cost zero tokens |
| πΈ Per-Run Token Budget | Hard guardrail (default 12k tokens/run) stops the pipeline gracefully before the Groq free tier runs dry |
| π― Model Routing | Cheap llama-3.1-8b-instant for SCOUT planning, llama-3.3-70b-versatile for ANALYST/CRITIC/SCRIBE |
| π Graceful Fallback | If DuckDuckGo rate-limits or scraping fails, SCOUT falls back to deterministic demo sources so the run never hangs |
| π Native PDF Export | Text-based jsPDF export β selectable, searchable, and not broken by Tailwind v4 oklch() colors |
| π§ͺ Fully Tested | Pytest on the backend, Vitest + Testing Library on the frontend, both green |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Browser / React 19 + Vite β
β β
β ββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββββββββ β
β β LandingPageβ β AgentGraph β β LiveFeed β β Research β β
β β (Question +β β (ReactFlow β β β (WebSocket β β Report + β β
β β Depth + β β live edges β β events feed)β β PDF Exportβ β
β β Mode) β β animated) β ββββββββ¬ββββββββ βββββββ¬ββββββ β
β βββββββ¬βββββββ ββββββββ¬ββββββββ β β β
β β POST /research β GET /research/:id WS /ws β β
ββββββββββΌβββββββββββββββββΌββββββββββββββββββΌβββββββββββββββββΌβββββββββ
β β β β
ββββββββββΌβββββββββββββββββΌββββββββββββββββββΌβββββββββββββββββΌβββββββββ
β FastAPI Backend (main.py) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Agent Pipeline (sequential) β β
β β β β
β β SCOUT βββΊ ANALYST βββΊ CRITIC βββΊ SCRIBE β β
β β (plan + (fetch + (review + (final brief) β β
β β search) extract + flag bias) β β
β β synthesize) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ β
β β groq_service.py β β search_service β β scraper_service β β
β β β’ model routing β β β’ AsyncDDGS + β β β’ httpx + bs4 β β
β β β’ prompt cache β β httpx fallbackβ β β’ concurrent β β
β β β’ token budget β β β’ ad/redirect β β fetch β β
β β β’ JSON retries β β filter β β β β
β ββββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ β
β β
β ββββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ β
β β run_store.py β β demo_service.py β β settings.py β β
β β In-memory run β β Deterministic β β env-driven β β
β β state + events β β fallback path β β config dataclassβ β
β ββββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Groq Cloud API β
β llama-3.3-70b + 3.1-8bβ
βββββββββββββββββββββββββ
| Layer | Technology |
|---|---|
| Frontend | React 19, Vite 8, Tailwind CSS 4, Framer Motion, @xyflow/react, Axios, jsPDF |
| Backend | FastAPI, Uvicorn, Pydantic 2, Python 3.11+ |
| LLM Provider | Groq Cloud β llama-3.3-70b-versatile (ANALYST / CRITIC / SCRIBE), llama-3.1-8b-instant (SCOUT) |
| Web Search | duckduckgo-search (AsyncDDGS) with httpx HTML fallback |
| Scraping | httpx + BeautifulSoup4 |
| Streaming | Native FastAPI WebSockets with JSON-encoded AgentEvent broadcasts |
| Testing | Pytest (backend), Vitest + Testing Library (frontend) |
| Export | jspdf (native text PDF β no screenshot step, no oklch() issues) |
nexus-research/
β
βββ backend/
β βββ main.py # FastAPI app β /research, /research/:id, /latest, /status, /ws
β βββ requirements.txt # Python deps
β βββ .env # GROQ_API_KEY + token caps + model routing
β β
β βββ agents/
β β βββ scout.py # Query planning + web search
β β βββ analyst.py # Source reading + extraction + synthesis
β β βββ critic.py # Adversarial review + confidence rating
β β βββ scribe.py # Final executive brief synthesis
β β
β βββ services/
β β βββ groq_service.py # Groq SDK wrapper, prompt cache, token budget
β β βββ search_service.py # AsyncDDGS + httpx fallback + ad filter
β β βββ scraper_service.py # httpx + BeautifulSoup page fetcher
β β βββ demo_service.py # Deterministic offline pipeline
β β βββ run_store.py # In-memory run state + event log
β β βββ settings.py # Env-driven dataclass config
β β
β βββ models/
β β βββ schemas.py # Pydantic models β coercive validators for robust LLM JSON
β β
β βββ tests/
β βββ test_api.py # Pytest suite β demo-mode end-to-end
β
βββ frontend/
β βββ src/
β β βββ App.jsx # Root β landing β research workspace
β β βββ main.jsx # Vite entry
β β βββ pages/
β β β βββ LandingPage.jsx # Hero + preview cards + start form
β β β βββ ResearchPage.jsx # Live workspace β graph, feed, sources, report
β β βββ components/
β β β βββ AgentGraph.jsx # ReactFlow topology with animated edges
β β β βββ AgentCard.jsx # Per-agent status card with live pulse
β β β βββ LiveFeed.jsx # Scrollable WebSocket event log
β β β βββ ProgressRing.jsx # Conic-gradient pipeline progress
β β β βββ SearchBar.jsx # Question + depth + mode form
β β β βββ SourceCard.jsx # Favicon + domain + snippet + numbered badge
β β β βββ CritiquePanel.jsx # Adversarial findings + confidence chip
β β β βββ ResearchReport.jsx # Final report sections
β β β βββ ExportButton.jsx # Native jsPDF export
β β βββ hooks/
β β β βββ useResearch.js # Run lifecycle, polling, 404 recovery
β β β βββ useWebSocket.js # Reconnecting WS client
β β βββ services/
β β β βββ api.js # Axios wrappers
β β β βββ agentMeta.js # Agent colors, tones, status enums
β β βββ styles/
β β βββ globals.css # Tailwind + custom panel / graph styles
β βββ package.json
β βββ vite.config.js
β
βββ sample_outputs/
β βββ sample_report.md # Example completed research brief
β
βββ docs/
β βββ screenshots/ # README preview images
β
βββ DECISIONS.md # Architecture decisions + trade-offs
βββ README.md
- Python 3.11+
- Node.js 18+
- A Groq API key (free tier works great) β optional if you only want demo mode
git clone https://github.com/your-username/nexus-research.git
cd nexus-researchcd backend
python -m venv .venv
# Activate virtual environment
source .venv/bin/activate # Linux / macOS
.venv\Scripts\Activate.ps1 # Windows PowerShell
pip install -r requirements.txtCreate a .env file inside backend/:
# Groq (optional β leave empty to force demo mode)
GROQ_API_KEY=your_groq_api_key_here
# Model routing (optional β these are the defaults)
GROQ_MODEL_SCOUT=llama-3.1-8b-instant
GROQ_MODEL_ANALYST=llama-3.3-70b-versatile
GROQ_MODEL_CRITIC=llama-3.3-70b-versatile
GROQ_MODEL_SCRIBE=llama-3.3-70b-versatile
# Token caps per agent call (optional β defaults shown)
NEXUS_TOKEN_CAP_SCOUT=320
NEXUS_TOKEN_CAP_ANALYST_EXTRACT=700
NEXUS_TOKEN_CAP_ANALYST_SYNTHESIS=600
NEXUS_TOKEN_CAP_CRITIC=600
NEXUS_TOKEN_CAP_SCRIBE=1000
# Per-run hard budget β pipeline stops gracefully if exceeded
NEXUS_TOKEN_BUDGET_PER_RUN=12000
# Search timeout (seconds)
NEXUS_SEARCH_TIMEOUT_SECONDS=12.0uvicorn main:app --reload --host 127.0.0.1 --port 8000API runs at http://localhost:8000 Β· Interactive docs at http://localhost:8000/docs
cd ../frontend
npm installCreate a .env file in frontend/ (optional β these are the defaults):
VITE_API_URL=http://localhost:8000
VITE_WS_URL=ws://localhost:8000/wsnpm run devFrontend runs at http://localhost:5173
- Open the app at
http://localhost:5173 - Type a research question β e.g. "How are research teams using AI in due diligence?"
- Choose a Depth:
standard(3 sources) ordeep(5 sources) - Choose a Mode:
autoβ use Groq if the key is set, otherwise fall back to demoliveβ require Groq (errors if no key)demoβ deterministic local content, no API calls
- Click Start Research
The research workspace shows everything that happens in real time:
- Run topology graph β animated ReactFlow edges flow between SCOUT β ANALYST β CRITIC β SCRIBE as each agent takes over
- Progress ring β conic gradient shows how many of the four agents are complete
- Live feed β every WebSocket event (
searching,reading,challenging,writing, β¦) streams in - Agent cards β per-agent status, current message, and live pulsing indicator
When SCRIBE finishes, three panels populate:
- Discovered Evidence β every source kept, with favicon, domain chip, snippet, and numbered badge
- Critic Review β adversarial findings grouped by category, with a color-coded confidence badge (
HIGH/MEDIUM/LOW) and justification - Final Report β Executive Summary, Background, Analysis, Critical Perspectives, Conclusion, Key Findings, Recommendations, Cited Sources
Click Export PDF at the top of the final report. NEXUS generates a native text PDF (selectable, searchable, branded with cover header and page footers) β no screenshot step, no color-parsing issues.
| Agent | Role | Default Model | Output |
|---|---|---|---|
| π‘ SCOUT | Web reconnaissance | llama-3.1-8b-instant |
3β12 search queries, deduped ranked sources |
| π΅ ANALYST | Evidence extraction | llama-3.3-70b-versatile |
Per-source findings + synthesis (themes, consensus, conflicts, gaps) |
| π΄ CRITIC | Adversarial review | llama-3.3-70b-versatile |
Logical flaws, missing perspectives, bias risk, overstatements, confidence rating |
| π£ SCRIBE | Report synthesis | llama-3.3-70b-versatile |
Full executive brief: 5 prose sections + key findings + recommendations |
Each agent runs sequentially, emits structured WebSocket events (running β working status β complete), and pipes its output into the next agent's context.
NEXUS is engineered to run a full 4-agent pipeline inside the Groq free-tier budget. Key techniques applied in groq_service.py and settings.py:
| Technique | Impact |
|---|---|
| Model routing | SCOUT (planning) uses the cheap 8B model; heavy reasoning agents use 70B |
Strict per-agent max_tokens |
Hard caps stop verbose outputs before they bloat the next agent's prompt |
| Low temperatures (0.1β0.3) | Deterministic outputs = fewer retries |
| Semantic prompt cache | Hash (normalized_prompt + model + schema) β identical repeats cost 0 tokens |
| Field pruning in prompts | Only top-N findings pass to CRITIC / SCRIBE instead of the full set |
| Per-run token budget | NEXUS_TOKEN_BUDGET_PER_RUN β pipeline returns a partial report rather than failing hard |
| JSON retry guardrail | Smart retries on malformed JSON avoid silent token waste on invalid completions |
| Method | Endpoint | Description |
|---|---|---|
POST |
/research |
Create a new research run. Body: { question, depth, mode } |
GET |
/research/{run_id} |
Fetch current run status + result (if complete) |
GET |
/latest |
Fetch the most recent completed run |
GET |
/status |
API health + whether Groq is configured |
WS |
/ws |
Stream normalized AgentEvent objects for all active runs |
Example β start a run:
curl -X POST http://localhost:8000/research \
-H "Content-Type: application/json" \
-d '{"question":"How are research teams using AI in due diligence?","depth":"standard","mode":"auto"}'Example AgentEvent broadcast:
{
"run_id": "a1b2c3d4-...",
"event": "agent_status",
"agent": "SCOUT",
"status": "searching",
"message": "Running 4 search queries..."
}GROQ_API_KEY=... # leave empty to force demo
GROQ_MODEL_SCOUT=llama-3.1-8b-instant
GROQ_MODEL_ANALYST=llama-3.3-70b-versatile
GROQ_MODEL_CRITIC=llama-3.3-70b-versatile
GROQ_MODEL_SCRIBE=llama-3.3-70b-versatile
NEXUS_TOKEN_CAP_SCOUT=320
NEXUS_TOKEN_CAP_ANALYST_EXTRACT=700
NEXUS_TOKEN_CAP_ANALYST_SYNTHESIS=600
NEXUS_TOKEN_CAP_CRITIC=600
NEXUS_TOKEN_CAP_SCRIBE=1000
NEXUS_TOKEN_BUDGET_PER_RUN=12000
NEXUS_SEARCH_TIMEOUT_SECONDS=12.0
CORS_ORIGINS=* # restrict in productionVITE_API_URL=http://localhost:8000
VITE_WS_URL=ws://localhost:8000/ws| Mode | Behavior | When to use |
|---|---|---|
auto |
Use Groq if GROQ_API_KEY is set, otherwise fall back to demo |
Default β safe for any environment |
live |
Require Groq. Errors out if no key is configured | Production / real research |
demo |
Deterministic offline content, no external calls | Screenshots, demos, CI, offline dev |
# Backend β Pytest
cd backend
pytest
# Frontend β Vitest (single run)
cd ../frontend
npm test -- --run
# Frontend β lint + production build
npm run lint
npm run buildThis project is built for local development and demo deployments. Before any public deployment:
- The backend defaults
CORS_ORIGINS=*β restrict this to your actual frontend domain run_store.pyis an in-memory dict β runs don't survive restarts and aren't safe for multi-tenant production use; swap in Redis or a database- Never commit your
.envor exposeGROQ_API_KEYpublicly - The scraper follows redirects and fetches arbitrary URLs β consider adding a domain allowlist and hard timeouts in production (both are already in place as sane defaults)
demomode is a safe fallback that makes no external network calls
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -m 'Add your feature' - Push:
git push origin feature/your-feature - Open a Pull Request
Ideas for improvement: persistent run storage (Postgres/Redis), multi-turn follow-up questions, streaming Groq responses directly into the report UI, Docker compose for one-command setup, user authentication, per-user run history, alternative LLM providers (Cerebras, Together, Gemini), adaptive pipeline depth based on confidence, export to DOCX / Notion / Markdown, embedding-based semantic cache.
This project is licensed under the MIT License. See LICENSE.
Made with β€οΈ for anyone who's ever wanted to watch AI agents think out loud.
β Star this repo if you find it useful!





