You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update README and demo documentation for clarity and consistency
- Revised README.md to enhance clarity on the docproc CLI functionality and its integration with the demo application.
- Updated demo/docker-compose.yml to reflect the new naming convention for the demo ecosystem.
- Adjusted demo/README.md to align with the new branding of "docproc // edu".
- Modified frontend components to consistently use the updated branding in headers and PDF exports.
This commit aims to improve the overall documentation and user experience by ensuring consistent terminology and clearer instructions.
**Document processor CLI** — File in, markdown out. High-accuracy extraction from PDF, DOCX, PPTX, XLSX using vision LLMs and optional LLM refinement. Multi-provider (OpenAI, Azure, Anthropic, Ollama, LiteLLM). Docproc is document processing only; assessment grading lives in the Go demo app.
3
+
docproc turns documents into markdown. Give it a PDF, DOCX, PPTX, or XLSX; you get clean text and every image (equations, diagrams, labels) explained by a vision model. It’s CLI only. Works with OpenAI, Azure, Anthropic, Ollama, or LiteLLM.
4
4
5
-
**Full-stack demo (Go + React)**— The study workspace (upload, RAG chat, notes, flashcards, assessments) lives in **[demo/](demo/)**. It is a separate Go application that uses LocalStack (S3), RabbitMQ, PostgreSQL + PgVector, and invokes the docproc CLI only when a document is uploaded or when grading an answer.
5
+
The **docproc // edu**demo in [demo/](demo/)is a full study workspace: upload docs, chat over them, generate notes and flashcards, create and take assessments. That app is written in Go and calls this CLI when a document is uploaded; it does grading itself.
6
6
7
7
---
8
8
9
-
## Features (CLI)
9
+
## What the CLI does
10
10
11
-
-**Extract** — `docproc --file input.pdf -o output.md`: native text + vision for every embedded image (equations, diagrams, labels).
12
-
-**Vision** — PDFs: native text layer; embedded images → Azure AI Vision or vision LLM (OpenAI, Anthropic, Ollama).
-**Config** — `docproc.yaml`: AI providers, ingest options; no server or database required for extract.
11
+
**Extract.**`docproc --file input.pdf -o output.md` — Pulls text from the native layer and runs vision on every embedded image. Optional extra pass: tidy markdown, LaTeX math, strip boilerplate (see `ingest.use_llm_refine` in config).
15
12
16
-
## Quick Start (CLI only)
13
+
**Config.**`docproc.yaml` holds AI providers and ingest options. No database or server needed for extract. Use `docproc init-config --env .env` once to generate a starter config from your `.env`.
# One-time: write ~/.config/docproc/docproc.yml from .env
23
-
uv run docproc init-config --env .env
24
-
25
-
# Extract a document to markdown
21
+
uv run docproc init-config --env .env # one-time
26
22
uv run docproc --file input.pdf -o output.md
27
23
```
28
24
29
-
## Demo (full stack)
25
+
## Demo (docproc // edu)
30
26
31
-
See **[demo/README.md](demo/README.md)**. Run PostgreSQL, LocalStack, RabbitMQ via `docker compose`, then the Go API and worker; the React frontend in `demo/web/` talks to the Go app. Document processing is done by running the docproc CLI from the Go worker.
27
+
See [demo/README.md](demo/README.md). From `demo/`, run `docker compose up -d` (stack name: **docproc-edu**). Then start the Go API and worker from `demo/go/`, and the React app from `demo/web/`. The worker runs the docproc CLI on each uploaded document.
32
28
33
29
## Configuration
34
30
35
-
Create `docproc.yaml`(or use `docproc init-config` to generate from `.env`). For extract and grade, only AI and ingest matter:
31
+
Create `docproc.yaml` or generate from `.env` with `init-config`. For both the CLI and the demo, the bits that matter are AI providers and ingest:
36
32
37
33
```yaml
38
34
ai_providers:
39
35
- provider: openai # or azure, anthropic, ollama, litellm
40
36
primary_ai: openai
41
37
42
38
ingest:
43
-
use_vision: true# PDF: extract text + vision for images
| [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) | Pipeline, CLI vs API |
66
+
| [docs/AZURE_SETUP.md](docs/AZURE_SETUP.md) | Azure OpenAI and Vision setup |
67
+
| [docs/ASSESSMENTS_AI.md](docs/ASSESSMENTS_AI.md) | Assessments and grading in the demo |
74
68
75
-
- `DOCPROC_CONFIG` — Path to config file (default: `docproc.yaml`).
76
-
- Provider-specific: `OPENAI_API_KEY`, `AZURE_OPENAI_*`, `ANTHROPIC_API_KEY`, etc. See [.env.example](.env.example) and [docs/CONFIGURATION.md](docs/CONFIGURATION.md).
69
+
**Environment:** `DOCPROC_CONFIG` for config path (default: `docproc.yaml`). Provider keys: `OPENAI_API_KEY`, `AZURE_OPENAI_*`, `ANTHROPIC_API_KEY`, etc. See [.env.example](.env.example).
77
70
78
71
## Contributing
79
72
80
-
Pull requests welcome. Ensure tests pass.
73
+
Pull requests welcome. Run the tests before sending.
81
74
82
75
## License
83
76
84
77
MIT. See [LICENSE.md](LICENSE.md).
85
78
86
-
## Motivation
87
-
I learn by asking questions. Not surface-level ones. The deep, obsessive "why"s that most materials never bother to answer. When my peers were studying from slides and PDFs, I sat there stuck. I couldn't absorb content I wasn't allowed to interrogate. Documents don't talk back. They don't explain the intuition, the connections, the *why*. Tools like NotebookLM couldn't help either: they don't understand images inside the data source, so those parts show up blank. Most of my slides were visual or text as screenshots. I was left with nothing.
79
+
---
80
+
81
+
## Why I built this
88
82
89
-
So I built something for myself. A platform that extracts content from any document — slides, papers, textbooks — and lets me use AI to actually ask. *Why does this work? What's the reasoning here? How does this connect to that thing from last week?* It grew from "extract and query" into a full study environment: converse over the corpus, generate notes and flashcards, and create or take AI-generated assessments with automatic grading. For the first time, static documents became something I could learn from. Not by re-reading. By *conversing*, *noting*, and *testing*.
83
+
I learn by asking questions. Not surface-level ones—the deep "why"s that most materials never answer. When my peers studied from slides and PDFs, I got stuck. I couldn’t absorb content I wasn’t allowed to interrogate. Documents don’t talk back. They don’t explain the intuition or the connections. Tools like NotebookLM didn’t help: they don’t understand images in the source, so those parts showed up blank. Most of my slides were visual or screenshots. I had nothing to work with.
90
84
91
-
I'm open-sourcing it because I'm probably not the only one who learns this way.
85
+
So I built something for myself. A way to pull content out of any document—slides, papers, textbooks—and ask AI the questions I needed. *Why does this work? What’s the reasoning here? How does this connect to what we did last week?* It grew from "extract and query" into a full study environment: chat over the corpus, generate notes and flashcards, create and take assessments with automatic grading. For the first time I could learn from static documents by *conversing*, *noting*, and *testing*—not just re-reading.
92
86
93
-
## Contact
87
+
I’m open-sourcing it because I’m probably not the only one who learns this way.
Copy file name to clipboardExpand all lines: demo/README.md
+1-2Lines changed: 1 addition & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,4 @@
1
-
# Docproc Demo (Go)
2
-
1
+
# docproc // edu
3
2
4
3
Full-stack demo: Go API + React frontend. Document processing is done by the **docproc** CLI (Python). This app handles uploads, storage (LocalStack S3), message queue (RabbitMQ), RAG (PgVector), and **assessment grading** (single-select, formula, conceptual, derivation) in Go.
Copy file name to clipboardExpand all lines: demo/web/WEB_APP_SPEC.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ This document describes **every feature** of the `demo/web/` frontend in detail,
6
6
7
7
## 1. Product Overview
8
8
9
-
-**Name / branding:** “docproc” (shown in header); product is “docproc / edu” in PDF exports.
9
+
-**Name / branding:** “docproc // edu” (shown in header and PDF exports).
10
10
-**Core value:** One workspace per **project**; each project has **documents** that are processed and indexed. All study features (chat, notes, flashcards, tests) are grounded in those documents.
11
11
-**Primary user flow:** Create/select project → Add documents → Wait for processing → Use Converse (chat), Notes, Flashcards, or create/take Assessments. Sources canvas manages documents.
12
12
-**No auth in app:** Assumes backend is configured (API base URL, optional namespace). Settings view shows API status only.
@@ -200,7 +200,7 @@ No sidebar document list in current layout; document selection is only within th
200
200
- “+ Add section”, “Download PDF”.
201
201
- List of note sections: each is textarea (auto-save debounced 600ms via `updateNote`); metadata: source filename, updated time; Saving…/Saved indicator.
202
202
- “Add section” creates note with optional “Section for: {filename}” if a doc is selected.
203
-
-**Download PDF:** jsPDF; header “docproc / edu”, “Project Notes”, project id and date; generated summary (if any) then each section with optional “Section N — {filename}”; filename `docproc-notes-{projectId}-{timestamp}.pdf`.
203
+
-**Download PDF:** jsPDF; header “docproc // edu”, “Project Notes”, project id and date; generated summary (if any) then each section with optional “Section N — {filename}”; filename `docproc-notes-{projectId}-{timestamp}.pdf`.
204
204
205
205
**NotesModule** (used in StudyDock): Same concepts in a more compact layout for the dock; sections in a scrollable area with max-height.
0 commit comments