You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-13Lines changed: 12 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,11 +6,11 @@ File Wizard is a self-hosted, browser-based utility for file conversion, OCR, an
6
6
7
7
## Features
8
8
9
-
***Versatile File Conversion:** Convert between various file formats. The system is designed to be extended with any command-line tool (like FFmpeg, ImageMagick, etc.) via a simple `settings.yml` configuration file.
10
-
***OCR:** Perform Optical Character Recognition (OCR) on PDFs and images to extract text.
11
-
***Accurate Audio Transcription:** Transcribe audio files into text using Whisper models.
12
-
***Modern UI & UX:**
13
-
* Clean, responsive, dark-themed interface.
9
+
* Convert between various file formats. The system is designed to be extended with any command-line tool (like FFmpeg, ImageMagick, etc.) via a simple `settings.yml` configuration file.
10
+
***OCR:** Perform Optical Character Recognition on PDFs and images to extract text.
11
+
***Audio Transcription:** Transcribe audio files into text using Whisper models.
* Drag-and-drop support for single or multiple files anywhere on the screen.
15
15
* Traditional multi-file selection buttons.
16
16
* A dialog to choose the desired action (Convert, OCR, Transcribe) for dropped files.
@@ -19,19 +19,17 @@ File Wizard is a self-hosted, browser-based utility for file conversion, OCR, an
19
19
* A persistent job history table displays file names, tasks, submission times, file sizes (input → output), and final status.
20
20
21
21
***Configuration:**
22
-
* A dedicated `/settings` page allows for viewing and editing the configuration directly from the UI.
22
+
* A dedicated `/settings` page.
23
23
* OAuth needs to be configured in the `config/settings.yml` file, you can see the default for a reference. By default, it runs without auth in local mode.
24
24
* Currently it only supports cpu operations, but a future image will include the cuda drivers for running whisper on gpu (torch and cuda is large and I didn't want to inflate the image even more)
25
25
26
26
-----
27
27
28
-
## Tech Stack
28
+
## NOTE
29
+
Run at your own risk! This app is highly vulnerable to arbitrary code executing if left public and without auth. I'm no security expert this app is intended for local use or usage with an OAuth oidc provider!
29
30
30
-
***Backend:** FastAPI (Python)
31
-
***Frontend:** Vanilla HTML, CSS, JavaScript (no framework)
32
-
***Task Queue:** Huey (with a SQLite backend might do redis soon)
33
-
***Database:** SQLAlchemy (with a SQLite database might go postgres soon)
34
-
***Configuration:** YAML
31
+
## Tech Stack:
32
+
FastAPI for the Server, Vanilla html, js, css for the frontend (might switch to svelte in the future, kept it light for this), Huey for task queuing and SQlite for any Databse.
35
33
36
34
-----
37
35
@@ -107,10 +105,11 @@ chmod +x run.sh
107
105
108
106
-----
109
107
110
-
## Usage 🖱️
108
+
## Usage
111
109
112
110
1. Open your browser to `http://127.0.0.1:8000`.
113
111
2.**Drag and drop** any file or multiple files onto the page.
114
112
3. A dialog will appear asking you to choose an action: **Convert**, **OCR**, or **Transcribe**.
115
113
4. Alternatively, use the "Choose File" buttons in any of the three sections.
116
114
5. Your job will appear in the "History" table, and its status will update automatically.
0 commit comments