You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+77-10Lines changed: 77 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,18 +66,23 @@ This project provides a powerful and flexible PDF analysis microservice built wi
66
66
67
67
### 1. Start the Service
68
68
69
-
**With GPU support (recommended for better performance):**
69
+
**Standard PDF Analysis (recommended for most users):**
70
70
```bash
71
71
make start
72
72
```
73
73
74
-
**Without GPU support:**
74
+
**With Translation Features (includes Ollama container):**
75
75
```bash
76
-
make start_no_gpu
76
+
make start_translation
77
77
```
78
78
79
79
The service will be available at `http://localhost:5060`
80
80
81
+
**See all available commands:**
82
+
```bash
83
+
make help
84
+
```
85
+
81
86
**Check service status:**
82
87
83
88
```bash
@@ -170,8 +175,8 @@ The service provides a comprehensive RESTful API with the following endpoints:
170
175
171
176
| Endpoint | Method | Description | Parameters |
172
177
|----------|--------|-------------|------------|
173
-
|`/markdown`| POST | Convert PDF to Markdown (includes segmentation data in zip) |`file`, `fast`, `extract_toc`, `dpi`, `output_file`|
174
-
|`/html`| POST | Convert PDF to HTML (includes segmentation data in zip) |`file`, `fast`, `extract_toc`, `dpi`, `output_file`|
178
+
|`/markdown`| POST | Convert PDF to Markdown (includes segmentation data in zip) |`file`, `fast`, `extract_toc`, `dpi`, `output_file`, `target_languages`, `translation_model`|
179
+
|`/html`| POST | Convert PDF to HTML (includes segmentation data in zip) |`file`, `fast`, `extract_toc`, `dpi`, `output_file`, `target_languages`, `translation_model`|
175
180
|`/visualize`| POST | Visualize segmentation results on the PDF |`file`, `fast`|
176
181
177
182
### OCR & Utility Endpoints
@@ -192,6 +197,8 @@ The service provides a comprehensive RESTful API with the following endpoints:
192
197
-**`types`**: Comma-separated content types to extract (string, default: "all")
193
198
-**`extract_toc`**: Include table of contents at the beginning of the output (boolean, default: false)
194
199
-**`dpi`**: Image resolution for conversion (integer, default: 120)
200
+
-**`target_languages`**: Comma-separated list of target languages for translation (e.g. "Turkish, Spanish, French")
201
+
-**`translation_model`**: Ollama model to use for translation (string, default: "gpt-oss")
195
202
196
203
## 💡 Usage Examples
197
204
@@ -254,15 +261,75 @@ curl -X POST http://localhost:5060/markdown \
> **📋 Segmentation Data**: Format conversion endpoints automatically include detailed segmentation data in the zip output. The resulting zip file contains a `{filename}_segmentation.json` file with information about each detected document segment including:
> **📋 Segmentation Data & Translations**: Format conversion endpoints automatically include detailed segmentation data in the zip output. The resulting zip file contains:
289
+
> -**Original file**: The converted document in the requested format
290
+
> -**Segmentation data**: `{filename}_segmentation.json` file with information about each detected document segment:
The `/markdown` and `/html` endpoints support automatic translation of the converted content into multiple languages using Ollama models.
265
300
301
+
**Translation Requirements:**
302
+
- The specified translation model must be available in Ollama
303
+
- An `output_file` must be specified (translations are only included in zip responses)
304
+
305
+
**Supported Translation Models:**
306
+
- Any Ollama-compatible model (e.g., `gpt-oss`, `llama2`, `mistral`, etc.)
307
+
- Models are automatically downloaded if not present locally
308
+
309
+
**Translation Process:**
310
+
1. The service checks if the specified model is available in Ollama
311
+
2. If not available, it attempts to download the model using `ollama pull`
312
+
3. For each target language, the content is translated while preserving:
313
+
- Original formatting and structure
314
+
- Markdown/HTML syntax
315
+
- Links and references
316
+
- Image references and tables
317
+
4. Translated files are named: `{filename}_{language}.{extension}`
318
+
319
+
_**Note that the quality of translations mostly depends on the models used. When using smaller models, the output may contain many unexpected or undesired elements. For regular users, we aimed for a balance between performance and quality, so we tested with different models with a reasonable size. The results for `gpt-oss` were satisfactory, which is why we set it as the default model. If you need something smaller you can also try `huihui_ai/hunyuan-mt-abliterated`, we saw it gives decent results especially if the text does not have much styling.**_
320
+
321
+
**Example Translation Output:**
322
+
```
323
+
document.zip
324
+
├── document.md # Source text with markdown/html styling
325
+
├── document_Spanish.md # Spanish translation
326
+
├── document_French.md # French translation
327
+
├── document_Turkish.md # Turkish translation
328
+
├── document_segmentation.json # Segmentation information
0 commit comments