Skip to content

Commit caaacae

Browse files
committed
fix: update API to match actual supported operations
- Remove unsupported methods (convert-to-pdf, export-to-images, etc) - Fix watermark to require width/height parameters - Add OCR language code mapping (en -> english) - Update merge_pdfs to work with Build API - Add comprehensive documentation of supported operations - Update README to reflect only supported features Based on API testing: - Only 6 operations are currently supported - All operations go through the Build API - Watermark requires width/height parameters - OCR supports english/eng/deu languages
1 parent 11800c1 commit caaacae

File tree

5 files changed

+352
-215
lines changed

5 files changed

+352
-215
lines changed

README.md

Lines changed: 32 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@ from nutrient import NutrientClient
2626
# Initialize the client
2727
client = NutrientClient(api_key="your-api-key")
2828

29-
# Direct API - Convert Office document to PDF
30-
pdf = client.convert_to_pdf(
31-
input_file="document.docx",
32-
output_path="converted.pdf"
29+
# Direct API - Flatten PDF annotations
30+
client.flatten_annotations(
31+
input_file="document.pdf",
32+
output_path="flattened.pdf"
3333
)
3434

3535
# Builder API - Chain multiple operations
@@ -59,20 +59,13 @@ with NutrientClient(api_key="your-api-key") as client:
5959

6060
## Direct API Examples
6161

62-
### Convert to PDF
62+
### Flatten Annotations
6363

6464
```python
65-
# Convert Office document to PDF
66-
client.convert_to_pdf(
67-
input_file="presentation.pptx",
68-
output_path="presentation.pdf"
69-
)
70-
71-
# Convert with options
72-
client.convert_to_pdf(
73-
input_file="spreadsheet.xlsx",
74-
output_path="spreadsheet.pdf",
75-
page_range="1-3"
65+
# Flatten all annotations and form fields
66+
client.flatten_annotations(
67+
input_file="form.pdf",
68+
output_path="flattened.pdf"
7669
)
7770
```
7871

@@ -119,19 +112,14 @@ client.rotate_pages(
119112
### Watermark PDF
120113

121114
```python
122-
# Add text watermark
115+
# Add text watermark (width/height required)
123116
client.watermark_pdf(
124117
input_file="document.pdf",
125118
output_path="watermarked.pdf",
126119
text="DRAFT",
127-
opacity=0.5
128-
)
129-
130-
# Add image watermark
131-
client.watermark_pdf(
132-
input_file="document.pdf",
133-
output_path="watermarked.pdf",
134-
image_url="https://example.com/logo.png",
120+
width=200,
121+
height=100,
122+
opacity=0.5,
135123
position="center"
136124
)
137125
```
@@ -223,42 +211,34 @@ Files larger than 10MB are automatically streamed to avoid memory issues:
223211

224212
```python
225213
# This will stream the file instead of loading it into memory
226-
client.convert_to_pdf("large-presentation.pptx")
214+
client.flatten_annotations("large-document.pdf")
227215
```
228216

229-
## Available Tools
230-
231-
### Document Conversion
232-
- `convert_to_pdf` - Convert Office documents to PDF
233-
- `convert_from_pdf` - Convert PDF to Office formats
234-
- `convert_pdf_page_to_image` - Convert PDF pages to images
235-
- `import_from_url` - Import documents from URLs
217+
## Available Operations
236218

237219
### PDF Manipulation
238-
- `merge_pdfs` - Merge multiple PDFs
239-
- `split_pdf` - Split PDF into multiple files
240-
- `rotate_pages` - Rotate PDF pages
241-
- `delete_pages` - Remove pages from PDF
242-
- `duplicate_pages` - Duplicate pages in PDF
243-
- `move_pages` - Reorder pages in PDF
220+
- `merge_pdfs` - Merge multiple PDFs into one
221+
- `rotate_pages` - Rotate PDF pages (all or specific pages)
222+
- `flatten_annotations` - Flatten form fields and annotations
244223

245224
### PDF Enhancement
246-
- `ocr_pdf` - Add searchable text layer
225+
- `ocr_pdf` - Add searchable text layer (English and German)
247226
- `watermark_pdf` - Add text or image watermarks
248-
- `flatten_annotations` - Flatten form fields and annotations
249-
- `linearize_pdf` - Optimize for web viewing
250227

251228
### PDF Security
252-
- `apply_redactions` - Permanently remove sensitive content
253-
- `create_redactions` - Mark content for redaction
254-
- `sanitize_pdf` - Remove potentially harmful content
255-
256-
### Annotations and Forms
257-
- `apply_instant_json` - Apply Nutrient Instant JSON annotations
258-
- `export_instant_json` - Export annotations as Instant JSON
259-
- `apply_xfdf` - Apply XFDF annotations
260-
- `export_xfdf` - Export annotations as XFDF
261-
- `export_pdf_info` - Extract PDF metadata and structure
229+
- `apply_redactions` - Apply existing redaction annotations
230+
231+
### Builder API
232+
The Builder API allows chaining multiple operations:
233+
```python
234+
client.build(input_file="document.pdf") \
235+
.add_step("rotate-pages", {"degrees": 90}) \
236+
.add_step("ocr-pdf", {"language": "english"}) \
237+
.add_step("watermark-pdf", {"text": "DRAFT", "width": 200, "height": 100}) \
238+
.execute(output_path="processed.pdf")
239+
```
240+
241+
Note: See [SUPPORTED_OPERATIONS.md](SUPPORTED_OPERATIONS.md) for detailed documentation of all supported operations and their parameters.
262242

263243
## Development
264244

SUPPORTED_OPERATIONS.md

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
# Supported Operations
2+
3+
This document lists all operations currently supported by the Nutrient DWS API through this Python client.
4+
5+
## Direct API Methods
6+
7+
The following methods are available on the `NutrientClient` instance:
8+
9+
### 1. `flatten_annotations(input_file, output_path=None)`
10+
Flattens all annotations and form fields in a PDF, converting them to static page content.
11+
12+
**Parameters:**
13+
- `input_file`: PDF file (path, bytes, or file-like object)
14+
- `output_path`: Optional path to save output
15+
16+
**Example:**
17+
```python
18+
client.flatten_annotations("document.pdf", "flattened.pdf")
19+
```
20+
21+
### 2. `rotate_pages(input_file, output_path=None, degrees=0, page_indexes=None)`
22+
Rotates pages in a PDF.
23+
24+
**Parameters:**
25+
- `input_file`: PDF file
26+
- `output_path`: Optional output path
27+
- `degrees`: Rotation angle (90, 180, 270, or -90)
28+
- `page_indexes`: Optional list of page indexes to rotate (0-based)
29+
30+
**Example:**
31+
```python
32+
# Rotate all pages 90 degrees
33+
client.rotate_pages("document.pdf", "rotated.pdf", degrees=90)
34+
35+
# Rotate specific pages
36+
client.rotate_pages("document.pdf", "rotated.pdf", degrees=180, page_indexes=[0, 2])
37+
```
38+
39+
### 3. `ocr_pdf(input_file, output_path=None, language="english")`
40+
Applies OCR to make a PDF searchable.
41+
42+
**Parameters:**
43+
- `input_file`: PDF file
44+
- `output_path`: Optional output path
45+
- `language`: OCR language - supported values:
46+
- `"english"` or `"eng"` - English
47+
- `"deu"` or `"german"` - German
48+
49+
**Example:**
50+
```python
51+
client.ocr_pdf("scanned.pdf", "searchable.pdf", language="english")
52+
```
53+
54+
### 4. `watermark_pdf(input_file, output_path=None, text=None, image_url=None, width=200, height=100, opacity=1.0, position="center")`
55+
Adds a watermark to all pages of a PDF.
56+
57+
**Parameters:**
58+
- `input_file`: PDF file
59+
- `output_path`: Optional output path
60+
- `text`: Text for watermark (either text or image_url required)
61+
- `image_url`: URL of image for watermark
62+
- `width`: Width in points (required)
63+
- `height`: Height in points (required)
64+
- `opacity`: Opacity from 0.0 to 1.0
65+
- `position`: One of: "top-left", "top-center", "top-right", "center", "bottom-left", "bottom-center", "bottom-right"
66+
67+
**Example:**
68+
```python
69+
# Text watermark
70+
client.watermark_pdf(
71+
"document.pdf",
72+
"watermarked.pdf",
73+
text="CONFIDENTIAL",
74+
width=300,
75+
height=150,
76+
opacity=0.5,
77+
position="center"
78+
)
79+
```
80+
81+
### 5. `apply_redactions(input_file, output_path=None)`
82+
Applies redaction annotations to permanently remove content.
83+
84+
**Parameters:**
85+
- `input_file`: PDF file with redaction annotations
86+
- `output_path`: Optional output path
87+
88+
**Example:**
89+
```python
90+
client.apply_redactions("document_with_redactions.pdf", "redacted.pdf")
91+
```
92+
93+
### 6. `merge_pdfs(input_files, output_path=None)`
94+
Merges multiple PDF files into one.
95+
96+
**Parameters:**
97+
- `input_files`: List of PDF files to merge
98+
- `output_path`: Optional output path
99+
100+
**Example:**
101+
```python
102+
client.merge_pdfs(
103+
["document1.pdf", "document2.pdf", "document3.pdf"],
104+
"merged.pdf"
105+
)
106+
```
107+
108+
## Builder API
109+
110+
The Builder API allows chaining multiple operations:
111+
112+
```python
113+
client.build(input_file="document.pdf") \
114+
.add_step("rotate-pages", {"degrees": 90}) \
115+
.add_step("ocr-pdf", {"language": "english"}) \
116+
.add_step("watermark-pdf", {
117+
"text": "DRAFT",
118+
"width": 200,
119+
"height": 100,
120+
"opacity": 0.3
121+
}) \
122+
.add_step("flatten-annotations") \
123+
.execute(output_path="processed.pdf")
124+
```
125+
126+
### Supported Builder Actions
127+
128+
1. **flatten-annotations** - No parameters required
129+
2. **rotate-pages** - Parameters: `degrees`, `page_indexes` (optional)
130+
3. **ocr-pdf** - Parameters: `language`
131+
4. **watermark-pdf** - Parameters: `text` or `image_url`, `width`, `height`, `opacity`, `position`
132+
5. **apply-redactions** - No parameters required
133+
134+
## API Limitations
135+
136+
The following operations are **NOT** currently supported by the API:
137+
138+
- Document conversion (Office to PDF, HTML to PDF)
139+
- PDF to image export
140+
- PDF splitting
141+
- Form filling
142+
- Digital signatures
143+
- Compression/optimization
144+
- Linearization
145+
- Creating redactions (only applying existing ones)
146+
- Instant JSON annotations
147+
- XFDF annotations
148+
149+
## Language Support
150+
151+
OCR currently supports:
152+
- English (`"english"` or `"eng"`)
153+
- German (`"deu"` or `"german"`)
154+
155+
## File Input Types
156+
157+
All methods accept files as:
158+
- String paths: `"document.pdf"`
159+
- Path objects: `Path("document.pdf")`
160+
- Bytes: `b"...pdf content..."`
161+
- File-like objects: `open("document.pdf", "rb")`
162+
163+
## Error Handling
164+
165+
Common exceptions:
166+
- `AuthenticationError` - Invalid or missing API key
167+
- `APIError` - General API errors with status code
168+
- `ValidationError` - Invalid parameters
169+
- `FileNotFoundError` - File not found
170+
- `ValueError` - Invalid input values

src/nutrient/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,11 @@
1515

1616
__version__ = "0.1.0"
1717
__all__ = [
18+
"NutrientClient",
19+
"NutrientError",
1820
"APIError",
1921
"AuthenticationError",
2022
"FileProcessingError",
21-
"NutrientClient",
22-
"NutrientError",
2323
"TimeoutError",
2424
"ValidationError",
2525
]

0 commit comments

Comments
 (0)