11# Nutrient DWS Python Client
22
3- A Python client library for the Nutrient Document Web Services (DWS) API.
3+ A Python client library for the [ Nutrient Document Web Services (DWS) API] ( https://www.nutrient.io/ ) . This library provides a Pythonic interface to interact with Nutrient's document processing services, supporting both Direct API calls and Builder API workflows.
4+
5+ ## Features
6+
7+ - 🚀 ** Two API styles** : Direct API for single operations, Builder API for complex workflows
8+ - 📄 ** Comprehensive document tools** : Convert, merge, rotate, OCR, watermark, and more
9+ - 🔄 ** Automatic retries** : Built-in retry logic for transient failures
10+ - 📁 ** Flexible file handling** : Support for file paths, bytes, and file-like objects
11+ - 🔒 ** Type-safe** : Full type hints for better IDE support
12+ - ⚡ ** Streaming support** : Memory-efficient processing of large files
13+ - 🧪 ** Well-tested** : Comprehensive test suite with high coverage
414
515## Installation
616
717``` bash
8- pip install nutrient
18+ pip install nutrient-dws
919```
1020
1121## Quick Start
@@ -14,22 +24,293 @@ pip install nutrient
1424from nutrient import NutrientClient
1525
1626# Initialize the client
17- client = NutrientClient(api_key = " YOUR_API_KEY" )
27+ client = NutrientClient(api_key = " your-api-key" )
28+
29+ # Direct API - Convert Office document to PDF
30+ pdf = client.convert_to_pdf(
31+ input_file = " document.docx" ,
32+ output_path = " converted.pdf"
33+ )
34+
35+ # Builder API - Chain multiple operations
36+ client.build(input_file = " document.pdf" ) \
37+ .add_step(" rotate-pages" , {" degrees" : 90 }) \
38+ .add_step(" ocr-pdf" , {" language" : " en" }) \
39+ .add_step(" watermark-pdf" , {" text" : " CONFIDENTIAL" }) \
40+ .execute(output_path = " processed.pdf" )
41+ ```
42+
43+ ## Authentication
44+
45+ The client supports API key authentication through multiple methods:
46+
47+ ``` python
48+ # 1. Pass directly to client
49+ client = NutrientClient(api_key = " your-api-key" )
50+
51+ # 2. Set environment variable
52+ # export NUTRIENT_API_KEY=your-api-key
53+ client = NutrientClient() # Will use env variable
54+
55+ # 3. Use context manager for automatic cleanup
56+ with NutrientClient(api_key = " your-api-key" ) as client:
57+ client.convert_to_pdf(" document.docx" )
58+ ```
59+
60+ ## Direct API Examples
61+
62+ ### Convert to PDF
63+
64+ ``` python
65+ # Convert Office document to PDF
66+ client.convert_to_pdf(
67+ input_file = " presentation.pptx" ,
68+ output_path = " presentation.pdf"
69+ )
70+
71+ # Convert with options
72+ client.convert_to_pdf(
73+ input_file = " spreadsheet.xlsx" ,
74+ output_path = " spreadsheet.pdf" ,
75+ page_range = " 1-3"
76+ )
77+ ```
78+
79+ ### Merge PDFs
80+
81+ ``` python
82+ # Merge multiple PDFs
83+ client.merge_pdfs(
84+ input_files = [" doc1.pdf" , " doc2.pdf" , " doc3.pdf" ],
85+ output_path = " merged.pdf"
86+ )
87+ ```
88+
89+ ### OCR PDF
90+
91+ ``` python
92+ # Add OCR layer to scanned PDF
93+ client.ocr_pdf(
94+ input_file = " scanned.pdf" ,
95+ output_path = " searchable.pdf" ,
96+ language = " en"
97+ )
98+ ```
99+
100+ ### Rotate Pages
101+
102+ ``` python
103+ # Rotate all pages
104+ client.rotate_pages(
105+ input_file = " document.pdf" ,
106+ output_path = " rotated.pdf" ,
107+ degrees = 180
108+ )
109+
110+ # Rotate specific pages
111+ client.rotate_pages(
112+ input_file = " document.pdf" ,
113+ output_path = " rotated.pdf" ,
114+ degrees = 90 ,
115+ page_indexes = [0 , 2 , 4 ] # Pages 1, 3, and 5
116+ )
117+ ```
118+
119+ ### Watermark PDF
120+
121+ ``` python
122+ # Add text watermark
123+ client.watermark_pdf(
124+ input_file = " document.pdf" ,
125+ output_path = " watermarked.pdf" ,
126+ text = " DRAFT" ,
127+ opacity = 0.5
128+ )
129+
130+ # Add image watermark
131+ client.watermark_pdf(
132+ input_file = " document.pdf" ,
133+ output_path = " watermarked.pdf" ,
134+ image_url = " https://example.com/logo.png" ,
135+ position = " center"
136+ )
137+ ```
138+
139+ ## Builder API Examples
140+
141+ The Builder API allows you to chain multiple operations in a single workflow:
142+
143+ ``` python
144+ # Complex document processing pipeline
145+ result = client.build(input_file = " raw-scan.pdf" ) \
146+ .add_step(" ocr-pdf" , {" language" : " en" }) \
147+ .add_step(" rotate-pages" , {" degrees" : - 90 , " page_indexes" : [0 ]}) \
148+ .add_step(" watermark-pdf" , {
149+ " text" : " PROCESSED" ,
150+ " opacity" : 0.3 ,
151+ " position" : " top-right"
152+ }) \
153+ .add_step(" flatten-annotations" ) \
154+ .set_output_options(
155+ metadata = {" title" : " Processed Document" , " author" : " DWS Client" },
156+ optimize = True
157+ ) \
158+ .execute(output_path = " final.pdf" )
159+ ```
160+
161+ ## File Input Options
162+
163+ The library supports multiple ways to provide input files:
18164
19- # Convert a document to PDF
20- pdf_bytes = client.convert_to_pdf(input_file = " document.docx" )
165+ ``` python
166+ # File path (string or Path object)
167+ client.convert_to_pdf(" document.docx" )
168+ client.convert_to_pdf(Path(" document.docx" ))
169+
170+ # Bytes
171+ with open (" document.docx" , " rb" ) as f:
172+ file_bytes = f.read()
173+ client.convert_to_pdf(file_bytes)
174+
175+ # File-like object
176+ with open (" document.docx" , " rb" ) as f:
177+ client.convert_to_pdf(f)
178+
179+ # URL (for supported operations)
180+ client.import_from_url(" https://example.com/document.pdf" )
181+ ```
182+
183+ ## Error Handling
184+
185+ The library provides specific exceptions for different error scenarios:
186+
187+ ``` python
188+ from nutrient import (
189+ NutrientError,
190+ AuthenticationError,
191+ APIError,
192+ ValidationError,
193+ TimeoutError ,
194+ FileProcessingError
195+ )
196+
197+ try :
198+ client.convert_to_pdf(" document.docx" )
199+ except AuthenticationError:
200+ print (" Invalid API key" )
201+ except ValidationError as e:
202+ print (f " Invalid parameters: { e.errors} " )
203+ except APIError as e:
204+ print (f " API error: { e.status_code} - { e.message} " )
205+ except TimeoutError :
206+ print (" Request timed out" )
207+ except FileProcessingError as e:
208+ print (f " File processing failed: { e} " )
209+ ```
21210
22- # Use the Builder API for complex workflows
23- client.build(input_file = " document.docx" ) \
24- .add_step(tool = " convert-to-pdf" ) \
25- .add_step(tool = " rotate-pages" , options = {" degrees" : 90 }) \
26- .execute(output_path = " output.pdf" )
211+ ## Advanced Configuration
212+
213+ ### Custom Timeout
214+
215+ ``` python
216+ # Set timeout to 10 minutes for large files
217+ client = NutrientClient(api_key = " your-api-key" , timeout = 600 )
27218```
28219
29- ## Documentation
220+ ### Streaming Large Files
221+
222+ Files larger than 10MB are automatically streamed to avoid memory issues:
223+
224+ ``` python
225+ # This will stream the file instead of loading it into memory
226+ client.convert_to_pdf(" large-presentation.pptx" )
227+ ```
228+
229+ ## Available Tools
230+
231+ ### Document Conversion
232+ - ` convert_to_pdf ` - Convert Office documents to PDF
233+ - ` convert_from_pdf ` - Convert PDF to Office formats
234+ - ` convert_pdf_page_to_image ` - Convert PDF pages to images
235+ - ` import_from_url ` - Import documents from URLs
236+
237+ ### PDF Manipulation
238+ - ` merge_pdfs ` - Merge multiple PDFs
239+ - ` split_pdf ` - Split PDF into multiple files
240+ - ` rotate_pages ` - Rotate PDF pages
241+ - ` delete_pages ` - Remove pages from PDF
242+ - ` duplicate_pages ` - Duplicate pages in PDF
243+ - ` move_pages ` - Reorder pages in PDF
244+
245+ ### PDF Enhancement
246+ - ` ocr_pdf ` - Add searchable text layer
247+ - ` watermark_pdf ` - Add text or image watermarks
248+ - ` flatten_annotations ` - Flatten form fields and annotations
249+ - ` linearize_pdf ` - Optimize for web viewing
30250
31- Full documentation is available at [ https://nutrient-dws-client-python.readthedocs.io ] ( https://nutrient-dws-client-python.readthedocs.io )
251+ ### PDF Security
252+ - ` apply_redactions ` - Permanently remove sensitive content
253+ - ` create_redactions ` - Mark content for redaction
254+ - ` sanitize_pdf ` - Remove potentially harmful content
255+
256+ ### Annotations and Forms
257+ - ` apply_instant_json ` - Apply Nutrient Instant JSON annotations
258+ - ` export_instant_json ` - Export annotations as Instant JSON
259+ - ` apply_xfdf ` - Apply XFDF annotations
260+ - ` export_xfdf ` - Export annotations as XFDF
261+ - ` export_pdf_info ` - Extract PDF metadata and structure
262+
263+ ## Development
264+
265+ ### Setup
266+
267+ ``` bash
268+ # Clone the repository
269+ git clone https://github.com/jdrhyne/nutrient-dws-client-python.git
270+ cd nutrient-dws-client-python
271+
272+ # Install in development mode
273+ pip install -e " .[dev]"
274+
275+ # Run tests
276+ pytest
277+
278+ # Run linting
279+ ruff check .
280+
281+ # Run type checking
282+ mypy src tests
283+ ```
284+
285+ ### Running Tests
286+
287+ ``` bash
288+ # Run all tests
289+ pytest
290+
291+ # Run with coverage
292+ pytest --cov=nutrient --cov-report=html
293+
294+ # Run specific test file
295+ pytest tests/unit/test_client.py
296+ ```
297+
298+ ## Contributing
299+
300+ Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
301+
302+ 1 . Fork the repository
303+ 2 . Create your feature branch (` git checkout -b feature/amazing-feature ` )
304+ 3 . Commit your changes (` git commit -m 'Add some amazing feature' ` )
305+ 4 . Push to the branch (` git push origin feature/amazing-feature ` )
306+ 5 . Open a Pull Request
32307
33308## License
34309
35- MIT License - see LICENSE file for details.
310+ This project is licensed under the MIT License - see the [ LICENSE] ( LICENSE ) file for details.
311+
312+ ## Support
313+
314+ 315+ - 📚 Documentation: https://www.nutrient.io/docs/
316+ - 🐛 Issues: https://github.com/jdrhyne/nutrient-dws-client-python/issues
0 commit comments