22
33This document lists all operations currently supported by the Nutrient DWS API through this Python client.
44
5+ ## 🎯 Important Discovery: Implicit Document Conversion
6+
7+ The Nutrient DWS API automatically converts Office documents (DOCX, XLSX, PPTX) to PDF when processing them. This means:
8+
9+ - ** No explicit conversion needed** - Just pass your Office documents to any method
10+ - ** All methods accept Office documents** - ` rotate_pages() ` , ` ocr_pdf() ` , etc. work with DOCX files
11+ - ** Seamless operation chaining** - Convert and process in one API call
12+
13+ ### Example:
14+ ``` python
15+ # This automatically converts DOCX to PDF and rotates it!
16+ client.rotate_pages(" document.docx" , degrees = 90 )
17+
18+ # Merge PDFs and Office documents together
19+ client.merge_pdfs([" file1.pdf" , " file2.docx" , " spreadsheet.xlsx" ])
20+ ```
21+
522## Direct API Methods
623
724The following methods are available on the ` NutrientClient ` instance:
825
9- ### 1. ` flatten_annotations(input_file, output_path=None) `
26+ ### 1. ` convert_to_pdf(input_file, output_path=None) `
27+ Converts Office documents to PDF format using implicit conversion.
28+
29+ ** Parameters:**
30+ - ` input_file ` : Office document (DOCX, XLSX, PPTX)
31+ - ` output_path ` : Optional path to save output
32+
33+ ** Example:**
34+ ``` python
35+ # Convert DOCX to PDF
36+ client.convert_to_pdf(" document.docx" , " document.pdf" )
37+
38+ # Convert and get bytes
39+ pdf_bytes = client.convert_to_pdf(" spreadsheet.xlsx" )
40+ ```
41+
42+ ** Note:** HTML files are not currently supported.
43+
44+ ### 2. ` flatten_annotations(input_file, output_path=None) `
1045Flattens all annotations and form fields in a PDF, converting them to static page content.
1146
1247** Parameters:**
13- - ` input_file ` : PDF file (path, bytes, or file-like object)
48+ - ` input_file ` : PDF or Office document
1449- ` output_path ` : Optional path to save output
1550
1651** Example:**
1752``` python
1853client.flatten_annotations(" document.pdf" , " flattened.pdf" )
54+ # Works with Office docs too!
55+ client.flatten_annotations(" form.docx" , " flattened.pdf" )
1956```
2057
21- ### 2 . ` rotate_pages(input_file, output_path=None, degrees=0, page_indexes=None) `
22- Rotates pages in a PDF.
58+ ### 3 . ` rotate_pages(input_file, output_path=None, degrees=0, page_indexes=None) `
59+ Rotates pages in a PDF or converts Office document to PDF and rotates .
2360
2461** Parameters:**
25- - ` input_file ` : PDF file
62+ - ` input_file ` : PDF or Office document
2663- ` output_path ` : Optional output path
2764- ` degrees ` : Rotation angle (90, 180, 270, or -90)
2865- ` page_indexes ` : Optional list of page indexes to rotate (0-based)
@@ -32,15 +69,18 @@ Rotates pages in a PDF.
3269# Rotate all pages 90 degrees
3370client.rotate_pages(" document.pdf" , " rotated.pdf" , degrees = 90 )
3471
72+ # Works with Office documents too!
73+ client.rotate_pages(" presentation.pptx" , " rotated.pdf" , degrees = 180 )
74+
3575# Rotate specific pages
3676client.rotate_pages(" document.pdf" , " rotated.pdf" , degrees = 180 , page_indexes = [0 , 2 ])
3777```
3878
39- ### 3 . ` ocr_pdf(input_file, output_path=None, language="english") `
40- Applies OCR to make a PDF searchable.
79+ ### 4 . ` ocr_pdf(input_file, output_path=None, language="english") `
80+ Applies OCR to make a PDF searchable. Converts Office documents to PDF first if needed.
4181
4282** Parameters:**
43- - ` input_file ` : PDF file
83+ - ` input_file ` : PDF or Office document
4484- ` output_path ` : Optional output path
4585- ` language ` : OCR language - supported values:
4686 - ` "english" ` or ` "eng" ` - English
@@ -49,13 +89,15 @@ Applies OCR to make a PDF searchable.
4989** Example:**
5090``` python
5191client.ocr_pdf(" scanned.pdf" , " searchable.pdf" , language = " english" )
92+ # Convert DOCX to searchable PDF
93+ client.ocr_pdf(" document.docx" , " searchable.pdf" , language = " eng" )
5294```
5395
54- ### 4 . ` watermark_pdf(input_file, output_path=None, text=None, image_url=None, width=200, height=100, opacity=1.0, position="center") `
55- Adds a watermark to all pages of a PDF.
96+ ### 5 . ` watermark_pdf(input_file, output_path=None, text=None, image_url=None, width=200, height=100, opacity=1.0, position="center") `
97+ Adds a watermark to all pages of a PDF. Converts Office documents to PDF first if needed.
5698
5799** Parameters:**
58- - ` input_file ` : PDF file
100+ - ` input_file ` : PDF or Office document
59101- ` output_path ` : Optional output path
60102- ` text ` : Text for watermark (either text or image_url required)
61103- ` image_url ` : URL of image for watermark
@@ -78,38 +120,46 @@ client.watermark_pdf(
78120)
79121```
80122
81- ### 5 . ` apply_redactions(input_file, output_path=None) `
82- Applies redaction annotations to permanently remove content.
123+ ### 6 . ` apply_redactions(input_file, output_path=None) `
124+ Applies redaction annotations to permanently remove content. Converts Office documents to PDF first if needed.
83125
84126** Parameters:**
85- - ` input_file ` : PDF file with redaction annotations
127+ - ` input_file ` : PDF or Office document with redaction annotations
86128- ` output_path ` : Optional output path
87129
88130** Example:**
89131``` python
90132client.apply_redactions(" document_with_redactions.pdf" , " redacted.pdf" )
91133```
92134
93- ### 6 . ` merge_pdfs(input_files, output_path=None) `
94- Merges multiple PDF files into one.
135+ ### 7 . ` merge_pdfs(input_files, output_path=None) `
136+ Merges multiple files into one PDF. Automatically converts Office documents to PDF before merging .
95137
96138** Parameters:**
97- - ` input_files ` : List of PDF files to merge
139+ - ` input_files ` : List of files to merge (PDFs and/or Office documents)
98140- ` output_path ` : Optional output path
99141
100142** Example:**
101143``` python
144+ # Merge PDFs only
102145client.merge_pdfs(
103146 [" document1.pdf" , " document2.pdf" , " document3.pdf" ],
104147 " merged.pdf"
105148)
149+
150+ # Mix PDFs and Office documents - they'll be converted automatically!
151+ client.merge_pdfs(
152+ [" report.pdf" , " spreadsheet.xlsx" , " presentation.pptx" ],
153+ " combined.pdf"
154+ )
106155```
107156
108157## Builder API
109158
110- The Builder API allows chaining multiple operations:
159+ The Builder API allows chaining multiple operations. Like the Direct API, it automatically converts Office documents to PDF when needed :
111160
112161``` python
162+ # Works with PDFs
113163client.build(input_file = " document.pdf" ) \
114164 .add_step(" rotate-pages" , {" degrees" : 90 }) \
115165 .add_step(" ocr-pdf" , {" language" : " english" }) \
@@ -121,6 +171,12 @@ client.build(input_file="document.pdf") \
121171 }) \
122172 .add_step(" flatten-annotations" ) \
123173 .execute(output_path = " processed.pdf" )
174+
175+ # Also works with Office documents!
176+ client.build(input_file = " report.docx" ) \
177+ .add_step(" watermark-pdf" , {" text" : " CONFIDENTIAL" , " width" : 300 , " height" : 150 }) \
178+ .add_step(" flatten-annotations" ) \
179+ .execute(output_path = " watermarked_report.pdf" )
124180```
125181
126182### Supported Builder Actions
@@ -135,7 +191,7 @@ client.build(input_file="document.pdf") \
135191
136192The following operations are ** NOT** currently supported by the API:
137193
138- - Document conversion (Office to PDF, HTML to PDF )
194+ - HTML to PDF conversion (only Office documents are supported )
139195- PDF to image export
140196- PDF splitting
141197- Form filling
0 commit comments