Skip to content

Commit 4abd0e1

Browse files
Add comprehensive integration tests for set_page_label method (#12)
* Add comprehensive integration tests for set_page_label method - Add 6 new integration tests covering various scenarios: * Basic page labeling with output file * Returning bytes without output file * Multiple page ranges with different labels * Single page labeling * Error handling for empty labels list * Error handling for invalid label configurations - Fix page range normalization to ensure 'end' field is always present - Update page range handling to avoid overlapping ranges in tests - Add comprehensive unit tests for validation logic - Add Builder API support for set_page_labels method with chaining - Update documentation with examples and usage patterns 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Fix unit test to expect normalized labels with 'end' field The set_page_label implementation normalizes label configurations to ensure the 'end' field is always present (defaulting to -1 for end of document). Updated the unit test to expect the normalized format rather than the original input format. * Fix mypy type errors in tests - Add type ignore comment for intentionally invalid test data - Add explicit type annotations for module-level variables that can be None * Add future annotations import for Python 3.10+ compatibility The union syntax (str | None) requires from __future__ import annotations for Python versions before 3.10 where it became the default. --------- Co-authored-by: Claude <[email protected]>
1 parent 854c63b commit 4abd0e1

File tree

6 files changed

+409
-3
lines changed

6 files changed

+409
-3
lines changed

SUPPORTED_OPERATIONS.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -256,6 +256,40 @@ client.delete_pdf_pages(
256256
)
257257
```
258258

259+
### 11. `set_page_label(input_file, labels, output_path=None)`
260+
Sets custom labels/numbering for specific page ranges in a PDF.
261+
262+
**Parameters:**
263+
- `input_file`: PDF file to process
264+
- `labels`: List of label configurations. Each dict must contain:
265+
- `pages`: Page range dict with `start` (required) and optionally `end`
266+
- `label`: String label to apply to those pages
267+
- Page ranges use 0-based indexing where `end` is exclusive.
268+
- `output_path`: Optional path to save the output file
269+
270+
**Returns:**
271+
- Processed PDF as bytes, or None if `output_path` provided
272+
273+
**Example:**
274+
```python
275+
# Set labels for different page ranges
276+
client.set_page_label(
277+
"document.pdf",
278+
labels=[
279+
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
280+
{"pages": {"start": 3, "end": 10}, "label": "Chapter 1"},
281+
{"pages": {"start": 10}, "label": "Appendix"}
282+
],
283+
output_path="labeled_document.pdf"
284+
)
285+
286+
# Set label for single page
287+
client.set_page_label(
288+
"document.pdf",
289+
labels=[{"pages": {"start": 0, "end": 1}, "label": "Cover Page"}]
290+
)
291+
```
292+
259293
## Builder API
260294

261295
The Builder API allows chaining multiple operations. Like the Direct API, it automatically converts Office documents to PDF when needed:
@@ -279,6 +313,15 @@ client.build(input_file="report.docx") \
279313
.add_step("watermark-pdf", {"text": "CONFIDENTIAL", "width": 300, "height": 150}) \
280314
.add_step("flatten-annotations") \
281315
.execute(output_path="watermarked_report.pdf")
316+
317+
# Setting page labels with Builder API
318+
client.build(input_file="document.pdf") \
319+
.add_step("rotate-pages", {"degrees": 90}) \
320+
.set_page_labels([
321+
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
322+
{"pages": {"start": 3}, "label": "Content"}
323+
]) \
324+
.execute(output_path="labeled_document.pdf")
282325
```
283326

284327
### Supported Builder Actions
@@ -289,6 +332,22 @@ client.build(input_file="report.docx") \
289332
4. **watermark-pdf** - Parameters: `text` or `image_url`, `width`, `height`, `opacity`, `position`
290333
5. **apply-redactions** - No parameters required
291334

335+
### Builder Output Options
336+
337+
The Builder API also supports setting output options:
338+
339+
- **set_output_options()** - General output configuration (metadata, optimization, etc.)
340+
- **set_page_labels()** - Set page labels for specific page ranges
341+
342+
Example:
343+
```python
344+
client.build("document.pdf") \
345+
.add_step("rotate-pages", {"degrees": 90}) \
346+
.set_output_options(metadata={"title": "My Document"}) \
347+
.set_page_labels([{"pages": {"start": 0}, "label": "Chapter 1"}]) \
348+
.execute("output.pdf")
349+
```
350+
292351
## API Limitations
293352

294353
The following operations are **NOT** currently supported by the API:

src/nutrient_dws/api/direct.py

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -723,3 +723,106 @@ def add_page(
723723
return None
724724
else:
725725
return result # type: ignore[no-any-return]
726+
727+
def set_page_label(
728+
self,
729+
input_file: FileInput,
730+
labels: list[dict[str, Any]],
731+
output_path: str | None = None,
732+
) -> bytes | None:
733+
"""Set labels for specific pages in a PDF.
734+
735+
Assigns custom labels/numbering to specific page ranges in a PDF document.
736+
Each label configuration specifies a page range and the label text to apply.
737+
738+
Args:
739+
input_file: Input PDF file.
740+
labels: List of label configurations. Each dict must contain:
741+
- 'pages': Page range dict with 'start' (required) and optionally 'end'
742+
- 'label': String label to apply to those pages
743+
Page ranges use 0-based indexing where 'end' is exclusive.
744+
output_path: Optional path to save the output file.
745+
746+
Returns:
747+
Processed PDF as bytes, or None if output_path is provided.
748+
749+
Raises:
750+
AuthenticationError: If API key is missing or invalid.
751+
APIError: For other API errors.
752+
ValueError: If labels list is empty or contains invalid configurations.
753+
754+
Examples:
755+
# Set labels for different page ranges
756+
client.set_page_label(
757+
"document.pdf",
758+
labels=[
759+
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
760+
{"pages": {"start": 3, "end": 10}, "label": "Chapter 1"},
761+
{"pages": {"start": 10}, "label": "Appendix"}
762+
],
763+
output_path="labeled_document.pdf"
764+
)
765+
766+
# Set label for single page
767+
client.set_page_label(
768+
"document.pdf",
769+
labels=[{"pages": {"start": 0, "end": 1}, "label": "Cover Page"}]
770+
)
771+
"""
772+
from nutrient_dws.file_handler import prepare_file_for_upload, save_file_output
773+
774+
# Validate inputs
775+
if not labels:
776+
raise ValueError("labels list cannot be empty")
777+
778+
# Normalize labels to ensure proper format
779+
normalized_labels = []
780+
for i, label_config in enumerate(labels):
781+
if not isinstance(label_config, dict):
782+
raise ValueError(f"Label configuration {i} must be a dictionary")
783+
784+
if "pages" not in label_config:
785+
raise ValueError(f"Label configuration {i} missing required 'pages' key")
786+
787+
if "label" not in label_config:
788+
raise ValueError(f"Label configuration {i} missing required 'label' key")
789+
790+
pages = label_config["pages"]
791+
if not isinstance(pages, dict) or "start" not in pages:
792+
raise ValueError(f"Label configuration {i} 'pages' must be a dict with 'start' key")
793+
794+
# Normalize pages to ensure 'end' is present
795+
normalized_pages = {"start": pages["start"]}
796+
if "end" in pages:
797+
normalized_pages["end"] = pages["end"]
798+
else:
799+
# If no end is specified, use -1 to indicate "to end of document"
800+
normalized_pages["end"] = -1
801+
802+
normalized_labels.append({"pages": normalized_pages, "label": label_config["label"]})
803+
804+
# Prepare file for upload
805+
file_field, file_data = prepare_file_for_upload(input_file, "file")
806+
files = {file_field: file_data}
807+
808+
# Build instructions with page labels in output configuration
809+
instructions = {
810+
"parts": [{"file": "file"}],
811+
"actions": [],
812+
"output": {"labels": normalized_labels},
813+
}
814+
815+
# Make API request
816+
# Type checking: at runtime, self is NutrientClient which has _http_client
817+
result = self._http_client.post( # type: ignore[attr-defined]
818+
"/build",
819+
files=files,
820+
json_data=instructions,
821+
)
822+
823+
# Handle output
824+
if output_path:
825+
save_file_output(result, output_path)
826+
return None
827+
else:
828+
return result # type: ignore[no-any-return]

src/nutrient_dws/builder.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,30 @@ def set_output_options(self, **options: Any) -> "BuildAPIWrapper":
7878
self._output_options.update(options)
7979
return self
8080

81+
def set_page_labels(self, labels: list[dict[str, Any]]) -> "BuildAPIWrapper":
82+
"""Set page labels for the final document.
83+
84+
Assigns custom labels/numbering to specific page ranges in the output PDF.
85+
86+
Args:
87+
labels: List of label configurations. Each dict must contain:
88+
- 'pages': Page range dict with 'start' (required) and optionally 'end'
89+
- 'label': String label to apply to those pages
90+
Page ranges use 0-based indexing where 'end' is exclusive.
91+
92+
Returns:
93+
Self for method chaining.
94+
95+
Example:
96+
>>> builder.set_page_labels([
97+
... {"pages": {"start": 0, "end": 3}, "label": "Introduction"},
98+
... {"pages": {"start": 3, "end": 10}, "label": "Chapter 1"},
99+
... {"pages": {"start": 10}, "label": "Appendix"}
100+
... ])
101+
"""
102+
self._output_options["labels"] = labels
103+
return self
104+
81105
def execute(self, output_path: str | None = None) -> bytes | None:
82106
"""Execute the workflow.
83107

tests/integration/test_live_api.py

Lines changed: 73 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,18 @@
33
These tests require a valid API key configured in integration_config.py.
44
"""
55

6+
from __future__ import annotations
7+
68
import pytest
79

810
from nutrient_dws import NutrientClient
911

1012
try:
1113
from . import integration_config # type: ignore[attr-defined]
1214

13-
API_KEY = integration_config.API_KEY
14-
BASE_URL = getattr(integration_config, "BASE_URL", None)
15-
TIMEOUT = getattr(integration_config, "TIMEOUT", 60)
15+
API_KEY: str | None = integration_config.API_KEY
16+
BASE_URL: str | None = getattr(integration_config, "BASE_URL", None)
17+
TIMEOUT: int = getattr(integration_config, "TIMEOUT", 60)
1618
except ImportError:
1719
API_KEY = None
1820
BASE_URL = None
@@ -158,6 +160,74 @@ def test_split_pdf_single_page_default(self, client, sample_pdf_path):
158160
# Verify result is a valid PDF
159161
assert_is_pdf(result[0])
160162

163+
def test_set_page_label_integration(self, client, sample_pdf_path, tmp_path):
164+
"""Test set_page_label method with live API."""
165+
labels = [{"pages": {"start": 0, "end": 1}, "label": "Cover"}]
166+
167+
output_path = str(tmp_path / "labeled.pdf")
168+
169+
# Try to set page labels
170+
result = client.set_page_label(sample_pdf_path, labels, output_path=output_path)
171+
172+
# If successful, verify output
173+
assert result is None # Should return None when output_path provided
174+
assert (tmp_path / "labeled.pdf").exists()
175+
assert_is_pdf(output_path)
176+
177+
def test_set_page_label_return_bytes(self, client, sample_pdf_path):
178+
"""Test set_page_label method returning bytes."""
179+
labels = [{"pages": {"start": 0, "end": 1}, "label": "i"}]
180+
181+
# Test getting bytes back
182+
result = client.set_page_label(sample_pdf_path, labels)
183+
184+
assert isinstance(result, bytes)
185+
assert len(result) > 0
186+
assert_is_pdf(result)
187+
188+
def test_set_page_label_multiple_ranges(self, client, sample_pdf_path):
189+
"""Test set_page_label method with multiple page ranges."""
190+
labels = [
191+
{"pages": {"start": 0, "end": 1}, "label": "i"},
192+
{"pages": {"start": 1, "end": 2}, "label": "intro"},
193+
{"pages": {"start": 2, "end": 3}, "label": "final"},
194+
]
195+
196+
result = client.set_page_label(sample_pdf_path, labels)
197+
198+
assert isinstance(result, bytes)
199+
assert len(result) > 0
200+
assert_is_pdf(result)
201+
202+
def test_set_page_label_single_page(self, client, sample_pdf_path):
203+
"""Test set_page_label method with single page label."""
204+
labels = [{"pages": {"start": 0, "end": 1}, "label": "Cover Page"}]
205+
206+
result = client.set_page_label(sample_pdf_path, labels)
207+
208+
assert isinstance(result, bytes)
209+
assert len(result) > 0
210+
assert_is_pdf(result)
211+
212+
def test_set_page_label_empty_labels_error(self, client, sample_pdf_path):
213+
"""Test set_page_label method with empty labels raises error."""
214+
with pytest.raises(ValueError, match="labels list cannot be empty"):
215+
client.set_page_label(sample_pdf_path, labels=[])
216+
217+
def test_set_page_label_invalid_label_config_error(self, client, sample_pdf_path):
218+
"""Test set_page_label method with invalid label configuration raises error."""
219+
# Missing 'pages' key
220+
with pytest.raises(ValueError, match="missing required 'pages' key"):
221+
client.set_page_label(sample_pdf_path, labels=[{"label": "test"}])
222+
223+
# Missing 'label' key
224+
with pytest.raises(ValueError, match="missing required 'label' key"):
225+
client.set_page_label(sample_pdf_path, labels=[{"pages": {"start": 0}}])
226+
227+
# Invalid pages format
228+
with pytest.raises(ValueError, match="'pages' must be a dict with 'start' key"):
229+
client.set_page_label(sample_pdf_path, labels=[{"pages": "invalid", "label": "test"}])
230+
161231
def test_duplicate_pdf_pages_basic(self, client, sample_pdf_path):
162232
"""Test duplicate_pdf_pages method with basic duplication."""
163233
# Test duplicating first page twice

tests/unit/test_builder.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,40 @@ def test_builder_set_output_options():
4848
assert builder._output_options["optimize"] is True
4949

5050

51+
def test_builder_set_page_labels():
52+
"""Test setting page labels."""
53+
builder = BuildAPIWrapper(None, "test.pdf")
54+
55+
labels = [
56+
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
57+
{"pages": {"start": 3, "end": 10}, "label": "Chapter 1"},
58+
{"pages": {"start": 10}, "label": "Appendix"},
59+
]
60+
61+
result = builder.set_page_labels(labels)
62+
63+
assert result is builder # Should return self for chaining
64+
assert builder._output_options["labels"] == labels
65+
66+
67+
def test_builder_set_page_labels_chaining():
68+
"""Test page labels can be chained with other operations."""
69+
builder = BuildAPIWrapper(None, "test.pdf")
70+
71+
labels = [{"pages": {"start": 0, "end": 1}, "label": "Cover"}]
72+
73+
result = (
74+
builder.add_step("rotate-pages", options={"degrees": 90})
75+
.set_page_labels(labels)
76+
.set_output_options(metadata={"title": "Test"})
77+
)
78+
79+
assert result is builder
80+
assert len(builder._actions) == 1
81+
assert builder._output_options["labels"] == labels
82+
assert builder._output_options["metadata"]["title"] == "Test"
83+
84+
5185
def test_builder_execute_requires_client():
5286
"""Test that execute requires a client."""
5387
builder = BuildAPIWrapper(None, "test.pdf")

0 commit comments

Comments
 (0)