PSPDFKit
diff --git a/‎FUTURE_ENHANCEMENTS_PLAN.md‎
Lines changed: 528 additions & 0 deletions b/‎FUTURE_ENHANCEMENTS_PLAN.md‎
Lines changed: 528 additions & 0 deletions
diff --git a/‎github_issues/00_enhancement_roadmap.md‎
Lines changed: 103 additions & 0 deletions b/‎github_issues/00_enhancement_roadmap.md‎
Lines changed: 103 additions & 0 deletions
diff --git a/‎github_issues/01_multi_language_ocr.md‎
Lines changed: 52 additions & 0 deletions b/‎github_issues/01_multi_language_ocr.md‎
Lines changed: 52 additions & 0 deletions
diff --git a/‎github_issues/02_image_watermark.md‎
Lines changed: 63 additions & 0 deletions b/‎github_issues/02_image_watermark.md‎
Lines changed: 63 additions & 0 deletions
diff --git a/‎github_issues/03_selective_flatten.md‎
Lines changed: 51 additions & 0 deletions b/‎github_issues/03_selective_flatten.md‎
Lines changed: 51 additions & 0 deletions
diff --git a/‎github_issues/04_create_redactions.md‎
Lines changed: 76 additions & 0 deletions b/‎github_issues/04_create_redactions.md‎
Lines changed: 76 additions & 0 deletions
@@ -0,0 +1,103 @@
+# Enhancement Roadmap: Nutrient DWS Python Client
+
+## Overview
+This issue tracks the comprehensive enhancement plan for the Nutrient DWS Python Client based on OpenAPI specification v1.9.0 analysis. The goal is to expand from ~30% to ~80% API coverage while maintaining our high standards for code quality and backward compatibility.
+
+## Enhancement Categories
+
+### 🔵 Priority 1: Enhanced Existing Methods
+*Improve current methods with additional OpenAPI capabilities*
+
+- [ ] #1 **Multi-Language OCR Support** - Support multiple languages in `ocr_pdf()`
+- [ ] #2 **Image Watermark Support** - Add image watermarks to `watermark_pdf()`
+- [ ] #3 **Selective Annotation Flattening** - Add annotation ID filtering to `flatten_annotations()`
+
+### 🟢 Priority 2: Core Missing Methods
+*Add commonly requested document operations*
+
+- [ ] #4 **Create Redactions** - Implement `create_redactions()` with text/regex/preset strategies
+- [ ] #5 **Import Annotations** - Implement `import_annotations()` for Instant JSON/XFDF
+- [ ] #6 **Extract Page Range** - Simple `extract_pages()` method (simpler than split_pdf)
+
+### 🟡 Priority 3: Format Conversion Methods
+*Enable output format flexibility*
+
+- [ ] #7 **Convert to PDF/A** - Implement `convert_to_pdfa()` for archival compliance
+- [ ] #8 **Convert to Images** - Implement `convert_to_images()` for PNG/JPEG/WebP
+- [ ] #9 **Extract Content as JSON** - Implement `extract_content()` for structured data
+- [ ] #10 **Convert to Office Formats** - Implement `convert_to_office()` for DOCX/XLSX/PPTX
+
+### 🟠 Priority 4: Advanced Features
+*Sophisticated document processing capabilities*
+
+- [ ] #11 **AI-Powered Redaction** - Implement `ai_redact()` using AI entity detection
+- [ ] #12 **Digital Signatures** - Implement `sign_pdf()` with visual signatures
+- [ ] #13 **Batch Processing** - Client-side `batch_process()` for bulk operations
+
+## Implementation Timeline
+
+### Phase 1 (Weeks 1-4)
+Focus on Priority 1 enhancements that improve existing methods:
+- Multi-language OCR
+- Image watermarks
+- Selective flattening
+
+### Phase 2 (Weeks 5-8)
+Add Priority 2 core methods:
+- Create redactions
+- Import annotations
+- PDF/A conversion
+
+### Phase 3 (Weeks 9-12)
+Implement Priority 3 format conversions:
+- Image extraction
+- Content extraction
+- Office format export
+
+### Phase 4 (Weeks 13-16)
+Advanced features for Priority 4:
+- AI redaction
+- Digital signatures
+- Batch processing
+
+## Success Metrics
+
+- **API Coverage**: Increase from ~30% to ~80%
+- **Test Coverage**: Maintain 95%+ coverage
+- **Documentation**: 100% method documentation with examples
+- **Performance**: Sub-second operations for common tasks
+- **Backward Compatibility**: Zero breaking changes
+
+## Implementation Guidelines
+
+For each enhancement:
+1. Review OpenAPI specification for exact requirements
+2. Implement with backward compatibility in mind
+3. Add comprehensive unit and integration tests
+4. Include detailed docstrings with examples
+5. Update documentation and changelog
+6. Consider performance implications
+
+## Related Documents
+
+- [FUTURE_ENHANCEMENTS_PLAN.md](../FUTURE_ENHANCEMENTS_PLAN.md) - Detailed enhancement specifications
+- [OPENAPI_COMPLIANCE_REVIEW.md](../OPENAPI_COMPLIANCE_REVIEW.md) - Current compliance status
+- [openapi_spec.yml](../openapi_spec.yml) - Official API specification v1.9.0
+
+## Contributing
+
+We welcome contributions! Please:
+1. Comment on the specific issue you'd like to work on
+2. Follow the implementation template in each issue
+3. Ensure all tests pass
+4. Update documentation
+5. Submit PR referencing the issue number
+
+## Questions?
+
+Feel free to ask questions in the comments or open a discussion for broader topics.
+
+---
+
+**Labels**: roadmap, enhancement, meta-issue
+**Milestone**: v2.0.0
@@ -0,0 +1,52 @@
+# Enhancement: Multi-Language OCR Support
+
+## Summary
+Enhance the `ocr_pdf()` method to support multiple languages simultaneously, as supported by the OpenAPI specification.
+
+## Current Behavior
+- `ocr_pdf()` accepts only a single language string
+- Limited to one language per document
+
+## Proposed Enhancement
+```python
+def ocr_pdf(
+    self,
+    input_file: FileInput,
+    output_path: Optional[str] = None,
+    language: Union[str, List[str]] = "english",  # Now accepts list
+    enable_structure: bool = False,  # New parameter
+) -> Optional[bytes]:
+```
+
+## Benefits
+- Process multi-lingual documents accurately
+- Better OCR accuracy with proper language hints
+- Optional structured text extraction
+- Backward compatible with existing single-language usage
+
+## Implementation Details
+- Modify `_map_tool_to_action()` in builder.py to handle language arrays
+- Update parameter validation to accept both string and list
+- Add `enable_structure` parameter for structured output
+- Extend language mapping to support all 30+ OpenAPI languages
+
+## Testing Requirements
+- [ ] Test single language string (backward compatibility)
+- [ ] Test multiple languages as list
+- [ ] Test structured output option
+- [ ] Test all supported language codes
+- [ ] Update integration tests
+
+## OpenAPI Reference
+- BuildAction type: `ocr`
+- Parameter: `language` - can be single OcrLanguage or array
+- Supports: english, spanish, french, german, italian, portuguese, chinese, japanese, korean, russian, arabic, hindi, and more
+
+## Priority
+🔵 Priority 1 - Enhancement to existing method
+
+## Labels
+- enhancement
+- ocr
+- openapi-compliance
+- backward-compatible
@@ -0,0 +1,63 @@
+# Enhancement: Image Watermark Support
+
+## Summary
+Extend `watermark_pdf()` to support image watermarks in addition to text watermarks, as specified in the OpenAPI ImageWatermarkAction.
+
+## Current Behavior
+- Only supports text watermarks
+- No image watermark capability
+
+## Proposed Enhancement
+```python
+def watermark_pdf(
+    self,
+    input_file: FileInput,
+    output_path: Optional[str] = None,
+    # Text watermark parameters (existing)
+    text: Optional[str] = None,
+    # Image watermark parameters (new)
+    image_file: Optional[FileInput] = None,
+    image_url: Optional[str] = None,
+    # Common parameters
+    width: int = 200,
+    height: int = 100,
+    opacity: float = 1.0,
+    position: str = "center",
+    rotation: int = 0,  # New parameter
+) -> Optional[bytes]:
+```
+
+## Benefits
+- Logo and branding watermarks
+- Complex visual watermarks
+- Rotation support for both text and image watermarks
+- Maintains backward compatibility
+
+## Implementation Details
+- Extend `_map_tool_to_action()` to handle image watermarks
+- Add validation for image_file/image_url parameters
+- Support rotation parameter for all watermark types
+- Handle image file upload in multipart request
+
+## Testing Requirements
+- [ ] Test with image file input (PNG, JPEG)
+- [ ] Test with image URL
+- [ ] Test rotation parameter (0, 90, 180, 270)
+- [ ] Test opacity with images
+- [ ] Test all position options
+- [ ] Verify backward compatibility with text watermarks
+
+## OpenAPI Reference
+- BuildAction type: `watermark`
+- Subtypes: TextWatermarkAction, ImageWatermarkAction
+- Image parameter: `image` (FileHandle)
+- New parameter: `rotation`
+
+## Priority
+🔵 Priority 1 - Enhancement to existing method
+
+## Labels
+- enhancement
+- watermark
+- openapi-compliance
+- backward-compatible
@@ -0,0 +1,51 @@
+# Enhancement: Selective Annotation Flattening
+
+## Summary
+Enhance `flatten_annotations()` to support selective flattening by annotation IDs, as supported by the OpenAPI FlattenAction.
+
+## Current Behavior
+- Flattens all annotations and form fields
+- No selective control
+
+## Proposed Enhancement
+```python
+def flatten_annotations(
+    self,
+    input_file: FileInput,
+    output_path: Optional[str] = None,
+    annotation_ids: Optional[List[Union[str, int]]] = None,  # New parameter
+) -> Optional[bytes]:
+```
+
+## Benefits
+- Preserve specific annotations while flattening others
+- More granular control over document processing
+- Better support for complex form workflows
+- Backward compatible (None = flatten all)
+
+## Implementation Details
+- Modify BuildAction to include `annotationIds` when provided
+- Support both string and integer IDs
+- Handle empty list (flatten none) vs None (flatten all)
+- Update parameter documentation
+
+## Testing Requirements
+- [ ] Test with None (flatten all - current behavior)
+- [ ] Test with specific annotation IDs
+- [ ] Test with mix of valid and invalid IDs
+- [ ] Test with empty list
+- [ ] Test with different annotation types
+
+## OpenAPI Reference
+- BuildAction type: `flatten`
+- Parameter: `annotationIds` (optional array of string/integer)
+- Behavior: If not specified, flattens all annotations
+
+## Priority
+🔵 Priority 1 - Enhancement to existing method
+
+## Labels
+- enhancement
+- annotations
+- openapi-compliance
+- backward-compatible
@@ -0,0 +1,76 @@
+# Feature: Create Redactions Method
+
+## Summary
+Implement `create_redactions()` method to programmatically create redaction annotations using text search, regex patterns, or presets.
+
+## Proposed Implementation
+```python
+def create_redactions(
+    self,
+    input_file: FileInput,
+    output_path: Optional[str] = None,
+    strategy: Literal["text", "regex", "preset"] = "text",
+    search_text: Optional[str] = None,  # For text strategy
+    regex_pattern: Optional[str] = None,  # For regex strategy
+    preset_type: Optional[str] = None,  # For preset strategy
+    case_sensitive: bool = False,
+    whole_words_only: bool = False,
+    # Redaction appearance
+    fill_color: Optional[str] = "#000000",
+    outline_color: Optional[str] = "#000000",
+    overlay_text: Optional[str] = None,
+) -> Optional[bytes]:
+```
+
+## Benefits
+- Automated redaction creation for compliance workflows
+- Multiple search strategies (text, regex, presets)
+- Customizable redaction appearance
+- Preview redactions before permanently applying
+- Works with existing `apply_redactions()` method
+
+## Implementation Details
+- Use BuildAction type: `createRedactions`
+- Support three strategies:
+  - `text`: Simple text search
+  - `regex`: Regular expression patterns
+  - `preset`: Common patterns (SSN, email, phone, etc.)
+- Include appearance customization options
+- Return PDF with redaction annotations (not yet applied)
+
+## Testing Requirements
+- [ ] Test text search strategy
+- [ ] Test regex patterns (email, SSN, phone)
+- [ ] Test preset types
+- [ ] Test case sensitivity options
+- [ ] Test appearance customization
+- [ ] Integration test with apply_redactions()
+
+## OpenAPI Reference
+- BuildAction type: `createRedactions`
+- Strategies: text, regex, preset
+- Strategy options vary by type
+- Includes content appearance configuration
+
+## Use Case Example
+```python
+# Create redactions for all SSNs
+pdf_with_redactions = client.create_redactions(
+    "document.pdf",
+    strategy="regex",
+    regex_pattern=r"\b\d{3}-\d{2}-\d{4}\b",
+    overlay_text="[REDACTED]"
+)
+
+# Review and then apply
+final_pdf = client.apply_redactions(pdf_with_redactions)
+```
+
+## Priority
+🟢 Priority 2 - Core missing method
+
+## Labels
+- feature
+- redaction
+- security
+- openapi-compliance