Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 22 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
run: python -m pytest tests/unit/ -v --cov=nutrient_dws --cov-report=xml --cov-report=term

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: ./coverage.xml
Expand All @@ -58,14 +58,17 @@ jobs:
integration-test:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: '3.12'
python-version: ${{ matrix.python-version }}

- name: Cache pip dependencies
uses: actions/cache@v4
Expand All @@ -80,7 +83,17 @@ jobs:
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Check for API key availability
run: |
if [ -z "${{ secrets.NUTRIENT_DWS_API_KEY }}" ]; then
echo "::warning::NUTRIENT_DWS_API_KEY secret not found, skipping integration tests"
echo "skip_tests=true" >> $GITHUB_ENV
else
echo "skip_tests=false" >> $GITHUB_ENV
fi

- name: Create integration config with API key
if: env.skip_tests != 'true'
run: |
python -c "
import os
Expand All @@ -91,8 +104,13 @@ jobs:
NUTRIENT_DWS_API_KEY: ${{ secrets.NUTRIENT_DWS_API_KEY }}

- name: Run integration tests
if: env.skip_tests != 'true'
run: python -m pytest tests/integration/ -v

- name: Cleanup integration config
if: always()
run: rm -f tests/integration/integration_config.py

build:
runs-on: ubuntu-latest
needs: test
Expand Down Expand Up @@ -120,4 +138,4 @@ jobs:
uses: actions/upload-artifact@v4
with:
name: dist
path: dist/
path: dist/
107 changes: 41 additions & 66 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,79 +1,54 @@
# Changelog

All notable changes to the nutrient-dws Python client library will be documented in this file.
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - 2024-06-17
## [1.0.1] - 2024-06-20

### Added
- 🎉 First stable release on PyPI
- Comprehensive test suite with 94% coverage (154 tests)
- Full support for Python 3.8 through 3.12
- Type hints for all public APIs
- PyPI package publication

#### Core Features
- **NutrientClient**: Main client class with support for both Direct API and Builder API patterns
- **Direct API Methods**: Convenient methods for single operations:
- `convert_to_pdf()` - Convert Office documents to PDF (uses implicit conversion)
- `flatten_annotations()` - Flatten PDF annotations and form fields
- `rotate_pages()` - Rotate specific or all pages
- `ocr_pdf()` - Apply OCR to make PDFs searchable
- `watermark_pdf()` - Add text or image watermarks
- `apply_redactions()` - Apply existing redaction annotations
- `merge_pdfs()` - Merge multiple PDFs and Office documents

- **Builder API**: Fluent interface for chaining multiple operations:
```python
client.build(input_file="document.docx") \
.add_step("rotate-pages", {"degrees": 90}) \
.add_step("ocr-pdf", {"language": "english"}) \
.execute(output_path="processed.pdf")
```

#### Infrastructure
- **HTTP Client**:
- Connection pooling for performance
- Automatic retry logic with exponential backoff
- Bearer token authentication
- Comprehensive error handling

- **File Handling**:
- Support for multiple input types (paths, Path objects, bytes, file-like objects)
- Automatic streaming for large files (>10MB)
- Memory-efficient processing

- **Exception Hierarchy**:
- `NutrientError` - Base exception
- `AuthenticationError` - API key issues
- `APIError` - General API errors with status codes
- `ValidationError` - Request validation failures
- `TimeoutError` - Request timeouts
- `FileProcessingError` - File operation failures

#### Development Tools
- **Testing**: 82 unit tests with 92.46% code coverage
- **Type Safety**: Full mypy type checking support
- **Linting**: Configured with ruff
- **Pre-commit Hooks**: Automated code quality checks
- **CI/CD**: GitHub Actions for testing, linting, and releases
- **Documentation**: Comprehensive README with examples
### Fixed
- CI pipeline compatibility for all Python versions
- Package metadata format for older setuptools versions
- Type checking errors with mypy strict mode
- File handler edge cases with BytesIO objects

### Changed
- Package name updated from `nutrient` to `nutrient-dws` for PyPI
- Source directory renamed from `src/nutrient` to `src/nutrient_dws`
- API endpoint updated to https://api.pspdfkit.com
- Authentication changed from X-Api-Key header to Bearer token

### Discovered
- **Implicit Document Conversion**: The API automatically converts Office documents (DOCX, XLSX, PPTX) to PDF when processing, eliminating the need for explicit conversion steps
- Improved error messages for better debugging
- Enhanced file handling with proper position restoration
- Updated coverage from 92% to 94%

### Fixed
- Watermark operation now correctly requires width/height parameters
- OCR language codes properly mapped (e.g., "en" → "english")
- All API operations updated to use the Build API endpoint
- Type annotations corrected throughout the codebase
## [1.0.0] - 2024-06-19

### Security
- API keys are never logged or exposed
- Support for environment variable configuration
- Secure handling of authentication tokens

[1.0.0]: https://github.com/jdrhyne/nutrient-dws-client-python/releases/tag/v1.0.0
### Added
- Initial implementation of Direct API with 7 methods:
- `convert_to_pdf` - Convert documents to PDF
- `convert_from_pdf` - Convert PDFs to other formats
- `ocr_pdf` - Perform OCR on PDFs
- `watermark_pdf` - Add watermarks to PDFs
- `flatten_annotations` - Flatten PDF annotations
- `rotate_pages` - Rotate PDF pages
- `merge_pdfs` - Merge multiple PDFs
- Builder API for complex document workflows
- Comprehensive error handling with custom exceptions
- Automatic retry logic with exponential backoff
- File streaming support for large documents
- Full type hints and py.typed marker
- Extensive documentation and examples
- MIT License

### Technical Details
- Built on `requests` library (only dependency)
- Supports file inputs as paths, bytes, or file-like objects
- Memory-efficient processing with streaming
- Connection pooling for better performance

[1.0.1]: https://github.com/PSPDFKit/nutrient-dws-client-python/compare/v1.0.0...v1.0.1
[1.0.0]: https://github.com/PSPDFKit/nutrient-dws-client-python/releases/tag/v1.0.0
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Claude Development Guide for Nutrient DWS Python Client


## Critical Reference
**ALWAYS** refer to `SPECIFICATION.md` before implementing any features. This document contains the complete design specification for the Nutrient DWS Python Client library.

Expand Down
83 changes: 83 additions & 0 deletions CREATE_GITHUB_ISSUES_MANUALLY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Manual GitHub Issue Creation Guide

Since automatic issue creation requires PSPDFKit organization permissions, please follow these steps to manually create the issues:

## Prerequisites
1. Ensure you have write access to the PSPDFKit/nutrient-dws-client-python repository
2. Or request someone with appropriate permissions to create these issues

## Issue Templates Location
All issue templates are in the `github_issues/` directory with the following structure:
- `00_roadmap.md` - Overall enhancement roadmap (create this first)
- `01_multi_language_ocr.md` - Multi-language OCR support
- `02_image_watermark.md` - Image watermark support
- `03_selective_flattening.md` - Selective annotation flattening
- `04_create_redactions.md` - Create redactions method
- `05_import_annotations.md` - Import annotations feature
- `06_extract_pages.md` - Extract page range method
- `07_convert_to_pdfa.md` - PDF/A conversion
- `08_convert_to_images.md` - Image extraction
- `09_extract_content_json.md` - JSON content extraction
- `10_convert_to_office.md` - Office format conversion
- `11_ai_redaction.md` - AI-powered redaction
- `12_digital_signature.md` - Digital signature support
- `13_batch_processing.md` - Batch processing method

## Steps to Create Issues

### Option 1: Using GitHub Web Interface
1. Go to https://github.com/PSPDFKit/nutrient-dws-client-python/issues
2. Click "New issue"
3. For each template file:
- Copy the title from the first line (after the #)
- Copy the entire content into the issue body
- Add the labels listed at the bottom of each template
- Click "Submit new issue"

### Option 2: Using GitHub CLI (if you have permissions)
If you get appropriate permissions, you can run:

```bash
cd /Users/admin/Projects/nutrient-dws-client-python

# Create the roadmap issue first
gh issue create \
--title "Enhancement Roadmap: Comprehensive Feature Plan" \
--body-file github_issues/00_roadmap.md \
--label "roadmap,enhancement,documentation"

# Then create individual feature issues
for i in {01..13}; do
title=$(head -n 1 github_issues/${i}_*.md | sed 's/# //')
labels=$(tail -n 1 github_issues/${i}_*.md | sed 's/- //')
gh issue create \
--title "$title" \
--body-file github_issues/${i}_*.md \
--label "$labels"
done
```

### Option 3: Request Organization Access
1. Contact the PSPDFKit organization administrators
2. Request contributor access to the nutrient-dws-client-python repository
3. Once granted, use the GitHub CLI commands above

## Issue Organization

### Priority Labels
- 🔵 `priority-1`: Enhanced existing methods
- 🟢 `priority-2`: Core missing methods
- 🟡 `priority-3`: Format conversion methods
- 🟠 `priority-4`: Advanced features

### Implementation Phases
- **Phase 1** (1-2 months): Issues 01, 02, 04
- **Phase 2** (2-3 months): Issues 07, 08, 05
- **Phase 3** (3-4 months): Issues 09, 10, 11
- **Phase 4** (4-6 months): Issues 12, 13

## Notes
- Create the roadmap issue (00) first as it provides context for all others
- Each issue is self-contained with implementation details, testing requirements, and examples
- Issues are numbered in suggested implementation order within their priority groups
- All issues follow the same format for consistency
31 changes: 31 additions & 0 deletions CREATE_GITHUB_RELEASE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Steps to Create GitHub Release for v1.0.1

## 1. Go to Releases Page
Navigate to: https://github.com/PSPDFKit/nutrient-dws-client-python/releases

## 2. Click "Create a new release"

## 3. Fill in the Release Details

**Choose a tag**: Select `v1.0.1` from the dropdown

**Release title**: `v1.0.1 - First Stable Release`

**Release notes**: Copy and paste the content from `RELEASE_NOTES_v1.0.1.md`

**Set as latest release**: ✅ Check this box

## 4. Publish Release
Click "Publish release"

## Note
Since the repository has branch protection rules, we cannot push the README updates directly to main. You may want to:

1. Create a PR for the README badge updates
2. Or update the README badges after the release

The updated README includes:
- PyPI version badge
- Python versions badge
- Downloads counter badge
- Updated coverage badge (94%)
74 changes: 74 additions & 0 deletions FIX_GITHUB_TOKEN_PERMISSIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Fix GitHub Token Permissions for Issue Creation

## Current Problem
Your token can:
- ✅ Push to branches
- ✅ Read issues
- ❌ Create issues (missing scope)

## Quick Fix Options

### Option 1: Use Fine-grained Personal Access Token (Recommended)
1. Go to: https://github.com/settings/personal-access-tokens/new
2. Token name: `nutrient-dws-development`
3. Expiration: 90 days
4. Repository access: Selected repositories
- Add: `PSPDFKit/nutrient-dws-client-python`
5. Permissions:
- **Repository permissions:**
- Contents: Read/Write
- Issues: Read/Write
- Pull requests: Read/Write
- Actions: Read (optional)
- Metadata: Read (required)
6. Click "Generate token"
7. Copy the token (starts with `github_pat_`)

### Option 2: Use Classic Personal Access Token
1. Go to: https://github.com/settings/tokens/new
2. Note: `nutrient-dws-development`
3. Expiration: 90 days
4. Select scopes:
- ✅ `repo` (Full control - includes private repos)
- OR just ✅ `public_repo` (if the repo is public)
5. Generate and copy token

## Apply the New Token

### Method 1: GitHub CLI (Recommended)
```bash
# Re-authenticate with new token
gh auth login

# When prompted:
# - Choose: GitHub.com
# - Choose: Paste an authentication token
# - Paste your new token
```

### Method 2: Environment Variable
```bash
# In your terminal
export GITHUB_TOKEN='your_new_token_here'

# Or add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
echo "export GITHUB_TOKEN='your_new_token_here'" >> ~/.zshrc
source ~/.zshrc
```

## Verify Token Works
```bash
# Test creating a simple issue
gh issue create --repo PSPDFKit/nutrient-dws-client-python \
--title "Test Issue (Delete Me)" \
--body "Testing token permissions"

# If successful, close it:
gh issue close <issue-number> --repo PSPDFKit/nutrient-dws-client-python
```

## Security Notes
- Never commit tokens to git
- Use environment variables or gh auth
- Rotate tokens regularly
- Use minimum required scopes
Loading
Loading