This document provides a detailed overview of the technical architecture of the Watermark Remover tool.
The system is composed of three main components:
- Batch Processor (
batch_processor.py): The main entry point and orchestrator. - Complete Watermark Detector (
complete_watermark_detector.py): The V3 high-precision detection module. - LaMa Inpainter (
lama_inpainter.py): The deep learning inpainting module.
graph TD
A[User Input] --> B(Batch Processor);
B --> C{Detection};
B --> D{Inpainting};
C --> E[Complete Detector V3];
D --> F[LaMa Inpainter];
E --> G[Mask];
F -- uses --> G;
F --> H[Clean Image];
B --> I[Report];
Responsibilities:
- Command-Line Interface: Parses user arguments (
--input,--output,--batch, etc.). - File Discovery: Scans input directories for image files.
- Orchestration: For each image, it calls the detector and then the inpainter.
- Reporting: Aggregates results and generates a JSON report and a file list.
- Error Handling: Catches and logs errors for individual files without halting the entire batch.
This is the core innovation of the project, designed to achieve 100% watermark coverage.
Detection Pipeline:
- Search Region Identification: Isolates the bottom-right corner of the image to reduce the search space and prevent false positives.
- Color-Based Thresholding: Creates an initial binary mask of all pixels matching the watermark's dark gray color profile.
- Contour Analysis: Finds all distinct shapes (contours) in the binary mask.
- Shape & Size Filtering: Filters contours based on expected width, height, and aspect ratio to identify the primary watermark component.
- Three-Stage Coverage Expansion:
- Method 1 (Direct): Re-analyzes the area within the detected bounding box to ensure all color-matching pixels are included.
- Method 2 (Expanded): Searches a slightly larger region to capture faint edges.
- Method 3 (Global): Validates that all detected components are part of the same watermark structure.
- Mask Merging: Combines the results of all three methods into a single, comprehensive mask.
- Final Refinement: Applies a slight dilation and Gaussian blur to smooth edges and close any microscopic gaps, preparing the mask for the inpainting model.
Responsibilities:
- Model Loading: Initializes the LaMa model and handles device placement (CPU, CUDA, MPS).
- Image Preprocessing: Prepares the input image and mask for the model.
- Inpainting: Executes the LaMa model to fill the masked region.
- Image Postprocessing: Converts the output tensor back into a standard image format (Pillow Image).
Underlying Library: simple-lama-inpainting
This library provides a high-level wrapper around the official LaMa implementation, simplifying the model download, caching, and execution process.
The data flows through the system as follows:
- Input: The user provides an input directory of images.
- Processing Loop: The
Batch Processoriterates through each image. - Detection: The
CompleteWatermarkDetectortakes an image path and returns a NumPy array representing the binary mask (255 for watermark, 0 for background). - Inpainting: The
LamaInpaintertakes the original image path and the generated mask, and returns a Pillow Image object of the clean image. - Output: The
Batch Processorsaves the clean image to the output directory and records the results. - Reporting: After the loop, the
Batch Processorcompiles all results into a JSON report.
- Modularity: Each component is self-contained and can be tested or replaced independently.
- Robustness: The batch processor is designed to handle errors gracefully, ensuring that one failed image does not stop the entire process.
- Precision: The V3 detector was built with the single goal of achieving 100% mask coverage to eliminate any post-processing requirements.
- Usability: The command-line interface is designed to be intuitive, with clear arguments and helpful examples.