Self-contained code for robust image synchronization through watermark embedding, developed to improve the performance and robustness of Watermarking Autoregressive Image Generation (WMAR). This folder is standalone and does not depend on the WMAR repository. It was built upon the Meta Video Seal codebase.
See paper: arXiv
We provide a scripted model so you can run the model out-of-the-box, without heavy setup.
Download it here: syncmodel.jit.pt, or through:
wget -O syncmodel.jit.pt https://dl.fbaipublicfiles.com/wmar/syncseal/paper/syncmodel.jit.ptMinimal usage:
import torch
from PIL import Image
from torchvision.transforms.functional import to_tensor
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
scripted = torch.jit.load("syncmodel.jit.pt").to(device).eval()
# Load an RGB image in [0,1]
img = Image.open("/path/to/image.jpg").convert("RGB")
img_pt = to_tensor(img).unsqueeze(0).to(device)
with torch.no_grad():
emb = scripted.embed(img_pt) # {'preds_w', 'imgs_w'}
det = scripted.detect(emb["imgs_w"]) # {'preds', 'preds_pts'} where preds_pts is Bx8 corners in [-1,1]
# Optional: rectify the image using the detected corners
pred_pts = det["preds_pts"]
imgs_unwarped = scripted.unwarp(emb["imgs_w"], pred_pts, original_size=img_pt.shape[-2:])For an end-to-end example (including simple augmentations and visualization), see notebooks/standalone.ipynb.
First clone the root repository:
git clone https://github.com/facebookresearch/wmar.git
cd wmar/syncseal/Python 3.10 is recommended. PyTorch should be installed to match your system (CPU or CUDA 12.1). We provide both pip and uv instructions.
PyTorch (choose one):
- CUDA 12.1 wheels via pip:
pip install --index-url https://download.pytorch.org/whl/cu121 \ torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0
- CPU-only wheels via pip:
pip install --index-url https://download.pytorch.org/whl/cpu \ torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0
Other dependencies with pip:
pip install -r requirements.txtCopy WAM-synchronization used as baseline
cp -r ../deps/watermark_anything deps/watermark_anythingThis repo ships a pyproject.toml compatible with uv.
- Create and activate a virtualenv (examples):
uv venv --python 3.10
source .venv/bin/activate- Install deps with uv:
uv syncDatasets are configured via YAML under configs/datasets/.
For instance, in configs/datasets/your-dataset.yaml:
train_dir: /path/to/your-dataset/sa-1b/train/
val_dir: /path/to/your-dataset/sa-1b/val/
where train and val are directories containing images.
In the paper, we train with SA-1B which you can download in https://segment-anything.com/dataset/index.html. We sort the filenames and use the first 1k images for validation, the following 1k for testing, and the resize all remaining images to 256x256 for training.
We release the following:
| Model Type | Checkpoint | TorchScript | Parameters | Training Logs | Console Outputs |
|---|---|---|---|---|---|
| Paper model (full) | checkpoint.pth | syncmodel.jit.pt | expe.json | log.txt | log.stdout |
| Codebase model (faster reproduction) | checkpoint.pth | syncmodel.jit.pt | expe.json | log.txt | log.stdout |
Remark about the main difference between the two models, w.r.t. the way geometric augmentations are applied during training:
- The paper's model first selected crop or identity (as done in
syncseal/augmentation/geometricunified.py), but could also choose crops in subsequent augmentations, which led to a non-uniform sampling, and to sampling some extreme crops in terms of area. - The new code instead first selects crop or identity, but then removes crops from the pool of augmentations for subsequent augmentations, leading to a more uniform sampling of geometric transformations, and to a more stable training, which was found to obtain similar results in 4x fewer steps per epoch.
- Overall, the way augmentations are applied could be further tuned if you want to improve performance.
Use train_sync.py to train a synchronization model.
Below is an example:
# Example (2 GPUs)
OMP_NUM_THREADS=40 torchrun --nproc_per_node=2 train_sync.py --local_rank 0Notes
- Datasets are configured via YAML under
configs/datasets/. Example:configs/datasets/your-dataset.yaml. - To see the specific parameters used in the models released above, see
expe.jsonfiles linked in the Models section.
If you want to load a training checkpoint and run Python inference with the native modules, use the convenience function in syncseal/utils/cfg.py:
from syncseal.utils.cfg import setup_model_from_checkpoint
model, cfg = setup_model_from_checkpoint("/path/to/checkpoint.pth")
model.eval().to("cuda")After training, script your own checkpoint to a single .pt file with:
python -m syncseal.models.scripted \
--checkpoint /path/to/checkpoint.pthThis will create syncmodel.jit.pt in the current directory which can be loaded as shown in the Quickstart section.
Evaluate synchronization accuracy and image quality under geometric and value-metric augmentations:
python -m syncseal.evals.eval_sync \
--checkpoint /path/to/checkpoint.pth \
--dataset your-dataset \
--num_samples 100 \
--short_edge_size 512 \
--square_images true \
--output_dir outputs/sync_evalBaseline options
--checkpoint baseline/siftuses a SIFT+Lowe matching baseline.--checkpoint baseline/wamruns a WAM-based baseline.
The script writes two CSV files in the output directory:
sync_metrics.csv: sync error per augmentation setup (includes detection and unwarp timing)image_quality_metrics.csv: PSNR/SSIM/LPIPS between original and watermarked images
- Standalone quickstart and visualization:
notebooks/standalone.ipynb
Please see the LICENSE file in the root of the main repository.
If you find this repository useful, please consider giving a star ⭐ and please cite as:
@article{fernandez2025geometric,
title={Geometric Image Synchronization with Deep Watermarking},
author={Fernandez, Pierre and Sou\v{c}ek, Tom\'{a}\v{s} and Jovanovi\'{c}, Nikola and Elsahar, Hady and Rebuffi, Sylvestre-Alvise and Lacatusu, Valeriu and Tran, Tuan and Mourachko, Alexandre},
journal={arXiv preprint arXiv:2509.15208},
year={2025}
}