fix deprecations, onnx conversion, inference example

clibdev · clibdev · commit 8e4bc85e2e3b · 2025-01-11T13:59:49.000+02:00
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,8 @@
+/__pycache__
+*/__pycache__
+/.idea
+*.pt
+*.pth
+*.onnx
+/results
+result.png
diff --git a/README.md b/README.md
@@ -1,127 +1,44 @@
-# ShadowFormer (AAAI'23)
-This is the official implementation of the AAAI 2023 paper [ShadowFormer: Global Context Helps Image Shadow Removal](https://arxiv.org/pdf/2302.01650.pdf).
+# Fork of [GuoLanqing/ShadowFormer](https://github.com/GuoLanqing/ShadowFormer)
 
-[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/shadowformer-global-context-helps-image/shadow-removal-on-istd)](https://paperswithcode.com/sota/shadow-removal-on-istd?p=shadowformer-global-context-helps-image)
-[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/shadowformer-global-context-helps-image/shadow-removal-on-adjusted-istd)](https://paperswithcode.com/sota/shadow-removal-on-adjusted-istd?p=shadowformer-global-context-helps-image)
-[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/shadowformer-global-context-helps-image/shadow-removal-on-srd)](https://paperswithcode.com/sota/shadow-removal-on-srd?p=shadowformer-global-context-helps-image)
+Differences between original repository and fork:
 
-#### News
-* **Feb 24, 2023**: Release the pretrained models for ISTD and ISTD+.
-* **Feb 18, 2023**: Release the training and testing codes.
-* **Feb 17, 2023**: Add the testing results and the description of our work.
+* Compatibility with PyTorch >=2.5. (🔥)
+* Original pretrained models and converted ONNX models from GitHub [releases page](https://github.com/clibdev/ShadowFormer/releases). (🔥)
+* Model conversion to ONNX format using the [export.py](export.py) file. (🔥)
+* Sample script [inference.py](inference.py) for inference of single image.
+* The following deprecations has been fixed:
+  * UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument.
+  * FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers.
+  * FutureWarning: You are using 'torch.load' with 'weights_only=False'.
 
-## Introduction
-To tackle image shadow removal problem, we propose a novel transformer-based method, dubbed ShadowFormer, for exploiting non-shadow
-regions to help shadow region restoration. A multi-scale channel attention framework is employed to hierarchically
-capture the global information. Based on that, we propose a Shadow-Interaction Module (SIM) with Shadow-Interaction Attention (SIA) in the bottleneck stage to effectively model the context correlation between shadow and non-shadow regions. 
-For more details, please refer to our [original paper](https://arxiv.org/pdf/2302.01650.pdf)
+# Installation
 
-<p align=center><img width="80%" src="doc/pipeline.jpg"/></p>
-
-<p align=center><img width="80%" src="doc/details.jpg"/></p>
-
-## Requirement
-* Python 3.7
-* Pytorch 1.7
-* CUDA 11.1
-```bash
+```shell
 pip install -r requirements.txt
 ```
 
-## Datasets
-* ISTD [[link]](https://github.com/DeepInsight-PCALab/ST-CGAN)  
-* ISTD+ [[link]](https://github.com/cvlab-stonybrook/SID)
-* SRD [[Training]](https://drive.google.com/file/d/1W8vBRJYDG9imMgr9I2XaA13tlFIEHOjS/view)[[Testing]](https://drive.google.com/file/d/1GTi4BmQ0SJ7diDMmf-b7x2VismmXtfTo/view)
+# Pretrained models
 
-## Pretrained models
-[ISTD](https://drive.google.com/file/d/1bHbkHxY5D5905BMw2jzvkzgXsFPKzSq4/view?usp=share_link) | [ISTD+](https://drive.google.com/file/d/10pBsJenoWGriZ9kjWOcE4l4Kzg-F1TFd/view?usp=share_link) | [SRD]()
+* Download links:
 
-Please download the corresponding pretrained model and modify the `weights` in `test.py`.
+| Name                 | Model Size (MB) | Link                                                                                                                                                                                                          | SHA-256                                                                                                                              |
+|----------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
+| ShadowFormer (ISTD)  | 130.9<br>83.0   | [PyTorch](https://github.com/clibdev/ShadowFormer/releases/latest/download/shadowformer-istd.pt)<br>[ONNX](https://github.com/clibdev/ShadowFormer/releases/latest/download/shadowformer-istd.onnx)           | 4700ae374b965253734dbcac0b63c9cac9af5895ff19655710042a988751fc98<br>96b90f5f1d11b67e3c7835cae3ccacaaa78ac4fadbf03a04fd36769e21f619a6 |
+| ShadowFormer (ISTD+) | 130.9<br>83.0   | [PyTorch](https://github.com/clibdev/ShadowFormer/releases/latest/download/shadowformer-istd-plus.pt)<br>[ONNX](https://github.com/clibdev/ShadowFormer/releases/latest/download/shadowformer-istd-plus.onnx) | 2748060149908df37cc65f0695ef61d64cd25847aba0c35af36823f9b780f5b2<br>077128017e7400c0e7c22210d6afb83748bfb068a6e02037156ea4ab8a8592a9 |
 
-## Test
-You can directly test the performance of the pre-trained model as follows
-1. Modify the paths to dataset and pre-trained model. You need to modify the following path in the `test.py` 
-```python
-input_dir # shadow image input path -- Line 27
-weights # pretrained model path -- Line 31
-```
-2. Test the model
-```python
-python test.py --save_images
-```
-You can check the output in `./results`.
+# Inference
 
-## Train
-1. Download datasets and set the following structure
-```
-|-- ISTD_Dataset
-    |-- train
-        |-- train_A # shadow image
-        |-- train_B # shadow mask
-        |-- train_C # shadow-free GT
-    |-- test
-        |-- test_A # shadow image
-        |-- test_B # shadow mask
-        |-- test_C # shadow-free GT
-```
-2. You need to modify the following terms in `option.py`
-```python
-train_dir  # training set path
-val_dir   # testing set path
-gpu: 0 # Our model can be trained using a single RTX A5000 GPU. You can also train the model using multiple GPUs by adding more GPU ids in it.
-```
-3. Train the network
-If you want to train the network on 256X256 images:
-```python
-python train.py --warmup --win_size 8 --train_ps 256
-```
-or you want to train on original resolution, e.g., 480X640 for ISTD:
-```python
-python train.py --warmup --win_size 10 --train_ps 320
+```shell
+python inference.py --weights shadowformer-istd.pt --input_path img/noisy_image.png --mask_path img/mask.png
+python inference.py --weights shadowformer-istd-plus.pt --input_path img/noisy_image.png --mask_path img/mask.png
 ```
 
-## Evaluation
-The results reported in the paper are calculated by the `matlab` script used in [previous method](https://github.com/zhuyr97/AAAI2022_Unfolding_Network_Shadow_Removal/tree/master/codes). Details refer to `evaluation/measure_shadow.m`.
-We also provide the `python` code for calculating the metrics in `test.py`, using `python test.py --cal_metrics` to print.
+# Export to ONNX format
 
-## Results
-#### Evaluation on ISTD
-The evaluation results on ISTD are as follows
-| Method | PSNR | SSIM | RMSE |
-| :-- | :--: | :--: | :--: |
-| ST-CGAN | 27.44 | 0.929 | 6.65 |
-| DSC | 29.00 | 0.944 | 5.59 |
-| DHAN | 29.11 | 0.954 | 5.66 |
-| Fu et al. | 27.19 | 0.945 | 5.88 |
-| Zhu et al. | 29.85 | 0.960 | 4.27 |
-| **ShadowFormer (Ours)** | **32.21** | **0.968** | **4.09** |
-
-#### Visual Results
-<p align=center><img width="80%" src="doc/res.jpg"/></p>
-
-#### Testing results
-The testing results on dataset ISTD, ISTD+, SRD are: [results](https://drive.google.com/file/d/1zcv7KBCIKgk-CGQJCWnM2YAKcSAj8Sc4/view?usp=share_link)
-
-## References
-Our implementation is based on [Uformer](https://github.com/ZhendongWang6/Uformer) and [Restormer](https://github.com/swz30/Restormer). We would like to thank them.
-
-Citation
------
-Preprint available [here](https://arxiv.org/pdf/2302.01650.pdf). 
-
-In case of use, please cite our publication:
-
-L. Guo, S. Huang, D. Liu, H. Cheng and B. Wen, "ShadowFormer: Global Context Helps Image Shadow Removal," AAAI 2023.
-
-Bibtex:
+```shell
+pip install onnx
 ```
-@article{guo2023shadowformer,
-  title={ShadowFormer: Global Context Helps Image Shadow Removal},
-  author={Guo, Lanqing and Huang, Siyu and Liu, Ding and Cheng, Hao and Wen, Bihan},
-  journal={arXiv preprint arXiv:2302.01650},
-  year={2023}
-}
+```shell
+python export.py --weights shadowformer-istd.pt
+python export.py --weights shadowformer-istd-plus.pt
 ```
-
-## Contact
-If you have any questions, please contact lanqing001@e.ntu.edu.sg
diff --git a/export.py b/export.py
@@ -0,0 +1,41 @@
+from model import ShadowFormer
+import torch
+import utils
+import argparse
+import os
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--weights', default='./shadowformer-istd.pt')
+parser.add_argument('--device', default='cpu', type=str, help='cuda or cpu')
+parser.add_argument('--dynamic', action='store_true', default=False, help='enable dynamic axis in onnx model')
+opt = parser.parse_args()
+
+device = torch.device(opt.device)
+
+win_size = 10
+img_multiple_of = 8 * win_size
+
+model_restoration = ShadowFormer(img_size=320, embed_dim=32, win_size=win_size, token_projection='linear', token_mlp='leff')
+
+utils.load_checkpoint(model_restoration, opt.weights)
+
+model_restoration.to(device)
+model_restoration.eval()
+
+model_path = os.path.splitext(opt.weights)[0] + '.onnx'
+
+dummy_input_1 = torch.randn(1, 3, 480, 640).to(opt.device)
+dummy_input_2 = torch.randn(1, 1, 480, 640).to(opt.device)
+
+dynamic_axes = {'input': {2: '?', 3: '?'}, 'mask': {2: '?', 3: '?'}, 'output': {2: '?', 3: '?'}} if opt.dynamic else None
+
+torch.onnx.export(
+    model_restoration,
+    (dummy_input_1, dummy_input_2),
+    model_path,
+    verbose=False,
+    input_names=['input', 'mask'],
+    output_names=['output'],
+    dynamic_axes=dynamic_axes,
+    opset_version=17
+)
diff --git a/img/mask.png b/img/mask.png
diff --git a/img/noisy_image.png b/img/noisy_image.png
diff --git a/inference.py b/inference.py
@@ -0,0 +1,53 @@
+from model import ShadowFormer
+import torch
+import utils
+import numpy as np
+import torch.nn.functional as F
+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--input_path', default='./img/noisy_image.png')
+parser.add_argument('--mask_path', default='./img/mask.png')
+parser.add_argument('--output_path', default='./result.png')
+parser.add_argument('--weights', default='./shadowformer-istd.pt')
+parser.add_argument('--device', default='cuda', help='cuda or cpu')
+opt = parser.parse_args()
+
+device = torch.device(opt.device)
+
+win_size = 10
+img_multiple_of = 8 * win_size
+
+model_restoration = ShadowFormer(img_size=320, embed_dim=32, win_size=win_size, token_projection='linear', token_mlp='leff')
+
+utils.load_checkpoint(model_restoration, opt.weights)
+
+model_restoration.to(device)
+model_restoration.eval()
+
+with torch.no_grad():
+    rgb_noisy = torch.from_numpy(np.float32(utils.load_img(opt.input_path)))
+    rgb_noisy = rgb_noisy.permute(2,0,1)
+    rgb_noisy = rgb_noisy.unsqueeze(0).to(device)
+
+    mask = utils.load_mask(opt.mask_path)
+    mask = torch.from_numpy(np.float32(mask))
+    mask = torch.unsqueeze(mask, dim=0)
+    mask = mask.unsqueeze(0).to(device)
+
+    # Pad the input if not_multiple_of win_size * 8
+    height, width = rgb_noisy.shape[2], rgb_noisy.shape[3]
+    H, W = ((height + img_multiple_of) // img_multiple_of) * img_multiple_of, (
+            (width + img_multiple_of) // img_multiple_of) * img_multiple_of
+    padh = H - height if height % img_multiple_of != 0 else 0
+    padw = W - width if width % img_multiple_of != 0 else 0
+    rgb_noisy = F.pad(rgb_noisy, (0, padw, 0, padh), 'reflect')
+    mask = F.pad(mask, (0, padw, 0, padh), 'reflect')
+
+    rgb_restored = model_restoration(rgb_noisy, mask)
+    rgb_restored = torch.clamp(rgb_restored, 0, 1).cpu().numpy().squeeze().transpose((1, 2, 0))
+
+    # Unpad the output
+    rgb_restored = rgb_restored[:height, :width, :]
+
+    utils.save_img(rgb_restored*255.0, opt.output_path)
diff --git a/model.py b/model.py
@@ -1,7 +1,7 @@
 import torch
 import torch.nn as nn
 import torch.utils.checkpoint as checkpoint
-from timm.models.layers import DropPath, to_2tuple, trunc_normal_
+from timm.layers import DropPath, to_2tuple, trunc_normal_
 import torch.nn.functional as F
 from einops import rearrange, repeat
 from einops.layers.torch import Rearrange
@@ -362,7 +362,7 @@ def __init__(self, dim, win_size, num_heads, token_projection='linear', qkv_bias
         # get pair-wise relative position index for each token inside the window
         coords_h = torch.arange(self.win_size[0])  # [0,...,Wh-1]
         coords_w = torch.arange(self.win_size[1])  # [0,...,Ww-1]
-        coords = torch.stack(torch.meshgrid([coords_h, coords_w]))  # 2, Wh, Ww
+        coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing='ij'))  # 2, Wh, Ww
         coords_flatten = torch.flatten(coords, 1)  # 2, Wh*Ww
         relative_coords = coords_flatten[:, :, None] - coords_flatten[:, None, :]  # 2, Wh*Ww, Wh*Ww
         relative_coords = relative_coords.permute(1, 2, 0).contiguous()  # Wh*Ww, Wh*Ww, 2
diff --git a/requirements.txt b/requirements.txt
@@ -1,8 +1,8 @@
-torch==1.7.1
-torchvision==0.8.2
+torch>=2.5.0
+torchvision>=0.20.0
 matplotlib
 scikit-image
-opencv-python
+opencv-python>=4.10.0
 yacs
 joblib 
 natsort 
@@ -12,4 +12,7 @@ einops
 linformer
 timm
 ptflops
-dataclasses
+dataclasses
+
+### Optional - ONNX format
+# onnx>=1.17.0
diff --git a/test.py b/test.py
@@ -17,17 +17,15 @@
 import cv2
 from model import UNet
 
-from skimage import img_as_float32, img_as_ubyte
 from skimage.metrics import peak_signal_noise_ratio as psnr_loss
 from skimage.metrics import structural_similarity as ssim_loss
-from sklearn.metrics import mean_squared_error as mse_loss
 
 parser = argparse.ArgumentParser(description='RGB denoising evaluation on the validation set of SIDD')
-parser.add_argument('--input_dir', default='../ISTD_Dataset/test/',
+parser.add_argument('--input_dir', default='./ISTD_Dataset/test',
     type=str, help='Directory of validation images')
 parser.add_argument('--result_dir', default='./results/',
     type=str, help='Directory for results')
-parser.add_argument('--weights', default='./log/ShadowFormer_istd/models/model_best.pth',
+parser.add_argument('--weights', default='shadowformer-istd.pt',
     type=str, help='Path to weights')
 parser.add_argument('--gpus', default='0', type=str, help='CUDA_VISIBLE_DEVICES')
 parser.add_argument('--arch', default='ShadowFormer', type=str, help='arch')
diff --git a/utils/model_utils.py b/utils/model_utils.py
@@ -21,7 +21,7 @@ def save_checkpoint(model_dir, state, session):
     torch.save(state, model_out_path)
 
 def load_checkpoint(model, weights):
-    checkpoint = torch.load(weights)
+    checkpoint = torch.load(weights, weights_only=True)
     try:
         model.load_state_dict(checkpoint["state_dict"])
     except: