docs,build: update README, installation guide, and switch to hatchling/uv for packaging

haoxiangsnr · haoxiangsnr · commit fc5697d8f9c6 · 2026-01-28T11:58:46.000+08:00
- Major rewrite of README.md for clarity, quick start, and citation
- Update installation guide to recommend uv and add detailed steps for both uv and conda/pip
- Switch build backend to hatchling in pyproject.toml, add dependencies and extras for GPU, test, docs, build
- Add data/ to .gitignore

These changes modernize the setup, improve reproducibility, and clarify usage for new users.
diff --git a/.gitignore b/.gitignore
@@ -8,6 +8,7 @@ _build/
 *.csv
 core.*
 tmp/
+data/
 
 # Ruff
 .ruff_cache/
diff --git a/README.md b/README.md
@@ -1,81 +1,155 @@
-<!-- PROJECT SHIELDS -->
-<!--
-*** I'm using markdown "reference style" links for readability.
-*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
-*** See the bottom of this document for the declaration of the reference variables
-*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
-*** https://www.markdownguide.org/basic-syntax/#reference-style-links
--->
-[![Stargazers][stars-shield]][stars-url]
-[![Forks][forks-shield]][forks-url]
-[![Contributors][contributors-shield]][contributors-url]
-[![Issues][issues-shield]][issues-url]
-[![MIT License][license-shield]][license-url]
-
-<!-- PROJECT LOGO -->
-<br />
-<div align="center">
-  <h3 align="center">Spiking-FullSubNet</h3>
-
-  <p align="center">
-    Intel N-DNS Challenge Algorithmic Track Winner
-    <br />
-    <a href="https://haoxiangsnr.github.io/spiking-fullsubnet/"><strong>Explore the docs »</strong></a>
-    <br />
-    <br />
-    <a href="https://github.com/haoxiangsnr/spiking-fullsubnet/">View Demo</a>
-    ·
-    <a href="https://github.com/haoxiangsnr/spiking-fullsubnet/issues">Report Bug</a>
-    ·
-    <a href="https://github.com/haoxiangsnr/spiking-fullsubnet/issues">Request Feature</a>
-  </p>
-</div>
-
-
-<!-- ABOUT THE PROJECT -->
-## About The Project
-
-![Spiking-FullSubNet](./docs/source/images/project_image.png)
-
-We are proud to announce that Spiking-FullSubNet has emerged as the winner of Intel N-DNS Challenge Track 1 (Algorithmic). Please refer to our [brief write-up here](./Spiking-FullSubNet.pdf) for more details. This repository serves as the official home of the Spiking-FullSubNet implementation. Here, you will find:
-
-- A PyTorch-based implementation of the Spiking-FullSubNet model.
+# Spiking-FullSubNet
+
+Spiking-FullSubNet is the winner solution of Intel N-DNS Challenge Track 1 (Algorithmic). This repository serves as the official home of the Spiking-FullSubNet implementation. Here, you will find:
+
+- A PyTorch-based implementation of the Spiking-FullSubNet model described in our paper "Toward Ultralow-Power Neuromorphic Speech Enhancement With Spiking-FullSubNet".
 - Scripts for training the model and evaluating its performance.
 - The pre-trained models in the `model_zoo` directory, ready to be further fine-tuned on the other datasets.
-
-<!---
-We are actively working on improving the documentation, fixing bugs and removing redundancies. Please feel free to raise an issue or submit a pull request if you have suggestions for enhancements.
-Our team is diligently working on a comprehensive paper that will delve into the intricate details of Spiking-FullSuNet's architecture, its operational excellence, and the broad spectrum of its potential applications. Please stay tuned!
--->
+- The frozen version of the solution used in the Intel N-DNS Challenge in the `38fe020` commit.
 
 ## Updates
 
-[2024-02-26] Currently, our repo contains two versions of the code:
+- 2026-01-27: The `main` branch shows the implementation of the Spiking-FullSubNet model as described in our published paper "Toward Ultralow-Power Neuromorphic Speech Enhancement With Spiking-FullSubNet", which includes several improvements and optimizations over the challenge version. We recommend using this branch for the citation and further research.
+- 2024-02-26: The **frozen version**, which serves as a backup for the submitted solution used in the Intel N-DNS Challenge. This solution has been checked and verified by Intel during the challenge. If you need to check the experimental results from that time, please refer to this specific commit: [38fe020](https://github.com/haoxiangsnr/spiking-fullsubnet/tree/38fe020cdb803d2fdc76a0df4b06311879c8e370). There you will find everything you need. After switching to this commit, you can place the checkpoints from the `model_zoo` into the `exp` directory and use `-M test` for inference or `-M train` to retrain the model. After challenge, we made some improvements and optimizations to the solution and published a paper (IEEE TNNLS) based on these improvements. Please check the `main` branch for the published paper version.
+
+## Quick Start
+
+You can either clone the repository, setup an environment and start with the scripts, or directly open in Colab (under construction).
+
+## Environment Setup
+
+We really like [uv](https://docs.astral.sh/uv/) and recommend using it as your package manager. But feel free to use whichever you prefer.
 
-1. The **frozen version**, which serves as a backup for the code used in a previous competition. However, due to a restructuring in the `audiozen` directory, this version can no longer be directly used for inference. If you need to verify the experimental results from that time, please refer to this specific commit: [38fe020](https://github.com/haoxiangsnr/spiking-fullsubnet/tree/38fe020cdb803d2fdc76a0df4b06311879c8e370). There you will find everything you need. After switching to this commit, you can place the checkpoints from the `model_zoo` into the `exp` directory and use `-M test` for inference or `-M train` to retrain the model.
+> [!TIP]
+> uv is significantly faster (10~100x) than pip and handles dependency resolution more reliably.
+> The `uv.lock` file ensures reproducible installations across different machines.
 
-2. The **latest version** of the code has undergone some restructuring and optimization to make it more understandable for readers. We've also introduced `acceleate` to assist with better training practices. We believe you can follow the instructions in the help documentation to run the training code directly. The pre-trained model checkpoints and a more detailed paper will be released by next weekend, so please stay tuned for that.
 
+```bash
+# Clone the repository
+git clone git@github.com:haoxiangsnr/spiking-fullsubnet.git && cd spiking-fullsubnet
 
+# [Optional] Install uv
+# Check https://docs.astral.sh/uv/ for other installation methods
+curl -LsSf https://astral.sh/uv/install.sh | sh
 
-## Documentation
+# Install all dependencies (creates .venv automatically)
+# This will:
+# - Create a virtual environment in `.venv`
+# - Install all dependencies from `uv.lock`
+# - Install `audiozen` folder in editable mode so you can import it everywhere
+uv sync --all-extras
 
-See the [Documentation](https://haoxiangsnr.github.io/spiking-fullsubnet/) for installation and usage. Our team is actively working to improve the documentation. Please feel free to raise an issue or submit a pull request if you have suggestions for enhancements.
+# Activate the virtual environment
+source .venv/bin/activate
+```
+
+If you prefer Conda/pip, you can still use the traditional approach:
+
+```bash
+git clone git@github.com:haoxiangsnr/spiking-fullsubnet.git && cd spiking-fullsubnet
+
+conda create --name spiking-fullsubnet python=3.10
+conda activate spiking-fullsubnet
+
+# torch==2.1.1 and torch==2.10 have been tested to work well with this codebase
+conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
+
+# Install other dependencies and the `audiozen` folder in editable mode
+# Please check the `pyproject.toml` for the full list of dependencies
+pip install -e .
+```
+
+## Inference on Validation Set
+
+Since the official test set requires a long time (10 hours+) for inference on a single GPU,
+we provide a mini-validation set to quickly verify. The validation set contains 341 noisy-clean pairs generated in the same way as the official test set. In our experience, performance gains on this set are highly positively correlated with the official test set.
+
+```bash
+# Download validation set from Github Releases
+cd <your_project_root>
+mkdir data && cd data
+
+# Download and extract validation set
+wget https://github.com/haoxiangsnr/spiking-fullsubnet/releases/download/data/validation_set.tar.gz
+tar -xzvf validation_set.tar.gz
+
+# folder structure:
+.
+└── data
+    ├── validation_set
+    │   ├── clean
+    │   │   ├── clean_fileid_119.wav
+    │   │   ├── clean_fileid_165.wav
+    │   │   └── clean_fileid_7.wav
+    │   ├── noise
+    │   │   ├── noise_fileid_27.wav
+    │   │   ├── noise_fileid_312.wav
+    │   │   └── noise_fileid_4.wav
+    │   └── noisy
+    │       ├── book_00588_chp_0003_..._fileid_115.wav
+    │       ├── book_09739_chp_0003_..._fileid_275.wav
+    │       └── German_Wikiped_..._fileid_246.wav
+    └── validation_set.tar.gz
+```
+
+To run inference on the validation set using the pre-trained model, use the following command:
+
+```bash
+cd <your_project_root>
+
+# Download pre-trained model from Github Releases
+wget https://github.com/haoxiangsnr/spiking-fullsubnet/releases/download/ckpt-epoch-188/epoch_0188.zip
+
+# Unzip the pre-trained model to the correct directory
+mkdir -p recipes/intel_ndns/spiking_fullsubnet_v2/exp/middle-model__partition-0-32-128-256__grouping-8-32-64__deep-filtering-5-3-1__synops-1e-8/checkpoints/
+
+unzip epoch_0188.zip -d recipes/intel_ndns/spiking_fullsubnet_v2/exp/middle-model__partition-0-32-128-256__grouping-8-32-64__deep-filtering-5-3-1__synops-1e-8/checkpoints/epoch_0188
+
+# Inference on validation set
+accelerate launch --multi_gpu \
+    --num_processes=4 \
+    --gpu_ids 0,1,2,3 \
+    --main_process_port 46601 \
+    run.py \
+    --config_path conf/middle-model__partition-0-32-128-256__grouping-8-32-64__deep-filtering-5-3-1__synops-1e-8.yaml \
+    --eval_batch_size 4 \
+    --resume_from_checkpoint /home/xhao/proj/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet_v2/exp/middle-model__partition-0-32-128-256__grouping-8-32-64__deep-filtering-5-3-1__synops-1e-8/checkpoints/epoch_0188 \
+    --do_eval true 
+```
+
+Depending on your software environment, you may have the results like below:
+
+|        set |  si_sdr |    P808 |    OVRL |     SIG |     BAK |
+| ---------: | ------: | ------: | ------: | ------: | ------: |
+| validation | 15.0127 | 3.61135 | 3.01281 | 3.33227 | 3.93021 |
+
+## Citation
+
+If you find this repository useful for your research, please consider citing the following papers:
+
+```bibtex
+@ARTICLE{hao2025toward,
+  author={Hao, Xiang and Ma, Chenxiang and Yang, Qu and Wu, Jibin and Tan, Kay Chen},
+  journal={IEEE Transactions on Neural Networks and Learning Systems}, 
+  title={Toward Ultralow-Power Neuromorphic Speech Enhancement With Spiking-FullSubNet}, 
+  year={2025},
+  volume={36},
+  number={9},
+  pages={17350-17364},
+  doi={10.1109/TNNLS.2025.3566021}}
+
+@INPROCEEDINGS{hao2024when,
+  author={Hao, Xiang and Ma, Chenxiang and Yang, Qu and Tan, Kay Chen and Wu, Jibin},
+  booktitle={2024 IEEE Conference on Artificial Intelligence (CAI)}, 
+  title={When Audio Denoising Meets Spiking Neural Network}, 
+  year={2024},
+  volume={},
+  number={},
+  pages={1524-1527},
+  doi={10.1109/CAI59869.2024.00275}}
+```
 
 ## License
 
-All the code in this repository is released under the [MIT License](https://opensource.org/licenses/MIT), for more details see the [LICENSE](LICENSE) file.
-
-
-<!-- MARKDOWN LINKS & IMAGES -->
-<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
-[contributors-shield]: https://img.shields.io/github/contributors/haoxiangsnr/spiking-fullsubnet.svg?style=for-the-badge
-[contributors-url]: https://github.com/haoxiangsnr/spiking-fullsubnet/graphs/contributors
-[forks-shield]: https://img.shields.io/github/forks/haoxiangsnr/spiking-fullsubnet.svg?style=for-the-badge
-[forks-url]: https://github.com/haoxiangsnr/spiking-fullsubnet/network/members
-[stars-shield]: https://img.shields.io/github/stars/haoxiangsnr/spiking-fullsubnet.svg?style=for-the-badge
-[stars-url]: https://github.com/haoxiangsnr/spiking-fullsubnet/stargazers
-[issues-shield]: https://img.shields.io/github/issues/haoxiangsnr/spiking-fullsubnet.svg?style=for-the-badge
-[issues-url]: https://github.com/haoxiangsnr/spiking-fullsubnet/issues
-[license-shield]: https://img.shields.io/github/license/haoxiangsnr/spiking-fullsubnet.svg?style=for-the-badge
-[license-url]: https://github.com/haoxiangsnr/spiking-fullsubnet/blob/master/LICENSE.txt
+This project is licensed under the [MIT License](LICENSE).
diff --git a/docs/source/getting_started/installation.md b/docs/source/getting_started/installation.md
@@ -3,50 +3,95 @@
 ## Prerequisites
 
 Spiking-FullSubNet is built on top of PyTorch and provides standard audio signal processing and deep learning tools.
-To install the PyTorch binaries, we recommend [Anaconda](https://www.anaconda.com/products/individual) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) as a Python distribution.
+
+- **Python**: >= 3.10
+- **uv**: We recommend using [uv](https://docs.astral.sh/uv/) as the package manager for faster and more reliable dependency management.
 
 ## Installation
 
-1. First, create a Conda virtual environment with Python. In our project, `python=3.10` is tested.
+### Option 1: Using uv (Recommended)
+
+[uv](https://docs.astral.sh/uv/) is a fast Python package manager that handles virtual environments and dependencies efficiently.
+
+1. **Install uv** (if not already installed):
     ```shell
-    # Create a virtual environment named `spiking-fullsubnet`
-    conda create --name spiking-fullsubnet python=3.10
+    # On macOS/Linux
+    curl -LsSf https://astral.sh/uv/install.sh | sh
 
-    # Activate the environment
-    conda activate spiking-fullsubnet
+    # Or using pip
+    pip install uv
+    ```
+
+2. **Clone the repository**:
+    ```shell
+    git clone https://github.com/haoxiangsnr/spiking-fullsubnet.git
+    cd spiking-fullsubnet
     ```
-    The following steps will assume you have activated the `spiking-fullsubnet` environment.
 
-2. Install Conda dependencies. Some dependencies of Spiking-FullSubNet, e.g., PyTorch and Tensorboard, are recommended to be installed using Conda instead of PyPI. First, we install a CUDA-capable PyTorch. Although `pytorch=2.1.1` has been tested, you may also [use other versions](https://pytorch.org/get-started/previous-versions/):
+3. **Install PyTorch** (CUDA version):
     ```shell
-    # Install PyTorch
-    conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
+    # Install PyTorch with CUDA support first
+    uv pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
+    ```
 
-    # Install other Conda dependencies
-    conda install tensorboard joblib matplotlib
+4. **Sync dependencies and install the project**:
+    ```shell
+    # Install all dependencies (creates .venv automatically)
+    uv sync
+
+    # Or with GPU support (includes onnxruntime-gpu)
+    uv sync --extra gpu
 
-    # (Optional) If you have "mp3" format audio data in your dataset, install ffmpeg first.
-    conda install ffmpeg -c conda-forge
+    # Or with all optional dependencies (gpu, test, docs, build)
+    uv sync --all-extras
     ```
 
-3. Install PyPI dependencies. Clone the repository and install PyPI dependencies via `pip -r requirements.txt`. Check `requirements.txt` for more details.
+    This will:
+    - Create a virtual environment in `.venv`
+    - Install all dependencies from `uv.lock`
+    - Install `audiozen` in editable mode
+
+5. **Activate the environment** (optional, uv commands auto-detect):
     ```shell
-    git clone https://github.com/haoxiangsnr/spiking-fullsubnet.git
+    source .venv/bin/activate
+    ```
 
-    cd spiking-fullsubnet
+### Option 2: Using Conda + pip
+
+If you prefer Conda, you can still use the traditional approach:
 
-    pip install -r requirements.txt
+1. **Create a Conda environment**:
+    ```shell
+    conda create --name spiking-fullsubnet python=3.10
+    conda activate spiking-fullsubnet
     ```
 
-4. We integrated all the audio signal processing tools into a package named `audiozen`. We will install the `audiozen` package in editable mode. By installing in editable mode, we can call `audiozen` package in everywhere of code, e.g, in `recipes` and `tools` folders. In addition, we are able to modify the source code of `audiozen` package directly. Any changes to the original package would reflect directly in your conda environment.
+2. **Install PyTorch**:
     ```shell
-    pip install --editable . # or for short: pip install -e .
+    conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
     ```
 
-Ok, all installations have done. You may speed up the installation by the following tips.
+3. **Install the project**:
+    ```shell
+    git clone https://github.com/haoxiangsnr/spiking-fullsubnet.git
+    cd spiking-fullsubnet
+    pip install -e .
+    ```
+
+## Common uv Commands
+
+```shell
+uv sync                    # Sync dependencies from lockfile
+uv sync --extra gpu        # Include GPU dependencies
+uv sync --all-extras       # Include all optional dependencies
+uv add <package>           # Add a new dependency
+uv lock --upgrade          # Upgrade all dependencies
+uv run <command>           # Run a command in the virtual environment
+uv build                   # Build the package
+```
 
 ```{tip}
-- [Speed up your Conda installs with Mamba](https://pythonspeed.com/articles/faster-conda-install/)
-- Use the [THU Anaconda mirror site](https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/) to speed up the Conda installation.
-- Use the [THU PyPi mirror site](https://mirrors.tuna.tsinghua.edu.cn/help/pypi/) to speed up the PyPI installation.
+- uv is significantly faster (10~100x) than pip and handles dependency resolution more reliably.
+- The `uv.lock` file ensures reproducible installations across different machines.
+- Use `uv run python script.py` to run scripts without manually activating the environment.
 ```
diff --git a/pyproject.toml b/pyproject.toml

-Original file line number
+Diff line change
 *.csv
 core.*
 tmp/
 +data/
 # Ruff
 .ruff_cache/