Skip to content

[Pattern Recognition 2025] Official implementation of paper: "TSLDSeg: A Texture-aware and Semantic-enhanced Latent Diffusion Model for Medical Image Segmentation"

License

Notifications You must be signed in to change notification settings

Saury997/TSLDSeg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TSLDSeg

[Pattern Recognition 2025] Official implementation of paper: "TSLDSeg: A Texture-aware and Semantic-enhanced Latent Diffusion Model for Medical Image Segmentation".

This repository is based on SDSeg, a latent diffusion model for medical image segmentation.
Specifically, our work focuses on alleviating the information loss of perceptual compression in conditioning by:

  • Enhancing fine-grained representations to preserve high-frequency details (edges, fine textures, orientations).
  • Leveraging hypergraph modeling to capture semantic relationships and spatial/topological constraints.

Requirements

A suitable conda environment named TSLDSeg can be created and activated with:

conda env create -f environment.yaml
conda activate TSLDSeg

Then, install some dependencies by:

pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip
pip install -e .
Solve GitHub connection issues when downloading taming-transformers or clip

After creating and entering the TSLDSeg environment:

  1. create an src folder and enter:
mkdir src
cd src
  1. download the following codebases in *.zip files and upload to src/:
  2. unzip and install taming-transformers:
unzip taming-transformers-master.zip
cd taming-transformers-master
pip install -e .
cd ..
  1. unzip and install clip:
unzip CLIP-main.zip
cd CLIP-main
pip install -e .
cd ..
  1. install TSLDSeg:
cd ..
pip install -e .

Then you're good to go!

Model Weights

Pretrained Models

TSLDSeg uses pre-trained weights from LDM to initialize before training.

For pre-trained weights of the autoencoder and conditioning model, run

bash scripts/download_first_stages_f8.sh

For pre-trained wights of the denoising UNet, run

bash scripts/download_models_lsun_churches.sh

Scripts

Training Scripts

Take CVC dataset as an example, run

nohup python -u main.py --base configs/latent-diffusion/cvc-ldm-kl-8.yaml -t --gpus 0, --name experiment_name > nohup/experiment_name.log 2>&1 &

You can check the training log by

tail -f nohup/experiment_name.log

Also, tensorboard will be on automatically. You can start a tensorboard session with --logdir=./logs/. For example,

tensorboard --logdir=./logs/

Note

If you want to use parallel training, the code trainer_config["accelerator"] = "gpu" in main.py should be changed to trainer_config["accelerator"] = "ddp". However, parallel training is not recommended since it has no performance gain (in my experience).

Warning

A single TSLDSeg model ckeckpoint is around 5GB. By default, save only the last model and the model with the highest dice score. If you have tons of storage space, feel free to save more models by increasing the save_top_k parameter in main.py.

Testing Scripts

After training an TSLDSeg model, you should manually modify the run paths in scripts/slice2seg.py, and begin an inference process like

python -u scripts/slice2seg.py --dataset cvc

Citation

If you find this repository useful in your research, please consider citing:

@article{YANG2026112795,
title = {TSLDSeg: A texture-aware and semantic-enhanced latent diffusion model for medical image segmentation},
journal = {Pattern Recognition},
volume = {173},
pages = {112795},
year = {2026},
issn = {0031-3203},
doi = {https://doi.org/10.1016/j.patcog.2025.112795},
author = {Zongjian Yang and Chunquan Li and Jiquan Ma},
}

or

Z. Yang, C. Li and J. Ma, “TSLDSeg: A texture-aware and semantic-enhanced latent diffusion model for medical image segmentation,” Pattern Recognition, vol. 173, p. 112795, 2026, doi: 10.1016/j.patcog.2025.112795.

Acknowledgement

This work is built upon the following open-source projects. We sincerely thank the authors for their excellent contributions:

About

[Pattern Recognition 2025] Official implementation of paper: "TSLDSeg: A Texture-aware and Semantic-enhanced Latent Diffusion Model for Medical Image Segmentation"

Topics

Resources

License

Stars

Watchers

Forks