Skip to content

ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

License

Notifications You must be signed in to change notification settings

GAInuist/ConsDreamer

Repository files navigation

ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

arXiv Project Page License

Yuan Zhou · Shilong Jin · Litao Hua · Wanjun Lv · Haoran Duan · Jungong Han


🎬 Gallery


🔍 Comparisons with Baselines

Baseline Ours          Baseline Ours

📝 Abstract

Here, we propose ConsDreamer, an innovative method designed to address the Janus problem in text-to-3D generation by introducing:

  1. View Disentanglement Module (VDM)
  2. Novel similarity-based partial order loss
📖 Click for the full abstract

Recent advances in zero-shot text-to-3D generation have revolutionised 3D content creation by enabling direct synthesis from textual descriptions. While state-of-the-art methods leverage 3D Gaussian Splatting with score distillation to enhance multi-view rendering through pre-trained text-to-image (T2I) models, they suffer from inherent prior view biases in T2I Models. These biases lead to inconsistent 3D generation, particularly manifesting as the multi-face Janus problem, where objects exhibit conflicting features across views.

To address this fundamental challenge, we propose ConsDreamer, a novel method that mitigates view bias by refining both the conditional and unconditional terms in the score distillation process:

  • View Disentanglement Module (VDM): Eliminates viewpoint biases in conditional prompts by decoupling irrelevant view components and injecting precise view control

  • Similarity-based partial order loss: Enforces geometric consistency in the unconditional term by aligning cosine similarities with azimuth relationships.

Extensive experiments demonstrate that ConsDreamer can be seamlessly integrated into various 3D representations and score distillation paradigms, effectively mitigating the multi-face Janus problem.


🏗️ Pipeline



🚀 Getting Started

Prerequisites

The implementation of ConsDreamer is based on:

  • Python: 3.9.16
  • CUDA: 11.7
  • PyTorch: 2.0.1

📥 Cloning the Repository

The repository contains submodules; thus please check it out with:

HTTPS

git clone https://github.com/GAInuist/ConsDreamer.git
cd ConsDreamer

SSH

git clone git@github.com:GAInuist/ConsDreamer.git
cd ConsDreamer

🛠️ Installation

Step 1: Create Conda Environment

conda create -n ConsDreamer python=3.9.16 cudatoolkit=11.8
conda activate ConsDreamer

Step 2: Install Dependencies

pip install -r requirements.txt
pip install submodules/diff-gaussian-rasterization/
pip install submodules/simple-knn/

Step 3: Install CLIP

cd CLIP_vit
pip install -e .
cd ..

🎯 Running

Basic Usage

python ConsDreamer_train.py --opt <path to config file>

Using the Provided Script

bash Run.sh

🙏 Acknowledgements

Parts of our code are based on many amazing research works and open-source projects:

Thanks for their excellent work and great contribution to the 3D generation area! 🌟


📚 Citation

If you find this work useful for your research, please consider citing:

@misc{zhou2025consdreameradvancingmultiviewconsistency,
      title={ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation}, 
      author={Yuan Zhou and Shilong Jin and Litao Hua and Wanjun Lv and Haoran Duan and Jungong Han},
      year={2025},
      eprint={2504.02316},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.02316}, 
}

⭐ If you find this project helpful, please consider giving it a star! ⭐

About

ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages