We present One-Shot Refiner, a novel framework for high-fidelity novel view synthesis (NVS) from sparse input views. Our method overcomes key limitations of recent feed-forward 3D Gaussian Splatting (3DGS) pipelines built on Vision Transformer (ViT) backbones by introducing a Dual-Domain Detail Perception Module and a feature-guided diffusion refiner, enabling consistent, high-resolution, and geometrically coherent view synthesis—even in unseen regions.
pip install -r requirements.txt
pip install submodules/latent-gaussian-rasterization
pip install git+https://github.com/rmurai0610/diff-gaussian-rasterization-w-pose.gitWe follow NoPoSplat to process the DL3DV dataset and modify 'config/experiment/dl3dv.yaml' to set the dataset root paths.
Our training follows a three-stage strategy as described in the paper:
This stage learns a stable 3D Gaussian representation from sparse views.
bash train_pipeline.shThe dataset preparation script and training code for this stage will be released in a separate repository soon.
Fuse the 3D Gaussian features into the diffusion process for geometrically consistent refinement. The diffusion model is guided by rendered Gaussian features during denoising. Enables end-to-end consistency across views while preserving fine details.
⚠ Important — Before running the joint optimization, update 'config/main.yaml' to set train.pretrain_model_dir and train.unet_model_dir.
bash train.shAfter obtaining the jointly trained model, high-fidelity novel views can be generated.
bash test.shIf you find this work useful, please cite our paper:
@misc{dong2026oneshotrefinerboostingfeedforward,
title={One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion},
author={Yitong Dong and Qi Zhang and Minchao Jiang and Zhiqiang Wu and Qingnan Fan and Ying Feng and Huaqi Zhang and Hujun Bao and Guofeng Zhang},
year={2026},
eprint={2601.14161},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.14161},
}