Novel View Synthesis using DDIM Inversion

Methodology

Given a single reference image $\mathbf{x_{\text{ref}}}$, we first apply DDIM inversion up to $t=600$ to obtain the mean latent $\mathbf{z}_{\text{ref},\mu}^{\text{inv}}$. This, together with camera intrinsics/extrinsics, class embeddings, and ray information, is fed into our translation network TUNet. TUNet predicts the target-view mean latent $\tilde{\mathbf{z}}_{\text{tar},\mu}^{\text{inv}}$, which we combine with the corresponding noise component via one of our fusion strategies to form the initial DDIM latent $\tilde{\mathbf{z}}_{tar}^{\text{inv}}$. Finally, this latent is sampled by a pre-trained diffusion model to synthesize the novel view image.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
arch.png		arch.png
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Novel View Synthesis using DDIM Inversion

Methodology

About

Uh oh!

Releases

Packages

Languages

Visual-Conception-Group/ddim_nvs

Folders and files

Latest commit

History

Repository files navigation

Novel View Synthesis using DDIM Inversion

Methodology

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages