@@ -79,48 +79,65 @@ <h1>Project 5: Fun with Diffusion Models</h1>
7979< section id ="part-0 ">
8080 < h2 > Part 0 – Prompting and Sampling</ h2 >
8181
82- < div class ="subsection " id ="part-0-prompts ">
83- < h3 > 0.1 Text Prompts and Embeddings</ h3 >
84- < p > < strong > Interesting prompts used:</ strong > </ p >
85- < ul >
86- < li > Prompt 1: < em > <!-- TODO: fill prompt --> </ em > </ li >
87- < li > Prompt 2: < em > <!-- TODO: fill prompt --> </ em > </ li >
88- < li > Prompt 3: < em > <!-- TODO: fill prompt --> </ em > </ li >
89- <!-- Add more as needed -->
90- </ ul >
91-
92- < h4 > Code: generating prompt embeddings</ h4 >
93- < pre > < code > # TODO
94- # Code to generate prompt embeddings
95- # e.g., prompt_embeds_dict = encode(prompts)</ code > </ pre >
96- </ div >
82+ To demonstrate the usage of the DeepFloyd IF diffusion model, below are a few examples of different prompts using 20 inference steps with stage 1 of the model, which generates images at 64x64 resolution:
9783
9884 < div class ="subsection " id ="part-0-images ">
99- < h3 > 0.2 Generated Images for 3 Prompts </ h3 >
85+ < h3 > Images generated with num_inference_steps=20 </ h3 >
10086
10187 < div class ="image-row ">
10288 < figure >
103- < img src ="images/part0_prompt1_stepsA.png " alt ="Prompt 1 - num_inference_steps = A " />
104- < figcaption > Prompt 1 – steps = A< br /> Caption and reflection: <!-- TODO --> </ figcaption >
89+ < img src ="images/64vs256/01_64px_20.png " alt ="01_64px_20.png " />
90+ < figcaption > Prompt 1 – 'a photo of a hipster barista'< br /> </ figcaption >
91+ </ figure >
92+ < figure >
93+ < img src ="images/64vs256/02_64px_20.png " alt ="02_64px_20.png " />
94+ < figcaption > Prompt 2 – 'a man wearing a hat'< br /> </ figcaption >
10595 </ figure >
10696 < figure >
107- < img src ="images/part0_prompt1_stepsB .png " alt ="Prompt 1 - num_inference_steps = B " />
108- < figcaption > Prompt 1 – steps = B < br /> Before/after comparison. <!-- TODO -- ></ figcaption >
97+ < img src ="images/64vs256/03_64px_20 .png " alt ="03_64px_20.png " />
98+ < figcaption > Prompt 3 – 'a rocket ship' < br /> </ figcaption >
10999 </ figure >
110100 </ div >
111-
101+ </ div >
102+
103+ Using stage 2, we can take the output of stage 1 and upscale them to 256x256 resolution:
104+
105+ < div class ="subsection " id ="part-0-images ">
112106 < div class ="image-row ">
113107 < figure >
114- < img src ="images/part0_prompt2.png " alt ="Prompt 2 image " />
115- < figcaption > Prompt 2 – single setting< br /> Reflection: <!-- TODO --> </ figcaption >
108+ < img src ="images/64vs256/01_256px_20.png " alt ="01_256px_20.png " />
109+ < figcaption > Prompt 1 – 'a photo of a hipster barista'< br /> </ figcaption >
110+ </ figure >
111+ < figure >
112+ < img src ="images/64vs256/02_256px_20.png " alt ="02_256px_20.png " />
113+ < figcaption > Prompt 2 – 'a man wearing a hat'< br /> </ figcaption >
116114 </ figure >
117115 < figure >
118- < img src ="images/part0_prompt3 .png " alt ="Prompt 3 image " />
119- < figcaption > Prompt 3 – single setting < br /> Reflection: <!-- TODO -- ></ figcaption >
116+ < img src ="images/64vs256/03_256px_20 .png " alt ="03_256px_20.png " />
117+ < figcaption > Prompt 3 – 'a rocket ship' < br /> </ figcaption >
120118 </ figure >
121119 </ div >
120+ </ div >
121+
122+ By increasing the inference steps, we can generate higher quality images at cost of more compute time. Below are the stage 2 outputs with the number of inference steps at 100:
123+
124+ < div class ="subsection " id ="part-0-images ">
125+ < h3 > Images generated with num_inference_steps=100</ h3 >
122126
123- < p class ="note "> < strong > Random seed used:</ strong > 180</ p >
127+ < div class ="image-row ">
128+ < figure >
129+ < img src ="images/64vs256/01_256px_100.png " alt ="01_256px_100.png " />
130+ < figcaption > Prompt 1 – 'a photo of a hipster barista'< br /> </ figcaption >
131+ </ figure >
132+ < figure >
133+ < img src ="images/64vs256/02_256px_100.png " alt ="02_256px_100.png " />
134+ < figcaption > Prompt 2 – 'a man wearing a hat'< br /> </ figcaption >
135+ </ figure >
136+ < figure >
137+ < img src ="images/64vs256/03_256px_100.png " alt ="03_256px_100.png " />
138+ < figcaption > Prompt 3 – 'a rocket ship'< br /> </ figcaption >
139+ </ figure >
140+ </ div >
124141 </ div >
125142</ section >
126143
@@ -161,16 +178,11 @@ <h3>Campanile at Different Noise Levels</h3>
161178<!-- Part 1.2: Gaussian Denoising -->
162179<!-- ========================================================= -->
163180< section id ="part-1-2 ">
164- < h2 > Part 1.2 – Gaussian Denoising</ h2 >
181+ < h2 > Part 1.2 – Classical Denoising</ h2 >
165182
166183 < div class ="subsection ">
167184 < h3 > Code: Gaussian Denoising</ h3 >
168- < pre > < code > # TODO
169- # def gaussian_denoise(im_noisy, ...):
170- # ...
171- # return im_denoised</ code > </ pre >
172- </ div >
173-
185+
174186 < div class ="subsection ">
175187 < h3 > Noisy vs Gaussian-Denoised Campanile</ h3 >
176188
@@ -613,4 +625,3 @@ <h5>Own Image 2</h5>
613625
614626</ body >
615627</ html >
616-
0 commit comments