|
2 | 2 | <html lang="en"> |
3 | 3 | <head> |
4 | 4 | <meta charset="UTF-8" /> |
5 | | -<title>Project 5: Fun with Diffusion Models</title> |
| 5 | +<title>CS180 Project 5</title> |
6 | 6 | <style> |
7 | 7 | body { |
8 | 8 | font-family: system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif; |
@@ -150,14 +150,14 @@ <h3>Images generated with num_inference_steps=100</h3> |
150 | 150 | <h2>Part 1.1 – The forward process</h2> |
151 | 151 |
|
152 | 152 | To start, we have the original Campanile image at 64px: |
153 | | -<div class="image-row"> |
| 153 | +<div align="center"> |
154 | 154 | <figure> |
155 | 155 | <img src="images/campanile.png" alt="campanile.png" /> |
156 | 156 | </figure> |
157 | 157 | </div> |
158 | 158 |
|
159 | 159 | To add noise to an image <code>x<sub>0</sub></code>, we can use the forward process and compute |
160 | | -<div class="image-row"> |
| 160 | +<div align="center"> |
161 | 161 | <figure> |
162 | 162 | <img src="images/forward.png" alt="forward.png" /> |
163 | 163 | </figure> |
@@ -310,7 +310,7 @@ <h2>Part 1.4 – Iterative Denoising</h2> |
310 | 310 |
|
311 | 311 | Instead of using one step, we can obtain better results by iteratively denoising from step <code>t</code> until step 0. However, this means running the diffusion model 1000 times in the worst case, which is slow and costly. Fortunately, we can speed up the computation by first defining a series of strided timestamps, starting at close to 1000 and ending at 0. For the examples below, we will use <code>strided_timestamps = [990, 960, ..., 30, 0]</code>. Then, we can use the formula |
312 | 312 |
|
313 | | -<div class="image-row"> |
| 313 | +<div align="center"> |
314 | 314 | <figure> |
315 | 315 | <img src="images/equation.png" alt="equation.png" /> |
316 | 316 | </figure> |
@@ -887,15 +887,19 @@ <h2>Part 1.9 – Hybrid Images</h2> |
887 | 887 | <br>After doing so, we will add the 2 filtered noises together to get the final noise estimate at each step. This will produce an image that, when viewed close up, shows <code>p<sub>1</sub></code>, but when viewed far away, shows <code>p<sub>2</sub></code>. Unlike the anagram images, we don't need to flip or transform the image to be denoised, as both images should be viewed under the same orientation. Below are several examples: |
888 | 888 |
|
889 | 889 | <div class="subsection"> |
890 | | -<div class="image-row"> |
| 890 | +<div align="center"> |
891 | 891 | <figure> |
892 | 892 | <img src="images/hybrid/hybrid1_256.png" alt="hybrid1_256.png" /> |
893 | 893 | <figcaption>Prompts: <code>'a lithograph of a skull'</code> (low-pass) & <code>'a lithograph of waterfalls'</code> (high-pass)</figcaption> |
894 | 894 | </figure> |
| 895 | +</div> |
| 896 | +<div align="center"> |
895 | 897 | <figure> |
896 | 898 | <img src="images/hybrid/hybrid2_256.png" alt="hybrid2_256.png" /> |
897 | 899 | <figcaption>Prompts: <code>'a pencil'</code> (low-pass) & <code>'a rocket ship'</code> (high-pass)</figcaption> |
898 | 900 | </figure> |
| 901 | +</div> |
| 902 | +<div align="center"> |
899 | 903 | <figure> |
900 | 904 | <img src="images/hybrid/hybrid3_256.png" alt="hybrid3_256.png" /> |
901 | 905 | <figcaption>Prompts: <code>'a lithograph of waterfalls'</code> (low-pass) & <code>'a photo of a dog'</code> (high-pass)</figcaption> |
|
0 commit comments