You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We now have the necessary tools to generate visual anagrams, or images that look like another different one when flipped/rotated. As an example for a vertical flip anagram, we would start with 2 prompt embeddings <code>p<sub>1</sub></code> and <code>p<sub>2</sub></code>. For <code>p<sub>1</sub></code>, we would compute the noise estimate ε<sub>1</sub> normally at each step, but for <code>p<sub>2</sub></code>, we flip the image <code>x<sub>t</sub></code> first before computing the noise estimate, then flip back the estimate to obtain ε<sub>2</sub>. Once this is done, we will use the average of ε<sub>1</sub> and ε<sub>2</sub> as the final noise estimate for each step. The variance can also be computed similarly, namely v<sub>1</sub> will be computed in the usual way, while v<sub>2</sub> will be the flipped variance estimate of the flipped <code>x<sub>t</sub></code>, and the final variance estimate will (v<sub>1</sub> + v<sub>2</sub>) / 2. Below are a few examples of such an effect, with <code>p<sub>1</sub></code> being the first prompt and <code>p<sub>2</sub></code> being the second:
838
840
839
841
<divclass="subsection">
840
-
<h3>1.8.1 – Code: visual_anagrams</h3>
841
-
<pre><code># TODO
842
-
# def visual_anagrams(
843
-
# prompt_embeds_p1,
844
-
# prompt_embeds_p2,
845
-
# uncond_prompt_embeds,
846
-
# timesteps,
847
-
# scale=7,
848
-
# num_inference_steps=...,
849
-
# ):
850
-
# """
851
-
# Returns:
852
-
# image: torch.Tensor of shape (1, 3, 64, 64) in [-1, 1]
853
-
# """
854
-
# # TODO</code></pre>
855
-
856
-
<pclass="note">
857
-
Notes: include your flipping operation (e.g., torch.flip(..., dims=[2])) and how you combine
858
-
noise / variance estimates (if applicable).
859
-
</p>
860
-
</div>
861
-
862
-
<divclass="subsection">
863
-
<h3>1.8.2 – Two Visual Anagram Illusions</h3>
864
-
<p>
865
-
Each illusion should look like one concept normally, and another concept when flipped upside down.
With the technqiues above, we can now also create hybrid images, or images that look like different subjects depending on the viewing distance. The classical way to create a hybrid image is to transform the image you want to see at close range with a low-pass filter, thus keeping
911
887
912
888
<divclass="subsection">
913
889
<h3>1.9.1 – Code: make_hybrids</h3>
914
-
<pre><code># TODO
915
-
# def make_hybrids(
916
-
# image_a,
917
-
# image_b,
918
-
# lowpass_sigma=...,
919
-
# highpass_sigma=...,
920
-
# blend_weight=...,
921
-
# ):
922
-
# """
923
-
# Returns:
924
-
# hybrid: torch.Tensor or np.ndarray (document your format)
925
-
# low_freq: low-frequency component (optional)
926
-
# high_freq: high-frequency component (optional)
927
-
# """
928
-
# # TODO</code></pre>
890
+
929
891
930
892
<pclass="note">
931
893
Notes: describe your filter choice (Gaussian blur / FFT), the cutoff frequencies (sigmas),
0 commit comments