You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We now have the necessary tools to generate visual anagrams, or images that look like another different one when flipped/rotated. As an example for a vertical flip anagram, we would start with 2 prompt embeddings <code>p<sub>1</sub></code> and <code>p<sub>2</sub></code>. For <code>p<sub>1</sub></code>, we would compute the noise estimate ε<sub>1</sub> normally at each step, but for <code>p<sub>2</sub></code>, we flip the image <code>x<sub>t</sub></code> first before computing the noise estimate, then flip back the estimate to obtain ε<sub>2</sub>. Once this is done, we will use the average of ε<sub>1</sub> and ε<sub>2</sub> as the final noise estimate for each step. The variance can also be computed similarly, namely v<sub>1</sub> will be computed in the usual way, while v<sub>2</sub> will be the flipped variance estimate of the flipped <code>x<sub>t</sub></code>, and the final variance estimate will (v<sub>1</sub> + v<sub>2</sub>) / 2. Below are a few examples of such an effect, with <code>p<sub>1</sub></code> being the first prompt and <code>p<sub>2</sub></code> being the second:
836
-
837
-
<divclass="subsection">
838
-
<h3>Prompts: <code>'an oil painting of an old man'</code> and <code>'an oil painting of people around a campfire'</code></h3>
We now have the necessary tools to generate visual anagrams, or images that look like another different one when flipped/rotated. As an example for a vertical flip anagram, we would start with 2 prompt embeddings <code>p<sub>1</sub></code> and <code>p<sub>2</sub></code>. For <code>p<sub>1</sub></code>, we would compute the noise estimate ε<sub>1</sub> normally at each step, but for <code>p<sub>2</sub></code>, we flip the image <code>x<sub>t</sub></code> first before computing the noise estimate, then flip back the estimate to obtain ε<sub>2</sub>. Once this is done, we will use the average of ε<sub>1</sub> and ε<sub>2</sub> as the final noise estimate for each step. The variance can also be computed similarly, namely v<sub>1</sub> will be computed in the usual way, while v<sub>2</sub> will be the flipped variance estimate of the flipped <code>x<sub>t</sub></code>, and the final variance estimate will (v<sub>1</sub> + v<sub>2</sub>) / 2. Below are a few examples of such an effect, with <code>p<sub>1</sub></code> being the first prompt and <code>p<sub>2</sub></code> being the second:
836
+
837
+
<divclass="subsection">
838
+
<h3>Prompts: <code>'an oil painting of an old man'</code> and <code>'an oil painting of people around a campfire'</code></h3>
With the technqiues above, we can now also create hybrid images, or images that look like different subjects depending on the viewing distance. The classical way to create a hybrid image is to transform the image you want to see at far range with a low-pass filter, the image you want to see at close range wiht a high-pass filter, and combine the 2 transformed images. We can use a similar algorithm in the denoising process, namely by passing the noise estimate from <code>p<sub>1</sub></code> and <code>p<sub>2</sub></code> through a low and high pass filter respectively. This will produce an image that when view close up, shows <code>p<sub>1</sub></code>, but when viewed far away, shows <code>p<sub>2</sub></code>. Unlike the anagram images, we don't need to flip or transform the image to be denoised, as both images should be viewed under the same orientation. Below are several examples:
883
-
884
-
<divclass="subsection">
885
-
<h3>Prompts: <code>'an oil painting of an old man'</code> and <code>'an oil painting of people around a campfire'</code></h3>
With the techniques above, we can now also create hybrid images, or images that look like different subjects depending on the viewing distance. The classical way to create a hybrid image is to transform the image you want to see at a far range with a low-pass filter, the image you want to see at close range with a high-pass filter, and combine the 2 transformed images. We can use a similar algorithm in the denoising process, namely by passing the noise estimate from <code>p<sub>1</sub></code> and <code>p<sub>2</sub></code> through a low and high pass filter, respectively. This will produce an image that, when viewed close up, shows <code>p<sub>1</sub></code>, but when viewed far away, shows <code>p<sub>2</sub></code>. Unlike the anagram images, we don't need to flip or transform the image to be denoised, as both images should be viewed under the same orientation. Below are several examples:
883
+
884
+
<divclass="subsection">
885
+
<h3>Prompts: <code>'an oil painting of an old man'</code> and <code>'an oil painting of people around a campfire'</code></h3>
0 commit comments