You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-7Lines changed: 2 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,7 @@ The big surprise is that the generations can reach this level of fidelity. Will
10
10
11
11
Additionally, we will try adding an extra linear attention on the main branch as well as self conditioning in the pixel-space.
12
12
13
-
Update:
14
-
15
-
<imgsrc="./images/sample.png"width="300px"></img>
16
-
17
-
*130k steps*
18
-
19
-
It works but the more I think about the paper, the less excited I am. There are a number of issues with the RIN / ISAB architecture. However I think the new sigmoid noise schedule remains interesting as well as the new concept of being able to self-condition on any hidden state of the network.
13
+
The insight of being able to self-condition on any hidden state of the network as well as the newly proposed sigmoid noise schedule are the two main findings.
20
14
21
15
## Appreciation
22
16
@@ -39,6 +33,7 @@ model = RIN(
39
33
patch_size=8, # patch size
40
34
depth=6, # depth
41
35
num_latents=128, # number of latents. they used 256 in the paper
36
+
dim_latent=512, # can be greater than the image dimension (dim) for greater capacity
42
37
latent_self_attn_depth=4, # number of latent self attention blocks per recurrent step, K in the paper
0 commit comments