You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NoisyInterpolatingDiscreteFlow(noise; K = 1, dummy_token = nothing) - Uses default cosine schedule, where `noise` is the maximum amplitude of the uniform noise component.
46
46
NoisyInterpolatingDiscreteFlow() - Uses default cosine schedule and noise = 0.2.
47
47
48
48
A convex mixture of X0, uniform noise, and X1. Equation 10 in https://arxiv.org/pdf/2407.15595
49
49
Compared to InterpolatingDiscreteFlow, it encourages the model to make multiple switches during inference.
50
50
κ₁, κ₂ are the schedules for target token interpolation and uniform noise probability.
51
51
dκ₁, dκ₂ are the derivatives of κ₁, κ₂.
52
52
Defaults to using a cosine schedule. `K=2` will resolve the discrete states later than `K=1`.
53
+
If K>1 things might break if your X0 is not the `dummy_token` (also called the masked token) which should be passed to NoisyInterpolatingDiscreteFlow.
53
54
"""
54
-
55
-
NoisyInterpolatingDiscreteFlow(noise, K =1) =NoisyInterpolatingDiscreteFlow(
functionNoisyInterpolatingDiscreteFlow(noise; K =1, dummy_token::T=nothing) where T
56
+
if (K >1&&isnothing(dummy_token))
57
+
@warn"NoisyInterpolatingDiscreteFlow: If K>1 things might break if your X0 is not the `dummy_token` (which should also be passed to NoisyInterpolatingDiscreteFlow)."
0 commit comments