|
42 | 42 |
|
43 | 43 | """
|
44 | 44 | NoisyInterpolatingDiscreteFlow(κ₁, κ₂, dκ₁, dκ₂)
|
45 |
| - NoisyInterpolatingDiscreteFlow(noise) - Uses default cosine schedule, where `noise` is the maximum amplitude of the uniform noise component. |
| 45 | + NoisyInterpolatingDiscreteFlow(noise, K = 1) - Uses default cosine schedule, where `noise` is the maximum amplitude of the uniform noise component. |
46 | 46 | NoisyInterpolatingDiscreteFlow() - Uses default cosine schedule and noise = 0.2.
|
47 | 47 |
|
48 | 48 | A convex mixture of X0, uniform noise, and X1. Equation 10 in https://arxiv.org/pdf/2407.15595
|
49 | 49 | Compared to InterpolatingDiscreteFlow, it encourages the model to make multiple switches during inference.
|
50 | 50 | κ₁, κ₂ are the schedules for target token interpolation and uniform noise probability.
|
51 | 51 | dκ₁, dκ₂ are the derivatives of κ₁, κ₂.
|
52 |
| -Defaults to using a cosine schedule. |
| 52 | +Defaults to using a cosine schedule. `K=2` will resolve the discrete states later than `K=1`. |
53 | 53 | """
|
54 | 54 |
|
55 |
| -NoisyInterpolatingDiscreteFlow(noise) = NoisyInterpolatingDiscreteFlow( |
56 |
| - t -> oftype(t,(1 - cos((π/2)*t))), |
57 |
| - t -> oftype(t,(noise * sin(π*t))), |
58 |
| - t -> oftype(t,((π/2)*sin((π/2)*t))), |
59 |
| - t -> oftype(t,(noise*π*cos(π*t))), |
| 55 | +NoisyInterpolatingDiscreteFlow(noise, K = 1) = NoisyInterpolatingDiscreteFlow( |
| 56 | + t -> oftype(t,(1 - cos((π/2)*t))^K), #K1 |
| 57 | + t -> oftype(t,(noise * sin(π*t))), #K2 |
| 58 | + t -> oftype(t,(K * (π/2) * sin((π/2) * t) * (1 - cos((π/2) * t))^(K - 1))), #dK1 |
| 59 | + t -> oftype(t,(noise*π*cos(π*t))) #dK2 |
60 | 60 | )
|
61 | 61 | NoisyInterpolatingDiscreteFlow() = NoisyInterpolatingDiscreteFlow(0.2)
|
62 | 62 |
|
|
0 commit comments