You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix ModifiedGradientDescentOptimizer to use correct projection
The UpdateVector method was using an incorrect scalar heuristic that
uniformly scaled all parameters by (1 - ||x||²), which required clipping
when ||x||² >= 1 and completely discarded the parameter term.
Issue:
- Used modFactor = 1 - ||x||² as a scalar multiplier
- Clipped to zero when ||x||² >= 1, dropping currentParameters entirely
- This is not the correct vector equivalent of W * (I - x x^T)
Fix:
Replaced with correct projection for vector parameter w:
- w * (I - x x^T) = w - x*(x^T*w) = w - x*dot(w,x)
- Compute dot = dot(currentParameters, input)
- Projection: currentParameters - input * dot
- Then subtract gradient: -η * gradient
- Final: w_{t+1} = w_t - x_t*dot(w_t,x_t) - η*gradient
Benefits:
- Mathematically correct implementation of Equations 27-29
- No clipping needed - projection is always numerically stable
- Parameters never discarded regardless of input norm
- Added validation for dimension matching
This ensures the Modified Gradient Descent optimizer correctly implements
the paper's formulation for vector parameters.
0 commit comments