Commit 4fb572c
committed
feat(memory): Only fail partial erasure of recurrent tail
The recurrent state is always assumed to be the state as of the last update
from the final token in the sequence. When doing a partial erasure, if the
range does not include the final token, the erasure can be considered a
success since any memory used for the sequence prior to the final token
(which is no memory) has been successfully removed.
There is one potential case that this doesn't address which is the pruning
of cache to remove sensitive data from the context. This wouldn't work for
attention cache partial removal (in the middle) either since the KV state
is linearly-dependent and states in later sequence positions would still be
based on the state from the sensitive data, even if that data is no longer
cached, so I don't think this is relevant, but it is worth noting that the
semantics of this change for a partial erasure in the middle of the cache
are essentially "my context is already compressed" and not "all trace of
the removed tokens has been removed."
#16768
Branch: HybridContextShift-16768
Signed-off-by: Gabe Goodhart <[email protected]>1 parent a5c07dc commit 4fb572c
1 file changed
+4
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
154 | | - | |
| 154 | + | |
| 155 | + | |
155 | 156 | | |
156 | 157 | | |
157 | 158 | | |
| |||
160 | 161 | | |
161 | 162 | | |
162 | 163 | | |
163 | | - | |
164 | | - | |
| 164 | + | |
| 165 | + | |
165 | 166 | | |
166 | 167 | | |
167 | 168 | | |
| |||
0 commit comments