Why is Context Shifting not kicking in for all messages even without using dynamic information (Memories)? #674
Replies: 5 comments 5 replies
-
@LostRuins I understand you might not have too much time at this moment, but when you are available, I'd be very thankful if you could chime into this. Sorry for the ping. |
Beta Was this translation helpful? Give feedback.
-
I'd suggest running in |
Beta Was this translation helpful? Give feedback.
-
Testing in progress... SillyTavern: Token Padding set to 32. Models:
|
Beta Was this translation helpful? Give feedback.
-
Testing in progress... SillyTavern: Token Padding set to 128/256/512/1024. Context 12288. Model:
|
Beta Was this translation helpful? Give feedback.
-
See issue #681 for conclusion and cause of the reported here. It is solved as if right now, hopefully! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When near max context only some messages benefit from Context Shifting, I can't seem to find the reason why that happens.
Real examples:
You can see the huge drop in final T/s when shifting doesn't happen.
I am using the prebuilt
koboldcpp 1.57.1
+SillyTavern 1.11.4+
(staging, latest commits), and I made sure I don't have any dynamic information added anywhere in the context sent for processing.Context/Response Formatting:
I don't have (I even disabled the modules and extensions I mention):
@D
or similarSo, I am not using any Memory or other dynamic information that might trigger additional context reprocessing, but this keeps happening.
What causes this? Is it an issue with the Model - due to quantization and RoPE?
For the example and this discussion, I'm using s3nh/Kunoichi-DPO-v2-7B-GGUF (Q4_K_M):
GPU: GTX 1070Ti - 8GB - Pascal
RAM: 32GB DDR4 3200MHZ
CPU: Ryzen 5 1600 AF 6C/12T 3.2-3.6MHZ (Zen 2 Arch)
OS: Windows 11 22H2
Beta Was this translation helpful? Give feedback.
All reactions