Replies: 2 comments 7 replies
-
it's odd that it would happen only after 1.20 since the implementation has been largely unchanged. At the moment there's no easy way to flush it, only to restart the program (which should be quite fast). Do you know when this happens? Threading shouldn't have any effect. Does it ever happen without smartcontext? Are you using a length beyond the recommended max context of 2048? If you have a prompt that you can use to consistently repro this issue then I can try to debug it. |
Beta Was this translation helpful? Give feedback.
-
Only happens with smartcontext. I use the default max tokens for the model I'm using, 2048. It only happens after there are more tokens than 2048 written from what I can tell. It happens with any prompt, any mode, many different models and has persisted through patches. As soon as smart context is refreshed, it's back to normal. It's been happening since smartcontext was introduced, though maybe more commonly now? Or I just keep noticing it more through built up frustration. I have never been able to figure out what causes it nor find a way to intentionally reproduce it. I have found some minor conditions that it occurs under. Long text, way above 2048 tokens is the only one I'm 100% certain on. And I have around 100-200 written stories, all with various prose and style and it happens on all of them rarely which means that it's not caused by the prompt. That being said, I never use the input box and only edit the raw text for my generations, usually writing a paragraph or two on my own. It shouldn't matter, but maybe there's something funky going on in the background with how it's parsed. It's model agnostic, affecting various models and architectures. 65b, 30b, Alpaca, Llama, Base Opt etc. It's strange, since it's very noticeable when it happens. Something is clearly wrong with the current context that causes an entirely fiction focused story start outputting weird pseudocode, or spam a table of contents, or just infinite /N. Letting it run out always resolves it. The reason I thought it might be CPU related is that it doesn't seem to claim the allocated threads if they are already in use by other processes, nor throttle the prompt processing if they are in use which could perhaps be causing some issue at times? I would use the force priority command, but it has some real strange performance implications, like the Kernel crashing randomly which doesn't happen when manually assigning N-1 cores. Like I said, no clue what causes it, it's seemingly random and never happens twice in a row. Not super useful for finding the cause, maybe it's cosmic rays lol. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey, I've been having issues with smartcontext occasionally becoming corrupt since 1.20. It's relatively rare, but when it happens it will output complete nonsense until the smartcontext can be flushed. It will output nonsense only, forcefully swap to code output or just spam random letters until it can reparse the context. Is there anyway to flush smartcontext and reprocess it when this happens? Currently I either restart or just have it output the 1020 tokens while I do something else. Once done, I delete the output and then resubmit the previous context again. Another common one is it outputting /n/n/n a total of 1020 times. Samplers don't affect it.
Being able to manually clear corrupt context would save a lot of time. I would say this happens around once every 30-40 new smartcontext which is enough to become a real annoyance over time. Once it reprocesses the context it's fine.
I'm not exactly sure why it becomes corrupt, going through the tokens in context shows nothing strange, and changing samplers does not alter the behavior. No clue why this happens, maybe it's due to the threads/cores allocated being busy with other processes, causing strange behavior during the processing? I've not been able to replicate, or diagnose what could be causing it. Seems to be pretty random, the only common denominator has been relatively long stories, way above the context cap.
Beta Was this translation helpful? Give feedback.
All reactions