How do you flush smartcontext without depleting the token allowance manually through generations? #192

bucketcat · 2023-05-22T00:45:40Z

bucketcat
May 22, 2023

Hey, I've been having issues with smartcontext occasionally becoming corrupt since 1.20. It's relatively rare, but when it happens it will output complete nonsense until the smartcontext can be flushed. It will output nonsense only, forcefully swap to code output or just spam random letters until it can reparse the context. Is there anyway to flush smartcontext and reprocess it when this happens? Currently I either restart or just have it output the 1020 tokens while I do something else. Once done, I delete the output and then resubmit the previous context again. Another common one is it outputting /n/n/n a total of 1020 times. Samplers don't affect it.

Being able to manually clear corrupt context would save a lot of time. I would say this happens around once every 30-40 new smartcontext which is enough to become a real annoyance over time. Once it reprocesses the context it's fine.

I'm not exactly sure why it becomes corrupt, going through the tokens in context shows nothing strange, and changing samplers does not alter the behavior. No clue why this happens, maybe it's due to the threads/cores allocated being busy with other processes, causing strange behavior during the processing? I've not been able to replicate, or diagnose what could be causing it. Seems to be pretty random, the only common denominator has been relatively long stories, way above the context cap.

LostRuins · 2023-05-22T05:58:36Z

LostRuins
May 22, 2023
Maintainer

it's odd that it would happen only after 1.20 since the implementation has been largely unchanged. At the moment there's no easy way to flush it, only to restart the program (which should be quite fast).

Do you know when this happens? Threading shouldn't have any effect. Does it ever happen without smartcontext? Are you using a length beyond the recommended max context of 2048?

If you have a prompt that you can use to consistently repro this issue then I can try to debug it.

0 replies

bucketcat · 2023-05-22T06:54:57Z

bucketcat
May 22, 2023
Author

Only happens with smartcontext. I use the default max tokens for the model I'm using, 2048. It only happens after there are more tokens than 2048 written from what I can tell. It happens with any prompt, any mode, many different models and has persisted through patches. As soon as smart context is refreshed, it's back to normal.

It's been happening since smartcontext was introduced, though maybe more commonly now? Or I just keep noticing it more through built up frustration. I have never been able to figure out what causes it nor find a way to intentionally reproduce it. I have found some minor conditions that it occurs under. Long text, way above 2048 tokens is the only one I'm 100% certain on.

And I have around 100-200 written stories, all with various prose and style and it happens on all of them rarely which means that it's not caused by the prompt. That being said, I never use the input box and only edit the raw text for my generations, usually writing a paragraph or two on my own. It shouldn't matter, but maybe there's something funky going on in the background with how it's parsed.

It's model agnostic, affecting various models and architectures. 65b, 30b, Alpaca, Llama, Base Opt etc.

It's strange, since it's very noticeable when it happens. Something is clearly wrong with the current context that causes an entirely fiction focused story start outputting weird pseudocode, or spam a table of contents, or just infinite /N. Letting it run out always resolves it. The reason I thought it might be CPU related is that it doesn't seem to claim the allocated threads if they are already in use by other processes, nor throttle the prompt processing if they are in use which could perhaps be causing some issue at times? I would use the force priority command, but it has some real strange performance implications, like the Kernel crashing randomly which doesn't happen when manually assigning N-1 cores.

Like I said, no clue what causes it, it's seemingly random and never happens twice in a row. Not super useful for finding the cause, maybe it's cosmic rays lol.

7 replies

JimmyInsane Jun 6, 2023

I sometimes get the same problem, but i dont know if it s really related to smartcontext. I usually use models outside their trained context-length. At the moment I use a 30b model trained with 2048 token input (the default i guess?) and I set up the ctx-size to 2500 (which works most of the time and give really interesting results which almost make responses look human like). When I pushed the limits to like 3k iIt starts writing random letters, gramatically nonsense and sometimes useless hashtags or links.

VL4DST3R Jun 16, 2023

Also encountered this issue a few times, and here i thought maybe i faffed too much with the temperature or other parameters, or that the particular model was broken. I've encountered all the examples given by OP,

start outputting weird pseudocode, or spam a table of contents, or just infinite /N

while also the occasional situation where it suddenly tries to wrap up the story out of nowhere, basically give some rushed "happy ever after" And then stop. I even got it to mention something of a "teaser" for an upcoming chapter (chapters were never mentioned in any of my texts) and then plonk random garbage data like "thank you for reading, follow my patreon", "this is a work of fiction..." etc.

LostRuins Jun 17, 2023
Maintainer

@VL4DST3R try using --unbantokens

VL4DST3R Jun 17, 2023

I'm actually a bit confused as to what --unbantokens does. I read issue #149 since it made a mention of this and its opposite flag, and from what i gather, isn't this going to cause prompts to be cut short/prematurely?

LostRuins Jun 18, 2023
Maintainer

--unbantokens basically allows the model to generate the <|endoftext|> token, which is a special token used when the model has finished speaking. Normally you don't want this, since you always want the model to keep generating the story, however in some cases like instruct mode, it is useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How do you flush smartcontext without depleting the token allowance manually through generations? #192

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 7 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How do you flush smartcontext without depleting the token allowance manually through generations? #192

Uh oh!

bucketcat May 22, 2023

Replies: 2 comments · 7 replies

Uh oh!

LostRuins May 22, 2023 Maintainer

Uh oh!

bucketcat May 22, 2023 Author

Uh oh!

JimmyInsane Jun 6, 2023

Uh oh!

Uh oh!

VL4DST3R Jun 16, 2023

Uh oh!

LostRuins Jun 17, 2023 Maintainer

Uh oh!

VL4DST3R Jun 17, 2023

Uh oh!

LostRuins Jun 18, 2023 Maintainer

bucketcat
May 22, 2023

Replies: 2 comments 7 replies

LostRuins
May 22, 2023
Maintainer

bucketcat
May 22, 2023
Author

LostRuins Jun 17, 2023
Maintainer

LostRuins Jun 18, 2023
Maintainer