I connected localGPT to my corpus building system - To Train LLM's #481
clearsitedesigns
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I designed a system to build a data store on any topic. I ingest this with my custom overlap embedding function. Then, I use the local GPT to run a series of chain commands, using the model as a dummy in between. The model is tuned to respond. I adjust several hyperparameters to do this - to a series of questions that I pose to the content in a change and then outputs the data as a "new training" source for an LLM as question and answer pairs. I use a secondary config that adds a Madlib style-like question and answer path to vary how the questions are asked. Then, from there, I save the output as a question and answer instruct pairs; I have a series of validation checks to help determine where hallucinations occur.
I thought I would share just a little information about this. The below is just a sample. It will run in the background, for 2,000 cycles (the typical amount to adjust a Lora adapter)
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 388.07 ms / 551 runs ( 0.70 ms per token, 1419.83 tokens per second)
llama_print_timings: prompt eval time = 42211.96 ms / 989 tokens ( 42.68 ms per token, 23.43 tokens per second)
llama_print_timings: eval time = 32695.94 ms / 550 runs ( 59.45 ms per token, 16.82 tokens per second)
llama_print_timings: total time = 76212.72 ms
Length of raw answer in tokens: 348
Length of query in tokens: 5
Llama.generate: prefix-match hit
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 12.78 ms / 18 runs ( 0.71 ms per token, 1408.56 tokens per second)
llama_print_timings: prompt eval time = 43205.15 ms / 1000 tokens ( 43.21 ms per token, 23.15 tokens per second)
llama_print_timings: eval time = 1055.25 ms / 18 runs ( 58.63 ms per token, 17.06 tokens per second)
llama_print_timings: total time = 44436.19 ms
Length of raw answer in tokens: 5
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 356.10 ms / 501 runs ( 0.71 ms per token, 1406.93 tokens per second)
llama_print_timings: prompt eval time = 33024.80 ms / 778 tokens ( 42.45 ms per token, 23.56 tokens per second)
llama_print_timings: eval time = 29268.26 ms / 500 runs ( 58.54 ms per token, 17.08 tokens per second)
llama_print_timings: total time = 63490.88 ms
Length of raw answer in tokens: 318
Beta Was this translation helpful? Give feedback.
All reactions