-
Notifications
You must be signed in to change notification settings - Fork 242
Description
Install under python 3.11 and ubuntu 22.04 was almost effortless and was hoping for the same when testing.
Unfortunately a few problems were encountered that require more knowledgable assistance.
Original intention for testing this program was to generate models similar to those found in
https://huggingface.co/Rewritelikeme using local models only.
If any user has an example of how to train these models I would deeply appreciate the assistance.
Problems testing local models so far.
- Edit local_linux.sh for local model:
echo " models/Mistral7bv02_fp16_GGUF - Use custom model path"
-
./local_linux.sh --tensor-parallelism 2
-
Main gui appears and all looks good
-
Choose "basic_server.yaml" from Configs (modified below)
basic_server.yaml:
pipeline: basic-server
prompt_path: system_prompts/Dickens_system_prompt.txt # System prompt path. The prompt.txt you get from the main Augmentoolkit pipeline is a good choice.
template_path: model_templates/Dickens_chat_template.jinja # Chat template path. Also look to the Augmentoolkit pipeline output dir.
gguf_model_path: models/Mistral7bv02_fp16_GGUF # A model that you have trained before. !!ATTENTION!! This should be the same directory as the original model was downloaded to -- the server needs the original tokenizer files as well. tokenizer.model, tokenizer.json, tokenizer_config.json, special_tokens_map -- you need those in the same dir as the file to which this setting points.
context_length: 9999
llama_path: './llama.cpp'
port: 8003 # what port to run on
-
Load and Run above 'basic_server.yaml'
-
Switches to next window and begins running. Loads punkt etc. until ...
Unrecognized model in models. Should have a model_type key in its config.json
- config.json for the above 'Mistral7bv02_fp16_GGUF' clearly shows:
"model_type": "mistral",
-
Downloaded the most recent versions of the model files required above and always the same error.
-
Is there a file to edit so this error doesn't occur?
##################################
My second attempt to load a model met with more success and new problems.
- Modified '_LOCAL_DATAGEN_complete_factual.yaml' to LOAD "datagen-pretrain-v1-7b-mistralv0.2" .safetensors model
echo " models/datagen-pretrain-v1-7b-mistralv0.2 - Use custom model path"
- A number of provider and api lines need to be commented out in the .yaml but it's unclear which ones apart from the most obvious.
Simply add your deepinfra API key (or change the provider to be local/some other provider) and run the pipeline to make your very own dataset and training configs for creating a domain expert!
-
Is there an explanation in the repo on how to do this and make the .yaml run 100% on a local machine?
-
Replaced the remaining 2 models in the yaml with the same:
models/datagen-pretrain-v1-7b-mistralv0.2
- Gave it my best shot but run ended with a requirements TypeError that should be removed in a 100% local model version:
run_augmentoolkit.py", line 158, in run_pipeline_config
asyncio.run(function(**flattened_config))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: factual_datagen_full() missing 9 required positional arguments: 'pretrain_hub_model_id', 'pretrain_hub_strategy', 'finetune_hub_model_id', 'finetune_hub_strategy', 'wandb_project', 'runpod_api_key', 'huggingface_token', 'wandb_api_key', and 'pod_id'
- I would appreciate any help in revising these files so that they run as intended and 100% on a local machine.
Cheers,