I am able to run without any issue the inference worker with a model like OA_SFT_Pythia_12Bq_4, but I can't use llama models.
On the OpenAssistant/oasst-sft-6-llama-30b-xor repository on huggingface we can read the following:
Due to the license attached to LLaMA models by Meta AI it is not possible to directly distribute LLaMA-based models. Instead we provide XOR weights for the OA models.
Is there somewhere a nice documentation on how to setup the inference worker with a llama models ? Downloading the orginal weights etc..