You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This example shows you how it's very simple to fine tune a LLM with the [axolotl](https://docs.axolotl.ai/) Framework and OVHcloud [Machine Learning Services](https://www.ovhcloud.com/fr/public-cloud/ai-machine-learning/).
4
+
5
+
## π Prerequisites π
6
+
7
+
- An OVHcloud [public cloud project created](https://help.ovhcloud.com/csm/en-ie-public-cloud-compute-essential-information?id=kb_article_view&sysparm_article=KB0050387)
8
+
- An OVHcloud [AI Endpoints valid API Key](https://help.ovhcloud.com/csm/en-ie-public-cloud-ai-endpoints-getting-started?id=kb_article_view&sysparm_article=KB0065398) stored in an environment variable named `OVH_AI_ENDPOINTS_ACCESS_TOKEN`
9
+
- A valid AI Endpoint model URL stored in an environment variable named `OVH_AI_ENDPOINTS_MODEL_URL`
10
+
- A valid AI Endpoint model name stored in an environment variable named `OVH_AI_ENDPOINTS_MODEL_NAME`
11
+
- A [Hugging Face](https://huggingface.co/) account with a valid API Key
12
+
- Optional:
13
+
- a valid Python installation
14
+
- a valid Docker installation
15
+
16
+
## π¬ The chatbot π€
17
+
18
+
To test the created models, you can use the chatbot in the [chatbot](./chatbot) folder.
19
+
**β οΈ It's a simple chatbot for testing purpose only, not for real production π β οΈ**
20
+
21
+
The chatbot is packaged with Docker and can be built with the provided [Dockerfile](./chatbot/Dockerfile): `cd ./chatbot && docker buildx build --platform="linux/amd64" -t <id>/fine-tune-chatbot:1.0.0 .`
22
+
You can run the chatbot using:
23
+
- your local Python installation: `cd ./chatbot && pip install -r requirements.txt && python ./chatbot/chatbot.py`
24
+
- your local Docker installation: `cd ./chatbot && docker run -p 7860:7860 <id>/fine-tune-chatbot:1.0.0 .`
25
+
- using [OVHcloud AI Deploy](https://www.ovhcloud.com/fr/public-cloud/ai-deploy/):
And you can access the chatbot by navigating to `http://127.0.0.1:7860` or using the public URL provided by OVHcloud AI Deploy.
37
+
38
+
## π The data generation π
39
+
40
+
To train the model you need data.
41
+
Data are generated from the OVHcloud AI Endpoints [official documentation](https://help.ovhcloud.com/csm/en-gb-documentation-public-cloud-ai-and-machine-learning-ai-endpoints?id=kb_browse_cat&kb_id=574a8325551974502d4c6e78b7421938&kb_category=ea1d6daa918a1a541e11d3d71f8624aa&spa=1).
42
+
43
+
You have two Python scripts:
44
+
- one to generate valide dataset from the markdown documentation: [DatasetCreation.py](./dataset/DatasetCreation.py)
45
+
- one to generate synthetic data from the previous generated documentation: [DatasetAugmentation.py](./dataset/DatasetAugmentation.py)
46
+
47
+
Once you have set the environment variables (see Prerequisites section) you can run the scripts with Python : `python DatasetCreation.py`
48
+
49
+
## ποΈββοΈ Train the model π
50
+
51
+
You have to create a notebook thanks to `ovhai` CLI:
To train the model please follow the steps in the [notebook](./notebook/axolto-llm-fine-tune-Meta-Llama-3.2-1B-instruct-ai-endpoints.ipynb) provided in the [notebook](./notebook/) folder.
65
+
You have to upload the previously generated data in the [ai-endpoints-doc](./notebook/ai-endpoints-doc/) folder.
With the markdown following, generate a JSON file composed as follows: a list named "messages" composed of tuples with a key "role" which can have the value "user" when it's the question and "assistant" when it's the response. To split the document, base it on the markdown chapter titles to create the question, seems like a good idea.
60
+
Keep the language English.
61
+
I don't need to know the code to do it but I want the JSON result file.
62
+
For the "user" field, don't just repeat the title but make a real question, for example "What are the requirements for OVHcloud AI Endpoints?"
63
+
Be sure to add OVHcloud with AI Endpoints so that it's clear that OVHcloud creates AI Endpoints.
64
+
Generate the entire JSON file.
65
+
An example of what it should look like: messages [{{"role":"user", "content":"What is AI Endpoints?"}}]
66
+
There must always be a question followed by an answer, never two questions or two answers in a row.
0 commit comments