Skip to content

Commit 1ed4f15

Browse files
Merge pull request #250 from stochasticai/tushar/docs
Documentation revamping
2 parents c614304 + e8d948e commit 1ed4f15

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+7785
-3129
lines changed

.github/stochastic_logo_dark.svg

Lines changed: 1144 additions & 12 deletions
Loading

.github/stochastic_logo_light.svg

Lines changed: 1243 additions & 12 deletions
Loading

README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,36 @@ model = BaseModel.load("x/distilgpt2_lora_finetuned_alpaca")
220220

221221
<br>
222222

223+
## Supported Models
224+
Below is a list of all the supported models via `BaseModel` class of `xTuring` and their corresponding keys to load them.
225+
226+
| Model | Key |
227+
| -- | -- |
228+
|Bloom | bloom|
229+
|Cerebras | cerebras|
230+
|DistilGPT-2 | distilgpt2|
231+
|Falcon-7B | falcon|
232+
|Galactica | galactica|
233+
|GPT-J | gptj|
234+
|GPT-2 | gpt2|
235+
|LlaMA | llama|
236+
|LlaMA2 | llama2|
237+
|OPT-1.3B | opt|
238+
239+
The above mentioned are the base variants of the LLMs. Below are the templates to get their `LoRA`, `INT8`, `INT8 + LoRA` and `INT4 + LoRA` versions.
240+
241+
| Version | Template |
242+
| -- | -- |
243+
| LoRA| <model_key>_lora|
244+
| INT8| <model_key>_int8|
245+
| INT8 + LoRA| <model_key>_lora_int8|
246+
247+
** In order to load any model's __`INT4+LoRA`__ version, you will need to make use of `GenericLoraKbitModel` class from `xturing.models`. Below is how to use it:
248+
```python
249+
model = GenericLoraKbitModel('<model_path>')
250+
```
251+
The `model_path` can be replaced with you local directory or any HuggingFace library model like `facebook/opt-1.3b`.
252+
223253
## 📈 Roadmap
224254
- [x] Support for `LLaMA`, `GPT-J`, `GPT-2`, `OPT`, `Cerebras-GPT`, `Galactica` and `Bloom` models
225255
- [x] Dataset generation using self-instruction

docs/docs/about.md

Lines changed: 0 additions & 17 deletions
This file was deleted.

docs/docs/advanced/_category_.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"label": "🧗🏻 Advanced Topics",
3+
"position": 3,
4+
"collapsed": true,
5+
"link": {
6+
"type": "doc",
7+
"id": "advanced"
8+
}
9+
}

docs/docs/advanced/advanced.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
sidebar_position: 3
3+
title: 🧗🏻 Advanced topics
4+
description: Guide for people who want to customise xTuring even further.
5+
---
6+
7+
import DocCardList from '@theme/DocCardList';
8+
9+
10+
# 🧗🏻 Advanced Topics
11+
12+
<DocCardList />

docs/docs/advanced/anymodel.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
---
2+
title: 🌦️ Work with any model
3+
description: Use self-instruction to generate a dataset
4+
sidebar_position: 2
5+
---
6+
7+
<!-- ## class `GenericModel` -->
8+
<!-- ## Load Any Model via `GenericModel` wrapper -->
9+
The `GenericModel` class makes it possible to test and fine-tune the models which are not directly available via the `BaseModel` class. Apart from the base class, we can use classes mentioned below to load the models for memory-efficient computations:
10+
11+
| Class Name | Description |
12+
| ---------- | ----------- |
13+
| `GenericModel` | Loads the normal version of the model |
14+
| `GenericInt8Model` | Loads the model ready to fine-tune in __INT8__ precision |
15+
| `GenericLoraModel` | Loads the model ready to fine-tune using __LoRA__ technique |
16+
| `GenericLoraInt8Model` | Loads the model ready to fine-tune using __LoRA__ technique in __INT8__ precsion |
17+
| `GenericLoraKbitModel` | Loads the model ready to fine-tune using __LoRA__ technique in __INT4__ precision |
18+
19+
<!-- Let us circle back to the above example and see how we can replicate the results of the `BaseModel` class as shown [here](/overview/quickstart/load_save_models). -->
20+
21+
<!-- Start by downloading the Alpaca dataset from [here](https://d33tr4pxdm6e2j.cloudfront.net/public_content/tutorials/datasets/alpaca_data.zip) and extract it to a folder. We will load this dataset using the `InstructionDataset` class. -->
22+
23+
<!-- ```python
24+
from xturing.datasets import InstructionDataset
25+
26+
dataset_path = './alpaca_data'
27+
28+
dataset = InstructionDataset(dataset_path)
29+
``` -->
30+
31+
32+
To initialize the model, simply run the following 2 commands:
33+
```python
34+
from xturing.models import GenericModel
35+
36+
model_path = 'aleksickx/llama-7b-hf'
37+
38+
model = GenericLoraModel(model_path)
39+
```
40+
The _'model_path'_ can be a locally saved model and/or any model available on the HuggingFace's [Model Hub](https://huggingface.co/models).
41+
42+
To fine-tune the model on a dataset, we will use the default configuration for the fine-tuning.
43+
44+
```python
45+
model.finetune(dataset=dataset)
46+
```
47+
48+
In order to see how to load a pre-defined dataset, go [here](/overview/quickstart/prepare), and to see how to generate a dataset, refer [this](/advanced/generate) page.
49+
50+
Let's test our fine-tuned model, and make some inference.
51+
52+
```python
53+
output = model.generate(texts=["Why LLM models are becoming so important?"])
54+
```
55+
We can print the `output` variable to see the results.
56+
57+
Next, we need to save our fine-tuned model using the `.save()` method. We will send the path of the directory as parameter to the method to save the fine-tuned model.
58+
59+
```python
60+
model.save('/path/to/a/directory/')
61+
```
62+
63+
We can also see our model(s) in action with a beautiful UI by launchung the playground locally.
64+
65+
```python
66+
from xturing.ui.playground import Playground
67+
68+
Playground().launch()
69+
```
70+
71+
<!-- ## GenericModel classes
72+
The `GenericModel` classes consists of:
73+
1. `GenericModel`
74+
2. `GenericInt8Model`
75+
3. `GenericLoraModel`
76+
4. `GenericLoraInt8Model`
77+
5. `GenericLoraKbitModel`
78+
79+
The below pieces of code will work for all of the above classes by replacing the `GenericModel` in below codes with any of the above classes. The pieces of codes presented below are very similar to that mentioned above with only slight difference.
80+
81+
### 1. Load a pre-trained and/or fine-tuned model
82+
83+
To load a pre-trained (or fine-tuned) model, run the following line of code. This will load the model with the default weights in the case of a pre-trained model, and the weights which were saved in the case of a fine-tuned one.
84+
```python
85+
from xturing.models import GenericModel
86+
87+
model = GenericModel("<model_path>")
88+
'''
89+
The <model_path> can be path to a local model, for example, "./saved_model" or path from the HuggingFace library, for example, "facebook/opt-1.3b"
90+
91+
For example,
92+
model = GenericModel('./saved_model')
93+
OR
94+
model = GenericModel('facebook/opt-1.3b')
95+
'''
96+
```
97+
98+
### 2. Save a fine-tuned model
99+
100+
After fine-tuning your model, you can save it as simple as:
101+
102+
```python
103+
model.save("/path/to/a/directory")
104+
```
105+
106+
Remember that the path that you specify should be a directory. If the directory doesn't exist, it will be created.
107+
108+
The model weights will be saved into 2 files. The whole model weights including based model parameters and LoRA parameters are stored in `pytorch_model.bin` file and only LoRA parameters are stored in `adapter_model.bin` file.
109+
110+
111+
<details>
112+
<summary> <h3> Examples to load fine-tuned and pre-trained models</h3> </summary>
113+
114+
1. To load a pre-trained model
115+
116+
```python
117+
## Make the necessary imports
118+
from xturing.models import GenericModel
119+
120+
## Loading the model
121+
model = GenericModel("facebook/opt-1.3b")
122+
123+
## Saving the model
124+
model.save("/path/to/a/directory")
125+
```
126+
127+
2. To load a fine-tuned model
128+
```python
129+
## Make the necessary imports
130+
from xturing.models import GenericModel
131+
132+
## Loading the model
133+
model = GenericModel("./saved_model")
134+
135+
```
136+
137+
</details>
138+
139+
## Inference via `GenericModel`
140+
141+
Once you have fine-tuned your model, you can run the inferences as simple as follows.
142+
143+
### Using a local model
144+
145+
Start with loading your model from a checkpoint after fine-tuning it.
146+
147+
```python
148+
# Make the ncessary imports
149+
from xturing.modelsimport GenericModel
150+
# Load the desired model
151+
model = GenericModel("/path/to/local/model")
152+
```
153+
154+
Next, we can run do the inference on our model using the `.generate()` method.
155+
156+
```python
157+
# Make inference
158+
output = model.generate(texts=["Why are the LLMs so important?"])
159+
# Print the generated outputs
160+
print("Generated output: {}".format(output))
161+
```
162+
### Using a pretrained model
163+
164+
Start with loading your model with the default weights.
165+
166+
```python
167+
# Make the ncessary imports
168+
from xturing.models import GenericModel
169+
# Load the desired model
170+
model = GenericModel("llama_lora")
171+
```
172+
173+
Next, we can run do the inference on our model using the `.generate()` method.
174+
175+
```python
176+
# Make inference
177+
output = model.generate(texts=["Why are the LLMs so important?"])
178+
# Print the generated outputs
179+
print("Generated output: {}".format(output))
180+
``` -->

docs/docs/inference/api_server.md renamed to docs/docs/advanced/api_server.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,24 @@
11
---
2-
title: FastAPI server
2+
title: ⚡️ FastAPI server
33
description: FastAPI inference server
44
sidebar_position: 3
55
---
66

7-
Once you have fine-tuned your model, you can run the inference using a FastAPI server.
7+
# ⚡️ Running model inference with FastAPI Ssrver
88

9-
### 1. Launch API server from CLI
9+
<!-- Once you have fine-tuned your model, you can run the inference using a FastAPI server. -->
10+
After successfully fine-tuning your model, you can perform inference using a FastAPI server. The following steps guide you through launching and utilizing the API server for your fine-tuned model.
11+
12+
### 1. Launch API server from Command Line Interface (CLI)
13+
14+
To initiate the API server, execute the following command in your command line interface:
1015

1116
```sh
12-
xturing api -m "/path/to/the/model"
17+
$ xturing api -m "/path/to/the/model"
1318
```
1419

1520
:::info
16-
Model path should be a directory containing a valid `xturing.json` config file.
21+
Ensure that the model path you provide is a directory containing a valid xturing.json configuration file.
1722
:::
1823

1924
### 2. Health check API
@@ -69,3 +74,5 @@ Model path should be a directory containing a valid `xturing.json` config file.
6974
"response": ["JP Morgan is multinational investment bank and financial service headquartered in New York city."]
7075
}
7176
```
77+
78+
By following these steps, you can effectively run your fine-tuned model for text generation through the FastAPI server, facilitating seamless inference with structured requests and responses.

0 commit comments

Comments
 (0)