stochasticai
diff --git a/‎README.md‎
Lines changed: 14 additions & 13 deletions b/‎README.md‎
Lines changed: 14 additions & 13 deletions
diff --git a/‎docs/docs/finetune/guide.md‎
Lines changed: 2 additions & 1 deletion b/‎docs/docs/finetune/guide.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/docs/finetune/test.jsx‎
Lines changed: 71 additions & 0 deletions b/‎docs/docs/finetune/test.jsx‎
Lines changed: 71 additions & 0 deletions
diff --git a/‎examples/README.md‎
Lines changed: 58 additions & 0 deletions b/‎examples/README.md‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎examples/dataset_generation/create_alpaca_dataset.ipynb‎ renamed to ‎examples/datasets/create_alpaca_dataset.ipynb‎ b/‎examples/dataset_generation/create_alpaca_dataset.ipynb‎ renamed to ‎examples/datasets/create_alpaca_dataset.ipynb‎
diff --git a/‎examples/dataset_generation/create_instruction_dataset_from_files.ipynb‎ renamed to ‎examples/datasets/create_instruction_dataset_from_files.ipynb‎ b/‎examples/dataset_generation/create_instruction_dataset_from_files.ipynb‎ renamed to ‎examples/datasets/create_instruction_dataset_from_files.ipynb‎
diff --git a/‎examples/dataset_generation/finance_seed_tasks.jsonl‎ renamed to ‎examples/datasets/finance_seed_tasks.jsonl‎ b/‎examples/dataset_generation/finance_seed_tasks.jsonl‎ renamed to ‎examples/datasets/finance_seed_tasks.jsonl‎
diff --git a/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/all_generated.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/all_generated.jsonl‎ b/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/all_generated.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/all_generated.jsonl‎
diff --git a/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/filtered_instructions.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/filtered_instructions.jsonl‎ b/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/filtered_instructions.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/filtered_instructions.jsonl‎
diff --git a/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/finetuning.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/finetuning.jsonl‎ b/‎examples/dataset_generation/finance_self_instruct_openai_davinci_cache_10_5/finetuning.jsonl‎ renamed to ‎examples/datasets/finance_self_instruct_openai_davinci_cache_10_5/finetuning.jsonl‎
@@ -97,9 +97,9 @@ outputs = model.generate(dataset = dataset, batch_size=10)
 
 ```
 
-An exploration of the [Llama LoRA INT4 working example](examples/int4_finetuning/LLaMA_lora_int4.ipynb) is recommended for an understanding of its application.
+An exploration of the [Llama LoRA INT4 working example](examples/features/int4_finetuning/LLaMA_lora_int4.ipynb) is recommended for an understanding of its application.
 
-For an extended insight, consider examining the [GenericModel working example](examples/generic/generic_model.py) available in the repository.
+For an extended insight, consider examining the [GenericModel working example](examples/features/generic/generic_model.py) available in the repository.
 
 <br>
 
@@ -111,6 +111,7 @@ pip install xturing
 <br>
 
 ## 🚀 Quickstart
+
 ```python
 from xturing.datasets import InstructionDataset
 from xturing.models import BaseModel
@@ -130,7 +131,7 @@ output = model.generate(texts=["Why LLM models are becoming so important?"])
 print("Generated output by the model: {}".format(output))
 ```
 
-You can find the data folder [here](examples/llama/alpaca_data).
+You can find the data folder [here](examples/models/llama/alpaca_data).
 
 <br>
 
@@ -164,21 +165,21 @@ Playground().launch() ## launches localhost UI
 <br>
 
 ## 📚 Tutorials
-- [Preparing your dataset](examples/llama/preparing_your_dataset.py)
-- [Cerebras-GPT fine-tuning with LoRA and INT8](examples/cerebras/cerebras_lora_int8.ipynb) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eKq3oF7dnK8KuIfsTE70Gvvniwr1O9D0?usp=sharing)
-- [Cerebras-GPT fine-tuning with LoRA](examples/cerebras/cerebras_lora.ipynb) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1VjqQhstm5pT4EjPjx4Je7b3W2X1V3vDo?usp=sharing)
-- [LLaMA fine-tuning with LoRA and INT8](examples/llama/llama_lora_int8.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SQUXq1AMZPSLD4mk3A3swUIc6Y2dclme?usp=sharing)
-- [LLaMA fine-tuning with LoRA](examples/llama/llama_lora.py)
-- [LLaMA fine-tuning](examples/llama/llama.py)
-- [GPT-J fine-tuning with LoRA and INT8](examples/gptj/gptj_lora_int8.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1hB_8s1V9K4IzifmlmN2AovGEJzTB1c7e?usp=sharing)
-- [GPT-J fine-tuning with LoRA](examples/gptj/gptj_lora.py)
-- [GPT-2 fine-tuning with LoRA](examples/gpt2/gpt2_lora.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://drive.google.com/file/d/1Sh-ocNpKn9pS7jv6oBb_Q8DitFyj1avL/view?usp=sharing)
+- [Preparing your dataset](examples/datasets/preparing_your_dataset.py)
+- [Cerebras-GPT fine-tuning with LoRA and INT8](examples/models/cerebras/cerebras_lora_int8.ipynb) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eKq3oF7dnK8KuIfsTE70Gvvniwr1O9D0?usp=sharing)
+- [Cerebras-GPT fine-tuning with LoRA](examples/models/cerebras/cerebras_lora.ipynb) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1VjqQhstm5pT4EjPjx4Je7b3W2X1V3vDo?usp=sharing)
+- [LLaMA fine-tuning with LoRA and INT8](examples/models/llama/llama_lora_int8.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SQUXq1AMZPSLD4mk3A3swUIc6Y2dclme?usp=sharing)
+- [LLaMA fine-tuning with LoRA](examples/models/llama/llama_lora.py)
+- [LLaMA fine-tuning](examples/models/llama/llama.py)
+- [GPT-J fine-tuning with LoRA and INT8](examples/models/gptj/gptj_lora_int8.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1hB_8s1V9K4IzifmlmN2AovGEJzTB1c7e?usp=sharing)
+- [GPT-J fine-tuning with LoRA](examples/models/gptj/gptj_lora.py)
+- [GPT-2 fine-tuning with LoRA](examples/models/gpt2/gpt2_lora.py) &ensp; [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://drive.google.com/file/d/1Sh-ocNpKn9pS7jv6oBb_Q8DitFyj1avL/view?usp=sharing)
 
 <br>
 
 ## 📊 Performance
 
-Here is a comparison for the performance of different fine-tuning techniques on the LLaMA 7B model. We use the [Alpaca dataset](examples/llama/alpaca_data/) for fine-tuning. The dataset contains 52K instructions.
+Here is a comparison for the performance of different fine-tuning techniques on the LLaMA 7B model. We use the [Alpaca dataset](examples/models/llama/alpaca_data/) for fine-tuning. The dataset contains 52K instructions.
 
 Hardware:
 
 
@@ -3,8 +3,9 @@ title: Guide
 description: Fine-tuning with xTuring
 sidebar_position: 1
 ---
-
+import Test from './test';
 # Fine-tuning guide
+<Test />
 
 ## 1. Prepare dataset
 
 
@@ -0,0 +1,71 @@
+import React, { useState } from 'react'
+import clsx from 'clsx'
+import styles from './styles.module.css'
+import MDXContent from '@theme/MDXContent'
+import CodeBlock from '@theme/CodeBlock'
+    
+export default function Test() {
+  const [model, setModel] = useState('LLaMa')
+  const [version, setVersion] = useState('LLaMa')
+
+  console.log(model, version)
+
+  return (
+    <div className={clsx('col')}>
+      <label htmlFor='model'>Choose a model:</label>
+
+      <select
+        name='model'
+        id='model'
+        onChange={(e) => console.log(e.target.value)}
+      >
+        <option value='LLaMa'>LLaMa</option>
+        <option value='GPT-J'>GPT-J</option>
+        <option value='GPT-3'>GPT-3</option>
+        <option value='GPT-4'>GPT-4</option>
+      </select>
+
+      <label htmlFor='model-version'>Choose a version:</label>
+
+      <select
+        name='model-version'
+        id='model-version'
+        onChange={(e) => console.log(e.target.value)}
+      >
+        <option value='v1'>V1</option>
+        <option value='v2'>V2</option>
+        <option value='v3'>V3</option>
+        <option value='v4'>V4</option>
+      </select>
+
+      <CodeBlock
+        className='row'
+        showLineNumbers={false}
+        language='python'
+        children={`import json from datasets 
+import Dataset, DatasetDict
+
+alpaca_data = json.load(open(alpaca_dataset_path)) 
+
+instructions = []
+inputs = [] 
+outputs = [] 
+
+for data in alpaca_data:
+    instructions.append(data["instruction"]) inputs.append(data["input"])
+    outputs.append(data["output"])
+
+data_dict = {
+    "train": {"instruction": instructions, "text": inputs, "target": outputs}
+} 
+
+dataset = DatasetDict() 
+
+for k, v in data_dict.items(): 
+    dataset[k] =
+    Dataset.from_dict(v) dataset.save_to_disk(str("./alpaca_data"))
+`}
+      />
+    </div>
+  )
+}
@@ -0,0 +1,58 @@
+# Navigating through examples
+Here, is a brief about how to navigate through examples quick and efficiently, and get your hands dirty with `xTuring`. 
+
+## Directory structure
+```
+examples/
+    | datasets
+    | features/
+        | dataset_generation/
+        | evaluation/
+        | generic/
+        | int4_finetuning/
+    | models/
+    | playground_ui/
+```
+
+### datsets/
+This directory consists of multiple ways to generate your custom dataset from a given set of examples. 
+
+### features/
+This directory consists of files with exapmles highlighting speific major features of the library, which can be replicated to any LLM you want.  
+For example, in `dataset_generation/`, you will find an example on how to generate your custom dataset from a .jsonl file. In `evaluation/`, you will find a specific exapmle on how to evaluate your finetuned model, which can then be extended to any LLM and any dataset. 
+
+### models/
+This directory consists of examples specific to each model mentioned. 
+
+### playground_ui/
+This directory consists of an example which demonstrates how you can play around with your LLM through a web interface.
+
+## Models
+Below is a list of all the supported models via `BaseModel` class of `xTuring` and their corresponding keys to load them.
+
+|  Model |  Key |
+| -- | -- |
+|Bloom | bloom|
+|Cerebras | cerebras|
+|DistilGPT-2 | distilgpt2|
+|Falcon-7B | falcon|
+|Galactica | galactica|
+|GPT-J | gptj|
+|GPT-2 | gpt2|
+|LlaMA | llama|
+|LlaMA2 | llama2|
+|OPT-1.3B | opt|
+
+The above mentioned are the base variants of the LLMs. Below are the templates to get their `LoRA`, `INT8`, `INT8 + LoRA` and `INT4 + LoRA` versions.
+
+| Version | Template |
+| -- | -- |
+| LoRA|  <model_key>_lora|
+| INT8|  <model_key>_int8|
+| INT8 + LoRA|  <model_key>_lora_int8|
+
+** In order to load any model's __`INT4+LoRA`__ version, you will need to make use of `GenericLoraKbitModel` class from `xturing.models`. Below is how to use it:
+```python
+model = GenericLoraKbitModel('<model_path>')
+```
+The `model_path` can be replaced with you local directory or any HuggingFace library model like `facebook/opt-1.3b`.