Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions docs/how-to/llms/local.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ First, set up your environment to use a Hugging Face model.
```python

import os
from dbally.llms.localllm import LocalLLM
from dbally.llms.local import LocalLLM

os.environ["HUGGINGFACE_API_KEY"] = "your-api-key"

Expand All @@ -34,6 +34,8 @@ response = await my_collection.ask("Which LLM should I use?")

## Advanced Usage

### Customizing LLM options

For advanced users, you can customize your LLM using [`LocalLLMOptions`](../../reference/llms/local.md#dbally.llms.clients.local.LocalLLMOptions). Here is a list of available parameters:

- `repetition_penalty`: *float or null (optional)* - Penalizes repeated tokens to avoid repetitions.
Expand All @@ -48,7 +50,7 @@ For advanced users, you can customize your LLM using [`LocalLLMOptions`](../../r

```python
import dbally
from dbally.llms.clients.localllm import LocalLLMOptions
from dbally.llms.clients.local import LocalLLMOptions

llm = LocalLLM("meta-llama/Meta-Llama-3-8B-Instruct", default_options=LocalLLMOptions(temperature=0.7))
my_collection = dbally.create_collection("my_collection", llm)
Expand All @@ -63,4 +65,21 @@ response = await my_collection.ask(
temperature=0.65,
),
)
```

### Using LoRA Adapters

To use a LoRA adapter with `LocalLLM`, specify the `adapter_name` parameter when creating the instance. It can be either the model id of a PEFT configuration hosted inside a model repo on the Hugging Face Hub, or a path to a directory containing a PEFT configuration file.

```python

import os
from dbally.llms.local import LocalLLM

os.environ["HUGGINGFACE_API_KEY"] = "your-api-key"

llm = LocalLLM(
model_name="meta-llama/Meta-Llama-3-8B-Instruct",
adapter_name="path/to/your/adapter"
)
```
80 changes: 80 additions & 0 deletions finetuning/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# How-To: Fine-tune IQL LLM

This section provides a step-by-step guide to fine-tuning a IQL LLM.

## Prerequisites

Before you start, install the required dependencies for fine-tuning LLMs.

```bash
pip install dbally[finetuning]
```

## Customizing the fine-tuning

You can customize various aspects of the fine-tuning by modifying the config files stored in the `finetuning/dbally_finetuning/config`.

Here is an example structure of the `config.yaml` file.
```bash
name:
defaults:
- model: <model-name>
- train_params: <train-params-name>
- lora_params: <lora-params-name>
- qlora_params: <qlora-params-name>
- _self_

dataset: <dataset-name>

use_lora: <true/false>
use_qlora: <true/false>

output_dir: <output-directory>
seed: <random-seed>
env_file_path: <path-to-env-file>
overwrite_output_dir: <true/false>
neptune_enabled: <true/false>
```

The key sections you might want to adjust are `model`, `train_params`, `lora_params` and `qlora_params`.

### Training parameters (`train_params`)

The `train_params` section should correspond to [`TrainingArguments`](https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.TrainingArguments). These parameters control the training process, including learning rate, batch size, number of epochs, and more.

### LoRA parameters (`lora_params`)

The lora_params section should correspond to [`PeftConfig`](https://huggingface.co/docs/peft/en/package_reference/config#peft.PeftConfig). These parameters control the Low-Rank Adaptation (LoRA) configuration, which helps in fine-tuning large language models efficiently.

### QLoRA parameters (`qlora_params`)

The qlora_params section should correspond to [`BitsAndBytesConfig`](https://huggingface.co/docs/transformers/main_classes/quantization#transformers.BitsAndBytesConfig). These parameters control the Quantized LoRA (QLoRA) configuration, which allows for training and inference with quantized weights, reducing memory usage and computational requirements.

### Model configuration (`model`)
This section defines the model architecture and related parameters. Key elements to include are:

- `name`: The name or path of the pre-trained model to fine-tune, such as "meta-llama/Meta-Llama-3-8B-Instruct".
- `lora_target_modules`: List of model modules to which LoRA will be applied, for example, ["q_proj", "k_proj", "v_proj", "o_proj"].
- `torch_dtype`: Data type for model parameters during training, such as bfloat16.
- `context_length`: Maximum context length for the model, e.g., 2048.

## Using Neptune for Experiment Tracking

[Neptune](https://neptune.ai/) helps in tracking and logging your experiment metrics, parameters, and other metadata in a centralized location. To enable experiment tracking with Neptune, you need to configure the necessary environment variables.

Ensure you have the following environment variables set:

- `NEPTUNE_API_TOKEN`: Your Neptune API token.
- `NEPTUNE_PROJECT`: The name of your Neptune project.

You can set these variables in your environment or load them from a .env file.

## Running the Script

Execute the script from the root directory using the following command:

```bash
PYTHONPATH=finetuning python finetuning/dbally_finetuning/train.py
```

This command runs the fine-tuning process with the specified configuration, and the output will be under the specified `output_dir`.
Empty file.
46 changes: 46 additions & 0 deletions finetuning/dbally_finetuning/callbacks/neptune_callback.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import os
from typing import List, Optional, Tuple

from loguru import logger
from omegaconf import DictConfig
from transformers.integrations import NeptuneCallback


def get_neptune_token_and_project_set() -> Tuple[Optional[str], Optional[str]]:
"""
Loads neptune token and project from environment variables.

Returns:
Neptune token and project values.
"""

neptune_token = os.getenv("NEPTUNE_API_TOKEN")
neptune_project_name = os.getenv("NEPTUNE_PROJECT")

if neptune_token is None:
logger.info("neptune token not found")

if neptune_project_name is None:
logger.info("neptune project name not found")

return neptune_token, neptune_project_name


def create_neptune_callback(config: DictConfig, tags: Optional[List[str]] = None) -> Optional[NeptuneCallback]:
"""
Args:
config: DictConfig with experiment configuration.
tags: Optional tags to be stored in experiments metadata.

Returns:
Neptune Callback.
"""

neptune_token, neptune_project_name = get_neptune_token_and_project_set()

if neptune_token is not None and neptune_project_name is not None:
neptune_callback = NeptuneCallback(project=neptune_project_name, api_token=neptune_token, tags=tags)
neptune_callback.config = config
return neptune_callback
logger.warning("Neptune environment variables not set properly. Neptune won't be used for this experiment.")
return None
18 changes: 18 additions & 0 deletions finetuning/dbally_finetuning/configs/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name:
defaults:
- model: llama-3-8b-instruct
- train_params: baseline
- lora_params: baseline
- qlora_params: baseline
- _self_

dataset: dsai-alicja-kotyla/text-to-iql-v2

use_lora: true
use_qlora: true

output_dir: reports
seed: 1234
env_file_path: "x.env"
overwrite_output_dir: true
neptune_enabled: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
r: 64
lora_alpha: 16
lora_dropout: 0.1
bias: "none"
task_type: "CAUSAL_LM"
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name: meta-llama/Meta-Llama-3-8B-Instruct
lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
torch_dtype: bfloat16
context_length: 2048
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
load_in_4bit: true
bnb_4bit_use_double_quant: true
bnb_4bit_quant_type: "nf4"
20 changes: 20 additions & 0 deletions finetuning/dbally_finetuning/configs/train_params/baseline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
learning_rate: 2e-05
per_device_train_batch_size: 8
gradient_accumulation_steps: 1
num_train_epochs: 2
lr_scheduler_type: "cosine"
logging_steps: 10
bf16: true
fp16: false
gradient_checkpointing: true
logging_strategy: "steps"
max_steps: -1
output_dir: "output"
seed: 42
warmup_steps: 24
save_strategy: "epoch"
save_total_limit: -1
do_eval: true
evaluation_strategy: "steps"
eval_steps: 40
per_device_eval_batch_size: 8
23 changes: 23 additions & 0 deletions finetuning/dbally_finetuning/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import enum
from typing import Dict

import torch


class DataType(enum.Enum):
"""
Class which represents torch.dtype used to load HuggingFace models.
"""

FLOAT16 = "float16"
FLOAT32 = "float32"
BFLOAT16 = "bfloat16"


DTYPES_MAPPING: Dict[DataType, torch.dtype] = {
DataType.FLOAT16: torch.float16,
DataType.FLOAT32: torch.float32,
DataType.BFLOAT16: torch.bfloat16,
}

DATASET_TEXT_FIELD = "messages"
7 changes: 7 additions & 0 deletions finetuning/dbally_finetuning/paths.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Module to store useful paths."""
from pathlib import Path

import dbally_finetuning

PATH_SRC = Path(dbally_finetuning.__file__).parents[0]
PATH_CONFIG = PATH_SRC / "configs"
49 changes: 49 additions & 0 deletions finetuning/dbally_finetuning/preprocessing/preprocessor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
from typing import Optional

from datasets import Dataset
from dbally_finetuning.constants import DATASET_TEXT_FIELD
from dbally_finetuning.prompt import IQL_GENERATION_TEMPLATE, IQLGenerationPromptFormat
from transformers import PreTrainedTokenizer

from dbally.prompt.template import PromptTemplate


class Preprocessor:
"""Interface for preprocessor."""

def __init__(
self, tokenizer: PreTrainedTokenizer, prompt_template: Optional[PromptTemplate[IQLGenerationPromptFormat]]
):
self.tokenizer: PreTrainedTokenizer = tokenizer
self._prompt_template = prompt_template or IQL_GENERATION_TEMPLATE

def _process_example(self, example: dict):
prompt_format = IQLGenerationPromptFormat(
question=example["question"],
iql_context=example["iql_context"],
iql=example["iql"],
)
formatted_prompt = self._prompt_template.format_prompt(prompt_format)

return formatted_prompt.chat

def process(
self,
dataset: Dataset,
) -> Dataset:
"""
Returns the dataset with the tokenized input for model.

Args:
dataset: Dataset.

Returns:
Dataset.
"""

processed_input = [self._process_example(example) for example in dataset]

processed_input = self.tokenizer.apply_chat_template(
processed_input, tokenize=False, add_generation_prompt=False
)
return Dataset.from_dict({DATASET_TEXT_FIELD: processed_input})
54 changes: 54 additions & 0 deletions finetuning/dbally_finetuning/prompt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# pylint: disable=C0301

from typing import List

from dbally.prompt.template import PromptFormat, PromptTemplate


class IQLGenerationPromptFormat(PromptFormat):
"""
IQL prompt format, providing a question and filters to be used in the conversation.
"""

def __init__(
self,
*,
question: str,
iql: str,
iql_context: List[str],
) -> None:
"""
Constructs a new IQLGenerationPromptFormat instance.

Args:
question: Question to be asked.
iql: IQL.
iql_context: List of e.g. filters or actions to be used in the prompt.
"""
super().__init__()
self.question = question
self.iql_context = "\n".join([str(iql_context) for iql_context in iql_context])
self.iql = iql


IQL_GENERATION_TEMPLATE = PromptTemplate[IQLGenerationPromptFormat](
[
{
"role": "system",
"content": "You have access to API that lets you query a database:\n"
"\n{iql_context}\n"
"Please suggest which one(s) to call and how they should be joined with logic operators (AND, OR, NOT).\n"
"Remember! Don't give any comments, just the function calls.\n"
"The output will look like this:\n"
'filter1("arg1") AND (NOT filter2(120) OR filter3(True))\n'
"DO NOT INCLUDE arguments names in your response. Only the values.\n"
"You MUST use only these methods:\n"
"\n{iql_context}\n"
"It is VERY IMPORTANT not to use methods other than those listed above."
"If you DON'T KNOW HOW TO ANSWER DON'T SAY \"\", SAY: `UNSUPPORTED QUERY` INSTEAD! "
"This is CRUCIAL, otherwise the system will crash.",
},
{"role": "user", "content": "{question}"},
{"role": "assistant", "content": "{iql}"},
]
)
20 changes: 20 additions & 0 deletions finetuning/dbally_finetuning/train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# pylint: disable=C0116
import os
from datetime import datetime

import hydra
from dbally_finetuning.paths import PATH_CONFIG
from dbally_finetuning.trainer.iql_trainer import IQLTrainer


@hydra.main(config_name="config", config_path=str(PATH_CONFIG), version_base=None)
def main(config):
output_dir = os.path.join(config.output_dir, datetime.now().strftime("%Y-%m-%d_%H-%M-%S"))
os.makedirs(output_dir, exist_ok=True)

iql_trainer = IQLTrainer(config, output_dir)
iql_trainer.finetune()


if __name__ == "__main__":
main() # pylint: disable=E1120
Loading