Skip to content

components aoai_finetuning

github-actions[bot] edited this page Apr 25, 2024 · 12 revisions

AOAI Finetuning Job

aoai_finetuning

Overview

Upload data to Azure OpenAI resource, finetune model and delete data

Version: 0.0.2

View in Studio: https://ml.azure.com/registries/azureml/components/aoai_finetuning/version/0.0.2

Inputs

Name Description Type Default Optional Enum
endpoint_name The endpoint name or AOAI resource name. string False
endpoint_resource_group Resource group for the AOAI resource. string True
endpoint_subscription Subscription for the AOAI resource. string True
training_file_path jsonl source file/folder for training dataset. uri_file False
validation_file_path source file/folder for validation dataset. uri_file True
model GPT model engine string gpt-35-turbo-0613 False
task_type Dataset type - chat or completion string False ['chat', 'completion', 'embedding']
n_epochs Number of training epochs. If not provided, it will be determined dynamically based on the input data. integer True
batch_size Global batch size. If not provided, it will be determined dynamically based on the input data. integer True
learning_rate_multiplier The learning rate multiplier to use for training. If not provided, it will be determined dynamically based on the input data. number True
suffix A string of up to 18 characters that will be added to your fine-tuned model name string True
export_merged_weights To get the merged wights of the model as output as well. Default is false boolean True
completion_override To override the task type to completion. Default is false boolean True
full_finetune To perform full finetuning. Default is false boolean True
lora_v2 To use lora V2. Default is false boolean True
lora_dimensions The size of LoRA dimensions in self attention layer. If not provided, it will be determined dynamically. integer True
context_window Context length of the model. If not provided, context window will be determined dynamically. integer True
file_spm_rate file spm rate should be between [0,1] number True
weight_decay_multiplier Weight Decay Multiplier for training. Not applicable for embedding finetuning number True
prompt_loss_weight Loss weight defined on prompt (i.e. user message). Note that loss weight defined on completion (i.e. assistant message) is alwyas 1.0. number True
trim_mode Trim method if data is longer than context window string True ['left', 'right', 'discard']
check_point_interval Checkpointing frequency based on steps. Applicable only to embedding finetuning. integer True
num_steps Total training steps. Applicable only to embedding finetuning. integer True
shuffle_type Shuffle type for input train dataset. Buffer means shuffle with a small buffer. string True ['none', 'full', 'buffer']

Outputs

Name Description Type
aoai_finetuning_output Contains finetuned model id in output file in JSON/custom class format uri_file

Environment

azureml://registries/azureml-staging/environments/aoai-data-upload-finetune/versions/3

Clone this wiki locally