-
Notifications
You must be signed in to change notification settings - Fork 432
Closed
Description
I am not totally certain if it is an issue or not.
The training data from the csv file contains two columns "act" and "prompt" but before this data gets tokenized and passed in for training, the columns are merged into a single column. Why?
I thought one should have "inputs" i.e. the initial short prompt "act as a trainer" and then a separate "labels" column which is the detailed output prompt.
Could you explain the rationale for merging the two columns? Like you see below? How is the training happening if you have a SINGLE input?
def concatenate_columns_prompt(dataset):
def concatenate(example):
example['prompt'] = "Act as a {}. Prompt: {}".format(example['act'], example['prompt'])
return example
dataset = dataset.map(concatenate)
return datasetThanks.
Metadata
Metadata
Assignees
Labels
No labels