Skip to content

Commit a795199

Browse files
authored
Update tokenizer parameter in sfttrainer across multiple examples (#2664)
* REFAC Update tokenizer parameter to processing_class in SFTTrainer instances across multiple examples * REFAC Replace tokenizer parameter with processing_class in Trainer instances across documentation and examples * Refactor tokenizer parameter to processing_class in various examples - Updated the Trainer initialization in corda_finetuning.py to use processing_class instead of tokenizer. - Changed the execution_count to null in image_classification_peft_lora.ipynb. - Modified the tokenizer parameter to processing_class in image_classification_peft_lora.ipynb. - Adjusted the tokenizer parameter to processing_class in peft_bnb_whisper_large_v2_training.ipynb. - Updated the README.md in lorafa_finetune to reflect the change from tokenizer to processing_class in Trainer initialization. * REFAC Update tokenizer parameter to processing_class in Seq2SeqTrainer instantiation * REFAC Replace tokenizer parameter with processing_class in README and notebook examples
1 parent f650b08 commit a795199

File tree

20 files changed

+23
-23
lines changed

20 files changed

+23
-23
lines changed

docs/source/accelerate/deepspeed.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ The first thing to know is that the script uses DeepSpeed for distributed traini
134134
# trainer
135135
trainer = SFTTrainer(
136136
model=model,
137-
tokenizer=tokenizer,
137+
processing_class=tokenizer,
138138
args=training_args,
139139
train_dataset=train_dataset,
140140
eval_dataset=eval_dataset,

docs/source/accelerate/fsdp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ The first thing to know is that the script uses FSDP for distributed training as
114114
# trainer
115115
trainer = SFTTrainer(
116116
model=model,
117-
tokenizer=tokenizer,
117+
processing_class=tokenizer,
118118
args=training_args,
119119
train_dataset=train_dataset,
120120
eval_dataset=eval_dataset,

docs/source/conceptual_guides/oft.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ trainer = SFTTrainer(
123123
model=model,
124124
train_dataset=ds['train'],
125125
peft_config=peft_config,
126-
tokenizer=tokenizer,
126+
processing_class=tokenizer,
127127
args=training_arguments,
128128
data_collator=collator,
129129
)

docs/source/quicktour.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ trainer = Trainer(
9090
args=training_args,
9191
train_dataset=tokenized_datasets["train"],
9292
eval_dataset=tokenized_datasets["test"],
93-
tokenizer=tokenizer,
93+
processing_class=tokenizer,
9494
data_collator=data_collator,
9595
compute_metrics=compute_metrics,
9696
)

docs/source/task_guides/lora_based_methods.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ trainer = Trainer(
281281
args,
282282
train_dataset=train_ds,
283283
eval_dataset=val_ds,
284-
tokenizer=image_processor,
284+
processing_class=image_processor,
285285
data_collator=collate_fn,
286286
)
287287
trainer.train()

examples/bone_finetuning/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ trainer = SFTTrainer(
3333
model=peft_model,
3434
args=training_args,
3535
train_dataset=dataset,
36-
tokenizer=tokenizer,
36+
processing_class=tokenizer,
3737
)
3838
trainer.train()
3939
peft_model.save_pretrained("bone-llama-2-7b")

examples/bone_finetuning/bone_finetuning.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ class ScriptArguments(SFTConfig):
9090
model=peft_model,
9191
args=script_args,
9292
train_dataset=dataset,
93-
tokenizer=tokenizer,
93+
processing_class=tokenizer,
9494
)
9595
trainer.train()
9696
trainer.save_state()

examples/conditional_generation/peft_prompt_tuning_seq2seq_with_generate.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -447,7 +447,7 @@
447447
},
448448
{
449449
"cell_type": "code",
450-
"execution_count": 5,
450+
"execution_count": null,
451451
"id": "6b3a4090",
452452
"metadata": {
453453
"ExecuteTime": {
@@ -567,7 +567,7 @@
567567
")\n",
568568
"trainer = Seq2SeqTrainer(\n",
569569
" model=model,\n",
570-
" tokenizer=tokenizer,\n",
570+
" processing_class=tokenizer,\n",
571571
" args=training_args,\n",
572572
" train_dataset=train_dataset,\n",
573573
" eval_dataset=eval_dataset,\n",

examples/corda_finetuning/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ trainer = SFTTrainer(
114114
model=peft_model,
115115
args=training_args,
116116
train_dataset=dataset,
117-
tokenizer=tokenizer,
117+
processing_class=tokenizer,
118118
)
119119
trainer.train()
120120
peft_model.save_pretrained("corda-llama-2-7b")

examples/corda_finetuning/corda_finetuning.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,7 +266,7 @@ def train():
266266
"train_dataset": train_dataset,
267267
"data_collator": data_collator,
268268
}
269-
trainer = Trainer(model=model, tokenizer=tokenizer, args=script_args, **data_module)
269+
trainer = Trainer(model=model, processing_class=tokenizer, args=script_args, **data_module)
270270
trainer.train()
271271
trainer.save_state()
272272
model.save_pretrained(os.path.join(script_args.output_dir, "ft"))

0 commit comments

Comments
 (0)