Skip to content

Missing file data preprocess and Bug in training task MRC #4

@PhongNTDo

Description

@PhongNTDo

Hi.
I am trying to use the ViDeBERTa model to refine an MRC task on a ViQuAD dataset. However, according to the provided code, file Finetuning/QA/extractive-qa-mrc/utils/preprocess.py is missing.

Screenshot from 2023-04-24 17-53-23

Then, I used the load_dataset function of the datasets library instead, and got this error during model training.

model_checkpoint = "Fsoft-AIC/videberta-base"
model = RobertaForQuestionAnswering.from_pretrained(model_checkpoint)

model_name = model_checkpoint.split("/")[-1]
args = TrainingArguments(
    f"{model_name}-finetuned-quad2.0",
    num_train_epochs=2.0,
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    warmup_ratio=0.05,
    weight_decay=0.01,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    load_best_model_at_end=True,
    save_strategy="epoch",
    save_total_limit=5,
    # do_train = True,
    # do_eval = False,
    #change the number of training epochs to get a better result
    #push_to_hub=True,
)

from transformers import default_data_collator
data_collator = default_data_collator

trainer = Trainer(
    model,
    args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_valid,
    data_collator=data_collator,
    tokenizer=tokenizer,
)

Screenshot from 2023-04-24 17-57-53

Looking forward to getting an answer to solve this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions