Skip to content

T0 (p=1) replicabilityΒ #35

@tuhinjubcse

Description

@tuhinjubcse

Hi
@VictorSanh

Thanks for releasing the code and data. I am trying to retrain it in pytorch
Some questions , in your paper you have p=1 vs p=5.7 results

Say for p=1 we take one random prompt per example of a dataset. This is fine perfectly

I have some doubts about the

1) Sampling strategy: proportional to the number of examples in each dataset (we treated any dataset with over 500'000 examples as having 500'000/num_templates examples) -  
Does this mean for big datasets like gigaword you include  422661 examples instead of  3803957



2) On huggingface T0 it says Fine-tuning steps: 12'200  but in your script says 
export TRAIN_STEPS=1112200. Any idea how many epochs you trained ?



3) Can you tell the total number of samples included for p=1  given tasks ['commonsense_qa', 'dream', 'quail', 'quartz', 'social_i_qa', 'wiqa', 'cosmos_qa', 'qasc', 'quarel', 'sciq', 'wiki_hop', 'adversarial_qa_dbert', 'adversarial_qa_dbidaf', 'adversarial_qa_droberta', 'quoref', 'duorc_ParaphraseRC', 'duorc_SelfRC', 'ropes', 'wiki_qa', 'common_gen', 'wiki_bio', 'app_reviews', 'amazon_polarity', 'imdb', 'rotten_tomatoes', 'gigaword', 'cnn_dailymail', 'multi_news', 'samsum', 'xsum', 'ag_news', 'dbpedia_14', 'trec', 'paws_labeled_final', 'glue_mrpc', 'glue_qqp', 'yelp_review_full', 'kilt_tasks_hotpotqa']

I have Num examples = 3068602 , which was done by taking p=1 from individual datasets , for datasets bigger than 500k dividing num of samples by num_of_prompts. If you have the file for T0 ( p=1 ) or (p=5.7) do you mind sharing them 



4) Example grouping: We use packing to combine multiple training examples into a single sequence to reach the maximum sequence length . Not sure whats this ? Is it necessary and how can we do it ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions