You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Concrete example (training set): python3 gen_scores.py ./model/MI1_dropout_encodings_only/ preprocessing/data/subset-1/train-subset-1.json 2000 50000 -- this will eval model at step 50000, 52000, 54000, ... up to the most recent one.
Concrete example (dev set): python3 gen_scores.py ./model/MI1_dropout_encodings_only/ preprocessing/data/dev-v2.0.json
The dataset file path needs to be something.json and have a corresponding something-tokenized.json for this script to work!
The script will generate a file scores_<datasetname>.log in the model folder, as well as two plots (EM and F1).
To copy the plots to your computer run: scp -T guest@138.19.43.95:"'Documents/no_eating_no_drinking/model/MI1_dropout_encodings_only/plot_loss_vs_em_score(train-subset-1).png'" . && scp -T guest@138.19.43.95:"'Documents/no_eating_no_drinking/model/MI1_dropout_encodings_only/plot_loss_vs_f1_score(train-subset-1).png'" . && (or same but with dev-v2 replacing train-subset-1).
Produce answer file for evaluation
Generate predictions on SQuAD dev set: python3 produce_answers.py model/2020-04-01_01-07-06/epoch0_batch791.par
Generate predictions on a different dataset: python3 produce_answers.py model/2020-04-01_01-07-06/epoch0_batch791.par preprocessing/data/subset-1/train-subset-1-tokenized.json [optional_prediction_file_path]
Run evaluation: python3 evaluate-v2.0.py preprocessing/data/subset-1/train-subset-1.json predictions.json
Plot F1 score and loss together
First generate the scores log file using gen_scores.py (see separate instructions for that).