-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
I am unable to recreate the MSE values in appendix O of the paper. Here is the script I ran which includes my config.
export CUDA_VISIBLE_DEVICES=$3
all_models=("FiLM" "DLinear" "Transformer" "Reformer" "Informer" "Autoformer" "FEDformer" "Nonstationary_Transformer" "Crossformer" "PatchTST" "iTransformer")
start_index=$1
end_index=$2
models=("${all_models[@]:$start_index:$end_index-$start_index+1}")
root_paths=("./data/Public_Health")
data_paths=("US_FLURATIO_Week.csv")
pred_lengths=(12 24 36 48)
seeds=(2021)
use_fullmodel=0
length=${#root_paths[@]}
for seed in "${seeds[@]}"
do
for model_name in "${models[@]}"
do
for ((i=0; i<$length; i++))
do
for pred_len in "${pred_lengths[@]}"
do
root_path=${root_paths[$i]}
data_path=${data_paths[$i]}
model_id=$(basename ${root_path})
echo "Running model $model_name with root $root_path, data $data_path, and pred_len $pred_len"
python -u run.py \
--task_name long_term_forecast \
--is_training 1 \
--root_path $root_path \
--data_path $data_path \
--model_id ${model_id}_${seed}_24_${pred_len}_fullLLM_${use_fullmodel} \
--model $model_name \
--data custom \
--features M \
--seq_len 24 \
--label_len 12 \
--pred_len $pred_len \
--des 'Exp' \
--seed $seed \
--type_tag "#F#" \
--text_len 4 \
--prompt_weight 0.1 \
--pool_type "avg" \
--save_name "results/result_health_gpt2_all.txt" \
--llm_model GPT2 \
--huggingface_token 'NA'\
--use_fullmodel $use_fullmodel
done
done
done
doneThe paper states GPT2 was used in the experiments included in Appendix O, however the provided sample script week_health.sh that was meant for us to reproduce the experiment uses BERT. I used GPT2, and here are my results below:
Here is the percentage difference when comparing with results from paper in Appendix O, Table 14.
Do my configs match what was done to produce the results in the paper? Or, any other ideas of where I might have gone wrong? Thanks.
Metadata
Metadata
Assignees
Labels
No labels

