-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Description
I reported in another issue that the most recent pytorch-lightning does not work with lag-llama. I also tried a few version combinations among pytorch, pytorch-lightning, and gluonts. Eventually I could get the code run for 385 epochs with the following requirements.txt:
orjson
torch==2.0.0
gluonts==0.13.5
pytorch-lightning==1.9.5
datasets
xformers
git+https://github.com/kashif/hopfield-layers@pytorch-2
etsformer-pytorch
reformer_pytorch
einops
opt_einsum
pykeops
scipy
apex
git+https://github.com/microsoft/torchscale
But the run still failed due to a divide-by-zero error in gluonts. Before I try more, I thought it'd be more efficient to ask the question here: could you share a working requirements.txt with version number specified?
BTW, the error I got with my requirements.txt is:
Epoch 385: : 110it [00:23, 4.66it/s, loss=-0.64, v_num=0, val_loss=-.690, train_loss=-1.10]Epoch 385, global step 38600: 'val_loss' was not in top 1
Epoch 385: : 110it [00:23, 4.65it/s, loss=-0.64, v_num=0, val_loss=-.690, train_loss=-1.10]
Use checkpoint: /home/lagllama_test/test/pytorch-transformer-ts/lag-llama/model-size-scaling-logs/0/experiments/lightning_logs/version_0/checkpoints/epoch=335-step=33600.ckpt
Predict on m4_weekly
m4_weekly prediction length: 13
Running evaluation: 0%| | 0/359 [00:00<?, ?it/s]
Running evaluation: 100%|██████████| 359/359 [00:00<00:00, 81024.28it/s]
logger.log_dir : /home/lagllama_test/test/pytorch-transformer-ts/lag-llama/model-size-scaling-logs/0/experiments/lightning_logs/version_0
os.path.exists(logger.log_dir) : True
Predict on traffic
traffic prediction length: 24
Running evaluation: 0%| | 0/6034 [00:00<?, ?it/s]
Running evaluation: 100%|██████████| 6034/6034 [00:00<00:00, 1090128.80it/s]
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:134: UserWarning: Warning: converting a masked element to nan.
return arr.astype(dtype, copy=True)
Thanks a lot.
Yan
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels