A working software environment for lag-llama

I reported in another issue that the most recent `pytorch-lightning` does not work with `lag-llama`. I also tried a few version combinations among pytorch, pytorch-lightning, and gluonts. Eventually I could get the code run for 385 epochs with the following `requirements.txt`:

```
orjson
torch==2.0.0
gluonts==0.13.5
pytorch-lightning==1.9.5
datasets
xformers
git+https://github.com/kashif/hopfield-layers@pytorch-2
etsformer-pytorch
reformer_pytorch
einops
opt_einsum
pykeops
scipy
apex
git+https://github.com/microsoft/torchscale
```
But the run still failed due to a divide-by-zero error in `gluonts`. Before I try more, I thought it'd be more efficient to ask the question here: could you share a working `requirements.txt` with version number specified?

BTW, the error I got with my `requirements.txt` is:
```
Epoch 385: : 110it [00:23,  4.66it/s, loss=-0.64, v_num=0, val_loss=-.690, train_loss=-1.10]Epoch 385, global step 38600: 'val_loss' was not in top 1

Epoch 385: : 110it [00:23,  4.65it/s, loss=-0.64, v_num=0, val_loss=-.690, train_loss=-1.10]
Use checkpoint: /home/lagllama_test/test/pytorch-transformer-ts/lag-llama/model-size-scaling-logs/0/experiments/lightning_logs/version_0/checkpoints/epoch=335-step=33600.ckpt
Predict on m4_weekly
m4_weekly prediction length: 13

Running evaluation:   0%|          | 0/359 [00:00<?, ?it/s]
Running evaluation: 100%|██████████| 359/359 [00:00<00:00, 81024.28it/s]
logger.log_dir :  /home/lagllama_test/test/pytorch-transformer-ts/lag-llama/model-size-scaling-logs/0/experiments/lightning_logs/version_0
os.path.exists(logger.log_dir) :  True
Predict on traffic
traffic prediction length: 24

Running evaluation:   0%|          | 0/6034 [00:00<?, ?it/s]
Running evaluation: 100%|██████████| 6034/6034 [00:00<00:00, 1090128.80it/s]
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
  metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
  metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
  metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/gluonts/evaluation/_base.py:422: RuntimeWarning: divide by zero encountered in scalar divide
  metrics["ND"] = cast(float, metrics["abs_error"]) / cast(
/home/lagllama_test/conda/envs/lagllama/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:134: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
```

Thanks a lot.

Yan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A working software environment for lag-llama #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

A working software environment for lag-llama #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions