Update torchtune pin to 0.4.0.dev20241010 #1300

vmpuri · 2024-10-15T09:51:04Z

Update torchtune pin to 0.4.0-dev20241010 . This is the newest version visible to the ROCm wheel & fixes the installation script for AMD/ROCm systems (otherwise, it would break since 0.3.0-dev20240928 isn't visible via the ROCm 6.2 wheel)

pip list after running install_requirements.sh successfully.

pip list 
Package                   Version
------------------------- --------------------------
absl-py                   2.1.0
accelerate                1.0.1
aiohappyeyeballs          2.4.3
aiohttp                   3.10.10
aiosignal                 1.3.1
altair                    5.4.1
annotated-types           0.7.0
antlr4-python3-runtime    4.9.3
anyio                     4.6.2.post1
async-timeout             4.0.3
attrs                     24.2.0
blinker                   1.8.2
blobfile                  3.0.0
cachetools                5.5.0
certifi                   2024.8.30
chardet                   5.2.0
charset-normalizer        3.4.0
click                     8.1.7
cmake                     3.30.4
colorama                  0.4.6
DataProperty              1.0.1
datasets                  3.0.1
dill                      0.3.8
distro                    1.9.0
evaluate                  0.4.3
exceptiongroup            1.2.2
filelock                  3.16.1
Flask                     3.0.3
frozenlist                1.4.1
fsspec                    2024.6.1
gguf                      0.10.0
gitdb                     4.0.11
GitPython                 3.1.43
h11                       0.14.0
httpcore                  1.0.6
httpx                     0.27.2
huggingface-hub           0.25.2
idna                      3.10
itsdangerous              2.2.0
Jinja2                    3.1.4
jiter                     0.6.1
joblib                    1.4.2
jsonlines                 4.0.0
jsonschema                4.23.0
jsonschema-specifications 2024.10.1
lm_eval                   0.4.2
lxml                      5.3.0
markdown-it-py            3.0.0
MarkupSafe                3.0.1
mbstrdecoder              1.1.3
mdurl                     0.1.2
more-itertools            10.5.0
mpmath                    1.3.0
multidict                 6.1.0
multiprocess              0.70.16
narwhals                  1.9.3
networkx                  3.4.1
ninja                     1.11.1.1
nltk                      3.9.1
numexpr                   2.10.1
numpy                     1.26.4
omegaconf                 2.3.0
openai                    1.51.2
packaging                 24.1
pandas                    2.2.3
pathvalidate              3.2.1
peft                      0.13.2
pillow                    10.4.0
pip                       22.0.2
portalocker               2.10.1
propcache                 0.2.0
protobuf                  5.28.2
psutil                    6.0.0
pyarrow                   17.0.0
pybind11                  2.13.6
pycryptodomex             3.21.0
pydantic                  2.9.2
pydantic_core             2.23.4
pydeck                    0.9.1
Pygments                  2.18.0
pytablewriter             1.2.0
python-dateutil           2.9.0.post0
pytorch-triton-rocm       3.1.0+cf34004b8a
pytz                      2024.2
PyYAML                    6.0.2
referencing               0.35.1
regex                     2024.9.11
requests                  2.32.3
rich                      13.9.2
rouge_score               0.1.2
rpds-py                   0.20.0
sacrebleu                 2.4.3
safetensors               0.4.5
scikit-learn              1.5.2
scipy                     1.14.1
sentencepiece             0.2.0
setuptools                59.6.0
six                       1.16.0
smmap                     5.0.1
snakeviz                  2.2.0
sniffio                   1.3.1
sqlitedict                2.1.0
streamlit                 1.39.0
sympy                     1.13.1
tabledata                 1.3.3
tabulate                  0.9.0
tcolorpy                  0.1.6
tenacity                  9.0.0
threadpoolctl             3.5.0
tiktoken                  0.8.0
tokenizers                0.20.1
toml                      0.10.2
tomli                     2.0.2
torch                     2.6.0.dev20241002+rocm6.2
torchao                   0.5.0
torchtune                 0.4.0.dev20241010+rocm6.2
torchvision               0.20.0.dev20241002+rocm6.2
tornado                   6.4.1
tqdm                      4.66.5
tqdm-multiprocess         0.0.11
transformers              4.45.2
typepy                    1.3.2
typing_extensions         4.12.2
tzdata                    2024.2
urllib3                   2.2.3
watchdog                  5.0.3
Werkzeug                  3.0.4
wheel                     0.44.0
word2number               1.1
xxhash                    3.5.0
yarl                      1.15.2
zstandard                 0.23.0
zstd                      1.5.5.1

pytorch-bot · 2024-10-15T09:51:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1300

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6c97bd9 with merge base d1ab6e0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

byjlw · 2024-10-15T20:31:26Z

please look at the list after running the install script for et as well and compare to ensure we don't have conflicts

joecummings

LGTM

vmpuri · 2024-10-15T21:12:51Z

please look at the list after running the install script for et as well and compare to ensure we don't have conflicts

tl;dr - not a problem within this diff, but we'll need to look at ET's installation logic.

After running through the ET installation, pip list shows the same version for torchtune.
Executorch doesn't seem to include torchtune in the dependencies - the only code reference I could find was in a manual installation script for the 3.2 demo & a doc mentioning that users should pip install torchtune (no version specified).

However, this reveals a deeper issue - ET is not checking for/using the ROCm (or CUDA) wheel and is force-installing the CPU versions of torch & other libs. If torchtune were included as a dep, then we'd run into the same issue. I don't know if this is an easy fix - ExecuTorch isn't targeted for GPU systems and should install the CPU version of torch. Warrants further discussion, but I think that's outside of the scope of this PR.

...
torch                     2.6.0.dev20241007+cpu
torchao                   0.5.0
torchaudio                2.5.0.dev20241007+cpu
torchsr                   1.0.4
torchtune                 0.4.0.dev20241010+rocm6.2
torchvision               0.20.0.dev20241007+cpu
...

Co-authored-by: vmpuri <[email protected]>

* add pp_dim, distributed, num_gpus, num_nodes as cmd line args * add tp_dim * add elastic_launch * working, can now launch from cli * Remove numpy < 2.0 pin to align with pytorch (#1301) Fix #1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5 * Update torchtune pin to 0.4.0-dev20241010 (#1300) Co-authored-by: vmpuri <[email protected]> * Unbreak gguf util CI job by fixing numpy version (#1307) Setting numpy version to be the range required by gguf: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/pyproject.toml * Remove apparently-unused import torchvision in model.py (#1305) Co-authored-by: vmpuri <[email protected]> * remove global var for tokenizer type + patch tokenizer to allow list of sequences * make pp tp visible in interface * Add llama 3.1 to dist_run.py * [WIP] Move dist inf into its own generator * Add initial generator interface to dist inference * Added generate method and placeholder scheduler * use prompt parameter for dist generation * Enforce tp>=2 * Build tokenizer from TokenizerArgs * Disable torchchat format + constrain possible models for distributed * disable calling dist_run.py directly for now * Restore original dist_run.py for now * disable _maybe_parallelize_model again * Reenable arg.model_name in dist_run.py * Use singleton logger instead of print in generate * Address PR comments; try/expect in launch_dist_inference; added comments --------- Co-authored-by: lessw2020 <[email protected]> Co-authored-by: Mengwei Liu <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: Scott Wolchok <[email protected]>

Update torchtune pin to 0.4.0-dev20241010

6c97bd9

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 15, 2024

vmpuri requested review from Jack-Khuu and joecummings October 15, 2024 09:51

vmpuri marked this pull request as ready for review October 15, 2024 09:53

vmpuri changed the title ~~Update torchtune pin to 0.4.0-dev20241010~~ Update torchtune pin to 0.4.0.dev20241010 Oct 15, 2024

joecummings approved these changes Oct 15, 2024

View reviewed changes

byjlw approved these changes Oct 16, 2024

View reviewed changes

vmpuri merged commit 2fe586a into main Oct 16, 2024
52 checks passed

mreso pushed a commit to mreso/torchchat that referenced this pull request Oct 18, 2024

Update torchtune pin to 0.4.0-dev20241010 (pytorch#1300)

5f0ca00

Co-authored-by: vmpuri <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update torchtune pin to 0.4.0.dev20241010 #1300

Update torchtune pin to 0.4.0.dev20241010 #1300

Uh oh!

vmpuri commented Oct 15, 2024

Uh oh!

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

Uh oh!

byjlw commented Oct 15, 2024

Uh oh!

joecummings left a comment

Uh oh!

vmpuri commented Oct 15, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Update torchtune pin to 0.4.0.dev20241010 #1300

Update torchtune pin to 0.4.0.dev20241010 #1300

Uh oh!

Conversation

vmpuri commented Oct 15, 2024

Uh oh!

pytorch-bot bot commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1300

✅ No Failures

Uh oh!

byjlw commented Oct 15, 2024

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

vmpuri commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

vmpuri commented Oct 15, 2024 •

edited

Loading