Adding script for processing many intermediate checkpoints at once for offline evals by IanMagnusson · Pull Request #731 · allenai/OLMo

IanMagnusson · 2024-10-08T23:31:43Z

Making a draft PR for this so we can consider merging this in to main. It would be nice if we could do this so we don't run into version issues if we train models in the future that are not compatible with the version of the code forked here.

dirkgr

@soldni , don't you already have a checkpoint converter script that runs in Beaker?

dirkgr · 2024-10-25T23:32:24Z

.gitignore

@@ -1,3 +1,6 @@
+# beaker yaml
+guided-trout-2f805b9.yaml


What is this?

dirkgr · 2024-10-25T23:32:42Z

log.txt

@@ -0,0 +1,10 @@
+


Don't commit temp output.

sorry. removed

dirkgr · 2024-10-25T23:33:09Z

hf_olmo/convert_olmo_to_hf.py

    upload_local_checkpoint(local_checkpoint_dir, args.destination_dir)

    print(f"Converted checkpoint saved to {args.destination_dir}")
+    if args.cleanup_local_dir:


Is there ever a reason not to do this?

I removed the if statement & the flag.

dirkgr · 2024-10-25T23:33:38Z

requirements.txt

@@ -0,0 +1,7 @@
+torch


We don't use requirements.txt in OLMo. We use pyproject.toml.

Removed --- it was created to troubleshoot.

jenahwang · 2024-10-26T01:35:01Z

@soldni , don't you already have a checkpoint converter script that runs in Beaker?

He does. What this one does is very similar but the focus is on batch conversion and wildcard acceptance. And it was written for oe-eval consistent ranking project with expediting its pipeline in mind.

soldni · 2024-10-26T02:50:08Z

@jenahwang would it be possible to merge in your changes to the other script? or consolidate the two?

it's confusing to have two conversion scripts, and it doubles maintenance.

jenahwang added 30 commits September 6, 2024 09:55

batch convert checkpoint

ef7a31c

batch convert checkpoint

62d9a1d

batch convert checkpoint

dba011b

batch convert checkpoint

84a4f1b

batch convert checkpoint

9a7b03b

batch convert checkpoint

d4687e9

batch convert checkpoint

24ec144

batch convert checkpoint

6187889

batch convert checkpoint

fbfda0e

tinkering

a862a0b

testing

8d79a01

testing

b4ed78d

testing

2f2a764

testing

b9601c4

testing

8cc86ee

testing

50e7090

testing

8aa450f

testing

f357320

testing

02899a3

testing

9ac3739

convert checkpoint batch

ef0b403

convert checkpoint batch

c0ff186

convert checkpoint batch

15092ae

convert checkpoint batch

c489f53

convert checkpoint batch

2ed6a50

convert checkpoint batch

dd3dc18

convert checkpoint batch

9bc11d7

convert checkpoint batch

e08895f

convert checkpoint batch

0b82c2b

convert checkpoint batch

482a487

jenahwang added 8 commits October 11, 2024 14:19

Merge remote-tracking branch 'origin' into jena/consistent-ranking

3c808a8

.

da90ae2

.

447de12

addressing errors

3920f2e

error fixes for pr

acaccdd

fixing errors for pr

a4a40e1

fixing errors for pr

3d2bd32

fixing errors for pr

d529f5a

jenahwang requested a review from dirkgr October 11, 2024 23:40

jenahwang marked this pull request as ready for review October 11, 2024 23:40

Merge branch 'main' into jena/consistent-ranking

904ae26

dirkgr requested changes Oct 25, 2024

View reviewed changes

jenahwang added 2 commits October 25, 2024 17:57

removing temp outputs

4a32be0

fixes and cleanups

2dc26a9

jenahwang added 14 commits December 13, 2024 12:57

adding beaker-gantry to dependencies

378aafe

adding beaker-gantry to dependencies

69d12f3

adding beaker-gantry to dependencies

579d612

python version apparently has to be 3.10 above for olmo/util.py to run

3b9563d

err... no

4a882a3

err... no

6acbfcc

tinkering

6e67a9c

undoing changes

21193cb

fix

1b4da65

error code updated

7b6e37f

minor change to the error log

264ce05

fixed error

2154ea8

edited conversion error output

d4e1f42

edited conversion error output

f39a522

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding script for processing many intermediate checkpoints at once for offline evals#731

Adding script for processing many intermediate checkpoints at once for offline evals#731
IanMagnusson wants to merge 93 commits intomainfrom
jena/consistent-ranking

IanMagnusson commented Oct 8, 2024

Uh oh!

dirkgr left a comment

Uh oh!

dirkgr Oct 25, 2024

Uh oh!

jenahwang Oct 26, 2024

Uh oh!

dirkgr Oct 25, 2024

Uh oh!

jenahwang Oct 26, 2024

Uh oh!

dirkgr Oct 25, 2024

Uh oh!

jenahwang Oct 26, 2024 •

edited

Loading

Uh oh!

dirkgr Oct 25, 2024

Uh oh!

jenahwang Oct 26, 2024

Uh oh!

jenahwang commented Oct 26, 2024

Uh oh!

soldni commented Oct 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

IanMagnusson commented Oct 8, 2024

Uh oh!

dirkgr left a comment

Choose a reason for hiding this comment

Uh oh!

dirkgr Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

jenahwang Oct 26, 2024

Choose a reason for hiding this comment

Uh oh!

dirkgr Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

jenahwang Oct 26, 2024

Choose a reason for hiding this comment

Uh oh!

dirkgr Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

jenahwang Oct 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dirkgr Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

jenahwang Oct 26, 2024

Choose a reason for hiding this comment

Uh oh!

jenahwang commented Oct 26, 2024

Uh oh!

soldni commented Oct 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jenahwang Oct 26, 2024 •

edited

Loading