made model checkpoints more informative by avantikalal · Pull Request #107 · Genentech/gReLU

avantikalal · 2025-02-26T01:07:16Z

Earlier, model checkpoints contained only the chromosomes used for training and validation. We now store the full genomic intervals.
Added an option to write a checkpoint in LightningModel.test_on_dataset. If selected, then test dataset parameters and test set performance metrics will also be written to a checkpoint.
Stored train, val and test dataset parameters as nested dictionaries under model.data_params for readability.
Demonstrated these changes in Tutorial 3.

This allows much better reproducibility as all models will be distributed along with their train/val/test intervals and their per-task performance. Users can then make sure they use the model only on tasks with good performance, and don't evaluate it on regions that overlap with the train/val intervals.

suragnair · 2025-03-04T03:24:53Z

Sorry if I missed it, is there a test that checks whether the genomic intervals are stored or not?

avantikalal · 2025-03-04T17:10:00Z

Sorry if I missed it, is there a test that checks whether the genomic intervals are stored or not?

I will add tests. Question: do you think we should store intervals of length seq_len, or padded_seq_len (including padding to allow for jitter etc)?

suragnair · 2025-03-04T17:15:37Z

Hmm that’s tricky. Maybe seq_len and then just store the scalar padded_seq_len?

avantikalal · 2025-03-04T18:13:02Z

@suragnair new commit:

updated the dataset classes so that self.intervals will have length seq_len
added tests in test_dataset.py to check that the intervals are correctly stored in the dataset
added tests in test_lightning.py to check that the intervals are correctly copied to the model and val and test performance is also stored in the model.

made checkpoints more informative

ac72144

avantikalal requested a review from suragnair February 26, 2025 01:07

avantikalal added 2 commits February 26, 2025 02:31

fixed some tests

7b69986

reverted accidental changes

cb86657

added tests and fixed interval length

d4c10e0

fix test

efa864d

suragnair approved these changes Mar 4, 2025

View reviewed changes

avantikalal merged commit b5bd01a into main Mar 4, 2025
2 checks passed

avantikalal deleted the store-intervals branch March 4, 2025 18:45

avantikalal restored the store-intervals branch March 5, 2025 08:06

This was referenced Mar 5, 2025

Revert "made model checkpoints more informative" #109

Merged

Store intervals #113

Closed

avantikalal deleted the store-intervals branch March 5, 2025 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

made model checkpoints more informative#107

made model checkpoints more informative#107
avantikalal merged 5 commits intomainfrom
store-intervals

avantikalal commented Feb 26, 2025 •

edited

Loading

Uh oh!

suragnair commented Mar 4, 2025

Uh oh!

avantikalal commented Mar 4, 2025

Uh oh!

suragnair commented Mar 4, 2025

Uh oh!

avantikalal commented Mar 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avantikalal commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suragnair commented Mar 4, 2025

Uh oh!

avantikalal commented Mar 4, 2025

Uh oh!

suragnair commented Mar 4, 2025

Uh oh!

avantikalal commented Mar 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avantikalal commented Feb 26, 2025 •

edited

Loading