Skip to content

Early Stop - Model Not Trained Error #6

@dankunk

Description

@dankunk

Seems like early stop functionality during model training is not working as expected in the workflow suggested in the ModelTraining tutorial.

When running the train() function with early stop parameters I get the following error.

# Train
epgs.train(early_stop = True, patience = 5, min_delta = 0.01, verbose = True)

Obtaining dataloders ...
Generating sliding windows ...
100%|██████████| 12/12 [00:00<00:00, 14.56it/s]
Processed 12/12 recordings.
Signal processing method: raw | Scale: False
Class distribution (label:ratio): 1: 0.07, 2: 0.08, 4: 0.04, 5: 0.73, 6: 0.01, 7: 0.06, 8: 0.01
Labels map (from:to): {1: 0, 2: 1, 4: 2, 5: 3, 6: 4, 7: 5, 8: 6}
Train, validate, test set sizes: (65024, 26011, 10115)
Input shape: (256, 1, 1024)
Training...
Training:   0%|          | 1/200 [00:10<34:44, 10.48s/it]
Epoch [1/200] | Train loss: 0.4289 | Val. loss: 0.1468 | Train acc: 0.8706 | Val. acc: 0.9487
Training:   6%|| 11/200 [01:52<32:22, 10.28s/it]
Epoch [11/200] | Train loss: 0.0199 | Val. loss: 0.0351 | Train acc: 0.9939 | Val. acc: 0.9911
Training:   7%|| 14/200 [02:23<31:46, 10.25s/it]
Early stopping occured at epoch 20 after 5 epochs of changes less than 0.01 in validation accuracy. Validation loss: 0.0274
Training:   7%|| 14/200 [02:33<33:58, 10.96s/it]
Accuracy: 99.23, Average f1: 96.5
Class accuracy: {'NP': 99.46, 'C': 98.15, 'E1': 97.97, 'E2': 99.66, 'F': 98.98, 'G': 96.71, 'pd': 88.68}
Finished testing!

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[7], [line 2](vscode-notebook-cell:?execution_count=7&line=2)
      [1](vscode-notebook-cell:?execution_count=7&line=1) # Train
----> [2](vscode-notebook-cell:?execution_count=7&line=2) epgs.train(early_stop = True, patience = 5, min_delta = 0.01, verbose = True)

File c:\Users\danie\anaconda3\envs\discoepg\lib\site-packages\DiscoEPG\models\Segmentation.py:231, in EPGSegment.train(self, early_stop, patience, min_delta, verbose)
    [229](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:229) # Test and save the checkpoint at early stopped epochs
    [230](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:230) self.evaluate(task = 'test')
--> [231](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:231) self.save_checkpoint(f'early_stopped_{epoch}')
    [232](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:232) self.train_result_['early_stopping_epoch'] = epoch   
    [233](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:233) _is_early_stopped = True 

File c:\Users\danie\anaconda3\envs\discoepg\lib\site-packages\DiscoEPG\models\Segmentation.py:335, in EPGSegment.save_checkpoint(self, name, save_dir)
    [334](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:334) def save_checkpoint(self, name: str = '', save_dir: str = ''):
--> [335](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:335)     assert self.model_is_trained == True, 'Model is not trained.'
    [336](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:336)     if save_dir == '':
    [337](file:///C:/Users/danie/anaconda3/envs/discoepg/lib/site-packages/DiscoEPG/models/Segmentation.py:337)         save_dir = f'{self.root_dir}/checkpoints'

AssertionError: Model is not trained.

The function finishes its testing but I believe that the assertion:

self.model_is_trained = True

needs to be called before this checkpoint:

self.save_checkpoint(f'early_stopped_{epoch}')

I'll play around with things for a bit and commit anything that seems to get it fixed. For the meantime I'll just leave early_stop as False

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions