Skip to content

Commit 5707bc5

Browse files
authored
Fix early stopping in converter patching + fix lr warmup for all tasks (#4131)
* fix converter and early stopping + fix warmup epochs * fix linter * fix linter2 * aligned default patience=10 for all tasks * fix linte * fix unit tests * revert epoch to steps back, change templates * fix cls templates * fix unit test * revert rotated det back. * change schedule for classification * fix linter * update changelog
1 parent c6e2952 commit 5707bc5

File tree

85 files changed

+326
-376
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

85 files changed

+326
-376
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ All notable changes to this project will be documented in this file.
7373
(<https://github.com/openvinotoolkit/training_extensions/pull/4035>)
7474
- Bump onnx to 1.17.0 to omit CVE-2024-5187
7575
(<https://github.com/openvinotoolkit/training_extensions/pull/4063>)
76+
- Decouple DinoV2 for semantic segmentation task
77+
(<https://github.com/openvinotoolkit/training_extensions/pull/4136>)
7678

7779
### Bug fixes
7880

@@ -126,6 +128,8 @@ All notable changes to this project will be documented in this file.
126128
(<https://github.com/openvinotoolkit/training_extensions/pull/4107>)
127129
- Fix empty annotation in tiling
128130
(<https://github.com/openvinotoolkit/training_extensions/pull/4124>)
131+
- Fix patching early stopping in tools/converter.py, update headers in templates, change training schedule for classification
132+
(<https://github.com/openvinotoolkit/training_extensions/pull/4131>)
129133
- Fix tensor type compatibility in dynamic soft label assigner and RTMDet head
130134
(<https://github.com/openvinotoolkit/training_extensions/pull/4140>)
131135

src/otx/algo/callbacks/adaptive_early_stopping.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ def __init__(
2020
self,
2121
monitor: str,
2222
min_delta: float = 0.0,
23-
patience: int = 3,
23+
patience: int = 10,
2424
verbose: bool = False,
2525
mode: str = "min",
2626
strict: bool = True,

src/otx/core/model/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -744,7 +744,7 @@ def lr_scheduler_step(self, scheduler: LRSchedulerTypeUnion, metric: Tensor) ->
744744
return super().lr_scheduler_step(scheduler=scheduler, metric=metric)
745745

746746
if len(warmup_schedulers) != 1:
747-
msg = "No more than two warmup schedulers coexist."
747+
msg = "No more than one warmup schedulers coexist."
748748
raise RuntimeError(msg)
749749

750750
warmup_scheduler = next(iter(warmup_schedulers))

src/otx/core/schedulers/warmup_schedulers.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,9 @@ class LinearWarmupScheduler(LambdaLR):
1919
"""Linear Warmup scheduler.
2020
2121
Args:
22-
num_warmup_steps: Learning rate will linearly increased during the period same as this number.
23-
warmup_interval: If "epoch", count the number of steps for the warmup period.
22+
optimizer (Optimizer): Optimizer to apply the scheduler.
23+
num_warmup_steps (int): Learning rate will linearly increased during the period same as this number.
24+
interval (Literal["step", "epoch"]): If "epoch", count the number of epochs for the warmup period.
2425
Otherwise, the iteration step will be the warmup period.
2526
"""
2627

@@ -55,7 +56,7 @@ class LinearWarmupSchedulerCallable:
5556
main_scheduler_callable: Callable to create a LR scheduler that will be mainly used.
5657
num_warmup_steps: Learning rate will linearly increased during the period same as this number.
5758
If it is less than equal to zero, do not create `LinearWarmupScheduler`.
58-
warmup_interval: If "epoch", count the number of steps for the warmup period.
59+
warmup_interval: If "epoch", count the number of epochs for the warmup period.
5960
Otherwise, the iteration step will be the warmup period.
6061
monitor: If given, override the main scheduler's `monitor` attribute.
6162
"""

src/otx/recipe/_base_/train.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ callbacks:
3939
init_args:
4040
max_interval: 5
4141
decay: -0.025
42+
min_earlystop_patience: 5
43+
min_lrschedule_patience: 3
4244
logger:
4345
- class_path: lightning.pytorch.loggers.csv_logs.CSVLogger
4446
init_args:

src/otx/recipe/anomaly_classification/stfpm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ overrides:
1616
precision: 32
1717
max_epochs: 100
1818
callbacks:
19-
- class_path: lightning.pytorch.callbacks.EarlyStopping
19+
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
2020
init_args:
2121
patience: 5
2222
mode: max

src/otx/recipe/anomaly_detection/stfpm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ overrides:
2121
precision: 32
2222
max_epochs: 100
2323
callbacks:
24-
- class_path: lightning.pytorch.callbacks.EarlyStopping
24+
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
2525
init_args:
2626
patience: 5
2727
mode: max

src/otx/recipe/anomaly_segmentation/stfpm.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ overrides:
1616
precision: 32
1717
max_epochs: 100
1818
callbacks:
19-
- class_path: lightning.pytorch.callbacks.EarlyStopping
19+
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
2020
init_args:
2121
patience: 5
2222
mode: max

src/otx/recipe/classification/h_label_cls/deit_tiny.yaml

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,16 @@ model:
1010
weight_decay: 0.05
1111

1212
scheduler:
13-
class_path: lightning.pytorch.cli.ReduceLROnPlateau
13+
class_path: otx.core.schedulers.LinearWarmupSchedulerCallable
1414
init_args:
15-
mode: max
16-
factor: 0.5
17-
patience: 1
18-
monitor: val/accuracy
15+
num_warmup_steps: 0
16+
main_scheduler_callable:
17+
class_path: lightning.pytorch.cli.ReduceLROnPlateau
18+
init_args:
19+
mode: max
20+
factor: 0.5
21+
patience: 3
22+
monitor: val/accuracy
1923

2024
engine:
2125
task: H_LABEL_CLS
@@ -26,11 +30,12 @@ callback_monitor: val/accuracy
2630
data: ../../_base_/data/classification.yaml
2731
overrides:
2832
max_epochs: 90
29-
callbacks:
30-
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
31-
init_args:
32-
patience: 3
3333

3434
data:
3535
task: H_LABEL_CLS
3636
data_format: datumaro
37+
38+
callbacks:
39+
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
40+
init_args:
41+
patience: 5

src/otx/recipe/classification/h_label_cls/efficientnet_b0.yaml

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,16 @@ model:
1111
weight_decay: 0.0001
1212

1313
scheduler:
14-
class_path: lightning.pytorch.cli.ReduceLROnPlateau
14+
class_path: otx.core.schedulers.LinearWarmupSchedulerCallable
1515
init_args:
16-
mode: max
17-
factor: 0.5
18-
patience: 1
19-
monitor: val/accuracy
16+
num_warmup_steps: 0
17+
main_scheduler_callable:
18+
class_path: lightning.pytorch.cli.ReduceLROnPlateau
19+
init_args:
20+
mode: max
21+
factor: 0.5
22+
patience: 3
23+
monitor: val/accuracy
2024

2125
engine:
2226
task: H_LABEL_CLS
@@ -29,11 +33,12 @@ overrides:
2933
reset:
3034
- data.train_subset.transforms
3135

32-
max_epochs: 90
3336
callbacks:
3437
- class_path: otx.algo.callbacks.adaptive_early_stopping.EarlyStoppingWithWarmup
3538
init_args:
36-
patience: 3
39+
patience: 5
40+
41+
max_epochs: 90
3742

3843
data:
3944
task: H_LABEL_CLS

0 commit comments

Comments
 (0)