Skip to content

Commit 3ee899c

Browse files
JulienMailleJulien Maille
andauthored
Merge Activation classes into one, added tanh (#315)
* Merge Activation classes into one, added tanh * update docstring * minor typos Co-authored-by: Julien Maille <[email protected]>
1 parent 0d997f4 commit 3ee899c

File tree

16 files changed

+41
-59
lines changed

16 files changed

+41
-59
lines changed

segmentation_models_pytorch/base/modules.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,14 +89,16 @@ def __init__(self, name, **params):
8989
self.activation = nn.Softmax(**params)
9090
elif name == 'logsoftmax':
9191
self.activation = nn.LogSoftmax(**params)
92+
elif name == 'tanh':
93+
self.activation = nn.Tanh()
9294
elif name == 'argmax':
9395
self.activation = ArgMax(**params)
9496
elif name == 'argmax2d':
9597
self.activation = ArgMax(dim=1, **params)
9698
elif callable(name):
9799
self.activation = name(**params)
98100
else:
99-
raise ValueError('Activation should be callable/sigmoid/softmax/logsoftmax/None; got {}'.format(name))
101+
raise ValueError('Activation should be callable/sigmoid/softmax/logsoftmax/tanh/None; got {}'.format(name))
100102

101103
def forward(self, x):
102104
return self.activation(x)

segmentation_models_pytorch/deeplabv3/model.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,13 @@
77

88

99
class DeepLabV3(SegmentationModel):
10-
"""DeepLabV3_ implemetation from "Rethinking Atrous Convolution for Semantic Image Segmentation"
10+
"""DeepLabV3_ implementation from "Rethinking Atrous Convolution for Semantic Image Segmentation"
1111
1212
Args:
1313
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1414
to extract features of different spatial resolution
1515
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
16-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
16+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
1717
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
1818
Default is 5
1919
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
@@ -22,7 +22,7 @@ class DeepLabV3(SegmentationModel):
2222
in_channels: A number of input channels for the model, default is 3 (RGB images)
2323
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
2424
activation: An activation function to apply after the final convolution layer.
25-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
25+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
2626
Default is **None**
2727
upsampling: Final upsampling factor. Default is 8 to preserve input-output spatial shape identity
2828
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build
@@ -86,14 +86,14 @@ def __init__(
8686

8787

8888
class DeepLabV3Plus(SegmentationModel):
89-
"""DeepLabV3+ implemetation from "Encoder-Decoder with Atrous Separable
89+
"""DeepLabV3+ implementation from "Encoder-Decoder with Atrous Separable
9090
Convolution for Semantic Image Segmentation"
9191
9292
Args:
9393
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
9494
to extract features of different spatial resolution
9595
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
96-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
96+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
9797
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
9898
Default is 5
9999
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
@@ -104,7 +104,7 @@ class DeepLabV3Plus(SegmentationModel):
104104
in_channels: A number of input channels for the model, default is 3 (RGB images)
105105
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
106106
activation: An activation function to apply after the final convolution layer.
107-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
107+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
108108
Default is **None**
109109
upsampling: Final upsampling factor. Default is 4 to preserve input-output spatial shape identity
110110
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

segmentation_models_pytorch/encoders/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ def get_encoder(name, in_channels=3, depth=5, weights=None):
5151
try:
5252
settings = encoders[name]["pretrained_settings"][weights]
5353
except KeyError:
54-
raise KeyError("Wrong pretrained weights `{}` for encoder `{}`. Avaliable options are: {}".format(
54+
raise KeyError("Wrong pretrained weights `{}` for encoder `{}`. Available options are: {}".format(
5555
weights, name, list(encoders[name]["pretrained_settings"].keys()),
5656
))
5757
encoder.load_state_dict(model_zoo.load_url(settings["url"]))
@@ -69,7 +69,7 @@ def get_preprocessing_params(encoder_name, pretrained="imagenet"):
6969
settings = encoders[encoder_name]["pretrained_settings"]
7070

7171
if pretrained not in settings.keys():
72-
raise ValueError("Avaliable pretrained options {}".format(settings.keys()))
72+
raise ValueError("Available pretrained options {}".format(settings.keys()))
7373

7474
formatted_settings = {}
7575
formatted_settings["input_space"] = settings[pretrained].get("input_space")

segmentation_models_pytorch/encoders/_base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ def out_channels(self):
1818
return self._out_channels[: self._depth + 1]
1919

2020
def set_in_channels(self, in_channels):
21-
"""Change first convolution chennels"""
21+
"""Change first convolution channels"""
2222
if in_channels == 3:
2323
return
2424

segmentation_models_pytorch/fpn/model.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,19 @@ class FPN(SegmentationModel):
1111
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1212
to extract features of different spatial resolution
1313
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
14-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
14+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
1515
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
1616
Default is 5
1717
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
1818
other pretrained weights (see table with available weights for each encoder_name)
1919
decoder_pyramid_channels: A number of convolution filters in Feature Pyramid of FPN_
2020
decoder_segmentation_channels: A number of convolution filters in segmentation blocks of FPN_
21-
decoder_merge_policy: Determines how to merge pyramid features inside FPN. Avaliable options are **add** and **cat**
21+
decoder_merge_policy: Determines how to merge pyramid features inside FPN. Available options are **add** and **cat**
2222
decoder_dropout: Spatial dropout rate in range (0, 1) for feature pyramid in FPN_
2323
in_channels: A number of input channels for the model, default is 3 (RGB images)
2424
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
2525
activation: An activation function to apply after the final convolution layer.
26-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
26+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
2727
Default is **None**
2828
upsampling: Final upsampling factor. Default is 4 to preserve input-output spatial shape identity
2929
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

segmentation_models_pytorch/linknet/model.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,18 +17,18 @@ class Linknet(SegmentationModel):
1717
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1818
to extract features of different spatial resolution
1919
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
20-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
20+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
2121
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
2222
Default is 5
2323
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
2424
other pretrained weights (see table with available weights for each encoder_name)
2525
decoder_use_batchnorm: If **True**, BatchNorm2d layer between Conv2D and Activation layers
2626
is used. If **"inplace"** InplaceABN will be used, allows to decrease memory consumption.
27-
Avaliable options are **True, False, "inplace"**
27+
Available options are **True, False, "inplace"**
2828
in_channels: A number of input channels for the model, default is 3 (RGB images)
2929
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
3030
activation: An activation function to apply after the final convolution layer.
31-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
31+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
3232
Default is **None**
3333
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build
3434
on top of encoder if **aux_params** is not **None** (default). Supported params:

segmentation_models_pytorch/manet/model.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,22 @@ class MAnet(SegmentationModel):
1616
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1717
to extract features of different spatial resolution
1818
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
19-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
19+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
2020
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
2121
Default is 5
2222
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
2323
other pretrained weights (see table with available weights for each encoder_name)
2424
decoder_channels: List of integers which specify **in_channels** parameter for convolutions used in decoder.
25-
Lenght of the list should be the same as **encoder_depth**
25+
Length of the list should be the same as **encoder_depth**
2626
decoder_use_batchnorm: If **True**, BatchNorm2d layer between Conv2D and Activation layers
2727
is used. If **"inplace"** InplaceABN will be used, allows to decrease memory consumption.
28-
Avaliable options are **True, False, "inplace"**
28+
Available options are **True, False, "inplace"**
2929
decoder_pab_channels: A number of channels for PAB module in decoder.
3030
Default is 64.
3131
in_channels: A number of input channels for the model, default is 3 (RGB images)
3232
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
3333
activation: An activation function to apply after the final convolution layer.
34-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
34+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
3535
Default is **None**
3636
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build
3737
on top of encoder if **aux_params** is not **None** (default). Supported params:

segmentation_models_pytorch/pan/model.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ class PAN(SegmentationModel):
2323
in_channels: A number of input channels for the model, default is 3 (RGB images)
2424
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
2525
activation: An activation function to apply after the final convolution layer.
26-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
26+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
2727
Default is **None**
2828
upsampling: Final upsampling factor. Default is 4 to preserve input-output spatial shape identity
2929
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

segmentation_models_pytorch/pspnet/model.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,20 +17,20 @@ class PSPNet(SegmentationModel):
1717
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1818
to extract features of different spatial resolution
1919
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
20-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
20+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
2121
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
2222
Default is 5
2323
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
2424
other pretrained weights (see table with available weights for each encoder_name)
25-
psp_out_channels: A number of filters in Saptial Pyramid
25+
psp_out_channels: A number of filters in Spatial Pyramid
2626
psp_use_batchnorm: If **True**, BatchNorm2d layer between Conv2D and Activation layers
2727
is used. If **"inplace"** InplaceABN will be used, allows to decrease memory consumption.
28-
Avaliable options are **True, False, "inplace"**
28+
Available options are **True, False, "inplace"**
2929
psp_dropout: Spatial dropout rate in [0, 1) used in Spatial Pyramid
3030
in_channels: A number of input channels for the model, default is 3 (RGB images)
3131
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
3232
activation: An activation function to apply after the final convolution layer.
33-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
33+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
3434
Default is **None**
3535
upsampling: Final upsampling factor. Default is 8 to preserve input-output spatial shape identity
3636
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build

segmentation_models_pytorch/unet/model.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,22 @@ class Unet(SegmentationModel):
1515
encoder_name: Name of the classification model that will be used as an encoder (a.k.a backbone)
1616
to extract features of different spatial resolution
1717
encoder_depth: A number of stages used in encoder in range [3, 5]. Each stage generate features
18-
two times smaller in spatial dimentions than previous one (e.g. for depth 0 we will have features
18+
two times smaller in spatial dimensions than previous one (e.g. for depth 0 we will have features
1919
with shapes [(N, C, H, W),], for depth 1 - [(N, C, H, W), (N, C, H // 2, W // 2)] and so on).
2020
Default is 5
2121
encoder_weights: One of **None** (random initialization), **"imagenet"** (pre-training on ImageNet) and
2222
other pretrained weights (see table with available weights for each encoder_name)
2323
decoder_channels: List of integers which specify **in_channels** parameter for convolutions used in decoder.
24-
Lenght of the list should be the same as **encoder_depth**
24+
Length of the list should be the same as **encoder_depth**
2525
decoder_use_batchnorm: If **True**, BatchNorm2d layer between Conv2D and Activation layers
2626
is used. If **"inplace"** InplaceABN will be used, allows to decrease memory consumption.
27-
Avaliable options are **True, False, "inplace"**
28-
decoder_attention_type: Attention module used in decoder of the model. Avaliable options are **None** and **scse**.
27+
Available options are **True, False, "inplace"**
28+
decoder_attention_type: Attention module used in decoder of the model. Available options are **None** and **scse**.
2929
SCSE paper - https://arxiv.org/abs/1808.08127
3030
in_channels: A number of input channels for the model, default is 3 (RGB images)
3131
classes: A number of classes for output mask (or you can think as a number of channels of output mask)
3232
activation: An activation function to apply after the final convolution layer.
33-
Avaliable options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"identity"**, **callable** and **None**.
33+
Available options are **"sigmoid"**, **"softmax"**, **"logsoftmax"**, **"tanh"**, **"identity"**, **callable** and **None**.
3434
Default is **None**
3535
aux_params: Dictionary with parameters of the auxiliary output (classification head). Auxiliary output is build
3636
on top of encoder if **aux_params** is not **None** (default). Supported params:

0 commit comments

Comments
 (0)