Skip to content

Commit 23b9331

Browse files
committed
Refined the docs. Simply very minor changes
1 parent ac81b25 commit 23b9331

File tree

12 files changed

+43
-23
lines changed

12 files changed

+43
-23
lines changed

pytorch_widedeep/bayesian_models/tabular/bayesian_linear/bayesian_wide.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,17 @@ class BayesianWide(BaseBayesianModel):
2121
pred_dim: int
2222
size of the ouput tensor containing the predictions
2323
prior_sigma_1: float, default = 1.0
24-
Prior of the sigma parameter for the first of the two Gaussian
25-
distributions that will be mixed to produce the prior weight
26-
distribution
24+
The prior weight distribution is a scaled mixture of two Gaussian
25+
densities:
26+
27+
.. math::
28+
\begin{aligned}
29+
P(\mathbf{w}) = \prod_{i=j} \pi N (\mathbf{w}_j | 0, \sigma_{1}^{2}) + (1 - \pi) N (\mathbf{w}_j | 0, \sigma_{2}^{2})
30+
\end{aligned}
31+
32+
This is the prior of the sigma parameter for the first of the two
33+
Gaussians that will be mixed to produce the prior weight
34+
distribution.
2735
prior_sigma_2: float, default = 0.002
2836
Prior of the sigma parameter for the second of the two Gaussian
2937
distributions that will be mixed to produce the prior weight

pytorch_widedeep/bayesian_models/tabular/bayesian_mlp/bayesian_tab_mlp.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,17 @@ class BayesianTabMlp(BaseBayesianModel):
5757
Activation function for the dense layers of the MLP. Currently
5858
`'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported
5959
prior_sigma_1: float, default = 1.0
60-
Prior of the sigma parameter for the first of the two Gaussian
61-
distributions that will be mixed to produce the prior weight
62-
distribution for each Bayesian linear and embedding layer
60+
The prior weight distribution is a scaled mixture of two Gaussian
61+
densities:
62+
63+
.. math::
64+
\begin{aligned}
65+
P(\mathbf{w}) = \prod_{i=j} \pi N (\mathbf{w}_j | 0, \sigma_{1}^{2}) + (1 - \pi) N (\mathbf{w}_j | 0, \sigma_{2}^{2})
66+
\end{aligned}
67+
68+
This is the prior of the sigma parameter for the first of the two
69+
Gaussians that will be mixed to produce the prior weight
70+
distribution.
6371
prior_sigma_2: float, default = 0.002
6472
Prior of the sigma parameter for the second of the two Gaussian
6573
distributions that will be mixed to produce the prior weight

pytorch_widedeep/models/image/vision.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ class Vision(nn.Module):
4545
List of strings containing the names (or substring within the name) of
4646
the parameters that will be trained. For example, if we use a
4747
`'resnet18'` pretrainable model and we set ``trainable_params =
48-
['layer4']`` only the parameters of `'layer4'` of the network(and the
48+
['layer4']`` only the parameters of `'layer4'` of the network (and the
4949
head, as mentioned before) will be trained. Note that setting this or
5050
the previous parameter involves some knowledge of the architecture
5151
used.

pytorch_widedeep/models/tabular/mlp/context_attention_mlp.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,8 @@ class ContextAttentionMLP(BaseTabularModelWithAttention):
6868
Activation function to be applied to the continuous embeddings, if
6969
any. `'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported.
7070
input_dim: int, default = 32
71-
The so-called *dimension of the model*. In general is the number of
72-
embeddings used to encode the categorical and/or continuous columns
71+
The so-called *dimension of the model*. Is the number of embeddings
72+
used to encode the categorical and/or continuous columns
7373
attn_dropout: float, default = 0.2
7474
Dropout for each attention block
7575
with_addnorm: bool = False,

pytorch_widedeep/models/tabular/mlp/self_attention_mlp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ class SelfAttentionMLP(BaseTabularModelWithAttention):
6767
Activation function to be applied to the continuous embeddings, if
6868
any. `'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported.
6969
input_dim: int, default = 32
70-
The so-called *dimension of the model*. In general is the number of
70+
The so-called *dimension of the model*. Is the number of
7171
embeddings used to encode the categorical and/or continuous columns
7272
attn_dropout: float, default = 0.2
7373
Dropout for each attention block

pytorch_widedeep/models/tabular/transformers/ft_transformer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ class FTTransformer(BaseTabularModelWithAttention):
7373
(See `Linformer: Self-Attention with Linear Complexity
7474
<https://arxiv.org/abs/2006.04768>`_ ) The compression factor that
7575
will be used to reduce the input sequence length. If we denote the
76-
resulting sequence length as :math:`k`
76+
resulting sequence length as
7777
:math:`k = int(kv_{compression \space factor} \times s)`
7878
where :math:`s` is the input sequence length.
7979
kv_sharing: bool, default = False

pytorch_widedeep/models/tabular/transformers/saint.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ class SAINT(BaseTabularModelWithAttention):
6464
Activation function to be applied to the continuous embeddings, if
6565
any. `'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported.
6666
input_dim: int, default = 32
67-
The so-called *dimension of the model*. In general is the number of
67+
The so-called *dimension of the model*. Is the number of
6868
embeddings used to encode the categorical and/or continuous columns
6969
n_heads: int, default = 8
7070
Number of attention heads per Transformer block

pytorch_widedeep/models/tabular/transformers/tab_fastformer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ class TabFastFormer(BaseTabularModelWithAttention):
6666
continuous embeddings, if any. `'tanh'`, `'relu'`, `'leaky_relu'` and
6767
`'gelu'` are supported.
6868
input_dim: int, default = 32
69-
The so-called *dimension of the model*. In general is the number of
69+
The so-called *dimension of the model*. Is the number of
7070
embeddings used to encode the categorical and/or continuous columns
7171
n_heads: int, default = 8
7272
Number of attention heads per FastFormer block

pytorch_widedeep/models/tabular/transformers/tab_perceiver.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,8 @@ class TabPerceiver(BaseTabularModelWithAttention):
6868
Activation function to be applied to the continuous embeddings, if
6969
any. `'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported.
7070
input_dim: int, default = 32
71-
The so-called *dimension of the model*. In general, is the number of
72-
embeddings used to encode the categorical and/or continuous columns.
71+
The so-called *dimension of the model*. Is the number of embeddings
72+
used to encode the categorical and/or continuous columns.
7373
n_cross_attns: int, default = 1
7474
Number of times each perceiver block will cross attend to the input
7575
data (i.e. number of cross attention components per perceiver block).

pytorch_widedeep/models/tabular/transformers/tab_transformer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ class TabTransformer(BaseTabularModelWithAttention):
7272
Activation function to be applied to the continuous embeddings, if
7373
any. `'tanh'`, `'relu'`, `'leaky_relu'` and `'gelu'` are supported.
7474
input_dim: int, default = 32
75-
The so-called *dimension of the model*. In general is the number of
75+
The so-called *dimension of the model*. Is the number of
7676
embeddings used to encode the categorical and/or continuous columns
7777
n_heads: int, default = 8
7878
Number of attention heads per Transformer block

0 commit comments

Comments
 (0)