|
1 | 1 | """
|
2 |
| -<<<<<<< HEAD |
3 |
| -Pretraining VGG from scratch |
4 |
| -======= |
5 |
| -Pretraining VGG from scratch |
6 |
| ->>>>>>> origin/master |
| 2 | +Pre-training VGG from scratch |
7 | 3 | ============================
|
8 | 4 |
|
9 | 5 |
|
|
37 | 33 | Overview
|
38 | 34 | ------------
|
39 | 35 |
|
40 |
| -<<<<<<< HEAD |
41 |
| -If you are running this in Google Colab, install ``albumentations`` by running the following command: |
42 |
| -
|
43 |
| -.. code-block:: py |
44 |
| -
|
45 |
| -
|
46 |
| - !pip install albumentations |
47 |
| -======= |
48 | 36 | VGG is a model that attracted attention due to its ability to build deeper layers and dramatically
|
49 |
| -shorten the training time compared to AlexNet, which was the state-of-the-art model at the time of the publishing |
| 37 | +shorten the training time compared to ``AlexNet``, which was the state-of-the-art model at the time of the publishing |
50 | 38 | of the `original paper <https://arxiv.org/abs/1409.1556>`__.
|
51 |
| ->>>>>>> origin/master |
52 | 39 |
|
53 |
| -Unlike AlexNet's 5x5 and 9x9 filters, VGG uses only 3x3 filters. Using multiple 3x3 filters can |
| 40 | +Unlike ``AlexNet``'s 5x5 and 9x9 filters, VGG uses only 3x3 filters. Using multiple 3x3 filters can |
54 | 41 | obtain the same receptive field as using a 5x5 filter, but it is effective in reducing the number
|
55 |
| -of parameters. In addition, since it passes through multiple nonlinear functions, the |
56 |
| -nonlinearity increases even more. |
| 42 | +of parameters. In addition, since it passes through multiple non-linear functions, the |
| 43 | +non-linearity increases even more. |
57 | 44 |
|
58 | 45 | VGG applies a max pooling layer after multiple convolutional layers to reduce the spatial size.
|
59 |
| -This allows the feature map to be downsampled while preserving important information. Thanks |
| 46 | +This allows the feature map to be down-sampled while preserving important information. Thanks |
60 | 47 | to this, the network can learn high-dimensional features in deeper layers and prevent overfitting.
|
61 | 48 |
|
62 | 49 | In this tutorial, we will train the VGG model from scratch using only the configuration presented
|
|
103 | 90 |
|
104 | 91 |
|
105 | 92 | ######################################################################
|
106 |
| -<<<<<<< HEAD |
107 |
| -# Purpose point of this tutorial |
108 |
| -# ---------------------------- |
109 |
| -# |
110 |
| - |
111 |
| - |
112 |
| -###################################################################### |
113 |
| -# - We train the model from scratch using only the configuration |
114 |
| -# presented in the paper. |
115 |
| -# |
116 |
| -# - we do not use future method, like ``Batch normalization``,Adam , He |
117 |
| -# initialization. |
118 |
| -# |
119 |
| -# - You can apply to ImageNet Data. |
120 |
| -# |
121 |
| -# - If you can download the ImageNet Data(140GB), you can apply this |
122 |
| -# tutorial to reproduce Original VGG. |
123 |
| -# |
124 |
| -# - You can learn VGG within the training time suggested in the paper. |
125 |
| -# |
126 |
| - |
127 |
| - |
128 |
| -###################################################################### |
129 |
| -# Background |
130 |
| -# ----------------------- |
131 |
| -# |
132 |
| - |
133 |
| - |
134 |
| -###################################################################### |
135 |
| -# VGG became a model that attracted attention because it succeeded in |
136 |
| -# building deeper layers and dramatically shortening the training time |
137 |
| -# compared to ``AlexNet``, which was the SOTA model at the time. |
138 |
| -# |
139 |
| -# Unlike ``AlexNet``'s 5x5 9x9 filters, VGG only uses 3x3 filters. |
140 |
| -# Using multiple 3x3 filters can obtain the same receptive field as using a 5x5 filter, but it is effective in reducing the number of parameters. |
141 |
| -# In addition, since it passes through multiple nonlinear functions, the ``nonlinearity`` increases even more. |
142 |
| -# |
143 |
| -# VGG applied a max pooling layer after multiple convolutional layers to reduce the spatial size. |
144 |
| -# This allowed the feature map to be ``downsampled`` while preserving important information. |
145 |
| -# Thanks to this, the network could learn high-dimensional features in deeper layers and prevent overfitting. |
146 |
| - |
147 |
| -###################################################################### |
148 |
| -======= |
149 |
| ->>>>>>> origin/master |
150 | 93 | # VGG Configuration
|
151 | 94 | # -----------------
|
152 | 95 | #
|
153 | 96 | # In this section, we will define configurations suggested in the VGG paper.
|
154 |
| -# We use the CIFAR100 dataset. The authors of the VGG paper scale images isotropically, |
| 97 | +# We use the CIFAR100 dataset. The authors of the VGG paper scale images ``isotropically``, |
155 | 98 | # which means increasing the size of an image while maintaining its proportions,
|
156 | 99 | # preventing distortion and maintaining the consistency of the object.
|
157 | 100 |
|
|
207 | 150 | # --------------------
|
208 | 151 | #
|
209 | 152 | # As mentioned above we use the CIFAR100 dataset in this tutorial. According to the VGG paper,
|
210 |
| -# the authors scale the images isotropically to maintain their proportions. This method, known |
| 153 | +# the authors scale the images ``isotropically`` to maintain their proportions. This method, known |
211 | 154 | # as isotropic scaling, increases the size of an image while preserving its aspect ratio,
|
212 | 155 | # thus avoiding distortion and maintaining object consistency.
|
213 | 156 | #
|
@@ -421,7 +364,7 @@ def _init_weights(self,m):
|
421 | 364 | # Initializing Model Weights
|
422 | 365 | # ----------------------------
|
423 | 366 | #
|
424 |
| -# ggIn the original VGG paper, the authors trained model A first and then |
| 367 | +# In the original VGG paper, the authors trained model A first and then |
425 | 368 | # used its weights as a starting point for training other variants. However,
|
426 | 369 | # this approach can be time-consuming. The authors also mentioned using Xavier
|
427 | 370 | # initialization as an alternative to initializing with model A's weights,
|
@@ -702,7 +645,7 @@ def __getitem__(self, index: int) :
|
702 | 645 | # Conclusion
|
703 | 646 | # ----------
|
704 | 647 | #
|
705 |
| -# In this tutorial, we have successfully demonstrated how to pretrain the VGG model |
| 648 | +# In this tutorial, we have successfully demonstrated how to pre-train the VGG model |
706 | 649 | # from scratch. The techniques and insights provided in this tutorial can serve as
|
707 | 650 | # a basis for reproducing and adapting other foundational models.
|
708 | 651 | #
|
|
0 commit comments