|
5 | 5 | "id": "bb5eee40-7a7b-4d5c-ab73-800776e96510",
|
6 | 6 | "metadata": {},
|
7 | 7 | "source": [
|
8 |
| - "# Crross Entropy Loss\n" |
| 8 | + "# Crross Entropy Loss\n", |
| 9 | + "\n", |
| 10 | + "https://towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643e" |
9 | 11 | ]
|
10 | 12 | },
|
11 | 13 | {
|
12 | 14 | "cell_type": "markdown",
|
13 | 15 | "id": "49dd39b1-221c-40fa-a34c-a548b7acca8d",
|
14 | 16 | "metadata": {},
|
15 | 17 | "source": [
|
16 |
| - "# Dropout" |
| 18 | + "# Dropout\n", |
| 19 | + "\n", |
| 20 | + "Dropout is used to make the model not over fit and it reduces the number of nodes used or essentially deactivates them\n", |
| 21 | + "\n", |
| 22 | + "By dropping a unit out, we mean temporarily removing it from the network, along with all its incoming and outgoing connections\n", |
| 23 | + "\n", |
| 24 | + "Dilution and dropout are regularization techniques for reducing overfitting in artificial neural networks by preventing complex co-adaptations on training data. They are an efficient way of performing model averaging with neural networks. Wikipedia\n", |
| 25 | + "\n", |
| 26 | + "https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf" |
17 | 27 | ]
|
18 | 28 | },
|
19 | 29 | {
|
20 | 30 | "cell_type": "markdown",
|
21 | 31 | "id": "311e5114-dd67-4d9d-a837-f6d8a7a53e2c",
|
22 | 32 | "metadata": {},
|
23 | 33 | "source": [
|
24 |
| - "# Batch normalization" |
| 34 | + "# Batch normalization\n", |
| 35 | + "\n", |
| 36 | + "Another way to prevent model overfitting\n", |
| 37 | + "\n", |
| 38 | + "Normalization is a data preprocessing too used\n", |
| 39 | + "\n", |
| 40 | + "When we normalise the model we can generalize its weights and biases\n", |
| 41 | + "\n", |
| 42 | + "The batch normalization is kind of a filter where it blends up all of the information\n", |
| 43 | + "\n", |
| 44 | + "https://arxiv.org/pdf/1502.03167.pdf" |
25 | 45 | ]
|
26 | 46 | },
|
27 | 47 | {
|
28 | 48 | "cell_type": "code",
|
29 |
| - "execution_count": null, |
| 49 | + "execution_count": 3, |
30 | 50 | "id": "05e944ee-5b08-4ff8-9cfb-25fd73eed387",
|
31 | 51 | "metadata": {},
|
32 | 52 | "outputs": [],
|
33 | 53 | "source": []
|
| 54 | + }, |
| 55 | + { |
| 56 | + "cell_type": "code", |
| 57 | + "execution_count": 7, |
| 58 | + "id": "2ab2057f-a717-4a50-9245-d1c530ff4bf0", |
| 59 | + "metadata": {}, |
| 60 | + "outputs": [ |
| 61 | + { |
| 62 | + "data": { |
| 63 | + "text/plain": [ |
| 64 | + "(tensor([-0.7186, -0.2342, 1.0638, 1.7499, 0.0266, 0.2970, -0.0000, -0.4345,\n", |
| 65 | + " -0.0000, 0.0000, -2.3886, 0.0000, -0.2310, 2.1409, -0.9096, -0.7719]),\n", |
| 66 | + " tensor([-0.5749, -0.1873, 0.8511, 1.3999, 0.0213, 0.2376, -0.6571, -0.3476,\n", |
| 67 | + " -0.0300, 0.8420, -1.9109, 1.0163, -0.1848, 1.7127, -0.7276, -0.6175]))" |
| 68 | + ] |
| 69 | + }, |
| 70 | + "execution_count": 7, |
| 71 | + "metadata": {}, |
| 72 | + "output_type": "execute_result" |
| 73 | + } |
| 74 | + ], |
| 75 | + "source": [] |
| 76 | + }, |
| 77 | + { |
| 78 | + "cell_type": "code", |
| 79 | + "execution_count": null, |
| 80 | + "id": "0cb61562-a5c7-445c-8218-4752a52d24c1", |
| 81 | + "metadata": {}, |
| 82 | + "outputs": [], |
| 83 | + "source": [] |
34 | 84 | }
|
35 | 85 | ],
|
36 | 86 | "metadata": {
|
|
0 commit comments