|
7 | 7 |
|
8 | 8 | A GPU accelerated library that creates/trains/runs neural networks in safe Rust code. |
9 | 9 |
|
| 10 | +--- |
| 11 | + |
| 12 | +### Table of contents |
| 13 | + |
| 14 | +* [Architechture overview](#architechture-overview) |
| 15 | + * [Models](#models) |
| 16 | + * [Layers](#layers) |
| 17 | + * [Optimizers](#optimizers) |
| 18 | + * [Loss Functions](#loss-functions) |
| 19 | +* [XoR using Intricate](#xor-using-intricate) |
| 20 | + * [Setting up the training data](#setting-up-the-training-data) |
| 21 | + * [Setting up the layers](#setting-up-the-layers) |
| 22 | + * [Setting up OpenCL](#setting-up-opencls-state) |
| 23 | + * [Fitting our Model](#fitting-our-model) |
| 24 | +* [How to save and load models](#how-to-save-and-load-models) |
| 25 | + * [Saving the Model](#saving-the-model) |
| 26 | + * [Loading the Model](#loading-the-model) |
| 27 | +* [Things to be done still](#things-to-be-done-still) |
| 28 | + |
| 29 | +--- |
| 30 | + |
10 | 31 | ## Architechture overview |
11 | 32 |
|
12 | 33 | Intricate has a layout very similar to popular libraries out there such as Keras. |
13 | 34 |
|
| 35 | +It consists at the surface of a [Model](#models), which consists then |
| 36 | +of [Layers](#layers) which can be adjusted using a [Loss Function](#loss-functions) |
| 37 | +that is also helped by a [Optimizer](#optimizers). |
| 38 | + |
14 | 39 | ### Models |
15 | 40 |
|
16 | | -As said before, similar to Keras from Tensorflow, Intricate defines Models as basically |
17 | | -a list of `Layers` and the definition for "layer" is as follows. |
| 41 | +As said before, similar to Keras, Intricate defines Models as basically |
| 42 | +a list of [Layers](#layers). |
| 43 | + |
| 44 | +A model does not have much logic in it, mostly it delegates most of the work to the layers, |
| 45 | +all that it does is orchestrate how the layers should work together and how the data goes from |
| 46 | +a layer to another. |
18 | 47 |
|
19 | 48 | ### Layers |
20 | 49 |
|
21 | | -Every layer receives **inputs** and returns **outputs**, |
22 | | -they must also implement a `back_propagate` method that |
23 | | -will mutate the layer if needed and then return the derivatives |
24 | | -of the loss function with respected to the inputs, |
25 | | -written with **I** as the inputs of the layer, |
26 | | -**E** as the loss and **O** as the outputs of the layer: |
| 50 | +Every layer receives **inputs** and returns **outputs** following some rule that they must define. |
27 | 51 |
|
28 | | -``` |
29 | | -dE/dI <- Model <- dE/dO |
30 | | -``` |
| 52 | +They must also implement four methods that together constitute backpropagation: |
| 53 | + |
| 54 | +- `optimize_parameters` |
| 55 | +- `compute_gradients` |
| 56 | +- `apply_gradients` |
| 57 | +- `compute_loss_to_input_derivatives` |
| 58 | + |
| 59 | +Mostly the optimize_parameters will rely on an `Optimizer` that will try to improve |
| 60 | +the parameters that the Layer allows it to optimize. |
| 61 | + |
| 62 | +These methods together will be called sequentially to do backpropagation in the Model and |
| 63 | +using the results from the `compute_loss_to_input_derivatives` we will then to the same for |
| 64 | +the last layer and so on. |
| 65 | + |
| 66 | +These layers can be really any type of transformation on the inputs and outputs. |
| 67 | +An example of this is the activation functions in Intricate which are actual |
| 68 | +layers instead of being used in the actual layers instead of being one with other layers |
| 69 | +which does simplify calculations tremendously and works like a charm. |
| 70 | + |
| 71 | +### Optimizers |
| 72 | + |
| 73 | +Optimizers the do just what you might think, they optimize. |
31 | 74 |
|
32 | | -These layers can be anything you want and just propagates the previous inputs |
33 | | -to the next inputs for the next layer or for the outputs of the whole Model. |
| 75 | +Specifically they optimize both the parameters a Layer allows them to optimize, as well |
| 76 | +as the Layer's gradients so that the Layer can use them to apply the optimized gradients on itself. |
34 | 77 |
|
35 | | -There are a few activations already implemented, but still many to be implemented. |
| 78 | +This is useful for just having any type of impl of the `Optimizer` trait and then using it |
| 79 | +later which allows you to have any kind of Optimization on the training process you would like. |
| 80 | + |
| 81 | +Intricate currently only does have one optimizer since it is still on heavy development and still |
| 82 | +defining its architechture. |
| 83 | + |
| 84 | +### Loss Functions |
| 85 | + |
| 86 | +Loss Functions are just basically some implementations of a certain trait that are used |
| 87 | +to determine how bad a Model is. |
| 88 | + |
| 89 | +Loss Functions are **NOT** used in a layer, they are used |
| 90 | +for the Model itself. Even though a Layer will use derivatives with respect |
| 91 | +to the loss they don't really communicate with the Loss Function directly. |
| 92 | + |
| 93 | +--- |
36 | 94 |
|
37 | 95 | ## XoR using Intricate |
38 | 96 |
|
@@ -99,70 +157,92 @@ use intricate::utils::{ |
99 | 157 | let opencl_state = setup_opencl(DeviceType::CPU).unwrap(); |
100 | 158 | ``` |
101 | 159 |
|
102 | | -For our Model to be able actually do computations, we need to pass the OpenCL state into an `init` |
103 | | -function inside of the model as follows: |
| 160 | +For our Model to be able to actually do computations, we need to pass the OpenCL state |
| 161 | +into the `init` method inside of the Model as follows: |
104 | 162 |
|
105 | 163 | ```rust |
106 | 164 | xor_model.init(&opencl_state).unwrap(); |
107 | 165 | ``` |
108 | 166 |
|
109 | | -Beware that as v0.3.0 of Intricate, any method called before `init` |
110 | | -will panic because they do not have the necessary OpenCL state. |
111 | | - |
112 | 167 | ### Fitting our model |
113 | 168 |
|
114 | 169 | For training our Model we just need to call the `fit` |
115 | 170 | method and pass in some parameters as follows: |
116 | 171 |
|
117 | 172 | ```rust |
118 | | -xor_model.fit( |
119 | | - &training_inputs, |
120 | | - &expected_outputs, |
121 | | - TrainingOptions { |
122 | | - learning_rate: 0.1, |
123 | | - loss_algorithm: MeanSquared::new(), // The Mean Squared loss function |
124 | | - should_print_information: true, // Should or not be verbose |
125 | | - epochs: 10000, |
126 | | - }, |
127 | | -).unwrap(); // Will return an Option containing the last loss after training |
| 173 | +use intricate::{ |
| 174 | + loss_functions::MeanSquared, |
| 175 | + optimizers::BasicOptimizer, |
| 176 | + types::{TrainingOptions, TrainingVerbosity}, |
| 177 | +}; |
| 178 | + |
| 179 | +let mut loss = MeanSquared::new(); |
| 180 | +let mut optimizer = BasicOptimizer::new(0.1); |
| 181 | + |
| 182 | +// Fit the model however many times we want |
| 183 | +xor_model |
| 184 | + .fit( |
| 185 | + &training_inputs, |
| 186 | + &expected_outputs, |
| 187 | + &mut TrainingOptions { |
| 188 | + loss_fn: &mut loss, |
| 189 | + verbosity: TrainingVerbosity { |
| 190 | + show_current_epoch: true, // Show a current epoch message such as `epoch #5` |
| 191 | + |
| 192 | + show_epoch_progress: true, // Show the training steps process for each epoch in |
| 193 | + // a indicatif progress bar |
| 194 | + |
| 195 | + show_epoch_elapsed: true, // Show the time elapsed in the epoch |
| 196 | + |
| 197 | + print_loss: true, // Show the loss after an epoch of training |
| 198 | + }, |
| 199 | + compute_loss: true, |
| 200 | + optimizer: &mut optimizer, |
| 201 | + batch_size: 4, // Intricate will always use Mini-batch Gradient Descent under the hood |
| 202 | + // since with it you can have all other variants of Gradient Descent. |
| 203 | + // So this is basically the size of the batch being used in gradient descent. |
| 204 | + epochs: 500, |
| 205 | + }, |
| 206 | + ) |
| 207 | + .unwrap(); |
128 | 208 | ``` |
129 | 209 |
|
130 | 210 | As you can see it is extremely easy creating these models, and blazingly fast as well. |
131 | 211 |
|
| 212 | +--- |
| 213 | + |
132 | 214 | ## How to save and load models |
133 | 215 |
|
134 | 216 | For saving and loading models Intricate uses the [savefile](https://github.com/avl/savefile) crate which makes it very simple and fast to save models. |
135 | 217 |
|
136 | 218 | ### Saving the model |
137 | 219 |
|
138 | | -To load and save data, as an example, say for the XoR model |
139 | | -we trained above, we can just call the `save_file` function as such: |
| 220 | +As an example let's try saving and loading our XoR model. |
| 221 | +For doing that we will first need to sync all of the relevant layer information |
| 222 | +of the Model with OpenCL's `host`, (or just with the CPU), and then we will need |
| 223 | +to call the `save_file` method as follows: |
140 | 224 |
|
141 | 225 | ```rust |
142 | 226 | xor_model.sync_data_from_buffers_to_host().unwrap(); // sends the weights and biases from |
143 | 227 | // OpenCL buffers to Rust Vec's |
144 | 228 | save_file("xor-model.bin", 0, &xor_model).unwrap(); |
145 | 229 | ``` |
146 | 230 |
|
147 | | -Which will save all of the configuration of the XoR Model including what types of layers |
148 | | -it has inside and the trained parameters of each layer. |
149 | | - |
150 | 231 | ### Loading the model |
151 | 232 |
|
152 | | -As for loading our XoR model, we just need to call the counterpart of save_file: `load_file`. |
| 233 | +As for loading our XoR model, we just need to call the |
| 234 | +counterpart of the save_file method: `load_file`. |
153 | 235 |
|
154 | 236 | ```rust |
155 | 237 | let mut loaded_xor_model: Model = load_file("xor-model.bin", 0).unwrap(); |
156 | 238 | ``` |
157 | 239 |
|
158 | | -Now of curse, **savefile** cannot load in the GPU state so if you want |
159 | | -to use the Model after loading it, you **must** call the `setup_opencl` again |
160 | | -and initialize the Model with the resulting OpenCLState. |
| 240 | +Now of curse, the savefile crate cannot load in the data to the GPU, so if you want |
| 241 | +to use the Model after loading it, you **must** call the `init` method in the `loaded_xor_model` |
| 242 | +(done in examples/xor.rs). |
161 | 243 |
|
162 | 244 | ## Things to be done still |
163 | 245 |
|
164 | 246 | - separate Intricate into more than one crate as to make development more lightweight with rust-analyzer |
165 | 247 | - implement convolutional layers and perhaps even solve some image classification problems in a example |
166 | | -- have some feature of Intricate, should be optional, that would contain preloaded datasets, such as MNIST and others |
167 | | -- write many more unit tests to make code safer, like a test for the backprop of every activation layer |
168 | | -- perhaps write some kind of utility functions to help with writing repetitive tests for the backprop of activation functions |
| 248 | +- have some feature of Intricate, should be optional, that would contain preloaded datasets, such as MNIST and others |
0 commit comments