|
| 1 | +--- |
| 2 | +title: StRUCT Model Template |
| 3 | +output: |
| 4 | + html_vignette: |
| 5 | + df_print: paged |
| 6 | + github_document: |
| 7 | + df_print: paged |
| 8 | + html_preview: false |
| 9 | +vignette: > |
| 10 | + %\VignetteIndexEntry{StRUCT Model Template} |
| 11 | + %\VignetteEngine{knitr::rmarkdown} |
| 12 | + %\VignetteEncoding{UTF-8} |
| 13 | +--- |
| 14 | + |
| 15 | +```{r,include=FALSE,purl=FALSE} |
| 16 | +library("struct") |
| 17 | +``` |
| 18 | + |
| 19 | +<span style="color:red">Please note: this vignette describes the contents of a |
| 20 | +class object broken down in to section. As such some of the lines cannot be run |
| 21 | +on their own. Please skip to the <span style="color:black">**Complete object** |
| 22 | +</span> section for a working example. </span> |
| 23 | + |
| 24 | +# Using the template |
| 25 | + |
| 26 | +If you want to output the full model template to an editable R script you can |
| 27 | +use the `struct_template` function. The `type` parameter specifies the template |
| 28 | +to access (in this case "model") and the file to save the template to. If no |
| 29 | +path is specified the file will be created in your working directory. |
| 30 | + |
| 31 | +```r |
| 32 | +struct_template(template='model',output='example.R') |
| 33 | +``` |
| 34 | + |
| 35 | +You can browse all the templates, available as vignettes in the StRUCT package |
| 36 | +using the `browseVignettes` function. Viewing the HTML will show this page, and |
| 37 | +viewing the R code will display a working template. |
| 38 | + |
| 39 | +```r |
| 40 | +browseVignettes('struct') |
| 41 | +``` |
| 42 | + |
| 43 | +***After editing the template*** you can source the script to use your new |
| 44 | +model. You can add the script to your own package, or submit it to us for |
| 45 | +inclusion in the `structtoolbox` package. |
| 46 | + |
| 47 | +```r |
| 48 | +## example use |
| 49 | +# create an instance of your object |
| 50 | +M = model_template() |
| 51 | +# get/set parameters |
| 52 | +M$value_1 = 5 |
| 53 | +M$value_1 # 5 |
| 54 | +# train your model with some data |
| 55 | +M = model.train(M,iris_dataset()) |
| 56 | +# apply your model to some test data |
| 57 | +M = model.predict(M,iris_dataset()) |
| 58 | +``` |
| 59 | + |
| 60 | +You might find it useful to create some chart objects for your new model object. |
| 61 | +See the `chart` template for help to do this. |
| 62 | + |
| 63 | +*** |
| 64 | + |
| 65 | +# Walkthrough |
| 66 | + |
| 67 | +## Class definition |
| 68 | +The model templates are based on inheriting the `model` or `model.stato` classes |
| 69 | +from the StRUCT package. This means all the functionality already created as |
| 70 | +part of struct will be available to the new class. |
| 71 | + |
| 72 | +First, a new class object is set up using the `setClass` function. The |
| 73 | +`setClass` function contains a number of inputs, which will be illustrated in |
| 74 | +the following steps. |
| 75 | + |
| 76 | +```r |
| 77 | +model_template=setClass('model_template', # replace model_template (both places) |
| 78 | + # with your new model name |
| 79 | + contains = c('model','stato'), # 'stato' is optional |
| 80 | +``` |
| 81 | + |
| 82 | +The `contains` option tells R which aspects of `struct` to include in the new |
| 83 | +model. |
| 84 | + |
| 85 | +*** |
| 86 | + |
| 87 | +### Slots for parameters and outputs |
| 88 | + |
| 89 | +After defining the name of your new class and specifying that it contains the |
| 90 | +`model` and (optionally) the `stato` class, the *slots* are defined. Slots are |
| 91 | +variables that are associated with the object that are intended to be accessable |
| 92 | +by the user. It is here that you specify the input parameters and outputs for |
| 93 | +your class. Input parameters should have `params.` appended and outputs should |
| 94 | +have `outputs.` appended. |
| 95 | + |
| 96 | +Default slots like `name`, `description` and `type` are inherited (i.e. provided |
| 97 | +by `struct`) and do not need to be specified. |
| 98 | + |
| 99 | +A type should be indicated for each slot e.g. `numeric`. Some special types |
| 100 | +are available in `struct`, namely `entity` and `enum`, and we recommend you use |
| 101 | +these where possible because they allow you to provide additonal information |
| 102 | +(`name`, `description` etc) about the parameters/outputs. |
| 103 | + |
| 104 | +To use `entity` objects for parameters and/or outputs, specify `entity` |
| 105 | +as the type. If you want to enable STATO functionality for your parameters then |
| 106 | +you can specify `entity.stato` as the type. |
| 107 | + |
| 108 | +```r |
| 109 | + slots=c( |
| 110 | + 'params.value_0'='entity', |
| 111 | + 'params.value_1'='entity.stato', |
| 112 | + 'params.value_2'='numeric', |
| 113 | + 'outputs.result_1'='entity', |
| 114 | + 'outputs.result_2'='numeric' |
| 115 | + ), |
| 116 | +``` |
| 117 | + |
| 118 | +*** |
| 119 | + |
| 120 | +### Prototypes |
| 121 | + |
| 122 | +#### Model protoype |
| 123 | +You can also optionally include a list of prototypes. Protoypes are used to |
| 124 | +define default values for parameters. It is recommended that the `name`, |
| 125 | +`description` and `type` are defined here for your model. |
| 126 | + |
| 127 | +```r |
| 128 | + prototype = list( |
| 129 | + ## These are the default slots available for every struct object |
| 130 | + name='A test model', |
| 131 | + description='An example model object. Training adds value_1 counts to |
| 132 | + a dataset, while prediction adds value_2 counts.', |
| 133 | + type='test', |
| 134 | +``` |
| 135 | + |
| 136 | + |
| 137 | +STATO objects can also have a `stato.id` assigned in the protoype list. It |
| 138 | +must be assigned when the class is defined, as no mechanism is provided for |
| 139 | +changing it by the user (other than using @ which is discouraged). |
| 140 | + |
| 141 | +STATO names and definitions can then be extracted from the STATO database based |
| 142 | +on the supplied id. To search for a relevant ID see http://stato-ontology.org/ |
| 143 | + |
| 144 | +```r |
| 145 | + ## This is slot is only required for stato objects |
| 146 | + stato.id='OBI:0000011', |
| 147 | +``` |
| 148 | + |
| 149 | +*** |
| 150 | + |
| 151 | +#### Parameter and Output protoypes |
| 152 | + |
| 153 | +Setting an input parameter to an `entity` object allows default slots for this |
| 154 | +parameter so that you can be more descriptive about what it does without using |
| 155 | +lengthy slot names. The `type` slot allows you to specify that the entity should |
| 156 | +be of a specific type e.g. 'numeric'. Note that any slot without a prototype |
| 157 | +will be initialised as NULL or a vector or zero length of the appropriate class |
| 158 | +e.g `numeric(0)`. |
| 159 | + |
| 160 | +```r |
| 161 | + ## parameters all start with params. |
| 162 | + # entities can be initialised with populated slots |
| 163 | + params.value_0=entity(name='Value 0',value=0,type='numeric'), |
| 164 | +``` |
| 165 | + |
| 166 | +*** |
| 167 | + |
| 168 | +A stato.id can be set for `entity.stato` objects enabling STATO integration for |
| 169 | +parameters and outputs as well as model objects. It is a good idea to set |
| 170 | +`name`, `description` and `stato.id` slots here, when the object is defined. |
| 171 | + |
| 172 | +```r |
| 173 | + # entity.stato objects can have a stato.id |
| 174 | + params.value_1=entity.stato(value=10,name='Value 1',type='numeric', |
| 175 | + description='An example entity.stato object', |
| 176 | + stato.id='STATO:0000047'), |
| 177 | +``` |
| 178 | + |
| 179 | +*** |
| 180 | + |
| 181 | +Parameters dont have to be entities, but this is not recommended as there is no |
| 182 | +mechanism for providing long names and descriptions for these parameters. |
| 183 | + |
| 184 | +```r |
| 185 | + params.value_2 = 20, |
| 186 | +``` |
| 187 | + |
| 188 | +*** |
| 189 | + |
| 190 | +Outputs can be included in the prototype, and we recommend `entity` or |
| 191 | +`entity.stato` objects for these too: |
| 192 | + |
| 193 | +```r |
| 194 | + |
| 195 | + outputs.result_1=entity(name='Result 1',type='dataset', |
| 196 | + description='An example entity object'), |
| 197 | + |
| 198 | + outputs.result_2=2 # dont have to be entity |
| 199 | + ) |
| 200 | +) |
| 201 | +``` |
| 202 | + |
| 203 | +*** |
| 204 | + |
| 205 | +## Defining new methods for your model object |
| 206 | + |
| 207 | +Model objects have a `model.train` and a `model.predict` function predefined by |
| 208 | +`struct`, but by default it prints a warning that the method hasnt been |
| 209 | +implemented yet. You can override the default using the `setMethod` function. |
| 210 | +There are a number of inputs for `setMethod` as described in the following |
| 211 | +steps. |
| 212 | + |
| 213 | +The `signature` option should include your new class as the first element, as |
| 214 | +this will direct R to use the correct `model.train` or `model.predict` function |
| 215 | +for your new model object. |
| 216 | + |
| 217 | +```r |
| 218 | +setMethod(f='model.train', # dont change this line |
| 219 | + signature=c('model_template','dataset'), # replace model_template with... |
| 220 | + # ...your new model name |
| 221 | +``` |
| 222 | + |
| 223 | +*** |
| 224 | + |
| 225 | +The description function is where you can include your own R code to train your |
| 226 | +model. Make sure you `return(M)` otherwise you model will not behave in the way |
| 227 | +expected by other StRUCT objects. |
| 228 | + |
| 229 | +```r |
| 230 | + definition = function(M,D) { # dont change this line |
| 231 | + # do something here # |
| 232 | + return(M) # make sure you return the model object |
| 233 | + } |
| 234 | +) |
| 235 | + |
| 236 | +``` |
| 237 | + |
| 238 | +In the definition M is the model object, so that you can access parameter values |
| 239 | +and D is the input dataset object, containing the data to train the model with. |
| 240 | + |
| 241 | +*** |
| 242 | + |
| 243 | +Similarly, a `model.predict` method is provided so that you can apply your |
| 244 | +(trained) model to test data. |
| 245 | + |
| 246 | +```r |
| 247 | +setMethod(f='model.predict', # dont change this line |
| 248 | + signature=c('model_template','dataset'), # replace model_template with... |
| 249 | + # ...your new model name |
| 250 | + definition = function(M,D) { # dont change this line |
| 251 | + ## do something here ## |
| 252 | + # M is the model object |
| 253 | + return(M) # make sure you return the model |
| 254 | + } |
| 255 | +) |
| 256 | + |
| 257 | +``` |
| 258 | + |
| 259 | +*** |
| 260 | + |
| 261 | +# Complete object |
| 262 | + |
| 263 | +The final model object will look something like this working example: |
| 264 | + |
| 265 | +```{r} |
| 266 | +model_template=setClass('model_template', # replace model_template with ... |
| 267 | + # ...your new model name |
| 268 | + contains = c('model','stato'), # stato is optional |
| 269 | + slots=c( # define your parameters and outputs here |
| 270 | + 'params.value_0'='entity', |
| 271 | + 'params.value_1'='entity.stato', |
| 272 | + 'params.value_2'='numeric', |
| 273 | + 'outputs.result_1'='entity', |
| 274 | + 'outputs.result_2'='numeric' |
| 275 | + ), |
| 276 | + prototype = list( # specify default values for your parameters etc |
| 277 | + |
| 278 | + ## These are the default slots available for every struct object |
| 279 | + name='A test model', |
| 280 | + description='An example model object. Training adds value_1 counts to |
| 281 | + a dataset, while prediction adds value_2 counts.', |
| 282 | + type='test', |
| 283 | + |
| 284 | + ## This slot is only required for model.stato objects |
| 285 | + stato.id='OBI:0000011', |
| 286 | + |
| 287 | + ## parameters all start with params. |
| 288 | + # entities can be initialised with populated slots |
| 289 | + params.value_0=entity(name='Value 0',value=0,type='numeric'), |
| 290 | + |
| 291 | + # entity.stato objects can have a stato.id |
| 292 | + params.value_1=entity.stato(value=10,name='Value 1',type='numeric', |
| 293 | + description='An example entity.stato object', |
| 294 | + stato.id='STATO:0000047'), |
| 295 | + |
| 296 | + # params dont have to be entity objects but we dont recommend this. |
| 297 | + params.value_2=20, |
| 298 | + |
| 299 | + # entities can be initialised with populated slots |
| 300 | + outputs.result_1=entity(name='Result 1',type='dataset', |
| 301 | + description='An example entity object'), |
| 302 | + |
| 303 | + # outputs dont have to be entity objects but we dont recommend this. |
| 304 | + outputs.result_2=2 |
| 305 | + ) |
| 306 | +) |
| 307 | +
|
| 308 | +# create a model.train method for your object |
| 309 | +setMethod(f='model.train', # dont change this line |
| 310 | + signature=c('model_template','dataset'), # replace model_template with... |
| 311 | + # ...your new model name |
| 312 | + definition = function(M,D) { # dont change this line |
| 313 | + # do something here # |
| 314 | + return(M) # make sure you return the model |
| 315 | + } |
| 316 | +) |
| 317 | +
|
| 318 | +# create a model.predict method for your object |
| 319 | +setMethod(f='model.predict', # dont change this line |
| 320 | + signature=c('model_template','dataset'), # replace model_template with... |
| 321 | + # ...your new model name |
| 322 | + definition = function(M,D) { # dont change this line |
| 323 | + ## do something here ## |
| 324 | + return(M) # make sure you return the model |
| 325 | + } |
| 326 | +) |
| 327 | +``` |
| 328 | + |
| 329 | +*** |
0 commit comments