|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Moving from Bayesflow 1.0 to 2.0" |
| 8 | + ] |
| 9 | + }, |
| 10 | + { |
| 11 | + "cell_type": "markdown", |
| 12 | + "metadata": {}, |
| 13 | + "source": [ |
| 14 | + "Current users of bayesflow will notice that with the update to 2.0 many things have changed and this short guide aims to clarify those changes. Users familiar with the previous Quickstart guide will notice that it follows a similar structure but assumes that users are already familiar with bayesflow so omits many of the the mathematical explaination in favor of demonstrating the differences in workflow. For a more detailed explaination of any of the bayesflow framework, users should read the linear regresion example notebook. \n", |
| 15 | + "\n", |
| 16 | + "Additionally to avoid confusion, when necessary similarly named objects from _bayesflow1.0_ will have 1.0 after their name, whereas those from _bayesflow2.0_ will not. Finally a short table with a summary of the function call changes is provided at the end of the guide. " |
| 17 | + ] |
| 18 | + }, |
| 19 | + { |
| 20 | + "cell_type": "markdown", |
| 21 | + "metadata": {}, |
| 22 | + "source": [ |
| 23 | + "## Keras Framework\n", |
| 24 | + "\n", |
| 25 | + "Bayesflow 2.0 looks quite different from 1.0 because the backend has been entirely reformatted in line with `keras` standards. Previously bayesflow was only compatible with `TensorFlow`, but now users can choose their prefered machine learning framework among `TensorFlow`, `JAX` or `Pytorch`." |
| 26 | + ] |
| 27 | + }, |
| 28 | + { |
| 29 | + "cell_type": "code", |
| 30 | + "execution_count": null, |
| 31 | + "metadata": {}, |
| 32 | + "outputs": [], |
| 33 | + "source": [ |
| 34 | + "import numpy as np\n", |
| 35 | + "\n", |
| 36 | + "# ensure the backend is set\n", |
| 37 | + "import os\n", |
| 38 | + "if \"KERAS_BACKEND\" not in os.environ:\n", |
| 39 | + " # set this to \"torch\", \"tensorflow\", or \"jax\"\n", |
| 40 | + " os.environ[\"KERAS_BACKEND\"] = \"tensorflow\"\n", |
| 41 | + "\n", |
| 42 | + "import keras\n", |
| 43 | + "import bayesflow as bf\n", |
| 44 | + "import pandas as pd" |
| 45 | + ] |
| 46 | + }, |
| 47 | + { |
| 48 | + "cell_type": "markdown", |
| 49 | + "metadata": {}, |
| 50 | + "source": [ |
| 51 | + "This version of bayeflow also relies much more heavily on dictionaries since parameters are now named by convention. Many objects now expect a dictionary, so parameters and data are returned as a dictionaries. " |
| 52 | + ] |
| 53 | + }, |
| 54 | + { |
| 55 | + "cell_type": "markdown", |
| 56 | + "metadata": {}, |
| 57 | + "source": [ |
| 58 | + "## Example Workflow \n", |
| 59 | + "\n", |
| 60 | + "### 1. Priors and Likelihood Model\n", |
| 61 | + "\n", |
| 62 | + "Any Bayesflow workflow begins with simulated data which is specified with a prior and a corresponding likelihood function. While these two core components are still present, their use and naming conventions within the workflow have changed. \n", |
| 63 | + "\n", |
| 64 | + "Previously users would define a prior function, which would then be used by a `Prior1.0` object to sample prior values. The likelihood would then also be specified via function and used by a `Simulator1.0` wrapper to produce observations for a given prior. These were then combined in the `GenerativeModel1.0`. \n", |
| 65 | + "\n", |
| 66 | + "In 2.0 we no longer make use of the `Prior1.0`, `Simulator1.0` or `GenerativeModel1.0` objects. Instead the roll of the `GenerativeModel1.0` has been renamed to `simulator` which can be invoked as a single function that glues the prior and likelihood functions together to generate samples of both the prior and observations. " |
| 67 | + ] |
| 68 | + }, |
| 69 | + { |
| 70 | + "cell_type": "code", |
| 71 | + "execution_count": null, |
| 72 | + "metadata": {}, |
| 73 | + "outputs": [], |
| 74 | + "source": [ |
| 75 | + "def theta_prior():\n", |
| 76 | + " theta = np.random.normal(size=4)\n", |
| 77 | + " # previously: \n", |
| 78 | + " # return theta \n", |
| 79 | + " return dict(theta=theta) # notice we return a dictionary\n", |
| 80 | + " \n", |
| 81 | + "\n", |
| 82 | + "def likelihood_model(theta, n_obs):\n", |
| 83 | + " x = np.random.normal(loc=theta, size=(n_obs, theta.shape[0]))\n", |
| 84 | + " return dict(x=x)\n" |
| 85 | + ] |
| 86 | + }, |
| 87 | + { |
| 88 | + "cell_type": "markdown", |
| 89 | + "metadata": {}, |
| 90 | + "source": [ |
| 91 | + "Previously the prior and likelihood were defined as" |
| 92 | + ] |
| 93 | + }, |
| 94 | + { |
| 95 | + "cell_type": "code", |
| 96 | + "execution_count": null, |
| 97 | + "metadata": {}, |
| 98 | + "outputs": [], |
| 99 | + "source": [ |
| 100 | + "# Do Not Run\n", |
| 101 | + "prior_1 = bf.simulation.Prior(prior_fun=theta_prior)\n", |
| 102 | + "simulator_1 = bf.simulation.Simulator(simulator_fun=likelihood_model)\n", |
| 103 | + "model_1 = bf.simulation.GenerativeModel(prior=prior_1, simulator=simulator_1)" |
| 104 | + ] |
| 105 | + }, |
| 106 | + { |
| 107 | + "cell_type": "markdown", |
| 108 | + "metadata": {}, |
| 109 | + "source": [ |
| 110 | + "Whereas the new framework directly uses the likelihood and prior functions directly in the simulator. We also a define a meta function which allows us to dynamically set the batch size. " |
| 111 | + ] |
| 112 | + }, |
| 113 | + { |
| 114 | + "cell_type": "code", |
| 115 | + "execution_count": null, |
| 116 | + "metadata": {}, |
| 117 | + "outputs": [], |
| 118 | + "source": [ |
| 119 | + "def meta(batch_size):\n", |
| 120 | + " return dict(n_obs=1)\n", |
| 121 | + "\n", |
| 122 | + "simulator = bf.make_simulator([theta_prior, likelihood_model], meta_fn=meta)" |
| 123 | + ] |
| 124 | + }, |
| 125 | + { |
| 126 | + "cell_type": "markdown", |
| 127 | + "metadata": {}, |
| 128 | + "source": [ |
| 129 | + "We can then generate batches of training samples as follows." |
| 130 | + ] |
| 131 | + }, |
| 132 | + { |
| 133 | + "cell_type": "code", |
| 134 | + "execution_count": null, |
| 135 | + "metadata": {}, |
| 136 | + "outputs": [], |
| 137 | + "source": [ |
| 138 | + "sim_draws = simulator.sample(500)\n", |
| 139 | + "print(sim_draws[\"x\"].shape)\n", |
| 140 | + "print(sim_draws[\"theta\"].shape)" |
| 141 | + ] |
| 142 | + }, |
| 143 | + { |
| 144 | + "cell_type": "markdown", |
| 145 | + "metadata": {}, |
| 146 | + "source": [ |
| 147 | + "### 2. Adapter and Data Configuration\n", |
| 148 | + "\n", |
| 149 | + "In _bayesflow2.0_ we now need to specify the data configuration. For example we should specify which variables are `summary_variables` meaning observations that will be summarized in the summary network, the `inference_variables` meaning the prior draws on which we're interested in training the posterior network and the `inference_conditions` which specify our number of observations. Previously these things were inferred from the type of network used, but now they should be defined explictly with the `adapter`. This allows users to ??? " |
| 150 | + ] |
| 151 | + }, |
| 152 | + { |
| 153 | + "cell_type": "code", |
| 154 | + "execution_count": null, |
| 155 | + "metadata": {}, |
| 156 | + "outputs": [], |
| 157 | + "source": [ |
| 158 | + "adapter = (\n", |
| 159 | + " bf.adapters.Adapter()\n", |
| 160 | + " .to_array()\n", |
| 161 | + " .broadcast(\"n_obs\")\n", |
| 162 | + " .convert_dtype(from_dtype=\"float64\", to_dtype=\"float32\")\n", |
| 163 | + " .standardize(exclude=[\"n_obs\"])\n", |
| 164 | + " .rename(\"x\", \"summary_variables\")\n", |
| 165 | + " .rename(\"theta\", \"inference_variables\")\n", |
| 166 | + " .rename(\"n_obs\", \"inference_conditions\")\n", |
| 167 | + ")" |
| 168 | + ] |
| 169 | + }, |
| 170 | + { |
| 171 | + "cell_type": "markdown", |
| 172 | + "metadata": {}, |
| 173 | + "source": [ |
| 174 | + "In addition the adapter now has built in functions to transform data such as standardization or one-hot encoding. For a full list of the adapter transforms, please see the documentation. " |
| 175 | + ] |
| 176 | + }, |
| 177 | + { |
| 178 | + "cell_type": "markdown", |
| 179 | + "metadata": {}, |
| 180 | + "source": [ |
| 181 | + "### 3. Summary Network and Inference Network" |
| 182 | + ] |
| 183 | + }, |
| 184 | + { |
| 185 | + "cell_type": "markdown", |
| 186 | + "metadata": {}, |
| 187 | + "source": [ |
| 188 | + "As in _bayesflow1.0_ we still use a summary network, which is still a Deepset model. Nothing has changed in this step of the workflow. " |
| 189 | + ] |
| 190 | + }, |
| 191 | + { |
| 192 | + "cell_type": "code", |
| 193 | + "execution_count": null, |
| 194 | + "metadata": {}, |
| 195 | + "outputs": [], |
| 196 | + "source": [ |
| 197 | + "summary_net = bf.networks.DeepSet(depth=2, summary_dim=10)" |
| 198 | + ] |
| 199 | + }, |
| 200 | + { |
| 201 | + "cell_type": "markdown", |
| 202 | + "metadata": {}, |
| 203 | + "source": [ |
| 204 | + "For the inference network there are now several implemented architectures for users to choose from. They are `FlowMatching`, `ConsistencyModel`, `ContinuousConsistencyModel` and `CouplingFlow`. For this demonstration we use `FlowMatching`, but for further explaination of the different models please see the other examples and documentation. " |
| 205 | + ] |
| 206 | + }, |
| 207 | + { |
| 208 | + "cell_type": "code", |
| 209 | + "execution_count": null, |
| 210 | + "metadata": {}, |
| 211 | + "outputs": [], |
| 212 | + "source": [ |
| 213 | + "inference_net = bf.networks.FlowMatching()" |
| 214 | + ] |
| 215 | + }, |
| 216 | + { |
| 217 | + "cell_type": "markdown", |
| 218 | + "metadata": {}, |
| 219 | + "source": [ |
| 220 | + "### 4. Approximator (Amortizer Posterior)" |
| 221 | + ] |
| 222 | + }, |
| 223 | + { |
| 224 | + "cell_type": "markdown", |
| 225 | + "metadata": {}, |
| 226 | + "source": [ |
| 227 | + "Previously the actual training and amortization was done in two steps with two different objects the `Amortizer1.0` and `Trainer1.0` . First users would create an amortizer containing the summary and inference networks." |
| 228 | + ] |
| 229 | + }, |
| 230 | + { |
| 231 | + "cell_type": "code", |
| 232 | + "execution_count": null, |
| 233 | + "metadata": {}, |
| 234 | + "outputs": [], |
| 235 | + "source": [ |
| 236 | + "### Do Not Run \n", |
| 237 | + "\n", |
| 238 | + "# Renamed to Approximator\n", |
| 239 | + "amortizer = bf.amortizers.AmortizedPosterior(inference_net, summary_net)\n", |
| 240 | + "\n", |
| 241 | + "# Defunct\n", |
| 242 | + "trainer = bf.trainers.Trainer(amortizer=amortizer, generative_model=gen_model)" |
| 243 | + ] |
| 244 | + }, |
| 245 | + { |
| 246 | + "cell_type": "markdown", |
| 247 | + "metadata": {}, |
| 248 | + "source": [ |
| 249 | + " This has been renamed to an `Approximator` and takes the summary network, inference network and the data adapter as arguments. " |
| 250 | + ] |
| 251 | + }, |
| 252 | + { |
| 253 | + "cell_type": "code", |
| 254 | + "execution_count": null, |
| 255 | + "metadata": {}, |
| 256 | + "outputs": [], |
| 257 | + "source": [ |
| 258 | + "approximator = bf.approximators.ContinuousApproximator(\n", |
| 259 | + " summary_network=summary_net,\n", |
| 260 | + " inference_network=inference_net,\n", |
| 261 | + " adapter=adapter\n", |
| 262 | + ")" |
| 263 | + ] |
| 264 | + }, |
| 265 | + { |
| 266 | + "cell_type": "markdown", |
| 267 | + "metadata": {}, |
| 268 | + "source": [ |
| 269 | + "Whereas previously a `Trainer1.0` object for training, now users call fit on the `approximator` directly. For additional flexibility in training the `approximator` also has two additional arguments the `learning rate` and `optimizer`. The optimizer can be any keras optimizer." |
| 270 | + ] |
| 271 | + }, |
| 272 | + { |
| 273 | + "cell_type": "code", |
| 274 | + "execution_count": null, |
| 275 | + "metadata": {}, |
| 276 | + "outputs": [], |
| 277 | + "source": [ |
| 278 | + "learning_rate = 1e-4\n", |
| 279 | + "optimizer = keras.optimizers.AdamW(learning_rate=learning_rate, clipnorm=1.0)" |
| 280 | + ] |
| 281 | + }, |
| 282 | + { |
| 283 | + "cell_type": "markdown", |
| 284 | + "metadata": {}, |
| 285 | + "source": [ |
| 286 | + "Users must then compile the `approximator` in oder to ??? " |
| 287 | + ] |
| 288 | + }, |
| 289 | + { |
| 290 | + "cell_type": "code", |
| 291 | + "execution_count": null, |
| 292 | + "metadata": {}, |
| 293 | + "outputs": [], |
| 294 | + "source": [ |
| 295 | + "approximator.compile(optimizer=optimizer)" |
| 296 | + ] |
| 297 | + }, |
| 298 | + { |
| 299 | + "cell_type": "markdown", |
| 300 | + "metadata": {}, |
| 301 | + "source": [ |
| 302 | + "\n", |
| 303 | + "To train the network, and save output users now need only to call fit on the `approximator`. " |
| 304 | + ] |
| 305 | + }, |
| 306 | + { |
| 307 | + "cell_type": "code", |
| 308 | + "execution_count": null, |
| 309 | + "metadata": {}, |
| 310 | + "outputs": [], |
| 311 | + "source": [ |
| 312 | + "history = approximator.fit(\n", |
| 313 | + " epochs=50,\n", |
| 314 | + " num_batches=200,\n", |
| 315 | + " batch_size=64,\n", |
| 316 | + " simulator=simulator\n", |
| 317 | + ")" |
| 318 | + ] |
| 319 | + }, |
| 320 | + { |
| 321 | + "cell_type": "markdown", |
| 322 | + "metadata": {}, |
| 323 | + "source": [ |
| 324 | + "## 5.Diagnostics \n", |
| 325 | + "Another change was made in the model diagnostics, much of the functionality remains the same, but the naming convention has changes. For example previously users would plot losses by using \n", |
| 326 | + "`bf.diagnostics.plot_losses()` in bf 2.0 we instead have all the plotting function group together in `bf.diagnostics.plots` which means the corresponding function in 2.0 is `bf.diagnostics.plots.loss()`." |
| 327 | + ] |
| 328 | + }, |
| 329 | + { |
| 330 | + "cell_type": "code", |
| 331 | + "execution_count": null, |
| 332 | + "metadata": {}, |
| 333 | + "outputs": [], |
| 334 | + "source": [ |
| 335 | + "f = bf.diagnostics.plots.loss(\n", |
| 336 | + " train_losses=history.history['loss']\n", |
| 337 | + ")" |
| 338 | + ] |
| 339 | + }, |
| 340 | + { |
| 341 | + "cell_type": "markdown", |
| 342 | + "metadata": {}, |
| 343 | + "source": [ |
| 344 | + "This was done as we have also added diagnostic metrics such as calibration error, posterior contraction, and root mean squared error. These functions can accordingly be found in `bf.diagnostics.metrics` but for more information please see the API. " |
| 345 | + ] |
| 346 | + }, |
| 347 | + { |
| 348 | + "cell_type": "markdown", |
| 349 | + "metadata": {}, |
| 350 | + "source": [ |
| 351 | + "# Other New Features? " |
| 352 | + ] |
| 353 | + }, |
| 354 | + { |
| 355 | + "cell_type": "markdown", |
| 356 | + "metadata": {}, |
| 357 | + "source": [] |
| 358 | + }, |
| 359 | + { |
| 360 | + "cell_type": "markdown", |
| 361 | + "metadata": {}, |
| 362 | + "source": [ |
| 363 | + "# Summary Change Table \n", |
| 364 | + "\n", |
| 365 | + "| 1.0 | 2.0 Useage |\n", |
| 366 | + "| :--------| :---------| \n", |
| 367 | + "| `Prior`, `Simulator` | Defunct and no longer standalone objects but incorporated into `simulator` | \n", |
| 368 | + "|`GenerativeModel` | Defunct with it's functionality having been taken over by `simulations.make_simulator` | \n", |
| 369 | + "| `training.configurator` | Functionality taken over by `Adapter` | \n", |
| 370 | + "|`Trainer` | Functionality taken over by `fit` method of `Approximator` | \n", |
| 371 | + "| `AmortizedPosterior`| Renamed to `Approximator` | " |
| 372 | + ] |
| 373 | + }, |
| 374 | + { |
| 375 | + "cell_type": "code", |
| 376 | + "execution_count": null, |
| 377 | + "metadata": {}, |
| 378 | + "outputs": [], |
| 379 | + "source": [] |
| 380 | + } |
| 381 | + ], |
| 382 | + "metadata": { |
| 383 | + "language_info": { |
| 384 | + "name": "python" |
| 385 | + } |
| 386 | + }, |
| 387 | + "nbformat": 4, |
| 388 | + "nbformat_minor": 2 |
| 389 | +} |
0 commit comments