add: Keras integration jupyter notebook added

cosmic-cortex · cosmic-cortex · commit e4cb37a1d623 · 2018-09-29T16:44:22.000+02:00
diff --git a/docs/source/content/examples/Keras-integration.rst b/docs/source/content/examples/Keras-integration.rst
diff --git a/docs/source/content/examples/Keras_integration.ipynb b/docs/source/content/examples/Keras_integration.ipynb
@@ -0,0 +1,241 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Keras models in modAL workflows\n",
+    "=============================\n",
+    "\n",
+    "Thanks for the scikit-learn API of Keras, you can seamlessly integrate Keras models into your modAL workflow. In this tutorial, we shall quickly introduce how to use the scikit-learn API of Keras and we are going to see how to do active learning with it. More details on the Keras scikit-learn API [can be found here](https://keras.io/scikit-learn-api/).\n",
+    "\n",
+    "The executable script for this example can be [found here](https://github.com/cosmic-cortex/modAL/blob/master/examples/keras_integration.py)!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Keras' scikit-learn API\n",
+    "-----------------------\n",
+    "\n",
+    "By default, a Keras model's interface differs from what is used for scikit-learn estimators. However, with the use of its scikit-learn wrapper, it is possible to adapt your model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/home/namazu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
+      "  from ._conv import register_converters as _register_converters\n",
+      "Using TensorFlow backend.\n"
+     ]
+    }
+   ],
+   "source": [
+    "import keras\n",
+    "from keras.models import Sequential\n",
+    "from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D\n",
+    "from keras.wrappers.scikit_learn import KerasClassifier\n",
+    "\n",
+    "# build function for the Keras' scikit-learn API\n",
+    "def create_keras_model():\n",
+    "    \"\"\"\n",
+    "    This function compiles and returns a Keras model.\n",
+    "    Should be passed to KerasClassifier in the Keras scikit-learn API.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    model = Sequential()\n",
+    "    model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))\n",
+    "    model.add(Conv2D(64, (3, 3), activation='relu'))\n",
+    "    model.add(MaxPooling2D(pool_size=(2, 2)))\n",
+    "    model.add(Dropout(0.25))\n",
+    "    model.add(Flatten())\n",
+    "    model.add(Dense(128, activation='relu'))\n",
+    "    model.add(Dropout(0.5))\n",
+    "    model.add(Dense(10, activation='softmax'))\n",
+    "\n",
+    "    model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])\n",
+    "\n",
+    "    return model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For our purposes, the ``classifier`` which we will initialize now acts just like any scikit-learn estimator."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# create the classifier\n",
+    "classifier = KerasClassifier(create_keras_model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Active learning with Keras\n",
+    "---------------------------------------\n",
+    "\n",
+    "In this example, we are going to use the famous MNIST dataset, which is available as a built-in for Keras."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from keras.datasets import mnist\n",
+    "\n",
+    "# read training data\n",
+    "(X_train, y_train), (X_test, y_test) = mnist.load_data()\n",
+    "X_train = X_train.reshape(60000, 28, 28, 1).astype('float32') / 255\n",
+    "X_test = X_test.reshape(10000, 28, 28, 1).astype('float32') / 255\n",
+    "y_train = keras.utils.to_categorical(y_train, 10)\n",
+    "y_test = keras.utils.to_categorical(y_test, 10)\n",
+    "\n",
+    "# assemble initial data\n",
+    "n_initial = 1000\n",
+    "initial_idx = np.random.choice(range(len(X_train)), size=n_initial, replace=False)\n",
+    "X_initial = X_train[initial_idx]\n",
+    "y_initial = y_train[initial_idx]\n",
+    "\n",
+    "# generate the pool\n",
+    "# remove the initial data from the training dataset\n",
+    "X_pool = np.delete(X_train, initial_idx, axis=0)[:5000]\n",
+    "y_pool = np.delete(y_train, initial_idx, axis=0)[:5000]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Active learning with data and classifier ready is as easy as always. Because training is *very* expensive in large neural networks, this time we are going to query the best 200 instances each time we measure the uncertainty of the pool."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 1/1\n",
+      "1000/1000 [==============================] - 4s 4ms/step - loss: 1.5794 - acc: 0.4790\n"
+     ]
+    }
+   ],
+   "source": [
+    "from modAL.models import ActiveLearner\n",
+    "\n",
+    "# initialize ActiveLearner\n",
+    "learner = ActiveLearner(\n",
+    "    estimator=classifier,\n",
+    "    X_training=X_initial, y_training=y_initial,\n",
+    "    verbose=1\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To make sure that you train only on newly queried labels, pass ``only_new=True`` to the ``.teach()`` method of the learner."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query no. 1\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 10ms/step - loss: 2.0987 - acc: 0.3300\n",
+      "Query no. 2\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 7ms/step - loss: 2.1222 - acc: 0.3300\n",
+      "Query no. 3\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 8ms/step - loss: 2.0558 - acc: 0.4900\n",
+      "Query no. 4\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 9ms/step - loss: 1.6943 - acc: 0.4700\n",
+      "Query no. 5\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 12ms/step - loss: 1.5865 - acc: 0.6200\n",
+      "Query no. 6\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 14ms/step - loss: 1.8714 - acc: 0.3500\n",
+      "Query no. 7\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 14ms/step - loss: 1.3940 - acc: 0.6700\n",
+      "Query no. 8\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 14ms/step - loss: 2.1033 - acc: 0.3200\n",
+      "Query no. 9\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 11ms/step - loss: 1.5666 - acc: 0.6700\n",
+      "Query no. 10\n",
+      "Epoch 1/1\n",
+      "100/100 [==============================] - 1s 12ms/step - loss: 2.0238 - acc: 0.2700\n"
+     ]
+    }
+   ],
+   "source": [
+    "# the active learning loop\n",
+    "n_queries = 10\n",
+    "for idx in range(n_queries):\n",
+    "    print('Query no. %d' % (idx + 1))\n",
+    "    query_idx, query_instance = learner.query(X_pool, n_instances=100, verbose=0)\n",
+    "    learner.teach(\n",
+    "        X=X_pool[query_idx], y=y_pool[query_idx], only_new=True,\n",
+    "        verbose=1\n",
+    "    )\n",
+    "    # remove queried instance from pool\n",
+    "    X_pool = np.delete(X_pool, query_idx, axis=0)\n",
+    "    y_pool = np.delete(y_pool, query_idx, axis=0)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -42,7 +42,7 @@ modAL is an active learning framework for Python3, designed with *modularity, fl
    content/examples/bayesian_optimization
    content/examples/query_by_committee
    content/examples/bootstrapping_and_bagging
-   content/examples/Keras-integration
+   content/examples/Keras_integration
    
 .. toctree::
    :glob: