You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Update the Spark version compatibility, release info for Release 1.0.0
- Add the example for hyperparameter turning with KerasImageFileEstimator
- Update link to Databricks notebooks (for release 1.0.0)
Spark 2.2.0 and Python 3.6 are recommended for working with the latest code. See the [travis config](https://github.com/databricks/spark-deep-learning/blob/master/.travis.yml) for the regularly-tested combinations.
61
+
To work with the latest code, Spark 2.3.0 is required and Python 3.6 & Scala 2.11 are recommended . See the [travis config](https://github.com/databricks/spark-deep-learning/blob/master/.travis.yml) for the regularly-tested combinations.
61
62
62
63
Compatibility requirements for each release are listed in the [Releases](#releases) section.
63
64
@@ -70,13 +71,10 @@ You can also post bug reports and feature requests in Github issues.
70
71
71
72
72
73
## Releases
73
-
<!--
74
-
TODO: might want to add TensorFlow compatibility information.
-[Distributed hyperparameter tuning](#distributed-hyperparameter-tuning) : via Spark MLlib Pipelines
98
96
-[Applying deep learning models at scale - to images](#applying-deep-learning-models-at-scale) : apply your own or known popular models to make predictions or transform them into features
99
97
-[Applying deep learning models at scale - to tensors](#applying-deep-learning-models-at-scale-to-tensors) : of up to 2 dimensions
100
98
-[Deploying models as SQL functions](#deploying-models-as-sql-functions) : empower everyone by making deep learning available in SQL.
101
99
102
100
To try running the examples below, check out the Databricks notebook in the [Databricks docs for Deep Learning Pipelines](https://docs.databricks.com/applications/deep-learning/deep-learning-pipelines.html), which works with the latest release of Deep Learning Pipelines. Here are some Databricks notebooks compatible with earlier releases:
The first step to applying deep learning on images is the ability to load the images. Spark and Deep Learning Pipelines include utility functions that can load millions of images into a Spark DataFrame and decode them automatically in a distributed fashion, allowing manipulation at scale.
108
+
The first step to apply deep learning on images is the ability to load the images. Spark and Deep Learning Pipelines include utility functions that can load millions of images into a Spark DataFrame and decode them automatically in a distributed fashion, allowing manipulation at scale.
print("Training set accuracy = "+str(evaluator.evaluate(predictionAndLabels)))
156
154
```
157
155
156
+
157
+
### Distributed hyperparameter tuning
158
+
159
+
Getting the best results in deep learning requires experimenting with different values for training parameters, an important step called hyperparameter tuning. Since Deep Learning Pipelines enables exposing deep learning training as a step in Spark’s machine learning pipelines, users can rely on the hyperparameter tuning infrastructure already built into Spark MLlib.
160
+
161
+
##### For Keras users
162
+
To perform hyperparameter tuning with a Keras Model, `KerasImageFileEstimator` can be used to build an Estimator and use MLlib’s tooling for tuning the hyperparameters (e.g. CrossValidator). `KerasImageFileEstimator` works with image URI columns (not ImageSchema columns) in order to allow for custom image loading and processing functions often used with keras.
163
+
164
+
To build the estimator with `KerasImageFileEstimator`, we need to have a Keras model stored as a file. The model could be Keras built-in model or user trained model.
165
+
166
+
```python
167
+
from keras.applications import InceptionV3
168
+
169
+
model = InceptionV3(weights="imagenet")
170
+
model.save('/tmp/model-full.h5')
171
+
```
172
+
We also need to create an image loading function that reads the image data from a URI, preprocesses them, and returns the numerical tensor in the keras Model input format.
173
+
Then, we can create a KerasImageFileEstimator that takes our saved model file.
174
+
```python
175
+
importPIL.Image
176
+
import numpy as np
177
+
from keras.applications.imagenet_utils import preprocess_input
178
+
from sparkdl.estimators.keras_image_file_estimator import KerasImageFileEstimator
Spark DataFrames are a natural construct for applying deep learning models to a large-scale dataset. Deep Learning Pipelines provides a set of Spark MLlib Transformers for applying TensorFlow Graphs and TensorFlow-backed Keras Models at scale. The Transformers, backed by the Tensorframes library, efficiently handle the distribution of models and data to Spark workers.
@@ -211,7 +267,7 @@ For applying Keras models in a distributed manner using Spark, [`KerasImageFileT
211
267
212
268
The difference in the API from `TFImageTransformer` above stems from the fact that usual Keras workflows have very specific ways to load and resize images that are not part of the TensorFlow Graph.
213
269
214
-
To use the transformer, we first need to have a Keras model stored as a file. For this notebook we'll just save the Keras built-in InceptionV3 model instead of training one.
270
+
To use the transformer, we first need to have a Keras model stored as a file. We can just save the Keras built-in InceptionV3 model instead of training one.
0 commit comments