1
- # ElasticDL Model Building
1
+ # ElasticDL Model Contribution
2
2
3
3
To submit an ElasticDL job, a user needs to provide a model file, such as
4
4
[ ` mnist_functional_api.py ` ] ( https://github.com/sql-machine-learning/elasticdl/blob/develop/model_zoo/mnist_functional_api/mnist_functional_api.py )
@@ -71,7 +71,7 @@ model = MnistModel()
71
71
### dataset_fn
72
72
73
73
``` python
74
- dataset_fn(dataset, training )
74
+ dataset_fn(dataset, mode )
75
75
```
76
76
77
77
` dataset_fn ` is a function that takes a RecordIO ` dataset ` as input,
@@ -128,23 +128,23 @@ def dataset_fn(dataset, mode):
128
128
### loss
129
129
130
130
``` python
131
- loss(labels, output )
131
+ loss(labels, predictions )
132
132
```
133
133
134
134
` loss ` is the loss function used in ElasticDL training.
135
135
136
136
Arguments:
137
137
138
138
- labels: ` labels ` from [ ` dataset_fn ` ] ( #dataset_fn ) .
139
- - output : [ model] ( #model ) 's output.
139
+ - predictions : [ model] ( #model ) 's output.
140
140
141
141
Example:
142
142
143
143
``` python
144
- def loss (labels , output ):
144
+ def loss (labels , predictions ):
145
145
return tf.reduce_mean(
146
146
input_tensor = tf.nn.sparse_softmax_cross_entropy_with_logits(
147
- logits = output , labels = labels.flatten()
147
+ logits = predictions , labels = labels.flatten()
148
148
)
149
149
)
150
150
```
@@ -179,59 +179,15 @@ TensorFlow API.
179
179
Example:
180
180
181
181
``` python
182
- def eval_metrics_fn (predictions , labels ):
182
+ def eval_metrics_fn ():
183
183
return {
184
- " accuracy" : tf.reduce_mean(
185
- input_tensor = tf.cast(
186
- tf.equal(
187
- tf.argmax(input = predictions, axis = 1 ), labels.flatten()
188
- ),
189
- tf.float32,
190
- )
184
+ " accuracy" : lambda labels , predictions : tf.equal(
185
+ tf.argmax(predictions, 1 , output_type = tf.int32),
186
+ tf.cast(tf.reshape(labels, [- 1 ]), tf.int32),
191
187
)
192
188
}
193
189
```
194
190
195
- ### prepare_data_for_a_single_file
196
-
197
- ``` python
198
- prepare_data_for_a_single_file(filename)
199
- ```
200
-
201
- ` prepare_data_for_a_single_file ` is to read a single file and do whatever
202
- user-defined logic to prepare the data (e.g, IO from the user's file system,
203
- feature engineering), and return the serialized data. The function can be used
204
- to process data for training, evaluation and prediction. The only difference
205
- between prediction data with training/evaluation data is that the 'label' in
206
- prediction data should be empty. Users should be able to determine if the data
207
- file contains label (e.g, via the different formats of filename) and implement
208
- the logic to prepare the data accordingly.
209
-
210
- Example:
211
-
212
- ``` python
213
- def prepare_data_for_a_single_file (filename ):
214
- '''
215
- An image classification dataset that images belonging to the same category
216
- located in the same directory.
217
- '''
218
- label = int (filename.split(' /' )[- 2 ])
219
- image = PIL .Image.open(filename)
220
- numpy_image = np.array(image)
221
- example_dict = {
222
- " image" : tf.train.Feature(
223
- float_list = tf.train.FloatList(value = numpy_image.flatten())
224
- ),
225
- " label" : tf.train.Feature(
226
- int64_list = tf.train.Int64List(value = [label])
227
- ),
228
- }
229
- example = tf.train.Example(
230
- features = tf.train.Features(feature = example_dict)
231
- )
232
- return example.SerializeToString()
233
- ```
234
-
235
191
## Model Building Examples
236
192
237
193
- [ MNIST model using Keras functional API] ( https://github.com/sql-machine-learning/elasticdl/blob/develop/model_zoo/mnist_functional_api/mnist_functional_api.py )
@@ -242,86 +198,4 @@ def prepare_data_for_a_single_file(filename):
242
198
243
199
- [ CIFAR10 model using Keras modelsubclassing] ( https://github.com/sql-machine-learning/elasticdl/blob/develop/model_zoo/cifar10_subclass/cifar10_subclass.py )
244
200
245
- ## Run and Debug Locally in VS Code
246
-
247
- It is more convenient to locally run and debug the defined model than
248
- submitting a job with the model to k8s cluster. The following example shows how
249
- to run and debug
250
- the DNN model using iris dataset.
251
-
252
- ### Locally Run
253
-
254
- The command to locally run the DNN model using iris dataset saved in a CSV file.
255
-
256
- ``` shell
257
- python -m elasticdl.python.elasticdl.client train \
258
- --model_zoo=/{REPO_DIR}/elasticdl/model_zoo \
259
- --model_def=odps_iris_dnn_model.odps_iris_dnn_model.custom_model \
260
- --training_data=/{DATA_DIR}/iris.csv \
261
- --validation_data=/{DATA_DIR}/iris.csv \
262
- --data_reader_params=" columns=['sepal.length', 'sepal.width', \
263
- 'petal.length', 'petal.width', 'variety']; sep=','" \
264
- --num_epochs=2 \
265
- --minibatch_size=64 \
266
- --num_minibatches_per_task=20 \
267
- --distribution_strategy=Local \
268
- --job_name=test-odps-iris \
269
- --evaluation_steps=20 \
270
- --output=iris_dnn_model
271
- ```
272
-
273
- ### Debug Model in VS Code
274
-
275
- We can add the command to the configurations in the ` launch.json ` file to debug
276
- the model in VS Code. The
277
- [ tutorial] ( https://code.visualstudio.com/docs/python/debugging ) show how to
278
- configure the ` launch.json ` file. For example, the configuration to debug the
279
- DNN model is
280
-
281
- ``` json
282
- {
283
- // Use IntelliSense to learn about possible attributes.
284
- // Hover to view descriptions of existing attributes.
285
- // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
286
- "version" : " 0.2.0" ,
287
- "configurations" : [
288
- {
289
- "name" : " Python: Current File" ,
290
- "type" : " python" ,
291
- "request" : " launch" ,
292
- "program" : " ${file}" ,
293
- "console" : " integratedTerminal" ,
294
- "module" : " elasticdl.python.elasticdl.client" ,
295
- "args" : [" train" ,
296
- " --model_zoo" ,
297
- " /{REPO_DIR}/elasticdl/model_zoo" ,
298
- " --model_def" ,
299
- " odps_iris_dnn_model.odps_iris_dnn_model.custom_model" ,
300
- " --training_data" ,
301
- " /{DATA_DIR}/iris.csv" ,
302
- " --num_epochs" ,
303
- " 2" ,
304
- " --minibatch_size" ,
305
- " 64" ,
306
- " --num_minibatches_per_task" ,
307
- " 20" ,
308
- " --distribution_strategy" ,
309
- " Local" ,
310
- " --num_workers" ,
311
- " 2" ,
312
- " --checkpoint_steps" ,
313
- " 10" ,
314
- " --evaluation_steps" ,
315
- " 20" ,
316
- " --job_name" ,
317
- " test-odps-iris" ,
318
- " --data_reader_params" ,
319
- "columns=['sepal.length',
320
- 'sepal.width',
321
- 'petal.length',
322
- 'petal.width',
323
- 'variety' ]; sep=','"
324
- ]
325
- }
326
- ]
327
- }
201
+ - [ Preprocess structured data for Keras model] ( https://github.com/sql-machine-learning/elasticdl/blob/develop/docs/tutorials/preprocessing_tutorial.md )
0 commit comments