-
-
Notifications
You must be signed in to change notification settings - Fork 236
Open
Labels
Description
Performance on fairly simple 1D model with stacked branches / bottlenecks / inception blocks was surprisingly poor (>300x slower).
Keras on CPU (tensorflow lite for the same model is 170µs)
Using the 1 CPU/Thread trick
import tensorflow as tf, keras
session_conf = tf.ConfigProto(
intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf)
keras.backend.set_session(sess)
model = model_from_json(json_blob)
%%time
model.predict(np.zeros((1, 600, 3)));
CPU times: user 88.1 ms, sys: 35.1 ms, total: 123 ms
Wall time: 25.1 ms
Frugally Deep App
I realize the loop here includes loading and parsing the model, but that is a small fraction of the compute time (as can be seen by changing the length of the sequence)
!time ../build/main
Loading model
Loading json ... done. elapsed time: 0.192233 s
Running test 1 of 1 ... done. elapsed time: 1.472293 s
Loading, constructing, testing of ../model.json took 3.309670 s overall.
Running all zeros:
real 0m31.150s
user 0m30.712s
sys 0m0.245s
Model Details
Model as Keras H5, FrugallyDeep JSON, and C++ file for inference: https://gist.github.com/kmader/135db41c5ea35c0dc8cae95ed90087f4
Reactions are currently unavailable
