Skip to content

Poor performance with big dilation rates (was "Performance on Branching Models" before) #132

@kmader

Description

@kmader

Performance on fairly simple 1D model with stacked branches / bottlenecks / inception blocks was surprisingly poor (>300x slower).

Keras on CPU (tensorflow lite for the same model is 170µs)

Using the 1 CPU/Thread trick

import tensorflow as tf, keras
session_conf = tf.ConfigProto(
      intra_op_parallelism_threads=1,
      inter_op_parallelism_threads=1)
sess = tf.Session(config=session_conf)
keras.backend.set_session(sess)
model = model_from_json(json_blob)
%%time
model.predict(np.zeros((1, 600, 3)));
CPU times: user 88.1 ms, sys: 35.1 ms, total: 123 ms
Wall time: 25.1 ms

Frugally Deep App

I realize the loop here includes loading and parsing the model, but that is a small fraction of the compute time (as can be seen by changing the length of the sequence)

!time ../build/main
Loading model
Loading json ... done. elapsed time: 0.192233 s
Running test 1 of 1 ... done. elapsed time: 1.472293 s
Loading, constructing, testing of ../model.json took 3.309670 s overall.
Running all zeros: 

real	0m31.150s
user	0m30.712s
sys	0m0.245s

Model Details

Model as Keras H5, FrugallyDeep JSON, and C++ file for inference: https://gist.github.com/kmader/135db41c5ea35c0dc8cae95ed90087f4

out_model

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions