Measuring inference time in spaCy pipeline #8238
-
Is there anything out-of-the-box in spaCy that helps measure performance in terms of time taken by each component in the pipeline, words processed per second, or something similar? I am looking to process a large number of documents and would like to gather some important processing time metrics for benchmarking as well as identifying potential bottleneck components. I am aware of the benchmarking mentioned here and the associated benchmarking project here but I am looking for something that helps measure timings for each component in the pipeline, not the entire pipeline. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
There's nothing out-of-the-box, no. I would recommend that you try either using the Python debugger or wrapping components in some kind of timer. Since you can get the components from the language pipeline, and since they're executed just using their
Where |
Beta Was this translation helpful? Give feedback.
There's nothing out-of-the-box, no. I would recommend that you try either using the Python debugger or wrapping components in some kind of timer.
Since you can get the components from the language pipeline, and since they're executed just using their
__call__
function at inference time, you should be able to wrap them in a timer function. Something like this:Where
timer
is some kind of function that times calls while passing through arguments and return values.