Skip to content

Slow processing with batched dlpreproc #188

@emmadrigal

Description

@emmadrigal

Processing time for a N-buffers batched dlpreproc is much lower than the processing time for N individual dlpreproc processing in parallel:

IN_CAPS="video/x-raw, width=320, height=320, format=RGB"

GST_DEBUG="2,*tiovxsiso*:6,*perf*:6" gst-launch-1.0 \
videotestsrc ! $IN_CAPS ! mux. \
videotestsrc ! $IN_CAPS ! mux. \
videotestsrc ! $IN_CAPS ! mux. \
videotestsrc ! $IN_CAPS ! mux. \
videotestsrc ! $IN_CAPS ! mux. \
videotestsrc ! $IN_CAPS ! mux. \
tiovxmux name=mux ! \
tiovxdlpreproc  ! "application/x-tensor-tiovx(memory:batched)" ! perf ! \
tiovxdemux name=demux \
demux. ! queue ! fakesink \
demux. ! queue ! fakesink \
demux. ! queue ! fakesink \
demux. ! queue ! fakesink \
demux. ! queue ! fakesink \
demux. ! queue ! fakesink

This first pipeline will run at around 5fps

IN_CAPS="video/x-raw, width=320, height=320, format=RGB"

GST_DEBUG="2,*perf*:6" gst-launch-1.0 \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink \
videotestsrc ! $IN_CAPS ! tiovxdlpreproc ! perf ! fakesink

Each of the 6 individual pipelines will run at 40fps.

All the delay in the batched pipeline appears to be in the processing time:

0:00:02.118839548  1771     0x171faf70 LOG                tiovxsiso gsttiovxsiso.c:864:gst_tiovx_siso_process_graph:<tiovxdlpreproc0> Enqueueing parameters
0:00:02.118885055  1771     0x171faf70 LOG                tiovxsiso gsttiovxsiso.c:883:gst_tiovx_siso_process_graph:<tiovxdlpreproc0> Processing graph
0:00:02.298343493  1771     0x171faf70 LOG                tiovxsiso gsttiovxsiso.c:896:gst_tiovx_siso_process_graph:<tiovxdlpreproc0> Dequeueing parameters

This corresponds to the following code: https://github.com/TexasInstruments/edgeai-gst-plugins/blob/develop/gst-libs/gst/tiovx/gsttiovxsiso.c#L882

which by removing the error handling can be summarized as:

GST_LOG_OBJECT (self, "Enqueueing parameters");
  status =
      vxGraphParameterEnqueueReadyRef (priv->graph, INPUT_PARAMETER_INDEX,
      (vx_reference *) priv->input, priv->num_channels);
  status =
      vxGraphParameterEnqueueReadyRef (priv->graph, OUTPUT_PARAMETER_INDEX,
      (vx_reference *) priv->output, priv->num_channels);

  GST_LOG_OBJECT (self, "Processing graph");
  status = vxScheduleGraph (priv->graph);
  status = vxWaitGraph (priv->graph);

  GST_LOG_OBJECT (self, "Dequeueing parameters");
  status =
      vxGraphParameterDequeueDoneRef (priv->graph, INPUT_PARAMETER_INDEX,
      (vx_reference *) priv->input, priv->num_channels, &in_refs);
  status =
      vxGraphParameterDequeueDoneRef (priv->graph, OUTPUT_PARAMETER_INDEX,
      (vx_reference *) priv->output, priv->num_channels, &out_refs);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions