Skip to content

toCUDA() slower than regular VideoCapture #64

@David-rn

Description

@David-rn

I would like to thank you for such amazing library!

I came accross some unexpected behaviour. Just to add some context and verify that ffmpeg is compiled with cuda support.

The following command give me ~500 fps

 ffmpeg -y -vsync 0  -i input.mp4 -f null /dev/null

And this one give me ~700 fps. And I can see that the gpu is being used by the process.

 ffmpeg -y -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -f null /dev/null

I was performing some benchmark with these two examples:

import ffmpegcv
cap = ffmpegcv.VideoCapture(file)
while True:
    ret, frame = cap.read()
    if not ret:
        break
    pass
cap.release()

########################

import ffmpegcv
cap = ffmpegcv.toCUDA(ffmpegcv.VideoCaptureNV(file, pix_fmt='nv12'))
while True:
    ret, frame = cap.read()
    if not ret:
        break
    pass
cap.release()

Nevertheless, the first example complete reading the video in 8 seconds while the second one completed it in ~20 seconds. I was wondering why this could be happening.

I performed some profiling and these are the results for cpu and toCUDA:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.003    0.003 ffmpeg_reader.py:241(read)
        1    0.003    0.003    0.003    0.003 {method 'read' of '_io.BufferedReader' objects}
        1    0.000    0.000    0.000    0.000 fromnumeric.py:3328(prod)
        1    0.000    0.000    0.000    0.000 fromnumeric.py:69(_wrapreduction)

====================================================================================================

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003    0.009    0.009 ffmpeg_reader_cuda.py:207(read)
        1    0.000    0.000    0.004    0.004 ffmpeg_reader.py:241(read)
        1    0.004    0.004    0.004    0.004 {method 'read' of '_io.BufferedReader' objects}
        1    0.000    0.000    0.001    0.001 driver.py:465(function_call)
        3    0.001    0.000    0.001    0.000 driver.py:144(pre_call)
        1    0.000    0.000    0.000    0.000 gpuarray.py:214(__init__)

I noticed that ffmpeg_reader_cuda was also using the ffmpeg_reader for reading the frame which is using numpy to read from buffer:

img = np.frombuffer(in_bytes, np.uint8).reshape(self.out_numpy_shape)

I was wondering if this could be the problem.

SYSTEM:

  • pycuda 2025.1
  • ffmpegcv 0.3.16

Thank you in advance!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions