Initial media support for G-API background subtraction demo (#3535)

TolyaTalamanov · Wovchena · web-flow · commit e3f46cc47640 · 2022-10-04T11:48:12.000+04:00
* Switch demo to GFrame input

* Refactor MediaCommonCapSrc

* Initial media support for G-API bgr subtraction

* Initial onevpl suppport
* Switch to GFrame

* Add flags to readme and help print

* Fix background_subtraction_demo_gapi documentation

* Round up frame_size by 16 for use_onevpl case.

* Update readme

* Replace "note:" in readme

* Update demos/background_subtraction_demo/cpp_gapi/main.cpp

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Fix warning and add onevpl_pool_size flag

* Update demos/background_subtraction_demo/cpp_gapi/README.md

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update demos/background_subtraction_demo/cpp_gapi/README.md

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update demos/background_subtraction_demo/cpp_gapi/main.cpp

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Fix comments to review

* Update demos/background_subtraction_demo/cpp_gapi/background_subtraction_demo_gapi.hpp

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update demos/background_subtraction_demo/cpp_gapi/README.md

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update demos/background_subtraction_demo/cpp_gapi/background_subtraction_demo_gapi.hpp

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update demos/background_subtraction_demo/cpp_gapi/README.md

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;

* Update background_subtraction_demo_gapi.hpp

* Update background_subtraction_demo_gapi.hpp

* Update background_subtraction_demo_gapi.hpp

* Update README.md

* Fix flag message

Co-authored-by: Zlobin Vladimir &lt;vladimir.zlobin@intel.com&gt;
diff --git a/demos/background_subtraction_demo/cpp_gapi/README.md b/demos/background_subtraction_demo/cpp_gapi/README.md
@@ -5,7 +5,7 @@ This demo shows how to perform background subtraction using G-API.
 > **NOTE**: Only batch size of 1 is supported.
 
 ## How It Works
-The demo application expects an instance-segmentation-security-???? or trimap free background matting based on pixel-level segmentation approach model in the Intermediate Representation (IR) format.
+The demo application expects an instance-segmentation-security-???? or trimap free background matting based on pixel-level segmentation approach model in the Intermediate Representation (IR) format. Please note, that there aren't background matting models in `OpenModelZoo` collection.
 
 1. for instance segmentation models based on `Mask RCNN` approach:
     * One input: `image` for input image.
@@ -54,6 +54,34 @@ omz_converter --list models.lst
 
 > **NOTE**: Refer to the tables [Intel's Pre-Trained Models Device Support](../../../models/intel/device_support.md) and [Public Pre-Trained Models Device Support](../../../models/public/device_support.md) for the details on models inference support at different devices.
 
+
+### OneVPL Support
+
+Demo provides functionality to use [OneVPL](https://github.com/oneapi-src/oneVPL#-video-processing-library) video decoding.
+Example:
+```sh
+./background_subtraction_demo_gapi/ -m <path_to_model> -i <path_to_video_file> -use_onevpl
+```
+
+In order to provide additional configuration paramaters use `-onevpl_params`:
+```sh
+./background_subtraction_demo_gapi/ -m <path_to_model> -i <path_to_raw_file> -use_onevpl -onevpl_params="mfxImplDescription.mfxDecoderDescription.decoder.CodecID:MFX_CODEC_HEVC"
+```
+>**NOTE**: Only raw formats such as `h264`, `h265` etc are supported on Linux.
+Working with raw formats user always must specify `codec` type via `-onevpl_params`. See example below.
+
+To build OpenCV G-API with `oneVPL` support follow instruction:
+[Building G-API with oneVPL Toolkit support](https://github.com/opencv/opencv/wiki/Graph-API#building-with-onevpl-toolkit-support)
+
+#### Troubleshooting
+During execution `oneVPL` might report warnings that tell the user that source can be configurable more accurate.
+
+For example:
+```
+ cv::gapi::wip::onevpl::VPLLegacyDecodeEngine::process_error [000001CED3851C70] error: cv::gapi::wip::onevpl::CachedPool::find_free - cannot get free surface from pool, size: 5
+```
+This might be fixed by increasing pool size using `-onevpl_pool_size` parameter.
+
 ## Running
 
 Run the application with the `-h` option to see the following usage message:
@@ -82,6 +110,9 @@ Options:
     -blur_bgr                  Optional. Blur background.
     -target_bgr                Optional. Background onto which to composite the output (by default to green field).
     -u                         Optional. List of monitors to show initially.
+    -use_onevpl                Optional. Use onevpl video decoding.
+    -onevpl_params             Optional. Parameters for onevpl video decoding. OneVPL source can be fine-grained by providing configuration parameters. Format: <prop name>:<value>,<prop name>:<value> Several important configuration parameters: 'mfxImplDescription.mfxDecoderDescription.decoder.CodecID' values: https://spec.oneapi.io/onevpl/2.7.0/API_ref/VPL_enums.html?highlight=mfx_codec_hevc#codecformatfourcc and 'mfxImplDescription.AccelerationMode' values: https://spec.oneapi.io/onevpl/2.7.0/API_ref/VPL_disp_api_enum.html?highlight=d3d11#mfxaccelerationmode(see `MFXSetConfigFilterProperty` by https://spec.oneapi.io/versions/latest/elements/oneVPL/source/index.html)
+    -onevpl_pool_size          OneVPL source applies this parameter as preallocated frames pool size. 0 leaves frames pool size default for your system. This parameter doesn't have a god default value. It must be adjusted for specific execution (video, model, system ...).
 
 Available target devices:  <targets>
 ```
diff --git a/demos/background_subtraction_demo/cpp_gapi/background_subtraction_demo_gapi.hpp b/demos/background_subtraction_demo/cpp_gapi/background_subtraction_demo_gapi.hpp
@@ -33,6 +33,18 @@ static const char blur_bgr_message[] = "Optional. Blur background.";
 static const char target_bgr_message[] =
     "Optional. Background onto which to composite the output (by default to green field).";
 static const char utilization_monitors_message[] = "Optional. List of monitors to show initially.";
+static const char use_onevpl_message[] = "Optional. Use onevpl video decoding.";
+static const char onevpl_params_message[] = "Optional. Parameters for onevpl video decoding. "
+                                            "OneVPL source can be fine-grained by providing configuration parameters. "
+                                            "Format: <prop name>:<value>,<prop name>:<value> "
+                                            "Several important configuration parameters: "
+                                            "-mfxImplDescription.mfxDecoderDescription.decoder.CodecID "
+                                            "values: https://spec.oneapi.io/onevpl/2.7.0/API_ref/VPL_enums.html?highlight=mfx_codec_hevc#codecformatfourcc"
+                                            "-mfxImplDescription.AccelerationMode "
+                                            "values: https://spec.oneapi.io/onevpl/2.7.0/API_ref/VPL_disp_api_enum.html?highlight=d3d11#mfxaccelerationmode"
+                                            "(see `MFXSetConfigFilterProperty` by https://spec.oneapi.io/versions/latest/elements/oneVPL/source/index.html)";
+static const char onevpl_pool_size_message[] = "OneVPL source applies this parameter as preallocated frames pool size. 0 leaves frames pool size default for your system. "
+                                               "This parameter doesn't have a god default value. It must be adjusted for specific execution (video, model, system, ...).";
 
 DEFINE_bool(h, false, help_message);
 DEFINE_string(res, "1280x720", camera_resolution_message);
@@ -47,6 +59,9 @@ DEFINE_bool(no_show, false, no_show_message);
 DEFINE_string(target_bgr, "", target_bgr_message);
 DEFINE_uint32(blur_bgr, 0, blur_bgr_message);
 DEFINE_string(u, "", utilization_monitors_message);
+DEFINE_bool(use_onevpl, false, use_onevpl_message);
+DEFINE_string(onevpl_params, "", onevpl_params_message);
+DEFINE_uint32(onevpl_pool_size, 0, onevpl_pool_size_message);
 
 /**
  * \brief This function shows a help message
@@ -74,4 +89,7 @@ static void showUsage() {
     std::cout << "    -blur_bgr \"<integer>\"      " << blur_bgr_message << std::endl;
     std::cout << "    -target_bgr                " << target_bgr_message << std::endl;
     std::cout << "    -u                         " << utilization_monitors_message << std::endl;
+    std::cout << "    -use_onevpl                " << use_onevpl_message << std::endl;
+    std::cout << "    -onevpl_params             " << onevpl_params_message << std::endl;
+    std::cout << "    -onevpl_pool_size          " << onevpl_pool_size_message << std::endl;
 }
diff --git a/demos/background_subtraction_demo/cpp_gapi/include/custom_kernels.hpp b/demos/background_subtraction_demo/cpp_gapi/include/custom_kernels.hpp
@@ -43,7 +43,7 @@ class NNBGReplacer {
     NNBGReplacer() = default;
     virtual ~NNBGReplacer() = default;
     NNBGReplacer(const std::string& model_path);
-    virtual cv::GMat replace(cv::GMat, const cv::Size&, cv::GMat) = 0;
+    virtual cv::GMat replace(cv::GFrame, cv::GMat, const cv::Size&, cv::GMat) = 0;
     const std::string& getName() {
         return m_tag;
     }
@@ -58,7 +58,7 @@ class NNBGReplacer {
 class MaskRCNNBGReplacer : public NNBGReplacer {
 public:
     MaskRCNNBGReplacer(const std::string& model_path);
-    cv::GMat replace(cv::GMat, const cv::Size&, cv::GMat) override;
+    cv::GMat replace(cv::GFrame, cv::GMat, const cv::Size&, cv::GMat) override;
 
 private:
     std::string m_input_name;
@@ -70,7 +70,7 @@ class MaskRCNNBGReplacer : public NNBGReplacer {
 class BGMattingReplacer : public NNBGReplacer {
 public:
     BGMattingReplacer(const std::string& model_path);
-    cv::GMat replace(cv::GMat, const cv::Size&, cv::GMat) override;
+    cv::GMat replace(cv::GFrame, cv::GMat, const cv::Size&, cv::GMat) override;
 
 private:
     std::string m_input_name;
diff --git a/demos/background_subtraction_demo/cpp_gapi/main.cpp b/demos/background_subtraction_demo/cpp_gapi/main.cpp
@@ -35,8 +35,10 @@
 #include <opencv2/gapi/own/assert.hpp>
 #include <opencv2/gapi/streaming/source.hpp>
 #include <opencv2/gapi/util/optional.hpp>
+#include <opencv2/gapi/streaming/onevpl/source.hpp>
 #include <opencv2/highgui.hpp>
 #include <opencv2/imgproc.hpp>
+
 #include <openvino/openvino.hpp>
 
 #include <monitors/presenter.h>
@@ -82,6 +84,37 @@ static cv::gapi::GKernelPackage getKernelPackage(const std::string& type) {
     GAPI_Assert(false && "Unreachable code!");
 }
 
+cv::gapi::wip::onevpl::CfgParam createFromString(const std::string &line) {
+    using namespace cv::gapi::wip;
+
+    if (line.empty()) {
+        throw std::runtime_error("Cannot parse CfgParam from emply line");
+    }
+
+    std::string::size_type name_endline_pos = line.find(':');
+    if (name_endline_pos == std::string::npos) {
+        throw std::runtime_error("Cannot parse CfgParam from: " + line +
+                                 "\nExpected separator \":\"");
+    }
+
+    std::string name = line.substr(0, name_endline_pos);
+    std::string value = line.substr(name_endline_pos + 1);
+
+    return cv::gapi::wip::onevpl::CfgParam::create(name, value,
+                                                   /* vpp params strongly optional */
+                                                   name.find("vpp.") == std::string::npos);
+}
+
+static std::vector<cv::gapi::wip::onevpl::CfgParam> parseVPLParams(const std::string& cfg_params) {
+    std::vector<cv::gapi::wip::onevpl::CfgParam> source_cfgs;
+    std::stringstream params_list(cfg_params);
+    std::string line;
+    while (std::getline(params_list, line, ',')) {
+        source_cfgs.push_back(createFromString(line));
+    }
+    return source_cfgs;
+}
+
 }  // namespace util
 
 int main(int argc, char* argv[]) {
@@ -117,9 +150,16 @@ int main(int argc, char* argv[]) {
                                                                stringToSize(FLAGS_res));
         const auto tmp = cap->read();
         cv::Size frame_size = cv::Size{tmp.cols, tmp.rows};
+        // NB: oneVPL source rounds up frame size by 16
+        // so size might be different from what ImagesCapture reads.
+        if (FLAGS_use_onevpl) {
+            frame_size.width  = cv::alignSize(frame_size.width, 16);
+            frame_size.height = cv::alignSize(frame_size.height, 16);
+        }
 
         cv::GComputation comp([&] {
-            cv::GMat in;
+            cv::GFrame in;
+            cv::GMat bgr = cv::gapi::streaming::BGR(in);
             // NB: target_bgr is optional second input which implies a background
             // that will change user video background. If user don't specify
             // it and specifies --bgr_blur then second input won't be used since
@@ -128,7 +168,7 @@ int main(int argc, char* argv[]) {
 
             cv::GMat bgr_resized;
             if (is_blur && FLAGS_target_bgr.empty()) {
-                bgr_resized = in;
+                bgr_resized = bgr;
             } else {
                 target_bgr = cv::util::make_optional<cv::GMat>(cv::GMat());
                 bgr_resized = cv::gapi::resize(target_bgr.value(), frame_size);
@@ -137,7 +177,7 @@ int main(int argc, char* argv[]) {
             auto background =
                 is_blur ? cv::gapi::blur(bgr_resized, cv::Size(FLAGS_blur_bgr, FLAGS_blur_bgr)) : bgr_resized;
 
-            auto result = model->replace(in, frame_size, background);
+            auto result = model->replace(in, bgr, frame_size, background);
 
             auto graph_inputs = cv::GIn(in);
             if (target_bgr.has_value()) {
@@ -176,7 +216,19 @@ int main(int argc, char* argv[]) {
                                 0,
                                 std::numeric_limits<size_t>::max(),
                                 stringToSize(FLAGS_res));
-        auto pipeline_inputs = cv::gin(cv::gapi::wip::make_src<custom::CommonCapSrc>(cap));
+        cv::gapi::wip::IStreamSource::Ptr media_cap;
+        if (FLAGS_use_onevpl) {
+            auto onevpl_params = util::parseVPLParams(FLAGS_onevpl_params);
+            if (FLAGS_onevpl_pool_size != 0) {
+                onevpl_params.push_back(
+                    cv::gapi::wip::onevpl::CfgParam::create_frames_pool_size(FLAGS_onevpl_pool_size));
+            }
+            media_cap = cv::gapi::wip::make_onevpl_src(FLAGS_i, std::move(onevpl_params));
+        } else {
+            media_cap = cv::gapi::wip::make_src<custom::MediaCommonCapSrc>(cap);
+        }
+
+        auto pipeline_inputs = cv::gin(std::move(media_cap));
         if (!is_blur && FLAGS_target_bgr.empty()) {
             cv::Scalar default_color(155, 255, 120);
             pipeline_inputs += cv::gin(cv::Mat(frame_size, CV_8UC3, default_color));
diff --git a/demos/background_subtraction_demo/cpp_gapi/src/custom_kernels.cpp b/demos/background_subtraction_demo/cpp_gapi/src/custom_kernels.cpp
@@ -169,7 +169,7 @@ custom::MaskRCNNBGReplacer::MaskRCNNBGReplacer(const std::string& model_path) :
     }
 }
 
-cv::GMat custom::MaskRCNNBGReplacer::replace(cv::GMat in, const cv::Size& in_size, cv::GMat background) {
+cv::GMat custom::MaskRCNNBGReplacer::replace(cv::GFrame in, cv::GMat bgr, const cv::Size& in_size, cv::GMat background) {
     cv::GInferInputs inputs;
     inputs[m_input_name] = in;
     auto outputs = cv::gapi::infer<cv::gapi::Generic>(m_tag, inputs);
@@ -181,7 +181,7 @@ cv::GMat custom::MaskRCNNBGReplacer::replace(cv::GMat in, const cv::Size& in_siz
     GAPI_Assert(dims.size() == 4u);
     auto mask = custom::GCalculateMaskRCNNBGMask::on(in_size, cv::Size(dims[3], dims[2]), labels, boxes, masks);
     auto mask3ch = cv::gapi::medianBlur(cv::gapi::merge3(mask, mask, mask), 11);
-    return (mask3ch & in) + (~mask3ch & background);
+    return (mask3ch & bgr) + (~mask3ch & background);
 }
 
 custom::BGMattingReplacer::BGMattingReplacer(const std::string& model_path) : NNBGReplacer(model_path) {
@@ -196,14 +196,14 @@ custom::BGMattingReplacer::BGMattingReplacer(const std::string& model_path) : NN
     m_output_name = m_outputs.begin()->first;
 }
 
-cv::GMat custom::BGMattingReplacer::replace(cv::GMat in, const cv::Size& in_size, cv::GMat background) {
+cv::GMat custom::BGMattingReplacer::replace(cv::GFrame in, cv::GMat bgr, const cv::Size& in_size, cv::GMat background) {
     cv::GInferInputs inputs;
     inputs[m_input_name] = in;
     auto outputs = cv::gapi::infer<cv::gapi::Generic>(m_tag, inputs);
 
     auto alpha = cv::gapi::resize(custom::GTensorToImg::on(outputs.at(m_output_name)), in_size);
     auto alpha3ch = cv::gapi::merge3(alpha, alpha, alpha);
-    auto in_fp = cv::gapi::convertTo(in, CV_32F);
+    auto in_fp = cv::gapi::convertTo(bgr, CV_32F);
     auto bgr_fp = cv::gapi::convertTo(background, CV_32F);
 
     cv::GScalar one(cv::Scalar::all(1.));
diff --git a/demos/common/cpp_gapi/utils_gapi/include/utils_gapi/stream_source.hpp b/demos/common/cpp_gapi/utils_gapi/include/utils_gapi/stream_source.hpp
@@ -9,6 +9,8 @@
 #include <opencv2/core.hpp>
 #include <opencv2/gapi/gmetaarg.hpp>
 #include <opencv2/gapi/streaming/source.hpp>
+#include <opencv2/gapi/media.hpp>
+#include <opencv2/gapi/streaming/cap.hpp>
 
 class ImagesCapture;
 namespace cv {
@@ -24,7 +26,7 @@ class CommonCapSrc : public cv::gapi::wip::IStreamSource {
 public:
     explicit CommonCapSrc(std::shared_ptr<ImagesCapture>& cap);
 
-protected:
+public:
     std::shared_ptr<ImagesCapture> cap;
     cv::Mat first;
     bool first_pulled = false;
@@ -34,4 +36,25 @@ class CommonCapSrc : public cv::gapi::wip::IStreamSource {
     cv::GMetaArg descr_of() const override;
 };
 
+class MediaBGRAdapter final: public cv::MediaFrame::IAdapter {
+public:
+    using Cb = cv::MediaFrame::View::Callback;
+
+    explicit MediaBGRAdapter(cv::Mat m, Cb cb = [](){});
+
+    cv::GFrameDesc meta() const override;
+    cv::MediaFrame::View access(cv::MediaFrame::Access) override;
+
+private:
+    cv::Mat m_mat;
+    Cb m_cb;
+};
+
+class MediaCommonCapSrc : public CommonCapSrc {
+    using CommonCapSrc::CommonCapSrc;
+
+    bool pull(cv::gapi::wip::Data& data);
+    cv::GMetaArg descr_of() const override;
+};
+
 }  // namespace custom
diff --git a/demos/common/cpp_gapi/utils_gapi/src/stream_source.cpp b/demos/common/cpp_gapi/utils_gapi/src/stream_source.cpp
@@ -43,4 +43,31 @@ cv::GMetaArg CommonCapSrc::descr_of() const {
     GAPI_Assert(!first.empty());
     return cv::GMetaArg{cv::descr_of(first)};
 }
+
+MediaBGRAdapter::MediaBGRAdapter(cv::Mat m, MediaBGRAdapter::Cb cb)
+    : m_mat(m), m_cb(cb) {
+}
+
+cv::GFrameDesc MediaBGRAdapter::meta() const {
+    return cv::GFrameDesc{cv::MediaFormat::BGR, m_mat.size()};
+}
+
+cv::MediaFrame::View MediaBGRAdapter::access(cv::MediaFrame::Access) {
+    cv::MediaFrame::View::Ptrs pp = { m_mat.ptr(), nullptr, nullptr, nullptr };
+    cv::MediaFrame::View::Strides ss = { m_mat.step, 0u, 0u, 0u };
+    return cv::MediaFrame::View(std::move(pp), std::move(ss), MediaBGRAdapter::Cb{m_cb});
+}
+
+bool MediaCommonCapSrc::pull(cv::gapi::wip::Data& data) {
+    if (CommonCapSrc::pull(data)) {
+        data = cv::MediaFrame::Create<MediaBGRAdapter>(cv::util::get<cv::Mat>(data));
+        return true;
+    }
+    return false;
+}
+
+cv::GMetaArg MediaCommonCapSrc::descr_of() const {
+    return cv::GMetaArg{cv::GFrameDesc{cv::MediaFormat::BGR,
+                                       cv::util::get<cv::GMatDesc>(CommonCapSrc::descr_of()).size}};
+}
 }  // namespace custom