diff --git a/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/README.md b/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/README.md index 6fdcc1555..3b306338b 100644 --- a/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/README.md +++ b/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/README.md @@ -4,7 +4,9 @@ 2. The sample involves presenting an image to the ONNX Runtime (RT), which uses the OpenVINO Execution Provider for ONNX RT to run inference on various Intel hardware devices like Intel CPU, GPU, VPU and more. The sample uses OpenCV for image processing and ONNX Runtime OpenVINO EP for inference. After the sample image is inferred, the terminal will output the predicted label classes in order of their confidence. -The source code for this sample is available [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Linux/squeezenet_classification). +The source code for this sample is available [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/squeezenet_cpp_app.cpp). + +3. There is one more sample [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Linux/squeezenet_classification/squeezenet_cpp_app_io.cpp) with IO Buffer optimization enabled. With IO Buffer interfaces we can avoid any memory copy overhead when plugging OpenVINO™ inference into an existing GPU pipeline. It also enables OpenCL kernels to participate in the pipeline to become native buffer consumers or producers of the OpenVINO™ inference. Refer [here](https://docs.openvino.ai/latest/openvino_docs_OV_UG_supported_plugins_GPU_RemoteTensor_API.html) for more details. This sample is for GPUs only. # How to build @@ -65,6 +67,10 @@ export OPENCL_INCS=path/to/your/directory/openvino/thirdparty/ocl/clhpp_headers/ If you are using the opencv from openvino package, below are the paths: * For latest version (2022.1.0), run download_opencv.sh in /path/to/openvino/extras/script and the opencv folder will be downloaded at /path/to/openvino/extras. * For older openvino version, opencv folder is available at openvino directory itself. + * The current cmake files are adjusted with the opencv folders coming along with openvino packages. Plase make sure you are updating the opencv paths according to your custom builds. + + For the squeezenet IO buffer sample: + Make sure you are creating the opencl context for the right GPU device in a multi-GPU environment. 4. Run the sample diff --git a/c_cxx/OpenVINO_EP/Windows/CMakeLists.txt b/c_cxx/OpenVINO_EP/Windows/CMakeLists.txt index 383d4cb23..14ba7e3c9 100644 --- a/c_cxx/OpenVINO_EP/Windows/CMakeLists.txt +++ b/c_cxx/OpenVINO_EP/Windows/CMakeLists.txt @@ -11,6 +11,8 @@ string(APPEND CMAKE_CXX_FLAGS " /W4") option(onnxruntime_USE_OPENVINO "Build with OpenVINO support" OFF) option(OPENCV_ROOTDIR "OpenCV root dir") option(ONNXRUNTIME_ROOTDIR "onnxruntime root dir") +option(OPENCL_LIB "OpenCL lib dir") +option(OPENCL_INCLUDE "OpenCL header dir") if(NOT ONNXRUNTIME_ROOTDIR) set(ONNXRUNTIME_ROOTDIR "C:/Program Files (x86)/onnxruntime") @@ -27,6 +29,10 @@ if(OPENCV_ROOTDIR) list(FILTER OPENCV_RELEASE_LIBRARIES EXCLUDE REGEX ".*d\\.lib") endif() +if(OPENCL_LIB AND OPENCL_INCLUDE) + set(OPENCL_FOUND true) +endif() + if(onnxruntime_USE_OPENVINO) add_definitions(-DUSE_OPENVINO) endif() @@ -34,4 +40,9 @@ endif() if(OPENCV_FOUND) add_subdirectory(squeezenet_classification) endif() + +if(OPENCL_FOUND) + add_subdirectory(squeezenet_classification_io_buffer) +endif() + add_subdirectory(model-explorer) diff --git a/c_cxx/OpenVINO_EP/Windows/README.md b/c_cxx/OpenVINO_EP/Windows/README.md index 6c96a54fd..553f95a9a 100644 --- a/c_cxx/OpenVINO_EP/Windows/README.md +++ b/c_cxx/OpenVINO_EP/Windows/README.md @@ -1,12 +1,28 @@ # Windows C++ sample with OVEP: 1. model-explorer -2. Squeezenet classification + + This sample application demonstrates how to use components of the experimental C++ API to query for model inputs/outputs and how to run inferrence using OpenVINO Execution Provider for ONNXRT on a model. The source code for this sample is available [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Windows/model-explorer). + +2. Squeezenet classification sample + + The sample involves presenting an image to the ONNX Runtime (RT), which uses the OpenVINO Execution Provider for ONNXRT to run inference on various Intel hardware devices like Intel CPU, GPU, VPU and more. The sample uses OpenCV for image processing and ONNX Runtime OpenVINO EP for inference. After the sample image is inferred, the terminal will output the predicted label classes in order of their confidence. The source code for this sample is available [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Windows/squeezenet_classification). + +3. Squeezenet classification sample with IO Buffer feature + + This sample is also doing the same process but with IO Buffer optimization enabled. With IO Buffer interfaces we can avoid any memory copy overhead when plugging OpenVINO™ inference into an existing GPU pipeline. It also enables OpenCL kernels to participate in the pipeline to become native buffer consumers or producers of the OpenVINO™ inference. Refer [here](https://docs.openvino.ai/latest/openvino_docs_OV_UG_supported_plugins_GPU_RemoteTensor_API.html) for more details. This sample is for GPUs only. The source code for this sample is available [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer). ## How to build #### Build ONNX Runtime Open x64 Native Tools Command Prompt for VS 2019. +For running the sample with IO Buffer optimization feature, make sure you set the OpenCL paths. For example if you are setting the path from openvino source build folder, the paths will be like: + +``` +set OPENCL_LIBS=\path\to\openvino\folder\bin\intel64\Release\OpenCL.lib +set OPENCL_INCS=\path\to\openvino\folder\thirdparty\ocl\clhpp_headers\include +``` + ``` build.bat --config RelWithDebInfo --use_openvino CPU_FP32 --build_shared_lib --parallel --cmake_extra_defines CMAKE_INSTALL_PREFIX=c:\dev\ort_install --skip_tests ``` @@ -32,10 +48,20 @@ cmake .. -A x64 -T host=x64 -Donnxruntime_USE_OPENVINO=ON -DONNXRUNTIME_ROOTDIR= ``` Choose required opencv path. Skip the opencv flag if you don't want to build squeezenet sample. +To get the squeezenet sample with IO buffer feature enabled, pass opencl paths as well: +```bat +mkdir build && cd build +cmake .. -A x64 -T host=x64 -Donnxruntime_USE_OPENVINO=ON -DONNXRUNTIME_ROOTDIR=c:\dev\ort_install -DOPENCV_ROOTDIR="path\to\opencv -DOPENCL_LIB=path\to\openvino\folder\bin\intel64\Release\ -DOPENCL_INCLUDE=path\to\openvino\folder\thirdparty\ocl\clhpp_headers\include" +``` + **Note:** If you are using the opencv from openvino package, below are the paths: -* For latest version (2022.1.0), run download_opencv.ps1 in \path\to\openvino\extras\script and the opencv folder will be downloaded at \path\to\openvino\extras. +* For openvino version 2022.1.0, run download_opencv.ps1 in \path\to\openvino\extras\script and the opencv folder will be downloaded at \path\to\openvino\extras. * For older openvino version, opencv folder is available at openvino directory itself. +* The current cmake files are adjusted with the opencv folders coming along with openvino packages. Plase make sure you are updating the opencv paths according to your custom builds. + +For the squeezenet IO buffer sample: +Make sure you are creating the opencl context for the right GPU device in a multi-GPU environment. Build samples using msbuild either for Debug or Release configuration. @@ -43,4 +69,4 @@ Build samples using msbuild either for Debug or Release configuration. msbuild onnxruntime_samples.sln /p:Configuration=Debug|Release ``` -To run the samples make sure you source openvino variables using setupvars.bat. +To run the samples make sure you source openvino variables using setupvars.bat. Also add opencv dll paths to $PATH. diff --git a/c_cxx/OpenVINO_EP/Windows/squeezenet_classification/squeezenet_cpp_app.cpp b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification/squeezenet_cpp_app.cpp index 3349494ca..19d4b067f 100644 --- a/c_cxx/OpenVINO_EP/Windows/squeezenet_classification/squeezenet_cpp_app.cpp +++ b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification/squeezenet_cpp_app.cpp @@ -21,6 +21,16 @@ Portions of this software are copyright of their respective authors and released #include #include #include // To use runtime_error +#include +#include + +std::size_t GetPeakWorkingSetSize() { + PROCESS_MEMORY_COUNTERS pmc; + if (GetProcessMemoryInfo(GetCurrentProcess(), &pmc, sizeof(pmc))) { + return pmc.PeakWorkingSetSize; + } + return 0; +} template T vectorProduct(const std::vector& v) @@ -387,5 +397,7 @@ int main(int argc, char* argv[]) std::cout << "Minimum Inference Latency: " << std::chrono::duration_cast(end - begin).count() / static_cast(numTests) << " ms" << std::endl; + size_t mem_size = GetPeakWorkingSetSize(); + std::cout << "Peak working set size: " << mem_size << " bytes" << std::endl; return 0; } diff --git a/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/CMakeLists.txt b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/CMakeLists.txt new file mode 100644 index 000000000..9adc03f0e --- /dev/null +++ b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/CMakeLists.txt @@ -0,0 +1,30 @@ +# Copyright (c) Microsoft Corporation. All rights reserved. +# Licensed under the MIT License. + +add_executable(run_squeezenet_io_buffer "squeezenet_cpp_app_io_buffer.cpp") +target_include_directories(run_squeezenet_io_buffer PRIVATE ${OPENCV_INCLUDE_DIRS} ${OPENCL_INCLUDE}) +target_link_libraries(run_squeezenet_io_buffer PRIVATE onnxruntime) + +if(OPENCV_LIBDIR) + target_link_directories(run_squeezenet_io_buffer PRIVATE ${OPENCV_LIBDIR}) + foreach(RelLib DebLib IN ZIP_LISTS OPENCV_RELEASE_LIBRARIES OPENCV_DEBUG_LIBRARIES) + target_link_libraries(run_squeezenet_io_buffer PRIVATE optimized ${RelLib} debug ${DebLib}) + endforeach() +endif() + +if(OPENCL_LIB) + target_link_directories(run_squeezenet_io_buffer PRIVATE ${OPENCL_LIB}) + target_link_libraries(run_squeezenet_io_buffer PRIVATE OpenCL.lib) +endif() + +#In onnxruntime deafault install path, the required dlls are in lib and bin folders +set(DLL_DIRS "${ONNXRUNTIME_ROOTDIR}/lib;${ONNXRUNTIME_ROOTDIR}/bin") +foreach(DLL_DIR IN LISTS DLL_DIRS) + file(GLOB ALL_DLLS ${DLL_DIR}/*.dll) + foreach(ORTDll IN LISTS ALL_DLLS) + add_custom_command(TARGET run_squeezenet_io_buffer POST_BUILD + COMMAND ${CMAKE_COMMAND} -E copy_if_different + "${ORTDll}" + $) + endforeach() +endforeach() \ No newline at end of file diff --git a/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/squeezenet_cpp_app_io_buffer.cpp b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/squeezenet_cpp_app_io_buffer.cpp new file mode 100644 index 000000000..81235cfbc --- /dev/null +++ b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/squeezenet_cpp_app_io_buffer.cpp @@ -0,0 +1,436 @@ +/* +Copyright (C) 2021, Intel Corporation +SPDX-License-Identifier: Apache-2.0 +Portions of this software are copyright of their respective authors and released under the MIT license: +- ONNX-Runtime-Inference, Copyright 2020 Lei Mao. For licensing see https://github.com/leimao/ONNX-Runtime-Inference/blob/main/LICENSE.md +*/ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include // To use runtime_error +#include +#include +#include + +#define CL_HPP_MINIMUM_OPENCL_VERSION 120 +#define CL_HPP_TARGET_OPENCL_VERSION 120 + +std::size_t GetPeakWorkingSetSize() { + PROCESS_MEMORY_COUNTERS pmc; + if (GetProcessMemoryInfo(GetCurrentProcess(), &pmc, sizeof(pmc))) { + return pmc.PeakWorkingSetSize; + } + return 0; +} + +struct OpenCL { + cl::Context _context; + cl::Device _device; + cl::CommandQueue _queue; + + explicit OpenCL(std::shared_ptr> media_api_context_properties = nullptr) { + // get Intel iGPU OCL device, create context and queue + { + const unsigned int refVendorID = 0x8086; + cl_uint n = 0; + clGetPlatformIDs(0, NULL, &n); + + // Get platform list + std::vector platform_ids(n); + clGetPlatformIDs(n, platform_ids.data(), NULL); + + for (auto& id : platform_ids) { + cl::Platform platform = cl::Platform(id); + std::vector devices; + platform.getDevices(CL_DEVICE_TYPE_GPU, &devices); + for (auto& d : devices) { + if (refVendorID == d.getInfo()) { + _device = d; + _context = cl::Context(_device); + break; + } + } + } + cl_command_queue_properties props = CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE; + _queue = cl::CommandQueue(_context, _device, props); + } + } + + explicit OpenCL(cl_context context) { + // user-supplied context handle + _context = cl::Context(context); + _device = cl::Device(_context.getInfo()[0]); + + cl_command_queue_properties props = CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE; + _queue = cl::CommandQueue(_context, _device, props); + } +}; + +template +T vectorProduct(const std::vector& v) +{ + return accumulate(v.begin(), v.end(), 1, std::multiplies()); +} + +/** + * @brief Operator overloading for printing vectors + * @tparam T + * @param os + * @param v + * @return std::ostream& + */ + +template +std::ostream& operator<<(std::ostream& os, const std::vector& v) +{ + os << "["; + for (int i = 0; i < v.size(); ++i) + { + os << v[i]; + if (i != v.size() - 1) + { + os << ", "; + } + } + os << "]"; + return os; +} + +// Function to validate the input image file extension. +bool imageFileExtension(std::string str) +{ + // is empty throw error + if (str.empty()) + throw std::runtime_error("[ ERROR ] The image File path is empty"); + + size_t pos = str.rfind('.'); + if (pos == std::string::npos) + return false; + + std::string ext = str.substr(pos+1); + + if (ext == "jpg" || ext == "jpeg" || ext == "gif" || ext == "png" || ext == "jfif" || + ext == "JPG" || ext == "JPEG" || ext == "GIF" || ext == "PNG" || ext == "JFIF") { + return true; + } + + return false; +} + +// Function to read the labels from the labelFilepath. +std::vector readLabels(std::string& labelFilepath) +{ + std::vector labels; + std::string line; + std::ifstream fp(labelFilepath); + while (std::getline(fp, line)) + { + labels.push_back(line); + } + return labels; +} + +// Function to validate the input model file extension. +bool checkModelExtension(const std::string& filename) +{ + if(filename.empty()) + { + throw std::runtime_error("[ ERROR ] The Model file path is empty"); + } + size_t pos = filename.rfind('.'); + if (pos == std::string::npos) + return false; + std::string ext = filename.substr(pos+1); + if (ext == "onnx") + return true; + return false; +} + +// Function to validate the Label file extension. +bool checkLabelFileExtension(const std::string& filename) +{ + size_t pos = filename.rfind('.'); + if (filename.empty()) + { + throw std::runtime_error("[ ERROR ] The Label file path is empty"); + } + if (pos == std::string::npos) + return false; + std::string ext = filename.substr(pos+1); + if (ext == "txt") { + return true; + } else { + return false; + } +} + +//Handling divide by zero +float division(float num, float den){ + if (den == 0) { + throw std::runtime_error("[ ERROR ] Math error: Attempted to divide by Zero\n"); + } + return (num / den); +} + +void printHelp() { + std::cout << "To run the model, use the following command:\n"; + std::cout << "Example: ./run_squeezenet " << std::endl; + std::cout << "\n Example: ./run_squeezenet squeezenet1.1-7.onnx demo.jpeg synset.txt \n" << std::endl; +} + +int main(int argc, char* argv[]) +{ + + if(argc == 2) { + std::string option = argv[1]; + if (option == "--help" || option == "-help" || option == "--h" || option == "-h") { + printHelp(); + } + return 0; + } else if(argc != 4) { + std::cout << "[ ERROR ] you have used the wrong command to run your program." << std::endl; + printHelp(); + return 0; + } + + std::string instanceName{"image-classification-inference"}; + #ifdef _WIN32 + std::string str_arg1 = argv[1]; + std::wstring wide_string_arg1 = std::wstring(str_arg1.begin(), str_arg1.end()); + std::basic_string modelFilepath = std::basic_string(wide_string_arg1); + #else + std::string modelFilepath = argv[1]; // .onnx file + + //validate ModelFilePath + checkModelExtension(modelFilepath); + if(!checkModelExtension(modelFilepath)) { + throw std::runtime_error("[ ERROR ] The ModelFilepath is not correct. Make sure you are setting the path to an onnx model file (.onnx)"); + } + #endif + std::string imageFilepath = argv[2]; + + // Validate ImageFilePath + imageFileExtension(imageFilepath); + if(!imageFileExtension(imageFilepath)) { + throw std::runtime_error("[ ERROR ] The imageFilepath doesn't have correct image extension. Choose from jpeg, jpg, gif, png, PNG, jfif"); + } + std::ifstream f(imageFilepath.c_str()); + if(!f.good()) { + throw std::runtime_error("[ ERROR ] The imageFilepath is not set correctly or doesn't exist"); + } + + // Validate LabelFilePath + std::string labelFilepath = argv[3]; + if(!checkLabelFileExtension(labelFilepath)) { + throw std::runtime_error("[ ERROR ] The LabelFilepath is not set correctly and the labels file should end with extension .txt"); + } + + std::vector labels{readLabels(labelFilepath)}; + + Ort::Env env(OrtLoggingLevel::ORT_LOGGING_LEVEL_FATAL, instanceName.c_str()); + Ort::SessionOptions sessionOptions; + sessionOptions.SetIntraOpNumThreads(1); + + auto ocl_instance = std::make_shared(); + + //Appending OpenVINO Execution Provider API + // Using OPENVINO backend + OrtOpenVINOProviderOptions options; + options.device_type = "GPU_FP32"; //Another options are: GPU_FP16, GPU.1_FP32, GPU.1_FP16, GPU.0_FP32, GPU.0_FP16 + options.context = (void *) ocl_instance->_context.get() ; + std::cout << "OpenVINO device type is set to: " << options.device_type << std::endl; + sessionOptions.AppendExecutionProvider_OpenVINO(options); + + // Sets graph optimization level + // Available levels are + // ORT_DISABLE_ALL -> To disable all optimizations + // ORT_ENABLE_BASIC -> To enable basic optimizations ( Such as redundant node + // removals) ORT_ENABLE_EXTENDED -> To enable extended optimizations + // (Includes level 1 + more complex optimizations like node fusions) + // ORT_ENABLE_ALL -> To Enable All possible optimizations + sessionOptions.SetGraphOptimizationLevel( + GraphOptimizationLevel::ORT_DISABLE_ALL); + + //Creation: The Ort::Session is created here + Ort::Session session(env, modelFilepath.c_str(), sessionOptions); + + Ort::AllocatorWithDefaultOptions allocator; + Ort::MemoryInfo info_gpu("OpenVINO_GPU", OrtAllocatorType::OrtDeviceAllocator, 0, OrtMemTypeDefault); + + size_t numInputNodes = session.GetInputCount(); + size_t numOutputNodes = session.GetOutputCount(); + + std::cout << "Number of Input Nodes: " << numInputNodes << std::endl; + std::cout << "Number of Output Nodes: " << numOutputNodes << std::endl; + + auto inputNodeName = session.GetInputNameAllocated(0, allocator); + const char* inputName = inputNodeName.get(); + std::cout << "Input Name: " << inputName << std::endl; + + Ort::TypeInfo inputTypeInfo = session.GetInputTypeInfo(0); + auto inputTensorInfo = inputTypeInfo.GetTensorTypeAndShapeInfo(); + + ONNXTensorElementDataType inputType = inputTensorInfo.GetElementType(); + std::cout << "Input Type: " << inputType << std::endl; + + std::vector inputDims = inputTensorInfo.GetShape(); + std::cout << "Input Dimensions: " << inputDims << std::endl; + + auto outputNodeName = session.GetOutputNameAllocated(0, allocator); + const char* outputName = outputNodeName.get(); + std::cout << "Output Name: " << outputName << std::endl; + + Ort::TypeInfo outputTypeInfo = session.GetOutputTypeInfo(0); + auto outputTensorInfo = outputTypeInfo.GetTensorTypeAndShapeInfo(); + + ONNXTensorElementDataType outputType = outputTensorInfo.GetElementType(); + std::cout << "Output Type: " << outputType << std::endl; + + std::vector outputDims = outputTensorInfo.GetShape(); + std::cout << "Output Dimensions: " << outputDims << std::endl; + //pre-processing the Image + // step 1: Read an image in HWC BGR UINT8 format. + cv::Mat imageBGR = cv::imread(imageFilepath, cv::ImreadModes::IMREAD_COLOR); + + // step 2: Resize the image. + cv::Mat resizedImageBGR, resizedImageRGB, resizedImage, preprocessedImage; + cv::resize(imageBGR, resizedImageBGR, + cv::Size(inputDims.at(3), inputDims.at(2)), + cv::InterpolationFlags::INTER_CUBIC); + + // step 3: Convert the image to HWC RGB UINT8 format. + cv::cvtColor(resizedImageBGR, resizedImageRGB, + cv::ColorConversionCodes::COLOR_BGR2RGB); + // step 4: Convert the image to HWC RGB float format by dividing each pixel by 255. + resizedImageRGB.convertTo(resizedImage, CV_32F, 1.0 / 255); + + // step 5: Split the RGB channels from the image. + cv::Mat channels[3]; + cv::split(resizedImage, channels); + + //step 6: Normalize each channel. + // Normalization per channel + // Normalization parameters obtained from + // https://github.com/onnx/models/tree/master/vision/classification/squeezenet + channels[0] = (channels[0] - 0.485) / 0.229; + channels[1] = (channels[1] - 0.456) / 0.224; + channels[2] = (channels[2] - 0.406) / 0.225; + + //step 7: Merge the RGB channels back to the image. + cv::merge(channels, 3, resizedImage); + + // step 8: Convert the image to CHW RGB float format. + // HWC to CHW + cv::dnn::blobFromImage(resizedImage, preprocessedImage); + + //Run Inference + std::vector inputNames{inputName}; + std::vector outputNames{outputName}; + + /* To run inference using ONNX Runtime, the user is responsible for creating and managing the + input and output buffers. The buffers are IO Binding Buffers created on Remote Folders to create Remote Blob*/ + + size_t inputTensorSize = vectorProduct(inputDims); + std::vector inputTensorValues(inputTensorSize); + inputTensorValues.assign(preprocessedImage.begin(), + preprocessedImage.end()); + + size_t imgSize = inputTensorSize*4; + cl_int err; + cl::Buffer shared_buffer(ocl_instance->_context, CL_MEM_READ_WRITE, imgSize, NULL, &err); + { + void *buffer = (void *)preprocessedImage.ptr(); + ocl_instance->_queue.enqueueWriteBuffer(shared_buffer, true, 0, imgSize, buffer); + } + + //To pass to OrtValue wrap the buffer in shared buffer + void *shared_buffer_void = static_cast(&shared_buffer); + size_t outputTensorSize = vectorProduct(outputDims); + std::vector outputTensorValues(outputTensorSize); + + Ort::Value inputTensors = Ort::Value::CreateTensor( + info_gpu, shared_buffer_void, imgSize, inputDims.data(), + inputDims.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT); + + assert(("Output tensor size should equal to the label set size.", + labels.size() == outputTensorSize)); + + size_t imgSizeO = outputTensorSize*4; + cl::Buffer shared_buffer_out(ocl_instance->_context, CL_MEM_READ_WRITE, imgSizeO, NULL, &err); + + //To pass the ORT Value wrap the output buffer in shared buffer + void *shared_buffer_out_void = static_cast(&shared_buffer_out); + Ort::Value outputTensors = Ort::Value::CreateTensor( + info_gpu, shared_buffer_out_void, imgSizeO, outputDims.data(), + outputDims.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT); + + std::cout << "Before Running\n"; + session.Run(Ort::RunOptions{nullptr}, inputNames.data(), &inputTensors, 1, outputNames.data(), &outputTensors, 1); + + int predId = 0; + float activation = 0; + float maxActivation = std::numeric_limits::lowest(); + float expSum = 0; + + uint8_t *ary = (uint8_t*) malloc(imgSizeO); + ocl_instance->_queue.enqueueReadBuffer(shared_buffer_out, true, 0, imgSizeO, ary); + + //float outputTensorArray[outputTensorSize] ; + float* outputTensorArray = new float[outputTensorSize]; + + std::memcpy(outputTensorArray, ary, imgSizeO); + /* The inference result could be found in the buffer for the output tensors, + which are usually the buffer from std::vector instances. */ + + for (int i = 0; i < labels.size(); i++) { + activation = outputTensorArray[i]; + expSum += std::exp(activation); + if (activation > maxActivation) + { + predId = i; + maxActivation = activation; + } + } + std::cout << "Predicted Label ID: " << predId << std::endl; + std::cout << "Predicted Label: " << labels.at(predId) << std::endl; + float result; + try { + result = division(std::exp(maxActivation), expSum); + std::cout << "Uncalibrated Confidence: " << result << std::endl; + } + catch (std::runtime_error& e) { + std::cout << "Exception occurred" << std::endl << e.what(); + } + + // Measure latency + int numTests{10}; + std::chrono::steady_clock::time_point begin = + std::chrono::steady_clock::now(); + + //Run: Running the session is done in the Run() method: + for (int i = 0; i < numTests; i++) { + // session.Run(Ort::RunOptions{nullptr}, binding); + session.Run(Ort::RunOptions{nullptr}, inputNames.data(), &inputTensors, 1, outputNames.data(), &outputTensors, 1); + //session.Run(Ort::RunOptions{nullptr}, binding); + } + std::chrono::steady_clock::time_point end = + std::chrono::steady_clock::now(); + std::cout << "Minimum Inference Latency: " + << std::chrono::duration_cast(end - begin).count() / static_cast(numTests) + << " ms" << std::endl; + size_t mem_size = GetPeakWorkingSetSize(); + std::cout << "Peak working set size: " << mem_size << " bytes" << std::endl; + return 0; +} diff --git a/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/synset.txt b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/synset.txt new file mode 120000 index 000000000..2429a9603 --- /dev/null +++ b/c_cxx/OpenVINO_EP/Windows/squeezenet_classification_io_buffer/synset.txt @@ -0,0 +1 @@ +../../Linux/squeezenet_classification/synset.txt \ No newline at end of file diff --git a/c_sharp/OpenVINO_EP/yolov3_object_detection/README.md b/c_sharp/OpenVINO_EP/yolov3_object_detection/README.md index a73d2d2ee..b45ca49b0 100644 --- a/c_sharp/OpenVINO_EP/yolov3_object_detection/README.md +++ b/c_sharp/OpenVINO_EP/yolov3_object_detection/README.md @@ -32,36 +32,38 @@ To build nuget packages of onnxruntime with openvino flavour ``` dotnet new console ``` -Replace the sample scripts with the one [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_sharp/OpenVINO_EP/yolov3_object_detection) -2. Install Nuget Packages of Onnxruntime and [ImageSharp](https://www.nuget.org/packages/SixLabors.ImageSharp) - * Using Visual Studio - 1. Open the Visual C# Project file (.csproj) using VS19. - 2. Right click on project, navigate to manage Nuget Packages. - 3. Install SixLabors.ImageSharp, SixLabors.Core, SixLabors.Fonts and SixLabors.ImageSharp.Drawing Packages from nuget.org. - 4. Install Microsoft.ML.OnnxRuntime.Managed and Microsoft.ML.OnnxRuntime.Openvino from your build directory nuget-artifacts. - * Using cmd - ``` - mkdir [source-folder] - cd [console-project-folder] - dotnet add package SixLabors.ImageSharp - dotnet add package SixLabors.Core - dotnet add package SixLabors.Fonts - dotnet add package SixLabors.ImageSharp.Drawing - ``` - Add Microsoft.ML.OnnxRuntime.Managed and Microsoft.ML.OnnxRuntime.Openvino packages. - ``` - nuget add [path-to-nupkg] -Source [source-path] - dotnet add package [nuget=package-name] -v [package-version] -s [source-path] - ``` -3. Compile the sample - ``` - dotnet build - ``` - -4. Run the sample - ``` - dotnet run [path-to-model] [path-to-image] [path-to-output-image] - ``` +2. Replace the sample scripts with the one [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_sharp/OpenVINO_EP/yolov3_object_detection) + +3. Install Nuget Packages of Onnxruntime and [ImageSharp](https://www.nuget.org/packages/SixLabors.ImageSharp) + * Using Visual Studio + 1. Open the Visual C# Project file (.csproj) using VS19. + 2. Right click on project, navigate to manage Nuget Packages. + 3. Install SixLabors.ImageSharp, SixLabors.Core, SixLabors.Fonts and SixLabors.ImageSharp.Drawing Packages from nuget.org. + 4. Install Microsoft.ML.OnnxRuntime.Managed and Microsoft.ML.OnnxRuntime.Openvino from your build directory nuget-artifacts. + * Using cmd + ``` + mkdir [source-folder] + cd [console-project-folder] + dotnet add package SixLabors.ImageSharp + dotnet add package SixLabors.Core + dotnet add package SixLabors.Fonts + dotnet add package SixLabors.ImageSharp.Drawing + ``` + Add Microsoft.ML.OnnxRuntime.Managed and Microsoft.ML.OnnxRuntime.Openvino packages. + ``` + nuget add [path-to-nupkg] -Source [source-path] + dotnet add package [nuget=package-name] -v [package-version] -s [source-path] + ``` + +4. Compile the sample + ``` + dotnet build + ``` + +5. Run the sample + ``` + dotnet run [path-to-model] [path-to-image] [path-to-output-image] + ``` ## References: