Skip to content

xt::transpose() is much slower than cv::transposeND() and ndarray.transpose() #2859

@04633435

Description

@04633435

Hi there,

I am trying to migrate my project written in python into C++. The project includes many N-D array operations, I found xtensor is exceed at doing this in C++.

But the performance of xt::transpose() is not acceptable, which is much slower than the cv::transposeND(). The code snippet is shown below

int main()
{
    //! execution time comparasion between xtensor and opencv
    // Define the dimensions
    size_t batch = 1;
    size_t channels = 3;
    size_t height = 1920;
    size_t width = 1080;
    
    // --- xtensor Operations ---
    
    // Create a random xtensor with shape {1, 3, 1920, 1080}
    xt::xarray<float> xt_array = xt::random::rand<float>({batch, channels, height, width});
    
    // Add 1 to all elements
    xt_array += 1.0f;
    
    // Perform the transpose and measure its time
    xt::xarray<float> xt_transposed;
    MEASURE_TIME(
        xt_transposed = xt::transpose(xt_array, {0, 2, 3, 1}),
        "xtensor transpose"
    );
    
    // --- OpenCV Operations ---
    
    // Create a random cv::Mat with shape {1, 3, 1920, 1080}
    std::vector<int> mat_sizes = {(int)batch, (int)channels, (int)height, (int)width};
    cv::Mat cv_mat(mat_sizes, CV_32F);
    cv::randu(cv_mat, 0.0f, 1.0f);
    
    // Add 1 to all elements
    cv_mat += 1.0f;
    
    // Perform the transpose and measure its time
    cv::Mat cv_transposed;
    std::vector<int> transpose_axes = {0, 2, 3, 1};
    MEASURE_TIME(
        cv::transposeND(cv_mat, transpose_axes, cv_transposed),
        "OpenCV transposeND"
    );
    //* OUTPUT:
    //* xtensor transpose execution time: 383.061 milliseconds
    //* OpenCV transposeND execution time: 50.024 milliseconds
    //? How about numpy transpose?

    // --- Verification (Optional) ---
    
    // Print shapes to verify the transpose was successful
    std::cout << "\nVerifying shapes:" << std::endl;
    std::cout << "Original xtensor shape: " << xt::adapt(xt_array.shape()) << std::endl;
    std::cout << "Transposed xtensor shape: " << xt::adapt(xt_transposed.shape()) << std::endl;
    std::cout << "Original cv::Mat shape: [" << cv_mat.size[0] << ", " << cv_mat.size[1] << ", " << cv_mat.size[2] << ", " << cv_mat.size[3] << "]" << std::endl;
    std::cout << "Transposed cv::Mat shape: [" << cv_transposed.size[0] << ", " << cv_transposed.size[1] << ", " << cv_transposed.size[2] << ", " << cv_transposed.size[3] << "]" << std::endl;

return 0;

}

The measured time is

    //* OUTPUT:
    //* xtensor transpose execution time: 383.061 milliseconds
    //* OpenCV transposeND execution time: 50.024 milliseconds

Is this result expected? Any clue would be appreciated. Thank you in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions