⚡️ Speed up function `shear` by 10% by codeflash-ai[bot] · Pull Request #26 · codeflash-ai/kornia

codeflash-ai · 2025-03-25T09:11:48Z

📄 10% (0.10x) speedup for `shear` in `kornia/geometry/transform/affwarp.py`

⏱️ Runtime : 17.2 milliseconds → 15.6 milliseconds (best of 33 runs)

📝 Explanation and details

Optimizations Made.

Direct Tensor Manipulations: Avoided unnecessary tensor chunk operations in _compute_shear_matrix by directly accessing and updating elements.
Dimension Methods: Replaced ndimension() with dim() for checking tensor dimensions, which is consistent with PyTorch updates and potentially more performant.
Batch Size Handling: Optimized batch size handling in the affine function by retrieving tensor dimensions only once and minimizing redundant calculations.

These optimizations improve computational efficiency by minimizing unnecessary operations and restructuring logic to make better use of PyTorch operations internally. The overall logic and results of the operations remain unchanged, ensuring that the function outputs are consistent with previous versions.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 44 Passed
🌀 Generated Regression Tests	✅ 26 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests Details

- geometry/transform/test_affine.py

🌀 Generated Regression Tests Details

from __future__ import annotations

from typing import Optional

# imports
import pytest  # used for our unit tests
import torch
import torch.nn.functional as F
from kornia.core import Tensor, eye, tensor, zeros
from kornia.geometry.conversions import (convert_affinematrix_to_homography,
                                         normalize_homography)
from kornia.geometry.transform.affwarp import shear
from kornia.geometry.transform.imgwarp import warp_affine
from kornia.utils import eye_like
from kornia.utils.helpers import _torch_inverse_cast
from kornia.utils.misc import eye_like
from torch import Tensor

# function to test
# LICENSE HEADER MANAGED BY add-license-header
#
# Copyright 2018 Kornia Team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#



__all__ = [
    "Affine",
    "Rescale",
    "Resize",
    "Rotate",
    "Scale",
    "Shear",
    "Translate",
    "affine",
    "affine3d",
    "rescale",
    "resize",
    "resize_to_be_divisible",
    "rotate",
    "rotate3d",
    "scale",
    "shear",
    "translate",
]
from kornia.geometry.transform.affwarp import shear

# LICENSE HEADER MANAGED BY add-license-header
#
# Copyright 2018 Kornia Team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#



def eye_like(n: int, input: Tensor, shared_memory: bool = False) -> Tensor:
    r"""Return a 2-D tensor with ones on the diagonal and zeros elsewhere with the same batch size as the input.

    Args:
        n: the number of rows :math:`(N)`.
        input: image tensor that will determine the batch size of the output matrix.
          The expected shape is :math:`(B, *)`.
        shared_memory: when set, all samples in the batch will share the same memory.

    Returns:
       The identity matrix with the same batch size as the input :math:`(B, N, N)`.

    Notes:
        When the dimension to expand is of size 1, using torch.expand(...) yields the same tensor as torch.repeat(...)
        without using extra memory. Thus, when the tensor obtained by this method will be later assigned -
        use this method with shared_memory=False, otherwise, prefer using it with shared_memory=True.

    """
    if n <= 0:
        raise AssertionError(type(n), n)
    if len(input.shape) < 1:
        raise AssertionError(input.shape)

    identity = eye(n, device=input.device).type(input.dtype)

    return identity[None].expand(input.shape[0], n, n) if shared_memory else identity[None].repeat(input.shape[0], 1, 1)

# LICENSE HEADER MANAGED BY add-license-header
#
# Copyright 2018 Kornia Team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


__all__ = [
    "get_affine_matrix2d",
    "get_affine_matrix3d",
    "get_perspective_transform",
    "get_perspective_transform3d",
    "get_projective_transform",
    "get_rotation_matrix2d",
    "get_shear_matrix2d",
    "get_shear_matrix3d",
    "get_translation_matrix2d",
    "homography_warp",
    "homography_warp3d",
    "invert_affine_transform",
    "projection_from_Rt",
    "remap",
    "warp_affine",
    "warp_affine3d",
    "warp_grid",
    "warp_grid3d",
    "warp_perspective",
    "warp_perspective3d",
]


def warp_affine(
    src: Tensor,
    M: Tensor,
    dsize: tuple[int, int],
    mode: str = "bilinear",
    padding_mode: str = "zeros",
    align_corners: bool = True,
    fill_value: Optional[Tensor] = None,  # needed for jit
) -> Tensor:
    r"""Apply an affine transformation to a tensor.

    .. image:: _static/img/warp_affine.png

    The function warp_affine transforms the source tensor using
    the specified matrix:

    .. math::
        \text{dst}(x, y) = \text{src} \left( M_{11} x + M_{12} y + M_{13} ,
        M_{21} x + M_{22} y + M_{23} \right )

    Args:
        src: input tensor of shape :math:`(B, C, H, W)`.
        M: affine transformation of shape :math:`(B, 2, 3)`.
        dsize: size of the output image (height, width).
        mode: interpolation mode to calculate output values ``'bilinear'`` | ``'nearest'``.
        padding_mode: padding mode for outside grid values ``'zeros'`` | ``'border'`` | ``'reflection'`` | ``'fill'``.
        align_corners : mode for grid_generation.
        fill_value: tensor of shape :math:`(3)` that fills the padding area. Only supported for RGB.

    Returns:
        the warped tensor with shape :math:`(B, C, H, W)`.

    .. note::
        This function is often used in conjunction with :func:`get_rotation_matrix2d`,
        :func:`get_shear_matrix2d`, :func:`get_affine_matrix2d`, :func:`invert_affine_transform`.

    .. note::
       See a working example `here <https://kornia.github.io/tutorials/nbs/rotate_affine.html>`__.

    Example:
       >>> img = torch.rand(1, 4, 5, 6)
       >>> A = torch.eye(2, 3)[None]
       >>> out = warp_affine(img, A, (4, 2), align_corners=True)
       >>> print(out.shape)
       torch.Size([1, 4, 4, 2])

    """
    if not isinstance(src, Tensor):
        raise TypeError(f"Input src type is not a Tensor. Got {type(src)}")

    if not isinstance(M, Tensor):
        raise TypeError(f"Input M type is not a Tensor. Got {type(M)}")

    if not len(src.shape) == 4:
        raise ValueError(f"Input src must be a BxCxHxW tensor. Got {src.shape}")

    if not (len(M.shape) == 3 or M.shape[-2:] == (2, 3)):
        raise ValueError(f"Input M must be a Bx2x3 tensor. Got {M.shape}")

    B, C, H, W = src.size()

    # we generate a 3x3 transformation matrix from 2x3 affine
    M_3x3: Tensor = convert_affinematrix_to_homography(M)
    dst_norm_trans_src_norm: Tensor = normalize_homography(M_3x3, (H, W), dsize)

    # src_norm_trans_dst_norm = torch.inverse(dst_norm_trans_src_norm)
    src_norm_trans_dst_norm = _torch_inverse_cast(dst_norm_trans_src_norm)

    grid = F.affine_grid(src_norm_trans_dst_norm[:, :2, :], [B, C, dsize[0], dsize[1]], align_corners=align_corners)

    if padding_mode == "fill":
        if fill_value is None:
            fill_value = zeros(3)
        return _fill_and_warp(src, grid, align_corners=align_corners, mode=mode, fill_value=fill_value)

    return F.grid_sample(src, grid, align_corners=align_corners, mode=mode, padding_mode=padding_mode)

# unit tests

def test_single_image_shearing():
    # Single image shearing with basic shear values
    tensor = torch.rand(3, 4, 4)  # CxHxW
    shear_factors = torch.tensor([[0.5, 0.0]])  # Batch size 1
    codeflash_output = shear(tensor, shear_factors)

def test_batch_image_shearing():
    # Batch image shearing with uniform shear values
    tensor = torch.rand(2, 3, 4, 4)  # BxCxHxW
    shear_factors = torch.tensor([[0.5, 0.0], [0.5, 0.0]])  # Batch size 2
    codeflash_output = shear(tensor, shear_factors)


def test_extreme_shear_values():
    # Test with extreme shear values
    tensor = torch.rand(1, 3, 4, 4)
    shear_factors = torch.tensor([[10.0, 10.0]])
    codeflash_output = shear(tensor, shear_factors)

def test_invalid_input_type():
    # Test with invalid input type
    with pytest.raises(TypeError):
        shear("not a tensor", torch.tensor([[0.5, 0.0]]))

def test_invalid_tensor_shape():
    # Test with invalid tensor shape
    with pytest.raises(ValueError):
        shear(torch.rand(3, 4), torch.tensor([[0.5, 0.0]]))


def test_interpolation_modes():
    # Test with different interpolation modes
    tensor = torch.rand(1, 3, 4, 4)
    shear_factors = torch.tensor([[0.5, 0.0]])
    codeflash_output = shear(tensor, shear_factors, mode='bilinear')
    codeflash_output = shear(tensor, shear_factors, mode='nearest')

def test_padding_modes():
    # Test with different padding modes
    tensor = torch.rand(1, 3, 4, 4)
    shear_factors = torch.tensor([[0.5, 0.0]])
    codeflash_output = shear(tensor, shear_factors, padding_mode='zeros')
    codeflash_output = shear(tensor, shear_factors, padding_mode='border')
    codeflash_output = shear(tensor, shear_factors, padding_mode='reflection')

def test_large_scale():
    # Test large scale input
    tensor = torch.rand(10, 3, 256, 256)  # 10 images, 3 channels, 256x256
    shear_factors = torch.rand(10, 2)  # Random shear factors for each image
    codeflash_output = shear(tensor, shear_factors)

def test_single_channel_image():
    # Test single channel image
    tensor = torch.rand(1, 1, 4, 4)  # Single channel
    shear_factors = torch.tensor([[0.5, 0.0]])
    codeflash_output = shear(tensor, shear_factors)

def test_non_square_images():
    # Test non-square images
    tensor = torch.rand(1, 3, 4, 8)  # Rectangular image
    shear_factors = torch.tensor([[0.5, 0.0]])
    codeflash_output = shear(tensor, shear_factors)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from typing import Optional

# imports
import pytest
import torch
import torch.nn.functional as F
from kornia.core import eye, tensor, zeros
from kornia.geometry.conversions import (convert_affinematrix_to_homography,
                                         normalize_homography)
from kornia.geometry.transform.affwarp import shear
from kornia.geometry.transform.imgwarp import warp_affine
from kornia.utils.helpers import _torch_inverse_cast
from kornia.utils.misc import eye_like
from torch import Tensor

# unit tests

def test_identity_shear():
    # Test with no shear, should return the original tensor
    img = torch.rand(1, 3, 4, 4)
    shear_factor = torch.tensor([[0.0, 0.0]])
    codeflash_output = shear(img, shear_factor)

def test_simple_shear():
    # Test with a simple shear factor
    img = torch.rand(1, 3, 4, 4)
    shear_factor = torch.tensor([[0.1, 0.2]])
    codeflash_output = shear(img, shear_factor)

def test_negative_shear():
    # Test with negative shear factors
    img = torch.rand(1, 3, 4, 4)
    shear_factor = torch.tensor([[-0.1, -0.2]])
    codeflash_output = shear(img, shear_factor)

def test_invalid_tensor_type():
    # Test with invalid tensor type
    with pytest.raises(TypeError):
        shear([1, 2, 3], torch.tensor([[0.0, 0.0]]))

def test_invalid_shear_type():
    # Test with invalid shear type
    img = torch.rand(1, 3, 4, 4)
    with pytest.raises(TypeError):
        shear(img, [0.0, 0.0])

def test_invalid_tensor_shape():
    # Test with invalid tensor shape
    img = torch.rand(4, 4)
    with pytest.raises(ValueError):
        shear(img, torch.tensor([[0.0, 0.0]]))


def test_large_batch():
    # Test with a large batch size
    img = torch.rand(100, 3, 32, 32)
    shear_factor = torch.rand(100, 2)
    codeflash_output = shear(img, shear_factor)

def test_high_resolution_image():
    # Test with a high-resolution image
    img = torch.rand(1, 3, 512, 512)
    shear_factor = torch.tensor([[0.1, 0.2]])
    codeflash_output = shear(img, shear_factor)

def test_different_interpolation_modes():
    # Test with different interpolation modes
    img = torch.rand(1, 3, 4, 4)
    shear_factor = torch.tensor([[0.1, 0.2]])
    codeflash_output = shear(img, shear_factor, mode='nearest')
    codeflash_output = shear(img, shear_factor, mode='bilinear')

def test_different_padding_modes():
    # Test with different padding modes
    img = torch.rand(1, 3, 4, 4)
    shear_factor = torch.tensor([[0.1, 0.2]])
    codeflash_output = shear(img, shear_factor, padding_mode='zeros')
    codeflash_output = shear(img, shear_factor, padding_mode='border')
    codeflash_output = shear(img, shear_factor, padding_mode='reflection')
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-shear-m8oa12js and push.

### Optimizations Made. 1. **Direct Tensor Manipulations**: Avoided unnecessary tensor chunk operations in `_compute_shear_matrix` by directly accessing and updating elements. 2. **Dimension Methods**: Replaced `ndimension()` with `dim()` for checking tensor dimensions, which is consistent with PyTorch updates and potentially more performant. 3. **Batch Size Handling**: Optimized batch size handling in the `affine` function by retrieving tensor dimensions only once and minimizing redundant calculations. These optimizations improve computational efficiency by minimizing unnecessary operations and restructuring logic to make better use of PyTorch operations internally. The overall logic and results of the operations remain unchanged, ensuring that the function outputs are consistent with previous versions.

github-actions · 2026-01-14T00:31:53Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

Ubuntu and others added 2 commits March 13, 2025 00:39

working coarse prompt fix

d09dcff

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Mar 25, 2025

codeflash-ai bot requested a review from dasarchan March 25, 2025 09:11

misrasaurabh1 force-pushed the main branch from 298d378 to 9f5d204 Compare April 1, 2025 18:14

github-actions bot added the stale label Jan 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `shear` by 10%#26

⚡️ Speed up function `shear` by 10%#26
codeflash-ai[bot] wants to merge 2 commits intomainfrom
codeflash/optimize-shear-m8oa12js

codeflash-ai bot commented Mar 25, 2025

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

codeflash-ai bot commented Mar 25, 2025

📄 10% (0.10x) speedup for shear in kornia/geometry/transform/affwarp.py

Optimizations Made.

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

📄 10% (0.10x) speedup for `shear` in `kornia/geometry/transform/affwarp.py`