⚡️ Speed up function `retrieve_timesteps` by 71% #128

codeflash-ai · 2025-06-01T09:50:45Z

📄 71% (0.71x) speedup for `retrieve_timesteps` in `src/diffusers/pipelines/ltx/pipeline_ltx.py`

⏱️ Runtime : 842 microseconds → 493 microseconds (best of 589 runs)

📝 Explanation and details

Key Optimizations:

Parameter Inspection Caching: The dominant bottleneck was repeated calls to inspect.signature(...).parameters.keys() for every function invocation. By caching the parameter set per scheduler class in _get_set_timesteps_param_set(cls), we reduce this to a one-time cost per scheduler class (for the process lifetime).
Minimal Imports: inspect is only imported if necessary in the helper.
Logic flow and memory remain identical, just far less repeated slow reflection.
Return value and all comments are preserved.

This greatly increases speed especially for repeated calls and in hot loops.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 22 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests Details

import inspect
from typing import List, Optional, Union

# imports
import pytest  # used for our unit tests
import torch
from src.diffusers.pipelines.ltx.pipeline_ltx import retrieve_timesteps

# ------------------------
# Mock Scheduler Classes
# ------------------------

class BasicScheduler:
    """
    Scheduler that supports set_timesteps(num_inference_steps, device)
    """
    def __init__(self):
        self.timesteps = None
        self.device = None
        self.set_timesteps_called_with = None

    def set_timesteps(self, num_inference_steps, device=None, **kwargs):
        self.device = device
        self.set_timesteps_called_with = num_inference_steps
        # For demonstration, create timesteps as a torch tensor from num_inference_steps-1 to 0
        self.timesteps = torch.arange(num_inference_steps-1, -1, -1, dtype=torch.long, device=device)

class TimestepsScheduler:
    """
    Scheduler that supports set_timesteps(timesteps, device)
    """
    def __init__(self):
        self.timesteps = None
        self.device = None
        self.set_timesteps_called_with = None

    def set_timesteps(self, timesteps, device=None, **kwargs):
        self.device = device
        self.set_timesteps_called_with = list(timesteps)
        # Store as torch tensor
        self.timesteps = torch.tensor(timesteps, dtype=torch.long, device=device)

class SigmasScheduler:
    """
    Scheduler that supports set_timesteps(sigmas, device)
    """
    def __init__(self):
        self.timesteps = None
        self.device = None
        self.set_timesteps_called_with = None

    def set_timesteps(self, sigmas, device=None, **kwargs):
        self.device = device
        self.set_timesteps_called_with = list(sigmas)
        # For this test, just store sigmas as float tensor (timesteps)
        self.timesteps = torch.tensor(sigmas, dtype=torch.float, device=device)

class FullScheduler:
    """
    Scheduler that supports set_timesteps(num_inference_steps, timesteps, sigmas, device)
    """
    def __init__(self):
        self.timesteps = None
        self.device = None
        self.set_timesteps_called_with = {}

    def set_timesteps(self, num_inference_steps=None, timesteps=None, sigmas=None, device=None, **kwargs):
        self.device = device
        self.set_timesteps_called_with = {
            "num_inference_steps": num_inference_steps,
            "timesteps": timesteps,
            "sigmas": sigmas,
        }
        if timesteps is not None:
            self.timesteps = torch.tensor(timesteps, dtype=torch.long, device=device)
        elif sigmas is not None:
            self.timesteps = torch.tensor(sigmas, dtype=torch.float, device=device)
        elif num_inference_steps is not None:
            self.timesteps = torch.arange(num_inference_steps-1, -1, -1, dtype=torch.long, device=device)
        else:
            self.timesteps = torch.tensor([], dtype=torch.long, device=device)

class NoCustomScheduler:
    """
    Scheduler that supports only num_inference_steps, not timesteps or sigmas
    """
    def __init__(self):
        self.timesteps = None
        self.device = None

    def set_timesteps(self, num_inference_steps, device=None, **kwargs):
        self.device = device
        self.timesteps = torch.arange(num_inference_steps-1, -1, -1, dtype=torch.long, device=device)

# ------------------------
# Unit Tests
# ------------------------

# 1. BASIC TEST CASES

def test_basic_num_inference_steps():
    # Test with num_inference_steps only (BasicScheduler)
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=5)

def test_basic_timesteps():
    # Test with custom timesteps (TimestepsScheduler)
    scheduler = TimestepsScheduler()
    custom_timesteps = [10, 20, 30]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_basic_sigmas():
    # Test with custom sigmas (SigmasScheduler)
    scheduler = SigmasScheduler()
    custom_sigmas = [0.1, 0.2, 0.3]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_basic_device_str():
    # Test with device as string
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=3, device="cpu")

def test_basic_device_torch_device():
    # Test with device as torch.device
    scheduler = BasicScheduler()
    device = torch.device("cpu")
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=2, device=device)

def test_basic_full_scheduler_num_inference_steps():
    # Test FullScheduler with num_inference_steps
    scheduler = FullScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=4)

def test_basic_full_scheduler_timesteps():
    # Test FullScheduler with timesteps
    scheduler = FullScheduler()
    custom_timesteps = [1, 4, 9]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_basic_full_scheduler_sigmas():
    # Test FullScheduler with sigmas
    scheduler = FullScheduler()
    custom_sigmas = [0.5, 1.5]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

# 2. EDGE TEST CASES

def test_edge_timesteps_and_sigmas_both_given():
    # Passing both timesteps and sigmas should raise ValueError
    scheduler = FullScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, timesteps=[1,2], sigmas=[0.1,0.2])

def test_edge_timesteps_not_supported():
    # Passing timesteps to a scheduler that doesn't accept it should raise ValueError
    scheduler = BasicScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, timesteps=[1,2,3])

def test_edge_sigmas_not_supported():
    # Passing sigmas to a scheduler that doesn't accept it should raise ValueError
    scheduler = BasicScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, sigmas=[0.1,0.2])

def test_edge_timesteps_none_and_sigmas_none_and_num_inference_steps_none():
    # All three are None: should raise TypeError or fail in set_timesteps
    scheduler = BasicScheduler()
    with pytest.raises(TypeError):
        retrieve_timesteps(scheduler)

def test_edge_empty_timesteps():
    # Passing empty timesteps list should return empty tensor and 0 steps
    scheduler = TimestepsScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=[])

def test_edge_empty_sigmas():
    # Passing empty sigmas list should return empty tensor and 0 steps
    scheduler = SigmasScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=[])

def test_edge_num_inference_steps_zero():
    # Passing num_inference_steps=0 should return empty tensor and 0 steps
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=0)

def test_edge_negative_num_inference_steps():
    # Negative num_inference_steps: should return empty tensor and negative steps
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=-5)

def test_edge_non_integer_timesteps():
    # Timesteps with floats should still work but be cast to long
    scheduler = TimestepsScheduler()
    custom_timesteps = [1.1, 2.9, 3.5]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_edge_non_integer_sigmas():
    # Sigmas with ints should be cast to float
    scheduler = SigmasScheduler()
    custom_sigmas = [1, 2, 3]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_edge_kwargs_passed():
    # Ensure kwargs are passed to set_timesteps
    class KwargScheduler:
        def __init__(self):
            self.last_kwargs = None
            self.timesteps = None
        def set_timesteps(self, num_inference_steps, device=None, foo=None, **kwargs):
            self.last_kwargs = {"foo": foo, **kwargs}
            self.timesteps = torch.arange(num_inference_steps-1, -1, -1, dtype=torch.long, device=device)
    scheduler = KwargScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=3, foo="bar", extra=42)

def test_edge_scheduler_with_no_custom_support():
    # NoCustomScheduler: should work with num_inference_steps, but fail with timesteps or sigmas
    scheduler = NoCustomScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=3)
    with pytest.raises(ValueError):
        retrieve_timesteps(scheduler, timesteps=[1,2,3])
    with pytest.raises(ValueError):
        retrieve_timesteps(scheduler, sigmas=[0.1,0.2,0.3])

# 3. LARGE SCALE TEST CASES

def test_large_num_inference_steps():
    # Large num_inference_steps (but <1000 to avoid >100MB tensor)
    scheduler = BasicScheduler()
    N = 999
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=N)

def test_large_timesteps():
    # Large custom timesteps
    scheduler = TimestepsScheduler()
    N = 1000
    custom_timesteps = list(range(N))
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_large_sigmas():
    # Large custom sigmas
    scheduler = SigmasScheduler()
    N = 1000
    custom_sigmas = [float(i)/1000 for i in range(N)]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_large_device_and_types():
    # Large scale with device parameter and type checks
    scheduler = FullScheduler()
    N = 500
    custom_timesteps = list(range(N))
    device = torch.device("cpu")
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps, device=device)

def test_large_multiple_calls_consistency():
    # Call retrieve_timesteps multiple times with different schedulers and large inputs
    N = 999
    schedulers = [BasicScheduler(), TimestepsScheduler(), FullScheduler()]
    for scheduler in schedulers:
        timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=N)

def test_large_sigmas_and_timesteps_mutual_exclusion():
    # Large sigmas and timesteps: should still raise ValueError
    scheduler = FullScheduler()
    N = 500
    with pytest.raises(ValueError):
        retrieve_timesteps(scheduler, timesteps=list(range(N)), sigmas=[float(i) for i in range(N)])
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import inspect
from typing import List, Optional, Union

# imports
import pytest  # used for our unit tests
import torch
from src.diffusers.pipelines.ltx.pipeline_ltx import retrieve_timesteps

# ---- Mock Schedulers for Testing ----

class BasicScheduler:
    """A simple scheduler supporting num_inference_steps and device."""
    def __init__(self):
        self.timesteps = None
        self.called_args = None

    def set_timesteps(self, num_inference_steps, device=None, **kwargs):
        # Store the arguments for verification
        self.called_args = (num_inference_steps, device)
        # Timesteps are simply a tensor from 0 to num_inference_steps-1
        self.timesteps = torch.arange(num_inference_steps, device=device, dtype=torch.long)

class TimestepsScheduler:
    """A scheduler that supports custom timesteps."""
    def __init__(self):
        self.timesteps = None
        self.called_args = None

    def set_timesteps(self, timesteps, device=None, **kwargs):
        self.called_args = (timesteps, device)
        # Accepts list of ints
        self.timesteps = torch.tensor(timesteps, device=device, dtype=torch.long)

class SigmasScheduler:
    """A scheduler that supports custom sigmas."""
    def __init__(self):
        self.timesteps = None
        self.called_args = None

    def set_timesteps(self, sigmas, device=None, **kwargs):
        self.called_args = (sigmas, device)
        # Accepts list of floats
        # For test, just convert sigmas to ints as timesteps
        self.timesteps = torch.tensor([int(s) for s in sigmas], device=device, dtype=torch.long)

class BothScheduler:
    """A scheduler that supports both timesteps and sigmas."""
    def __init__(self):
        self.timesteps = None
        self.called_args = None

    def set_timesteps(self, num_inference_steps=None, timesteps=None, sigmas=None, device=None, **kwargs):
        self.called_args = (num_inference_steps, timesteps, sigmas, device)
        if timesteps is not None:
            self.timesteps = torch.tensor(timesteps, device=device, dtype=torch.long)
        elif sigmas is not None:
            self.timesteps = torch.tensor([int(s) for s in sigmas], device=device, dtype=torch.long)
        elif num_inference_steps is not None:
            self.timesteps = torch.arange(num_inference_steps, device=device, dtype=torch.long)
        else:
            self.timesteps = torch.tensor([], device=device, dtype=torch.long)

class NoTimestepsScheduler:
    """A scheduler that does NOT support timesteps or sigmas."""
    def __init__(self):
        self.timesteps = None
        self.called_args = None

    def set_timesteps(self, num_inference_steps, device=None, **kwargs):
        self.called_args = (num_inference_steps, device)
        self.timesteps = torch.arange(num_inference_steps, device=device, dtype=torch.long)

# ------------------- Unit Tests -------------------

# 1. BASIC TEST CASES

def test_basic_num_inference_steps():
    # Test normal use with num_inference_steps
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=5)

def test_basic_timesteps_list():
    # Test with custom timesteps using TimestepsScheduler
    scheduler = TimestepsScheduler()
    custom_timesteps = [0, 2, 4, 6]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_basic_sigmas_list():
    # Test with custom sigmas using SigmasScheduler
    scheduler = SigmasScheduler()
    custom_sigmas = [1.0, 2.0, 3.0]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_basic_device_cpu():
    # Test that device is passed and tensor is on correct device (cpu)
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=3, device="cpu")


def test_basic_scheduler_with_kwargs():
    # Test that extra kwargs are passed to set_timesteps
    class KwargsScheduler(BasicScheduler):
        def set_timesteps(self, num_inference_steps, device=None, foo=None, **kwargs):
            self.foo = foo
            super().set_timesteps(num_inference_steps, device=device)
    scheduler = KwargsScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=2, foo="bar")

# 2. EDGE TEST CASES

def test_timesteps_and_sigmas_both_raise():
    # Passing both timesteps and sigmas should raise ValueError
    scheduler = BothScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, timesteps=[1,2,3], sigmas=[0.1,0.2,0.3])

def test_timesteps_not_supported_raises():
    # If scheduler does not support timesteps, should raise
    scheduler = NoTimestepsScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, timesteps=[1,2,3])

def test_sigmas_not_supported_raises():
    # If scheduler does not support sigmas, should raise
    scheduler = NoTimestepsScheduler()
    with pytest.raises(ValueError) as excinfo:
        retrieve_timesteps(scheduler, sigmas=[0.1,0.2])

def test_timesteps_empty_list():
    # Passing empty timesteps list should return empty tensor and 0 steps
    scheduler = TimestepsScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=[])

def test_sigmas_empty_list():
    # Passing empty sigmas list should return empty tensor and 0 steps
    scheduler = SigmasScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=[])

def test_num_inference_steps_zero():
    # Passing num_inference_steps=0 should return empty tensor and 0 steps
    scheduler = BasicScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=0)

def test_timesteps_negative_values():
    # Passing negative timesteps should work if scheduler accepts it
    scheduler = TimestepsScheduler()
    custom_timesteps = [-3, -1, 0, 2]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_sigmas_float_and_negative():
    # Sigmas with negative and float values
    scheduler = SigmasScheduler()
    custom_sigmas = [-2.7, 0.0, 3.9]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_device_as_torch_device_object():
    # Pass device as torch.device object
    scheduler = BasicScheduler()
    device = torch.device("cpu")
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=2, device=device)

def test_kwargs_are_passed_through():
    # Ensure kwargs are passed to set_timesteps
    class CustomScheduler(BasicScheduler):
        def set_timesteps(self, num_inference_steps, device=None, custom_arg=None, **kwargs):
            self.custom_arg = custom_arg
            super().set_timesteps(num_inference_steps, device=device)
    scheduler = CustomScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=2, custom_arg="baz")

def test_scheduler_returns_non_long_tensor():
    # If scheduler.timesteps is not long dtype, should still work
    class FloatScheduler(BasicScheduler):
        def set_timesteps(self, num_inference_steps, device=None, **kwargs):
            self.timesteps = torch.arange(num_inference_steps, device=device, dtype=torch.float32)
    scheduler = FloatScheduler()
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=3)

# 3. LARGE SCALE TEST CASES

def test_large_num_inference_steps():
    # Test with large num_inference_steps (within 1000 elements)
    scheduler = BasicScheduler()
    N = 1000
    timesteps, steps = retrieve_timesteps(scheduler, num_inference_steps=N)

def test_large_custom_timesteps():
    # Test with large custom timesteps list
    scheduler = TimestepsScheduler()
    custom_timesteps = list(range(1000))
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_large_custom_sigmas():
    # Test with large custom sigmas list
    scheduler = SigmasScheduler()
    custom_sigmas = [float(i) for i in range(1000)]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)

def test_large_negative_timesteps():
    # Large negative timesteps
    scheduler = TimestepsScheduler()
    custom_timesteps = list(range(-1000, 0))
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_large_sparse_timesteps():
    # Large, non-contiguous timesteps
    scheduler = TimestepsScheduler()
    custom_timesteps = [i*10 for i in range(1000)]
    timesteps, steps = retrieve_timesteps(scheduler, timesteps=custom_timesteps)

def test_large_float_sigmas():
    # Large sigmas with float values
    scheduler = SigmasScheduler()
    custom_sigmas = [float(i)+0.5 for i in range(1000)]
    timesteps, steps = retrieve_timesteps(scheduler, sigmas=custom_sigmas)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-retrieve_timesteps-mbdhd3cq and push.

**Key Optimizations:** - **Parameter Inspection Caching:** The dominant bottleneck was repeated calls to `inspect.signature(...).parameters.keys()` for every function invocation. By caching the parameter set *per scheduler class* in `_get_set_timesteps_param_set(cls)`, we reduce this to a one-time cost per scheduler class (for the process lifetime). - **Minimal Imports:** `inspect` is only imported if necessary in the helper. - **Logic flow and memory remain identical**, just far less repeated slow reflection. - **Return value and all comments are preserved.** This greatly increases speed especially for repeated calls and in hot loops.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 1, 2025

codeflash-ai bot requested a review from aseembits93 June 1, 2025 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `retrieve_timesteps` by 71% #128

⚡️ Speed up function `retrieve_timesteps` by 71% #128

Uh oh!

codeflash-ai bot commented Jun 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function retrieve_timesteps by 71% #128

Are you sure you want to change the base?

⚡️ Speed up function retrieve_timesteps by 71% #128

Uh oh!

Conversation

codeflash-ai bot commented Jun 1, 2025

📄 71% (0.71x) speedup for retrieve_timesteps in src/diffusers/pipelines/ltx/pipeline_ltx.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `retrieve_timesteps` by 71% #128

⚡️ Speed up function `retrieve_timesteps` by 71% #128

📄 71% (0.71x) speedup for `retrieve_timesteps` in `src/diffusers/pipelines/ltx/pipeline_ltx.py`