⚡️ Speed up function `mask_tokens_evenly` by 2,936% #2

codeflash-ai · 2025-10-22T22:27:29Z

📄 2,936% (29.36x) speedup for `mask_tokens_evenly` in `blanc/utils.py`

⏱️ Runtime : 89.6 milliseconds → 2.95 milliseconds (best of 300 runs)

📝 Explanation and details

The optimization achieves a 2936% speedup by eliminating redundant computation in the inner loops through strategic precomputation and algorithmic improvements.

Key optimizations:

Precomputation of expensive operations: The original code called is_token_large_enough() for every token in every modulus iteration, resulting in 345,648 function calls. The optimized version precomputes a large_enough_flags list once upfront, reducing function calls to just 5,930 - a 98% reduction.
Eliminated redundant next_token lookups: The original repeatedly computed next_token = '' if idx + 1 == len(tokens) else tokens[idx + 1] for each modulus. The optimization precomputes a next_tokens list once, avoiding 345,648 conditional checks.
Batch masking with efficient indexing: Instead of checking masking conditions for every token in every modulus, the optimized version precomputes all mask indices per modulus using list comprehensions and explicit range logic, then applies masks in a single pass.
Direct list copying: Replaced masked_input.append() calls with tokens.copy() and direct index assignment, reducing list operations from ~345K appends to simple copies plus targeted assignments.

Performance characteristics by test case:

Small inputs (2-10 tokens): 5-35% slower due to precomputation overhead
Medium inputs (50-100 tokens): 225-1522% faster as precomputation pays off
Large inputs (1000+ tokens): 629-6278% faster, with greatest gains when many tokens need size checking

The optimization is most effective for large token sequences where the cost of precomputation is amortized across many modulus iterations, making it ideal for production NLP workloads with substantial text inputs.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 39 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import pytest
from blanc.utils import mask_tokens_evenly

# unit tests

# 1. Basic Test Cases

def test_basic_single_gap():
    # Test with gap=1, all tokens large enough, should mask all tokens in one output
    tokens = ['hello', 'world']
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.63μs -> 5.36μs (32.2% slower)

def test_basic_gap_two():
    # gap=2, mask every other token, all tokens large enough
    tokens = ['hello', 'world', 'foo', 'bar']
    gap = 2
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 6.65μs -> 7.38μs (9.83% slower)

def test_basic_wordpiece_tokens():
    # Test with wordpiece tokens and min_token_lengths
    tokens = ['un', '##break', 'able', '##ness']
    gap = 2
    min_token_lengths = (2, 2, 4)  # normal, lead, followup
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 6.75μs -> 7.33μs (7.90% slower)

def test_basic_gap_mask_greater_than_one():
    # Test gap_mask > 1, so multiple tokens masked per window
    tokens = ['a', 'b', 'c', 'd', 'e']
    gap = 3
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    gap_mask = 2
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask) # 9.14μs -> 8.96μs (2.05% faster)

def test_basic_mask_token_not_in_tokens():
    # Test that mask_token can be arbitrary and not present in tokens
    tokens = ['x', 'y', 'z']
    gap = 2
    min_token_lengths = (1, 1, 1)
    mask_token = '<MASK>'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 5.73μs -> 6.95μs (17.5% slower)

# 2. Edge Test Cases

def test_edge_empty_tokens():
    # Empty tokens list should return empty lists
    tokens = []
    gap = 2
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 1.12μs -> 1.02μs (9.46% faster)

def test_edge_gap_larger_than_tokens():
    # gap > len(tokens), should behave as gap = len(tokens)
    tokens = ['a', 'b']
    gap = 5
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 4.89μs -> 6.32μs (22.6% slower)

def test_edge_min_token_lengths_exclude_all():
    # All tokens too short, so no masking
    tokens = ['a', 'b', 'c']
    gap = 1
    min_token_lengths = (10, 10, 10)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.70μs -> 4.89μs (24.4% slower)

def test_edge_min_token_lengths_exclude_some():
    # Some tokens too short, only mask large enough tokens
    tokens = ['apple', 'b', '##cat', 'dog']
    gap = 2
    min_token_lengths = (3, 2, 3)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 6.93μs -> 7.50μs (7.58% slower)

def test_edge_gap_mask_equals_gap():
    # gap_mask == gap, all tokens masked in every output
    tokens = ['a', 'b', 'c']
    gap = 3
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    gap_mask = 3
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask) # 6.91μs -> 7.41μs (6.68% slower)

def test_edge_gap_mask_zero():
    # gap_mask = 0, no tokens masked
    tokens = ['a', 'b', 'c']
    gap = 2
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    gap_mask = 0
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask) # 5.30μs -> 5.51μs (3.78% slower)

def test_edge_single_token():
    # Single token, gap=1, should mask it if large enough
    tokens = ['token']
    gap = 1
    min_token_lengths = (3, 3, 3)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 2.72μs -> 4.21μs (35.3% slower)

def test_edge_single_token_too_small():
    # Single token, too small, should not mask
    tokens = ['a']
    gap = 1
    min_token_lengths = (2, 2, 2)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 2.62μs -> 3.71μs (29.3% slower)

def test_edge_wordpiece_lead_and_followup():
    # Test lead and followup min_token_lengths logic
    tokens = ['hello', '##world', 'foo', '##bar']
    gap = 2
    min_token_lengths = (5, 5, 5)
    mask_token = '[MASK]'
    # 'hello' (normal, >=5), '##world' (followup, 'world' >=5), 'foo' (lead, next is ##bar, >=5), '##bar' (followup, 'bar' >=5)
    # 'hello' (5), '##world' (5), 'foo' (3), '##bar' (3)
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 6.78μs -> 7.31μs (7.36% slower)

# 3. Large Scale Test Cases

def test_large_scale_all_masked():
    # Large input, all tokens large enough, gap=10
    tokens = [f'token{i}' for i in range(1000)]
    gap = 10
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 2.71ms -> 372μs (629% faster)
    for modulus in range(gap):
        masked = masked_inputs[modulus]
        answers = all_answers[modulus]
        for idx in range(1000):
            if idx % gap == modulus:
                pass
            else:
                pass

def test_large_scale_some_masked():
    # Large input, only some tokens large enough
    tokens = ['x'] * 1000
    gap = 100
    min_token_lengths = (2, 2, 2)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 25.7ms -> 402μs (6278% faster)

def test_large_scale_gap_mask():
    # Large input, gap_mask > 1, mask multiple tokens per output
    tokens = [f't{i}' for i in range(100)]
    gap = 10
    gap_mask = 5
    min_token_lengths = (2, 2, 2)
    mask_token = '[MASK]'
    # All tokens are 't0'...'t99', all length >=2
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask) # 290μs -> 89.5μs (225% faster)
    for modulus in range(gap):
        masked = masked_inputs[modulus]
        answers = all_answers[modulus]
        for idx in range(100):
            idx_off = idx % gap
            # Should mask idx_off in [modulus, modulus+gap_mask)
            if modulus + gap_mask >= gap:
                can_mask = idx_off >= modulus or idx_off < (modulus + gap_mask) % gap
            else:
                can_mask = idx_off >= modulus and idx_off < modulus + gap_mask
            if can_mask:
                pass
            else:
                pass

def test_large_scale_wordpiece_mixed():
    # Large input with mixed wordpiece and normal tokens
    tokens = []
    for i in range(500):
        tokens.append(f'word{i}')
        tokens.append(f'##piece{i}')
    gap = 20
    min_token_lengths = (5, 5, 5)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 5.35ms -> 402μs (1230% faster)
    for modulus in range(gap):
        masked = masked_inputs[modulus]
        answers = all_answers[modulus]
        for idx in range(1000):
            if idx % gap == modulus:
                pass
            else:
                pass

def test_large_scale_gap_equals_len_tokens():
    # gap == len(tokens), each output masks one token
    tokens = [f'tok{i}' for i in range(50)]
    gap = 50
    min_token_lengths = (1, 1, 1)
    mask_token = '[MASK]'
    masked_inputs, all_answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 624μs -> 38.5μs (1522% faster)
    for modulus in range(50):
        masked = masked_inputs[modulus]
        answers = all_answers[modulus]
        for idx in range(50):
            if idx == modulus:
                pass
            else:
                pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from blanc.utils import mask_tokens_evenly

# unit tests

# 1. Basic Test Cases

def test_single_token_maskable():
    # Single token, large enough, should be masked
    tokens = ["hello"]
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 2.91μs -> 4.47μs (34.9% slower)

def test_single_token_too_small():
    # Single token, too small, should not be masked
    tokens = ["h"]
    gap = 1
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 2.84μs -> 3.89μs (26.9% slower)

def test_multiple_tokens_basic_gap_1():
    # Multiple tokens, gap=1, all maskable
    tokens = ["hello", "world"]
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.67μs -> 5.24μs (29.9% slower)

def test_multiple_tokens_gap_2():
    # Multiple tokens, gap=2, mask every other token
    tokens = ["a", "bb", "ccc", "dd", "eee"]
    gap = 2
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 7.57μs -> 8.03μs (5.75% slower)
    # Only tokens with length >=2 and at correct indices are masked
    # tokens: 0:'a'(1), 1:'bb'(2), 2:'ccc'(3), 3:'dd'(2), 4:'eee'(3)
    # Only indices 1,2,3,4 are maskable (since 'a' too small)
    expected_masked = [
        # modulus=0: mask idx 0,2,4 (but 0 is too small)
        ["a", "bb", mask_token, "dd", mask_token],
        # modulus=1: mask idx 1,3 (both maskable)
        ["a", mask_token, "ccc", mask_token, "eee"]
    ]
    expected_answers = [
        {2: "ccc", 4: "eee"},
        {1: "bb", 3: "dd"}
    ]

def test_mask_token_is_wordpiece():
    # Mask token itself is a wordpiece token
    tokens = ["hello", "##ly", "world"]
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = "##[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.99μs -> 5.63μs (29.1% slower)

def test_gap_larger_than_tokens():
    # gap > len(tokens): gap should be set to len(tokens)
    tokens = ["a", "b"]
    gap = 5
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 5.16μs -> 6.17μs (16.4% slower)
    # gap becomes 2, so modulus=0 and modulus=1
    expected_masked = [
        [mask_token, "b"],  # modulus=0, mask idx 0
        ["a", mask_token]   # modulus=1, mask idx 1
    ]
    expected_answers = [
        {0: "a"},
        {1: "b"}
    ]

def test_gap_mask_parameter():
    # gap_mask > 1, mask more than one token per gap
    tokens = ["a", "bb", "ccc", "dd", "eee"]
    gap = 3
    gap_mask = 2
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask=gap_mask) # 10.2μs -> 9.07μs (12.4% faster)
    # Only tokens with length >=2 and at correct indices are masked
    # tokens: 0:'a'(1), 1:'bb'(2), 2:'ccc'(3), 3:'dd'(2), 4:'eee'(3)
    # For modulus=0, mask idx 0,1; modulus=1, mask idx 1,2; modulus=2, mask idx 2,3
    expected_masked = [
        # modulus=0: mask idx 0,1 (0 too small)
        ["a", mask_token, "ccc", "dd", "eee"],
        # modulus=1: mask idx 1,2 (both maskable)
        ["a", mask_token, mask_token, "dd", "eee"],
        # modulus=2: mask idx 2,3 (both maskable)
        ["a", "bb", mask_token, mask_token, "eee"]
    ]
    expected_answers = [
        {1: "bb"},
        {1: "bb", 2: "ccc"},
        {2: "ccc", 3: "dd"}
    ]

# 2. Edge Test Cases

def test_empty_tokens_list():
    # Edge: empty tokens list
    tokens = []
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 1.17μs -> 1.00μs (16.8% faster)

def test_all_tokens_too_small():
    # Edge: all tokens too small to mask
    tokens = ["a", "b", "c"]
    gap = 1
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.88μs -> 5.01μs (22.7% slower)

def test_wordpiece_lead_and_followup():
    # Edge: test lead/followup min lengths for wordpiece tokens
    tokens = ["hel", "##lo", "world", "##s"]
    gap = 1
    min_token_lengths = (3, 2, 2)
    mask_token = "[MASK]"
    # "hel" is lead for "##lo" (min_lead=2), "##lo" is followup (min_followup=2)
    # "world" is lead for "##s"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 4.39μs -> 5.87μs (25.3% slower)
    expected_masked = [[mask_token, mask_token, mask_token, mask_token]]
    expected_answers = [{0: "hel", 1: "##lo", 2: "world", 3: "##s"}]

def test_wordpiece_prefix_length_enforcement():
    # Edge: wordpiece token with prefix, but too short after prefix removal
    tokens = ["hello", "##a", "world"]
    gap = 1
    min_token_lengths = (1, 1, 2)  # followup min=2
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 3.96μs -> 5.31μs (25.4% slower)
    # "##a" is too short (after removing '##', length=1<2)
    expected_masked = [[mask_token, "##a", mask_token]]
    expected_answers = [{0: "hello", 2: "world"}]

def test_no_maskable_tokens_due_to_gap():
    # Edge: gap so large that no tokens are masked
    tokens = ["hello", "world"]
    gap = 10
    min_token_lengths = (10, 10, 10)  # impossible to satisfy
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 4.94μs -> 5.49μs (9.86% slower)

def test_gap_mask_equal_gap():
    # Edge: gap_mask == gap, should mask all tokens
    tokens = ["a", "bb", "ccc"]
    gap = 3
    gap_mask = 3
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask=gap_mask) # 7.41μs -> 8.17μs (9.36% slower)
    # All tokens should be masked in each output
    expected_masked = [
        [mask_token, mask_token, mask_token],
    ] * 3
    expected_answers = [
        {0: "a", 1: "bb", 2: "ccc"},
        {0: "a", 1: "bb", 2: "ccc"},
        {0: "a", 1: "bb", 2: "ccc"}
    ]

def test_gap_mask_zero():
    # Edge: gap_mask=0, should mask no tokens
    tokens = ["a", "bb", "ccc"]
    gap = 2
    gap_mask = 0
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask=gap_mask) # 5.81μs -> 6.01μs (3.23% slower)


def test_mask_token_same_as_input_token():
    # Edge: mask_token same as one of the input tokens
    tokens = ["hello", "[MASK]", "world"]
    gap = 1
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 5.33μs -> 7.06μs (24.5% slower)

# 3. Large Scale Test Cases

def test_large_tokens_list_gap_10():
    # Large scale: 100 tokens, gap=10, maskable tokens only at even indices
    tokens = ["t{}".format(i) if i % 2 == 0 else "a" for i in range(100)]
    gap = 10
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 265μs -> 44.0μs (503% faster)
    # Only even-indexed tokens are maskable
    for m, a in zip(masked, answers):
        # For each modulus, check that only even-indexed tokens >=2 are masked at correct positions
        for idx, token in enumerate(tokens):
            if idx % gap == masked.index(m):
                if len(tokens[idx]) >= 2:
                    pass
                else:
                    pass
            else:
                pass

def test_large_tokens_list_all_maskable():
    # Large scale: 1000 tokens, all maskable
    tokens = ["token{}".format(i) for i in range(1000)]
    gap = 100
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 25.9ms -> 530μs (4775% faster)
    # For each modulus, there should be 10 masked tokens (since 1000/100 = 10)
    for m, a in zip(masked, answers):
        for idx in a:
            pass

def test_large_tokens_list_none_maskable():
    # Large scale: 1000 tokens, none maskable
    tokens = ["a"] * 1000
    gap = 100
    min_token_lengths = (2, 2, 2)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 25.6ms -> 403μs (6240% faster)

def test_large_tokens_gap_mask_multi():
    # Large scale: 500 tokens, gap=20, gap_mask=5
    tokens = ["tok{}".format(i) for i in range(500)]
    gap = 20
    gap_mask = 5
    min_token_lengths = (1, 1, 1)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token, gap_mask=gap_mask) # 2.76ms -> 448μs (515% faster)
    # For each modulus, should mask 5 tokens per 20
    for m, a in zip(masked, answers):
        for idx in a:
            pass

def test_large_tokens_wordpiece_mixed():
    # Large scale: 100 tokens, alternating wordpiece and normal tokens
    tokens = ["tok{}".format(i) if i % 2 == 0 else "##tok{}".format(i) for i in range(100)]
    gap = 10
    min_token_lengths = (4, 4, 4)
    mask_token = "[MASK]"
    masked, answers = mask_tokens_evenly(tokens, gap, min_token_lengths, mask_token) # 266μs -> 46.9μs (469% faster)
    # Only tokens with length >=4 (after removing '##' for wordpieces) are masked
    for m, a in zip(masked, answers):
        for idx, token in enumerate(tokens):
            if m[idx] == mask_token:
                if token.startswith("##"):
                    pass
                else:
                    pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-mask_tokens_evenly-mh2kd0zc and push.

The optimization achieves a **2936% speedup** by eliminating redundant computation in the inner loops through strategic precomputation and algorithmic improvements. **Key optimizations:** 1. **Precomputation of expensive operations**: The original code called `is_token_large_enough()` for every token in every modulus iteration, resulting in 345,648 function calls. The optimized version precomputes a `large_enough_flags` list once upfront, reducing function calls to just 5,930 - a **98% reduction**. 2. **Eliminated redundant `next_token` lookups**: The original repeatedly computed `next_token = '' if idx + 1 == len(tokens) else tokens[idx + 1]` for each modulus. The optimization precomputes a `next_tokens` list once, avoiding 345,648 conditional checks. 3. **Batch masking with efficient indexing**: Instead of checking masking conditions for every token in every modulus, the optimized version precomputes all mask indices per modulus using list comprehensions and explicit range logic, then applies masks in a single pass. 4. **Direct list copying**: Replaced `masked_input.append()` calls with `tokens.copy()` and direct index assignment, reducing list operations from ~345K appends to simple copies plus targeted assignments. **Performance characteristics by test case:** - **Small inputs** (2-10 tokens): 5-35% slower due to precomputation overhead - **Medium inputs** (50-100 tokens): 225-1522% faster as precomputation pays off - **Large inputs** (1000+ tokens): 629-6278% faster, with greatest gains when many tokens need size checking The optimization is most effective for large token sequences where the cost of precomputation is amortized across many modulus iterations, making it ideal for production NLP workloads with substantial text inputs.

codeflash-ai bot requested a review from mashraf-222 October 22, 2025 22:27

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `mask_tokens_evenly` by 2,936% #2

⚡️ Speed up function `mask_tokens_evenly` by 2,936% #2

Uh oh!

codeflash-ai bot commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function mask_tokens_evenly by 2,936% #2

Are you sure you want to change the base?

⚡️ Speed up function mask_tokens_evenly by 2,936% #2

Uh oh!

Conversation

codeflash-ai bot commented Oct 22, 2025

📄 2,936% (29.36x) speedup for mask_tokens_evenly in blanc/utils.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `mask_tokens_evenly` by 2,936% #2

⚡️ Speed up function `mask_tokens_evenly` by 2,936% #2

📄 2,936% (29.36x) speedup for `mask_tokens_evenly` in `blanc/utils.py`