Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jul 22, 2025

📄 31% (0.31x) speedup for _infer_docstring_style in pydantic_ai_slim/pydantic_ai/_griffe.py

⏱️ Runtime : 6.08 milliseconds 4.64 milliseconds (best of 109 runs)

📝 Explanation and details

Here’s how to optimize your _infer_docstring_style function for both speed and memory usage.

  • Avoid generator usage with any() for inner loop: Instead of using a generator expression (which creates a generator and then iterates in any()), a simple for loop with early break is slightly faster and allows us to exit on the first match directly.
  • Pre-compile patterns: Compiling the regex patterns at function runtime wastes time. For maximum speed, these should be compiled once. Since the _docstring_style_patterns data comes from a read-only module, we will compile on demand within the function, but cache them locally with a simple dict for future calls (i.e., LRU caching for compiled regex).
  • Minimize .format calls: Pre-formatting patterns (for all replacements) and re-using if this function is called many times.

Notes.

  • We introduced a module-level _regex_cache dict to ensure each compiled regex is re-used, speeding up repeated style checks.
  • The nested loop is now more explicit and will short-circuit on the first found match, ensuring fewer total regex searches.
  • All behaviors and types remain unchanged.

This version is optimal for both single calls and repeated calls (where the caching shines).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 81 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import re
from typing import Literal

# imports
import pytest  # used for our unit tests
from pydantic_ai._griffe import _infer_docstring_style

DocstringStyle = Literal['google', 'numpy', 'sphinx']
from pydantic_ai._griffe import _infer_docstring_style

# See https://github.com/mkdocstrings/griffe/issues/329#issuecomment-2425017804
_docstring_style_patterns: list[tuple[str, list[str], DocstringStyle]] = [
    (
        r'\n[ \t]*:{0}([ \t]+\w+)*:([ \t]+.+)?\n',
        [
            'param',
            'parameter',
            'arg',
            'argument',
            'key',
            'keyword',
            'type',
            'var',
            'ivar',
            'cvar',
            'vartype',
            'returns',
            'return',
            'rtype',
            'raises',
            'raise',
            'except',
            'exception',
        ],
        'sphinx',
    ),
    (
        r'\n[ \t]*{0}:([ \t]+.+)?\n[ \t]+.+',
        [
            'args',
            'arguments',
            'params',
            'parameters',
            'keyword args',
            'keyword arguments',
            'other args',
            'other arguments',
            'other params',
            'other parameters',
            'raises',
            'exceptions',
            'returns',
            'yields',
            'receives',
            'examples',
            'attributes',
            'functions',
            'methods',
            'classes',
            'modules',
            'warns',
            'warnings',
        ],
        'google',
    ),
    (
        r'\n[ \t]*{0}\n[ \t]*---+\n',
        [
            'deprecated',
            'parameters',
            'other parameters',
            'returns',
            'yields',
            'receives',
            'raises',
            'warns',
            'attributes',
            'functions',
            'methods',
            'classes',
            'modules',
        ],
        'numpy',
    ),
]


# unit tests

# --------------------
# 1. BASIC TEST CASES
# --------------------

def test_google_style_simple():
    """
    Test a simple Google-style docstring with 'Args:' section.
    """
    doc = """
    Does something.

    Args:
        x: The x value.
        y: The y value.
    """
    codeflash_output = _infer_docstring_style(doc) # 17.8μs -> 5.00μs (257% faster)

def test_google_style_with_returns():
    """
    Test a Google-style docstring with 'Returns:' section.
    """
    doc = """
    Computes the sum.

    Args:
        a: First value.
        b: Second value.

    Returns:
        The sum of a and b.
    """
    codeflash_output = _infer_docstring_style(doc) # 18.8μs -> 6.08μs (209% faster)

def test_numpy_style_simple():
    """
    Test a simple Numpy-style docstring with 'Parameters' and dashes.
    """
    doc = """
    Compute the sum.

    Parameters
    ----------
    a : int
        First value.
    b : int
        Second value.

    Returns
    -------
    int
        The sum of a and b.
    """
    codeflash_output = _infer_docstring_style(doc) # 49.0μs -> 23.1μs (112% faster)

def test_numpy_style_with_warns():
    """
    Test Numpy-style with 'Warns' section.
    """
    doc = """
    Does something.

    Warns
    -----
    UserWarning
        If something goes wrong.
    """
    codeflash_output = _infer_docstring_style(doc) # 44.8μs -> 15.4μs (190% faster)

def test_sphinx_style_param():
    """
    Test a Sphinx-style docstring with ':param:' fields.
    """
    doc = """
    Does something.

    :param x: The x value.
    :param y: The y value.
    :returns: The result.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.21μs -> 1.25μs (157% faster)

def test_sphinx_style_raises():
    """
    Test Sphinx-style with ':raises:' field.
    """
    doc = """
    Divide numbers.

    :param a: Numerator.
    :param b: Denominator.
    :raises ZeroDivisionError: If b is zero.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.21μs -> 1.17μs (175% faster)

def test_sphinx_style_alternate_keywords():
    """
    Sphinx-style using ':argument:' and ':return:'.
    """
    doc = """
    Does something.

    :argument foo: Foo argument.
    :return: The result.
    """
    codeflash_output = _infer_docstring_style(doc) # 6.04μs -> 2.08μs (190% faster)

def test_google_style_with_examples():
    """
    Google-style with 'Examples:' section.
    """
    doc = """
    Does something.

    Examples:
        >>> foo()
        bar
    """
    codeflash_output = _infer_docstring_style(doc) # 30.9μs -> 10.0μs (208% faster)

def test_numpy_style_deprecated():
    """
    Numpy-style with 'Deprecated' section.
    """
    doc = """
    Does something.

    Deprecated
    ----------
    This function will be removed in future versions.
    """
    codeflash_output = _infer_docstring_style(doc) # 36.9μs -> 11.7μs (215% faster)

# --------------------
# 2. EDGE TEST CASES
# --------------------

def test_empty_docstring():
    """
    Empty docstring should default to 'google'.
    """
    doc = ""
    codeflash_output = _infer_docstring_style(doc) # 36.6μs -> 4.50μs (713% faster)

def test_no_sections():
    """
    Docstring with no recognizable sections should default to 'google'.
    """
    doc = "This function does something."
    codeflash_output = _infer_docstring_style(doc) # 36.6μs -> 5.00μs (632% faster)

def test_only_whitespace():
    """
    Docstring with only whitespace should default to 'google'.
    """
    doc = "   \n\t  "
    codeflash_output = _infer_docstring_style(doc) # 36.0μs -> 5.00μs (621% faster)

def test_sphinx_style_with_tabs_and_spaces():
    """
    Sphinx-style with mixed tabs and spaces.
    """
    doc = """
    Does something.

    \t:param\tfoo:\tFoo value.
    \t:returns:\tResult.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.25μs -> 1.29μs (152% faster)

def test_google_style_with_indented_args():
    """
    Google-style with indented 'Args:' section.
    """
    doc = """
    Does something.

        Args:
            foo: Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 17.7μs -> 4.88μs (263% faster)

def test_numpy_style_with_short_dashes():
    """
    Numpy-style with short dashes (minimum 3 dashes).
    """
    doc = """
    Parameters
    ---
    foo : int
        Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 37.2μs -> 11.2μs (233% faster)

def test_sphinx_style_case_insensitive():
    """
    Sphinx-style with uppercase ':PARAM:'.
    """
    doc = """
    Does something.

    :PARAM foo: Foo value.
    """
    # Should match regardless of case
    codeflash_output = _infer_docstring_style(doc) # 3.21μs -> 1.17μs (175% faster)

def test_google_style_case_insensitive():
    """
    Google-style with uppercase 'ARGS:'.
    """
    doc = """
    Does something.

    ARGS:
        foo: Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 17.1μs -> 4.54μs (277% faster)

def test_numpy_style_case_insensitive():
    """
    Numpy-style with uppercase 'PARAMETERS' and dashes.
    """
    doc = """
    PARAMETERS
    ----------
    foo : int
        Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 37.8μs -> 11.4μs (233% faster)

def test_sphinx_style_with_colons_in_description():
    """
    Sphinx-style with colon in parameter description.
    """
    doc = """
    :param foo: The value: must be an int.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.29μs -> 1.12μs (193% faster)

def test_google_style_with_colons_in_description():
    """
    Google-style with colon in description.
    """
    doc = """
    Args:
        foo: The value: must be an int.
    """
    codeflash_output = _infer_docstring_style(doc) # 16.8μs -> 3.96μs (323% faster)

def test_numpy_style_with_colons_in_description():
    """
    Numpy-style with colon in description.
    """
    doc = """
    Parameters
    ----------
    foo : int
        The value: must be an int.
    """
    codeflash_output = _infer_docstring_style(doc) # 37.6μs -> 11.4μs (230% faster)

def test_multiple_styles_present_prefers_first_match():
    """
    If multiple styles are present, should return the first matching style by order in _docstring_style_patterns.
    """
    doc = """
    :param foo: Foo value.

    Parameters
    ----------
    foo : int
        Foo value.
    """
    # Sphinx pattern is checked before numpy, so should return 'sphinx'
    codeflash_output = _infer_docstring_style(doc) # 3.12μs -> 1.12μs (178% faster)

def test_no_match_fallback_google():
    """
    If no pattern matches, fallback is 'google'.
    """
    doc = """
    This is a docstring with no sections.
    """
    codeflash_output = _infer_docstring_style(doc) # 40.8μs -> 8.83μs (362% faster)

def test_sphinx_style_with_multiple_param_types():
    """
    Sphinx-style with :param, :ivar, :cvar, :vartype.
    """
    doc = """
    :param foo: Foo value.
    :ivar bar: Bar value.
    :cvar baz: Baz value.
    :vartype foo: int
    """
    codeflash_output = _infer_docstring_style(doc) # 3.12μs -> 1.12μs (178% faster)

def test_google_style_with_other_sections():
    """
    Google-style with 'Other Parameters:' section.
    """
    doc = """
    Other Parameters:
        foo: Foo value.
        bar: Bar value.
    """
    codeflash_output = _infer_docstring_style(doc) # 25.0μs -> 7.12μs (250% faster)

def test_numpy_style_with_other_parameters():
    """
    Numpy-style with 'Other Parameters' section.
    """
    doc = """
    Other Parameters
    ---------------
    foo : int
        Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 38.3μs -> 11.7μs (227% faster)

def test_sphinx_style_with_colon_and_whitespace():
    """
    Sphinx-style with extra whitespace after colon.
    """
    doc = """
    :param    foo:    Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.17μs -> 1.17μs (171% faster)

def test_google_style_with_keyword_arguments():
    """
    Google-style with 'Keyword Arguments:' section.
    """
    doc = """
    Keyword Arguments:
        foo: Foo value.
    """
    codeflash_output = _infer_docstring_style(doc) # 20.8μs -> 5.29μs (292% faster)

def test_numpy_style_with_functions_section():
    """
    Numpy-style with 'Functions' section.
    """
    doc = """
    Functions
    ---------
    foo
        Does something.
    """
    codeflash_output = _infer_docstring_style(doc) # 44.2μs -> 13.6μs (225% faster)

# --------------------
# 3. LARGE SCALE TEST CASES
# --------------------

def test_large_google_style_docstring():
    """
    Google-style docstring with many parameters and sections.
    """
    params = "\n".join([f"    param{i}: Description of param{i}." for i in range(500)])
    doc = f"""
    Does something big.

    Args:
{params}

    Returns:
        Something.
    """
    codeflash_output = _infer_docstring_style(doc) # 233μs -> 220μs (5.87% faster)

def test_large_numpy_style_docstring():
    """
    Numpy-style docstring with many parameters.
    """
    params = "\n".join([f"param{i} : int\n    Description of param{i}." for i in range(500)])
    doc = f"""
    Does something big.

    Parameters
    ----------
{params}

    Returns
    -------
    int
        Something.
    """
    codeflash_output = _infer_docstring_style(doc) # 1.05ms -> 1.02ms (2.38% faster)

def test_large_sphinx_style_docstring():
    """
    Sphinx-style docstring with many :param: fields.
    """
    params = "\n".join([f":param param{i}: Description of param{i}." for i in range(500)])
    doc = f"""
    Does something big.

{params}

    :returns: Something.
    """
    codeflash_output = _infer_docstring_style(doc) # 3.38μs -> 1.21μs (179% faster)

def test_performance_large_mixed_docstring():
    """
    Large docstring with lots of text and a Sphinx-style section at the end.
    Should still detect the correct style efficiently.
    """
    # Create a large block of unrelated text
    unrelated = "\n".join([f"This is line {i}." for i in range(800)])
    doc = f"""
    {unrelated}

    :param foo: Foo value.
    :returns: Result.
    """
    codeflash_output = _infer_docstring_style(doc) # 13.4μs -> 10.9μs (22.5% faster)

def test_large_docstring_no_match():
    """
    Large docstring with no matching style, should fallback to 'google'.
    """
    unrelated = "\n".join([f"This is line {i}." for i in range(900)])
    doc = f"""
    {unrelated}

    This is the end.
    """
    codeflash_output = _infer_docstring_style(doc) # 755μs -> 699μs (8.04% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import re
from typing import Literal

# imports
import pytest  # used for our unit tests
from pydantic_ai._griffe import _infer_docstring_style

# function to test
DocstringStyle = Literal['google', 'numpy', 'sphinx']
from pydantic_ai._griffe import _infer_docstring_style

# See https://github.com/mkdocstrings/griffe/issues/329#issuecomment-2425017804
_docstring_style_patterns: list[tuple[str, list[str], DocstringStyle]] = [
    (
        r'\n[ \t]*:{0}([ \t]+\w+)*:([ \t]+.+)?\n',
        [
            'param',
            'parameter',
            'arg',
            'argument',
            'key',
            'keyword',
            'type',
            'var',
            'ivar',
            'cvar',
            'vartype',
            'returns',
            'return',
            'rtype',
            'raises',
            'raise',
            'except',
            'exception',
        ],
        'sphinx',
    ),
    (
        r'\n[ \t]*{0}:([ \t]+.+)?\n[ \t]+.+',
        [
            'args',
            'arguments',
            'params',
            'parameters',
            'keyword args',
            'keyword arguments',
            'other args',
            'other arguments',
            'other params',
            'other parameters',
            'raises',
            'exceptions',
            'returns',
            'yields',
            'receives',
            'examples',
            'attributes',
            'functions',
            'methods',
            'classes',
            'modules',
            'warns',
            'warnings',
        ],
        'google',
    ),
    (
        r'\n[ \t]*{0}\n[ \t]*---+\n',
        [
            'deprecated',
            'parameters',
            'other parameters',
            'returns',
            'yields',
            'receives',
            'raises',
            'warns',
            'attributes',
            'functions',
            'methods',
            'classes',
            'modules',
        ],
        'numpy',
    ),
]

# unit tests

# -------------------- BASIC TEST CASES --------------------

def test_google_style_basic():
    """
    Test a basic Google style docstring.
    """
    doc = """
    Summary line.

    Args:
        x (int): The x value.
        y (str): The y value.

    Returns:
        bool: True if successful.
    """
    codeflash_output = _infer_docstring_style(doc) # 19.5μs -> 6.50μs (199% faster)

def test_numpy_style_basic():
    """
    Test a basic Numpy style docstring.
    """
    doc = """
    Summary line.

    Parameters
    ----------
    x : int
        The x value.
    y : str
        The y value.

    Returns
    -------
    bool
        True if successful.
    """
    codeflash_output = _infer_docstring_style(doc) # 49.6μs -> 23.4μs (112% faster)

def test_sphinx_style_basic():
    """
    Test a basic Sphinx style docstring.
    """
    doc = """
    Summary line.

    :param x: The x value.
    :type x: int
    :param y: The y value.
    :type y: str
    :returns: True if successful.
    :rtype: bool
    """
    codeflash_output = _infer_docstring_style(doc) # 3.29μs -> 1.25μs (163% faster)

def test_google_style_variants():
    """
    Test Google style with different section names and indentation.
    """
    doc = """
    Function summary.

    Arguments:
        foo (str): foo arg

    Raises:
        ValueError: if bad input
    """
    codeflash_output = _infer_docstring_style(doc) # 19.6μs -> 6.42μs (206% faster)

def test_numpy_style_variants():
    """
    Test Numpy style with "Other Parameters" and mixed-case section headers.
    """
    doc = """
    Function summary.

    Other Parameters
    ---------------
    bar : float
        bar arg
    """
    codeflash_output = _infer_docstring_style(doc) # 40.8μs -> 13.8μs (195% faster)

def test_sphinx_style_variants():
    """
    Test Sphinx style with :raises: and :rtype: fields.
    """
    doc = """
    Does something.

    :raises ValueError: If something goes wrong.
    :rtype: None
    """
    codeflash_output = _infer_docstring_style(doc) # 13.7μs -> 4.00μs (242% faster)

# -------------------- EDGE TEST CASES --------------------

def test_empty_docstring():
    """
    Test empty docstring returns fallback style (google).
    """
    doc = ""
    codeflash_output = _infer_docstring_style(doc) # 36.5μs -> 4.42μs (725% faster)

def test_no_sections():
    """
    Test docstring with no recognizable sections.
    """
    doc = "Just a summary line with no sections."
    codeflash_output = _infer_docstring_style(doc) # 37.2μs -> 4.88μs (662% faster)

def test_only_summary_and_blank_lines():
    """
    Test docstring with only summary and blank lines.
    """
    doc = "\n\nA summary.\n\n"
    codeflash_output = _infer_docstring_style(doc) # 39.8μs -> 7.79μs (411% faster)

def test_multiple_styles_present_prefers_first_match():
    """
    Test docstring with both Sphinx and Numpy sections; should match Sphinx first.
    """
    doc = """
    Summary.

    :param x: X value.
    :returns: Result.

    Parameters
    ----------
    x : int
        Description.
    """
    # Sphinx pattern is checked before Numpy, so should return 'sphinx'
    codeflash_output = _infer_docstring_style(doc) # 3.38μs -> 1.21μs (179% faster)

def test_indented_sections():
    """
    Test docstring with indented section headers.
    """
    doc = """
    Summary.

        Args:
            foo (int): foo argument
    """
    codeflash_output = _infer_docstring_style(doc) # 17.7μs -> 5.04μs (250% faster)

def test_colon_in_text_not_section():
    """
    Test docstring with colons in text but not as section headers.
    """
    doc = """
    This function: does something.
    It is very: important.
    """
    codeflash_output = _infer_docstring_style(doc) # 42.8μs -> 10.6μs (303% faster)

def test_sphinx_section_with_extra_whitespace():
    """
    Test Sphinx style with extra whitespace and tabs.
    """
    doc = """
    Summary.

    :param    x   :   The x value.
    :returns  :   True if successful.
    """
    codeflash_output = _infer_docstring_style(doc) # 46.3μs -> 13.8μs (235% faster)

def test_google_section_with_extra_whitespace():
    """
    Test Google style with extra whitespace.
    """
    doc = """
    Summary.

    Args:    x (int): x value

        y (str): y value
    """
    codeflash_output = _infer_docstring_style(doc) # 47.2μs -> 15.5μs (205% faster)

def test_numpy_section_with_short_underline():
    """
    Test Numpy style with short underline.
    """
    doc = """
    Parameters
    ----
    x : int
        Description.
    """
    codeflash_output = _infer_docstring_style(doc) # 37.5μs -> 11.4μs (228% faster)

def test_numpy_section_with_long_underline_and_spaces():
    """
    Test Numpy style with long underline and leading spaces.
    """
    doc = """
    Returns
        -------
    bool
        Result.
    """
    codeflash_output = _infer_docstring_style(doc) # 39.6μs -> 12.4μs (219% faster)

def test_sphinx_section_case_insensitive():
    """
    Test Sphinx style section headers with different casing.
    """
    doc = """
    :PARAM x: value
    :RETURNS: result
    """
    codeflash_output = _infer_docstring_style(doc) # 3.25μs -> 1.21μs (169% faster)

def test_google_section_case_insensitive():
    """
    Test Google style section headers with different casing.
    """
    doc = """
    a summary.

    ARGS:
        foo (int): foo
    """
    codeflash_output = _infer_docstring_style(doc) # 17.2μs -> 4.75μs (262% faster)

def test_numpy_section_case_insensitive():
    """
    Test Numpy style section headers with different casing.
    """
    doc = """
    PARAMETERS
    ----------
    foo : int
        foo
    """
    codeflash_output = _infer_docstring_style(doc) # 37.6μs -> 11.4μs (230% faster)

def test_section_headers_in_middle_of_text():
    """
    Test docstring with section-like words in the middle of lines.
    """
    doc = """
    This function parameters are x and y.
    It returns the result.
    """
    codeflash_output = _infer_docstring_style(doc) # 43.2μs -> 10.8μs (301% faster)

def test_sphinx_with_multiple_colons():
    """
    Test Sphinx style with multiple colons in a line.
    """
    doc = """
    :param x: foo: bar: baz
    """
    codeflash_output = _infer_docstring_style(doc) # 3.12μs -> 1.12μs (178% faster)

def test_google_with_multiple_colons():
    """
    Test Google style with multiple colons in a line.
    """
    doc = """
    Args: foo: bar: baz
        x (int): description
    """
    codeflash_output = _infer_docstring_style(doc) # 16.8μs -> 4.08μs (311% faster)

def test_numpy_with_multiple_hyphens():
    """
    Test Numpy style with more than three hyphens in the underline.
    """
    doc = """
    Parameters
    -------------
    x : int
        description
    """
    codeflash_output = _infer_docstring_style(doc) # 37.2μs -> 11.8μs (216% faster)

def test_section_headers_with_tabs():
    """
    Test section headers with tabs instead of spaces.
    """
    doc = """
    \tArgs:
    \t\tx (int): description
    """
    codeflash_output = _infer_docstring_style(doc) # 16.7μs -> 4.08μs (309% faster)

def test_section_headers_with_mixed_whitespace():
    """
    Test section headers with mixed tabs and spaces.
    """
    doc = """
        Parameters
        ----------
        x : int
            description
    """
    codeflash_output = _infer_docstring_style(doc) # 40.0μs -> 14.0μs (185% faster)

def test_sphinx_with_raise_and_raises():
    """
    Test Sphinx style with both :raise: and :raises:.
    """
    doc = """
    :raise ValueError: if bad
    :raises TypeError: if worse
    """
    codeflash_output = _infer_docstring_style(doc) # 14.3μs -> 4.08μs (250% faster)

def test_google_with_keyword_arguments():
    """
    Test Google style with 'Keyword Arguments' section.
    """
    doc = """
    Keyword Arguments:
        foo (str): description
    """
    codeflash_output = _infer_docstring_style(doc) # 20.5μs -> 5.42μs (278% faster)

def test_numpy_with_deprecated_section():
    """
    Test Numpy style with 'Deprecated' section.
    """
    doc = """
    Deprecated
    ----------
    This function will be removed.
    """
    codeflash_output = _infer_docstring_style(doc) # 35.4μs -> 9.46μs (274% faster)

def test_sphinx_with_exception_section():
    """
    Test Sphinx style with :exception: section.
    """
    doc = """
    :exception ValueError: if bad
    """
    codeflash_output = _infer_docstring_style(doc) # 15.8μs -> 3.92μs (303% faster)

def test_sphinx_with_except_section():
    """
    Test Sphinx style with :except: section.
    """
    doc = """
    :except Exception: on error
    """
    codeflash_output = _infer_docstring_style(doc) # 14.9μs -> 3.54μs (320% faster)

def test_google_with_examples_section():
    """
    Test Google style with 'Examples' section.
    """
    doc = """
    Examples:
        >>> foo()
        bar
    """
    codeflash_output = _infer_docstring_style(doc) # 29.4μs -> 8.62μs (241% faster)

def test_google_with_methods_section():
    """
    Test Google style with 'Methods' section.
    """
    doc = """
    Methods:
        foo: does foo
    """
    codeflash_output = _infer_docstring_style(doc) # 30.6μs -> 7.67μs (299% faster)

def test_numpy_with_methods_section():
    """
    Test Numpy style with 'Methods' section and underline.
    """
    doc = """
    Methods
    ------
    foo
        does foo
    """
    codeflash_output = _infer_docstring_style(doc) # 44.8μs -> 13.1μs (242% faster)

def test_sphinx_with_vartype_section():
    """
    Test Sphinx style with :vartype: section.
    """
    doc = """
    :vartype x: int
    """
    codeflash_output = _infer_docstring_style(doc) # 11.0μs -> 2.71μs (305% faster)

def test_sphinx_with_ivar_section():
    """
    Test Sphinx style with :ivar: section.
    """
    doc = """
    :ivar x: description
    """
    codeflash_output = _infer_docstring_style(doc) # 9.29μs -> 2.25μs (313% faster)

def test_sphinx_with_cvar_section():
    """
    Test Sphinx style with :cvar: section.
    """
    doc = """
    :cvar x: description
    """
    codeflash_output = _infer_docstring_style(doc) # 9.88μs -> 2.50μs (295% faster)

def test_sphinx_with_var_section():
    """
    Test Sphinx style with :var: section.
    """
    doc = """
    :var x: description
    """
    codeflash_output = _infer_docstring_style(doc) # 8.54μs -> 2.08μs (310% faster)

def test_google_with_warns_section():
    """
    Test Google style with 'Warns' section.
    """
    doc = """
    Warns:
        UserWarning: if something is odd
    """
    codeflash_output = _infer_docstring_style(doc) # 32.9μs -> 8.42μs (291% faster)

def test_numpy_with_warns_section():
    """
    Test Numpy style with 'Warns' section and underline.
    """
    doc = """
    Warns
    -----
    UserWarning
        If something is odd.
    """
    codeflash_output = _infer_docstring_style(doc) # 42.8μs -> 13.0μs (228% faster)

def test_sphinx_with_type_section():
    """
    Test Sphinx style with :type: section.
    """
    doc = """
    :type x: int
    """
    codeflash_output = _infer_docstring_style(doc) # 7.79μs -> 2.12μs (267% faster)

def test_sphinx_with_rtype_section():
    """
    Test Sphinx style with :rtype: section.
    """
    doc = """
    :rtype: int
    """
    codeflash_output = _infer_docstring_style(doc) # 12.6μs -> 2.88μs (339% faster)

# -------------------- LARGE SCALE TEST CASES --------------------

def test_large_google_docstring():
    """
    Test a large Google style docstring with many arguments.
    """
    args_section = "\n".join([f"    arg{i} (int): description" for i in range(500)])
    doc = f"""
    Summary.

    Args:
{args_section}

    Returns:
        int: result
    """
    codeflash_output = _infer_docstring_style(doc) # 215μs -> 202μs (6.41% faster)

def test_large_numpy_docstring():
    """
    Test a large Numpy style docstring with many parameters.
    """
    params_section = "\n".join([f"arg{i} : int\n    description" for i in range(500)])
    doc = f"""
    Summary.

    Parameters
    ----------
{params_section}

    Returns
    -------
    int
        result
    """
    codeflash_output = _infer_docstring_style(doc) # 974μs -> 937μs (3.91% faster)

def test_large_sphinx_docstring():
    """
    Test a large Sphinx style docstring with many :param: fields.
    """
    params_section = "\n".join([f":param arg{i}: description" for i in range(500)])
    doc = f"""
    Summary.

{params_section}

    :returns: result
    """
    codeflash_output = _infer_docstring_style(doc) # 3.38μs -> 1.21μs (179% faster)

def test_large_mixed_docstring_prefers_sphinx():
    """
    Test a large docstring with both Sphinx and Numpy sections; Sphinx is matched first.
    """
    sphinx_section = "\n".join([f":param arg{i}: description" for i in range(300)])
    numpy_section = "\n".join([f"arg{i} : int\n    description" for i in range(300)])
    doc = f"""
    Summary.

{sphinx_section}

    Parameters
    ----------
{numpy_section}

    :returns: result
    """
    # Sphinx pattern is checked before Numpy, so should return 'sphinx'
    codeflash_output = _infer_docstring_style(doc) # 3.33μs -> 1.12μs (196% faster)

def test_large_mixed_docstring_prefers_numpy():
    """
    Test a large docstring with only Numpy and Google sections; Numpy is matched first.
    """
    numpy_section = "\n".join([f"arg{i} : int\n    description" for i in range(300)])
    google_section = "\n".join([f"    arg{i} (int): description" for i in range(300)])
    doc = f"""
    Summary.

    Parameters
    ----------
{numpy_section}

    Args:
{google_section}

    Returns:
        int: result
    """
    # Numpy pattern is checked before Google, so should return 'numpy'
    codeflash_output = _infer_docstring_style(doc) # 332μs -> 317μs (4.81% faster)

def test_large_unmatched_docstring():
    """
    Test a large docstring with no recognizable sections; should fallback to 'google'.
    """
    lines = "\n".join([f"This is line {i} of text." for i in range(800)])
    doc = f"""
    Summary.

{lines}
    """
    codeflash_output = _infer_docstring_style(doc) # 739μs -> 732μs (0.933% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from pydantic_ai._griffe import _infer_docstring_style

def test__infer_docstring_style():
    _infer_docstring_style('')

To edit these changes git checkout codeflash/optimize-_infer_docstring_style-mdeycsff and push.

Codeflash

Here’s how to optimize your `_infer_docstring_style` function for both speed and memory usage.

- **Avoid generator usage with `any()` for inner loop**: Instead of using a generator expression (which creates a generator and then iterates in `any()`), a simple `for` loop with early break is slightly faster and allows us to exit on the first match directly.
- **Pre-compile patterns**: Compiling the regex patterns at function runtime wastes time. For maximum speed, these should be compiled once. Since the `_docstring_style_patterns` data comes from a read-only module, we will compile on demand within the function, but cache them locally with a simple `dict` for future calls (i.e., LRU caching for compiled regex).
- **Minimize `.format` calls**: Pre-formatting patterns (for all replacements) and re-using if this function is called many times.




**Notes**.
- We introduced a module-level `_regex_cache` dict to ensure each compiled regex is re-used, speeding up repeated style checks.
- The nested loop is now more explicit and will short-circuit on the first found match, ensuring fewer total regex searches.
- All behaviors and types remain unchanged.  

This version is optimal for both single calls and repeated calls (where the caching shines).
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 22, 2025
@codeflash-ai codeflash-ai bot requested a review from aseembits93 July 22, 2025 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants