Skip to content

Regex matching bug in code block detectionΒ #97

@mdlinville

Description

@mdlinville

Describe the bug:

Triple-backtick code fences are being collapsed into a single line in the output, like:

Basic usage: ``` import weave```

Discovered this bug when trying to regenerate the Weave reference docs, which use Lazydocs for the Python SDK.

Expected behaviour:

Multi-line output, like:
Basic usage:
```
import weave
````

Hypothesis about root cause:

_RE_TYPED_ARGSTART = re.compile(r"^([\w\[\]_]{1,}?)[ ]*?\((.*?)\):[ ]+(.{2,})", re.IGNORECASE)
_RE_ARGSTART = re.compile(r"^(.+):[ ]+(.{2,})$", re.IGNORECASE)

The _RE_ARGSTART regex matches ANY line with a colon followed by 2+ characters. When it processes "Basic usage:" followed by doctest code in an Examples section, it incorrectly treats it as an argument and applies the substitution pattern r"\1: \2" (line 588), which tries to wrap the text after the colon in backticks.
The theory is that this regex is too greedy and gets applied in section_block contexts (like Examples) where it shouldn't.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions