Skip to content

Feat : adds MarkdownChunker, docs, tests#242

Open
Josephrp wants to merge 8 commits intochonkie-inc:mainfrom
Josephrp:feat/markdown-chunker
Open

Feat : adds MarkdownChunker, docs, tests#242
Josephrp wants to merge 8 commits intochonkie-inc:mainfrom
Josephrp:feat/markdown-chunker

Conversation

@Josephrp
Copy link

@Josephrp Josephrp commented Aug 13, 2025

solves #241

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Josephrp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the MarkdownChunker to the chonkie library, a new utility designed for intelligent text splitting of Markdown content. This chunker leverages Markdown's structural elements like headings and separator lines to create meaningful chunks, with additional capabilities for cleaning inline formatting and ensuring chunks adhere to specified size limits. The changes include the core implementation, comprehensive documentation, and a dedicated test suite, enhancing the library's text processing capabilities for structured documents.

Highlights

  • New MarkdownChunker Implementation: A robust MarkdownChunker class has been added, capable of splitting Markdown text based on structural elements like headings and horizontal rules, and recursively splitting larger sections to meet chunk_size requirements.
  • Optional Text Cleaning: The MarkdownChunker includes an option to clean chunked text by stripping inline Markdown formatting (bold, italic, code), links, and URLs, making the output cleaner for downstream tasks.
  • Comprehensive Documentation: A new documentation page (markdown-chunker.mdx) has been added, detailing the MarkdownChunker's purpose, parameters, usage examples, and internal workings. The main documentation navigation and overview page have also been updated.
  • Dedicated Test Suite: A new set of unit tests (test_markdown_chunker.py) has been introduced to ensure the correct functionality, robustness, and parameter adherence of the MarkdownChunker.
  • Package Integration: The MarkdownChunker has been properly integrated into the chonkie package structure, making it accessible from the top-level chonkie import.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Josephrp Josephrp marked this pull request as draft August 13, 2025 10:56
@gemini-code-assist
Copy link
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@Josephrp Josephrp changed the title Port MarkdownChunker, docs, tests (exclude my_contribution) Feat : adds MarkdownChunker, docs, tests Aug 13, 2025
@Josephrp
Copy link
Author

Josephrp commented Aug 13, 2025

try the demo here : https://huggingface.co/spaces/Tonic/chonkie
code for demo : https://huggingface.co/spaces/Tonic/chonkie/blob/main/app.py
please dm me here if you wanna send me some swag: https://twitter.com/Josephpollack (i really wanna a t-shirt lmfao)

@Josephrp Josephrp marked this pull request as ready for review August 13, 2025 11:51
@Josephrp
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new MarkdownChunker, complete with documentation and tests. The implementation is well-structured and thoughtfully designed. My review focuses on a few areas for improvement: removing a piece of dead code, fixing a logic issue in a test helper function, enhancing test coverage for key features like separator splitting and combining small ranges, and improving documentation clarity. I've also flagged a file that may have been added unintentionally.

Comment on lines +54 to +66
def verify_chunk_indices(chunks: List[Chunk], original_text: str) -> None:
"""Verify that chunk indices correctly map to the original text."""
reconstructed = ""
for i, chunk in enumerate(chunks):
extracted = original_text[chunk.start_index : chunk.end_index]
assert chunk.text == extracted, (
f"Chunk {i} text mismatch\n"
f"Chunk text: {repr(chunk.text)}\n"
f"Slice text: {repr(extracted)}\n"
f"Indices: [{chunk.start_index}:{chunk.end_index}]"
)
reconstructed += chunk.text
assert reconstructed == original_text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The text reconstruction logic in this helper function is likely to fail. The MarkdownChunker is designed to skip separators (like =====), which means there will be gaps between chunks when they are reconstructed. Therefore, reconstructed == original_text will not hold true. The per-chunk assertion that chunk.text matches the slice from original_text is sufficient for verifying index correctness. I suggest removing the reconstruction logic.

def verify_chunk_indices(chunks: List[Chunk], original_text: str) -> None:
    """Verify that chunk indices correctly map to the original text."""
    for i, chunk in enumerate(chunks):
        extracted = original_text[chunk.start_index : chunk.end_index]
        assert chunk.text == extracted, (
            f"Chunk {i} text mismatch\n"
            f"Chunk text: {repr(chunk.text)}\n"
            f"Slice text: {repr(extracted)}\n"
            f"Indices: [{chunk.start_index}:{chunk.end_index}]"
        )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong ai interpretation of this , adds metadata to the chunk based on sources, see demo

ranges.append((start, first_heading_start))

# Add sentinel at end
matches_with_end = matches + [(end, 0)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable matches_with_end is initialized here but is never used in the rest of the method. This appears to be dead code and can be removed to improve clarity.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional , trust me , you'll need this later (maybe) if you want to add configurations for forward or backward looking passes ;-)

@Josephrp
Copy link
Author

hello my frens & @chonk-lain , some merge conflicts , which is normal and expected after literally 3 months under review. should i take care of it , or is it possible for you to take care of it ? btw let's close this as unplanned if it's not relevant anymore :-) with all my thanks !

@chonk-lain
Copy link
Member

hi @Josephrp
we're still pushing this a little bit further, will try to include this in the upcoming next big version, also I can help you fix the merge conflicts once we get to em since there has been so many updates when this pr originally first opened.

@Josephrp
Copy link
Author

it's just doumentations ;-)

@chonk-lain chonk-lain self-requested a review December 22, 2025 18:19
@chonk-lain
Copy link
Member

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new MarkdownChunker, complete with documentation and tests. The overall contribution is valuable, but there are a few areas that need attention. The core implementation of the MarkdownChunker contains a critical bug in its heading-splitting logic that results in overlapping chunks and duplicated content. Additionally, the new documentation has some minor inconsistencies and omissions that should be addressed for clarity. I've provided detailed comments and suggestions to fix these issues.

Comment on lines +170 to +215
def _split_on_markdown_headings(self, text: str, start: int, end: int) -> List[Tuple[int, int]]:
"""Split a section by Markdown headings with hierarchy-aware grouping.

For level 1 headings (#), collect content until next level 1.
For level >= 2, collect until next heading with level <= current.
Also emits a leading prelude if content exists before the first heading.
"""
section = text[start:end]
heading_re = re.compile(r"^(#{1,6})\s+.+$", flags=re.MULTILINE)

matches = [
(start + m.start(), len(m.group(1)))
for m in heading_re.finditer(section)
]

if not matches:
# No headings: return whole section (will be size-corrected later)
return [(start, end)]

ranges: List[Tuple[int, int]] = []
# Prelude before first heading
first_heading_start = matches[0][0]
if first_heading_start > start and text[start:first_heading_start].strip():
ranges.append((start, first_heading_start))

# Add sentinel at end
matches_with_end = matches + [(end, 0)]

# Walk headings
for idx in range(len(matches)):
h_start, h_level = matches[idx]
# Find boundary depending on level
boundary = end
for j in range(idx + 1, len(matches)):
next_start, next_level = matches[j]
if h_level == 1 and next_level == 1:
boundary = next_start
break
if h_level >= 2 and next_level <= h_level:
boundary = next_start
break
# Emit heading section
if h_start < boundary and text[h_start:boundary].strip():
ranges.append((h_start, boundary))

return ranges
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation of _split_on_markdown_headings has a critical flaw: it produces overlapping ranges. This leads to duplicated content in the final chunks, which can negatively impact downstream applications. The logic attempts a complex hierarchical grouping but fails to create disjoint sections.

I recommend replacing this with a simpler, more robust implementation that splits the text at each heading, ensuring the resulting ranges are always disjoint. This approach is correct and avoids data duplication. While it simplifies the grouping logic, correctness should be the priority. The docstring should also be updated to reflect this simpler behavior.

    def _split_on_markdown_headings(self, text: str, start: int, end: int) -> List[Tuple[int, int]]:
        """Split a section by Markdown headings into disjoint ranges."""
        section = text[start:end]
        heading_re = re.compile(r"^(#{1,6})\s+.+$", flags=re.MULTILINE)

        ranges: List[Tuple[int, int]] = []
        last_split = start

        for match in heading_re.finditer(section):
            match_start = start + match.start()

            if match_start > last_split:
                if text[last_split:match_start].strip():
                    ranges.append((last_split, match_start))
            
            last_split = match_start

        if last_split < end:
            if text[last_split:end].strip():
                ranges.append((last_split, end))
        
        if not ranges and section.strip():
            return [(start, end)]

        return ranges

heading_level=3, # max heading level to consider (1..6)
min_characters_per_chunk=50,
max_characters_per_section=4000,
clean_text=True # strip inline markdown and URLs from chunks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The initialization example shows clean_text=True, while the parameter description below correctly states that the default is False. This could be confusing for users. To improve clarity, I suggest adding a comment to the example to clarify that this is not the default value.

    clean_text=True         # default: False; strip inline markdown and URLs from chunks

|-----------------|:-------:|:----------:|:-----:|:-------------:|
| TokenChunker | ✅ | ✅ | ✅ | ✅ |
| SentenceChunker | ✅ | ✅ | ✅ | ✅ |
| RecursiveChunker| ✅ | ✅ | ✅ | ✅ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new MarkdownChunker is missing from the 'Availability' table. Since it ships with the base library and has no extra dependencies, it should be marked as available for all installation options. I recommend adding it to the table for completeness, ideally after RecursiveChunker to maintain a logical order.

| RecursiveChunker| ✅      | ✅         | ✅    | ✅            |
| MarkdownChunker | ✅      | ✅         | ✅    | ✅            |


<Note>
Chonkie follows a special approach to dependencies, keeping the base installation lightweight while allowing you to add extra features as and when needed.
Please check the [Installation](/installation) page for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistent link to the Installation page. Here, it's [Installation](/installation), but on line 160, a card links to /python-sdk/getting-started/installation. The latter seems more likely to be correct given the file's location in the directory structure. Please update this link for consistency.

  Please check the [Installation](/python-sdk/getting-started/installation) page for more details.

return ""
cleaned = chunk
# Drop common prompt lines
# cleaned = re.sub(r"^(?:Search Results for:|Question:|Query:).*$", "", cleaned, flags=re.MULTILINE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line of code is commented out. Commented-out code reduces readability and can be confusing. If it's no longer needed, it should be removed. If it's for future use, consider adding a TODO comment with an explanation.

@chonk-lain
Copy link
Member

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the new MarkdownChunker, complete with implementation, tests, and documentation. The chunker itself is well-structured, following a multi-stage process to split Markdown content intelligently. My review has identified a few areas for improvement. There are some correctness issues in the Markdown cleaning logic due to problematic regular expressions, and a critical flaw in the test suite's text reconstruction logic. I've also found several inconsistencies and omissions in the new documentation files. Additionally, a significant but unrelated change to src/chonkie/__init__.py introduces an incomplete and incorrect __all__ definition, which I recommend reverting or fixing as it could break package imports. Overall, this is a great feature addition, and with these fixes, it will be a solid contribution.

Comment on lines +107 to +217
# Add basic package metadata to __all__
__all__ = [
"__name__",
"__version__",
"__author__",
]

# Add all data classes to __all__
__all__ += [
"Context",
"Chunk",
"RecursiveChunk",
"RecursiveLevel",
"RecursiveRules",
"SentenceChunk",
"SemanticChunk",
"Sentence",
"SemanticSentence",
"LateChunk",
"CodeChunk",
"LanguageConfig",
"MergeRule",
"SplitRule",
]

# Add all tokenizer classes to __all__
__all__ += [
"Tokenizer",
"CharacterTokenizer",
"WordTokenizer",
]

# Add all chunker classes to __all__
__all__ += [
"BaseChunker",
"TokenChunker",
"SentenceChunker",
"SemanticChunker",
"RecursiveChunker",
"LateChunker",
"CodeChunker",
"SlumberChunker",
"NeuralChunker",
"MarkdownChunker",
]

# Add all cloud classes to __all__
__all__ += [
"auth",
"chunker",
"refineries",
]

# Add all embeddings classes to __all__
__all__ += [
"BaseEmbeddings",
"Model2VecEmbeddings",
"SentenceTransformerEmbeddings",
"OpenAIEmbeddings",
"CohereEmbeddings",
"GeminiEmbeddings",
"AutoEmbeddings",
"JinaEmbeddings",
"VoyageAIEmbeddings",
]

# Add all refinery classes to __all__
__all__ += [
"BaseRefinery",
"OverlapRefinery",
"EmbeddingsRefinery",
]

# Add all utils classes to __all__
__all__ += [
"Hubbie",
"Visualizer",
]

# Add all genie classes to __all__
__all__ += [
"BaseGenie",
"GeminiGenie",
"OpenAIGenie",
]

# Add all friends classes to __all__
__all__ += [
"BasePorter",
"BaseHandshake",
"JSONPorter",
"ChromaHandshake",
"MongoDBHandshake",
"PgvectorHandshake",
"PineconeHandshake",
"QdrantHandshake",
"WeaviateHandshake",
"TurbopufferHandshake",
]

# Add all the chefs to __all__
__all__ += [
"BaseChef",
"TextChef",
]

# Add all the fetchers to __all__
__all__ += [
"BaseFetcher",
"FileFetcher",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This large addition to define __all__ is both incomplete and incorrect. It references many names that are not imported into this __init__.py file, and it omits many that are. This will break wildcard imports (from chonkie import *) for users.

Given that this change is out of scope for adding a MarkdownChunker and introduces critical issues, I recommend removing it entirely from this pull request. It can be addressed in a separate PR where it can be implemented correctly and thoroughly.

Comment on lines +54 to +67
def verify_chunk_indices(chunks: List[Chunk], original_text: str) -> None:
"""Verify that chunk indices correctly map to the original text."""
reconstructed = ""
for i, chunk in enumerate(chunks):
extracted = original_text[chunk.start_index : chunk.end_index]
assert chunk.text == extracted, (
f"Chunk {i} text mismatch\n"
f"Chunk text: {repr(chunk.text)}\n"
f"Slice text: {repr(extracted)}\n"
f"Indices: [{chunk.start_index}:{chunk.end_index}]"
)
reconstructed += chunk.text
assert reconstructed == original_text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The assertion assert reconstructed == original_text is incorrect. The chunking process removes separators (like =====) and doesn't guarantee contiguous chunks, so concatenating them will not reproduce the original text. This will cause the test to fail.

The first assertion, which checks if chunk.text matches the slice from original_text, is a valid check for clean_text=False. The reconstruction logic should be removed.

def verify_chunk_indices(chunks: List[Chunk], original_text: str) -> None:
    """Verify that chunk indices correctly map to the original text."""
    for i, chunk in enumerate(chunks):
        extracted = original_text[chunk.start_index : chunk.end_index]
        assert chunk.text == extracted, (
            f"Chunk {i} text mismatch\n"
            f"Chunk text: {repr(chunk.text)}\n"
            f"Slice text: {repr(extracted)}\n"
            f"Indices: [{chunk.start_index}:{chunk.end_index}]"
        )

Comment on lines +295 to +297
cleaned = re.sub(r"\*\*([^*]+)\*\*", r"\1", cleaned)
cleaned = re.sub(r"\*([^*]+)\*", r"\1", cleaned)
cleaned = re.sub(r"`([^`]+)`", r"\1", cleaned)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regular expressions used to strip inline Markdown are not robust. Using [^*]+ will fail for nested formatting (e.g., **bold *and italic***). Using a non-greedy match (.*?) is a more reliable approach. Also, the order of replacement matters: ** should be processed before * to avoid incorrect substitutions.

Suggested change
cleaned = re.sub(r"\*\*([^*]+)\*\*", r"\1", cleaned)
cleaned = re.sub(r"\*([^*]+)\*", r"\1", cleaned)
cleaned = re.sub(r"`([^`]+)`", r"\1", cleaned)
cleaned = re.sub(r"\*\*(.*?)\*\*", r"\1", cleaned)
cleaned = re.sub(r"\*(.*?)\*", r"\1", cleaned)
cleaned = re.sub(r"`(.*?)`", r"\1", cleaned)

cleaned = re.sub(r"\*([^*]+)\*", r"\1", cleaned)
cleaned = re.sub(r"`([^`]+)`", r"\1", cleaned)
# Links and URLs
cleaned = re.sub(r"\[([^\]]+)\]\([^)]+\)", r"\1", cleaned)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regex for stripping links \[([^\]]+)\]\([^)]+\) will fail if a URL contains parentheses, which is valid. Using a non-greedy match (.*?) for the URL part is more robust.

Suggested change
cleaned = re.sub(r"\[([^\]]+)\]\([^)]+\)", r"\1", cleaned)
cleaned = re.sub(r"\[([^\]]+)\]\(.*?\)", r"\1", cleaned)

heading_level=3, # max heading level to consider (1..6)
min_characters_per_chunk=50,
max_characters_per_section=4000,
clean_text=True # strip inline markdown and URLs from chunks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The initialization example shows clean_text=True, but the parameter description on line 39 and the actual implementation in src/chonkie/chunker/markdown.py both use False as the default. This inconsistency can be misleading for users. To ensure clarity, the example should reflect the actual default value.

    clean_text=False         # strip inline markdown and URLs from chunks

Comment on lines +95 to +106
| Chunker | Default | embeddings | 'all' | Chonkie Cloud |
|-----------------|:-------:|:----------:|:-----:|:-------------:|
| TokenChunker | ✅ | ✅ | ✅ | ✅ |
| SentenceChunker | ✅ | ✅ | ✅ | ✅ |
| RecursiveChunker| ✅ | ✅ | ✅ | ✅ |
| TableChunker | ✅ | ✅ | ✅ | ✅ |
| CodeChunker | ❌ | ✅ | ✅ | ✅ |
| SemanticChunker | ❌ | ✅ | ✅ | ✅ |
| SDPMChunker | ❌ | ✅ | ✅ | ✅ |
| LateChunker | ❌ | ✅ | ✅ | ✅ |
| NeuralChunker | ❌ | ✅ | ✅ | ✅ |
| SlumberChunker | ❌ | ✅ | ✅ | ✅ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The MarkdownChunker is missing from the availability table. Since it's part of the base library with no extra dependencies, it should be included in the table to accurately reflect its availability across different installation options.

| Chunker         | Default | embeddings | 'all' | Chonkie Cloud |
|-----------------|:-------:|:----------:|:-----:|:-------------:|
| TokenChunker    | ✅      | ✅         | ✅    | ✅            |
| SentenceChunker | ✅      | ✅         | ✅    | ✅            |
| RecursiveChunker| ✅      | ✅         | ✅    | ✅            |
| MarkdownChunker | ✅      | ✅         | ✅    | ✅            |
| TableChunker    | ✅      | ✅         | ✅    | ✅            |
| CodeChunker     | ❌      | ✅         | ✅    | ✅            |
| SemanticChunker | ❌      | ✅         | ✅    | ✅            |
| SDPMChunker     | ❌      | ✅         | ✅    | ✅            |
| LateChunker     | ❌      | ✅         | ✅    | ✅            |
| NeuralChunker   | ❌      | ✅         | ✅    | ✅            |
| SlumberChunker  | ❌      | ✅         | ✅    | ✅            |


<Note>
Chonkie follows a special approach to dependencies, keeping the base installation lightweight while allowing you to add extra features as and when needed.
Please check the [Installation](/installation) page for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This link to the installation page appears to be broken. It should point to the full path of the installation page within the python-sdk section.

  Please check the [Installation](/python-sdk/getting-started/installation) page for more details.


1. Make a free account on [Chonkie Cloud](https://cloud.chonkie.ai)
2. Get your API key
3. Send your CHONK reqests!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a typo in the word "requests".

3. Send your CHONK requests! 

ranges.append((start, first_heading_start))

# Add sentinel at end
matches_with_end = matches + [(end, 0)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The matches_with_end variable is defined but never used. This appears to be dead code and should be removed to improve clarity.

return ""
cleaned = chunk
# Drop common prompt lines
# cleaned = re.sub(r"^(?:Search Results for:|Question:|Query:).*$", "", cleaned, flags=re.MULTILINE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Commented-out code should be removed to keep the codebase clean and avoid confusion.

@Josephrp
Copy link
Author

lmk if you want me to do anything based on the above. in my opinion this automated review is itself poor quality and nonsensical , but dont take my word for it , just check the demo. for documentation points , i think it's out of scope here , so that's fine . you knkow during christmas & new years, it's not easy to really prioritize this , hope you understand , but it's cool y'all havent completely forgot about it or ignored it , currently we're using my fork in all the deep research agents we make.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants