Skip to content

Conversation

@bashandbone
Copy link
Contributor

@bashandbone bashandbone commented Dec 9, 2025

This pull request significantly revises the README.md and updates the feature comparison chart for CodeWeaver, focusing on clearer messaging, improved competitive positioning, and more actionable guidance for users. The most important changes include a complete rewrite of the introduction and feature sections to emphasize semantic code search for Claude, a streamlined competitive analysis, and clearer setup instructions. Additionally, the feature comparison chart now includes Serena and provides a more nuanced breakdown of strengths and tradeoffs.

Major content and documentation updates:

README overhaul and positioning

  • Rewrote the introduction and feature explanation to focus on CodeWeaver's core value: semantic code search for Claude, with clear examples and direct comparisons to traditional search. [1] [2]
  • Added a competitive quick reference matrix and linked to a detailed analysis, making it easier for users to compare CodeWeaver against other tools. [1] [2]
  • Streamlined installation and configuration instructions, including clearer notes on offline operation and updated .mcp.json examples for both stdio and HTTP setups.

Feature and capability clarification

  • Updated the feature breakdown table to clarify language support, provider flexibility, and configuration options, with improved terminology and more accurate counts.
  • Moved philosophy and rationale to a dedicated section, simplifying the README and focusing on user outcomes rather than implementation details.

Competitive analysis improvements:

  • Updated the feature comparison chart to include Serena, revised feature definitions, and added a direct CodeWeaver vs. Serena breakdown with practical tradeoffs and recommendations. [1] [2]
  • Clarified hybrid search capabilities and industry context, emphasizing CodeWeaver's strengths in semantic and hybrid search.

These changes make the documentation more user-focused, clarify CodeWeaver’s unique value, and provide actionable guidance for both setup and tool selection.

Summary by Sourcery

Revise positioning and documentation to emphasize CodeWeaver as semantic code search for Claude, and update competitive materials and terminology for more accurate feature and language-support descriptions.

Enhancements:

  • Standardize terminology from 'smart delimiter-based' to 'language-aware chunking' across code, generated docs, and build scripts for supported languages and chunking strategies.

Documentation:

  • Rewrite the README introduction, feature overview, and philosophy sections to focus on semantic code search for Claude, clarify setup (including offline profiles and MCP config), and add a competitive quick reference matrix linking to detailed analysis.
  • Update the feature comparison documentation into docs/comparison.md with an expanded matrix that now includes Serena, refined metrics (providers, languages, hybrid search, real-time updates), and practical CodeWeaver-vs-Serena guidance.

Chores:

  • Remove legacy Claude-specific config and internal claude docs/server artifacts that are no longer needed (old MCP JSON, server descriptors, and analysis notes).

Copilot AI review requested due to automatic review settings December 9, 2025 16:46
@bashandbone bashandbone added the documentation Improvements or additions to documentation label Dec 9, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2025

👋 Hey @bashandbone,

Thanks for your contribution to codeweaver! 🧵

You need to agree to the CLA first... 🖊️

Before we can accept your contribution, you need to agree to our Contributor License Agreement (CLA).

To agree to the CLA, please comment:

I read the contributors license agreement and I agree to it.

Those exact words are important1, so please don't change them. 😉

You can read the full CLA here: Contributor License Agreement


@bashandbone has signed the CLA.


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Footnotes

  1. Our bot needs those exact words to recognize that you agree to the CLA.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 9, 2025

Reviewer's Guide

This PR overhauls the README and competitive comparison docs to position CodeWeaver as "semantic code search for Claude," updates competitive metrics and terminology (e.g., language-aware chunking, provider counts), clarifies installation/MCP configuration and offline operation, and removes outdated Claude-specific docs/configs while keeping generator scripts and metadata in sync with the new language and capabilities.

File-Level Changes

Change Details Files
Reposition README to focus on semantic code search for Claude, add a quick competitive matrix, and clarify install, MCP setup, and philosophy.
  • Rewrite intro tagline and "What It Does" section around semantic code search for Claude with concrete usage examples and an alpha disclaimer.
  • Replace old problem/architecture/roadmap sections with a concise capabilities grid (search, language support, resilience, provider flexibility, configuration, DX).
  • Add a new "How CodeWeaver Stacks Up" quick reference matrix including Serena, with notes and a link to detailed competitive analysis.
  • Clarify quick install section, recommended profile requirements, and add an explicit offline profile option.
  • Update MCP configuration examples for stdio and HTTP, including env var names and HTTP URL shape.
  • Simplify and relocate philosophy into a dedicated "Philosophy" section that links to the detailed rationale document.
  • Add a link reference for the new competitive analysis document and adjust various copy to match updated terminology (language-aware chunking, provider counts).
README.md
Replace and expand the competitive comparison chart to include Serena, refine metrics/definitions, and emphasize hybrid search, language-aware chunking, and deployment tradeoffs.
  • Move comparison content from claude-specific docs path to a general docs location and fully rewrite the comparison matrix to add Serena and new columns (tool count, prompt overhead, symbol precision, concept search, editing capabilities).
  • Introduce a dedicated CodeWeaver vs. Serena section that explains strengths, weaknesses, and complementary usage patterns.
  • Update embedding provider counts (20+ → 17) and language counts (170+ → 166+) throughout, and adjust tables comparing CodeWeaver vs Cursor, Continue.dev, Cody, etc. to use the revised numbers.
  • Rename terminology from smart delimiter/pseudo-semantic parsing to language-aware chunking in multiple tables and narrative sections, and simplify the performance section to focus on real-time updates rather than raw indexing benchmarks.
  • Refine deployment, portability, and positioning guidance, including airgapped leaders, multi-IDE teams, and recommended tool selection by use case, plus update ASCII positioning diagram and conclusion section accordingly.
  • Refresh last-updated metadata and contextual notes (e.g., Continue.dev provider list, scale claims for Cody).
docs/comparison.md
Align generator scripts and internal language metadata with the new "language-aware chunking" terminology and current provider/language capability metadata.
  • In the supported-languages generation script, swap import order for type aliases and change docstrings/markdown fragments from smart delimiter-based parsing to language-aware chunking wording.
  • Update the generated secondary_languages literal type docstring to describe language-aware chunking rather than pseudo-semantic delimiter parsing.
  • Modify the languages partial markdown to rename the section heading and explanatory text to "Languages with Language-Aware Chunking" and updated description.
  • Adjust the Docker server YAML generator description text to mention language-aware chunking instead of intelligent chunking.
  • Update the MCP server JSON generator capabilities to advertise a "language-aware" chunking strategy instead of "semantically-aware-delimiters".
scripts/build/generate-supported-languages.py
src/codeweaver/core/secondary_languages.py
overrides/partials/languages.md
scripts/build/generate-docker-server-yaml.py
scripts/build/generate-mcp-server-json.py
Remove obsolete Claude-specific MCP configs, server descriptors, and internal Claude docs that are superseded by the new generic documentation.
  • Delete Claude-specific .claude/mcp.json and its license file, plus VS Code/Claude server JSON/YAML artifacts that are now generated or represented elsewhere.
  • Remove multiple claude-only docs under claudedocs/ that are no longer maintained (implementation plans, summaries, and the old competitive analysis file).
  • Drop legacy top-level server.json/server.yaml artifacts in favor of the generator-driven configurations.
.claude/mcp.json
.claude/mcp.json.license
.vscode/mcp.json
server.json
server.json.license
server.yaml
server.yaml.license
claudedocs/cli-ux-improvements-summary.md
claudedocs/client-pooling-analysis.md
claudedocs/connection-pooling-implementation-plan.md
claudedocs/option-a-service-implementation-plan-final.md
claudedocs/phase-1-implementation-summary.md
claudedocs/phase-2-implementation-summary.md
claudedocs/phase-3-implementation-summary.md
claudedocs/refactor-phases-3-6-completion-summary.md
claudedocs/refactor-phases-3-6-implementation-guide.md
claudedocs/research_codeweaver_competitive_analysis_2025-11-17.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request updates CodeWeaver's README and feature comparison documentation to improve clarity and competitive positioning. The changes focus on repositioning CodeWeaver as a semantic code search tool for Claude, adding comprehensive competitive analysis including a new comparison with Serena, and streamlining the documentation structure.

Key Changes:

  • Rewrote the README introduction to emphasize semantic code search capabilities and concrete use cases
  • Added detailed CodeWeaver vs. Serena competitive analysis with practical tradeoffs
  • Updated feature comparison matrix to include Serena and revised metrics across all competitors
  • Streamlined README by removing extensive sections on architecture, roadmap, and project details in favor of focused value proposition

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File Description
README.md Complete overhaul of introduction and positioning; new tagline emphasizing "semantic code search for Claude"; simplified feature grid; removed extensive documentation sections; updated configuration examples
claudedocs/codeweaver_feature_comparison_chart.md Added Serena to comparison matrix; new detailed CodeWeaver vs. Serena breakdown section; updated metrics and feature counts; revised industry context; streamlined performance section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

README.md Outdated
# CodeWeaver

### The missing abstraction layer between AI and your code
### Semantic code search for Claude — across 170+ languages
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tagline mentions "170+ languages" but the feature comparison table below shows "166+" languages. This inconsistency should be resolved - use the same number in both places.

Suggested change
### Semantic code search for Claude — across 170+ languages
### Semantic code search for Claude — across 166+ languages

Copilot uses AI. Check for mistakes.
README.md Outdated
- **Hierarchical merging**
- **Environment overrides**
### 🔌 Provider Flexibility
- **17+ embedding providers**
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The claim "17+ embedding providers" in this section contradicts other parts of the document which state "16+ embedding providers" (line 72, 178, and line 327 in the feature comparison chart). Ensure consistency across all mentions.

Suggested change
- **17+ embedding providers**
- **16+ embedding providers**

Copilot uses AI. Check for mistakes.
1. Embedding provider diversity (20+ vs 1-5)
1. Embedding provider diversity (16+ vs 1-5)
2. Hybrid search (dense + sparse by default)
3. Language support breadth (170+ with semantic-like chunking)
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature comparison chart states "Language support breadth (170+ with semantic-like chunking)" but earlier in the document the numbers vary between 166+ and 170+. This should be consistent throughout the document.

Copilot uses AI. Check for mistakes.
| **Tool Count** | **1** | **17** | N/A | N/A | N/A | N/A | N/A | N/A |
| **Prompt Overhead** | **~500 tokens** | **~17,000 tokens** | N/A | N/A | N/A | N/A | N/A | N/A |
| **Search Speed** | Moderate (embeddings) | **Very fast (LSP)** | Moderate | Server-side | Fast | Moderate | Fast | On-demand |
| **Embedding Providers** | **16+** | 0 (no embeddings) | 1-2 | 1 | 0 (deprecated) | 4-5 | 1 | 0 |
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The embedding provider count in this table shows "16+" but in the Detailed Feature Breakdown section (line 318) it says "16+ providers" while the conclusion (line 327) mentions provider diversity as "(16+ vs 1-5)". However, the previous version mentioned "20+ providers" which was changed to "16+" in some places but not all. Ensure the count is accurate and consistent - if it's 16+, update all references; if it's actually more, use the correct number.

Copilot uses AI. Check for mistakes.
README.md Outdated
### 🔌 Provider Flexibility
- **17+ embedding providers**
- **50+ embedding models**
- **Sparse & dense**embedding model support
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "Sparse & dense**embedding model support" is missing a space before "embedding". Should be "Sparse & dense embedding model support".

Suggested change
- **Sparse & dense**embedding model support
- **Sparse & dense** embedding model support

Copilot uses AI. Check for mistakes.
README.md Outdated
[arch_find_code]: <src/codeweaver/agent_api/find_code/ARCHITECTURE.md> "find_code Architecture"
[architecture]: <ARCHITECTURE.md> "Overall Architecture"
[bashandbone]: <https://github.com/bashandbone> "Adam Poulemanos' GitHub Profile"
[competitive_analysis]: <src/codeweaver/claudedocs/competitive_feature_comparison_chart.md> "See how CodeWeaver stacks up"
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL for the competitive analysis link appears to be incorrect. It references "src/codeweaver/claudedocs/competitive_feature_comparison_chart.md" but the actual file in the PR is "claudedocs/codeweaver_feature_comparison_chart.md" (not "competitive_feature_comparison_chart.md"). This link will be broken.

Suggested change
[competitive_analysis]: <src/codeweaver/claudedocs/competitive_feature_comparison_chart.md> "See how CodeWeaver stacks up"
[competitive_analysis]: <claudedocs/codeweaver_feature_comparison_chart.md> "See how CodeWeaver stacks up"

Copilot uses AI. Check for mistakes.
…sistency and fairness across docs and comparisons. Updated all docs accordingly. Cleaned up older planning docs
@socket-security
Copy link

socket-security bot commented Dec 9, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​argcomplete@​3.6.397100100100100

View full report

@bashandbone bashandbone marked this pull request as ready for review December 9, 2025 19:23
Copilot AI review requested due to automatic review settings December 9, 2025 19:23
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The Quick Reference Matrix and several competitive stats (e.g., provider counts, language counts, prompt overhead) are now duplicated between the README and docs/comparison.md; consider centralizing these in a single source (or generating them from a script/constant) to avoid future drift.
  • Numbers like 17 embedding providers, 166+ languages, and token counts are hard-coded throughout the comparison doc and README while some of these values are already available programmatically (e.g., len(_languages()), embedding_providers()); wiring the docs to those programmatic sources would make them more robust as the product evolves.
  • The new Serena comparison introduces specific measurements (tool counts, token overhead, language counts) that may change quickly as clients update; consider either linking to a methodology/measurement section or parameterizing these values to make it easier to keep them current.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Quick Reference Matrix and several competitive stats (e.g., provider counts, language counts, prompt overhead) are now duplicated between the README and docs/comparison.md; consider centralizing these in a single source (or generating them from a script/constant) to avoid future drift.
- Numbers like `17 embedding providers`, `166+ languages`, and token counts are hard-coded throughout the comparison doc and README while some of these values are already available programmatically (e.g., len(_languages()), embedding_providers()); wiring the docs to those programmatic sources would make them more robust as the product evolves.
- The new Serena comparison introduces specific measurements (tool counts, token overhead, language counts) that may change quickly as clients update; consider either linking to a methodology/measurement section or parameterizing these values to make it easier to keep them current.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Adam Poulemanos <[email protected]>
Copilot AI review requested due to automatic review settings December 9, 2025 19:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings December 9, 2025 20:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/WHY.md Outdated
- Even well-meaning MCP tools contribute: several popular MCP servers have **16,000+ tokens in prompt overhead** -- all the prompts they supply **every single message** to tell an agent about their available tools and how to use them [^1]

CodeWeaver takes a different approach:
CodeWeaver strives to approac:
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "approac" should be "approach"

Suggested change
CodeWeaver strives to approac:
CodeWeaver strives to approach:

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings December 9, 2025 20:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bashandbone bashandbone merged commit e45f1e8 into main Dec 9, 2025
19 of 22 checks passed
@bashandbone bashandbone deleted the feat/improve-docs-clarity-features branch December 9, 2025 20:59
@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants