feat: update README and feature comparison chart for clarity and accuracry #201

bashandbone · 2025-12-09T16:46:21Z

This pull request significantly revises the README.md and updates the feature comparison chart for CodeWeaver, focusing on clearer messaging, improved competitive positioning, and more actionable guidance for users. The most important changes include a complete rewrite of the introduction and feature sections to emphasize semantic code search for Claude, a streamlined competitive analysis, and clearer setup instructions. Additionally, the feature comparison chart now includes Serena and provides a more nuanced breakdown of strengths and tradeoffs.

Major content and documentation updates:

README overhaul and positioning

Rewrote the introduction and feature explanation to focus on CodeWeaver's core value: semantic code search for Claude, with clear examples and direct comparisons to traditional search. [1] [2]
Added a competitive quick reference matrix and linked to a detailed analysis, making it easier for users to compare CodeWeaver against other tools. [1] [2]
Streamlined installation and configuration instructions, including clearer notes on offline operation and updated .mcp.json examples for both stdio and HTTP setups.

Feature and capability clarification

Updated the feature breakdown table to clarify language support, provider flexibility, and configuration options, with improved terminology and more accurate counts.
Moved philosophy and rationale to a dedicated section, simplifying the README and focusing on user outcomes rather than implementation details.

Competitive analysis improvements:

Updated the feature comparison chart to include Serena, revised feature definitions, and added a direct CodeWeaver vs. Serena breakdown with practical tradeoffs and recommendations. [1] [2]
Clarified hybrid search capabilities and industry context, emphasizing CodeWeaver's strengths in semantic and hybrid search.

These changes make the documentation more user-focused, clarify CodeWeaver’s unique value, and provide actionable guidance for both setup and tool selection.

Summary by Sourcery

Revise positioning and documentation to emphasize CodeWeaver as semantic code search for Claude, and update competitive materials and terminology for more accurate feature and language-support descriptions.

Enhancements:

Standardize terminology from 'smart delimiter-based' to 'language-aware chunking' across code, generated docs, and build scripts for supported languages and chunking strategies.

Documentation:

Rewrite the README introduction, feature overview, and philosophy sections to focus on semantic code search for Claude, clarify setup (including offline profiles and MCP config), and add a competitive quick reference matrix linking to detailed analysis.
Update the feature comparison documentation into docs/comparison.md with an expanded matrix that now includes Serena, refined metrics (providers, languages, hybrid search, real-time updates), and practical CodeWeaver-vs-Serena guidance.

Chores:

Remove legacy Claude-specific config and internal claude docs/server artifacts that are no longer needed (old MCP JSON, server descriptors, and analysis notes).

…racy

github-actions · 2025-12-09T16:46:40Z

👋 Hey @bashandbone,

Thanks for your contribution to codeweaver! 🧵

You need to agree to the CLA first... 🖊️

Before we can accept your contribution, you need to agree to our Contributor License Agreement (CLA).

To agree to the CLA, please comment:

I read the contributors license agreement and I agree to it.

Those exact words are important¹, so please don't change them. 😉

You can read the full CLA here: Contributor License Agreement

✅ @bashandbone has signed the CLA.

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

Our bot needs those exact words to recognize that you agree to the CLA. ↩

sourcery-ai · 2025-12-09T16:46:44Z

Reviewer's Guide

This PR overhauls the README and competitive comparison docs to position CodeWeaver as "semantic code search for Claude," updates competitive metrics and terminology (e.g., language-aware chunking, provider counts), clarifies installation/MCP configuration and offline operation, and removes outdated Claude-specific docs/configs while keeping generator scripts and metadata in sync with the new language and capabilities.

File-Level Changes

Change	Details	Files
Reposition README to focus on semantic code search for Claude, add a quick competitive matrix, and clarify install, MCP setup, and philosophy.	Rewrite intro tagline and "What It Does" section around semantic code search for Claude with concrete usage examples and an alpha disclaimer. Replace old problem/architecture/roadmap sections with a concise capabilities grid (search, language support, resilience, provider flexibility, configuration, DX). Add a new "How CodeWeaver Stacks Up" quick reference matrix including Serena, with notes and a link to detailed competitive analysis. Clarify quick install section, recommended profile requirements, and add an explicit offline profile option. Update MCP configuration examples for stdio and HTTP, including env var names and HTTP URL shape. Simplify and relocate philosophy into a dedicated "Philosophy" section that links to the detailed rationale document. Add a link reference for the new competitive analysis document and adjust various copy to match updated terminology (language-aware chunking, provider counts).	`README.md`
Replace and expand the competitive comparison chart to include Serena, refine metrics/definitions, and emphasize hybrid search, language-aware chunking, and deployment tradeoffs.	Move comparison content from claude-specific docs path to a general docs location and fully rewrite the comparison matrix to add Serena and new columns (tool count, prompt overhead, symbol precision, concept search, editing capabilities). Introduce a dedicated CodeWeaver vs. Serena section that explains strengths, weaknesses, and complementary usage patterns. Update embedding provider counts (20+ → 17) and language counts (170+ → 166+) throughout, and adjust tables comparing CodeWeaver vs Cursor, Continue.dev, Cody, etc. to use the revised numbers. Rename terminology from smart delimiter/pseudo-semantic parsing to language-aware chunking in multiple tables and narrative sections, and simplify the performance section to focus on real-time updates rather than raw indexing benchmarks. Refine deployment, portability, and positioning guidance, including airgapped leaders, multi-IDE teams, and recommended tool selection by use case, plus update ASCII positioning diagram and conclusion section accordingly. Refresh last-updated metadata and contextual notes (e.g., Continue.dev provider list, scale claims for Cody).	`docs/comparison.md`
Align generator scripts and internal language metadata with the new "language-aware chunking" terminology and current provider/language capability metadata.	In the supported-languages generation script, swap import order for type aliases and change docstrings/markdown fragments from smart delimiter-based parsing to language-aware chunking wording. Update the generated secondary_languages literal type docstring to describe language-aware chunking rather than pseudo-semantic delimiter parsing. Modify the languages partial markdown to rename the section heading and explanatory text to "Languages with Language-Aware Chunking" and updated description. Adjust the Docker server YAML generator description text to mention language-aware chunking instead of intelligent chunking. Update the MCP server JSON generator capabilities to advertise a "language-aware" chunking strategy instead of "semantically-aware-delimiters".	`scripts/build/generate-supported-languages.py` `src/codeweaver/core/secondary_languages.py` `overrides/partials/languages.md` `scripts/build/generate-docker-server-yaml.py` `scripts/build/generate-mcp-server-json.py`
Remove obsolete Claude-specific MCP configs, server descriptors, and internal Claude docs that are superseded by the new generic documentation.	Delete Claude-specific `.claude/mcp.json` and its license file, plus VS Code/Claude server JSON/YAML artifacts that are now generated or represented elsewhere. Remove multiple claude-only docs under `claudedocs/` that are no longer maintained (implementation plans, summaries, and the old competitive analysis file). Drop legacy top-level `server.json`/`server.yaml` artifacts in favor of the generator-driven configurations.	`.claude/mcp.json` `.claude/mcp.json.license` `.vscode/mcp.json` `server.json` `server.json.license` `server.yaml` `server.yaml.license` `claudedocs/cli-ux-improvements-summary.md` `claudedocs/client-pooling-analysis.md` `claudedocs/connection-pooling-implementation-plan.md` `claudedocs/option-a-service-implementation-plan-final.md` `claudedocs/phase-1-implementation-summary.md` `claudedocs/phase-2-implementation-summary.md` `claudedocs/phase-3-implementation-summary.md` `claudedocs/refactor-phases-3-6-completion-summary.md` `claudedocs/refactor-phases-3-6-implementation-guide.md` `claudedocs/research_codeweaver_competitive_analysis_2025-11-17.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

Copilot

Pull request overview

This pull request updates CodeWeaver's README and feature comparison documentation to improve clarity and competitive positioning. The changes focus on repositioning CodeWeaver as a semantic code search tool for Claude, adding comprehensive competitive analysis including a new comparison with Serena, and streamlining the documentation structure.

Key Changes:

Rewrote the README introduction to emphasize semantic code search capabilities and concrete use cases
Added detailed CodeWeaver vs. Serena competitive analysis with practical tradeoffs
Updated feature comparison matrix to include Serena and revised metrics across all competitors
Streamlined README by removing extensive sections on architecture, roadmap, and project details in favor of focused value proposition

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File	Description
README.md	Complete overhaul of introduction and positioning; new tagline emphasizing "semantic code search for Claude"; simplified feature grid; removed extensive documentation sections; updated configuration examples
claudedocs/codeweaver_feature_comparison_chart.md	Added Serena to comparison matrix; new detailed CodeWeaver vs. Serena breakdown section; updated metrics and feature counts; revised industry context; streamlined performance section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-09T16:52:22Z

README.md

 # CodeWeaver

-### The missing abstraction layer between AI and your code
+### Semantic code search for Claude — across 170+ languages


The tagline mentions "170+ languages" but the feature comparison table below shows "166+" languages. This inconsistency should be resolved - use the same number in both places.

Suggested change

### Semantic code search for Claude — across 170+ languages

### Semantic code search for Claude — across 166+ languages

Copilot · 2025-12-09T16:52:23Z

README.md

- **Hierarchical merging**
- **Environment overrides**
+### 🔌 Provider Flexibility
+- **17+ embedding providers**


The claim "17+ embedding providers" in this section contradicts other parts of the document which state "16+ embedding providers" (line 72, 178, and line 327 in the feature comparison chart). Ensure consistency across all mentions.

Suggested change

- **17+ embedding providers**

- **16+ embedding providers**

Copilot · 2025-12-09T16:52:23Z

claudedocs/codeweaver_feature_comparison_chart.md

-1. Embedding provider diversity (20+ vs 1-5)
+1. Embedding provider diversity (16+ vs 1-5)
 2. Hybrid search (dense + sparse by default)
 3. Language support breadth (170+ with semantic-like chunking)


The feature comparison chart states "Language support breadth (170+ with semantic-like chunking)" but earlier in the document the numbers vary between 166+ and 170+. This should be consistent throughout the document.

docs/comparison.md

Copilot · 2025-12-09T16:52:24Z

claudedocs/codeweaver_feature_comparison_chart.md

+| **Tool Count** | **1** | **17** | N/A | N/A | N/A | N/A | N/A | N/A |
+| **Prompt Overhead** | **~500 tokens** | **~17,000 tokens** | N/A | N/A | N/A | N/A | N/A | N/A |
+| **Search Speed** | Moderate (embeddings) | **Very fast (LSP)** | Moderate | Server-side | Fast | Moderate | Fast | On-demand |
+| **Embedding Providers** | **16+** | 0 (no embeddings) | 1-2 | 1 | 0 (deprecated) | 4-5 | 1 | 0 |


The embedding provider count in this table shows "16+" but in the Detailed Feature Breakdown section (line 318) it says "16+ providers" while the conclusion (line 327) mentions provider diversity as "(16+ vs 1-5)". However, the previous version mentioned "20+ providers" which was changed to "16+" in some places but not all. Ensure the count is accurate and consistent - if it's 16+, update all references; if it's actually more, use the correct number.

README.md

Copilot · 2025-12-09T16:52:25Z

README.md

+### 🔌 Provider Flexibility
+- **17+ embedding providers**
+- **50+ embedding models**
+- **Sparse & dense**embedding model support


Typo: "Sparse & dense**embedding model support" is missing a space before "embedding". Should be "Sparse & dense embedding model support".

Suggested change

- **Sparse & dense**embedding model support

- **Sparse & dense** embedding model support

Copilot · 2025-12-09T16:52:25Z

README.md

 [arch_find_code]: <src/codeweaver/agent_api/find_code/ARCHITECTURE.md> "find_code Architecture"
 [architecture]: <ARCHITECTURE.md> "Overall Architecture"
 [bashandbone]: <https://github.com/bashandbone> "Adam Poulemanos' GitHub Profile"
+[competitive_analysis]: <src/codeweaver/claudedocs/competitive_feature_comparison_chart.md> "See how CodeWeaver stacks up"


The URL for the competitive analysis link appears to be incorrect. It references "src/codeweaver/claudedocs/competitive_feature_comparison_chart.md" but the actual file in the PR is "claudedocs/codeweaver_feature_comparison_chart.md" (not "competitive_feature_comparison_chart.md"). This link will be broken.

Suggested change

[competitive_analysis]: <src/codeweaver/claudedocs/competitive_feature_comparison_chart.md> "See how CodeWeaver stacks up"

[competitive_analysis]: <claudedocs/codeweaver_feature_comparison_chart.md> "See how CodeWeaver stacks up"

README.md

…sistency and fairness across docs and comparisons. Updated all docs accordingly. Cleaned up older planning docs

socket-security · 2025-12-09T19:20:16Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	pypi/argcomplete@3.6.3

View full report

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

The Quick Reference Matrix and several competitive stats (e.g., provider counts, language counts, prompt overhead) are now duplicated between the README and docs/comparison.md; consider centralizing these in a single source (or generating them from a script/constant) to avoid future drift.
Numbers like 17 embedding providers, 166+ languages, and token counts are hard-coded throughout the comparison doc and README while some of these values are already available programmatically (e.g., len(_languages()), embedding_providers()); wiring the docs to those programmatic sources would make them more robust as the product evolves.
The new Serena comparison introduces specific measurements (tool counts, token overhead, language counts) that may change quickly as clients update; consider either linking to a methodology/measurement section or parameterizing these values to make it easier to keep them current.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The Quick Reference Matrix and several competitive stats (e.g., provider counts, language counts, prompt overhead) are now duplicated between the README and docs/comparison.md; consider centralizing these in a single source (or generating them from a script/constant) to avoid future drift.
- Numbers like `17 embedding providers`, `166+ languages`, and token counts are hard-coded throughout the comparison doc and README while some of these values are already available programmatically (e.g., len(_languages()), embedding_providers()); wiring the docs to those programmatic sources would make them more robust as the product evolves.
- The new Serena comparison introduces specific measurements (tool counts, token overhead, language counts) that may change quickly as clients update; consider either linking to a methodology/measurement section or parameterizing these values to make it easier to keep them current.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Copilot

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <[email protected]> Signed-off-by: Adam Poulemanos <[email protected]>

Copilot

Pull request overview

Copilot reviewed 24 out of 25 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…, WHY.md, and comparison chart

…ng experimental note and fixing formatting

Copilot

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-09T20:30:28Z

docs/WHY.md

+- Even well-meaning MCP tools contribute: several popular MCP servers have **16,000+ tokens in prompt overhead** -- all the prompts they supply **every single message** to tell an agent about their available tools and how to use them [^1]

-CodeWeaver takes a different approach:
+CodeWeaver strives to approac:


Typo: "approac" should be "approach"

Suggested change

CodeWeaver strives to approac:

CodeWeaver strives to approach:

… headers

…and enhancing readability

Copilot

Pull request overview

Copilot reviewed 25 out of 26 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

feat: update README and feature comparison chart for clarity and accu…

f589ca5

…racy

Copilot AI review requested due to automatic review settings December 9, 2025 16:46

bashandbone added the documentation Improvements or additions to documentation label Dec 9, 2025

Copilot started reviewing on behalf of bashandbone December 9, 2025 16:46 View session