Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,12 @@ SPLITTER_TYPE=ast

# Whether to use hybrid search mode. If true, it will use both dense vector and BM25; if false, it will use only dense vector search.
# HYBRID_MODE=true

# =============================================================================
# Collection Naming Configuration
# =============================================================================

# Whether to use strict collection naming that includes provider and model info
# This prevents conflicts when switching between different embedding providers/models
# If false (default), uses legacy naming for backward compatibility
# EMBEDDING_STRICT_COLLECTION_NAMES=false
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -543,7 +543,7 @@ Claude Context is a monorepo containing three main packages:

### Supported Technologies

- **Embedding Providers**: [OpenAI](https://openai.com), [VoyageAI](https://voyageai.com), [Ollama](https://ollama.ai), [Gemini](https://gemini.google.com)
- **Embedding Providers**: [OpenAI](https://openai.com), [VoyageAI](https://voyageai.com), [Ollama](https://ollama.ai), [Gemini](https://gemini.google.com), [LlamaCpp](https://github.com/ggerganov/llama.cpp) (local inference on consumer hardware)
- **Vector Databases**: [Milvus](https://milvus.io) or [Zilliz Cloud](https://zilliz.com/cloud)(fully managed vector database as a service)
- **Code Splitters**: AST-based splitter (with automatic fallback), LangChain character-based splitter
- **Languages**: TypeScript, JavaScript, Python, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, Markdown
Expand Down
18 changes: 17 additions & 1 deletion docs/getting-started/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Claude Context supports a global configuration file at `~/.context/.env` to simp
### Embedding Provider
| Variable | Description | Default |
|----------|-------------|---------|
| `EMBEDDING_PROVIDER` | Provider: `OpenAI`, `VoyageAI`, `Gemini`, `Ollama` | `OpenAI` |
| `EMBEDDING_PROVIDER` | Provider: `OpenAI`, `VoyageAI`, `Gemini`, `Ollama`, `LlamaCpp` | `OpenAI` |
| `EMBEDDING_MODEL` | Embedding model name (works for all providers) | Provider-specific default |
| `OPENAI_API_KEY` | OpenAI API key | Required for OpenAI |
| `OPENAI_BASE_URL` | OpenAI API base URL (optional, for custom endpoints) | `https://api.openai.com/v1` |
Expand All @@ -47,13 +47,29 @@ Claude Context supports a global configuration file at `~/.context/.env` to simp
|----------|-------------|---------|
| `MILVUS_TOKEN` | Milvus authentication token. Get [Zilliz Personal API Key](https://github.com/zilliztech/claude-context/blob/master/assets/signup_and_get_apikey.png) | Recommended |
| `MILVUS_ADDRESS` | Milvus server address. Optional when using Zilliz Personal API Key | Auto-resolved from token |
| `MILVUS_COLLECTION_NAME` | Custom collection name (optional, overrides automatic naming) | Auto-generated |
| `EMBEDDING_STRICT_COLLECTION_NAMES` | Use strict collection naming with provider+model info to prevent conflicts | `false` |

> **💡 Collection Naming:**
> - **Legacy mode** (default): Collections named `hybrid_code_chunks_<hash>` - same name for all providers
> - **Strict mode** (`EMBEDDING_STRICT_COLLECTION_NAMES=true`): Collections include provider and model, e.g. `hybrid_ollama_nomic_embed_text_<hash>_<unique>`
> - **Benefits of strict mode**: Prevents data conflicts when switching between different embedding providers or models
> - **Use case**: Enable strict mode when experimenting with multiple providers (Ollama, LlamaCpp, etc.) on the same codebase

### Ollama (Optional)
| Variable | Description | Default |
|----------|-------------|---------|
| `OLLAMA_HOST` | Ollama server URL | `http://127.0.0.1:11434` |
| `OLLAMA_MODEL`(alternative to `EMBEDDING_MODEL`) | Model name | |

### LlamaCpp (Optional)
| Variable | Description | Default |
|----------|-------------|---------|
| `LLAMACPP_HOST` | LlamaCpp server URL | `http://localhost:8080` |
| `LLAMACPP_MODEL` (alternative to `EMBEDDING_MODEL`) | Model name | |
| `LLAMACPP_TIMEOUT` | Request timeout in milliseconds | `30000` |
| `LLAMACPP_CODE_PREFIX` | Enable automatic code prefix for embeddings | `true` |


### Advanced Configuration
| Variable | Description | Default |
Expand Down
36 changes: 32 additions & 4 deletions packages/core/src/context.ts
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ export interface ContextConfig {
ignorePatterns?: string[];
customExtensions?: string[]; // New: custom extensions from MCP
customIgnorePatterns?: string[]; // New: custom ignore patterns from MCP
customCollectionName?: string; // New: custom collection name from MCP config
}

export class Context {
Expand All @@ -103,6 +104,7 @@ export class Context {
private supportedExtensions: string[];
private ignorePatterns: string[];
private synchronizers = new Map<string, FileSynchronizer>();
private customCollectionName?: string;

constructor(config: ContextConfig = {}) {
// Initialize services
Expand Down Expand Up @@ -145,6 +147,9 @@ export class Context {
// Remove duplicates
this.ignorePatterns = [...new Set(allIgnorePatterns)];

// Store custom collection name if provided
this.customCollectionName = config.customCollectionName;

console.log(`[Context] 🔧 Initialized with ${this.supportedExtensions.length} supported extensions and ${this.ignorePatterns.length} ignore patterns`);
if (envCustomExtensions.length > 0) {
console.log(`[Context] 📎 Loaded ${envCustomExtensions.length} custom extensions from environment: ${envCustomExtensions.join(', ')}`);
Expand Down Expand Up @@ -229,14 +234,37 @@ export class Context {
}

/**
* Generate collection name based on codebase path and hybrid mode
* Generate collection name based on codebase path, provider, model and hybrid mode
*/
public getCollectionName(codebasePath: string): string {
// If custom collection name is provided, use it directly
if (this.customCollectionName) {
return this.customCollectionName;
}

const isHybrid = this.getIsHybrid();
const normalizedPath = path.resolve(codebasePath);
const hash = crypto.createHash('md5').update(normalizedPath).digest('hex');
const prefix = isHybrid === true ? 'hybrid_code_chunks' : 'code_chunks';
return `${prefix}_${hash.substring(0, 8)}`;
const pathHash = crypto.createHash('md5').update(normalizedPath).digest('hex');

// Check if strict collection naming is enabled
const strictNaming = envManager.get('EMBEDDING_STRICT_COLLECTION_NAMES')?.toLowerCase() === 'true';

if (strictNaming) {
// Generate collection name including provider and model to prevent conflicts
const provider = this.embedding.getProvider().toLowerCase();
const model = this.embedding.getModel().replace(/[^a-zA-Z0-9]/g, '_'); // Sanitize model name

// Create a comprehensive hash including provider and model to ensure uniqueness
const uniqueString = `${provider}_${model}_${normalizedPath}`;
const fullHash = crypto.createHash('md5').update(uniqueString).digest('hex');

const prefix = isHybrid === true ? 'hybrid' : 'code';
return `${prefix}_${provider}_${model}_${pathHash.substring(0, 8)}_${fullHash.substring(0, 8)}`;
} else {
// Legacy collection naming (default behavior)
const prefix = isHybrid === true ? 'hybrid_code_chunks' : 'code_chunks';
return `${prefix}_${pathHash.substring(0, 8)}`;
}
}

/**
Expand Down
6 changes: 6 additions & 0 deletions packages/core/src/embedding/base-embedding.ts
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,10 @@ export abstract class Embedding {
* @returns Provider name
*/
abstract getProvider(): string;

/**
* Get model name/identifier
* @returns Model name
*/
abstract getModel(): string;
}
4 changes: 4 additions & 0 deletions packages/core/src/embedding/gemini-embedding.ts
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ export class GeminiEmbedding extends Embedding {
return 'Gemini';
}

getModel(): string {
return this.config.model || 'gemini-embedding-001';
}

/**
* Set model type
* @param model Model name
Expand Down
3 changes: 2 additions & 1 deletion packages/core/src/embedding/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ export * from './base-embedding';
export * from './openai-embedding';
export * from './voyageai-embedding';
export * from './ollama-embedding';
export * from './gemini-embedding';
export * from './gemini-embedding';
export * from './llamacpp-embedding';
Loading