|
| 1 | +# Configuration Reference |
| 2 | + |
| 3 | +Complete environment variable reference for Context Engine. |
| 4 | + |
| 5 | +**Documentation:** [README](../README.md) · [Configuration](CONFIGURATION.md) · [IDE Clients](IDE_CLIENTS.md) · [MCP API](MCP_API.md) · [ctx CLI](CTX_CLI.md) · [Memory Guide](MEMORY_GUIDE.md) · [Architecture](ARCHITECTURE.md) · [Multi-Repo](MULTI_REPO_COLLECTIONS.md) · [Kubernetes](../deploy/kubernetes/README.md) · [VS Code Extension](vscode-extension.md) · [Troubleshooting](TROUBLESHOOTING.md) · [Development](DEVELOPMENT.md) |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +**On this page:** |
| 10 | +- [Core Settings](#core-settings) |
| 11 | +- [Indexing & Micro-Chunks](#indexing--micro-chunks) |
| 12 | +- [Watcher Settings](#watcher-settings) |
| 13 | +- [Reranker](#reranker) |
| 14 | +- [Decoder (llama.cpp / GLM)](#decoder-llamacpp--glm) |
| 15 | +- [ReFRAG](#refrag) |
| 16 | +- [Ports](#ports) |
| 17 | +- [Search & Expansion](#search--expansion) |
| 18 | +- [Memory Blending](#memory-blending) |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## Core Settings |
| 23 | + |
| 24 | +| Name | Description | Default | |
| 25 | +|------|-------------|---------| |
| 26 | +| COLLECTION_NAME | Qdrant collection name (unified across all repos) | codebase | |
| 27 | +| REPO_NAME | Logical repo tag stored in payload for filtering | auto-detect from git/folder | |
| 28 | +| HOST_INDEX_PATH | Host path mounted at /work in containers | current repo (.) | |
| 29 | +| QDRANT_URL | Qdrant base URL | container: http://qdrant:6333; local: http://localhost:6333 | |
| 30 | + |
| 31 | +## Indexing & Micro-Chunks |
| 32 | + |
| 33 | +| Name | Description | Default | |
| 34 | +|------|-------------|---------| |
| 35 | +| INDEX_MICRO_CHUNKS | Enable token-based micro-chunking | 0 (off) | |
| 36 | +| MAX_MICRO_CHUNKS_PER_FILE | Cap micro-chunks per file | 200 | |
| 37 | +| TOKENIZER_URL | HF tokenizer.json URL (for Make download) | n/a | |
| 38 | +| TOKENIZER_PATH | Local path where tokenizer is saved (Make) | models/tokenizer.json | |
| 39 | +| TOKENIZER_JSON | Runtime path for tokenizer (indexer) | models/tokenizer.json | |
| 40 | +| USE_TREE_SITTER | Enable tree-sitter parsing (py/js/ts) | 0 (off) | |
| 41 | +| INDEX_CHUNK_LINES | Lines per chunk (non-micro mode) | 120 | |
| 42 | +| INDEX_CHUNK_OVERLAP | Overlap lines between chunks | 20 | |
| 43 | +| INDEX_BATCH_SIZE | Upsert batch size | 64 | |
| 44 | +| INDEX_PROGRESS_EVERY | Log progress every N files | 200 | |
| 45 | + |
| 46 | +## Watcher Settings |
| 47 | + |
| 48 | +| Name | Description | Default | |
| 49 | +|------|-------------|---------| |
| 50 | +| WATCH_DEBOUNCE_SECS | Debounce between FS events | 1.5 | |
| 51 | +| INDEX_UPSERT_BATCH | Upsert batch size (watcher) | 128 | |
| 52 | +| INDEX_UPSERT_RETRIES | Retry count | 5 | |
| 53 | +| INDEX_UPSERT_BACKOFF | Seconds between retries | 0.5 | |
| 54 | +| QDRANT_TIMEOUT | HTTP timeout seconds | watcher: 60; search: 20 | |
| 55 | +| MCP_TOOL_TIMEOUT_SECS | Max duration for long-running MCP tools | 3600 | |
| 56 | + |
| 57 | +## Reranker |
| 58 | + |
| 59 | +| Name | Description | Default | |
| 60 | +|------|-------------|---------| |
| 61 | +| RERANKER_ONNX_PATH | Local ONNX cross-encoder model path | unset | |
| 62 | +| RERANKER_TOKENIZER_PATH | Tokenizer path for reranker | unset | |
| 63 | +| RERANKER_ENABLED | Enable reranker by default | 1 (enabled) | |
| 64 | + |
| 65 | +## Decoder (llama.cpp / GLM) |
| 66 | + |
| 67 | +| Name | Description | Default | |
| 68 | +|------|-------------|---------| |
| 69 | +| REFRAG_DECODER | Enable decoder for context_answer | 1 (enabled) | |
| 70 | +| REFRAG_RUNTIME | Decoder backend: llamacpp or glm | llamacpp | |
| 71 | +| LLAMACPP_URL | llama.cpp server endpoint | http://llamacpp:8080 or http://host.docker.internal:8081 | |
| 72 | +| LLAMACPP_TIMEOUT_SEC | Decoder request timeout | 300 | |
| 73 | +| DECODER_MAX_TOKENS | Max tokens for decoder responses | 4000 | |
| 74 | +| REFRAG_DECODER_MODE | prompt or soft (soft requires patched llama.cpp) | prompt | |
| 75 | +| GLM_API_KEY | API key for GLM provider | unset | |
| 76 | +| GLM_MODEL | GLM model name | glm-4.6 | |
| 77 | +| USE_GPU_DECODER | Native Metal decoder (1) vs Docker (0) | 0 (docker) | |
| 78 | +| LLAMACPP_GPU_LAYERS | Number of layers to offload to GPU, -1 for all | 32 | |
| 79 | + |
| 80 | +## ReFRAG (Micro-Chunking & Retrieval) |
| 81 | + |
| 82 | +| Name | Description | Default | |
| 83 | +|------|-------------|---------| |
| 84 | +| REFRAG_MODE | Enable micro-chunking and span budgeting | 1 (enabled) | |
| 85 | +| REFRAG_GATE_FIRST | Enable mini-vector gating | 1 (enabled) | |
| 86 | +| REFRAG_CANDIDATES | Candidates for gate-first filtering | 200 | |
| 87 | +| MICRO_BUDGET_TOKENS | Token budget for context_answer | 512 | |
| 88 | +| MICRO_OUT_MAX_SPANS | Max spans returned per query | 3 | |
| 89 | +| MICRO_CHUNK_TOKENS | Tokens per micro-chunk window | 16 | |
| 90 | +| MICRO_CHUNK_STRIDE | Stride between windows | 8 | |
| 91 | +| MICRO_MERGE_LINES | Lines to merge adjacent spans | 4 | |
| 92 | +| MICRO_TOKENS_PER_LINE | Estimated tokens per line | 32 | |
| 93 | + |
| 94 | +## Ports |
| 95 | + |
| 96 | +| Name | Description | Default | |
| 97 | +|------|-------------|---------| |
| 98 | +| FASTMCP_PORT | Memory MCP server port (SSE) | 8000 | |
| 99 | +| FASTMCP_INDEXER_PORT | Indexer MCP server port (SSE) | 8001 | |
| 100 | +| FASTMCP_HTTP_PORT | Memory RMCP host port mapping | 8002 | |
| 101 | +| FASTMCP_INDEXER_HTTP_PORT | Indexer RMCP host port mapping | 8003 | |
| 102 | +| FASTMCP_HEALTH_PORT | Health port (memory/indexer) | memory: 18000; indexer: 18001 | |
| 103 | + |
| 104 | +## Search & Expansion |
| 105 | + |
| 106 | +| Name | Description | Default | |
| 107 | +|------|-------------|---------| |
| 108 | +| HYBRID_EXPAND | Enable heuristic multi-query expansion | 0 (off) | |
| 109 | +| LLM_EXPAND_MAX | Max alternate queries via LLM | 0 | |
| 110 | + |
| 111 | +## Memory Blending |
| 112 | + |
| 113 | +| Name | Description | Default | |
| 114 | +|------|-------------|---------| |
| 115 | +| MEMORY_SSE_ENABLED | Enable SSE memory blending | false | |
| 116 | +| MEMORY_MCP_URL | Memory MCP endpoint for blending | http://mcp:8000/sse | |
| 117 | +| MEMORY_MCP_TIMEOUT | Timeout for memory queries | 6 | |
| 118 | +| MEMORY_AUTODETECT | Auto-detect memory collection | 1 | |
| 119 | +| MEMORY_COLLECTION_TTL_SECS | Cache TTL for collection detection | 300 | |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## Exclusions (.qdrantignore) |
| 124 | + |
| 125 | +The indexer supports a `.qdrantignore` file at the repo root (similar to `.gitignore`). |
| 126 | + |
| 127 | +**Default exclusions** (overridable): |
| 128 | +- `/models`, `/node_modules`, `/dist`, `/build` |
| 129 | +- `/.venv`, `/venv`, `/__pycache__`, `/.git` |
| 130 | +- `*.onnx`, `*.bin`, `*.safetensors`, `tokenizer.json`, `*.whl`, `*.tar.gz` |
| 131 | + |
| 132 | +**Override via env or flags:** |
| 133 | +```bash |
| 134 | +# Disable defaults |
| 135 | +QDRANT_DEFAULT_EXCLUDES=0 |
| 136 | + |
| 137 | +# Custom ignore file |
| 138 | +QDRANT_IGNORE_FILE=.myignore |
| 139 | + |
| 140 | +# Additional excludes |
| 141 | +QDRANT_EXCLUDES='tokenizer.json,*.onnx,/third_party' |
| 142 | +``` |
| 143 | + |
| 144 | +**CLI examples:** |
| 145 | +```bash |
| 146 | +docker compose run --rm indexer --root /work --ignore-file .qdrantignore |
| 147 | +docker compose run --rm indexer --root /work --no-default-excludes --exclude '/vendor' --exclude '*.bin' |
| 148 | +``` |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## Scaling Recommendations |
| 153 | + |
| 154 | +| Repo Size | Chunk Lines | Overlap | Batch Size | |
| 155 | +|-----------|------------|---------|------------| |
| 156 | +| Small (<100 files) | 80-120 | 16-24 | 32-64 | |
| 157 | +| Medium (100s-1k files) | 120-160 | ~20 | 64-128 | |
| 158 | +| Large (1k+ files) | 120 (default) | 20 | 128+ | |
| 159 | + |
| 160 | +For large monorepos, set `INDEX_PROGRESS_EVERY=200` for visibility. |
| 161 | + |
0 commit comments