Skip to content

Commit a4ff887

Browse files
committed
docs: consolidate Phase 8.5 documentation and remove temporary files
- Updated MIGRATION_GUIDE.md with Phase 8.5 improvements: * Auto-bootstrap database connections (3-step → 1-step workflow) * Auto-detection of embeddings from environment variables * Optional URL parameter with auto-generation * Improved default chunking (None → Sentence-based) * Updated all workflow examples to use current API * Removed references to disabled tools - Enhanced create_collection docstring in server.py: * Documented auto-bootstrap behavior * Detailed embedding auto-detection logic * Added environment variable requirements * Updated error codes (DB_BOOTSTRAP_FAILED) - Removed temporary documentation files: * MCP_TOOLS_FIXES_SUMMARY.md (fix tracking - obsolete) * E2E_TESTS_NEED_UPDATE.md (test updates - completed) * TOOL5_UNTRACKED_COLLECTIONS.md (consolidated into audit) - Preserved permanent documentation: * MCP_TOOLS_AUDIT.md (complete audit record) * MIGRATION_GUIDE.md (updated with Phase 8.5) * MCP_API_REFERENCE.md (already current) All documentation now accurate, consistent, and consolidated. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
1 parent e837acd commit a4ff887

File tree

9 files changed

+1090
-399
lines changed

9 files changed

+1090
-399
lines changed

docs/MCP_API_REFERENCE.md

Lines changed: 221 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,13 @@ write_documents(
3737
```python
3838
{
3939
"text": str, # Required: Document content
40-
"url": str, # Optional: Source URL (auto-generated if empty)
40+
"url": str, # Optional: Source URL or identifier (auto-generated from text hash if empty)
4141
"metadata": dict # Optional: Custom metadata
4242
}
4343
```
4444

45+
**Note**: In Phase 8.5, the `url` field became optional. If not provided or empty, it will be auto-generated from the text content hash.
46+
4547
**Example:**
4648
```python
4749
write_documents(
@@ -420,33 +422,244 @@ search(
420422

421423
---
422424

423-
## Error Handling
425+
## Response Format Documentation
426+
427+
All tools return JSON responses with consistent structure.
428+
429+
### Success Response Structure
430+
431+
```json
432+
{
433+
"status": "success",
434+
"message": "Human-readable summary of the operation",
435+
"data": {
436+
// Tool-specific data (varies by operation)
437+
},
438+
"metadata": {
439+
// Optional metadata (included when relevant)
440+
"timestamp": "2025-01-13T16:00:00.000Z",
441+
"operation": "tool_name",
442+
"database": "collection_name",
443+
"collection": "collection_name"
444+
}
445+
}
446+
```
447+
448+
**Fields**:
449+
- `status`: Always "success" for successful operations
450+
- `message`: Human-readable summary (e.g., "Wrote 3 documents to collection 'docs'")
451+
- `data`: Tool-specific response data (structure varies by tool)
452+
- `metadata`: Optional metadata about the operation (timestamp, operation name, etc.)
453+
454+
### Error Response Structure
455+
456+
```json
457+
{
458+
"status": "error",
459+
"error_code": "ERROR_CODE",
460+
"message": "Human-readable error message",
461+
"details": {
462+
// Additional error context
463+
},
464+
"suggestion": "Actionable suggestion to fix the error"
465+
}
466+
```
467+
468+
**Fields**:
469+
- `status`: Always "error" for failed operations
470+
- `error_code`: Machine-readable error code (see Error Codes section)
471+
- `message`: Human-readable error description
472+
- `details`: Optional additional context (parameters, available options, etc.)
473+
- `suggestion`: Optional actionable suggestion to resolve the error
474+
475+
### Error Codes
476+
477+
Error codes follow a consistent naming convention with prefixes:
478+
479+
**Database/Collection Errors**:
480+
- `DB_NOT_FOUND`: Collection not found
481+
- `DB_NOT_INITIALIZED`: Collection not properly initialized
482+
- `COLL_NOT_FOUND`: Collection not found
483+
- `COLL_ALREADY_EXISTS`: Collection already exists
484+
- `COLL_NOT_EMPTY`: Collection contains documents (for delete operations)
485+
- `COLL_CREATION_FAILED`: Collection creation failed
486+
- `COLL_DELETE_FAILED`: Collection deletion failed
487+
- `COLL_INFO_FAILED`: Failed to retrieve collection information
424488

425-
All tools return JSON responses with consistent structure:
489+
**Document Errors**:
490+
- `DOC_WRITE_FAILED`: Document write operation failed
491+
- `DOC_DELETE_FAILED`: Document deletion failed
492+
- `DOC_DELETE_REQUIRES_FORCE`: Deletion requires force=True
493+
- `DOC_NOT_FOUND`: Document not found
494+
- `DOC_RETRIEVAL_FAILED`: Document retrieval failed
426495

427-
**Success Response:**
496+
**Parameter Errors**:
497+
- `PARAM_INVALID_VALUE`: Parameter value out of valid range
498+
- `PARAM_MISSING`: Required parameter missing
499+
500+
**Configuration Errors**:
501+
- `CONFIG_EMBEDDING_INVALID`: Invalid embedding model specified
502+
503+
**System Errors**:
504+
- `NO_DATABASES`: No collections registered
505+
- `QUERY_FAILED`: Query operation failed
506+
- `SEARCH_FAILED`: Search operation failed
507+
- `REFRESH_FAILED`: Database refresh failed
508+
509+
### Response Examples by Tool
510+
511+
**write_documents Success**:
512+
```json
513+
{
514+
"status": "success",
515+
"message": "Wrote 2 documents to collection 'docs'",
516+
"data": {
517+
"documents_written": 2,
518+
"chunks_created": 8,
519+
"collection": "docs",
520+
"embedding_model": "text-embedding-ada-002"
521+
},
522+
"metadata": {
523+
"timestamp": "2025-01-13T16:00:00.000Z",
524+
"operation": "write_documents",
525+
"collection": "docs",
526+
"collection_total_documents": 10,
527+
"sample_query": "What is Python programming"
528+
}
529+
}
530+
```
531+
532+
**query Success**:
428533
```json
429534
{
430535
"status": "success",
431-
"message": "Operation completed",
432-
"data": { ... }
536+
"message": "Query completed for 'What is Python?'",
537+
"data": {
538+
"query": "What is Python?",
539+
"summary": "Python is a high-level programming language...",
540+
"limit": 5
541+
},
542+
"metadata": {
543+
"timestamp": "2025-01-13T16:00:00.000Z",
544+
"operation": "query",
545+
"database": "docs",
546+
"collection": "docs"
547+
}
433548
}
434549
```
435550

436-
**Error Response:**
551+
**search Success**:
552+
```json
553+
{
554+
"status": "success",
555+
"message": "Found 3 results in collection 'docs'",
556+
"data": {
557+
"query": "Python programming",
558+
"results_count": 3,
559+
"results": [
560+
{
561+
"text": "Python is a programming language...",
562+
"url": "https://example.com/python",
563+
"source_citation": "[Python Guide](https://example.com/python)",
564+
"score": 0.92,
565+
"metadata": {"author": "John"},
566+
"rank": 1
567+
}
568+
]
569+
},
570+
"metadata": {
571+
"timestamp": "2025-01-13T16:00:00.000Z",
572+
"operation": "search",
573+
"collection": "docs",
574+
"limit": 10
575+
}
576+
}
577+
```
578+
579+
**Error Example**:
437580
```json
438581
{
439582
"status": "error",
440583
"error_code": "COLL_NOT_FOUND",
441584
"message": "Collection 'docs' not found",
442-
"details": { ... },
585+
"details": {
586+
"collection": "docs",
587+
"database": "docs",
588+
"available_collections": ["other_collection"]
589+
},
443590
"suggestion": "Create the collection first: create_collection(collection='docs')"
444591
}
445592
```
446593

447594
---
448595

449596
## Environment Variables
597+
---
598+
599+
## Parameter Validation
600+
601+
All tools validate their parameters and return clear error messages for invalid values.
602+
603+
### Common Parameter Constraints
604+
605+
**limit** (query, search):
606+
- Type: integer
607+
- Range: 1-100 (inclusive)
608+
- Default: 5
609+
- Error: `PARAM_INVALID_VALUE` if out of range
610+
611+
**min_score** (search):
612+
- Type: float
613+
- Range: 0.0-1.0 (inclusive)
614+
- Optional: Yes
615+
- Error: Invalid if outside range
616+
- Interpretation:
617+
- 0.0 = include all results
618+
- 0.5 = moderate similarity threshold
619+
- 0.7 = good similarity threshold
620+
- 0.8 = high similarity threshold
621+
- 1.0 = exact matches only
622+
623+
**force** (delete_documents, delete_collection):
624+
- Type: boolean
625+
- Default: False
626+
- Purpose: Safety mechanism requiring explicit confirmation for destructive operations
627+
- Error: Operation rejected if False and would delete data
628+
629+
**collection**:
630+
- Type: string
631+
- Required: Yes (for most operations)
632+
- Validation: Must exist (checked at runtime)
633+
- Error: `COLL_NOT_FOUND` if collection doesn't exist
634+
635+
**embedding** (create_collection):
636+
- Type: string
637+
- Default: "auto"
638+
- Valid values:
639+
- "auto" - Auto-detect from environment (recommended)
640+
- "text-embedding-ada-002" - OpenAI default
641+
- "text-embedding-3-small" - OpenAI small
642+
- "text-embedding-3-large" - OpenAI large
643+
- "custom_local" - Custom embedding (requires env vars)
644+
- Error: `CONFIG_EMBEDDING_INVALID` if unsupported
645+
646+
**documents** (write_documents):
647+
- Type: list of dicts
648+
- Required: Yes
649+
- Minimum: 1 document
650+
- Each document must have:
651+
- `text` (required): string
652+
- `url` (optional): string (auto-generated if empty)
653+
- `metadata` (optional): dict
654+
- Error: Validation error if empty or missing required fields
655+
656+
**metadata_filters** (search):
657+
- Type: dict
658+
- Optional: Yes
659+
- Format: `{"field_name": "value"}`
660+
- Behavior: AND logic (all filters must match)
661+
- Example: `{"author": "John", "category": "tech"}`
662+
450663

451664
### Required for OpenAI Embeddings
452665
```bash

0 commit comments

Comments
 (0)