An Advanced AI Assistant specifically designed for Istio Service Mesh.
Unlike generic chatbots, this agent uses Advanced RAG (Retrieval-Augmented Generation) with a Hybrid Search strategy. It ingests Istio source code, documentation, and GitHub Issues locally, uses a Cross-Encoder to verify relevance, and streams answers in real-time.
⚠️ Disclaimer: This is a Proof of Concept. Do not use in critical production environments without verification.
- 🧠 Advanced Retrieval: Uses Hybrid Search (Vector Semantic Search + BM25 Keyword Search) to find both high-level concepts and specific error codes.
- 🎯 AI Reranking: A dedicated Cross-Encoder Model evaluates retrieved documents to filter out hallucinations and irrelevant matches.
- ⚡ Real-Time Streaming: Answers are streamed token-by-token for an instant, responsive experience.
- 📎 Context Aware: Supports File Uploads (YAML, Go, Logs). Drag & drop a
virtualservice.yamlor a log file, and the AI will analyze it. - 🔍 Precision Citations: Displays the exact source file and a Relevance Score (0-100%) for every piece of information used.
- 🛡️ 100% Local Privacy: Runs locally using Ollama and local vector embeddings. No data leaves your machine.
The pipeline implements a "Retrieve & Rerank" strategy:
- Ingestion: Code & Docs are split and embedded into ChromaDB.
- Retrieval: * Vector Search: Finds semantic meaning (e.g., "traffic splitting").
- BM25: Finds exact keyword matches (e.g., "Error 503-UC").
- Reranking: The
cross-encoder/ms-marco-MiniLM-L-6-v2model reads the top 30 candidates and selects the Top 10 most relevant chunks. - Generation: Ollama (Llama3/GPT-OSS) generates the answer using the curated context.
Tech Stack:
- LLM: Ollama (default:
gpt-oss:20b) - Orchestration: LlamaIndex
- Reranker: SentenceTransformers (
ms-marco-MiniLM-L-6-v2) - Vector DB: ChromaDB
- Backend: FastAPI (Async Streaming)
- Frontend: HTML5/JS (No frameworks, pure WebSocket-like streaming)
- Python 3.11+ installed.
- Ollama installed and running.
- Pull the model:
ollama pull gpt-oss:20b(Note: You can change the model in config.py)
- Clone the repository:
git clone https://github.com/ArnauSB/istio-ai-agent.git
cd istio-ai-agent- Create a virtual environment:
python -m venv venv
source venv/bin/activate- Install dependencies:
pip install -r requirements.txt- Configure environment:
GITHUB_TOKEN=your_github_token_hereBefore running the chat, you need to download and index the data.
- Ingest Code & Docs:
python ingest_code.pyThis clones the Istio repositories and creates the vector embeddings.
- Ingest GitHub Issues (Optional):
python ingest_issues.pyDownloads solved issues from the last year to learn from real-world problems.
Start the API server (Backend + Frontend):
python -m uvicorn api:app --reload --loop asyncioOpen your browser at: http://localhost:8000
Licensed under the Apache License, Version 2.0. See LICENSE for details.