All notable changes to the Visionary Tool Server project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Contributing guide with comprehensive development documentation
- Changelog following Keep a Changelog format
- 331 MCP tools across 64 modules providing comprehensive AI agent capabilities
- FastMCP 2.14.2 framework for Model Context Protocol server implementation
- Python 3.14.2 runtime with modern async/await architecture
- Starlette 0.50.0 ASGI framework for high-performance HTTP handling
- OAuth 2.1 with PKCE authentication system for secure API access
- Self-test health check suite for automated tool validation and monitoring
- Agent orchestrator with multi-agent chain execution capabilities
- SSE (Server-Sent Events) transport on port 8082 for real-time communication
- OpenAI: GPT-4, GPT-3.5, embeddings, fine-tuning, DALL-E integration
- Anthropic: Claude 3 Opus/Sonnet/Haiku, extended context windows
- Groq: Ultra-fast inference, Mixtral, Llama models
- DeepSeek: Code-specialized models, competitive pricing
- Mistral: Mixtral 8x7B, Mistral Large, European AI compliance
- Grok: X.AI's conversational models
- Perplexity: Search-augmented generation, real-time information
- Cohere: Command models, embeddings, reranking
- Together: Open-source model hosting, custom deployments
- Fireworks: Optimized inference, function calling
- OpenRouter: Unified API for 100+ models
- HuggingFace: Open-source model hub integration
- MiniMax M2.1: Chinese language optimization
- GitHub (37 tools): Repository management, issues, pull requests, actions, releases, webhooks
- GitLab (26 tools): Project management, CI/CD pipelines, merge requests, wiki integration
- Jira: Issue tracking, sprint planning, workflow automation, Jira Query Language (JQL)
- Linear: Modern issue tracking, project workflows, cycle management
- ClickUp: Task management, time tracking, goals, custom fields
- Asana: Project planning, task dependencies, portfolio management
- Monday: Visual project boards, workflow automation, team collaboration
- Vercel: Serverless deployment, preview environments, edge functions, analytics
- Railway: Container deployment, PostgreSQL/Redis provisioning, environment management
- Cloudflare: DNS management, edge workers, caching, DDoS protection, SSL/TLS
- Datadog: Full-stack monitoring, APM, log aggregation, custom metrics, dashboards
- Prometheus: Metrics collection, alerting, PromQL queries, service discovery
- Grafana: Visualization, dashboard creation, alert management, data source integration
- Sentry: Error tracking, performance monitoring, release tracking, issue grouping
- ElevenLabs: Text-to-speech, voice cloning, multilingual support
- Replicate: AI model inference, Stable Diffusion, SDXL, custom models
- FAL: Fast AI inference, image generation, video processing
- Stability AI: Stable Diffusion, image editing, upscaling, inpainting
- Firecrawl: Web scraping, content extraction, structured data
- Exa: Semantic search, knowledge graph queries
- Brave Search: Privacy-focused search, no-tracking API
- Discord (21 tools): Server management, channels, messages, roles, webhooks, embeds, reactions
- Slack: Messaging, channels, threads, file sharing, app integration
- Twilio: SMS, voice calls, WhatsApp, phone number management
- SendGrid: Transactional email, templates, tracking, analytics
- Resend: Modern email API, React Email templates, webhook handling
- Stripe: Payment intents, subscriptions, customer management, invoicing, webhooks
- Obsidian: Vault integration, note creation, linking, search, tag management, daily notes
- Nemotron: NVIDIA's agentic AI capabilities, tool chaining, context management
- Agent Orchestrator: Multi-agent workflows, task delegation, result aggregation
- Razer AIKit: Local LLM inference with RTX 4090 GPU acceleration
- Model Management: Load, unload, list models, memory optimization
- Fine-tuning: LoRA training, model adaptation, checkpoint management
- Benchmarking: Performance testing, latency measurement, throughput analysis
- Upgraded FastMCP from 1.x to 2.14.2 with improved SSE handling
- Migrated to Python 3.14.2 from 3.11 for performance improvements
- Updated Starlette to 0.50.0 for enhanced ASGI support
- Improved error handling with specific exception types and detailed messages
- Enhanced logging with structured context and request tracing
- Optimized Docker images for faster builds and smaller size
- OAuth 2.1 with PKCE replacing basic token authentication
- Environment variable validation preventing insecure configurations
- API key rotation support for zero-downtime credential updates
- Rate limiting per tool and per API key
- Request sanitization preventing injection attacks
- Async/await throughout for non-blocking I/O operations
- Connection pooling for external API calls
- Response caching for frequently accessed data
- Streaming responses for large payloads
- GPU acceleration for local LLM inference
- 22 vLLM tools for local large language model inference
- RTX 4090 GPU support with CUDA optimization
- Model management: Load, unload, switch between models dynamically
- Fine-tuning capabilities: LoRA, QLoRA training on custom datasets
- Benchmarking suite: Performance metrics, latency analysis, throughput testing
- Memory optimization: KV cache management, quantization (4-bit, 8-bit)
- Docker Compose configuration for production deployment
- Multi-stage builds for optimized image size
- Health checks for container orchestration
- Volume mounting for model persistence
- Environment-based configuration for dev/staging/prod
vllm_load_model: Load models with configurable parametersvllm_generate: Text generation with streamingvllm_chat: Conversational interfacevllm_embeddings: Local embedding generationvllm_fine_tune: Model fine-tuning workflowsvllm_benchmark: Performance testingvllm_list_models: Available model enumerationvllm_unload_model: Memory management
- Enhanced GPU utilization with tensor parallelism
- Improved model loading with automatic format detection
- Optimized VRAM usage with PagedAttention
- 50-100ms latency for local inference on RTX 4090
- 2000+ tokens/sec throughput for Llama 2 7B
- Multi-GPU support for larger models
- MCP server with FastMCP framework
- SSE transport on port 8082
- Basic health check endpoint at
/health - Docker support with Dockerfile and docker-compose.yml
github_create_issue: Create issues with labels and assigneesgithub_list_issues: List and filter repository issuesgithub_get_issue: Get detailed issue informationgithub_create_pull_request: Create PRs with descriptiongithub_list_pull_requests: List and filter PRsgithub_merge_pull_request: Merge PRs with optionsgithub_create_repository: Create new repositoriesgithub_list_repositories: List user/org repositoriesgithub_create_branch: Create new branchesgithub_list_branches: List repository branchesgithub_get_commit: Get commit detailsgithub_list_commits: List commit history
discord_send_message: Send messages to channelsdiscord_create_channel: Create text/voice channelsdiscord_list_channels: List server channelsdiscord_create_role: Create roles with permissionsdiscord_list_roles: List server rolesdiscord_send_dm: Send direct messagesdiscord_create_webhook: Create channel webhooksdiscord_list_webhooks: List channel webhooks
- Type-safe API: Full type hints and validation
- Error handling: Structured error responses
- Logging: Request/response logging with context
- Configuration: Environment-based settings
- Documentation: OpenAPI/Swagger auto-generated docs
fastmcp==1.0.0starlette==0.45.0uvicorn==0.30.0pydantic==2.8.0httpx==0.27.0python-dotenv==1.0.0
| Version | Release Date | Highlights |
|---|---|---|
| 4.0.0 | 2026-02-06 | 331 tools, 13 LLM providers, OAuth 2.1, agent orchestration |
| 2.0.0 | 2026-01-09 | vLLM integration, RTX 4090 support, local inference |
| 1.0.0 | 2025-12-01 | Initial release, GitHub + Discord, basic MCP server |
- Project Repository: https://github.com/YOUR_ORG/razer-aikit-v2
- Documentation: https://docs.visionary-tools.dev
- Issue Tracker: https://github.com/YOUR_ORG/razer-aikit-v2/issues
- Discussions: https://github.com/YOUR_ORG/razer-aikit-v2/discussions
Breaking Changes:
- OAuth 2.1 Required: Basic token authentication removed
- New Environment Variables: Many
_TOKENvariables renamed to_API_KEY - Tool Naming: Some tools renamed for consistency
Steps:
- Update
.envfile with OAuth credentials - Replace
GITHUB_TOKENwithGITHUB_API_KEY - Update tool calls to use new names (see documentation)
- Test thoroughly before deploying
Breaking Changes:
- Python 3.14+: Minimum Python version increased
- FastMCP 2.x: Updated MCP protocol
Steps:
- Update Python to 3.14.2+
- Update dependencies:
pip install -r requirements.txt - Rebuild Docker images
- No tool API changes required
For detailed migration instructions, see MIGRATION.md
For contributing guidelines, see CONTRIBUTING.md