Skip to content
#

multi-modal-ai

Here are 24 public repositories matching this topic...

CollabAI is an open-source & self-hosted AI operation platform for small and medium-sized businesses. It’s a customizable & team-centric platform where you can have access to custom AI agents tailored to your business needs.

  • Updated Aug 26, 2025
  • JavaScript

RAPTOR (Rapid AI-Powered Text and Object Recognition) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis, semantic search, and actionable insights. RAPTOR reducing manual tagging by 85% and making content discovery 10x faster.

  • Updated Feb 2, 2026
  • Python

Sheldon AI Assistant is a powerful and versatile Discord bot and web application that enhances user interaction and automates tasks within your Discord server and beyond. With advanced AI models, seamless integrations, a intuitive web interface, and a wide range of features, Sheldon is your go-to assistant for an engaging and productive experience.

  • Updated Aug 30, 2025

Advanced AI architecture integrating multi-modal reasoning, dynamic token optimization, and self-reflective learning loops. Designed for high efficiency, deep contextual understanding, and adaptive general intelligence across vision, language, and logic tasks—pushing beyond conventional transformer limits.

  • Updated Oct 19, 2025

Sales Forge is a high performance, real time voice interaction platform designed to train sales representatives through adaptive AI personas. It provides a low latency, immersive roleplay experience that simulates real world sales challenges.

  • Updated Jan 20, 2026
  • Python

Powerful framework for building applications with Large Language Models (LLMs), enabling seamless integration with memory, agents, and external data sources.

  • Updated Feb 13, 2025
  • Jupyter Notebook

An enterprise-grade educational platform leveraging advanced AI, real-time voice processing, and dynamic code generation to deliver personalized, adaptive learning experiences with multi-modal interactions, intelligent content synthesis, and production-ready architecture.

  • Updated Jan 29, 2026
  • TypeScript

Text-Vision-Agent is an AI-powered assistant that generates images from text descriptions and provides detailed image descriptions. It combines image generation using FluxPipeline with vision-based language models like ChatOllama, enabling seamless text-to-image and image interpretation interactions.

  • Updated Feb 16, 2025
  • Python

Color-based semantic routing for Apache Kafka - Tag events with RGB hex codes for flexible consumer-side filtering. Eliminates topic proliferation and enables dynamic routing without payload deserialization. Python reference implementation with validated 5x speedup over content-based routing.

  • Updated Nov 15, 2025
  • Python

A Multi-Modal RAG Knowledge Engine An intelligent knowledge graph system that ingests video, audio, and PDF documents to create a connected semantic web. Features a graph-based retrieval engine (GraphRAG), multi-modal search, and an interactive React Flow visualization dashboard. Built with FastAPI, Next.js, Neo4j, and LlamaIndex.

  • Updated Feb 5, 2026
  • Python

An intelligent travel planning platform powered by GPT-4 and DALL-E 3 that generates personalized, optimized itineraries with route optimization, budget allocation, and AI-generated visual content through advanced prompt engineering and multi-modal AI integration.

  • Updated Feb 3, 2026
  • Python

Improve this page

Add a description, image, and links to the multi-modal-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-ai topic, visit your repo's landing page and select "manage topics."

Learn more