Build software better, together

sjinnovation / CollabAI

CollabAI is an open-source & self-hosted AI operation platform for small and medium-sized businesses. It’s a customizable & team-centric platform where you can have access to custom AI agents tailored to your business needs.

openai claude gemini-api ai-platform openai-api claude-ai gpt4-api claude-api gpt-4-1106-preview openai-assistant-api collaborativeai openai-assistant selfhostedai multi-modal-ai gpt4o custom-ai-agents ai-for-agency ai-for-non-profit

Updated Aug 26, 2025
JavaScript

DHT-AI-Studio / RAPTOR

Star

RAPTOR (Rapid AI-Powered Text and Object Recognition) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis, semantic search, and actionable insights. RAPTOR reducing manual tagging by 85% and making content discovery 10x faster.

nlp machine-learning microservices ai computer-vision deep-learning artificial-intelligence semantic-search ai-framework audio-processing content-analysis digital-asset-management video-analysis vector-database ai-automation llm multi-modal-ai content-intelligence ai-orchestration

Updated Feb 2, 2026
Python

CoRAL-ASU / weaver

Star

(EMNLP 2025) Weaver: A modular agentic pipeline that dynamically combines SQL and LLMs for advanced table-based question answering

machine-learning natural-language-processing sql database research question-answering sql-agent table-qa ai-agent llm reasoning-agent multi-modal-ai ott-qa table-reasoning finqa wikitablequestions tabfact

Updated Jul 19, 2025
HTML

tomoima525 / daily-diary

Star

Pipecat(Daily.co) x Gemini hackathon project. Talk to your agent and make your daily life memorable!

voice-assistant gemini-ai image-generation-ai multi-modal-ai pipecat-ai

Updated Nov 14, 2025
TypeScript

developtheweb / sheldon-ai-showcase

Star

Sheldon AI Assistant is a powerful and versatile Discord bot and web application that enhances user interaction and automates tasks within your Discord server and beyond. With advanced AI models, seamless integrations, a intuitive web interface, and a wide range of features, Sheldon is your go-to assistant for an engaging and productive experience.

nodejs webgl threejs typescript discord-bot artificial-intelligence discord-js conversational-ai ai-chatbot ai-assistant multi-modal-ai spacial-computing

Updated Aug 30, 2025

dheeraj966 / NEXT_GEN-AI-MODEL---REVOLUTION-AI

Star

Advanced AI architecture integrating multi-modal reasoning, dynamic token optimization, and self-reflective learning loops. Designed for high efficiency, deep contextual understanding, and adaptive general intelligence across vision, language, and logic tasks—pushing beyond conventional transformer limits.

agi multi-modal-ai adaptive-ai agentic-ai next-gen-ai trending-agi relevant-agi-2025 self-reflective-architectures benchmark-outperformer

Updated Oct 19, 2025

levrex / EHR-Clustering-RA

Star

Cast different EHR (electronic health record) layers to a shared latent space to identify patient subtypes

machine-learning-algorithms ehr cluster-analysis clustering-algorithm clinical-research clinical-data unsupervised-machine-learning ehr-phenotyping rheumatoid-arthritis multi-modal-ai

Updated Oct 28, 2025
Jupyter Notebook

Tanush1912 / sales-forge-backend

Star

Sales Forge is a high performance, real time voice interaction platform designed to train sales representatives through adaptive AI personas. It provides a low latency, immersive roleplay experience that simulates real world sales challenges.

python docker websockets postgresql roleplay real ai-agents conversational-ai fastapi voice-ai generative-ai sales-training sales-enablement multi-modal-ai coaching-platform

Updated Jan 20, 2026
Python

Md-Emon-Hasan / LangChain

Star

Powerful framework for building applications with Large Language Models (LLMs), enabling seamless integration with memory, agents, and external data sources.

Updated Feb 13, 2025
Jupyter Notebook

SolitudeZY / AI_chat

Star

A cosmic-themed, multi-modal AI chat application built with FastAPI and Vue 3, featuring web browsing capabilities, intelligent document analysis, and seamless integration with Deepseek and Qwen models.

vue3 fastapi multi-modal-ai

Updated Jan 7, 2026
Vue

metacore-stack / IntelliTeach

Star

An enterprise-grade educational platform leveraging advanced AI, real-time voice processing, and dynamic code generation to deliver personalized, adaptive learning experiences with multi-modal interactions, intelligent content synthesis, and production-ready architecture.

react machine-learning typescript nextjs adaptive-learning personalized-learning educational-technology ai-education voice-ai ai-tutoring real-time-ai multi-modal-ai dynamic-code-generation

Updated Jan 29, 2026
TypeScript

Aish-p / Text-Vision-Agent

Star

Text-Vision-Agent is an AI-powered assistant that generates images from text descriptions and provides detailed image descriptions. It combines image generation using FluxPipeline with vision-based language models like ChatOllama, enabling seamless text-to-image and image interpretation interactions.

generative-ai multi-modal-ai nlp-and-vision-integration chatollama fluxpipeline image-generation-and-description

Updated Feb 16, 2025
Python

nickcottrell / vrgb-kafka

Star

Color-based semantic routing for Apache Kafka - Tag events with RGB hex codes for flexible consumer-side filtering. Eliminates topic proliferation and enables dynamic routing without payload deserialization. Python reference implementation with validated 5x speedup over content-based routing.

python distributed-systems machine-learning kafka stream-processing apache-kafka real-time-processing message-broker event-routing multi-modal-ai

Updated Nov 15, 2025
Python

Diluksha-Upeka / Neurospace

Star

A Multi-Modal RAG Knowledge Engine An intelligent knowledge graph system that ingests video, audio, and PDF documents to create a connected semantic web. Features a graph-based retrieval engine (GraphRAG), multi-modal search, and an interactive React Flow visualization dashboard. Built with FastAPI, Next.js, Neo4j, and LlamaIndex.

python docker typescript neo4j nextjs knowledge-graph openai rag fastapi react-flow vector-database llamaindex genai multi-modal-ai graphrag

Updated Feb 5, 2026
Python

aditya-ai-architect / F.R.I.D.A.Y

Star

3D Intelligence Engine - Real-time voice, vision & gesture control powered by Gemini 3.0 Pro + React Three Fiber

react threejs real-time typescript ai computer-vision production gcp google-cloud gemini 3d react-three-fiber mediapipe voice-ai vertex-ai llm generative-ai multi-modal-ai

Updated Feb 5, 2026
TypeScript

ShivamMishra1603 / video-xplore

Star

AI video analysis + web research in one tool. Upload videos, ask questions, get comprehensive insights with current web data.

multi-modal-ai agentic-ai google-gemini-api

Updated Aug 30, 2025
Python

mwasifanwar / ChronoPredict

Star

Multi-modal system analyzing social media, news, art, and music to predict emerging cultural movements and artistic trends years before they mainstream.

creative-ai cultural-analytics social-dynamics trend-prediction multi-modal-ai multi-modal-ai-analysis

Updated Nov 11, 2025
Python

delegatexai / practical-ai-agents

Star

A curated list of AI agents (open-source & proprietary) that solve real-world problems. Updated regularly!

open-source machine-learning automation artificial-intelligence ai-agents ai-agents-framework llm-agents multi-modal-ai practical-ai-guide ai-agents-directory

Updated Apr 28, 2025

metacore-stack / WanderLens-AI

Star

An intelligent travel planning platform powered by GPT-4 and DALL-E 3 that generates personalized, optimized itineraries with route optimization, budget allocation, and AI-generated visual content through advanced prompt engineering and multi-modal AI integration.

machine-learning ai openai fastapi itinerary-generator streamlit gpt-4 prompt-engineering travel-tech langchain dall-e-3 multi-modal-ai travel0planning

Updated Feb 3, 2026
Python

Lipeka / Multi-modal-Recommendation-System

Star

A multi-modal recommender system that suggests books or music based on: Voice input, Audio song recognition, Typed queries, Real-time weather in your city

python deep-learning artificial-intelligence speech-recognition gradio rag large-language-models multi-modal-ai

Updated Jun 16, 2025
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal-ai

Here are 24 public repositories matching this topic...

sjinnovation / CollabAI

DHT-AI-Studio / RAPTOR

CoRAL-ASU / weaver

tomoima525 / daily-diary

developtheweb / sheldon-ai-showcase

dheeraj966 / NEXT_GEN-AI-MODEL---REVOLUTION-AI

levrex / EHR-Clustering-RA

Tanush1912 / sales-forge-backend

Md-Emon-Hasan / LangChain

SolitudeZY / AI_chat

metacore-stack / IntelliTeach

Aish-p / Text-Vision-Agent

nickcottrell / vrgb-kafka

Diluksha-Upeka / Neurospace

aditya-ai-architect / F.R.I.D.A.Y

ShivamMishra1603 / video-xplore

mwasifanwar / ChronoPredict

delegatexai / practical-ai-agents

metacore-stack / WanderLens-AI

Lipeka / Multi-modal-Recommendation-System

Improve this page

Add this topic to your repo