Integrated Multimodal Content Generation: Image, Video, and Audio Support in Chat #7528
Navdeesh-Official
started this conversation in
Feature Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Description
This feature request proposes adding native, integrated support for multimodal content generation—specifically image, video, and audio generation—directly within Jan.ai's chat interface, without relying on external Model Context Protocol (MCP) integrations.
Problem Statement
Currently, Jan.ai users who want to generate images, videos, or audio files must:
This fragmented approach disrupts the workflow and reduces the efficiency of AI-assisted content creation directly within Jan.ai.
Proposed Solution
Integrate multimodal content generation capabilities directly into Jan.ai's chat interface:
Image Generation
Video Generation
Audio Generation
Technical Considerations
Benefits
✅ Unified Workflow: Keep users within Jan.ai for most AI tasks
✅ Better Privacy: Local execution reduces data transmission
✅ Improved Performance: No external API latency
✅ Enhanced UX: Seamless integration within chat interface
✅ Community-Driven: Leverage open-source models and community contributions
Implementation Priority
Use Cases
Challenges & Mitigation
Community Input
Would love to hear community feedback on:
Related: This aligns with Jan.ai's mission of providing a comprehensive, self-hosted AI platform that gives users full control and privacy.
Beta Was this translation helpful? Give feedback.
All reactions