Change the repository type filter
All
Repositories list
44 repositories
- GLM-OCR: Accurate × Fast × Comprehensive
- GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
- An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
- Offical Implementation of SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
- GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.
- Pose Extraction & Rendering for SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
- GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning
- A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using autoregressive diffusio…
- text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
- [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
- CogView4, CogView3-Plus and CogView3(ECCV 2024)
- [AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
- GPT4V-level open-source multi-modal model based on Llama3-8B