Change the repository type filter
All
Repositories list
45 repositories
SAFER
PublicSAFEGROUND
PublicSoft-Thinking
PublicOfficial implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"GRIT
PublicOfficial code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"EvoScene
Publicevoscene.github.io
PublicEvoPresent
PublicMorph4D
Public3dtown.github.io
PublicMMWorld
PublicOfficial repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"SafeKey
Public[EMNLP 2025] Official code for the paper "SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning"MSSBench
Public[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"- Official repo for the paper "Mojito: Motion Trajectory and Intensity Control for Video Generation""
iReason
PublicMLRM-Halu
PublicVLMbench
PublicNeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"MiniGPT-5
PublicOfficial implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"edit-room.github.io
PublicEditRoom
PublicMMIR
PublicProbMed
Public[ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"- Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"
llm_coordination
PublicCode repository for the NAACL 2025 paper "LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models"swap-anything
PublicOfficial implementation of the ECCV paper "SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"ComCLIP
PublicOfficial implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"Screen-Point-and-Read
PublicCode repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"via-video
Public