UCSB ERIC Lab

All

45 repositories

SafePro
Public
This is the source code for the SafePro paper
Python
•
Other
•0•2•0•9•Updated Mar 10, 2026Mar 10, 2026
eric-ai-lab.github.io
Public
HTML
•0•0•0•0•Updated Feb 20, 2026Feb 20, 2026
SAFER
Public
ICLR2026 SAFER: Risk-Constrained Sample-then-Filter in Large Language Models
Python
•
MIT License
•1•7•0•0•Updated Feb 14, 2026Feb 14, 2026
SAFEGROUND
Public
SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration
Python
•
MIT License
•2•9•0•0•Updated Feb 11, 2026Feb 11, 2026
Soft-Thinking
Public
Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
soft-reasoning chain-of-thought-reasoning reasoning-models
soft-reasoning chain-of-thought-reasoning reasoning-models soft-thinking continous-space-reasoning soft-token concept-token
Python
•
MIT License
•39•319•1•0•Updated Jan 26, 2026Jan 26, 2026
GRIT
Public
Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"
reinforcement-learning visual-reasoning visual-grounding
reinforcement-learning visual-reasoning visual-grounding multimodal-reasoning grounded-reasoning thinking-with-image
Python
•
MIT License
•11•180•5•0•Updated Jan 16, 2026Jan 16, 2026
DMLR
Public
Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"
Python
•4•64•2•0•Updated Dec 17, 2025Dec 17, 2025
EvoScene
Public
Jupyter Notebook
•
Apache License 2.0
•2•20•0•0•Updated Dec 15, 2025Dec 15, 2025
evoscene.github.io
Public
JavaScript
•0•0•0•0•Updated Dec 10, 2025Dec 10, 2025
EvoPresent
Public
[ICLR26] Official codebase for the paper "Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations"
Python
•22•335•2•0•Updated Oct 14, 2025Oct 14, 2025
Morph4D
Public
Jupyter Notebook
•2•33•1•0•Updated Oct 13, 2025Oct 13, 2025
3dtown.github.io
Public
JavaScript
•0•0•0•0•Updated Oct 4, 2025Oct 4, 2025
MMWorld
Public
Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"
evaluation video-understanding video-dataset
evaluation video-understanding video-dataset multi-disciplinary multimodal-large-language-models world-model
Python
•
MIT License
•1•28•0•0•Updated Jul 15, 2025Jul 15, 2025
SafeKey
Public
[EMNLP 2025] Official code for the paper "SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning"
ai-safety llm-safety large-reasoning-models
ai-safety llm-safety large-reasoning-models safety-reasoning
Python
•1•14•0•0•Updated Jun 30, 2025Jun 30, 2025
MSSBench
Public
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
safety ai-agents situational-awareness
safety ai-agents situational-awareness ai-assistant large-language-models multimodal-large-language-models
Python
•
MIT License
•2•30•3•0•Updated Jun 23, 2025Jun 23, 2025
Mojito
Public
Official repo for the paper "Mojito: Motion Trajectory and Intensity Control for Video Generation""
motion-control video-generation diffusion-models
motion-control video-generation diffusion-models controllable-generation text-to-video-generation
Python
•1•33•0•0•Updated Jun 11, 2025Jun 11, 2025
iReason
Public
Official code for paper "Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language Models"
Python
•1•5•0•0•Updated Jun 4, 2025Jun 4, 2025
MLRM-Halu
Public
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
Python
•5•0•0•0•Updated May 31, 2025May 31, 2025
VLMbench
Public
NeurIPS 2022 Paper "VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation"
language-grounding vision-and-language robotic-manipulation
language-grounding vision-and-language robotic-manipulation compositionality embodied-ai
Python
•
MIT License
•8•99•5•0•Updated May 8, 2025May 8, 2025
MiniGPT-5
Public
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
transformers diffusion-models multimodal-generation
transformers diffusion-models multimodal-generation multimodal-llm
Python
•
Apache License 2.0
•52•863•12•0•Updated May 8, 2025May 8, 2025
edit-room.github.io
Public
JavaScript
•0•0•0•0•Updated Apr 1, 2025Apr 1, 2025
EditRoom
Public
[ICLR 2025] EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Python
•
MIT License
•5•23•0•1•Updated Apr 1, 2025Apr 1, 2025
MMIR
Public
[ACL 2025 Findings] "Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models"
Python
•0•14•2•0•Updated Feb 25, 2025Feb 25, 2025
ProbMed
Public
[ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"
evaluation vision-and-language medical-vqa
evaluation vision-and-language medical-vqa medical-diagnosis llms large-multimodal-models
Python
•2•25•1•0•Updated Feb 21, 2025Feb 21, 2025
Aerial-Vision-and-Dialog-Navigation
Public
Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"
navigation aerial-imagery drone-navigation
navigation aerial-imagery drone-navigation vision-and-language vln
Python
•7•62•5•0•Updated Nov 4, 2024Nov 4, 2024
llm_coordination
Public
Code repository for the NAACL 2025 paper "LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models"
multiagent llms coordination-game
multiagent llms coordination-game agent-coordination
Python
•
MIT License
•7•44•1•0•Updated Oct 13, 2024Oct 13, 2024
swap-anything
Public
Official implementation of the ECCV paper "SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"
image-editing personalization diffusion-models
image-editing personalization diffusion-models subject-driven-generation photoswapping swap-anything
Python
•
MIT License
•14•262•5•0•Updated Oct 10, 2024Oct 10, 2024
ComCLIP
Public
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
causality clip svo
causality clip svo slip vision-and-language compositionality flickr8k-dataset image-text-matching flickr30k image-text-retrieval
Python
•
MIT License
•5•38•0•1•Updated Aug 18, 2024Aug 18, 2024
Screen-Point-and-Read
Public
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"
screen-reader ai-agents grounding
screen-reader ai-agents grounding gui-agents tree-of-lens layout-understanding
Python
•4•29•0•0•Updated Jul 31, 2024Jul 31, 2024
via-video
Public
0•26•1•0•Updated Jun 20, 2024Jun 20, 2024