Skip to content
@mbzuai-oryx

ORYX

A Library for Large Vision-Language Models

Popular repositories Loading

  1. Awesome-LLM-Post-training Awesome-LLM-Post-training Public

    Awesome Reasoning LLM Tutorial/Survey/Guide

    Python 2.3k 154

  2. Video-ChatGPT Video-ChatGPT Public

    [ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

    Python 1.5k 127

  3. groundingLMM groundingLMM Public

    [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

    Python 945 53

  4. LLaVA-pp LLaVA-pp Public

    🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

    Python 848 61

  5. GeoChat GeoChat Public

    [CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing

    Python 691 61

  6. MobiLlama MobiLlama Public

    [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices

    Python 668 53

Repositories

Showing 10 of 46 repositories
  • OpenEarthAgent Public

    OpenEarthAgent is a unified framework for tool-augmented geospatial agents.

    mbzuai-oryx/OpenEarthAgent’s past year of commit activity
    Python 8 0 0 0 Updated Feb 20, 2026
  • ThinkGeo Public

    ThinkGeo is a Comprehensive Benchmark to evaluate Tool-Augmented Agents for Remote Sensing Tasks

    mbzuai-oryx/ThinkGeo’s past year of commit activity
    Python 59 Apache-2.0 3 3 0 Updated Feb 20, 2026
  • CoVR-R Public

    Reasoning Aware Composed Video Retrieval

    mbzuai-oryx/CoVR-R’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Feb 8, 2026
  • DuwatBench Public

    [EACL Accepted 🔥🔥] DuwatBench: A Benchmark for Arabic Calligraphy Understanding 🖋️📜

    mbzuai-oryx/DuwatBench’s past year of commit activity
    Python 1 Apache-2.0 1 0 0 Updated Jan 28, 2026
  • VideoMathQA Public

    VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos

    mbzuai-oryx/VideoMathQA’s past year of commit activity
    22 1 0 0 Updated Jan 26, 2026
  • LongShOT Public

    A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

    mbzuai-oryx/LongShOT’s past year of commit activity
    Python 12 0 0 0 Updated Jan 24, 2026
  • Video-R2 Public

    🔥🔥 Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models

    mbzuai-oryx/Video-R2’s past year of commit activity
    Python 14 0 0 0 Updated Jan 21, 2026
  • UniMed-CLIP Public

    Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities".

    mbzuai-oryx/UniMed-CLIP’s past year of commit activity
    Python 157 16 4 0 Updated Jan 19, 2026
  • Video-CoM Public

    Video-CoM: Interactive Video Reasoning via Chain of Manipulations

    mbzuai-oryx/Video-CoM’s past year of commit activity
    18 0 2 0 Updated Dec 1, 2025
  • Agent-X Public

    Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

    mbzuai-oryx/Agent-X’s past year of commit activity
    Jupyter Notebook 36 4 2 0 Updated Nov 27, 2025

Top languages

Loading…

Most used topics

Loading…