Skip to content
Change the repository type filter

All

    Repositories list

    • HTML
      1000Updated Sep 14, 2025Sep 14, 2025
    • A curated list of balanced multimodal learning methods.
      410010Updated Aug 1, 2025Aug 1, 2025
    • MGIPF

      Public
      The repo for "MGIPF: Multi-Granularity Interest Prediction Framework for Personalized Recommendation", SIGIR 2025
      Python
      0200Updated Jul 26, 2025Jul 26, 2025
    • WCAE

      Public
      Python
      0000Updated Jul 1, 2025Jul 1, 2025
    • MokA

      Public
      MokA: Multimodal Low-Rank Adaptation for MLLMs
      Python
      32360Updated Jun 27, 2025Jun 27, 2025
    • MS-Bot

      Public
      The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)
      Python
      31510Updated Jun 25, 2025Jun 25, 2025
    • AnyTouch

      Public
      The repo for "AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors", ICLR 2025
      Python
      56520Updated Jun 25, 2025Jun 25, 2025
    • Official repo for ICML 2025 paper "RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer"
      Python
      2610Updated Jun 21, 2025Jun 21, 2025
    • A python implement for Certifiable Robust Multi-modal Training
      Python
      01810Updated Jun 21, 2025Jun 21, 2025
    • JavaScript
      0000Updated Jun 20, 2025Jun 20, 2025
    • [CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
      Python
      01400Updated Jun 17, 2025Jun 17, 2025
    • JavaScript
      0000Updated Jun 8, 2025Jun 8, 2025
    • The official repo for "Efficient Quantification of Multimodal Interaction at Sample Level", ICML 2025
      Python
      1600Updated Jun 5, 2025Jun 5, 2025
    • Crab

      Public
      [CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
      Python
      26930Updated Jun 4, 2025Jun 4, 2025
    • Python
      01200Updated Apr 30, 2025Apr 30, 2025
    • LFAV

      Public
      Towards Long Form Audio-visual Video Understanding
      Python
      01510Updated Apr 27, 2025Apr 27, 2025
    • This is the repo for "Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition", CVPR2025.
      Python
      2800Updated Mar 31, 2025Mar 31, 2025
    • The official repo for "Can Textual Semantics Mitigate Sounding Object Segmentation Preference?", ECCV 2024
      Python
      0510Updated Mar 1, 2025Mar 1, 2025
    • Python
      02540Updated Feb 23, 2025Feb 23, 2025
    • The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
      Python
      22283360Updated Jan 8, 2025Jan 8, 2025
    • Ref-AVS

      Public
      The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
      Python
      24600Updated Dec 4, 2024Dec 4, 2024
    • A curated list of audio-visual learning methods and datasets.
      1926600Updated Dec 3, 2024Dec 3, 2024
    • The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024
      Python
      35460Updated Nov 5, 2024Nov 5, 2024
    • TSPM

      Public
      Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
      Python
      11740Updated Oct 25, 2024Oct 25, 2024
    • The repo for "KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance", CoRL 2024
      Python
      1700Updated Oct 17, 2024Oct 17, 2024
    • The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
      Python
      11710Updated Oct 11, 2024Oct 11, 2024
    • Official repository for "Unveiling and Mitigating Bias in Audio Visual Segmentation" in ACM MM 2024
      Python
      0600Updated Oct 10, 2024Oct 10, 2024
    • The repo for "On-the-fly Modulation for Balanced Multimodal Learning", T-PAMI 2024
      Python
      11720Updated Sep 29, 2024Sep 29, 2024
    • Python
      01830Updated Aug 21, 2024Aug 21, 2024
    • A python implement for Geometric-Inspired Graph-based Incomplete Multi-view Clustering
      Python
      1900Updated Aug 16, 2024Aug 16, 2024