Skip to content

amusi/ICCV2025-Papers-with-Code

Repository files navigation

ICCV 2025 论文和开源项目合集(Papers with Code)

ICCV 2025 Accepance Rate of 24% = 2699 / 11239

注1:欢迎各位大佬提交issue,分享ICCV 2025论文和开源项目!

注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision

欢迎扫码加入【CVer学术交流群】,可以获取ICCV 2025等最前沿工作!这是最大的计算机视觉AI知识星球!每日更新,第一时间分享最新最前沿的计算机视觉、AIGC、扩散模型、多模态、深度学习、自动驾驶、医疗影像和遥感等方向的学习资料,快加入学起来!

【ICCV 2025 论文和开源代码目录】

3DGS(Gaussian Splatting)

Agent

Avatars

Backbone

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba

CLIP

Mamba

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Embodied AI

GAN

OCR

NeRF

DETR

Prompt

多模态大语言模型(MLLM)

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

大语言模型(LLM)

World Model(世界模型)

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

ReID(重识别)

扩散模型(Diffusion Models)

From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Vision Transformer

视觉和语言(Vision-Language)

目标检测(Object Detection)

异常检测(Anomaly Detection)

目标跟踪(Object Tracking)

医学图像(Medical Image)

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

医学图像分割(Medical Image Segmentation)

自动驾驶(Autonomous Driving)

Where, What, Why: Towards Explainable Driver Attention Prediction

ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones

DriveMM: All-in-One Large Multimodal Model for Autonomous Driving

3D点云(3D-Point-Cloud)

3D目标检测(3D Object Detection)

3D语义分割(3D Semantic Segmentation)

Low-level Vision

EAMamba: Efficient All-Around Vision State Space Model for Image Restoration

超分辨率(Super-Resolution)

去噪(Denoising)

图像去噪(Image Denoising)

3D人体姿态估计(3D Human Pose Estimation)

#3D Visual Grounding(3D视觉定位)

图像生成(Image Generation)

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

视频生成(Video Generation)

图像编辑(Image Editing)

Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing

视频编辑(Video Editing)

3D生成(3D Generation)

3D重建(3D Reconstruction)

人体运动生成(Human Motion Generation)

视频理解(Video Understanding)

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

具身智能(Embodied AI)

知识蒸馏(Knowledge Distillation)

深度估计(Depth Estimation)

立体匹配(Stereo Matching)

暗光图像增强(Low-light Image Enhancement)

图像压缩(Image Compression)](#IC)

场景图生成(Scene Graph Generation)

风格迁移(Style Transfer)

图像质量评价(Image Quality Assessment)

视频质量评价(Video Quality Assessment)

压缩感知(Compressive Sensing)

数据集(Datasets)

ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones

其他(Others)

Music Grounding by Short Video