Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ Designed to help researchers and practitioners explore, compare, and build state
##### b-3: Learnable Query Encoding
| Name | Title | Venue | Date | Code | Demo |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---- | :--------- | :----------------------------------------------------------- | :------------------------------------------------------------------- |:----|
| Puffin | [Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation](https://arxiv.org/abs/2510.08673) ![GitHub Repo stars](https://img.shields.io/github/stars/KangLiao929/Puffin?style=social) | arXiv | 2025/10/09 | [Github](https://github.com/KangLiao929/Puffin) | - |
| TBAC-UniImage | [TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning](https://arxiv.org/abs/2508.08098) ![GitHub Repo stars](https://img.shields.io/github/stars/DruryXu/TBAC-UniImage?style=social) | arXiv | 2025/08/11 | [Github](https://github.com/DruryXu/TBAC-UniImage) | - |
| UniLIP | [UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing](https://arxiv.org/abs/2507.23278) | arXiv | 2025/07/31 | - | - |
| Ming-Omni | [Ming-Omni: A Unified Multimodal Model for Perception and Generation](https://arxiv.org/abs/2506.09344) | arXiv | 2025/06/09 | - | - |
Expand Down