Skip to content

Commit b9f25cd

Browse files
committed
add
1 parent 2765099 commit b9f25cd

File tree

4 files changed

+24
-9
lines changed

4 files changed

+24
-9
lines changed

README.md

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -43,15 +43,15 @@ However, compared to task-specific DG models, FM offers increased task diversity
4343
- [Optimization Strategy](#optimization-strategy)
4444
- [Model Test Level](#model-test-level)
4545
- [Test-time Adaptation](#test-time-adaptation)
46-
- [Universal Foundation Model](#universal-foundation-model)
47-
- [Survey](#survey)
48-
- [Visual Prompted Models](#visual-prompted-models)
49-
- [Interactive](#interactive)
50-
- [Few-shot/One-shot](#few-shotone-shot)
51-
- [Textual Prompted Models](#textual-prompted-models)
52-
- [Contrastive](#contrastive)
53-
- [Generative](#generative)
54-
- [Conversational](#conversational)
46+
- [Universal Foundation Model](#universal-foundation-model)
47+
- [Survey](#survey)
48+
- [Visual Prompted Models](#visual-prompted-models)
49+
- [Interactive](#interactive)
50+
- [Few-shot/One-shot](#few-shotone-shot)
51+
- [Textual Prompted Models](#textual-prompted-models)
52+
- [Contrastive](#contrastive)
53+
- [Generative](#generative)
54+
- [Conversational](#conversational)
5555
- [Datasets](#datasets)
5656
- [Libraries](#libraries)
5757
- [Other Resources](#other-resources)
@@ -281,9 +281,24 @@ Self-supervised learning is a machine learning method where a model learns gener
281281
### Textual Prompted Models
282282

283283
#### Contrastive
284+
>Contrastive textually prompted models are increasingly recognized as foundational models for medical imaging. They learn representations that capture the semantics and relationships between medical images and their corresponding textual prompts. By leveraging contrastive learning objectives, these models bring similar image-text pairs closer in the feature space while pushing dissimilar pairs apart.
285+
286+
| Diagram | Descriptions |
287+
|:-----------------:|:------------|
288+
| <img src="images/img59.png" width="900"> |<li> Title: <a href="https://link.springer.com/chapter/10.1007/978-3-031-72378-0_67">MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise</a></li> <li>Publication: MICCAI 2024</li> <li>Propose MM-Retinal, a multi-modal dataset that encompasses high-quality image-text pairs collected from professional fundus diagram books. Moreover, enabled by MM-Retinal, present a novel Knowledge-enhanced foundational pretraining model which incorporates Fundus Image-Text expertise. It is designed with image similarity-guided text revision and mixed training strategy to infuse expert knowledge.</li> <li>Code: <a href="https://github.com/lxirich/MM-Retinal">https://github.com/lxirich/MM-Retinal</a>|
289+
| <img src="images/img58.png" width="900"> |<li> Title: <a href="https://www.nature.com/articles/s41467-023-40260-7">Knowledge-enhanced Visual-Language Pre-training on Chest Radiology Images</a></li> <li>Publication: Nature Communications 2023</li> <li>Propose an approach called Knowledge-enhanced Auto Diagnosis (KAD) which leverages existing medical domain knowledge to guide vision-language pre-training using paired chest X-rays and radiology reports.</li>|
290+
284291

285292
#### Generative
286293

294+
#### Conversational
295+
>Conversational textually prompted models are designed to enable interactive dialogues between medical professionals and the model by fine-tuning foundational models on specific instruction sets. These models enhance communication and collaboration, allowing medical experts to ask questions, provide instructions, and seek explanations related to medical images.
296+
297+
| Diagram | Descriptions |
298+
|:-----------------:|:------------|
299+
| <img src="images/img57.png" width="900"> |<li> Title: <a href="https://arxiv.org/pdf/2409.17508">Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE</a></li> <li>Publication: Neurips 2024</li> <li>Propose Uni-Med, a novel medical generalist foundation model consisting of a universal visual feature extraction module, a connector mixture-of-experts (CMoE) module, and a large language model (LLM). Benefiting from the proposed CMoE, which leverages a well-designed router with a mixture of projection experts at the connector, Uni-Med provides an efficient solution to the tug-of-war problem. It is capable of performing six different medical tasks, including question answering, visual question answering, report generation, referring expression comprehension, referring expression generation, and image classification.</li> <li>Code: <a href="https://github.com/MSIIP/Uni-Med">https://github.com/MSIIP/Uni-Med</a>|
300+
301+
287302
# Datasets
288303
> We list the widely used benchmark datasets for domain generalization including classification and segmentation.
289304

images/img57.png

407 KB
Loading

images/img58.png

554 KB
Loading

images/img59.png

1.34 MB
Loading

0 commit comments

Comments
 (0)