You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -26,7 +26,7 @@ In this article, you learn how to deploy CXRReportGen as an online endpoint for
26
26
* Grant permissions to the endpoint.
27
27
* Send test data to the model, receive and interpret results
28
28
29
-
## CXRReportGen - Grounded Report Generation Model for Chest X-rays
29
+
## CXRReportGen - grounded report generation model for chest X-rays
30
30
Radiology reporting demands detailed image understanding, integration of multiple inputs (including comparisons with prior imaging), and precise language generation, making it an ideal candidate for generative multimodal models. CXRReportGen not only performs the task of generating a list of findings from a chest Xray study, but also extends it by incorporating the localization of individual findings on the image—a task we refer to as grounded report generation.
31
31
32
32
The animation below demonstrates the conceptual architecture of the CxrReportGen model which consists of an embedding model paired with a general reasoner LLM.
@@ -52,7 +52,7 @@ For deployment to a self-hosted managed compute, you must have enough quota in y
52
52
> [!div class="nextstepaction"]
53
53
> [Deploy the model to managed compute](../../concepts/deployments-overview.md)
54
54
55
-
## Work with a Grounded Report Generation Model for Chest X-rays
55
+
## Work with a grounded report generation model for chest X-ray analysis
56
56
57
57
### Using REST API to consume the model
58
58
@@ -184,7 +184,7 @@ Response payload is a JSON formatted string containing the following fields:
184
184
}
185
185
```
186
186
187
-
### Supported Image Formats
187
+
### Supported image formats
188
188
The deployed model API supports images encoded in PNG or JPEG formats. For optimal results we recommend using uncompressed/lossless PNGs with 8-bit monochromatic images.
|`get_scaling_factor`|`boolean`| N<br/>`True`|`"True"` OR `"False"`| Whether the model should return "temperature" scaling factor. This factor is useful when you are planning to compare multiple cosine similarity values in application like classification. It is essential for correct implementation of "zero-shot" type of scenarios. For usage refer to the zero-shot classification example linked in the "More Examples" section. |
140
+
|`get_scaling_factor`|`boolean`| N<br/>`True`|`"True"` OR `"False"`| Whether the model should return "temperature" scaling factor. This factor is useful when you are planning to compare multiple cosine similarity values in application like classification. It is essential for correct implementation of "zero-shot" type of scenarios. For usage refer to the zero-shot classification example linked in the samples section. |
141
141
142
142
### Request Example
143
143
@@ -172,7 +172,7 @@ The `params` object contains the following fields:
172
172
]
173
173
},
174
174
"params": {
175
-
"get_scaling_factor": true
175
+
"get_scaling_factor": "true"
176
176
}
177
177
}
178
178
```
@@ -204,10 +204,10 @@ The submitted text is embedded into the same latent space as the image. This mea
204
204
205
205
If you are fine tuning the model, you can change these parameters to better suit your application needs.
206
206
207
-
### Supported Image Formats
207
+
### Supported image formats
208
208
The deployed model API supports images encoded in PNG format.
209
209
210
-
Upon receiving the images the model does pre-processing which involves compressing and resizing the images to 512x512 pixels.
210
+
Upon receiving the images the model does pre-processing which involves compressing and resizing the images to `512x512` pixels.
211
211
212
212
The preferred format is lossless PNG containing either an 8-bit monochromatic or RGB image. For optimization purposes, you can perform resizing on the client side to reduce network traffic.
213
213
@@ -218,7 +218,7 @@ MedImageInsight is a versatile model that can be applied to a wide range of task
218
218
*[Deploying and Using MedImageInsight](https://aka.ms/healthcare-ai-examples-mi2-deploy): learn how to deploy the MedImageInsight model programmatically and issue an API call to it.
219
219
220
220
### Classification techniques
221
-
*[Building a Zero-Shot Classifier](https://aka.ms/healthcare-ai-examples-mi2-zero-shot): discover how to create a classifier without the need training or large amount of labeled training data using MedImageInsight.
221
+
*[Building a Zero-Shot Classifier](https://aka.ms/healthcare-ai-examples-mi2-zero-shot): discover how to use MedImageInsight to create a classifier without the need for training or large amount of labeled ground truth data.
222
222
223
223
*[Enhancing Classification with Adapter Networks](https://aka.ms/healthcare-ai-examples-mi2-adapter): improve classification performance by building a small adapter network on top of MedImageInsight.
@@ -26,7 +26,7 @@ In this article, you learn how to deploy MedImageParse as an online endpoint for
26
26
* Send test data to the model, receive and interpret results
27
27
28
28
29
-
## MedImageParse - Prompt-based Segmentation of Medical Images
29
+
## MedImageParse - prompt-based segmentation of medical images
30
30
Biomedical image analysis is crucial for discovery in fields like cell biology, pathology, and radiology. Traditionally, tasks such as segmentation, detection, and recognition of relevant objects have been addressed separately, which can limit the overall effectiveness of image analysis. MedImageParse unifies these tasks through image parsing, jointly conducting segmentation, detection, and recognition across numerous object types and imaging modalities. By leveraging the interdependencies among these subtasks—such as the semantic labels of segmented objects—the model enhances accuracy and enables novel applications. For instance, it allows users to segment all relevant objects in an image using a simple text prompt, eliminating the need to manually specify bounding boxes for each object.
31
31
32
32
The image below shows the conceptual architecture of the MedImageParse model where an image embedding model is augmented with a task adaptation layer to produce segmentation masks and textual descriptions.
@@ -52,7 +52,7 @@ For deployment to a self-hosted managed compute, you must have enough quota in y
52
52
> [!div class="nextstepaction"]
53
53
> [Deploy the model to managed compute](../../concepts/deployments-overview.md)
54
54
55
-
## Work with an Segmentation Model
55
+
## Work with a segmentation model
56
56
57
57
### Using REST API to consume the model
58
58
@@ -182,12 +182,12 @@ Response payload is a list of JSON-formatted strings each corresponding to a sub
182
182
]
183
183
```
184
184
185
-
### Supported Image Formats
185
+
### Supported image formats
186
186
The deployed model API supports images encoded in PNG format. For optimal results we recommend using uncompressed/lossless PNGs with RGB images.
187
187
188
-
Note that as described above in the API spec, the model only accepts images in the resolution of `1024x1024`pixels. Images need to be resized and padded (in case of non-square aspect ratio).
188
+
Note that as described above in the API specification, the model only accepts images in the resolution of `1024x1024`pixels. Images need to be resized and padded (in case of non-square aspect ratio).
189
189
190
-
See the "Generating Segmentation for a Variety of Imaging Modalities" notebook for techniques and sample code useful for submitting images of various sizes stored in variety of biomedical imaging formats.
190
+
See the "Generating Segmentation for a Variety of Imaging Modalities" notebook for techniques and sample code useful for submitting images of various sizes stored using a variety of biomedical imaging formats.
191
191
192
192
## Learn more from samples
193
193
MedImageParse is a versatile model that can be applied to a wide range of tasks and imaging modalities. For more examples see the following interactive Python Notebooks:
The [Azure AI model catalog](../model-catalog-overview.md) provides foundational healthcare AI models that go beyond medical text comprehension to multimodal reasoning about medical data. These AI models can integrate and analyze data from diverse sources that come in various modalities, such as medical imaging, genomics, clinical records, and other structured and unstructured data sources. The models also span several healthcare fields like dermatology, ophthalmology, radiology, and pathology.
25
-
26
24
The healthcare industry is undergoing a revolutionary transformation driven by the power of artificial intelligence (AI). While existing large language models like GPT-4 show tremendous promise for clinical text-based tasks and general-purpose multimodal reasoning, they struggle to understand non-text multimodal healthcare data such as medical imaging—radiology, pathology, ophthalmology—and other specialized medical text like longitudinal electronic medical records. They also find it challenging to process non-text modalities like signal data, genomic data, and protein data, much of which isn't publicly available.
27
25
26
+
The [Azure AI model catalog](../model-catalog-overview.md) provides foundational healthcare AI models that facilitate AI-powered analysis of various medical data types and expand well beyond medical text comprehension into the multimodal reasoning about medical data. These AI models can integrate and analyze data from diverse sources that come in various modalities, such as medical imaging, genomics, clinical records, and other structured and unstructured data sources. The models also span several healthcare fields like dermatology, ophthalmology, radiology, and pathology.
27
+
28
28
:::image type="content" source="../../media/how-to/healthcare-ai/connect-modalities.gif" alt-text="Models that reason about various modalities come together to support discover, development and delivery of healthcare":::
29
29
30
-
In this article, you learn about Microsoft's catalog of foundational multimodal healthcare AI models. The models were developed in collaboration with Microsoft Research, strategic partners, and leading healthcare institutions for healthcare organizations. Healthcare organizations can use the models to rapidly build and deploy AI solutions tailored to their specific needs, while minimizing the extensive compute and data requirements typically associated with building multimodal models from scratch. The intention isn't for these models to serve as standalone products; rather, they're designed for developers to use as a foundation to build upon. With these healthcare AI models, professionals have the tools they need to harness the full potential of AI to enhance patient care.
30
+
In this article, you learn about Microsoft's catalog of foundational multimodal healthcare AI models. The models were developed in collaboration with Microsoft Research, strategic partners, and leading healthcare institutions for healthcare organizations. Healthcare organizations can use the models to rapidly build and deploy AI solutions tailored to their specific needs, while minimizing the extensive compute and data requirements typically associated with building multimodal models from scratch. The intention isn't for these models to serve as standalone products; rather, they're designed for developers to use as a foundation to build upon. With these healthcare AI models, professionals have the tools they need to harness the full potential of AI to enhance biomedical research, clinical workflows and ultimately care delivery.
31
31
32
32
## Microsoft first-party models
33
33
@@ -37,10 +37,10 @@ The following models are Microsoft's first party foundational multimodal Healthc
37
37
This model is an embedding model that enables sophisticated image analysis, including classification and similarity search in medical imaging. Researchers can use the model embeddings directly or build adapters for their specific tasks, thereby streamlining workflows in radiology, pathology, ophthalmology, dermatology, and other modalities. For example, the model can be used to build tools that automatically route imaging scans to specialists or flag potential abnormalities for further review. These actions can improve efficiency and patient outcomes. Furthermore, the model can be used for Responsible AI (RAI) safeguards such as out-of-distribution (OOD) detection and drift monitoring, to maintain stability and reliability of AI tools and data pipelines in dynamic medical imaging environments.
38
38
39
39
### [CXRReportGen](./deploy-cxrreportgen.md)
40
-
Chest X-rays are the most common radiology procedure globally. They're crucial because they help doctors diagnose a wide range of conditions—from lung infections to heart problems. These images are often the first step in detecting health issues that affect millions of people. This multimodal AI model incorporates current and prior images along with key patient information to generate detailed, structured reports from chest X-rays. The reports highlight AI-generated findings directly on the images to align with human-in-the-loop workflows. This capability accelerates turnaround times while enhancing the diagnostic precision of radiologists. This model is currently state-of-the-art on the industry standard MIMIC-CXR benchmark.
40
+
Chest X-rays are the most common radiology procedure globally. They're crucial because they help doctors diagnose a wide range of conditions—from lung infections to heart problems. These images are often the first step in detecting health issues that affect millions of people. This multimodal AI model incorporates current and prior images along with key patient information to generate detailed, structured reports from chest X-rays. The reports highlight AI-generated findings directly on the images to align with human-in-the-loop workflows. This capability accelerates turnaround times while enhancing the diagnostic precision of radiologists.
41
41
42
42
### [MedImageParse](./deploy-medimageparse.md)
43
-
This model is designed for precise image segmentation, and it covers various imaging modalities, including X-Rays, CT scans, MRIs, ultrasounds, dermatology images, and pathology slides. The model can be fine-tuned for specific applications, such as tumor segmentation or organ delineation, allowing developers to build tools on top of this model that leverage AI for highly targeted cancers and other disease detection, diagnostics and treatment planning.
43
+
This model is designed for precise image segmentation, and it covers various imaging modalities, including X-Rays, CT scans, MRIs, ultrasounds, dermatology images, and pathology slides. The model can be fine-tuned for specific applications, such as tumor segmentation or organ delineation, allowing developers to build tools on top of this model that leverage AI for highly sophisticated medical image analysis.
0 commit comments