Skip to content

Commit 61711fd

Browse files
committed
Add readme
Signed-off-by: M Q <[email protected]>
1 parent 137ac32 commit 61711fd

File tree

1 file changed

+137
-0
lines changed

1 file changed

+137
-0
lines changed
Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# AI Remote Inference App - Spleen Segmentation
2+
3+
This example application demonstrates how to perform medical image segmentation using **Triton Inference Server** for remote inference calls. The app processes DICOM CT series to segment spleen anatomy using a deep learning model hosted on a remote Triton server.
4+
5+
## Overview
6+
7+
This application showcases:
8+
- **Remote inference using Triton Inference Server**: The app connects to a Triton server to perform model inference remotely rather than loading models locally, by sending and receiving input/output tensors corresponding to the model dimensions including channels
9+
- **Triton client integration**: The built-in `TritonRemoteModel` class in the [triton_model.py](https://github.com/Project-MONAI/monai-deploy-app-sdk/blob/137ac32d647843579f52060c8f72f9d9e8b51c38/monai/deploy/core/models/triton_model.py) module contains and acts as a Triton inference client, communicating with an already loaded model network on the server. It supports the same API as the in-process model class (e.g., a loaded TorchScript model network), so that the application inference operator does not need to change when switching between in-process and remote inference
10+
- **Model metadata parsing**: Uses Triton's model folder structure which contains the `config.pbtxt` configuration file to extract model specifications including name, input/output dimensions, and other metadata. The parent folder of the Triton model folder needs to be used as the model path for the application
11+
12+
## Architecture
13+
14+
The application follows a pipeline architecture:
15+
16+
1. **DICOM Data Loading**: Loads DICOM study from input directory
17+
2. **Series Selection**: Selects appropriate CT series based on configurable rules
18+
3. **Volume Conversion**: Converts DICOM series to 3D volume
19+
4. **Remote Inference**: Performs spleen segmentation via Triton Inference Server
20+
5. **Output Generation**: Creates DICOM segmentation and STL mesh outputs
21+
22+
## Key Components
23+
24+
### Triton Integration
25+
26+
The `SpleenSegOperator` leverages the `MonaiSegInferenceOperator` which:
27+
- Uses the loaded model network which in turn acts as a **Triton inference client** and connects to a remote Triton Inference Server that actually serves the named model
28+
- Handles preprocessing and postprocessing transforms
29+
- No explicit remote inference logic is required in these two operators
30+
31+
### Model Configuration Requirements
32+
33+
The application requires a Triton model folder which contains a **Triton model configuration file** (`config.pbtxt`) to be present on the application side, and the parent path to the model folder will be used as the model path for the application. This example application has the following model folder structure:
34+
35+
```
36+
models_client_side/spleen_ct/config.pbtxt
37+
```
38+
39+
The path to `models_client_side` is the model path for the application while `spleen_ct` is the folder of the named model with the folder name matching the model name. The model name in the `config.pbtxt` file is therefore intentionally omitted.
40+
41+
This configuration file (`config.pbtxt`) contains essential model metadata:
42+
- **Model name**: `spleen_ct` (used for server communication)
43+
- **Input dimensions**: `[1, 96, 96, 96]` (channels, width, height, depth)
44+
- **Output dimensions**: `[2, 96, 96, 96]` (2-class segmentation output)
45+
- **Data types**: `TYPE_FP32` for both input and output
46+
- **Batching configuration**: Dynamic batching with preferred sizes
47+
- **Hardware requirements**: GPU-based inference
48+
49+
**Important**: The `config.pbtxt` file is used **in lieu of the actual model file** (e.g., TorchScript `.ts` file) that would be present in an in-process inference scenario. For remote inference, the physical model file (`model_spleen_ct_segmentation_v1.ts`) resides on the Triton server, while the client only needs the configuration metadata to understand the model's interface.
50+
51+
### API Compatibility Between In-Process and Remote Inference
52+
53+
The `TritonRemoteModel` class in the `triton_model.py` module contains the actual Triton client instance and provides the **same API as in-process model instances**. This design ensures that:
54+
55+
- **Application inference operators remain unchanged** whether using in-process or remote inference
56+
- **Seamless switching** between local and remote models without code modifications
57+
- **Unified interface** through the `__call__` method that handles both PyTorch tensors locally and Triton HTTP requests remotely
58+
- **Transparent model loading** where `MonaiSegInferenceOperator` uses the same `predictor` interface regardless of model location
59+
60+
## Setup and Configuration
61+
62+
### Environment Variables
63+
64+
Configure the following environment variables (see `env_settings_example.sh`):
65+
66+
```bash
67+
export HOLOSCAN_INPUT_PATH="inputs/spleen_ct_tcia" # Input DICOM directory
68+
export HOLOSCAN_MODEL_PATH="examples/apps/ai_remote_infer_app/models_client_side" # Client-side model config path
69+
export HOLOSCAN_OUTPUT_PATH="output_spleen" # Output directory
70+
export HOLOSCAN_LOG_LEVEL=DEBUG # Logging level
71+
export TRITON_SERVER_NETLOC="localhost:8000" # Triton server address
72+
```
73+
74+
### Triton Server Setup
75+
76+
1. **Server Side**: Deploy the actual model file (`model_spleen_ct_segmentation_v1.ts`) to your Triton server
77+
2. **Client Side**: Ensure the `config.pbtxt` file is available locally for metadata parsing
78+
3. **Network**: Ensure connectivity between client and Triton server on the specified port
79+
80+
### Directory Structure
81+
82+
```
83+
ai_remote_infer_app/
84+
├── app.py # Main application logic
85+
├── spleen_seg_operator.py # Custom segmentation operator
86+
├── __main__.py # Application entry point
87+
├── env_settings_example.sh # Environment configuration
88+
├── models_client_side/ # Client-side model configurations
89+
│ └── spleen_ct/
90+
│ └── config.pbtxt # Triton model configuration (no model file)
91+
└── README.md # This file
92+
```
93+
94+
## Usage
95+
96+
1. **Set up Triton Server** with the spleen segmentation model, listening at localhost:8000 in this example
97+
2. **Configure environment** variables pointing to your Triton server
98+
3. **Prepare input data** in DICOM format
99+
4. **Run the application**:
100+
```bash
101+
python ai_remote_infer_app
102+
```
103+
104+
## Input Requirements
105+
106+
- **DICOM CT series** containing abdominal scans
107+
- **Series selection criteria**: PRIMARY/ORIGINAL CT images
108+
- **Image preprocessing**: Automatic resampling to 1.5x1.5x2.9mm spacing
109+
110+
## Output
111+
112+
The application generates:
113+
- **DICOM Segmentation** files with spleen masks
114+
- **STL mesh** files for 3D visualization
115+
- **Intermediate NIfTI** files for debugging (optional)
116+
117+
## Model Specifications
118+
119+
- **Architecture**: 3D PyTorch model optimized for spleen segmentation
120+
- **Input size**: 96×96×96 voxels
121+
- **Output**: 2-class segmentation (background + spleen)
122+
- **Inference method**: Sliding window with 60% overlap
123+
- **Batch size**: Configurable (default: 4)
124+
125+
## Notes
126+
127+
- The application demonstrates **remote inference patterns** suitable for production deployments
128+
- **Model versioning** is handled server-side through Triton's version policies
129+
- **Dynamic batching** optimizes throughput for multiple concurrent requests
130+
- **GPU acceleration** is configured but can be adjusted based on available hardware
131+
132+
## Dependencies
133+
134+
- MONAI Deploy SDK
135+
- Triton Inference Client libraries
136+
- PyDICOM for DICOM handling
137+
- MONAI transforms for preprocessing

0 commit comments

Comments
 (0)