LMI provides backend-specific user guides that cover the following topics:
-
Model Artifact Structure
- All backends support standard HuggingFace Transformers Pretrained artifacts
- The TensorRT-LLM user guide provides information on compiled model artifact structures
-
Supported Model Architectures
- Some Model Architectures can only be deployed using specific backends
-
Quick Start Configurations
- Starter configurations in both
serving.propertiesand environment variable formats to provide an out-of-the-box solution for that backend
- Starter configurations in both
-
Quantization Guide
- If a backend supports quantization, we describe the different options and how to enable them
-
Advanced Configurations
- Configurations that are only available with this backend
The available backends and their respective user guides are available below: