This repository was archived by the owner on Jul 4, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 182
[Closed] TensorRT-LLM Engine stories #1742
Copy link
Copy link
Closed
Labels
Description
28 Nov 2024:
Closing all open Tensorrt-llm stories due to TensorRT-LLM not supporting Desktop
- feat: Docker container for TensorRT-LLM environment #1100
- bug: TensorRT-LLM frequency_penalty Parameter in model.yml Only Functions Correctly with Value 1, Produces Gibberish for other values #1137
- feat: TensorRT-LLM better error logs for Cuda/Nvidia driver out of date #1019
- bug: run llama3:tensorrt-llm leads to "cortex.llamacpp engine not found" #1020
- bug: TensorRT-LLM error #1315
- CI to pack CUDA dependencies for cortex.tensorrt-llm #1086
- epic: Cortex TensorRT-LLM support #1152
Cortex.tensorrt-llm repo:
- TensorRT-LLM Support for logits_prob #54
- tensorrt-llm-engine README.md file #55
- TensorRT-LLM Inflight batching #29
- TensorRT-LLM InferenceRequest and stop_words_list #30
- TensorRT-LLM Request Interruption #31
- TensorRT-LLM Unload Model #32
- TensorRT-LLM load multiple models #33