Skip to content

Does trtllm-serve enables prefix caching automatically with Deepseek-R1? #2932

@Bihan

Description

@Bihan

Does trtllm-serve enables prefix caching automatically ?

I want to serve Deepseek-R1 with prefix caching enabled. I am deploying as follow:

trtllm-serve
          --backend pytorch
          --max_batch_size $MAX_BATCH_SIZE
          --max_num_tokens $MAX_NUM_TOKENS
          --max_seq_len $MAX_SEQ_LENGTH
          --tp_size 8
          --ep_size 4
          --pp_size 1
          deepseek

Metadata

Metadata

Assignees

Labels

triagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions