-
Notifications
You must be signed in to change notification settings - Fork 2k
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
System Info
I tried to run the following sample code but got the error saying PluginConfig object has no attribute _paged_kv_cache when initializing LLM.
Here is the detailed info.
tensorrt_llm.version 0.19.0
OS: WSL - Ubuntu 22.04
Host CPU: Intel Core Ultra 9 185H
GPU: Nvidia RTX 4070
from tensorrt_llm import LLM, SamplingParams
def main():
prompts = [
"Hello, my name is",
"The president of the United States is",
"The capital of France is",
"The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
# The entry point of the program need to be protected for spawning processes.
if __name__ == '__main__':
main()
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- Install
tensorrt_llm==0.19.0on WSL - Ubuntu 22.04 with Python 3.10.0 - Run the above code
Expected behavior
The code should run smoothly.
actual behavior
Got an error when initializing LLM instance.
additional notes
Could be the issue with WSL?
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested