Optimized Latency Inference Models with AWS #28544
Unanswered
johnpiscani
asked this question in
Q&A
Replies: 2 comments 2 replies
-
recent |
Beta Was this translation helpful? Give feedback.
0 replies
-
Any updates on this? It would be very nice |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
AWS Just released latency optimized inference models for Haiku 3.5 (https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html)
I want to be able to invoke this inference model from within langchain, but can not figure out how to correctly put this parameter when using langchain. I have provide both the code that is working using boto3 and the code I can not get to work using langchain. Please help me in getting this performance latency code within langchain code.
Thank you!
P.S. this feature was just recently released as part of AWS reInvent
System Info
Using most to do langchain and langchain_aws
Beta Was this translation helpful? Give feedback.
All reactions