| title | MetaLlamaChatGenerator |
|---|---|
| id | metallamachatgenerator |
| slug | /metallamachatgenerator |
| description | This component enables chat completion with any model hosted available with Meta Llama API. |
This component enables chat completion with any model hosted available with Meta Llama API.
| Most common position in a pipeline | After a ChatPromptBuilder |
| Mandatory init variables | api_key: A Meta Llama API key. Can be set with LLAMA_API_KEY env variable or passed to init() method. |
| Mandatory run variables | messages: A list of ChatMessage objects |
| Output variables | replies: A list of ChatMessage objects |
| API reference | Meta Llama API |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/meta_llama |
The MetaLlamaChatGenerator enables you to use multiple Meta Llama models by making chat completion calls to the Meta Llama API. The default model is Llama-4-Scout-17B-16E-Instruct-FP8.
Currently available models are:
| Model ID | Input context length | Output context length | Input Modalities | Output Modalities |
Llama-4-Scout-17B-16E-Instruct-FP8 |
128k | 4028 | Text, Image | Text |
Llama-4-Maverick-17B-128E-Instruct-FP8 |
128k | 4028 | Text, Image | Text |
Llama-3.3-70B-Instruct |
128k | 4028 | Text | Text |
Llama-3.3-8B-Instruct |
128k | 4028 | Text | Text |
MetaLlamaChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:
- A list of Tool objects: Pass individual tools as a list
- A single Toolset: Pass an entire Toolset directly
- Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list
This allows you to organize related tools into logical groups while also including standalone tools as needed.
from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.meta_llama import MetaLlamaChatGenerator
# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)
# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])
# Pass mixed tools and toolsets to the generator
generator = MetaLlamaChatGenerator(
tools=[math_toolset, weather_tool, news_tool] # Mix of Toolset and Tool objects
)For more details on working with tools, see the Tool and Toolset documentation.
To use this integration, you must have a Meta Llama API key. You can provide it with the LLAMA_API_KEY environment variable or by using a Secret.
Then, install the meta-llama-haystack integration:
pip install meta-llama-haystackMetaLlamaChatGenerator supports streaming responses from the LLM, allowing tokens to be emitted as they are generated. To enable streaming, pass a callable to the streaming_callback parameter during initialization.
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.meta_llama import (
MetaLlamaChatGenerator,
)
llm = MetaLlamaChatGenerator()
response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")])
print(response["replies"][0].text)With streaming and model routing:
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.meta_llama import (
MetaLlamaChatGenerator,
)
llm = MetaLlamaChatGenerator(
model="Llama-3.3-8B-Instruct",
streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
)
response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")])
## check the model used for the response
print("\n\n Model used: ", response["replies"][0].meta["model"])With multimodal inputs:
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.meta_llama import (
MetaLlamaChatGenerator,
)
llm = MetaLlamaChatGenerator(model="Llama-4-Scout-17B-16E-Instruct-FP8")
image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(
content_parts=["What does the image show? Max 5 words.", image],
)
response = llm.run([user_message])["replies"][0].text
print(response)
# Red apple on straw.## To run this example, you will need to set a `LLAMA_API_KEY` environment variable.
from haystack import Document, Pipeline
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.generators.utils import print_streaming_chunk
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.dataclasses import ChatMessage
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.utils import Secret
from haystack_integrations.components.generators.meta_llama import (
MetaLlamaChatGenerator,
)
## Write documents to InMemoryDocumentStore
document_store = InMemoryDocumentStore()
document_store.write_documents(
[
Document(content="My name is Jean and I live in Paris."),
Document(content="My name is Mark and I live in Berlin."),
Document(content="My name is Giorgio and I live in Rome."),
],
)
## Build a RAG pipeline
prompt_template = [
ChatMessage.from_user(
"Given these documents, answer the question.\n"
"Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n"
"Question: {{question}}\n"
"Answer:",
),
]
## Define required variables explicitly
prompt_builder = ChatPromptBuilder(
template=prompt_template,
required_variables={"question", "documents"},
)
retriever = InMemoryBM25Retriever(document_store=document_store)
llm = MetaLlamaChatGenerator(
api_key=Secret.from_env_var("LLAMA_API_KEY"),
streaming_callback=print_streaming_chunk,
)
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm.messages")
## Ask a question
question = "Who lives in Paris?"
rag_pipeline.run(
{
"retriever": {"query": question},
"prompt_builder": {"question": question},
},
)