Skip to content

Latest commit

 

History

History
206 lines (155 loc) · 7.76 KB

File metadata and controls

206 lines (155 loc) · 7.76 KB
title MetaLlamaChatGenerator
id metallamachatgenerator
slug /metallamachatgenerator
description This component enables chat completion with any model hosted available with Meta Llama API.

MetaLlamaChatGenerator

This component enables chat completion with any model hosted available with Meta Llama API.

Most common position in a pipeline After a ChatPromptBuilder
Mandatory init variables api_key: A Meta Llama API key. Can be set with LLAMA_API_KEY env variable or passed to init() method.
Mandatory run variables messages: A list of ChatMessage objects
Output variables replies: A list of ChatMessage objects
API reference Meta Llama API
GitHub link https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/meta_llama

Overview

The MetaLlamaChatGenerator enables you to use multiple Meta Llama models by making chat completion calls to the Meta Llama API. The default model is Llama-4-Scout-17B-16E-Instruct-FP8.

Currently available models are:

Model ID Input context length Output context length Input Modalities Output Modalities
Llama-4-Scout-17B-16E-Instruct-FP8 128k 4028 Text, Image Text
Llama-4-Maverick-17B-128E-Instruct-FP8 128k 4028 Text, Image Text
Llama-3.3-70B-Instruct 128k 4028 Text Text
Llama-3.3-8B-Instruct 128k 4028 Text Text
This component uses the same `ChatMessage` format as other Haystack Chat Generators for structured input and output. For more information, see the [ChatMessage documentation](../../concepts/data-classes/chatmessage.mdx).

Tool Support

MetaLlamaChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

  • A list of Tool objects: Pass individual tools as a list
  • A single Toolset: Pass an entire Toolset directly
  • Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.meta_llama import MetaLlamaChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = MetaLlamaChatGenerator(
    tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
)

For more details on working with tools, see the Tool and Toolset documentation.

Initialization

To use this integration, you must have a Meta Llama API key. You can provide it with the LLAMA_API_KEY environment variable or by using a Secret.

Then, install the meta-llama-haystack integration:

pip install meta-llama-haystack

Streaming

MetaLlamaChatGenerator supports streaming responses from the LLM, allowing tokens to be emitted as they are generated. To enable streaming, pass a callable to the streaming_callback parameter during initialization.

Usage

On its own

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.meta_llama import (
    MetaLlamaChatGenerator,
)

llm = MetaLlamaChatGenerator()
response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")])
print(response["replies"][0].text)

With streaming and model routing:

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.meta_llama import (
    MetaLlamaChatGenerator,
)

llm = MetaLlamaChatGenerator(
    model="Llama-3.3-8B-Instruct",
    streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
)

response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")])

## check the model used for the response
print("\n\n Model used: ", response["replies"][0].meta["model"])

With multimodal inputs:

from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.meta_llama import (
    MetaLlamaChatGenerator,
)

llm = MetaLlamaChatGenerator(model="Llama-4-Scout-17B-16E-Instruct-FP8")

image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(
    content_parts=["What does the image show? Max 5 words.", image],
)

response = llm.run([user_message])["replies"][0].text
print(response)

# Red apple on straw.

In a pipeline

## To run this example, you will need to set a `LLAMA_API_KEY` environment variable.

from haystack import Document, Pipeline
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.generators.utils import print_streaming_chunk
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.dataclasses import ChatMessage
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.utils import Secret

from haystack_integrations.components.generators.meta_llama import (
    MetaLlamaChatGenerator,
)

## Write documents to InMemoryDocumentStore
document_store = InMemoryDocumentStore()
document_store.write_documents(
    [
        Document(content="My name is Jean and I live in Paris."),
        Document(content="My name is Mark and I live in Berlin."),
        Document(content="My name is Giorgio and I live in Rome."),
    ],
)

## Build a RAG pipeline
prompt_template = [
    ChatMessage.from_user(
        "Given these documents, answer the question.\n"
        "Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n"
        "Question: {{question}}\n"
        "Answer:",
    ),
]

## Define required variables explicitly
prompt_builder = ChatPromptBuilder(
    template=prompt_template,
    required_variables={"question", "documents"},
)

retriever = InMemoryBM25Retriever(document_store=document_store)
llm = MetaLlamaChatGenerator(
    api_key=Secret.from_env_var("LLAMA_API_KEY"),
    streaming_callback=print_streaming_chunk,
)

rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm.messages")

## Ask a question
question = "Who lives in Paris?"
rag_pipeline.run(
    {
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
    },
)