- Always first write your "plan" in natural language, then transcribe it in pipelex.
- You should ALWAYS RUN validation when you are writing or editing a
.plxfile. It will ensure the pipe is runnable. If not, iterate.- For a specific file:
pipelex validate path_to_file.plx - For all pipelines:
pipelex validate --all - IMPORTANT: Ensure the Python virtual environment is activated before running
pipelexcommands. For standard installations, the venv is named.venv- always check that first. The commands will not work without proper venv activation.
- For a specific file:
- Please use POSIX standard for files. (empty lines, no trailing whitespaces, etc.)
- Files must be
.plxfor pipelines (Always add an empty line at the end of the file, and do not add trailing whitespaces to PLX files at all) - Files must be
.pyfor code defining the data structures - Use descriptive names in
snake_case
A pipeline file has three main sections:
- Domain statement
- Concept definitions
- Pipe definitions
domain = "domain_code"
description = "Description of the domain" # OptionalNote: The domain code usually matches the plx filename for single-file domains. For multi-file domains, use the subdirectory name.
Concepts represent ideas and semantic entities in your pipeline. They define what something is, not how it's structured.
[concept]
ConceptName = "Description of the concept"Naming Rules:
- Use PascalCase for concept names
- Never use plurals (no "Stories", use "Story") - lists are handled implicitly by Pipelex
- Avoid circumstantial adjectives (no "LargeText", use "Text") - focus on the essence of what the concept represents
- Don't redefine native concepts (Text, Image, PDF, TextAndImages, Number, Page, JSON)
Native Concepts:
Pipelex provides built-in native concepts: Text, Image, PDF, TextAndImages, Number, Page, JSON. Use these directly or refine them when appropriate.
Refining Concepts:
To create a concept that specializes another concept without adding fields, use refines:
## Refining a native concept
[concept.Landscape]
description = "A scenic outdoor photograph"
refines = "Image"
## Refining a custom concept (must be in domain.ConceptCode format)
[concept.PremiumCustomer]
description = "A premium customer with special benefits"
refines = "myapp.Customer"Note: When refining a custom (non-native) concept, you must use the fully qualified concept ref in domain.ConceptCode format. Pipelex automatically handles the dependency order to ensure referenced concepts are loaded first.
For details on how to structure concepts with fields, see the "Structuring Models" section below.
[pipe.your_pipe_code]
type = "PipeLLM"
description = "A description of what your pipe does"
inputs = { input_1 = "ConceptName1", input_2 = "ConceptName2" }
output = "ConceptName"The pipes will all have at least this base definition.
inputs: Dictionary of key being the variable used in the prompts, and the value being the ConceptName. It should ALSO LIST THE INPUTS OF THE INTERMEDIATE STEPS (if PipeSequence) or of the conditional pipes (if PipeCondition). So If you have this error:PipeValidationError: missing_input_variable • domain='expense_validator' • pipe='validate_expense' • variable='['invoice']'`` That means that the pipe validate_expense is missing the inputinvoice` because one of the subpipe is needing it.
NEVER WRITE THE INPUTS BY BREAKING THE LINE LIKE THIS:
inputs = {
input_1 = "ConceptName1",
input_2 = "ConceptName2"
}</NEVER DO THIS>
output: The name of the concept to output. TheConceptNameshould have the same name as the python class if you want structured output:
By default, inputs expect a single item. Use bracket notation to specify multiple items:
## Single item (default)
inputs = { document = "Text" }
## Variable list - indeterminate number of items
inputs = { documents = "Text[]" }
## Fixed count - exactly N items
inputs = { comparison_items = "Image[2]" }Key points:
- No brackets = single item (default behavior)
- Use
[]for lists of unknown length - Use
[N](where N is an integer) when operation requires exact count (e.g., comparing 2 items)
Once you've defined your concepts semantically (see "Concept Definitions" above), you need to specify their structure if they have fields.
1. No Structure Needed
If a concept only refines a native concept without adding fields, use the TOML table syntax shown in "Concept Definitions" above. No structure section is needed.
2. Inline Structure Definition (RECOMMENDED for most cases)
For concepts with structured fields, define them inline using TOML syntax:
[concept.Invoice]
description = "A commercial document issued by a seller to a buyer"
[concept.Invoice.structure]
invoice_number = "The unique invoice identifier" # This will be optional by default
issue_date = { type = "date", description = "The date the invoice was issued", required = true }
total_amount = { type = "number", description = "The total invoice amount", required = true }
vendor_name = "The name of the vendor" # This will be optional by default
line_items = { type = "list", item_type = "text", description = "List of items" }Supported inline field types: text, integer, boolean, number, date, list, dict, concept
Field properties: type, description, required (default: false), default_value, choices, item_type (for lists), key_type and value_type (for dicts), concept_ref (for concept references), item_concept_ref (for lists of concepts)
Simple syntax (creates required text field):
field_name = "Field description"Detailed syntax (with explicit properties):
field_name = { type = "text", description = "Field description", default_value = "default" }Concept reference syntax (referencing another concept):
## Single concept reference
customer = { type = "concept", concept_ref = "myapp.Customer", description = "The customer" }
## List of concepts
line_items = { type = "list", item_type = "concept", item_concept_ref = "myapp.LineItem", description = "Line items" }Example with concept references:
[concept.Customer]
description = "A customer entity"
[concept.Customer.structure]
name = { type = "text", description = "Customer name" }
email = { type = "text", description = "Customer email" }
[concept.LineItem]
description = "A line item in an invoice"
[concept.LineItem.structure]
product = { type = "text", description = "Product name" }
quantity = { type = "integer", description = "Quantity ordered" }
unit_price = { type = "number", description = "Price per unit" }
[concept.Invoice]
description = "An invoice document"
[concept.Invoice.structure]
customer = { type = "concept", concept_ref = "myapp.Customer", description = "The customer" }
items = { type = "list", item_type = "concept", item_concept_ref = "myapp.LineItem", description = "Line items" }
total = { type = "number", description = "Invoice total" }Note: Pipelex automatically determines the correct loading order for concepts based on their dependencies (topological sort), so concepts can reference each other across domains as long as there are no circular dependencies.
3. Python StructuredContent Class (For Advanced Features)
Create a Python class when you need:
- Custom validation logic (@field_validator, @model_validator)
- Computed properties (@property methods)
- Custom methods or class methods
- Complex cross-field validation
- Reusable structures across multiple domains
from pipelex.core.stuffs.structured_content import StructuredContent
from pydantic import Field, field_validator
class Invoice(StructuredContent):
"""A commercial invoice with validation."""
invoice_number: str = Field(description="The unique invoice identifier")
total_amount: float = Field(ge=0, description="The total invoice amount")
tax_amount: float = Field(ge=0, description="Tax amount")
@field_validator('tax_amount')
@classmethod
def validate_tax(cls, v, info):
"""Ensure tax doesn't exceed total."""
total = info.data.get('total_amount', 0)
if v > total:
raise ValueError('Tax amount cannot exceed total amount')
return vLocation: Create models in my_project/some_domain/some_domain_struct.py. Classes inheriting from StructuredContent are automatically discovered.
If concept already exists:
- If it's already inline → KEEP IT INLINE unless user explicitly asks to convert or features require Python class
- If it's already a Python class → KEEP IT as Python class
If creating new concept:
- Does it only refine a native concept without adding fields? → Use concept-only declaration
- Does it need custom validation, computed properties, or methods? → Use Python class
- Otherwise → Use inline structure (fastest and simplest)
When to suggest conversion to Python class:
- User needs validation logic beyond type checking
- User needs computed properties or custom methods
- Structure needs to be reused across multiple domains
- Complex type relationships or inheritance required
Inline structures:
- ✅ Support all common field types (text, number, date, list, dict, concept, etc.)
- ✅ Support required/optional fields, defaults, choices
- ✅ Support concept-to-concept references (type = "concept" with concept_ref)
- ✅ Support lists of concepts (type = "list" with item_type = "concept")
- ✅ Support refining both native and custom concepts
- ✅ Generate full Pydantic models with validation
- ❌ Cannot have custom validators or complex validation logic
- ❌ Cannot have computed properties or custom methods
- ❌ Limited IDE autocomplete compared to explicit Python classes
Look at the Pipes we have in order to adapt it. Pipes are organized in two categories:
-
Controllers - For flow control:
PipeSequence- For creating a sequence of multiple stepsPipeCondition- If the next pipe depends of the expression of a stuff in the working memoryPipeParallel- For parallelizing pipes
-
Operators - For specific tasks:
PipeLLM- Generate Text and Objects (include Vision LLM)PipeExtract- Extract text and images from an image or a PDFPipeCompose- For composing text using Jinja2 templates: supports html, markdown, mermaid, etc.PipeImgGen- Generate ImagesPipeFunc- For running classic python scripts
Purpose: PipeSequence executes multiple pipes in a defined order, where each step can use results from original inputs or from previous steps.
[pipe.your_sequence_name]
type = "PipeSequence"
description = "Description of what this sequence does"
inputs = { input_name = "InputType" } # All the inputs of the sub pipes, except the ones generated by intermediate steps
output = "OutputType"
steps = [
{ pipe = "first_pipe", result = "first_result" },
{ pipe = "second_pipe", result = "second_result" },
{ pipe = "final_pipe", result = "final_result" }
]- Steps Array: List of pipes to execute in sequence
pipe: Name of the pipe to executeresult: Name to assign to the pipe's output that will be in the working memory
You can use PipeBatch functionality within steps using batch_over and batch_as:
steps = [
{ pipe = "process_items", batch_over = "input_list", batch_as = "current_item", result = "processed_items"
}
]-
batch_over: Specifies a
ListContentfield to iterate over. Each item in the list will be processed individually and IN PARALLEL by the pipe.- Must be a
ListContenttype containing the items to process - Can reference inputs or results from previous steps
- Must be a
-
batch_as: Defines the name that will be used to reference the current item being processed
- This name can be used in the pipe's input mappings
- Makes each item from the batch available as a single element
The result of a batched step will be a ListContent containing the outputs from processing each item.
The PipeCondition controller allows you to implement conditional logic in your pipeline, choosing which pipe to execute based on an evaluated expression. It supports both direct expressions and expression templates.
[pipe.conditional_operation]
type = "PipeCondition"
description = "A conditional pipe to decide whether..."
inputs = { input_data = "CategoryInput" }
output = "native.Text"
expression = "input_data.category"
default_outcome = "process_medium"
[pipe.conditional_operation.outcomes]
small = "process_small"
medium = "process_medium"
large = "process_large"or
[pipe.conditional_operation]
type = "PipeCondition"
description = "A conditional pipe to decide whether..."
inputs = { input_data = "CategoryInput" }
output = "native.Text"
expression_template = "{{ input_data.category }}" # Jinja2 code
default_outcome = "process_medium"
[pipe.conditional_operation.outcomes]
small = "process_small"
medium = "process_medium"
large = "process_large"expression: Direct boolean or string expression (mutually exclusive with expression_template)expression_template: Jinja2 template for more complex conditional logic (mutually exclusive with expression)outcomes: Dictionary mapping expression results to pipe codes:- The key on the left (
small,medium) is the result ofexpressionorexpression_template - The value on the right (
process_small,process_medium, etc.) is the name of the pipe to trigger
- The key on the left (
default_outcome: Required - The pipe to execute if the expression doesn't match any key in outcomes. Use"fail"if you want the pipeline to fail when no match is found
Example with fail as default:
[pipe.strict_validation]
type = "PipeCondition"
description = "Validate with strict matching"
inputs = { status = "Status" }
output = "Text"
expression = "status.value"
default_outcome = "fail"
[pipe.strict_validation.outcomes]
approved = "process_approved"
rejected = "process_rejected"PipeLLM is used to:
- Generate text or objects with LLMs
- Process images with Vision LLMs
Simple Text Generation:
[pipe.write_story]
type = "PipeLLM"
description = "Write a short story"
output = "Text"
prompt = """
Write a short story about a programmer.
"""Structured Data Extraction:
[pipe.extract_info]
type = "PipeLLM"
description = "Extract information"
inputs = { text = "Text" }
output = "PersonInfo"
prompt = """
Extract person information from this text:
@text
"""Supports system instructions:
[pipe.expert_analysis]
type = "PipeLLM"
description = "Expert analysis"
output = "Analysis"
system_prompt = "You are a data analysis expert"
prompt = "Analyze this data"Generate multiple outputs (fixed number) - use bracket notation:
[pipe.generate_ideas]
type = "PipeLLM"
description = "Generate ideas"
output = "Idea[3]" # Generate exactly 3 ideas
prompt = "Generate 3 ideas"Generate multiple outputs (variable number) - use bracket notation:
[pipe.generate_ideas]
type = "PipeLLM"
description = "Generate ideas"
output = "Idea[]" # Let the LLM decide how many to generate
prompt = "Generate ideas"Process images with VLMs (image inputs must be tagged in the prompt):
[pipe.analyze_image]
type = "PipeLLM"
description = "Analyze image"
inputs = { image = "Image" }
output = "ImageAnalysis"
prompt = """
Describe what you see in this image:
$image
"""You can also reference images inline in meaningful sentences to guide the Visual LLM:
[pipe.compare_images]
type = "PipeLLM"
description = "Compare two images"
inputs = { photo = "Image", painting = "Image" }
output = "Analysis"
prompt = "Analyze the colors in $photo and the shapes in $painting."Insert stuff inside a tagged block
If the inserted text is supposedly a long text, made of several lines or paragraphs, you want it inserted inside a block, possibly a block tagged and delimlited with proper syntax as one would do in a markdown documentation. To include stuff as a block, use the "@" prefix.
Example template:
prompt = """
Match the expense with its corresponding invoice:
@expense
@invoices
"""In the example above, the expense data and the invoices data are obviously made of several lines each, that's why it makes sense to use the "@" prefix in order to have them delimited inside a block. Note that our preprocessor will automatically include the block's title, so it doesn't need to be explicitly written in the prompt.
DO NOT write things like "Here is the expense: @expense". DO write simply "@expense" alone in an isolated line.
Insert stuff inline
If the inserted text is short text and it makes sense to have it inserted directly into a sentence, you want it inserted inline. To insert stuff inline, use the "$" prefix. This will insert the stuff without delimiters and the content will be rendered as plain text.
Example template:
prompt = """
Your goal is to summarize everything related to $topic in the provided text:
@text
Please provide only the summary, with no additional text or explanations.
Your summary should not be longer than 2 sentences.
"""In the example above, $topic will be inserted inline, whereas @text will be a a delimited block. Be sure to make the proper choice of prefix for each insertion.
DO NOT write "$topic" alone in an isolated line. DO write things like "Write an essay about $topic" to include text into an actual sentence.
The PipeExtract operator is used to extract text and images from an image or a PDF
[pipe.extract_info]
type = "PipeExtract"
description = "extract the information"
inputs = { document = "Document" } # or { image = "Image" } if it's an image. This is the only input.
output = "Page"Using Extract Model Settings:
[pipe.extract_with_model]
type = "PipeExtract"
description = "Extract with specific model"
inputs = { document = "Document" }
output = "Page"
model = "base_extract_mistral" # Use predefined extract preset or model aliasOnly one input is allowed and it must either be an Image or a PDF. The input can be named anything.
The output concept Page is a native concept, with the structure PageContent:
It corresponds to 1 page. Therefore, the PipeExtract is outputing a ListContent of Page
class TextAndImagesContent(StuffContent):
text: TextContent | None
images: list[ImageContent] | None
class PageContent(StructuredContent): # CONCEPT IS "Page"
text_and_images: TextAndImagesContent
page_view: ImageContent | None = Nonetext_and_imagesare the text, and the related images found in the input image or PDF.page_viewis the screenshot of the whole pdf page/image.
The PipeCompose operator is used to compose text using Jinja2 templates. It supports various output formats including HTML, Markdown, Mermaid diagrams, and more.
Simple Template Composition:
[pipe.compose_report]
type = "PipeCompose"
description = "Compose a report using template"
inputs = { data = "ReportData" }
output = "Text"
template = """
## Report Summary
Based on the analysis:
$data
Generated on: {{ current_date }}
"""Using Named Templates:
[pipe.use_template]
type = "PipeCompose"
description = "Use a predefined template"
inputs = { content = "Text" }
output = "Text"
template_name = "standard_report_template"Using Nested Template Section (for more control):
[pipe.advanced_template]
type = "PipeCompose"
description = "Use advanced template settings"
inputs = { data = "ReportData" }
output = "Text"
[pipe.advanced_template.template]
template = "Report: $data"
category = "html"
templating_style = { tag_style = "square_brackets", text_format = "html" }CRM Email Template:
[pipe.compose_follow_up_email]
type = "PipeCompose"
description = "Compose a personalized follow-up email for CRM"
inputs = { customer = "Customer", deal = "Deal", sales_rep = "SalesRep" }
output = "Text"
template_category = "html"
templating_style = { tag_style = "square_brackets", text_format = "html" }
template = """
Subject: Following up on our $deal.product_name discussion
Hi $customer.first_name,
I hope this email finds you well! I wanted to follow up on our conversation about $deal.product_name from $deal.last_contact_date.
Based on our discussion, I understand that your key requirements are: $deal.customer_requirements
I'm excited to let you know that we can definitely help you achieve your goals. Here's what I'd like to propose:
**Next Steps:**
- Schedule a demo tailored to your specific needs
- Provide you with a customized quote based on your requirements
- Connect you with our implementation team
Would you be available for a 30-minute call this week? I have openings on:
{% for slot in available_slots %}
- {{ slot }}
{% endfor %}
Looking forward to moving this forward together!
Best regards,
$sales_rep.name
$sales_rep.title
$sales_rep.phone | $sales_rep.email
"""template: Inline template string (mutually exclusive with template_name and construct)template_name: Name of a predefined template (mutually exclusive with template)template_category: Template type ("llm_prompt", "html", "markdown", "mermaid", etc.)templating_style: Styling options for template renderingextra_context: Additional context variables for template
For more control, you can use a nested template section instead of the template field:
template.template: The template stringtemplate.category: Template typetemplate.templating_style: Styling options
Use the same variable insertion rules as PipeLLM:
@variablefor block insertion (multi-line content)$variablefor inline insertion (short text)
PipeCompose can also generate StructuredContent objects using the construct section. This mode composes field values from fixed values, variable references, templates, or nested structures.
When to use construct mode:
- You need to output a structured object (not just Text)
- You want to deterministically compose fields from existing data
- No LLM is needed - just data composition and templating
[concept.SalesSummary]
description = "A structured sales summary"
[concept.SalesSummary.structure]
report_title = { type = "text", description = "Title of the report" }
customer_name = { type = "text", description = "Customer name" }
deal_value = { type = "number", description = "Deal value" }
summary_text = { type = "text", description = "Generated summary text" }
[pipe.compose_summary]
type = "PipeCompose"
description = "Compose a sales summary from deal data"
inputs = { deal = "Deal" }
output = "SalesSummary"
[pipe.compose_summary.construct]
report_title = "Monthly Sales Report"
customer_name = { from = "deal.customer_name" }
deal_value = { from = "deal.amount" }
summary_text = { template = "Deal worth $deal.amount with $deal.customer_name" }There are four ways to define field values in a construct:
1. Fixed Value (literal)
Use a literal value directly:
[pipe.compose_report.construct]
report_title = "Annual Report"
report_year = 2024
is_draft = false2. Variable Reference (from)
Get a value from working memory using a dotted path:
[pipe.compose_report.construct]
customer_name = { from = "deal.customer_name" }
total_amount = { from = "order.total" }
street_address = { from = "customer.address.street" }3. Template (template)
Render a Jinja2 template with variable substitution:
[pipe.compose_report.construct]
invoice_number = { template = "INV-$order.id" }
summary = { template = "Deal worth $deal.amount with $deal.customer_name on {{ current_date }}" }4. Nested Construct
For nested structures, use a TOML subsection:
[pipe.compose_invoice.construct]
invoice_number = { template = "INV-$order.id" }
total = { from = "order.total_amount" }
[pipe.compose_invoice.construct.billing_address]
street = { from = "customer.address.street" }
city = { from = "customer.address.city" }
country = "France"domain = "invoicing"
[concept.Address]
description = "A postal address"
[concept.Address.structure]
street = { type = "text", description = "Street address" }
city = { type = "text", description = "City name" }
country = { type = "text", description = "Country name" }
[concept.Invoice]
description = "An invoice document"
[concept.Invoice.structure]
invoice_number = { type = "text", description = "Invoice number" }
total = { type = "number", description = "Total amount" }
[pipe.compose_invoice]
type = "PipeCompose"
description = "Compose an invoice from order and customer data"
inputs = { order = "Order", customer = "Customer" }
output = "Invoice"
[pipe.compose_invoice.construct]
invoice_number = { template = "INV-$order.id" }
total = { from = "order.total_amount" }
[pipe.compose_invoice.construct.billing_address]
street = { from = "customer.address.street" }
city = { from = "customer.address.city" }
country = "France"construct: Dictionary mapping field names to their composition rules- Each field can be:
- A literal value (string, number, boolean)
- A dict with
fromkey for variable reference - A dict with
templatekey for template rendering - A nested dict for nested structures
Note: You must use either template or construct, not both. They are mutually exclusive.
The PipeImgGen operator is used to generate images using AI image generation models.
Simple Image Generation:
[pipe.generate_image]
type = "PipeImgGen"
description = "Generate an image from prompt"
inputs = { prompt = "ImgGenPrompt" }
output = "Image"Using Image Generation Settings:
[pipe.generate_photo]
type = "PipeImgGen"
description = "Generate a high-quality photo"
inputs = { prompt = "ImgGenPrompt" }
output = "Photo"
model = { model = "fast-img-gen" }
aspect_ratio = "16:9"
quality = "hd"Multiple Image Generation:
[pipe.generate_variations]
type = "PipeImgGen"
description = "Generate multiple image variations"
inputs = { prompt = "ImgGenPrompt" }
output = "Image[3]"
seed = "auto"Advanced Configuration:
[pipe.generate_custom]
type = "PipeImgGen"
description = "Generate image with custom settings"
inputs = { prompt = "ImgGenPrompt" }
output = "Image"
model = "img_gen_preset_name" # Use predefined preset
aspect_ratio = "1:1"
quality = "hd"
background = "transparent"
output_format = "png"
is_raw = false
safety_tolerance = 3Image Generation Settings:
model: Model choice (preset name or inline settings with model name)quality: Image quality ("standard", "hd")
Output Configuration:
aspect_ratio: Image dimensions ("1:1", "16:9", "9:16", etc.)output_format: File format ("png", "jpeg", "webp")background: Background type ("default", "transparent")
Generation Control:
seed: Random seed (integer or "auto")is_raw: Whether to apply post-processingis_moderated: Enable content moderationsafety_tolerance: Content safety level (1-6)
PipeImgGen requires exactly one input that must be either:
- An
ImgGenPromptconcept - A concept that refines
ImgGenPrompt
The input can be named anything but must contain the prompt text for image generation.
The PipeFunc operator is used to run custom Python functions within a pipeline. This allows integration of classic Python scripts and custom logic.
Simple Function Call:
[pipe.process_data]
type = "PipeFunc"
description = "Process data using custom function"
inputs = { input_data = "DataType" }
output = "ProcessedData"
function_name = "process_data_function"File Processing Example:
[pipe.read_file]
type = "PipeFunc"
description = "Read file content"
inputs = { file_path = "FilePath" }
output = "FileContent"
function_name = "read_file_content"function_name: Name of the Python function to call (must be registered in func_registry)
The Python function must:
-
Be registered in the
func_registry -
Accept
working_memoryas a parameter:async def my_function(working_memory: WorkingMemory) -> StuffContent | list[StuffContent] | str: # Function implementation pass
-
Return appropriate types:
StuffContent: Single content objectlist[StuffContent]: Multiple content objects (becomes ListContent)str: Simple string (becomes TextContent)
Functions must be registered in the function registry before use:
from pipelex.system.registries.func_registry import func_registry
@func_registry.register("my_function_name")
async def my_custom_function(working_memory: WorkingMemory) -> StuffContent:
# Access inputs from working memory
input_data = working_memory.get_stuff("input_name")
# Process data
result = process_logic(input_data.content)
# Return result
return MyResultContent(data=result)Inside the function, access pipeline inputs through working memory:
async def process_function(working_memory: WorkingMemory) -> TextContent:
# Get input stuff by name
input_stuff = working_memory.get_stuff("input_name")
# Access the content
input_content = input_stuff.content
# Process and return
processed_text = f"Processed: {input_content.text}"
return TextContent(text=processed_text)In order to use it in a pipe, an LLM is referenced by its llm_handle (alias) and possibly by an llm_preset.
LLM configurations are managed through the new inference backend system with files located in .pipelex/inference/:
- Model Deck:
.pipelex/inference/deck/base_deck.tomland.pipelex/inference/deck/overrides.toml - Backends:
.pipelex/inference/backends.tomland.pipelex/inference/backends/*.toml - Routing:
.pipelex/inference/routing_profiles.toml
An llm_handle can be either:
- A direct model name (like "gpt-4o-mini", "claude-3-sonnet") - automatically available for all models loaded by the inference backend system
- An alias - user-defined shortcuts that map to model names, defined in the
[aliases]section:
[aliases]
base-claude = "claude-4.5-sonnet"
base-gpt = "gpt-5"
base-gemini = "gemini-2.5-flash"
base-mistral = "mistral-medium"The system first looks for direct model names, then checks aliases if no direct match is found. The system handles model routing through backends automatically.
Here is an example of using a model to specify which LLM to use in a PipeLLM:
[pipe.hello_world]
type = "PipeLLM"
description = "Write text about Hello World."
output = "Text"
model = { model = "gpt-5", temperature = 0.9 }
prompt = """
Write a haiku about Hello World.
"""As you can see, to use the LLM, you must also indicate the temperature (float between 0 and 1) and max_tokens (either an int or the string "auto").
Presets are meant to record the choice of an llm with its hyper parameters (temperature and max_tokens) if it's good for a particular task. LLM Presets are skill-oriented.
Examples:
llm_to_engineer = { model = "base-claude", temperature = 1 }
llm_to_extract_invoice = { model = "claude-3-7-sonnet", temperature = 0.1, max_tokens = "auto" }The interest is that these presets can be used to set the LLM choice in a PipeLLM, like this:
[pipe.extract_invoice]
type = "PipeLLM"
description = "Extract invoice information from an invoice text transcript"
inputs = { invoice_text = "InvoiceText" }
output = "Invoice"
model = "llm_to_extract_invoice"
prompt = """
Extract invoice information from this invoice:
The category of this invoice is: $invoice_details.category.
@invoice_text
"""The setting here model = "llm_to_extract_invoice" works because "llm_to_extract_invoice" has been declared as an llm_preset in the deck.
You must not use an LLM preset in a PipeLLM that does not exist in the deck. If needed, you can add llm presets.
You can override the predefined llm presets by setting them in .pipelex/inference/deck/overrides.toml.
ALWAYS RUN validation when you are finished writing pipelines: This checks for errors. If there are errors, iterate until it works.
- For a specific bundle/file:
pipelex validate path_to_file.plx - For all pipelines:
pipelex validate --all - Remember: Ensure your Python virtual environment is activated (typically
.venvfor standard installations) before runningpipelexcommands.
Then, create an example file to run the pipeline in the examples folder.
But don't write documentation unless asked explicitly to.
import asyncio
from pipelex import pretty_print
from pipelex.pipelex import Pipelex
from pipelex.pipeline.execute import execute_pipeline
async def hello_world() -> str:
"""
This function demonstrates the use of a super simple Pipelex pipeline to generate text.
"""
# Run the pipe
pipe_output = await execute_pipeline(
pipe_code="hello_world",
)
return pipe_output.main_stuff_as_str
## start Pipelex
Pipelex.make()
## run sample using asyncio
output_text = asyncio.run(hello_world())
pretty_print(output_text, title="Your first Pipelex output")import asyncio
from pipelex import pretty_print
from pipelex.pipelex import Pipelex
from pipelex.pipeline.execute import execute_pipeline
from pipelex.core.stuffs.image_content import ImageContent
from my_project.gantt.gantt_struct import GanttChart
SAMPLE_NAME = "extract_gantt"
IMAGE_URL = "assets/gantt/gantt_tree_house.png"
async def extract_gantt(image_url: str) -> GanttChart:
# Run the pipe
pipe_output = await execute_pipeline(
pipe_code="extract_gantt_by_steps",
inputs={
"gantt_chart_image": {
"concept": "gantt.GanttImage",
"content": ImageContent(url=image_url),
}
},
)
# Output the result
return pipe_output.main_stuff_as(content_type=GanttChart)
## start Pipelex
Pipelex.make()
## run sample using asyncio
gantt_chart = asyncio.run(extract_gantt(image_url=IMAGE_URL))
pretty_print(gantt_chart, title="Gantt Chart")The input memory is a dictionary, where the key is the name of the input variable and the value provides details to make it a stuff object. The relevant definitions are:
StuffContentOrData = dict[str, Any] | StuffContent | list[Any] | str
PipelineInputs = dict[str, StuffContentOrData]As you can seen, we made it so different ways can be used to define that stuff using structured content or data.
So here are a few concrete examples of calls to execute_pipeline with various ways to set up the input memory:
## Here we have a single input and it's a Text.
## If you assign a string, by default it will be considered as a TextContent.
pipe_output = await execute_pipeline(
pipe_code="master_advisory_orchestrator",
inputs={
"user_input": problem_description,
},
)
## Here we have a single input and it's a document.
## Because DocumentContent is a native concept, we can use it directly as a value,
## the system knows what content it corresponds to:
pipe_output = await execute_pipeline(
pipe_code="power_extractor_dpe",
inputs={
"document": DocumentContent(url=pdf_url),
},
)
## Here we have a single input and it's an Image.
## Because ImageContent is a native concept, we can use it directly as a value:
pipe_output = await execute_pipeline(
pipe_code="fashion_variation_pipeline",
inputs={
"fashion_photo": ImageContent(url=image_url),
},
)
## Here we have a single input, it's an image but
## its actually a more specific concept gantt.GanttImage which refines Image,
## so we must provide it using a dict with the concept and the content:
pipe_output = await execute_pipeline(
pipe_code="extract_gantt_by_steps",
inputs={
"gantt_chart_image": {
"concept": "gantt.GanttImage",
"content": ImageContent(url=image_url),
}
},
)
## Here is a more complex example with multiple inputs assigned using different ways:
pipe_output = await execute_pipeline(
pipe_code="retrieve_then_answer",
dynamic_output_concept_code="contracts.Fees",
inputs={
"text": load_text_from_path(path=text_path),
"question": {
"concept": "answer.Question",
"content": question,
},
"client_instructions": client_instructions,
},
)All pipe executions return a PipeOutput object.
It's a BaseModel which contains the resulting working memory at the end of the execution and the pipeline run id.
It also provides a bunch of accessor functions and properties to unwrap the main stuff, which is the last stuff added to the working memory:
class PipeOutput(BaseModel):
working_memory: WorkingMemory = Field(default_factory=WorkingMemory)
pipeline_run_id: str = Field(default=SpecialPipelineId.UNTITLED)
@property
def main_stuff(self) -> Stuff:
...
def main_stuff_as_list(self, item_type: type[StuffContentType]) -> ListContent[StuffContentType]:
...
def main_stuff_as_items(self, item_type: type[StuffContentType]) -> list[StuffContentType]:
...
def main_stuff_as(self, content_type: type[StuffContentType]) -> StuffContentType:
...
@property
def main_stuff_as_text(self) -> TextContent:
...
@property
def main_stuff_as_str(self) -> str:
...
@property
def main_stuff_as_image(self) -> ImageContent:
...
@property
def main_stuff_as_text_and_image(self) -> TextAndImagesContent:
...
@property
def main_stuff_as_number(self) -> NumberContent:
...
@property
def main_stuff_as_html(self) -> HtmlContent:
...
@property
def main_stuff_as_mermaid(self) -> MermaidContent:
...As you can see, you can extract any variable from the output working memory.
Simple text as a string:
result = pipe_output.main_stuff_as_strStructured object (BaseModel):
result = pipe_output.main_stuff_as(content_type=GanttChart)If it's a list, you can get a ListContent of the specific type.
result_list_content = pipe_output.main_stuff_as_list(item_type=GanttChart)or if you want, you can get the actual items as a regular python list:
result_list = pipe_output.main_stuff_as_items(item_type=GanttChart)