Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions content/integrations/frameworks/meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
"pydantic-ai",
"quarkus-langchain4j",
"ragas",
"restate",
"semantic-kernel",
"smolagents",
"spring-ai",
Expand Down
151 changes: 151 additions & 0 deletions content/integrations/frameworks/restate.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
---
title: Trace Restate Workflows with Langfuse
sidebarTitle: Restate
logo: /images/integrations/restate_icon.png
description: Learn how to use Langfuse to monitor and evaluate resilient Restate agentic workflows via OpenTelemetry
category: Integrations
---

# Trace Restate Workflows with Langfuse

This guide shows how to integrate Langfuse into your [Restate](https://restate.dev/) agentic workflows for full observability — LLM calls, tool invocations, and durable workflow steps — all in a single unified trace.

> **What is Restate?** [Restate](https://restate.dev/) is a durable execution platform that makes agents and workflows resumable and resilient. Every non-deterministic action (LLM calls, tool API calls, MCP calls) is persisted in a durable journal. On failure, Restate replays the journal and resumes where it left off — with automatic retries, recovery, and idempotent execution.

> **What is Langfuse?** [Langfuse](https://langfuse.com/) is an open-source observability platform for AI agents. It helps you monitor LLM calls, tool usage, cost, latency, and run automated evaluations.

## Versioning

Restate's [versioning model](https://docs.restate.dev/operate/versioning) ensures that new deployments route new requests to the latest version, while ongoing executions continue on the version they started with. This means each Langfuse trace is linked to a single immutable artifact — one code version, one prompt version, one execution history — making it straightforward to compare quality across versions and spot regressions.

## 1. Install Dependencies

```bash
pip install restate-sdk[serde] openai-agents langfuse openinference-instrumentation-openai-agents hypercorn
```

## 2. Configure Environment

Set up your API keys. You can get Langfuse keys from [Langfuse Cloud](https://langfuse.com/cloud) or by [self-hosting Langfuse](https://langfuse.com/self-hosting).

```bash filename=".env"
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com
OPENAI_API_KEY=sk-proj-...
```
Comment on lines +31 to +36
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The .env block on line 34 uses LANGFUSE_HOST but the correct Langfuse SDK environment variable is LANGFUSE_BASE_URL. Users who self-host or use a regional endpoint will set LANGFUSE_HOST, which the SDK silently ignores, causing all traces to be sent to cloud.langfuse.com instead of their configured endpoint with no error or warning.

Extended reasoning...

Bug: Wrong Environment Variable Name (LANGFUSE_HOST should be LANGFUSE_BASE_URL)

What the bug is and how it manifests:
In the .env configuration block (line 34), the guide instructs users to set LANGFUSE_HOST=https://cloud.langfuse.com. The Langfuse Python SDK does not recognize LANGFUSE_HOST as a valid environment variable -- it looks for LANGFUSE_BASE_URL. As a result, if a user sets only LANGFUSE_HOST, the SDK silently ignores it and defaults to https://cloud.langfuse.com.

The specific code path that triggers it:
The guide uses the new-style v3+ Python SDK API (from langfuse import get_client), which reads LANGFUSE_BASE_URL to determine the host. When get_client() is called in __main__.py, it reads environment variables. Since LANGFUSE_HOST is not recognized, any custom host value is silently dropped. The official SDK docs confirm: configure the host argument or the LANGFUSE_BASE_URL environment variable, and the v2-to-v3 upgrade guide explicitly states the Langfuse base URL environment variable is now LANGFUSE_BASE_URL.

Why existing code does not prevent it:
The SDK does not emit a warning when LANGFUSE_HOST is set but LANGFUSE_BASE_URL is absent. The call to get_client() succeeds regardless, defaulting to the cloud endpoint. There is no validation error or log message to alert the user that their configured host is being ignored.

Impact:
This disproportionately affects self-hosting users and those using EU/US regional endpoints. A developer following this guide who self-hosts Langfuse would set LANGFUSE_HOST=https://langfuse.mycompany.com, see the workflow run successfully, and then be confused why traces appear on cloud.langfuse.com -- or fail to appear anywhere if they have no cloud account. The root cause is not obvious since no error is surfaced.

How to fix it:
Change LANGFUSE_HOST to LANGFUSE_BASE_URL on line 34. This is consistent with every other modern framework integration guide in this repo (openai-agents.mdx, claude-agent-sdk.mdx, autogen.mdx, haystack.mdx, temporal.mdx, smolagents.mdx, etc.) and matches the official SDK documentation.

Step-by-step proof:

  1. User self-hosts Langfuse at https://langfuse.internal.company.com
  2. User follows this guide and sets LANGFUSE_HOST=https://langfuse.internal.company.com in .env
  3. __main__.py calls get_client() -- the SDK reads environment variables looking for LANGFUSE_BASE_URL, finds it absent, and silently defaults to https://cloud.langfuse.com
  4. Workflow runs; traces are exported to cloud.langfuse.com instead of the user's self-hosted instance
  5. User sees no error, but traces do not appear in their Langfuse instance -- root cause is entirely non-obvious


## 3. Define the Agent

Use [Restate's OpenAI Agents SDK integration](https://docs.restate.dev/ai/sdk-integrations/openai-agents-sdk) to make agent steps durable. `DurableRunner` persists each LLM call in Restate's journal, so failed executions resume where they left off instead of restarting from scratch.

```python filename="agent.py"
import restate
from agents import Agent
from restate.ext.openai import restate_context, DurableRunner, durable_function_tool

# Durable tool — executed exactly once, even across retries
@durable_function_tool
async def check_fraud_database(customer_id: str) -> dict[str, str]:
"""Check the claim against the fraud database."""
return await restate_context().run_typed(
"Query fraud DB", query_fraud_db, claim_id=customer_id
)

# LLM agents
parse_agent = Agent(
name="DocumentParser",
instructions="Extract the customer ID, claim amount, currency, category, and description.",
output_type=ClaimData,
)

analysis_agent = Agent(
name="ClaimsAnalyst",
instructions="Assess whether this claim is valid and provide detailed reasoning.",
output_type=ClaimAssessment,
tools=[check_fraud_database],
)

# Main orchestrator
claim_service = restate.Service("InsuranceClaimAgent")

@claim_service.handler()
async def run(ctx: restate.Context, req: ClaimDocument) -> str:
# Step 1: Parse the claim document (LLM step)
parsed = await DurableRunner.run(parse_agent, req.text)
claim: ClaimData = parsed.final_output

# Step 2: Analyze the claim (LLM step)
response = await DurableRunner.run(analysis_agent, claim.model_dump_json())
assessment: ClaimAssessment = response.final_output

if not assessment.valid:
return "Claim rejected"

# Step 3: Convert currency (regular durable step, no LLM)
converted = await ctx.run_typed(
"Convert currency", convert_currency, amount=claim.amount
)

# Step 4: Process reimbursement (regular durable step, no LLM)
await ctx.run_typed("Reimburse", reimburse, amount=converted)

return "Claim reimbursed"
```

## 4. Enable Langfuse Tracing

Initialize the Langfuse client and set up the tracing processor. This connects the OpenAI Agents SDK spans to Restate's execution traces, so everything appears as a single unified trace in Langfuse.

```python filename="__main__.py"
import hypercorn
import asyncio
import restate

from langfuse import get_client
from opentelemetry import trace as trace_api
from openinference.instrumentation import OITracer, TraceConfig
from agents import set_trace_processors

from utils.tracing import RestateTracingProcessor
from agent import claim_service

# Initialize Langfuse (sets up the global OTel tracer provider + exporter)
langfuse = get_client()
tracer = OITracer(
trace_api.get_tracer("openinference.openai_agents"), config=TraceConfig()
)
set_trace_processors([RestateTracingProcessor(tracer)])

if __name__ == "__main__":
app = restate.app(services=[claim_service])

conf = hypercorn.Config()
conf.bind = ["0.0.0.0:9080"]
asyncio.run(hypercorn.asyncio.serve(app, conf))
```
Comment on lines +100 to +126
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The __main__.py example imports import hypercorn but then calls hypercorn.asyncio.serve(app, conf), which will raise AttributeError: module hypercorn has no attribute asyncio at runtime. Fix by adding import hypercorn.asyncio explicitly, or replace with from hypercorn.asyncio import serve and call serve(app, conf).

Extended reasoning...

What the bug is and how it manifests

In Python, importing a top-level package does not automatically import its subpackages or submodules. The __main__.py code block uses import hypercorn but then accesses hypercorn.asyncio.serve(app, conf). When Python executes import hypercorn, only hypercorn/__init__.py is loaded. The asyncio submodule is not populated on the hypercorn module object unless it is explicitly imported or hypercorn/__init__.py itself imports it.

The specific code path that triggers it

The problematic pattern in the example:

import hypercorn  # Only loads hypercorn/__init__.py
...
asyncio.run(hypercorn.asyncio.serve(app, conf))  # AttributeError here

At runtime, Python evaluates hypercorn.asyncio as an attribute lookup on the hypercorn module. Since hypercorn.asyncio was never imported, Python cannot find the attribute and raises AttributeError.

Why existing code does not prevent it

The only guard would be if hypercorn/__init__.py explicitly imported its asyncio submodule. Hypercorn is an ASGI server with multiple async backends (asyncio and trio), making it unlikely that it auto-imports all backends unconditionally in its __init__.py. The standard, documented usage pattern for hypercorn consistently uses an explicit import: from hypercorn.asyncio import serve.

Impact

Any developer copying this code snippet verbatim will encounter a runtime AttributeError when starting their server. This is the final step of the integration guide, meaning the code will appear to run (all imports succeed) but crash at the asyncio.run(...) call - a confusing failure point.

How to fix it

Either add an explicit submodule import:

import hypercorn
import hypercorn.asyncio

Or, more idiomatically (matching hypercorn's own documentation):

from hypercorn.asyncio import serve
...
asyncio.run(serve(app, conf))

Step-by-step proof

  1. User reads the guide and copies __main__.py as written.
  2. User runs python __main__.py.
  3. Python executes import hypercorn - success, hypercorn/__init__.py runs.
  4. Python executes import asyncio - success.
  5. Python reaches asyncio.run(hypercorn.asyncio.serve(app, conf)).
  6. Python evaluates hypercorn.asyncio as an attribute lookup on the module.
  7. Since hypercorn.asyncio was never imported, this attribute does not exist.
  8. Python raises AttributeError: module 'hypercorn' has no attribute 'asyncio'.
  9. The server never starts.


The `RestateTracingProcessor` (available in [Restate's example repo](https://github.com/restatedev/ai-examples/blob/main/openai-agents/examples/langfuse/utils/tracing.py)) flattens the OpenAI Agents SDK spans under Restate's parent span, so the trace hierarchy in Langfuse mirrors the actual execution flow.

Restate also exports its own execution traces (workflow steps, retries, recovery) as OpenTelemetry spans. By pointing Restate's tracing endpoint at Langfuse, both agentic and workflow spans appear in the same trace.

## 5. View Traces in Langfuse

After running the workflow, the trace in Langfuse shows both the agentic steps and the workflow steps. For LLM calls, you can inspect inputs, prompts, model configuration, and outputs.

## Prompt Management with Restate

You can use [Langfuse Prompt Management](/docs/prompt-management/overview) with Restate. Each prompt fetch becomes a durable step — retries reuse the same prompt, while new executions pick up updated versions.

```python
from langfuse import get_client

langfuse = get_client()

def fetch_prompt() -> str:
prompt = langfuse.get_prompt("claim-agent", type="text")
return prompt.compile()

# Durably journaled — same prompt is used on retries
prompt = await ctx.run_typed("Fetch prompt", fetch_prompt)
```
Comment on lines +140 to +151
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The Prompt Management code snippet (lines 140-151) calls await ctx.run_typed(...) where ctx is never defined in the snippet scope, causing NameError: name ctx is not defined for users copying it verbatim. Additionally, await appears at module top-level outside any async def, which is a SyntaxError in Python <3.12 scripts. The snippet should be wrapped in a @claim_service.handler() decorated async def run(ctx: restate.Context, ...) function, matching the pattern shown in section 3.

Extended reasoning...

What the bug is and how it manifests

The Prompt Management section ends with a standalone code snippet that uses await ctx.run_typed("Fetch prompt", fetch_prompt) (line 150). The variable ctx does not appear anywhere in the snippet — it is not imported, assigned, or passed as a parameter. Any user who copies this snippet and runs it will immediately encounter NameError: name ctx is not defined.

A second independent error exists on the same line: prompt = await ctx.run_typed(...) is written at module top-level, outside any async def function. In Python scripts (.py files), await outside an async function is a SyntaxError. (Top-level await is only valid in Python 3.12+ interactive REPL sessions or Jupyter notebooks, not in regular scripts.)

The specific code path that triggers it

The offending snippet is the final code block in the file (lines 140-151):

from langfuse import get_client

langfuse = get_client()

def fetch_prompt() -> str:
    prompt = langfuse.get_prompt("claim-agent", type="text")
    return prompt.compile()

# Durably journaled — same prompt is used on retries
prompt = await ctx.run_typed("Fetch prompt", fetch_prompt)

There is no surrounding function, no ctx parameter, and no async def.

Why existing code does not prevent it

Section 3 of the guide correctly shows ctx passed as a parameter to a @claim_service.handler() decorated async def run(ctx: restate.Context, req: ClaimDocument) -> str: function. However, the Prompt Management snippet is presented as a separate, self-contained block that drops this framing entirely. Nothing in the snippet signals that ctx must come from a surrounding handler.

What the impact would be

Users following the Prompt Management section and copying the snippet to integrate Langfuse prompt fetching will receive either a SyntaxError (Python <3.12, the overwhelming majority of production environments) or a NameError at runtime. This makes the example completely non-functional as written.

How to fix it

Wrap the await ctx.run_typed(...) call inside a handler function, showing the full pattern:

from langfuse import get_client

langfuse = get_client()

def fetch_prompt() -> str:
    prompt = langfuse.get_prompt("claim-agent", type="text")
    return prompt.compile()

@claim_service.handler()
async def run(ctx: restate.Context, req: ClaimDocument) -> str:
    # Durably journaled — same prompt is used on retries
    prompt = await ctx.run_typed("Fetch prompt", fetch_prompt)
    ...

Step-by-step proof

  1. User reads the Prompt Management section and copies the code block verbatim into agent.py.
  2. They run python agent.py.
  3. Python parses the file and encounters await ctx.run_typed(...) at module top-level — in Python <3.12 this is immediately a SyntaxError: await outside function before any code runs.
  4. Even in Python 3.12+ (where top-level await is allowed in some contexts), the interpreter evaluates ctx and finds it is undefined in module scope, raising NameError: name ctx is not defined.
  5. The user cannot proceed without understanding that this snippet must be embedded inside a @claim_service.handler() async function, which is never stated in this section.

Binary file added public/images/integrations/restate_icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading