Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
326 changes: 326 additions & 0 deletions fern/01-guide/07-observability/respan.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,326 @@
---
title: Respan
---
Comment on lines +1 to +3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add required subtitle to frontmatter.

This guide starts with frontmatter but omits subtitle, which is required for every .mdx file.

Proposed fix
 ---
 title: Respan
+subtitle: Learn how to integrate BAML with Respan observability
 ---

As per coding guidelines: "Every .mdx file must start with frontmatter containing title and subtitle in the specified format" and "Subtitles should be concise and short, with some starting with 'Learn how to …' for guides".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
---
title: Respan
---
---
title: Respan
subtitle: Learn how to integrate BAML with Respan observability
---


[Respan](https://respan.ai) is an open-source observability platform for LLM applications. You can use it alongside BAML to get detailed traces, token usage, cost tracking, and prompt analytics in the Respan dashboard.

There are three ways to integrate BAML with Respan, depending on how much control you need:

1. **Wrap BAML calls with Respan decorators** — easiest, gives you workflow-level traces
2. **Use `on_log_event` to forward events** — real-time event streaming to Respan
3. **Use `Collector` for rich trace data** — most detailed, includes token usage, HTTP requests, and timing

Comment on lines +7 to +12
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tighten wording to a neutral technical tone.

Phrases like “easiest,” “most detailed,” and “simplest approach” read promotional. Prefer neutral, factual phrasing.

As per coding guidelines: "Use scientific research tone—professional, factual, and straightforward" and "Do not use marketing/promotional language".

Also applies to: 36-36

## Prerequisites

- A [Respan](https://platform.respan.ai) account and API key
- BAML installed and configured in your project

<CodeGroup>
```bash Python
pip install respan-tracing
```

```bash TypeScript
npm install @respan/tracing
```
</CodeGroup>

Comment on lines +18 to +27
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use CodeBlocks instead of CodeGroup for multi-language snippets.

The page consistently uses CodeGroup, but the docs standard requires CodeBlocks for grouped language examples.

Example migration pattern
-<CodeGroup>
-```python Python
+<CodeBlocks>
+```python
 ...

-typescript TypeScript +typescript
...

-</CodeGroup>
+</CodeBlocks>

As per coding guidelines: "Use CodeBlocks component for groups of code with multiple languages, starting with Python as the default language".

Also applies to: 38-92, 97-187, 192-282

Set your Respan API key:

```bash
export RESPAN_API_KEY=your_respan_api_key
```

## Option 1: Wrap BAML Calls with Respan Decorators

The simplest approach. Use Respan's `@workflow` and `@task` decorators around your BAML function calls to get structured traces in the Respan dashboard.

<CodeGroup>
```python Python
from respan_tracing import RespanTelemetry, workflow, task
from baml_client import b

# Initialize Respan telemetry
telemetry = RespanTelemetry(app_name="my-baml-app")

@task(name="extract_resume")
async def extract_resume(text: str):
return await b.ExtractResume(text)

@task(name="classify_sentiment")
async def classify_sentiment(text: str):
return await b.ClassifySentiment(text)

@workflow(name="analyze_document")
async def analyze_document(text: str):
resume = await extract_resume(text)
sentiment = await classify_sentiment(text)
return {"resume": resume, "sentiment": sentiment}
```

```typescript TypeScript
import { RespanTelemetry } from '@respan/tracing'
import { b } from 'baml_client'

const respan = new RespanTelemetry({
apiKey: process.env.RESPAN_API_KEY,
appName: 'my-baml-app',
})
await respan.initialize()

async function analyzeDocument(text: string) {
return await respan.withWorkflow(
{ name: 'analyze_document' },
async () => {
const resume = await respan.withTask(
{ name: 'extract_resume' },
() => b.ExtractResume(text)
)
const sentiment = await respan.withTask(
{ name: 'classify_sentiment' },
() => b.ClassifySentiment(text)
)
return { resume, sentiment }
}
)
}

await analyzeDocument("John Doe, Software Engineer...")
await respan.shutdown()
```
</CodeGroup>

## Option 2: Forward Events with `on_log_event`

BAML fires a callback for every LLM call. You can forward these events to Respan's [trace ingestion API](https://respan.ai/docs/apis/observe/traces/traces-ingest-from-logs) in real time.

<CodeGroup>
```python Python
import os
import requests
from baml_client import b
from baml_client.tracing import trace, on_log_event, flush

RESPAN_API_KEY = os.environ["RESPAN_API_KEY"]
RESPAN_INGEST_URL = "https://api.respan.ai/v1/traces/ingest"

def forward_to_respan(event):
"""Forward BAML log events to Respan."""
requests.post(
RESPAN_INGEST_URL,
headers={
"Authorization": f"Bearer {RESPAN_API_KEY}",
"Content-Type": "application/json",
},
json=[
{
"trace_unique_id": event.metadata.root_event_id,
"span_unique_id": event.metadata.event_id,
"span_parent_id": event.metadata.parent_id,
"span_name": "baml_llm_call",
"input": event.prompt,
"output": event.raw_output,
"start_time": event.start_time,
"timestamp": event.start_time,
"metadata": {
"source": "baml",
"parsed_output": event.parsed_output,
},
}
],
)
Comment on lines +109 to +131
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The on_log_event callback runs synchronously in the tracing pipeline; making a blocking network call (requests.post) here will add latency to every LLM call and can stall the app if the ingest endpoint is slow. Consider enqueueing events to a background worker/batch sender, and at minimum set a short timeout + handle non-2xx responses/exceptions.

Suggested change
requests.post(
RESPAN_INGEST_URL,
headers={
"Authorization": f"Bearer {RESPAN_API_KEY}",
"Content-Type": "application/json",
},
json=[
{
"trace_unique_id": event.metadata.root_event_id,
"span_unique_id": event.metadata.event_id,
"span_parent_id": event.metadata.parent_id,
"span_name": "baml_llm_call",
"input": event.prompt,
"output": event.raw_output,
"start_time": event.start_time,
"timestamp": event.start_time,
"metadata": {
"source": "baml",
"parsed_output": event.parsed_output,
},
}
],
)
try:
response = requests.post(
RESPAN_INGEST_URL,
headers={
"Authorization": f"Bearer {RESPAN_API_KEY}",
"Content-Type": "application/json",
},
json=[
{
"trace_unique_id": event.metadata.root_event_id,
"span_unique_id": event.metadata.event_id,
"span_parent_id": event.metadata.parent_id,
"span_name": "baml_llm_call",
"input": event.prompt,
"output": event.raw_output,
"start_time": event.start_time,
"timestamp": event.start_time,
"metadata": {
"source": "baml",
"parsed_output": event.parsed_output,
},
}
],
timeout=2.0,
)
if not (200 <= response.status_code < 300):
# In a real app, consider using structured logging instead of print
print(
f"Respan ingest returned status {response.status_code}: "
f"{response.text}"
)
except requests.exceptions.RequestException as exc:
# Swallow exceptions to avoid breaking the tracing pipeline
print(f"Error sending event to Respan: {exc}")

Copilot uses AI. Check for mistakes.

# Register the callback — all BAML LLM calls will be forwarded
on_log_event(forward_to_respan)

@trace
async def process_document(text: str):
result = await b.ExtractResume(text)
return result

# Don't forget to flush before your app exits
flush()
```

```typescript TypeScript
import { b } from 'baml_client'
import { traceAsync, onLogEvent, flush } from 'baml_client/tracing'

const RESPAN_API_KEY = process.env.RESPAN_API_KEY!
const RESPAN_INGEST_URL = 'https://api.respan.ai/v1/traces/ingest'

// Register the callback — all BAML LLM calls will be forwarded
onLogEvent((event) => {
fetch(RESPAN_INGEST_URL, {
method: 'POST',
headers: {
Authorization: `Bearer ${RESPAN_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify([
{
trace_unique_id: event.metadata.rootEventId,
span_unique_id: event.metadata.eventId,
span_parent_id: event.metadata.parentId,
span_name: 'baml_llm_call',
input: event.prompt,
output: event.rawOutput,
start_time: event.startTime,
timestamp: event.startTime,
metadata: {
source: 'baml',
parsed_output: event.parsedOutput,
},
},
]),
})
})

const processDocument = traceAsync('processDocument', async (text: string) => {
return await b.ExtractResume(text)
})

await processDocument("John Doe, Software Engineer...")
flush()
```
</CodeGroup>

## Option 3: Use Collector for Rich Traces

The [`Collector`](/guide/baml-advanced/collector-track-tokens) gives you the most detailed data — token counts, timing, HTTP request/response details, and retry information. Combine it with Respan's ingestion API for full observability.

<CodeGroup>
```python Python
import os
import requests
from baml_client import b
from baml_py import Collector

RESPAN_API_KEY = os.environ["RESPAN_API_KEY"]
RESPAN_INGEST_URL = "https://api.respan.ai/v1/traces/ingest"

collector = Collector(name="respan-collector")

result = await b.ExtractResume(
"John Doe, Software Engineer...",
baml_options={"collector": collector},
)
Comment on lines +204 to +207
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses await at top-level (result = await ...). Top-level await is not valid in normal Python scripts, so readers copying this will get a syntax error. Wrap the snippet in an async def main() + asyncio.run(main()), or remove await and show the synchronous call style used elsewhere in the docs.

Copilot uses AI. Check for mistakes.

# Extract rich data from the collector
log = collector.last
assert log is not None

span = {
"trace_unique_id": f"baml-{log.id}",
"span_unique_id": log.id,
"span_name": log.function_name,
"input": str(log.raw_llm_response),
"output": str(result),
Comment on lines +217 to +218
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_llm_response is the model’s raw response text, not the request input. Using it as the Respan span input will invert the data in the dashboard. Consider sending the original function args / rendered prompt as input, and raw_llm_response (and/or parsed result) as output or metadata.

Copilot uses AI. Check for mistakes.
"start_time": log.timing.start_time.isoformat(),
"latency": log.timing.duration_ms / 1000,
Comment on lines +213 to +220
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.timing.start_time does not exist on the Python Timing object; the public fields are start_time_utc_ms and duration_ms (and duration_ms can be None). Update this snippet to use start_time_utc_ms (convert to ISO-8601 if needed) and handle a missing duration_ms safely.

Suggested change
span = {
"trace_unique_id": f"baml-{log.id}",
"span_unique_id": log.id,
"span_name": log.function_name,
"input": str(log.raw_llm_response),
"output": str(result),
"start_time": log.timing.start_time.isoformat(),
"latency": log.timing.duration_ms / 1000,
from datetime import datetime, timezone
start_time_iso = datetime.fromtimestamp(
log.timing.start_time_utc_ms / 1000.0, tz=timezone.utc
).isoformat()
latency_seconds = (
log.timing.duration_ms / 1000.0 if log.timing.duration_ms is not None else None
)
span = {
"trace_unique_id": f"baml-{log.id}",
"span_unique_id": log.id,
"span_name": log.function_name,
"input": str(log.raw_llm_response),
"output": str(result),
"start_time": start_time_iso,
"latency": latency_seconds,

Copilot uses AI. Check for mistakes.
"model": log.calls[-1].model if log.calls else None,
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.calls[-1].model is not a field on the Python collector call objects. Use an available field (e.g. client_name / provider info) or omit model if it can’t be derived reliably from the collector call.

Suggested change
"model": log.calls[-1].model if log.calls else None,

Copilot uses AI. Check for mistakes.
"prompt_tokens": log.usage.input_tokens if log.usage else None,
"completion_tokens": log.usage.output_tokens if log.usage else None,
"metadata": {
"source": "baml",
"function_name": log.function_name,
"tags": log.tags,
},
}

requests.post(
RESPAN_INGEST_URL,
headers={
"Authorization": f"Bearer {RESPAN_API_KEY}",
"Content-Type": "application/json",
},
json=[span],
)
```

```typescript TypeScript
import { b } from 'baml_client'
import { Collector } from '@boundaryml/baml'

const RESPAN_API_KEY = process.env.RESPAN_API_KEY!
const RESPAN_INGEST_URL = 'https://api.respan.ai/v1/traces/ingest'

const collector = new Collector('respan-collector')

const result = await b.ExtractResume("John Doe, Software Engineer...", {
collector,
})

const log = collector.last!

await fetch(RESPAN_INGEST_URL, {
method: 'POST',
headers: {
Authorization: `Bearer ${RESPAN_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify([
{
trace_unique_id: `baml-${log.id}`,
span_unique_id: log.id,
span_name: log.functionName,
input: log.rawLlmResponse,
output: JSON.stringify(result),
Comment on lines +266 to +268
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here: log.rawLlmResponse is the model output, not the input. Mapping it to input will swap request/response in Respan. Prefer using the function args/prompt as input and the raw response / parsed result as output or metadata.

Copilot uses AI. Check for mistakes.
latency: (log.timing?.durationMs ?? 0) / 1000,
prompt_tokens: log.usage?.inputTokens,
completion_tokens: log.usage?.outputTokens,
metadata: {
source: 'baml',
function_name: log.functionName,
tags: log.tags,
},
},
]),
})
```
</CodeGroup>

## Adding Metadata for Filtering

Respan supports [span attributes](https://respan.ai/docs/integrations/respan-native#attributes) like `customer_identifier` and `trace_group_identifier` for filtering traces in the dashboard. You can combine these with BAML's `set_tags`:

```python
from respan_tracing import RespanTelemetry, workflow, task, get_client
from baml_client import b
from baml_client.tracing import set_tags

telemetry = RespanTelemetry(app_name="my-baml-app")

@workflow(name="process_user_request")
async def process_user_request(user_id: str, text: str):
# Set Respan attributes for dashboard filtering
client = get_client()
client.update_current_span(
respan_params={
"customer_identifier": user_id,
"trace_group_identifier": "resume-extraction",
"metadata": {"pipeline_version": "v2"},
}
)

# Set BAML tags (visible in Boundary Studio if also enabled)
set_tags(userId=user_id)

return await b.ExtractResume(text)
```

## Which Option Should I Use?

| Approach | Effort | Detail Level | Best For |
|----------|--------|-------------|----------|
| **Respan decorators** | Low | Workflow-level spans | Quick setup, structured traces |
| **`on_log_event`** | Medium | Per-LLM-call events | Real-time streaming, custom filtering |
| **`Collector` + API** | Medium | Full LLM metadata (tokens, HTTP, timing) | Cost tracking, debugging, detailed analytics |
Comment on lines +314 to +318
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison table is written with double leading pipes (|| ...), which renders as an extra empty column in Markdown. Use a single leading | for the header/separator/rows so the table displays correctly.

Copilot uses AI. Check for mistakes.

You can also combine approaches — for example, use Respan decorators for workflow structure and the `Collector` for detailed LLM metrics within each task.

## Further Reading

- [Respan Tracing SDK docs](https://respan.ai/docs/sdks/python/tracing/overview)
- [Respan trace ingestion API](https://respan.ai/docs/apis/observe/traces/traces-ingest-from-logs)
- [BAML Collector reference](/guide/baml-advanced/collector-track-tokens)
3 changes: 3 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,9 @@ navigation:
- page: Tracking Usage
icon: fa-regular fa-bar-chart
path: 01-guide/07-observability/studio.mdx
- page: Respan
icon: fa-regular fa-chart-line
path: 01-guide/07-observability/respan.mdx
- section: Comparisons
contents:
- page: BAML vs Langchain
Expand Down