SDialog is a modular Python library for dialogue modeling, generation, evaluation, and analysis with LLMs. It provides a standard Dialog format with rich metadata, persona-driven multi-agent simulation, orchestration for fine control, evaluation metrics, and built-in mechanistic interpretability support.
Quick links: Docs β’ API β’ Tutorials β’ Demo (Colab) β’ Issues
Synthetic dialogue generation is increasingly central to creating training data, augmenting datasets, stress-testing systems, and simulating both task-oriented and open-domain interactions. Teams need precise control over personas, contexts, tools, and orchestration to cover long-tail scenarios at scale while preserving privacy and reproducibility. Yet dialogue work is fragmented: every dataset has its own format, every project reinvents agents and prompts, and reproducibility is hard.
The purpose of this project is to make synthetic dialogue generation practicalβbuilt with and for the communityβby enabling:
- Standardization and reproducibility: a well-defined schema for Dialog, Personas, Context, Agents, etc., with JSON import/export serialization for auditability, sharing, and benchmarking.
- Abstractions: simple, composable building blocks for personas, agents, orchestrators, generators, evaluation, and interpretability.
- Interoperability: the same code works with multiple LLM backends (Ollama, HuggingFace, OpenAI, Google Generative AI, AWS, etc.).
- Controllability: persona-, context-, and orchestration-driven generation for targeted scenarios and long-tail distribution exploration.
- Evaluation loop: built-in metrics and LLM-as-judge interfaces to compare synthetic against reference data and guide iteration.
- Interpretability and safety: native mechanistic interpretability to inspect and steer activations/tokens; supports debugging, bias mitigation, and safe behavior adjustments.
See the quick examples below and our demo notebook for a simple demo of the core workflow and basic capabilities. For task-focused guides, see the Tutorials folder.
pip install sdialog
Define personas and context, create agents, and generate a dialogue:
from sdialog import Context
from sdialog.agents import Agent
from sdialog.personas import Persona
# Personas (built-ins like Doctor/Patient/Customer are also available)
alice = Persona(name="Alice", role="friendly barista", personality="cheerful and helpful")
bob = Persona(name="Bob", role="customer", personality="curious and polite")
# Optional shared context
ctx = Context(location="Downtown cafe", topics=["coffee", "recommendations"])
# Agents
alice_agent = Agent(persona=alice)
bob_agent = Agent(persona=bob)
# Dialogue
dialog = alice_agent.dialog_with(bob_agent, context=ctx)
dialog.print() # Pretty print the dialog
# dialog.to_file("my_dialog.json") # Save it as a JSON file
Make the same agents talk in a different context:
starship = Context(
location="Starship",
environment="futuristic cafeteria",
objects=[
"holographic menu board",
"service droid",
"zero-g drink dispenser",
],
circumstances="Customer complains the delivered drink isnβt the one ordered"
)
dialog = alice_agent.dialog_with(bob_agent, context=starship)
dialog.print()
Check out our demo notebook for a simple demo of the core workflow and capabilities (generation, evaluation, and interpretability).
- SDialog supports many backends (Ollama, HuggingFace, OpenAI, Google Generative AI, AWS), specified as a model string: "BACKEND:MODEL", e.g.:
- "openai:gpt-4.1"
- "ollama:gemma3:27b"
- "aws:anthropic.claude-3-5-sonnet-20240620-v1:0"
- "huggingface:meta-llama/Llama-3.2-3B-Instruct"
Set a global default LLM for all components:
import sdialog
sdialog.config.llm("ollama:qwen3:14b")
Optionally pass parameters:
sdialog.config.llm("ollama:qwen3:14b", temperature=0.9)
Any parameter supported by the selected backend is allowed, for instance:
sdialog.config.llm(
"aws:anthropic.claude-3-5-sonnet-20240620-v1:0",
region_name="us-east-1"
)
Personas are lists of attributes that define who an agent is: a structured profile with role, background/expertise, goals, tone, and other metadata that conditions how the agent speaks and acts. In SDialog, personas are first-class, serializable objects used by Agents, Generators, and Orchestrators; they can be built-in or custom.
Built-in personas include Persona
(generic) and typed ones like Doctor
, Patient
, Customer
, and SupportAgent
. For example:
from sdialog.personas import Customer, SupportAgent
customer_persona = Customer(customer_id="12345",
issue="Cannot log in to my account",
anger_level="high")
support_persona = SupportAgent(politeness="high")
# Pretty print the personas
customer_persona.print()
support_persona.print()
Define your own persona type (simple Pydantic-style fields):
from pydantic import Field
from sdialog.personas import BasePersona
class Librarian(BasePersona):
name: str = ""
expertise: str = Field("", description="Primary subject area")
personality: str = Field("patient and meticulous", description="Key traits shaping tone and behavior")
lib = Librarian(name="Morgan",
expertise="history")
lib.print()
Agents are persona-conditioned conversational actors; they take a persona object when created. They can also support hidden thinking and tool use (if the chosen LLM supports it):
from sdialog.agents import Agent
# As example, let's define wwo simple (mock) tools our support agent can call
# 1) Fake RAG-like tool
def get_product_documentation(product: str, model: str) -> dict:
"""Retrieve product documentation for a specific product and model."""
# In a real tool, query your documentation store and return top-k snippets.
snippets = [
f"Overview for {product} {model}",
f"Troubleshooting guide for {product} {model}",
f"FAQ for {product} {model}"
]
return {"snippets": snippets}
# 2) Fake verification account tool
def verify_account(customer_id: str) -> dict:
"""Verify customer account and return minimal details."""
return {"customer_id": customer_id, "exists": True}
support_agent = Agent(persona=support_persona,
think=True, # Enable reasoning
tools=[get_product_documentation, verify_account], # And tools!
name="AGENT")
customer = Agent(persona=customer_persona,
first_utterance="Hi there!",
name="USER")
dialog = customer.dialog_with(support_agent, max_turns=10)
dialog.print()
See the Agents with tools and thoughts tutorial for details.
Context and personas can be generated using an LLM or custom functions to populate their attributes. For instance, create doctors (whose specialty is cardiology) and patients (whose symptom is chest pain):
from sdialog.personas import Doctor, Patient
from sdialog.generators import PersonaGenerator
# Persona generators (by default fill all unspecified fields via LLM)
doc_gen = PersonaGenerator(Doctor(specialty="Cardiology"))
pat_gen = PersonaGenerator(Patient(symptoms="mild chest pain"))
# New doctor and patient each time `generate()` is called
doctor = doc_gen.generate()
patient = pat_gen.generate()
# Pretty print generated personas
doctor.print()
patient.print()
We can then generate a dialogue using Agents as above, or use a single LLM with PersonaDialogGenerator
:
from sdialog.generators import PersonaDialogGenerator
# Full dialogue generator for the given personas (no agents)
dlg_gen = PersonaDialogGenerator(doctor, patient, dialogue_details="Keep it short and reassuring.")
# Generate a new dialogue (each call returns a new one)
dialog = dlg_gen()
dialog.print()
The PersonaGenerator
takes any persona as input, including user-defined ones:
# Using the Librarian class defined above
lib_gen = PersonaGenerator(Librarian()) # unspecified fields are LLM-filled by default
new_lib = lib_gen.generate()
new_lib.print()
Other utilities: ContextGenerator
to generate contexts, and Paraphraser
for dataset augmentation.
Add simple rules or constraints by composing orchestrators; the |
operator attaches them to agents.
from sdialog.orchestrators import SimpleReflexOrchestrator, LengthOrchestrator
# Make Alice react if Bob mentions "cupcakes" and keep the chat between 6 and 10 turns
react = SimpleReflexOrchestrator(
condition=lambda utt: "cupcakes" in utt.lower(),
instruction="Politely explain cupcake policy and suggest an alternative"
)
keep_length = LengthOrchestrator(min=6, max=10)
alice_agent = alice_agent | react | keep_length
dialog = alice_agent.dialog_with(bob_agent)
dialog.print(orchestration=True) # show injected instructions/events
Define your own orchestrator:
from sdialog.orchestrators import BaseOrchestrator
class EncourageDetailOrchestrator(BaseOrchestrator):
def instruct(self, dialog, utterance):
if utterance and len(utterance.split()) < 5:
return "Add a bit more detail in your next reply."
return None
alice_agent = alice_agent | EncourageDetailOrchestrator()
See the orchestration tutorial for more details.
Evaluate and compare generated dialogues against a reference set (e.g., human reference) using built-in metrics and evaluators (LLM-as-judge, linguistic features, dialog-flow), for example:
import sdialog
from sdialog.evaluation import LLMJudgeRealDialog, LinguisticFeatureScore # scores
from sdialog.evaluation import FrequencyEvaluator, MeanEvaluator # evaluators
from sdialog.evaluation import DatasetComparator # comparator
reference = [...] # list of reference Dialogs
candidate_a = [...] # list of first candidate Dialogs
candidate_b = [...] # list of second candidate Dialogs
# Instantiate scores
judge = LLMJudgeRealDialog(feedback=True)
flesch = LinguisticFeatureScore(feature="flesch-reading-ease")
gunning = LinguisticFeatureScore(feature="gunning-fog")
# Instantiate comparator with evaluators
comparator = DatasetComparator(evaluators=[
FrequencyEvaluator(judge, name="Realistic dialog rate"),
MeanEvaluator(flesch, name="Mean Flesch Reading Ease"),
MeanEvaluator(gunning, name="Mean Gunning Fog"),
])
# Compare the dialog sets
comparator({
"reference": reference,
"candidate_a": candidate_a,
"candidate_b": candidate_b,
})
# Plot the comparison
comparator.plot()
Create your own score by inheriting from BaseDialogScore
and implementing score(dialog)
:
from sdialog.core import Dialog
from sdialog.evaluation import BaseDialogScore
# Simple custom metric: turn length
class DialogLength(BaseDialogScore):
def score(self, dialog: Dialog) -> int:
return len(dialog)
See the evaluation tutorial and the demo for more.
SDialog natively supports mechanistic interpretability. For instance, it provides an Inspector
class to capture and steer internal activations at specific layers/tokens. This enables per-token inspection and controlled, ethical behavior adjustments.
Observe internal activations:
import sdialog
from sdialog.agents import Agent
from sdialog.interpretability import Inspector
sdialog.config.llm("huggingface:meta-llama/Llama-3.2-3B-Instruct")
agent = Agent(name="Bob")
# Inspect activations in the residual stream at layer 16
inspector = Inspector(target="model.layers.16.post_attention_layernorm")
agent = agent | inspector
agent("How are you?")
act = inspector[0][0].act # activations for the first generated token
Steer the model with a user-provided direction, e.g., remove anger expression:
import torch
# Target all layers of a 28-layer model
targets = []
for i in range(28):
targets.append(f"model.layers.{i}.post_attention_layernorm")
targets.append(f"model.layers.{i}.mlp")
targets.append(f"model.layers.{i}")
intruder = Inspector(target=targets)
anger_direction = torch.load("anger_direction.pt") # your direction vector
agent_steered = agent | intruder - anger_direction # ablate the anger direction across layers
agent_steered("You are an extremely upset assistant") # anger is no longer part of the activation space
See tutorials for worked examples: our demo notebook (Mechanistic Interpretability example) and the tutorial to remove refusal capacity from Llama 3.
Notes: Use these tools for research and safety improvements only; do not attempt to bypass model safety mechanisms.
- Documentation: https://sdialog.readthedocs.io
- API reference: https://sdialog.readthedocs.io/en/latest/api/index.html
- Tutorials (Jupyter): https://github.com/idiap/sdialog/tree/main/tutorials
We welcome issues, feature requests, and pull requests. If you want to add personas, agents, orchestrators, generators, evaluators, or tutorials, please open an issue or submit a PR.
This project follows the all-contributors specification. Contributions of any kind welcome!
All-contributors list:
Sergio Burdisso π» π€ π β |
Labrak Yanis π» π€ |
SΓ©verin π» π€ β |
Ricard Marxer π» π€ |
Thomas Schaaf π€ π» |
David Liu π» |
ahassoo1 π€ π» |
Pawel Cyrta π» π€ |
ABCDEFGHIJKL π» |
This work was supported by the EU Horizon 2020 project ELOQUENCE (grant number 101070558).
The initial development of this project began in preparation for the 2025 Jelinek Memorial Summer Workshop on Speech and Language Technologies (JSALT 2025). Further improvements and enhancements were made during the Workshop as part of the "Play your Part" research group.
MIT License
Copyright (c) 2025 Idiap Research Institute