Skip to content

idiap/sdialog

Repository files navigation

SDialog Logo

Documentation Status CI codecov PyPI version Downloads Open In Colab


SDialog is a modular Python library for dialogue modeling, generation, evaluation, and analysis with LLMs. It provides a standard Dialog format with rich metadata, persona-driven multi-agent simulation, orchestration for fine control, evaluation metrics, and built-in mechanistic interpretability support.

Quick links: Docs β€’ API β€’ Tutorials β€’ Demo (Colab) β€’ Issues

πŸš€ Motivation

Synthetic dialogue generation is increasingly central to creating training data, augmenting datasets, stress-testing systems, and simulating both task-oriented and open-domain interactions. Teams need precise control over personas, contexts, tools, and orchestration to cover long-tail scenarios at scale while preserving privacy and reproducibility. Yet dialogue work is fragmented: every dataset has its own format, every project reinvents agents and prompts, and reproducibility is hard.

The purpose of this project is to make synthetic dialogue generation practicalβ€”built with and for the communityβ€”by enabling:

  • Standardization and reproducibility: a well-defined schema for Dialog, Personas, Context, Agents, etc., with JSON import/export serialization for auditability, sharing, and benchmarking.
  • Abstractions: simple, composable building blocks for personas, agents, orchestrators, generators, evaluation, and interpretability.
  • Interoperability: the same code works with multiple LLM backends (Ollama, HuggingFace, OpenAI, Google Generative AI, AWS, etc.).
  • Controllability: persona-, context-, and orchestration-driven generation for targeted scenarios and long-tail distribution exploration.
  • Evaluation loop: built-in metrics and LLM-as-judge interfaces to compare synthetic against reference data and guide iteration.
  • Interpretability and safety: native mechanistic interpretability to inspect and steer activations/tokens; supports debugging, bias mitigation, and safe behavior adjustments.

See the quick examples below and our demo notebook for a simple demo of the core workflow and basic capabilities. For task-focused guides, see the Tutorials folder.

⚑ Installation

pip install sdialog

🏁 Quick start

Define personas and context, create agents, and generate a dialogue:

from sdialog import Context
from sdialog.agents import Agent
from sdialog.personas import Persona

# Personas (built-ins like Doctor/Patient/Customer are also available)
alice = Persona(name="Alice", role="friendly barista", personality="cheerful and helpful")
bob = Persona(name="Bob", role="customer", personality="curious and polite")

# Optional shared context
ctx = Context(location="Downtown cafe", topics=["coffee", "recommendations"]) 

# Agents
alice_agent = Agent(persona=alice)
bob_agent = Agent(persona=bob)

# Dialogue
dialog = alice_agent.dialog_with(bob_agent, context=ctx)
dialog.print()  # Pretty print the dialog
# dialog.to_file("my_dialog.json")  # Save it as a JSON file

Make the same agents talk in a different context:

starship = Context(
  location="Starship",
  environment="futuristic cafeteria",
  objects=[
    "holographic menu board",
    "service droid",
    "zero-g drink dispenser",
  ],
  circumstances="Customer complains the delivered drink isn’t the one ordered"
)

dialog = alice_agent.dialog_with(bob_agent, context=starship)
dialog.print()

Check out our demo notebook for a simple demo of the core workflow and capabilities (generation, evaluation, and interpretability).

πŸ”§ Interoperability

  • SDialog supports many backends (Ollama, HuggingFace, OpenAI, Google Generative AI, AWS), specified as a model string: "BACKEND:MODEL", e.g.:
    • "openai:gpt-4.1"
    • "ollama:gemma3:27b"
    • "aws:anthropic.claude-3-5-sonnet-20240620-v1:0"
    • "huggingface:meta-llama/Llama-3.2-3B-Instruct"

Set a global default LLM for all components:

import sdialog

sdialog.config.llm("ollama:qwen3:14b")

Optionally pass parameters:

sdialog.config.llm("ollama:qwen3:14b", temperature=0.9)

Any parameter supported by the selected backend is allowed, for instance:

sdialog.config.llm(
  "aws:anthropic.claude-3-5-sonnet-20240620-v1:0",
  region_name="us-east-1"
)

πŸ‘€ Personas

Personas are lists of attributes that define who an agent is: a structured profile with role, background/expertise, goals, tone, and other metadata that conditions how the agent speaks and acts. In SDialog, personas are first-class, serializable objects used by Agents, Generators, and Orchestrators; they can be built-in or custom.

Built-in personas include Persona (generic) and typed ones like Doctor, Patient, Customer, and SupportAgent. For example:

from sdialog.personas import Customer, SupportAgent

customer_persona = Customer(customer_id="12345",
                            issue="Cannot log in to my account",
                            anger_level="high")
support_persona = SupportAgent(politeness="high")

# Pretty print the personas
customer_persona.print()
support_persona.print()

Define your own persona type (simple Pydantic-style fields):

from pydantic import Field
from sdialog.personas import BasePersona

class Librarian(BasePersona):
  name: str = ""
  expertise: str = Field("", description="Primary subject area")
  personality: str = Field("patient and meticulous", description="Key traits shaping tone and behavior")

lib = Librarian(name="Morgan",
                expertise="history")
lib.print()

πŸ€– Agents

Agents are persona-conditioned conversational actors; they take a persona object when created. They can also support hidden thinking and tool use (if the chosen LLM supports it):

from sdialog.agents import Agent

# As example, let's define wwo simple (mock) tools our support agent can call
# 1) Fake RAG-like tool
def get_product_documentation(product: str, model: str) -> dict:
    """Retrieve product documentation for a specific product and model."""
    # In a real tool, query your documentation store and return top-k snippets.
    snippets = [
        f"Overview for {product} {model}",
        f"Troubleshooting guide for {product} {model}",
        f"FAQ for {product} {model}"
    ]
    return {"snippets": snippets}

# 2) Fake verification account tool
def verify_account(customer_id: str) -> dict:
  """Verify customer account and return minimal details."""
  return {"customer_id": customer_id, "exists": True}

support_agent = Agent(persona=support_persona,
                      think=True,  # Enable reasoning
                      tools=[get_product_documentation, verify_account],  # And tools!
                      name="AGENT")
customer = Agent(persona=customer_persona,
                 first_utterance="Hi there!",
                 name="USER")

dialog = customer.dialog_with(support_agent, max_turns=10)
dialog.print()

See the Agents with tools and thoughts tutorial for details.

πŸ§ͺ Generators (personas, context, dialogues)

Context and personas can be generated using an LLM or custom functions to populate their attributes. For instance, create doctors (whose specialty is cardiology) and patients (whose symptom is chest pain):

from sdialog.personas import Doctor, Patient
from sdialog.generators import PersonaGenerator

# Persona generators (by default fill all unspecified fields via LLM)
doc_gen = PersonaGenerator(Doctor(specialty="Cardiology"))
pat_gen = PersonaGenerator(Patient(symptoms="mild chest pain"))

# New doctor and patient each time `generate()` is called
doctor = doc_gen.generate()
patient = pat_gen.generate()

# Pretty print generated personas
doctor.print()
patient.print()

We can then generate a dialogue using Agents as above, or use a single LLM with PersonaDialogGenerator:

from sdialog.generators import PersonaDialogGenerator

# Full dialogue generator for the given personas (no agents)
dlg_gen = PersonaDialogGenerator(doctor, patient, dialogue_details="Keep it short and reassuring.")

# Generate a new dialogue (each call returns a new one)
dialog = dlg_gen()

dialog.print()

The PersonaGenerator takes any persona as input, including user-defined ones:

# Using the Librarian class defined above
lib_gen = PersonaGenerator(Librarian())  # unspecified fields are LLM-filled by default
new_lib = lib_gen.generate()
new_lib.print()

Other utilities: ContextGenerator to generate contexts, and Paraphraser for dataset augmentation.

πŸŽ›οΈ Orchestration in one minute

Add simple rules or constraints by composing orchestrators; the | operator attaches them to agents.

from sdialog.orchestrators import SimpleReflexOrchestrator, LengthOrchestrator

# Make Alice react if Bob mentions "cupcakes" and keep the chat between 6 and 10 turns
react = SimpleReflexOrchestrator(
  condition=lambda utt: "cupcakes" in utt.lower(),
  instruction="Politely explain cupcake policy and suggest an alternative"
)
keep_length = LengthOrchestrator(min=6, max=10)

alice_agent = alice_agent | react | keep_length

dialog = alice_agent.dialog_with(bob_agent)
dialog.print(orchestration=True)   # show injected instructions/events

Define your own orchestrator:

from sdialog.orchestrators import BaseOrchestrator

class EncourageDetailOrchestrator(BaseOrchestrator):
  def instruct(self, dialog, utterance):
    if utterance and len(utterance.split()) < 5:
      return "Add a bit more detail in your next reply."
    return None

alice_agent = alice_agent | EncourageDetailOrchestrator()

See the orchestration tutorial for more details.

πŸ“Š Evaluation and analysis

Evaluate and compare generated dialogues against a reference set (e.g., human reference) using built-in metrics and evaluators (LLM-as-judge, linguistic features, dialog-flow), for example:

import sdialog

from sdialog.evaluation import LLMJudgeRealDialog, LinguisticFeatureScore  # scores
from sdialog.evaluation import FrequencyEvaluator, MeanEvaluator           # evaluators
from sdialog.evaluation import DatasetComparator                           # comparator

reference = [...]   # list of reference Dialogs
candidate_a = [...] # list of first candidate Dialogs
candidate_b = [...] # list of second candidate Dialogs

# Instantiate scores
judge = LLMJudgeRealDialog(feedback=True)
flesch = LinguisticFeatureScore(feature="flesch-reading-ease")
gunning = LinguisticFeatureScore(feature="gunning-fog")

# Instantiate comparator with evaluators
comparator = DatasetComparator(evaluators=[
  FrequencyEvaluator(judge, name="Realistic dialog rate"),
  MeanEvaluator(flesch, name="Mean Flesch Reading Ease"),
  MeanEvaluator(gunning, name="Mean Gunning Fog"),
])

# Compare the dialog sets
comparator({
  "reference": reference,
  "candidate_a": candidate_a,
  "candidate_b": candidate_b,
})

# Plot the comparison
comparator.plot()

Create your own score by inheriting from BaseDialogScore and implementing score(dialog):

from sdialog.core import Dialog
from sdialog.evaluation import BaseDialogScore

# Simple custom metric: turn length
class DialogLength(BaseDialogScore):
    def score(self, dialog: Dialog) -> int:
        return len(dialog)

See the evaluation tutorial and the demo for more.

🧠 Mechanistic Interpretability

SDialog natively supports mechanistic interpretability. For instance, it provides an Inspector class to capture and steer internal activations at specific layers/tokens. This enables per-token inspection and controlled, ethical behavior adjustments.

Observe internal activations:

import sdialog
from sdialog.agents import Agent
from sdialog.interpretability import Inspector

sdialog.config.llm("huggingface:meta-llama/Llama-3.2-3B-Instruct")

agent = Agent(name="Bob")
# Inspect activations in the residual stream at layer 16
inspector = Inspector(target="model.layers.16.post_attention_layernorm")
agent = agent | inspector

agent("How are you?")
act = inspector[0][0].act  # activations for the first generated token

Steer the model with a user-provided direction, e.g., remove anger expression:

import torch

# Target all layers of a 28-layer model
targets = []
for i in range(28):
    targets.append(f"model.layers.{i}.post_attention_layernorm")
    targets.append(f"model.layers.{i}.mlp")
    targets.append(f"model.layers.{i}")

intruder = Inspector(target=targets)

anger_direction = torch.load("anger_direction.pt")  # your direction vector
agent_steered = agent | intruder - anger_direction  # ablate the anger direction across layers

agent_steered("You are an extremely upset assistant")  # anger is no longer part of the activation space

See tutorials for worked examples: our demo notebook (Mechanistic Interpretability example) and the tutorial to remove refusal capacity from Llama 3.

Notes: Use these tools for research and safety improvements only; do not attempt to bypass model safety mechanisms.

πŸ“– Documentation and tutorials

πŸ’ͺ Contributors πŸ˜ŽπŸ‘

We welcome issues, feature requests, and pull requests. If you want to add personas, agents, orchestrators, generators, evaluators, or tutorials, please open an issue or submit a PR.

This project follows the all-contributors specification. Contributions of any kind welcome!

All-contributors list:

Sergio Burdisso
Sergio Burdisso

πŸ’» πŸ€” πŸ“– βœ…
Labrak Yanis
Labrak Yanis

πŸ’» πŸ€”
SΓ©verin
SΓ©verin

πŸ’» πŸ€” βœ…
Ricard Marxer
Ricard Marxer

πŸ’» πŸ€”
Thomas Schaaf
Thomas Schaaf

πŸ€” πŸ’»
David Liu
David Liu

πŸ’»
ahassoo1
ahassoo1

πŸ€” πŸ’»
Pawel Cyrta
Pawel Cyrta

πŸ’» πŸ€”
ABCDEFGHIJKL
ABCDEFGHIJKL

πŸ’»

πŸ™ Acknowledgments

This work was supported by the EU Horizon 2020 project ELOQUENCE (grant number 101070558).

The initial development of this project began in preparation for the 2025 Jelinek Memorial Summer Workshop on Speech and Language Technologies (JSALT 2025). Further improvements and enhancements were made during the Workshop as part of the "Play your Part" research group.

πŸ“ License

MIT License
Copyright (c) 2025 Idiap Research Institute

About

Synthetic Dialog Generation and Analysis with LLMs

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 10

Languages