Skip to content

feat: Use TOON (Token-Oriented Object Notation) for LLM data serialization #1479

@Ivlad003

Description

@Ivlad003

Title:
Use TOON (Token-Oriented Object Notation) for LLM data serialization


Issue Body:
We should adopt the Token-Oriented Object Notation (TOON) format for passing structured data into/out of our LLM / agent workflows. TOON is a compact, schema-aware format that can reduce token usage by 30-60% versus standard JSON. :contentReference[oaicite:0]{index=0}

Motivation:

  • Our current JSON-based payloads contain a lot of syntactic overhead (braces, quotes, repeated fields) which consumes tokens when sent to LLMs.
  • By sending the same data in TOON format, we can reduce token cost, lower latency, and fit more data into the model’s input window.
  • Since our architecture already uses JSON internally, the change surface is minimal — only the boundary layer (LLM-calls) needs conversion.

Proposed Solution:

  • Introduce a serialization layer in our Node.js/TypeScript backend that converts JSON → TOON when preparing data for LLM calls, and TOON → JSON when parsing responses.
  • Use the official NPM package @toon-format/toon for encoding/decoding. :contentReference[oaicite:1]{index=1}
  • Update our prompt templates to instruct the model to accept/produce TOON format (e.g., include a brief example header)
  • Benchmark token counts, latency, and cost before and after adoption to validate savings.
  • Provide rollout guidelines: use TOON for flat/uniform data structures (e.g., lists, records) — for deeply nested or irregular structures, continue using JSON (TOON may not always yield benefit) :contentReference[oaicite:2]{index=2}

Acceptance Criteria:

  • Helper functions/modules exist that convert to/from TOON reliably.
  • At least one real-world workflow has been migrated and demonstrates measurable token/cost savings.
  • Documentation/guidance is available for engineers on when to use TOON vs JSON.
  • No regression in model accuracy or handling of structured data.

Out of Scope:

  • Replacing JSON for all internal service APIs or storage – this proposal covers only the LLM/agent boundary layer.
  • Building a custom TOON parser from scratch – we leverage the existing supported library.
  • Changing deeply nested or highly relational payloads until initial rollout of simpler structures is validated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:ai-modelsAI model integration and configurationenhancementNew feature or requestmedium-priorityImportant but not urgentrefactorChanges needed to code

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions