Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML #210

konard · 2026-01-21T10:23:16Z

Summary

This PR implements UTF-8 character count benchmarks comparing Links Notation (lino) against JSON, YAML, and XML formats. The benchmarks are implemented in all six supported languages and include a GitHub Actions workflow for automatic report generation.

Key Changes

New benchmark implementations in all 6 supported languages:
- Rust (primary, used in CI/CD for auto-generating BENCHMARK_RESULTS.md)
- JavaScript
- Python
- C#
- Go
- Java
5 benchmark test cases covering different data structures:
- employees - Employee records with nested structure
- simple_doublets - Simple doublet links (2-tuples)
- triplets - Triplet relations (3-tuples)
- nested_structure - Deeply nested company structure
- config - Application configuration
GitHub Actions workflow (benchmarks.yml) that:
- Runs the Rust benchmark on push to main
- Automatically commits updated BENCHMARK_RESULTS.md if results change
- Validates all language implementations in CI

Benchmark Results

Format	Total Characters	vs Lino
Lino	734	-
JSON	1332	+81.5%
YAML	920	+25.3%
XML	1882	+156.4%

Average savings with Lino:

vs JSON: 47.9% fewer characters
vs YAML: 21.5% fewer characters
vs XML: 61.5% fewer characters

Files Changed

.github/workflows/benchmarks.yml - New CI workflow for benchmark automation
benchmarks/ - New benchmark directory with:
- BENCHMARK_RESULTS.md - Generated results (auto-updated by CI)
- README.md - Documentation for running benchmarks
- data/ - Test data files in all formats
- Language-specific benchmark implementations
rust/links-notation-benchmark/ - Rust benchmark crate
rust/Cargo.toml - Updated workspace members

Test Plan

All 6 benchmark implementations produce consistent results
Rust benchmark tests pass (cargo test -p links-notation-benchmark)
Existing Rust library tests still pass
GitHub Actions CI passes for all benchmark validations

Issue Reference

Closes #209

🤖 Generated with Claude Code

Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #209

… XML This implements UTF-8 character count benchmarks in all six supported languages: - Rust (primary, used in CI/CD for auto-generating BENCHMARK_RESULTS.md) - JavaScript - Python - C# - Go - Java Features: - Five benchmark test cases: employees, simple_doublets, triplets, nested_structure, config - Detailed markdown report with summary and per-case results - GitHub Actions workflow that automatically updates benchmark results on push to main - Consistent benchmark implementation across all languages producing identical results Results show Lino achieves on average: - 47.9% fewer characters vs JSON - 21.5% fewer characters vs YAML - 61.5% fewer characters vs XML Closes #209 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Apply rustfmt formatting and fix clippy warning about redundant closure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

This reverts commit 222c3f1.

konard · 2026-01-21T10:39:12Z

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

Public pricing estimate: $8.161039 USD
Calculated by Anthropic: $6.654238 USD
Difference: $-1.506801 (-18.46%)
📎 Log file uploaded as Gist (838KB)
🔗 View complete solution draft log

Now working session is ended, feel free to review and add any feedback on the solution draft.

Initial commit with task details

222c3f1

Adding CLAUDE.md with task information for AI processing. This file will be removed when the task is complete. Issue: #209

konard self-assigned this Jan 21, 2026

konard changed the title ~~[WIP] Add tokenization benchmarks comparing with YAML, XML, JSON~~ Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML Jan 21, 2026

Fix Rust formatting and Clippy warnings in benchmark

33206fa

Apply rustfmt formatting and fix clippy warning about redundant closure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

konard marked this pull request as ready for review January 21, 2026 10:38

Revert "Initial commit with task details"

b420835

This reverts commit 222c3f1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML #210

Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML #210

Uh oh!

konard commented Jan 21, 2026 •

edited

Loading

Uh oh!

konard commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML #210

Are you sure you want to change the base?

Add tokenization benchmarks comparing Links Notation with JSON, YAML, XML #210

Uh oh!

Conversation

konard commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Benchmark Results

Files Changed

Test Plan

Issue Reference

Uh oh!

konard commented Jan 21, 2026

🤖 Solution Draft Log

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

konard commented Jan 21, 2026 •

edited

Loading