
"In the rapidly evolving landscape of large language models ( LLMs), token efficiency is becoming a serious concern. As developers and researchers keep pushing more structured data into models, the cost and latency tied to token count only grow. That's where Token-Oriented Object Notation (TOON) ( GitHub repository here) comes in. It's a serialization format built specifically for LLM prompts, aiming to cut down token usage while keeping the data structured and machine-readable."
"What is TOON? TOON is a serialization format for structured data designed with LLM inputs in mind. It is human-readable, uses minimal syntax (leaning on indentation and compact arrays), and aims to remove the repeated overhead of typical JSON when dealing with large uniform arrays of objects. Key features include: Declaring the length of arrays and the field names once (for tabular data) instead of repeating keys for each object Using indentation rather than braces/brackets in many places helps reduce token overhead"
TOON is a serialization format optimized for LLM prompts that prioritizes token efficiency and machine readability. It minimizes syntax by relying on indentation and compact arrays, and it declares array lengths and field names once for tabular data to avoid repeated keys. Benchmarks indicate 30–60% fewer tokens on large, uniform object arrays compared with formatted JSON. TOON supports alternate delimiters (tab, pipe) for very large arrays. It performs best on uniform, primitive-valued arrays and is less suitable for deeply nested, irregular, or non-uniform datasets, trading generality for token savings.
Read at LogRocket Blog
Unable to calculate read time
Collection
[
|
...
]