Semantic Compression Layer

LoreTokens

Compressed Meaning For AI

LoreTokens are an AI-native serialization format that lets models work on compact meaning instead of huge blobs of JSON. They compress schemas, behaviors, and typical values into short, LLM-readable lines that can be expanded into full knowledge on demand.

Engineered alongside SAIQL and used inside Nova, an enterprise crypto trading AI.

Why LoreTokens?

Beyond JSON, TOON, and CSV

Semantic Compression

Instead of squeezing bytes, LoreTokens compress meaning. A single line can encode the domain, concept, subject, fields, and output shape - which modern LLMs can expand into many pages of detail.

See the medical example and discussion in the LoreTokens README .

LLM-Native

LoreTokens are designed as a language models already like: regular, symbolic, and explicit. Formats are documented in docs/FORMATS.md with glyphs in docs/SYMBOLS.md .

Field-Proven With SAIQL

LoreTokens power SAIQL's compressed storage paths, GPU hooks, and Nova's long-term trading memory. Together they have hit order-of-magnitude speedups over Postgres-only stacks in internal tests.

For a database-centric view, see saiql.ai/saiql-docs.html and the SAIQL repo.

Formats At A Glance

Symbolic | Standard | Ultra

Symbolic (.sym)

Machine-centric glyph stream used by SAIQL nodes and GPU hooks. Highest compression, least human-friendly - ideal for hot paths and logs.

LoreToken Standard (.lt)

Balanced format for CLI tools, diffs, and repos. Short but still readable once you know the glyphs.

LoreToken Ultra (.ltu)

Human-friendly variant for docs, audits, and exported memories where clarity matters more than every last byte.

Full tables live in docs/FORMATS.md and the glyph legend is in docs/COMPLETE_SYMBOL_LEGEND.md .

Tooling and GPU Hook

From CLI To CUDA

CLI and Converters

Translate between symbolic, standard, and ultra formats with converter scripts and C-based search tools.

See scripts/loretoken_converter.py and scripts/loretoken_translator.py .

GPU Hook

LoreToken-GPU lets you compress buffers before they hit the GPU, saving memory and bandwidth with minimal overhead.

Start with docs/GPU_HOOK.md .

Example Datasets

Ready-made LoreToken dumps you can load into any LLM or RAG pipeline.

examples/Schema-LoreTokenised.txt

Drop LoreTokens Under Your Stack

Use LoreTokens on their own, or pair them with SAIQL for a full semantic memory and query layer.

Get LoreTokens on GitHub LoreToken Docs