Skip to content
/ lpgn Public

LPGN - Locality-Preserving Graph Notation for LLM self-attention. Spec, parser, tools.

License

Notifications You must be signed in to change notification settings

vkozio/lpgn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ•ธ๏ธ LPGN: Locality-Preserving Graph Notation

A Graph Serialization Format Natively Optimized for LLM Self-Attention

Status: Specification Domain: LLM & Graph AI

Standard graph serialization formats (JSON, DOT, Cypher) were built for algorithmic parsers, not for Neural Networks. When fed to Large Language Models (LLMs), these formats cause catastrophic context fragmentation, leading to hallucinations, poor topological recall, and wasted token limits.

Locality-Preserving Graph Notation (LPGN) is a declarative, algebraically dense graph language designed from the ground up to align with the Transformer Self-Attention mechanism.


๐Ÿšจ The Problem: The "Attention Crisis" in Graph AI

Recent studies (2024โ€“2025) have proven a critical flaw in how LLMs process graph data: Serialization format dictates algorithmic complexity for the model.

When a graph is serialized into standard Edge Lists or JSON, connected nodes are scattered across the text.

  • Indirect Referencing: LLMs struggle to resolve cross-referenced IDs (e.g., "source": "node_A", "target": "node_B").
  • Attention Dispersion: The Transformer's self-attention matrix loses focus when tokens for adjacent nodes are separated by hundreds of syntax characters.
  • Token Bloat: A densely connected graph of $|V|$ nodes can require $O(|V|^2)$ tokens to describe, quickly exhausting context windows and driving up API costs.

While the industry attempts to solve this by building computationally expensive Graph Neural Networks (GNNs) or using aggressive RAG truncation, LPGN solves the root cause: the text representation itself.


๐Ÿ’ก The LPGN Solution: Topological-Lexical Isomorphism

LPGN enforces a strict mathematical rule: Graph adjacency must equal token adjacency. If two nodes are connected in the topology, they must be structurally adjacent in the prompt. We achieve this through two core principles:

1. Algebraic Edge Compression

Instead of enumerating every edge individually, LPGN uses combinatorial primitives (Sets {}, Cliques [], and Chains <>) combined with Cartesian flow operators (->). A bipartite connection that takes multiple lines and objects in JSON is mathematically compressed into a single, highly semantic line.

2. Strict Textual-Topological Locality

LPGN eliminates arbitrary variables and "GOTO" logic. The graph is evaluated purely left-to-right.

  • Semantic labels on edges are transformed into explicit bipartite node chains to prevent token distance between core entities.
  • For cyclic graphs, a zero-logic back-reference anchor (^) safely closes loops topologically without breaking the forward semantic attention flow.

๐ŸฅŠ Code Comparison: JSON vs. LPGN

Let's represent a standard business topology: A User submits an order, which is processed by both Risk and Finance domains in parallel, and then sent to Logistics.

โŒ Standard JSON / Edge List (High Token Cost, Low Recall):

{
  "nodes": ["User.Submit", "Risk.Check", "Finance.Hold", "Logistics.Ship"],
  "edges": [
    {"from": "User.Submit", "to": "Risk.Check"},
    {"from": "User.Submit", "to": "Finance.Hold"},
    {"from": "Risk.Check", "to": "Logistics.Ship"},
    {"from": "Finance.Hold", "to": "Logistics.Ship"}
  ]
}

About

LPGN - Locality-Preserving Graph Notation for LLM self-attention. Spec, parser, tools.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published