Zero-Dependency Document Intelligence Substrate — Local semantic embeddings, multi-topic clustering, revision diffing, and structural signal extraction running offline at the edge in under 1 millisecond.
The semantic shape of a document exists in the geometry of its text, not in the remote weights of a billion-parameter model.
For years, we have done magnificent work with vector databases and API-based LLM embedding endpoints. We have mastered the art of shipping chunks to the cloud to retrieve context.
And yet, a ghost haunts the runtime. The developer pays a steep tax in API latency, cold-start delays, dependency bloat, and serverless compute costs just to understand if two paragraphs in the same process address the same topic.
graviton is a zero-dependency document intelligence substrate. It runs locally, offline, and at the edge in under 1 millisecond. It packs quantized semantic geometry directly into your runtime to chunk, cluster, extract, compare, and compress document signals with zero network requests and zero configuration.
Try the Live Web Sandbox & Visualizer →
| Dimension | graviton |
transformers.js |
OpenAI / API |
|---|---|---|---|
| Cold Start | ~80ms (disk read) | 2,000–8,000ms | 50–200ms (network) |
| In-Process Latency | <0.5ms | 50–500ms (CPU) | 100–300ms (HTTP) |
| Bundle Footprint | ~13MB (vectors included) | ~80MB+ | 0 (but requires client library) |
| Edge Compatibility | Out-of-the-box | Requires WASM hacks | Network dependent |
| Financial Cost | $0.00 | $0.00 | Metered billing |
npm install @zero-intelligence/gravitonimport { extractSignal } from '@zero-intelligence/graviton'
// Bundled semantic geometries are loaded automatically.
const packet = await extractSignal(documentText, 'deep')
console.log(packet.intrinsicRank) // Semantic complexity: number of independent axes
console.log(packet.spectralEntropy) // Trajectory focus: 0 (laser-focused) to 1 (scattered)
console.log(packet.documentShape) // Structural categorization: 'focused' | 'structured' | 'fragmented'
console.log(packet.contextBlock) // High-density content payload ready to inject into your LLM promptTo avoid cold-start latency on the first lazy-load in production, pre-warm the memory store at startup:
import { preload } from '@zero-intelligence/graviton/signal'
await preload() // Reads and decompresses the 13MB binary into memory (~80ms)Low-level operations for vector math, search, clustering, and deduplication.
import { embed, nearestVectors, kmeans, simhash, ngramSimhash } from '@zero-intelligence/graviton/embed'
// Embed in hash mode (zero-data footprint, Johnson-Lindenstrauss distance preservation guarantee):
const vector = embed('zero-dependency local embedding', null)
// Scan a corpus in-memory (O(N) scan, resolves in <1ms for up to 50,000 vectors):
const nearest = nearestVectors(queryVector, corpusVectors, 5)
// Cluster semantic vectors:
const { assignments, centroids } = kmeans(corpusVectors, 3)
// Deduplicate documents using locality-sensitive hashing:
const hash = simhash(text)
const robustHash = ngramSimhash(text) // Robust against vocabulary editsHigh-level document processing that chunk, projects, decomposes, and synthesizes raw text into structured signals.
import { extractSignal } from '@zero-intelligence/graviton/signal'
const packet = await extractSignal(documentText, 'deep')
packet.contextBlock // Formatted text block optimized for LLM prompting
packet.contextLayers // Multi-resolution semantic wavelet layers
packet.topics // Cluster themes and their proportional dominance
packet.keySentences // Central sentences selected via Maximal Marginal Relevance
packet.keyTerms // Key vocabulary terms ranked by TF-IDF and TextRank
packet.facts // Extracted structured entities (SemVer, IPs, URLs, dates, numbers)
packet.sectionBreaks // Indices representing semantic discontinuity / drift
packet.intrinsicRank // SVD-derived count of independent document concepts
packet.spectralEntropy // DFT-derived structural noise index [0.0 - 1.0]
packet.documentShape // Coherence classification: 'focused' | 'structured' | 'fragmented'
packet.activationProfile // Top GloVe dimensions driving the primary document topic
packet.phaseSpectrum // DFT phase representation for structural alignment| Property | Interface | Meaning |
|---|---|---|
intrinsicRank |
number |
The minimum number of singular axes required to capture ≥95% of document variance. Represents conceptual density. |
spectralEntropy |
number |
The normalized entropy of the power spectrum of the semantic trajectory. Measures narrative linearity. |
documentShape |
string |
'focused' (single tight topic), 'structured' (well-partitioned multi-topic), or 'fragmented' (topic drift). |
reconstructionConfidence |
number |
Score indicating the descriptive coverage of extracted key sentences over the raw document sections. |
-
TextRank ≡ SVD U[0]: Traditional PageRank over document graphs is mathematically equivalent to the first left singular vector of the embedding matrix.
gravitonuses this relation to replace$O(N^2)$ iterations with a fast$O(N \cdot K)$ projection. -
DFT Phase Alignment: The phase spectrum of the semantic trajectory encodes the narrative structure of a text.
compareSignals(a, b)uses phase delta to classify relations (e.g., Q&A pairs align in complementary phase phase-shifted by$\pi$ ). -
Multi-Resolution Wavelets: Documents carry information at different scales.
contextLayersmodels this as a semantic wavelet, exposing layers from global summaries down to raw fact ledgers.
SLMs and LLMs fail when flooded with raw noise. Compress your prompts dynamically based on token constraints:
const packet = await extractSignal(contractDraft, 'deep')
// 1. High-level conceptual summary only (Minimal tokens)
const summaryContext = packet.contextLayers
.filter(layer => layer.resolution <= 1)
.map(layer => `[${layer.label}]\n${layer.content}`)
.join('\n\n')
// 2. Full semantic profile (Optimal density)
const fullContext = packet.contextLayers
.map(layer => `[${layer.label}]\n${layer.content}`)
.join('\n\n')Track documents over time, detect drift, and compare revisions structurally.
import { extractSignal, compareSignals } from '@zero-intelligence/graviton/signal'
const qPacket = await extractSignal(questionText)
const aPacket = await extractSignal(answerText)
const relation = compareSignals(qPacket, aPacket)
// relation.phaseRelationship -> 'complementary' (Q&A align on opposite phases)
// relation.phaseRelationship -> 'aligned' (same content/stance)import { extractSignal, diffSignals } from '@zero-intelligence/graviton/signal'
const v1 = await extractSignal(draftV1)
const v2 = await extractSignal(draftV2)
const diff = diffSignals(v1, v2)
console.log(diff.cosineSimilarity) // Semantic similarity of content trajectories
console.log(diff.addedKeyTerms) // Vocabulary introduced in V2
console.log(diff.addedFacts) // Added structured elements (monetary values, IP addresses, dates)Avoid retaining raw texts or large JSON packets to run compliance history checks. Hash the document's phase spectrum into a 64-bit signature:
import { extractSignal, temporalFingerprint } from '@zero-intelligence/graviton/signal'
import { hammingDistance } from '@zero-intelligence/graviton/embed'
const fpV1 = temporalFingerprint(v1.phaseSpectrum) // "ffff000001ffffff"
const fpV2 = temporalFingerprint(v2.phaseSpectrum) // "ffff00000fffffff"
const edits = hammingDistance(fpV1, fpV2) // Distance <= 3 indicates structural preservationgraviton compiles to a zero-dependency CLI binary. It includes an in-process regex-driven PDF text extractor to analyze files directly from the shell.
# Get a styled console summary of any text or PDF file:
npx graviton analyze legal_contract.pdf
# Dump raw JSON metadata with specialized presets:
npx graviton analyze source_code.py --format json --presets dev --dynamic-threshold 0.4Isolate the signals that matter by controlling noise dynamically.
DEV_PRESETS: Matches SemVer, Docker images, SQL statements, IP:port combinations, and CSS colors.FINANCIAL_PRESETS: Matches IBANs, credit card numbers, SWIFT codes, and Tax IDs.ACADEMIC_PRESETS: Matches DOIs, arXiv IDs, LaTeX equations, and measurement metrics.
import { extractSignal, DEV_PRESETS } from '@zero-intelligence/graviton/signal'
const packet = await extractSignal(codeText, 'deep', {
customFactPatterns: [...DEV_PRESETS],
customStopwords: ['import', 'require', 'const'],
dynamicStopwordThreshold: 0.4 // Drop any term appearing in >40% of code chunks
})Native adapters to compress prompt sizes dynamically before invocation:
// 1. Vercel AI SDK
import { gravitonAiMiddleware } from '@zero-intelligence/graviton/integrations'
import { wrapLanguageModel } from 'ai'
import { openai } from '@ai-sdk/openai'
const model = wrapLanguageModel({
model: openai('gpt-4o'),
middleware: gravitonAiMiddleware({ minLength: 500, format: 'latent' })
})
// 2. LangChain JS
import { GravitonDocumentCompressor } from '@zero-intelligence/graviton/integrations'
const compressor = new GravitonDocumentCompressor({ preset: 'standard' })
const denseDocs = await compressor.transformDocuments(docs)
// 3. Mastra JS
import { gravitonMastraExecute } from '@zero-intelligence/graviton/integrations'
import { createTool } from '@mastra/core/tools'
const compressTool = createTool({
id: 'compress-document',
description: 'Mathematically compress a text document',
inputSchema: z.object({ text: z.string() }),
execute: gravitonMastraExecute({ preset: 'deep' })
})MIT © zero-intelligence