A Multi-Layer Comparison of SAL, JSON-RPC, and Binary Serialization Formats Across Wire, Token, and Grammar Dimensions
Version 2.2 | March 2026 Octid Semantic Mesh Protocol (OSMP) v1.0 | Cloudless Sky Project Author: Clay Holberg Contact: ack@octid.io | octid.io Repository: github.com/octid-io/cloudless-sky License: Apache 2.0 with express patent grant
This paper presents a multi-layer efficiency analysis comparing Semantic Assembly Language (SAL), the instruction encoding format of the Octid Semantic Mesh Protocol (OSMP), against JSON-RPC, MessagePack, and Protocol Buffers. The primary evidence base is a 29-vector empirical benchmark drawn from five production frameworks (MCP, OpenAI, Google A2A, CrewAI, Microsoft AutoGen), measured across three dimensions: byte reduction, LLM token economics (GPT-4 cl100k_base tokenizer), and binary serialization comparison (MessagePack and compiled Protocol Buffers). Supplementary analysis includes a 1,000-point grammar-level structural overhead sweep, Shannon entropy measurement of structural token streams, and a production token economics model extrapolating per-instruction savings across cited real-world multipliers (conversation history accumulation, multi-step tool chains, retry loops, context window saturation, and aggregate system inefficiency).
Principal findings on the 29-vector benchmark: SAL achieves 76.0% mean token reduction (GPT-4 cl100k_base), 86.8% mean byte reduction versus minified JSON, 70.5% versus compiled Protocol Buffers, and 84.5% versus MessagePack. Protocol Buffers captures much of JSON's structural overhead (55.4% reduction); SAL's remaining advantage over protobuf derives primarily from semantic content compression through a shared opcode vocabulary, not from superior structural encoding alone. The paper includes an adversarial prosecution section, explicit methodological assumptions, and honest disclosure of cases where SAL underperforms protobuf on numeric-heavy payloads.
All methodologies, data, source code, and test vectors are published in the open repository. The protocol has zero production deployments, zero third-party adoption, no independent audit, and no formal verification. This paper measures encoding properties, not production readiness.
A common serialization format for agent-to-agent communication in current production frameworks is JSON, transmitted via JSON-RPC 2.0 (in MCP and Google A2A) or as structured function call objects (in OpenAI, CrewAI, and AutoGen). JSON is a self-describing key-value format designed for human developer readability. It has achieved ubiquitous adoption because developers can read, write, and debug it without specialized tools.
This design carries a cost. Every JSON message includes key names that repeat with every transmission, nesting braces, quoted string delimiters, type discriminators, and protocol envelopes. These structural elements consume bytes on the wire and tokens in LLM context windows. Schema-based binary formats (Protocol Buffers, Apache Avro, FlatBuffers) address part of this overhead by externalizing field names into a shared schema compiled at both endpoints. Self-describing binary formats (MessagePack, CBOR) address a smaller portion by using compact type headers while preserving key names. As used in the benchmarked frameworks, these formats do not compress the semantic content itself: the natural-language strings, verbose method names, and human-readable descriptions that carry much of the message payload.
OSMP's Semantic Assembly Language (SAL) operates at a different layer. It is not a compression algorithm applied to JSON or a binary encoding of JSON's structure. It is a domain-specific instruction language that replaces natural language content with opcode lookups against a shared dictionary (the ASD). This means SAL compresses both structure and content, but at the cost of a stronger dependency: the ASD must be present at both endpoints, and the instruction domain must be covered by the ASD's vocabulary.
OSMP originated from a design exercise exploring the minimum viable encoding for AI agent communication under extreme bandwidth constraints. LoRa radio, the physical layer underlying both the LoRaWAN enterprise network protocol and the Meshtastic peer-to-peer mesh protocol, imposes payload floors ranging from 11 bytes (US915 DR0 under LoRaWAN) to 51 bytes (EU868 DR0) to 255 bytes (standard Meshtastic LongFast deployment). A standard JSON-RPC 2.0 tools/call envelope is 82 bytes before content, exceeding these floors. The protocol was designed from the minimum constraint upward. This paper tests whether the same design choices yield measurable efficiency gains on the cloud-oriented benchmark set used here.
OSMP v1.0 is a working protocol with a reference implementation, three SDK implementations (Python, TypeScript, Go), 231 passing tests, an MCP server on PyPI (pip install osmp-mcp), and a static website. It has zero production deployments, zero third-party adoption metrics, and no independent audit. The analysis measures encoding properties, not production readiness. These are different questions.
SAL does not replace JSON for arbitrary data serialization, and it does not replace Protocol Buffers for general-purpose schema-driven messaging. It replaces verbose text-based encodings for agentic instructions where the semantic content maps to a defined set of operational codes. Free-form data remains in its native format. The efficiency claims apply to the instruction encoding domain.
This paper was authored by the protocol's inventor. It has been subjected to independent adversarial review, and the prosecution section (Section 10) incorporates findings from that review. However, the data, models, and analysis originate from the same source as the protocol itself. Independent replication using the published code and test vectors is encouraged.
A SAL instruction is a human-readable character sequence comprising four layers:
- Namespace prefix (1 byte): A single uppercase Latin letter identifying the semantic domain (A through Z, 26 standard namespaces).
- Opcode (2-6 bytes): An alphanumeric code representing a semantic primitive drawn from an authoritative domain vocabulary.
- Glyph operators (1-3 bytes each): Unicode formal logic symbols representing logical, sequential, or compositional relationships.
- Slot values (variable): Typed operand positions carrying parameters, targets, and domain codes.
Example: H:TRIAGE?I (10 bytes) encodes "triage classification: immediate." The equivalent JSON-RPC tool call requires approximately 140-170 bytes depending on framework.
The ASD is the shared reference enabling positional encoding without self-description. It contains 352 opcodes across 26 namespaces, drawn from authoritative domain standards (ICD-10, ISO 20022, MITRE ATT&CK, IEEE 1451, RFC 5424, FIPS 140-3, and others). Three MDR corpora extend the ASD with 124,215 resolvable domain codes.
Both sender and receiver must possess the ASD. This is architecturally analogous to protobuf's requirement for compiled .proto schemas at both endpoints, and to MCP's tools/list schema discovery step. The ASD shifts schema from per-message to per-session. Unlike protobuf schemas, the ASD also provides semantic vocabulary compression: opcode names replace natural language descriptions.
SAL decoding is performed by deterministic decode against the ASD. No neural inference, statistical model, or training history is required. Any device capable of string processing and dictionary lookup can decode SAL instructions. This enables deployment on constrained hardware (ESP32, LoRaWAN end devices, Meshtastic nodes) where language model inference is computationally infeasible.
The analysis separates encoded messages into two byte classes:
Content bytes (C): The semantic payload. A location name, an ICD-10 code, an account number.
Structural bytes (S): Everything the grammar requires that is not content. In JSON: key names, braces, quotes, colons, commas, type discriminators, protocol envelopes. In SAL: namespace prefix, assign colon, target operator, slot brackets, composition operators. In protobuf: field tags, wire type indicators, varint length prefixes.
The overhead ratio for a grammar G is: R(G, M) = S_G / (S_G + C).
Explicit assumptions and limitations of this decomposition:
-
Key names are classified as structural because SAL and protobuf both eliminate them through positional/field-number encoding. This is a modeling choice, not an objective fact. In JSON, key names carry semantic information to human readers and have real engineering value for debugging, interoperability, and evolvability. The decomposition treats this value as zero for purposes of wire/token cost measurement.
-
The ASD and protobuf schemas are treated as session-level costs amortized across all messages, not charged per-message. This is defensible for persistent connections but overstates the advantage for short-lived or ad hoc exchanges where dictionary/schema negotiation occurs frequently.
-
SAL's opcode substitution compresses content, not just structure. When SAL replaces "Transferred to flights_refunder" with
J:HANDOFF@flights_refunder, the content is shorter because the opcode carries semantic meaning that the natural language description spelled out. This is vocabulary-level compression, distinct from structural overhead reduction. Both contribute to SAL's total byte savings, but they are different mechanisms. -
This decomposition does not measure dictionary lifecycle costs (version negotiation, synchronization, cache invalidation, governance). Those are real system costs not captured by per-message encoding measurement.
Primary evidence:
| Layer | Measures | Data Points | Method |
|---|---|---|---|
| Byte reduction | SAL vs JSON vs MessagePack vs Protocol Buffers | 29 | Direct measurement + protobuf compiled schemas (protoc 3.21.12) |
| Token economics | LLM context window consumption | 29 | GPT-4 cl100k_base tokenizer (tiktoken) |
| Batch compression | All formats with best second-pass compression | 29 (batched) | gzip for JSON/MsgPack, lossless block for SAL |
Supplementary analysis:
| Layer | Measures | Data Points | Method |
|---|---|---|---|
| Grammar structural overhead | Structural bytes per grammar, content held constant | 1,000 | Analytical sweep (n, k, d) |
| Shannon entropy | Information density of structural token streams | 2 corpora (500 msg each) | Byte-level entropy |
Twenty-nine test vectors sourced from official documentation of five production frameworks:
| Framework | Version | Vectors | Source |
|---|---|---|---|
| MCP | 2025-11-25 spec | 7 | modelcontextprotocol.io |
| OpenAI | Responses API / Chat Completions | 7 | developers.openai.com |
| Google A2A | v0.2.5 spec | 5 | a2a-protocol.org |
| CrewAI | 0.86+ | 5 | docs.crewai.com |
| Microsoft AutoGen | 0.4 AgentChat | 5 | microsoft.github.io/autogen |
Every JSON payload is drawn from published documentation. Source URLs are recorded per vector and published in the repository.
Parse speed, implementation complexity, ecosystem maturity, developer experience, production reliability, dictionary lifecycle costs, failure mode behavior, or security properties. These matter and are out of scope.
JSON-RPC (MCP pattern). The minimum structural envelope for a JSON-RPC 2.0 tools/call message is 82 bytes:
{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"","arguments":{}}}
This is the exact minified string. The 82 bytes are structural: they must be present regardless of what content fills the empty name and arguments fields. Each parameter adds approximately 5 bytes of structural tokens (quotes, colon, comma) plus the key name (average 6 bytes). Each instruction in a chain requires its own complete envelope.
SAL. The minimum structural frame is 2 bytes (namespace prefix + assign colon). Target adds 1 byte. Slot list adds 3 bytes plus 1 byte per separator. Chain operators add 1-3 bytes per join. No envelope. No key names.
Protocol Buffers. Field tag (1-2 bytes per field) + varint length prefix for strings (1 byte for lengths under 128). No key names, no braces, no quotes. Protobuf's structural overhead is the lightest of the three formats for a given number of fields.
Structural byte models swept across: n = [1..20] parameters, k = [1..10] chain length, d = [1..5] nesting depth. Total: 1,000 data points.
The sweep is synthetic. It is not measured from real agent traffic distributions. It shows the grammar's behavior across the composition space, not the frequency-weighted behavior of production deployments.
Results (SAL vs JSON-RPC only; protobuf is measured empirically in Section 6):
| Metric | Uniform | Frequency-Weighted |
|---|---|---|
| Structural advantage (SAL over JSON-RPC) | 92.9% | 93.1% |
| JSON overhead ratio (structure / total) | 75.1% | 86.2% |
| SAL overhead ratio (structure / total) | 19.3% | 31.7% |
The near-identity of the uniform and weighted results indicates that the model behaves similarly under the two chosen distributions. Because the sweep is synthetic, these results should be read as descriptive of the model rather than predictive of production traffic.
When Agent A sends a tool call to Agent B in a cloud environment:
-
Wire layer: JSON serialized, optionally gzipped, transmitted over HTTP. Bandwidth is effectively free. Gzip achieves 70-85% compression on repetitive JSON structural tokens.
-
Token layer: The full, uncompressed JSON text is placed into the LLM's context window. Every key name, brace, quote, and envelope token consumes API metering and context capacity. Gzip does not operate at this layer. The LLM never sees compressed bytes.
Empirical studies of production multi-agent systems reveal significant token redundancy. The AgentTaxo study measured token duplication rates (tokens appearing in two or more LLM calls within a single task) across frameworks: 72% in MetaGPT, 86% in CAMEL, and 53% in AgentVerse [12]. The AgentPrune method demonstrated 87% cost reduction by pruning redundant inter-agent communication [13]. Optima achieved up to 90% token reduction through optimized multi-agent communication training [14].
All 29 vectors tokenized with GPT-4 cl100k_base (tiktoken). Measured counts, not estimates.
| Metric | JSON | SAL | Reduction |
|---|---|---|---|
| Total tokens (29 vectors) | 1,809 | 434 | 76.0% |
| Mean tokens per instruction | 62.4 | 15.0 | 76.0% |
The following model assumes an idealized deployment: one encoding pass per message, no retries, no accumulated conversation history, no reasoning loops. This is the provable minimum.
Representative LLM input pricing: $2.50 / 1M tokens (GPT-4o, March 2026).
| Daily messages | JSON tokens/year | SAL tokens/year | Annual input token savings |
|---|---|---|---|
| 10,000 | 227.8M | 54.8M | ~$430 |
| 50,000 | 1.14B | 274M | ~$2,150 |
These figures cover input tokens only, at a single price point, for a single tokenizer. Output tokens (priced higher) and context window capacity freed for reasoning are not quantified. This is the floor. A deployment that achieves single-pass message delivery with no retries, no context accumulation, and no multi-step reasoning chains would realize savings in this range. The baseline exists to establish a provable minimum, not to predict production costs.
Production multi-agent systems do not operate at baseline. Empirical reporting from deployed systems identifies five categories of token multiplication that compound on top of per-message encoding cost.
1. Conversation history accumulation. Multi-agent frameworks re-read the full conversation history on every LLM call. A document analysis workflow consuming 10,000 tokens with a single agent has been measured requiring 35,000 tokens across a four-agent implementation, a 3.5x cost multiplier before retries, error handling, and coordination messages [19]. This is consistent with the AgentTaxo finding that 72% of tokens in MetaGPT are duplicates appearing in two or more LLM calls [12], implying a comparable 3.6x effective multiplier. In AutoGen's GroupChat pattern, a 4-agent discussion with 5 rounds generates a minimum of 20 LLM calls, each carrying progressively longer conversation history [20]. The token cost of each call grows with the conversation length, creating superlinear cost growth within a single task.
2. Multi-step tool chains. Autonomous agents routinely chain 10-20 sequential API calls to complete a single task, including tool lookups, retrieval-augmented generation queries, multi-step reasoning, and final completions [21]. Each call in the chain carries the serialized instruction payload. A 10-call chain does not cost 10x a single call because context accumulates, but the serialization overhead of the instruction format is present in every call.
3. Retry and fallback loops. Failed tool calls trigger repeated LLM invocations [22]. If an agent encounters an unexpected error format, it may enter a retry loop consuming thousands of tokens without achieving its goal [23]. Each retry carries the full serialized instruction payload again. The serialization overhead is a fixed cost per attempt, regardless of whether the attempt succeeds or fails.
4. Context window saturation. Poor context management accounts for 60-70% of total token spend in production agent systems [24]. A production agent handling 2,000 conversations per day at an average of 1,500 tokens per conversation generates approximately 90 million tokens per month [25]. A customer service deployment processing 10,000 conversations per day at 5 turns per conversation at 200 tokens per turn reaches 10 million tokens per day [26]. At these volumes, the serialization format of inter-agent instructions is a recurring line item on every call, not a one-time cost.
5. Aggregate system inefficiency. A typical unoptimized multi-agent system processes 10-50x more tokens than strictly necessary [22]. Unoptimized production agents can cost $10-100+ per session due to long context windows and multi-step loops [24]. A single autonomous agent running a multi-step research task can consume $5-15 in API calls in minutes [22].
The 76.0% token reduction measured in Section 5.2 applies to the serialization layer of every inter-agent instruction. This reduction is a fixed component of a variable cost. Every API call that carries a serialized instruction, whether it is the initial call, a retry, a reasoning step, or a fallback attempt, includes the encoding overhead that SAL eliminates.
The baseline model (Section 5.3) assumes a 1x multiplier: one encoding per message, no repetition. Production systems operate at higher effective multipliers due to the factors documented in Section 5.4.
At 10,000 inter-agent messages per day:
| Scenario | Effective multiplier | Annual JSON input tokens | Annual SAL input tokens | Annual input token savings |
|---|---|---|---|---|
| Baseline (clean) | 1x | 227.8M | 54.8M | ~$430 |
| Moderate production | 3.5x | 797M | 192M | ~$1,515 |
| Multi-agent orchestration | 20x | 4.56B | 1.10B | ~$8,650 |
| Unoptimized system | 50x | 11.39B | 2.74B | ~$21,625 |
At 50,000 inter-agent messages per day:
| Scenario | Effective multiplier | Annual JSON input tokens | Annual SAL input tokens | Annual input token savings |
|---|---|---|---|---|
| Baseline (clean) | 1x | 1.14B | 274M | ~$2,150 |
| Moderate production | 3.5x | 3.99B | 958M | ~$7,575 |
| Multi-agent orchestration | 20x | 22.78B | 5.48B | ~$43,250 |
| Unoptimized system | 50x | 56.95B | 13.70B | ~$108,125 |
The multipliers are derived from cited production reporting, empirical research, and industry analyses, not from theoretical modeling within this paper. The 3.5x multiplier reflects measured conversation history accumulation and token duplication in multi-agent workflows, independently documented at 3.5x in a four-agent implementation [19] and implied at 3.6x by the 72% token duplication rate measured in MetaGPT [12]. The 20x multiplier reflects multi-step tool chains of 10-20 calls with moderate context growth [21]. The 50x multiplier reflects the upper bound of measured inefficiency in unoptimized multi-agent systems [22]. Actual multipliers in any given deployment will depend on the agent architecture, framework, task complexity, and optimization maturity.
What this model does not claim. SAL does not reduce the multiplier itself. An agent that retries 10 times under JSON will retry 10 times under SAL. The conversation history accumulation, the reasoning loops, and the fallback patterns are architectural behaviors of the agent framework, not properties of the serialization format. What SAL reduces is the per-instruction cost that rides on every iteration of that multiplier. The serialization tax is a fixed component of the variable cost, and the variable cost in production is 3.5-50x higher than the clean model predicts.
This analysis measures encoding efficiency, not meaning resolution fidelity. SAL decode is deterministic by design: the same instruction produces the same structured output on every decode deterministically. Whether JSON parsed through an LLM inference layer reliably resolves to the same semantic instruction at production scale is an open empirical question with no published comparative data. A controlled comparison of meaning resolution accuracy between deterministic decode and inference-based parsing under production conditions would constitute a separate analysis with different methodology, different evidence requirements, and different conclusions.
| Layer | Gzip | SAL |
|---|---|---|
| Wire (HTTP transport) | 70-85% compression | 86.8% byte reduction |
| Token (LLM context window) | No effect | 76.0% token reduction |
On unconstrained cloud channels, the wire efficiency argument alone does not justify a protocol change when gzip is available. The token economics argument is structurally independent of gzip because gzip operates below the LLM layer.
On constrained channels (LoRaWAN end devices, Meshtastic mesh nodes, BLE, ESP32), gzip is unavailable: the device lacks memory for streaming decompression, and compressed output may still exceed the payload floor.
Binary serialization formats fall into two architectural classes:
Self-describing formats (MessagePack, CBOR, BSON): Replace JSON's text syntax with binary type headers and compact encodings, but preserve key names, nesting structure, and per-message schema. The receiver needs no prior knowledge of the message structure.
Schema-based formats (Protocol Buffers, FlatBuffers, Apache Avro, Cap'n Proto): Externalize field names into a shared schema (.proto file, .avsc file) compiled at both endpoints. The wire message contains field numbers and wire types, not field names. The receiver recovers names and types from the compiled schema. This is architecturally analogous to SAL's use of the ASD.
Previous versions of this analysis compared SAL only to MessagePack. That comparison is incomplete because it understates what schema-based formats achieve. Protocol Buffers is included here as the strongest binary comparator.
All 29 JSON payloads encoded in MessagePack (msgpack.packb) and Protocol Buffers (compiled .proto schemas with protoc 3.21.12, optimal field numbering using fields 1-15 for single-byte tags). Protobuf schemas were designed generously: low field numbers, compact types, well-structured messages.
| Format | Total Bytes | Reduction vs JSON |
|---|---|---|
| JSON (minified) | 6,896 | baseline |
| MessagePack (self-describing) | 5,848 | 15.2% |
| Protocol Buffers (schema-based) | 3,075 | 55.4% |
| SAL (semantic instruction encoding) | 908 | 86.8% |
Cross-format comparisons:
| Comparison | Reduction |
|---|---|
| SAL vs Protocol Buffers | 70.5% |
| SAL vs MessagePack | 84.5% |
| Protobuf vs MessagePack | 47.3% |
On vector DOM-04 (scale service to 5 replicas with resource limits), protobuf produces 27 bytes while SAL produces 32 bytes. SAL is 18.5% larger. This occurs because the payload is dominated by short numeric and unit values (replicas: 5, cpu: "2000m", memory: "4Gi") that protobuf encodes as varints and short length-prefixed strings while SAL transmits as human-readable UTF-8 text. The SAL encoding uses a compound instruction with inline resource parameters (C:SCALE@api-gateway[5:2000m:4Gi]), eliminating the structural waste of separate opcodes and target duplication present in earlier versions. The remaining 5-byte gap is the human-readability tax on numeric fields.
The 29/29 option that was not taken. SAL's slot value architecture (Section 2 of the semantic dictionary) supports single-character codes for enumerated value sets, as used in H:TRIAGE (I/D/M/B/X) and C:STAT (A/D/E/I/O). The same architecture could encode standard Kubernetes resource tiers as single-character slot values:
| Code | CPU Value | Description |
|---|---|---|
| A | 125m | Eighth core |
| B | 250m | Quarter core |
| C | 500m | Half core |
| D | 1000m | One core |
| E | 2000m | Two cores |
| F | 4000m | Four cores |
| G | 8000m | Eight cores |
| Code | Memory Value | Description |
|---|---|---|
| A | 128Mi | Minimal |
| B | 256Mi | Small |
| C | 512Mi | Medium-small |
| D | 1Gi | Medium |
| E | 2Gi | Medium-large |
| F | 4Gi | Large |
| G | 8Gi | Extra-large |
| H | 16Gi | Double extra-large |
Applying these codes to DOM-04:
| Encoding | SAL Instruction | Bytes | vs Protobuf (27B) |
|---|---|---|---|
| Current (text values) | C:SCALE@api-gateway[5:2000m:4Gi] |
32 | +5 (protobuf wins) |
| With tier codes | C:SCALE@api-gateway[5:E:F] |
26 | -1 (SAL wins) |
The tier code encoding beats protobuf by 1 byte and would flip the scoreboard to 29 of 29. Non-standard values that fall outside the tier table (C:SCALE@api-gateway[5:1742m:3.5Gi], 34 bytes) would fall back to text representation, the same behavior as the current encoding.
This optimization was not applied. The slot value codes for resource tiers are ordinal: E is the 5th tier, not a mnemonic for "2000m." A developer inspecting C:SCALE@api-gateway[5:E:F] in a Kubernetes deployment log cannot infer that E means 2000m CPU and F means 4Gi memory without the ASD. Compare this to H:TRIAGE?I, where I means Immediate, or C:STAT?A, where A means Active. Those are mnemonics. The Kubernetes tier codes are not.
The protocol's design preference is that common operational instructions remain partially inspectable by a human operator without consulting a non-mnemonic lookup table. The triage and status codes satisfy this because their letters are the first letter of the word they represent. Ordinal tier codes that require memorization or reference violate that principle for a 6-byte improvement on one vector. The 5-byte gap on DOM-04 is accepted as the cost of maintaining human readability on the one vector class where it is most expensive. The mechanism is disclosed and available for deployments where byte efficiency at the edge outweighs field inspectability.
Protobuf and SAL share the same architectural bet: externalize schema/dictionary, amortize it across the session. Both eliminate key names from the wire. Why does SAL still beat protobuf by 70.5% on this benchmark set?
On this benchmark set, the answer is content compression, not structural encoding. Protobuf transmits the semantic content in full: the string "Transferred to flights_refunder, adopting the role of flights_refunder immediately." is 85 bytes in protobuf (tag + length prefix + 83 bytes UTF-8 content). SAL replaces this with J:HANDOFF@flights_refunder (26 bytes) because the opcode HANDOFF carries the semantic meaning that the natural language description spelled out.
This is not a fair structural comparison. It is a vocabulary comparison. Protobuf faithfully serializes whatever content the application provides. SAL replaces the content with a domain-specific code. The reduction comes from the domain vocabulary, not from superior encoding mechanics.
On pure structural overhead (tags, length prefixes, type indicators vs namespace prefix, colons, brackets), protobuf and SAL are comparable for low field counts. SAL's advantage grows with content length because opcode substitution compresses content linearly with description verbosity. Protobuf's advantage grows with numeric-heavy payloads where varint encoding is maximally compact.
Methodological note: The protobuf sizes in this analysis were verified against compiled .proto schemas using protoc 3.21.12 with Python serialization. The analytical wire format estimates (used for initial comparison) differed from compiled serializer output by 6 bytes total across all 29 vectors (3,081 analytical vs 3,075 compiled, a 0.2% delta). Three vectors produced compiled output 2 bytes smaller than estimated due to proto3 default value omission. All other vectors were byte-identical. The compiled results are used in this paper. The .proto schemas and serialization code are published in the repository.
Defensible summary: Against the benchmark set used here, compiled-schema Protocol Buffers captures much of JSON's structural inefficiency, reducing total bytes by 55.4% versus minified JSON. SAL remains smaller by a further 70.5% relative to compiled protobuf, but that remaining gap is attributable primarily to domain-specific semantic abbreviation via a shared dictionary rather than to structural encoding alone.
JSON-RPC is self-describing: the grammar is the full system. Protobuf requires grammar + schema compiler + .proto files at both endpoints. SAL requires grammar + ASD + MDR corpora at both endpoints. Protobuf and SAL are comparable in total system complexity; both trade per-message simplicity for per-session dependency. JSON trades per-session simplicity for per-message overhead.
The results in Sections 6.2 through 6.4 measure raw SAL, the human-readable UTF-8 text as it appears at encode and decode time. The OSMP specification also defines a two-tier encoding architecture in which SAL-encoded instructions undergo lossless block compression prior to wire transmission. The first tier is the semantic encoding (natural language to SAL via the ASD). The second tier is lossless compression of the SAL byte stream for transport. The human-readable form is recoverable at the receiver by reversing the second tier and resolving the first tier by dictionary lookup.
Per-message results. On the 29 benchmark vectors, the second-tier compression does not reduce byte count below the raw SAL size for any individual message. The overhead of the compression frame header exceeds the savings achievable on payloads of 10-65 bytes. The protocol's mode selection mechanism correctly selects uncompressed SAL for all 29 individual messages. This is the compression floor guarantee operating as designed: the two-tier architecture never increases byte count over the single-tier encoding.
Batch transmission results. Multi-agent workflows frequently produce compound instruction sequences transmitted as a group. When all 29 benchmark instructions are encoded as a single batch (simulating a compound workflow or Overflow Protocol sequential burst), second-pass lossless compression is applied to all formats to ensure symmetric comparison. Results:
| Format + Best Compression | Batch Size (29 instructions) |
|---|---|
| SAL + lossless block compression | 632 bytes |
| Protocol Buffers + gzip | 1,751 bytes |
| JSON (minified) + gzip | 2,623 bytes |
| MessagePack (array) + gzip | 2,698 bytes |
SAL's batch compression reduces the combined payload from 936 raw bytes to 632 bytes, a 32.5% compression over the already-compact SAL encoding. Protobuf's concatenated binary stream compresses from 3,075 raw bytes to 1,751 under gzip (43.1% compression), confirming that protobuf's binary does contain exploitable redundancy in batch. JSON + gzip achieves 62.1% compression from 6,924 raw bytes to 2,623. Compressed SAL remains 63.9% smaller than compressed protobuf, 75.9% smaller than compressed JSON, and 76.6% smaller than compressed MessagePack. All formats receive their best available second-pass lossless compression in this comparison.
The batch compression ratio improves with message count because SAL's structural tokens (namespace prefixes, colons, operators, bracket patterns) are highly repetitive across instructions in the same domain. Protobuf's binary encoding, already compact per-message, offers less redundancy for a second-pass compressor to exploit. The batch experiment is a benchmark aggregate, not a measured production traffic trace.
Per-vector scoreboard: raw SAL vs Protocol Buffers. Without any second-tier compression, SAL produces a smaller encoding than Protocol Buffers on 28 of 29 vectors. The single vector where protobuf wins (DOM-04: Kubernetes scaling with resource limits, protobuf 27 bytes, SAL 32 bytes) is a numeric-heavy payload where the 5-byte gap represents the human-readability constraint on resource limit encoding. On all other vector classes, SAL's semantic content compression produces smaller encodings than protobuf's binary serialization.
Traffic-weighted representation. The 28-of-29 scoreboard is a vector count, not a traffic-weighted result. If numeric-only payloads (where protobuf is competitive or superior) represent a large fraction of production agent traffic, the effective advantage would be lower than the per-vector count implies. Among the message types documented in production multi-agent frameworks (tool calls, task delegations, handoffs, configuration exchange), the dominant pattern is text-heavy with natural language descriptions, method names, and context strings. Numeric-only patterns are more characteristic of sensor telemetry than agent-to-agent instruction exchange. The scoreboard reflects this instruction-class distribution, but it is based on 29 selected vectors, not measured traffic distributions.
Interpretation. The two-tier architecture does not change SAL's per-message competitive position against protobuf. Raw SAL already wins 28 of 29 vectors on semantic compression alone. The second tier matters for compound instruction chains where batch compression of the SAL stream yields further reduction. When all formats receive their best second-pass compression, SAL (632 bytes) remains 63.9% smaller than compressed protobuf (1,751 bytes) and 75.9% smaller than gzipped JSON (2,623 bytes). This is the pattern that occurs in production multi-agent workflows: not single isolated instructions, but sequences of related instructions with shared namespace context, repeated targets, and structural regularity.
Two corpora of 500 messages each were generated: one of JSON-RPC structural tokens and one of SAL structural tokens. Content values replaced with fixed-length placeholders. Byte-level Shannon entropy computed.
Limitations: These corpora are not like-for-like. JSON's corpus includes repeated field names because JSON is self-describing. SAL's corpus excludes the ASD that gives its symbols meaning. The comparison measures entropy of the wire format only, not entropy of the full encoding system including external dependencies. Results should be interpreted within this constraint.
| Metric | JSON structural | SAL structural |
|---|---|---|
| Corpus size | 80,070 bytes | 16,306 bytes |
| Entropy | 4.20 bits/byte | 3.60 bits/byte |
| Total structural information | 336,558 bits | 58,640 bits |
SAL's per-byte entropy is lower than JSON's (3.60 vs 4.20 bits/byte). Each SAL structural byte is individually more redundant. But SAL produces 4.9x fewer structural bytes, yielding 5.7x lower total structural information cost.
SAL achieves this by eliminating entire categories of structural tokens (quotes at 26% of JSON's stream, key names at ~30%, envelope tokens). This elimination mechanism differs from compression: gzip can exploit redundancy within JSON's structural stream, but cannot remove the grammar's requirement for key names, quotes, and envelopes. Schema-based formats like protobuf CAN remove field names, which is why protobuf's structural overhead is substantially lower than JSON's. SAL's remaining advantage over protobuf comes from content vocabulary compression, not structural elimination.
| Framework | Vectors | Mean Byte Reduction vs JSON | Mean Token Reduction | Mean Byte Reduction vs Protobuf |
|---|---|---|---|---|
| MCP | 7 | 87.2% | 72.7% | 51.3% |
| OpenAI | 7 | 85.5% | 72.2% | 59.2% |
| Google A2A | 5 | 86.1% | 75.1% | 61.3% |
| CrewAI | 5 | 84.0% | 68.5% | 77.6% |
| AutoGen | 5 | 85.8% | 81.6% | 77.8% |
| Cross-domain | 7 | 87.5% | 74.8% | 50.5% |
CrewAI and AutoGen show the largest SAL-vs-protobuf gap because their payloads contain long natural language strings (task descriptions, agent backstories, conversation histories) that protobuf transmits verbatim while SAL replaces with opcodes. MCP and OpenAI show the smallest gap because their tool call payloads are more structured with shorter string values.
OSMP operates across four communication modes. The encoding format does not change between modes; the binding constraint and the economic argument differ.
The LoRa Layer Stack. LoRa is a physical radio modulation technique (chirp spread spectrum, Semtech). Two protocol ecosystems sit on top of it:
LoRaWAN is a MAC/network protocol managed by the LoRa Alliance (500+ member companies). Star-of-stars topology: end devices communicate through gateways to centralized network servers. Millions of devices deployed globally for metering, tracking, agriculture, and industrial monitoring.
Meshtastic is a peer-to-peer mesh protocol on the same LoRa hardware. Decentralized, no gateway or server dependency. Community-oriented with active AI integration efforts (MESH-API, DEF CON 700+ node mesh demonstration).
OSMP operates at the application layer, above both. The payload encoding goes inside either protocol's frame.
Payload floors by transport and region:
| Transport / Config | Application Payload |
|---|---|
| LoRaWAN US915 DR0 (SF10) | 11 bytes |
| LoRaWAN EU868 DR0 (SF12) | 51 bytes |
| LoRaWAN US/EU DR3+ (SF9) | 125+ bytes |
| Meshtastic LongFast (default) | 228-255 bytes |
At 11 bytes, H:TRIAGE?I (10 bytes) fits. A bare {"a":1} is 7 bytes. A JSON-RPC envelope (82 bytes) is physically impossible below DR3. A protobuf-encoded minimal tool call (approximately 35 bytes) exceeds the EU DR0 floor. Among the formats compared in this analysis, SAL produces the most compact multi-field instructions at 11-51 byte floors.
Existing application-layer encodings. The closest standardized application-layer encoding in LoRaWAN is CayenneLPP (Cayenne Low Power Payload), a binary TLV format supporting approximately 20 sensor data types. CayenneLPP is built into The Things Stack as a native payload formatter. It handles sensor telemetry compactly: a temperature reading is 4 bytes (channel + type + value).
CayenneLPP cannot express conditional logic, multi-step workflows, cross-domain composition, or any form of agentic instruction. When a LoRaWAN deployment needs to send instructions rather than readings, custom binary byte layouts with device-specific payload formatters are the norm. There is no dominant standardized, semantically composable instruction encoding for these channels comparable to what CayenneLPP provides for telemetry.
OSMP is proposed for this instruction-encoding space. A SAL payload formatter for The Things Stack would be a JavaScript function performing ASD lookup, enabling any LoRaWAN network server to decode OSMP instructions without per-device custom code. On Meshtastic, SAL instructions are UTF-8 text through the existing message channel. This does not mean no other instruction-layer solutions exist; custom TLV protocols, domain-specific command encodings, and proprietary control payloads are widely used. What does not exist is a cross-domain standard with a published dictionary and open implementations.
The constrained-channel claim should be scoped precisely: OSMP does not replace CayenneLPP for compact numeric telemetry (where CayenneLPP's binary encoding is more compact per reading). It addresses the instruction encoding space that CayenneLPP does not cover.
The numeric tax on biometric telemetry. The tradeoff between SAL's human-readable encoding and binary numeric encoding can be quantified on real clinical data. Human vital signs occupy narrow numeric ranges that rarely exceed three digits:
| Vital Sign | Clinical Range | SAL Encoding | SAL Bytes | Binary (protobuf-equivalent) | Binary Bytes | Delta |
|---|---|---|---|---|---|---|
| Heart rate | 40-200 bpm | H:HR[72] |
8 | channel + type + varint | 4 | +4 |
| SpO2 | 70-100% | H:SPO2[97] |
10 | channel + type + varint | 4 | +6 |
| Respiratory rate | 8-40 /min | H:RR[16] |
8 | channel + type + varint | 4 | +4 |
| Temperature | 90-110 F | H:TEMP[101] |
11 | channel + type + varint | 4 | +7 |
| Blood pressure | 60-250 mmHg | H:BP[120/80] |
12 | channel + type + 2x varint | 4 | +8 |
| Glucose | 40-500 mg/dL | H:GLUC[95] |
10 | channel + type + varint | 4 | +6 |
| Triage classification | I/D/M/B/X | H:TRIAGE?I |
10 | channel + type + enum | 3 | +7 |
A complete vitals panel (7 readings): SAL costs 69 bytes. Binary costs 27 bytes. The numeric tax is 42 bytes total, averaging 6 bytes per reading.
What the tax buys. The same device, in the same grammar, on the same channel, without switching protocols, can also express:
| Instruction | SAL Encoding | Bytes | Fits EU DR0 (51B) |
|---|---|---|---|
| Threshold-triggered clinical alert | H:HR<60→H:ALERT[BRADYCARDIA]∧H:ICD[R001] |
44 | Yes |
| Triage + MEDEVAC request | H:TRIAGE?I∧H:ICD[J939]→H:CASREP@coord⚠ |
44 | Yes |
| Composite vitals in one frame | H:HR[72]∧H:SPO2[97]∧H:RR[16]∧H:BP[120] |
44 | Yes |
| GPS-tagged triage | E:GPS[40.7128,-74.006]∧H:TRIAGE?I |
35 | Yes |
None of these can be expressed in CayenneLPP at any byte count. A protobuf encoding could serialize equivalent data structures, but the .proto schema defining those structures must be designed, agreed upon by sender and receiver, compiled, deployed to both endpoints, and versioned. SAL carries the semantics in the encoding itself, resolvable by any node with the ASD.
The complete sense-decide-act loop (sensor reading, threshold evaluation, alert composition, coordination request, command response) can run in one grammar with no protocol translation boundaries between stages. A deployment using CayenneLPP for telemetry plus custom binary for thresholds plus JSON for cloud coordination maintains three encoding formats with translation logic at every boundary. Each boundary is code to write, test, maintain, and debug. On a constrained device in a disconnected environment, each boundary is also a failure surface.
The protocol-level tradeoff is a 42-byte numeric tax per vitals panel in exchange for a unified encoding surface across the sense-decide-act loop.
The edge node encodes in SAL and transmits via LoRaWAN gateway, Meshtastic mesh relay, cellular, or satellite. The cloud node decodes by ASD lookup. A triage classification with ICD-10 diagnosis, GPS coordinates, and MEDEVAC request encodes in approximately 38 bytes, transmissible at any LoRaWAN data rate or over Meshtastic.
The H namespace contains 16 clinical opcodes. The ICD-10-CM MDR corpus contains 74,719 codes resolvable at the edge deterministically. Decode is deterministic: H:ICD[J939] resolves to "Pneumothorax, unspecified" by dictionary lookup. Deterministic decode is not the same as semantic adequacy: if the opcode inventory lacks a needed clinical concept, the decode is exact but incomplete. The ASD covers the ICD-10-CM code set; concepts outside that set require Frame Negotiation Protocol extension.
Bandwidth is free. Gzip handles wire compression. But the LLM reads full uncompressed text as tokens. An agent encountering OSMP via the MCP server discovers eight tools and six resources, learns the grammar through tool use, and produces instructions at 76.0% fewer tokens (measured, Section 5.2).
The tradeoff: SAL instructions are not self-describing. A developer debugging a multi-agent system must consult the ASD to read H:TRIAGE?I where JSON shows "triage_category":"immediate" directly. For deployments where API cost at scale is the binding constraint, this tradeoff is favorable. For prototyping where cost is immaterial and debuggability is paramount, JSON's self-description has higher immediate value.
Instructions carrying regulatory code references (ICD-10, CFR, NFPA, ISO 20022) must be machine-readable and unambiguous across organizational boundaries. SAL's domain code accessor pattern (H:ICD[J939], K:PAY[ISO20022:MsgId]) enables compliance logging where the instruction constitutes an auditable record. Any party with the ASD decodes to full regulatory meaning without proprietary tooling.
Previous versions of this analysis used MessagePack as a binary proxy. That was weak. Protocol Buffers achieves 55.4% reduction over JSON by externalizing field names via compiled schemas. SAL's remaining 70.5% advantage over protobuf derives primarily from semantic content compression (opcode substitution), not structural encoding superiority. On numeric-heavy payloads where there is no natural language to compress, protobuf outperforms SAL (DOM-04: protobuf 27 bytes, SAL 32 bytes). Section 6.3 reports this honestly.
Key names have real engineering value: debugging, interoperability, evolvability, auditability, and safety inspection. The content-structure decomposition treats this value as zero for wire/token measurement purposes. That is a modeling choice stated explicitly in Section 3.1, not an objective fact.
Correct at the wire layer. Inapplicable at the token layer. On constrained channels, gzip is unavailable. See Section 5.7.
The ASD is a hard dependency. If the receiver lacks it, the message is unintelligible. JSON is self-contained. Protobuf requires compiled schemas, a comparable dependency. The ASD goes further than protobuf schemas by also compressing content, which tightens the coupling: a missing opcode is not just a missing field name but a missing concept.
Dictionary lifecycle costs (versioning, synchronization, governance, backward compatibility) are real system costs not measured in this per-message encoding analysis. These costs should not be dismissed as "amortized"; they are engineering burden that scales with deployment complexity.
If sender and receiver ASD versions diverge, instructions may contain opcodes unknown to the receiver. The protocol addresses this through multiple mechanisms at different layers:
The Frame Negotiation Protocol (FNP) detects version mismatch during session handshake via fingerprint and version comparison. The ASD Distribution Protocol (ADP), implemented in v1.0, extends FNP with SAL-level instructions for version identity exchange (A:ASD[M.m]), delta request and delivery (A:ASD:DELTA), single-opcode micro-delta for task-relevant repair (A:ASD:DEF), and hash verification (A:ASD:HASH). ADP uses the existing Category 6 glyph designators (+, ←, †) for delta operations with CRDT-based conflict resolution (G-Set for additive, LWW-Register for replacement, Tombstone for deprecation). REPLACE operations enforce mandatory retransmission via FLAGS[C], preventing silent semantic drift.
The ASD version field uses u16 interpreted as u8.u8 (MAJOR.MINOR), enabling breaking-change detection from the version number alone without inspecting delta payloads. A priority hierarchy ensures dictionary synchronization traffic never blocks mission instructions: mission traffic transmits first, micro-deltas (task-relevant single opcode definitions) second, background deltas third, and trickle charge requests in idle bandwidth only.
A semantic dependency resolution buffer holds instructions referencing undefined opcodes as "semantically pending" until the defining delta unit arrives. This prevents instruction loss during synchronization without blocking the session.
A guaranteed minimum operational vocabulary floor ensures that baseline-layer instructions are executable at any node regardless of synchronization state, network connectivity, or duration of disconnection.
The governance model (who approves dictionary changes) remains unquantified and is intentionally outside the protocol specification. The synchronization mechanism (how changes propagate) is now implemented and tested.
352 opcodes across 26 namespaces cannot express every possible agent instruction. If the instruction domain falls outside the ASD vocabulary, SAL cannot encode it without sovereign extension (the Omega namespace) or MDR registration. Deterministic decode is not semantic adequacy: an exact decode of a limited vocabulary may lose nuance present in the original natural language instruction.
SAL's total byte reduction combines structural overhead reduction and vocabulary-level content compression. Section 3.1 states this explicitly. Section 6.4 separates the two mechanisms and explains why SAL's advantage over protobuf is primarily content compression.
The 29 test vectors are drawn from tool-call and task-delegation patterns, which are verbose JSON-RPC wrappers. This is the dominant inter-agent communication pattern in the five frameworks tested, but it is not the only pattern. Short status messages, numeric sensor data, and binary payloads would show smaller SAL advantages or protobuf superiority (as DOM-04 demonstrates).
Unicode composition operators (3 bytes each in UTF-8) matter near LoRaWAN payload floors and are negligible for typical cloud messages where the envelope savings (82 bytes) dominate.
SAL does not replace JSON or protobuf for arbitrary data. It replaces verbose text-based encodings where agent instructions are the payload.
JSON's structural corpus includes self-describing tokens; SAL's excludes the external ASD. The comparison measures wire format entropy, not full system entropy. Section 7.1 states this limitation. The finding that SAL eliminates structural token categories is valid within this scope; the claim should not be extended to full-system information-theoretic superiority.
SAL depends on UTF-8 preservation through the transport layer. Systems that strip, transliterate, or corrupt multi-byte UTF-8 sequences will damage glyph operators. This is a deployment constraint documented in the protocol specification (malformed fragments treated as missing under loss tolerance policy) but not stress-tested in production.
This analysis measures encoding efficiency, not decoding complexity. SAL parsing requires Unicode-aware tokenization, namespace resolution, glyph operator recognition, and ASD lookup. JSON parsing is a solved problem with mature libraries in every language. Protobuf parsing is handled by generated code from compiled schemas. The implementation burden of a new parser for a new format is a real adoption cost not measured here.
If sender and receiver have overlapping but non-identical ASD versions, an opcode may resolve to different definitions at each endpoint. The ASD Distribution Protocol mitigates this through three mechanisms: FNP fingerprint comparison detects any content divergence at session start, the MAJOR.MINOR version scheme signals breaking changes (REPLACE operations increment the major version), and the mandatory FLAGS[C] retransmission on REPLACE deltas prevents silent semantic drift from lost update packets. However, the failure mode under partial overlap where both sides have the opcode but with different semantic definitions due to a missed REPLACE delta has not been tested in production. Silent semantic divergence is potentially worse than a clean unknown-opcode error. The hash verification instruction (A:ASD:HASH) provides a post-synchronization integrity check but does not prevent mid-session divergence if a breaking delta is queued rather than applied.
The Protocol Buffers byte counts were initially computed analytically from wire format encoding rules. They have since been verified against compiled .proto schemas using protoc 3.21.12 with Python serialization. The compiled output differed from the analytical estimate by 6 bytes total across 29 vectors (0.2%). The compiled results are used in this paper. The .proto schemas are published in the repository.
| Component | Status |
|---|---|
| Specification | OSMP v1.0, SAL EBNF grammar, semantic dictionary v12 |
| Patent | Patent pending |
| Python SDK (reference) | 167 tests passing |
| TypeScript SDK | 52 tests passing |
| Go SDK | 12 tests passing |
| MCP Server | PyPI v1.0.12, 8 tools, 6 resources, 3 MDR corpora |
| Canonical benchmark | 55 vectors, 60.8% mean SAL-vs-NL reduction, CONFORMANT |
| Website | octid.io (static, Netlify) |
| License | Apache 2.0 with express patent grant |
| Production deployments | Zero |
| Third-party adoption | Zero |
| Independent audit | None |
| Formal verification | None |
SAL, JSON, Protocol Buffers, and MessagePack encode the same semantic content using different architectural models:
| Format | Model | Key Names | Content | Schema Dependency |
|---|---|---|---|---|
| JSON-RPC | Self-describing text | Per-message | Verbatim | None |
| MessagePack | Self-describing binary | Per-message | Verbatim | None |
| Protocol Buffers | Schema-based binary | Externalized | Verbatim | .proto schema |
| SAL | Semantic instruction encoding | Externalized | Opcode-compressed | ASD dictionary |
The benchmark results reported here are consistent with these architectural differences:
| Dimension | JSON | MessagePack | Protobuf | SAL |
|---|---|---|---|---|
| Byte reduction vs JSON (per-message) | -- | 15.2% | 55.4% | 86.8% |
| Batch (29 msgs, each format best-compressed) | gzip: 2,623B | gzip: 2,698B | gzip: 1,751B | 632B |
| Token reduction vs JSON | -- | not measured | not applicable (binary) | 76.0% |
| Per-vector wins vs protobuf (29 vectors) | -- | -- | 1 | 28 |
| Schema/dictionary dependency | None | None | Compiled schema | ASD + MDR |
On this benchmark set, SAL's per-message advantage over protobuf (70.5% byte reduction) derives primarily from semantic content compression: opcodes replacing longer natural-language descriptions. This is a vocabulary optimization, not a structural encoding improvement. On numeric-heavy payloads with no natural language content, protobuf outperforms SAL (1 of 29 vectors). On batch transmission, with all formats receiving their best available second-pass compression, compressed SAL (632 bytes) remains 63.9% smaller than compressed protobuf (1,751 bytes) and 75.9% smaller than gzipped JSON (2,623 bytes).
SAL's token reduction is independent of gzip and binary wire serialization at the point where an LLM consumes plain-text instructions. In deployments where inter-agent instructions are rendered into model context as text, SAL reduces context-window load relative to the JSON payloads benchmarked here. To the extent that serialized instruction payloads recur across retry loops, conversation history accumulation, and multi-step tool chains documented in production multi-agent systems (Section 5.4-5.5), the 76.0% per-instruction token reduction compounds with those effects. Cited production multipliers of 3.5-50x over clean single-pass baselines place the effective annual savings range between approximately $2,150 and $108,125 at 50,000 daily inter-agent messages, depending on system optimization maturity. These extrapolations are bounded by the cited sources and should be read as indicative ranges, not predictions for any specific deployment.
On constrained channels (LoRaWAN, Meshtastic), SAL produces multi-field instructions within payload floors (11-51 bytes) where conventional JSON-RPC envelopes do not fit and where schema-based binary encodings remain constrained by the same application content.
The protocol exists as open-source software with published code, test vectors, and benchmarks. It has no production deployments and no third-party adoption. The encoding measurements reported here can be independently reproduced using the published materials.
The results in this paper should be read as bounded comparative measurements, not as a claim that SAL is a general replacement for JSON or Protocol Buffers. The measured advantage depends on three conditions: a domain where instructions can be represented by a controlled vocabulary, endpoints willing to externalize semantics into a shared dictionary, and workflows in which instruction payload size or tokenized text volume is operationally relevant.
All code and data at github.com/octid-io/cloudless-sky/benchmarks/sal-vs-json/:
python3 benchmark.py # 29-vector empirical benchmark
python3 grammar-analysis.py # 1,000-point structural sweep
python3 protobuf-comparison.py # Four-way format comparison
Protocol Buffers schemas: benchmark.proto (compiled with protoc 3.21.12). Two-tier batch compression results in dpack-comparison.json. Dependencies: tiktoken, msgpack, protobuf, grpcio-tools (pip install). Additional dependencies for two-tier analysis are listed in the repository.
[1] Shannon, C.E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal, 27(3), 379-423.
[2] Chomsky, N. (1956). "Three models for the description of language." IRE Transactions on Information Theory, 2(3), 113-124.
[3] Koplenig, A., Wolfer, S., Rüdiger, J.O., and Meyer, P. (2025). "Human languages trade off complexity against efficiency." PLOS Complex Systems, 2(2), e0000032. DOI: 10.1371/journal.pcsy.0000032.
[4] RFC 8259: The JavaScript Object Notation (JSON) Data Interchange Format.
[5] JSON-RPC 2.0 Specification. jsonrpc.org.
[6] Model Context Protocol Specification, v2025-11-25. modelcontextprotocol.io.
[7] Agent2Agent Protocol Specification, v0.2.5. a2a-protocol.org.
[8] OpenAI Function Calling Guide. developers.openai.com.
[9] CrewAI Documentation. docs.crewai.com.
[10] Microsoft AutoGen Documentation. microsoft.github.io/autogen.
[11] Van Gassen, E. (2026). "Semantic Compression of LLM Instructions via Symbolic Metalanguages." arXiv:2601.07354.
[12] Wang, Q., Tang, Z., Jiang, Z., Chen, N., Wang, T., and He, B. (2025). "AgentTaxo: Dissecting and Benchmarking Token Distribution of LLM Multi-Agent Systems." ICLR 2025 Workshop on Foundation Models in the Wild. (Token duplication rates: 72% in MetaGPT, 86% in CAMEL, 53% in AgentVerse.)
[13] Zhang, G., Yue, Y., Li, Z., et al. (2025). "Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems." ICLR 2025. arXiv:2410.02506.
[14] Chen, W., Yuan, J., Qian, C., Yang, C., Liu, Z., and Sun, M. (2025). "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System." Findings of ACL 2025, pp. 11534-11557. arXiv:2410.08115.
[15] Protocol Buffers Encoding Guide. protobuf.dev/programming-guides/encoding.
[16] LoRa Alliance. LoRaWAN Regional Parameters v1.0.3 (2018).
[17] myDevices. Cayenne Low Power Payload (CayenneLPP). docs.mydevices.com/docs/lorawan/cayenne-lpp.
[18] Semtech. "Packet Size Considerations." lora-developers.semtech.com.
[19] Towards Data Science (2026). "The Multi-Agent Trap." towardsdatascience.com/the-multi-agent-trap/. Retrieved March 2026. (Measured 3.5x token cost multiplier in 4-agent document analysis workflow: 10,000 tokens single-agent to 35,000 tokens multi-agent, before retries and coordination overhead.)
[20] GuruSup (2026). "Best Multi-Agent Frameworks in 2026." gurusup.com/blog/best-multi-agent-frameworks-2026. Retrieved March 2026. (AutoGen 4-agent GroupChat: 20 LLM calls minimum per 5-round task.)
[21] Zuplo (2026). "Token-Based Rate Limiting: How to Manage AI Agent API Traffic in 2026." zuplo.com/learning-center/token-based-rate-limiting-ai-agents. Retrieved March 2026. (Autonomous agents chain 10-20 sequential API calls per single task.)
[22] Moltbook-AI (2026). "AI Agent Cost Optimization Guide 2026." moltbook-ai.com/posts/ai-agent-cost-optimization-2026. Retrieved March 2026. (Unoptimized multi-agent systems process 10-50x more tokens than necessary; single research task $5-15 in API calls in minutes.)
[23] InvestGlass (2026). "How to Control API Costs in an Agentic AI World." investglass.com/how-to-control-api-costs-in-an-agentic-ai-world/. Retrieved March 2026. (Agents entering retry loops from unexpected error formats, consuming thousands of tokens in seconds.)
[24] Fast.io (2026). "AI Agent Token Cost Optimization: Complete Guide for 2026." fast.io/resources/ai-agent-token-cost-optimization/. Retrieved March 2026. (Poor context management causes 60-70% of total spend; unoptimized agents $10-100+ per session.)
[25] Neontri (2026). "AI Agent Development Cost in 2026: Full Budget Guide." neontri.com/blog/ai-agent-development-cost/. Retrieved March 2026. (Production agent: 2,000 conversations/day, ~90M tokens/month at 1,500 tokens/conversation with 1:2 input-to-output ratio.)
[26] Spheron (2026). "How to Build GPU Infrastructure for AI Agents: The 2026 Compute Playbook." spheron.network/blog/gpu-infrastructure-ai-agents-2026/. Retrieved March 2026. (Customer service agent: 10,000 conversations/day x 5 turns x 200 tokens = 10M tokens/day.)
OSMP is patent pending. This paper describes measurement methodology and competitive analysis applied to the disclosed protocol. The analytical frameworks (Shannon entropy, grammar production comparison, token cost measurement) are measurement tools, not protocol extensions. The benchmark results, format comparisons, and economic projections are analytical work product not present in the patent application. Readers should not treat this paper as a definitive scope statement regarding the patent's disclosure.