peer2prompt/docs/whitepaper.typ at main · longevityboris/peer2prompt · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
#import "@preview/arkheion:0.1.0": arkheion, arkheion-appendices

#show: arkheion.with(
  title: "Peer2Prompt: A Peer-to-Peer Protocol for Anonymous AI Inference",
  authors: (
    (name: "Anonymous", email: "", affiliation: ""),
  ),
  abstract: [Current AI inference systems bind prompt content to persistent user identity. This creates a surveillance default: providers can build longitudinal behavioral profiles even when prompts are ordinary. Existing alternatives fail in practice. Local inference cannot match frontier model quality for many users. Computationally intensive cryptographic approaches (MPC, FHE, GPU TEEs for full model serving) are not deployable at consumer scale today. Centralized privacy gateways can reduce linkage but remain vulnerable to legal and operational shutdown.

Peer2Prompt is a peer-to-peer protocol that targets one narrow property: unlinking _who asked_ from _what was asked_ while preserving access to frontier APIs. Peer2Prompt introduces *Ghost Nodes*: participants who contribute their own provider API keys and earn electronic cash by serving inference for others. Routing uses layered encryption and multi-hop forwarding so no single relay observes both client network identity and cleartext prompt. Payments use Cashu-style Chaumian ecash over Lightning, allowing spend and redemption without direct payer-payee linkage in the common case. Discovery and coordination use a Kademlia-style distributed hash table (DHT), avoiding central servers.

The design does not claim perfect anonymity. Ghost nodes can read prompts they serve, global traffic observers can correlate flows, and stylometric methods can re-identify some users. Peer2Prompt's claim is weaker and practical: it raises surveillance cost from near-zero bulk profiling to targeted, expensive investigation. The protocol is built around economic incentives. Users pay an anonymity premium; Ghost and GPU nodes compete to provide inference and capture that premium.],
  keywords: ("anonymous inference", "peer-to-peer", "onion routing", "ecash", "AI privacy"),
  date: "February 24, 2026 — v0.1.0",
)

#set cite(style: "ieee")

= Introduction

AI assistants are becoming a general interface to thought and work. In most deployments, every query is logged under a stable account, payment instrument, device fingerprint, or enterprise tenant. The technical result is trivial profile construction: a provider can map time, content, and identity at full resolution.

The privacy problem is often framed as content secrecy, but the critical leak is linkage. "A question about a rash" is low-value telemetry. "A question about a rash from a specific person across years" is high-value telemetry. The goal is not to hide language itself from all observers; the goal is to break durable identity linkage at scale.

Several approaches exist:

+ Local open-source models avoid remote logging but often sacrifice capability, context length, latency, or operating cost.
+ Cryptographic private inference (MPC, FHE, TEE-based) is promising but not near-term for broad frontier use.
+ Centralized privacy brokers can mediate access but create single points of compromise.

Peer2Prompt adopts a different path: preserve existing model APIs, but route access through a censorship-resistant marketplace where identity and prompt are separated by protocol design. The protocol combines three mature ingredients:

+ Onion-routed, multi-hop message forwarding for unlinkability.
+ Chaumian blind-signature ecash for private payment.
+ DHT-based peer discovery for serverless operation.

The novel piece is economic, not cryptographic: *Ghost Nodes* monetize API keys by serving anonymous traffic. Privacy is funded directly by users who value it.

= Related Work

Several approaches have been proposed to address AI inference privacy. Each suffers from practical limitations that motivate Peer2Prompt's design.

*Private Inference Techniques.* TEEs, SMPC, and FHE provide different trust and performance trade-offs @gentry2009fhe @yao1986generate. TEEs rely on hardware isolation and side-channel resistance assumptions; SMPC and FHE provide stronger cryptographic privacy at substantially higher compute and latency cost in current deployments. For frontier-scale, low-latency consumer inference, none of these approaches are broadly practical today.

*Query Decomposition.* Splitting complex queries into generic sub-queries and routing them through different accounts degrades response quality. The requirement for a local model to reassemble responses adds user friction, and PII stripping mechanisms remain fragile against motivated adversaries.

*Pure P2P Query Routing.* A peer-to-peer network where peers route queries and exit through shared, pooled API accounts fails on two fronts. First, exit node trust: random peers see plaintext queries, trading corporate surveillance for surveillance by unknown third parties. Second, account survivability: shared or pooled accounts quickly trigger rate limits and bans from providers.

*Centralized Blind Relays.* A consortium model using blind tokens and non-colluding relays @privacypass provides good privacy properties but introduces single points of failure. A corporate entity managing relay infrastructure or holding enterprise API keys can be compelled to shut down, freeze funds, or revoke access through legal or regulatory pressure.

*Decentralized AI Networks.* Systems such as Bittensor @bittensor2021 provide decentralized model serving with token incentives but do not target query anonymity. Users are identifiable, and queries are attributed. These systems solve the compute distribution problem but not the surveillance problem.

Peer2Prompt combines onion routing with a decentralized economic marketplace. It requires no central coordinator and no provider cooperation.

= Design Goals and Non-Goals

== Goals

+ Preserve frontier quality. Prompts are forwarded verbatim; no mandatory decomposition or semantic rewriting.
+ Provide practical unlinkability. No honest single node should learn both requester identity and prompt content.
+ Remove central control points. Discovery and routing should survive company failure or jurisdictional pressure.
+ Enable open participation. Any user can relay; API key holders can become Ghost nodes; GPU owners can serve local models.
+ Align incentives. Rational participants should profit by honest service under normal market conditions.

== Non-Goals

+ Perfect anonymity against a global active adversary.
+ Confidentiality from the serving Ghost node without additional hardware trust assumptions.
+ Guaranteed immunity from provider policy or legal intervention.
+ Universal protection from stylometric re-identification.

Peer2Prompt is engineered for _economic infeasibility of mass surveillance_, not cryptographic impossibility of deanonymization.

= System Model

== Roles

+ *Client*: initiates inference requests and receives responses.
+ *Relay Node*: forwards onion-encrypted packets; does not access cleartext prompts.
+ *Ghost Node*: contributes third-party model API credentials; submits prompts to providers; returns outputs.
+ *GPU Node*: serves open-source models directly; API-free alternative supply.
+ *Mint*: issues and redeems Cashu ecash tokens using blind signatures.

All roles except mint are peer-to-peer and discoverable via DHT advertisements.

== Trust Boundaries

+ Client trusts no relay by default.
+ Client treats Ghost as potentially curious about prompt content.
+ Client treats mint as potentially curious about withdrawal/redeem timing and amount, but unable to link specific blinded tokens under standard blind-signature assumptions.
+ Provider sees Ghost account activity and prompt content, but not originating client identity unless external correlation succeeds.

== Data Visibility by Role

#figure(
  table(
    align: center,
    columns: (auto, auto, auto, auto, auto),
    stroke: 0.5pt,
    inset: 5pt,
    [*Role*], [*Sees IP*], [*Sees Prompt*], [*Sees Payment ID*], [*Sees Response*],
    [Ingress hop], [Client IP], [No], [No], [Encrypted],
    [Mid relay], [Prev/next], [No], [No], [Encrypted],
    [Ghost node], [Prev hop], [Yes], [Token proof only], [Yes],
    [AI provider], [Ghost IP], [Yes], [Ghost billing], [Yes],
    [Mint], [No], [No], [Mint-side ledger], [No],
  ),
  caption: [Data visibility matrix by network role.]
) <visibility>

== Adversary and Network Assumptions

+ Endpoints are not assumed secure. Client or Ghost compromise defeats protocol privacy.
+ Network is asynchronous. Packets can be delayed, dropped, replayed, or reordered.
+ Public keys are long-lived enough for routing but can rotate; session keys are ephemeral.
+ At least one non-colluding boundary exists between entry-side observation and exit-side prompt observation for a protected session.
+ Mints are economically rational but not fully trusted; clients may use multiple mints.

= Protocol Overview

== Peer Discovery (DHT)

Peer2Prompt uses a Kademlia-style DHT @maymounkov2002kademlia. Each node has an identifier $"NodeID" = H("pubkey")$. Nodes maintain $k$-buckets indexed by XOR distance. Capability advertisements are stored under keys derived from capability descriptors.

Advertisements are short-lived, tied to rotating epoch keys, and expose only commitments; detailed capabilities are fetched over encrypted request/response to limit crawl-based profiling.

Lookup procedure follows standard iterative Kademlia:

+ Query $alpha$ nearest known peers for a target key.
+ Merge returned closer peers.
+ Repeat until convergence on $k$ closest peers.
+ Store advertisements on the $k$ closest nodes to the advertisement key.

This provides logarithmic expected lookup complexity and removes centralized directories.

== Route Construction

For each request, client chooses:

+ $h - 1$ relays from DHT (typically 2--3).
+ One exit service node (Ghost or GPU).
+ A single-use reply path token (reverse onion metadata).
+ A route satisfying verifiable diversity rules (distinct node keys, IP prefixes, and ASNs by observed network data).

Client constructs layered encryption from exit to ingress. Hop $i$ receives only instructions for hop $i + 1$.

$ "Client" arrow.r R_1 arrow.r R_2 arrow.r R_3 arrow.r "Ghost" arrow.r "Provider" $

No relay on the forward path learns prompt content. No relay learns full path.

== Cryptographic Packet Sketch

Let $P K_i$ be hop long-term public keys. Client and hop $i$ run an authenticated per-session key exchange (ntor-style @dingledine2004tor) to derive session key $K_i$ and replay-protection state. Each layer uses AEAD encryption with nonce and associated data (session id, expiry, integrity tags).

At hop $i$:

+ Decrypt outer header with $K_i$.
+ Verify expiry and replay window.
+ Learn only next-hop address and next ciphertext blob.
+ Forward.

Security note: if Peer2Prompt uses only static-hop Diffie-Hellman from client ephemeral keys, compromise of a hop long-term private key can expose previously recorded traffic for that hop. Post-compromise secrecy therefore requires per-session ephemeral keys on both sides and key erasure. This is conceptually similar to low-latency onion routing and Sphinx-style layered forwarding @danezis2009sphinx.

== Return Path

The client pre-computes a single-use reply block (SURB-like object) using the public keys of the return-path relays and embeds it in the forward request. The Ghost never learns the client address or the return route topology. When the Ghost has a response ready, it encrypts the payload with the ephemeral symmetric key established during the forward handshake, attaches it to the pre-computed SURB header, and transmits it to the first return hop. Each subsequent hop peels one layer and forwards toward the client. Clients behind NAT or firewalls receive replies via polling or subscription to rendezvous mailboxes; direct inbound delivery is optional.

== Session Semantics for Multi-Turn Chat

A conversation requires context continuity. Peer2Prompt handles this by session-scoped routing:

+ A session is pinned to one selected Ghost for its lifetime or until explicit rotation.
+ Context window size (for example 128k tokens) bounds maximum exposure per compromised Ghost.
+ New sessions should re-sample path and Ghost.

This does not remove exposure of a single conversation to one Ghost. It limits cross-conversation linkage.

== Traffic Analysis Countermeasures

Peer2Prompt is a low-latency system, so traffic analysis remains a first-class risk. Practical mitigations include:

+ Fixed-size packet cells for forwarding layers.
+ Randomized inter-packet delays in bounded ranges.
+ Optional cover packets during idle intervals.
+ Session-level route rotation policies.

These controls increase attack cost but do not provide mixnet-level anonymity guarantees.

= Inference Marketplace and Ghost Nodes

== Ghost Node Function

A Ghost node is an API credential holder that accepts anonymous inference jobs and submits them to an upstream provider as normal API calls. The provider sees ordinary account traffic from Ghost's account origin.

Ghost processing pipeline:

+ Receive onion-decapsulated request and payment proof.
+ Verify ecash proofs against mint keysets and enforce settlement policy.
+ Send prompt and parameters to provider API.
+ Stream response chunks into reply onion.
+ Redeem ecash proofs at mint.

== Why Ghost Nodes Are New

Relay networks and anonymous payment systems exist separately. Peer2Prompt combines them with frontier API credential supply. This creates a market where privacy demand purchases model access indirectly from distributed credential owners rather than a central privacy company.

== GPU Node Competition

GPU nodes serve open models and quote prices similarly. Clients choose by utility:

+ Quality (model capability).
+ Latency and reliability.
+ Price.
+ Jurisdiction and policy preferences.

Over time, cheaper GPU inference can displace some Ghost demand as open models improve.

= Payment Layer: Cashu Ecash over Lightning

== Why Cashu

Peer2Prompt requires small, frequent, low-friction payments with weak identity coupling. Cashu @cashu2026 provides bearer tokens issued by a mint using Chaumian blind signatures @chaum1982blind. Lightning @poon2016lightning provides mint funding and redemption rails.

== Token Lifecycle

The token lifecycle proceeds in three phases:

+ *Minting*: Client pays a Lightning invoice to the mint and receives blinded signatures. Client unblinds to obtain spendable proofs.
+ *Spending*: Client sends proofs to Ghost/GPU node. The service node verifies proof signatures and amount, then serves in prepaid chunks to cap double-spend exposure.
+ *Settlement*: Ghost/GPU sends proofs to mint. Mint verifies signatures and unspent status, marks proofs as spent, and returns either reissued fresh proofs or Lightning payment via melt.

Blindness property: mint signs blinded messages and cannot directly link minting to later spend proofs under standard assumptions.

Cashu deployments commonly use a Blind Diffie-Hellman Key Exchange (BDHKE) @chaum1988untraceable:

+ Mint holds signing secret $k$ with public key $K = k G$.
+ Client generates random secret $x$, computes $Y = op("hash_to_curve")(x)$, selects blinding factor $r$, and computes blinded point $B' = Y + r G$.
+ Mint signs: $C' = k B'$. Returns $C'$ to client.
+ Client unblinds: $C = C' - r K$. The spend proof is the tuple $(x, C)$.
+ At redemption, mint verifies $C = k dot op("hash_to_curve")(x)$ and marks $x$ as spent.

Double spends are prevented by mint-side tracking of redeemed secrets. The mint's state is authoritative for double-spend prevention.

== Practical Leakage and Mitigations

Cashu's anonymity guarantees are strictly bounded by network metadata and timing correlations:

+ Unusual denomination patterns can reduce the anonymity set for specific transactions.
+ Tight timing between mint and spend enables statistical linkage when the anonymity set is small.
+ Single-mint monopolies concentrate metadata visibility.
+ Lightning Network topology can deanonymize the client to the mint: payment timing correlation between the client's Lightning node and the mint's invoice settlement reveals funding origin.

Mitigations:

+ Standard denomination sets and split/merge behavior.
+ Token batching and random delays between mint and first spend.
+ Multi-mint support and wallet-side mint rotation.
+ Lightning privacy techniques (multi-path payments, trampoline routing) for mint funding.

== Mint Trust and Federation

Cashu mint operators can censor redemption, fail operationally, or correlate by timing and denomination metadata. Peer2Prompt therefore treats mint choice as a market, not a singleton.

Recommended wallet behavior:

+ Hold balances across multiple independent mints.
+ Split payments across mints when possible.
+ Prefer mints with transparent reserves, uptime history, and public key continuity.
+ Treat mint insolvency as a realistic failure mode.

= Privacy and Security Analysis

== Threat Model Tiers

Peer2Prompt defines tiers by adversary capability.

#figure(
  table(
    align: left,
    columns: (auto, auto, auto, auto),
    stroke: 0.5pt,
    inset: 5pt,
    [*Tier*], [*Adversary*], [*Primary Power*], [*Outcome*],
    [1], [Provider/account observer], [Prompt + Ghost billing], [No direct user identity],
    [2], [Single node / local observer], [One vantage point], [Partial metadata only],
    [3], [Colluding subset], [Entry-exit correlation], [Session compromise prob.],
    [4], [Wide passive observer], [Timing/volume correlation], [Weakened unlinkability],
    [5], [Global active state actor], [Coercion + infiltration], [Targeted deanonymization],
  ),
  caption: [Threat model tiers and baseline outcomes.]
) <threat>

*Tier 1: Provider/Account Linkage.* The AI provider observes prompts and Ghost account metadata but cannot control route ingress. Provider can profile Ghost account behavior but cannot directly map prompts to end users unless external data is available. Residual risk: provider can pressure or ban Ghost accounts.

*Tier 2: Single Local Observer.* One compromised relay, local ISP observer, or one malicious Ghost. A malicious ingress sees client IP but not prompt. A malicious Ghost sees prompt but not client IP. A single mid-relay sees neither. Residual risk: traffic volume and timing fingerprints.

*Tier 3: Partial Collusion.* If ingress-candidate malicious fraction is $m$ and Ghost malicious fraction is $g$, simple first-order link probability per session is approximately $P("link") approx m dot g$, assuming independent route sampling and no global timing oracle. For $n$ independent sessions with re-sampled routes, naive full-link expectation is $n dot m dot g$. Residual risk: colluding sets can accumulate partial evidence over time.

*Mint-Exit Collusion.* A specific collusion threat deserves separate treatment. When a client funds a Cashu mint via Lightning, the mint learns the client's Lightning node identity or IP from the funding invoice. When the client later spends those tokens at a Ghost node, the Ghost redeems them at the mint. If the mint and Ghost collude and share logs, they can attempt to correlate the spent token denominations and timing with the originating Lightning funding event. Chaumian blind signatures prevent the mint from linking a specific blinded issuance to a specific spend, but if the anonymity set is small, statistical correlation can narrow the candidate set. Required protocol-level mitigations: (1) token swapping through the mint before spending to increase the anonymity set; (2) batching of minting and spending operations; (3) randomized hold periods between minting and first spend; (4) multi-mint distribution so no single mint observes the full funding pattern; (5) Lightning privacy techniques (multi-path payments, trampoline routing) to reduce mint-side identity exposure during funding. Failure to implement these mitigations reduces the anonymity set to users who minted matching denominations within the correlation window.

*Tier 4: Network-Wide Passive Observer.* Low-latency onion systems are vulnerable to advanced traffic analysis. Peer2Prompt can raise cost with padding and jitter, but cannot eliminate this class without heavy latency overhead. Residual risk: high for persistent, high-resource adversaries.

*Tier 5: Global Active State Adversary.* Node infiltration, coercion, legal compulsion, endpoint malware, financial tracing, and targeted stylometry. Targeted deanonymization can succeed against specific users with enough budget. Peer2Prompt does not claim resistance to this adversary in all cases.

== Stylometry and Semantic Fingerprinting

Even with network unlinkability, writing style and unique factual references can identify users @narayanan2012feasibility @koppel2009computational. This is outside transport-layer anonymity. Optional client-side transformations may reduce stylometric leakage, but mandatory rewriting harms output quality and was intentionally avoided.

== Context-Window Compartmentalization

Session pinning creates bounded exposure:

+ A compromised Ghost can observe one session's context window.
+ Cross-session history requires additional compromise and correlation.

This is weaker than end-to-end content secrecy, but stronger than centralized account archives where all history is aggregated by default.

== Logging and Operational Hygiene

Protocol-level privacy fails if implementations log sensitive state. Reference clients and nodes should treat request plaintext, route metadata, and token proofs as memory-only with strict retention controls. This is an implementation requirement, not a cryptographic guarantee.

= Economics and Incentive Compatibility

== Market Structure

Peer2Prompt is a two-sided market:

+ Demand side: users willing to pay an anonymity premium over direct API cost.
+ Supply side: Ghost and GPU nodes offering inference capacity.

The premium funds routing overhead, risk compensation, and node operator profit.

== Ghost Node Unit Economics

Let $C = C_"api" + C_"bandwidth" + C_"ops" + C_"policy_risk"$ be expected cost per request, and $P$ be quoted price. Rational Ghost serves if $P > C$. Expected profit per request: $pi_"ghost" = P - C$.

Actual spreads vary by model, abuse rate, and account risk. Ghost operators also price variance. If output token usage is heavy-tailed, quotes should include confidence margins or hard caps to avoid negative-expectation jobs.

== Why Users Pay

Users purchase:

+ Identity unlinkability from prompt history.
+ Access to multiple providers through one private payment model.
+ Censorship resilience from decentralized supply.

As long as perceived privacy value exceeds premium, demand persists.

== Attack Economics

Centralized platforms make mass surveillance cheap: one subpoena or one breach can expose complete histories. Peer2Prompt changes the cost curve in two dimensions.

*Sybil capital requirements.* To achieve a meaningful ingress-exit correlation rate, an attacker must operate enough nodes to appear frequently in independently sampled routes. In a Kademlia DHT with $N$ honest nodes, an attacker operating $A$ Sybil nodes achieves ingress selection probability proportional to $A / (A + N)$ and an independent exit selection probability of similar order. Full-link correlation per session requires controlling both, yielding approximately $(A / (A + N))^2$ per session. Each Sybil node must pass identity cost checks, maintain uptime for routing table insertion, and sustain forwarding bandwidth. The marginal cost per Sybil node sets a floor on the capital required for a given correlation rate.

*Comparison to centralized breach.* In a centralized system, a single legal order or infrastructure breach exposes complete history at near-zero marginal cost per user. In Peer2Prompt, even a well-funded Sybil attacker gains only probabilistic per-session correlation, must sustain ongoing operational costs, and cannot retroactively link sessions that used routes not under their control.

The protocol objective is not to make deanonymization impossible; it is to ensure that the cost of blanket monitoring scales with the number of targets and sessions, rather than collapsing to a single point of compromise.

= Abuse, Policy Pressure, and Adversarial Behavior

== Terms-of-Service and Account Survival

Ghost nodes are exposed to provider policy enforcement. Protocol cannot guarantee account longevity. Mitigation is distribution: many independent accounts and providers reduce single-point failure.

== Malicious Service Nodes

Malicious Ghost behavior includes overbilling, dropping responses, and prompt harvesting. Countermeasures:

+ Pre-declared signed pricing.
+ Response-integrity receipts and client challenge probes.
+ Reputation systems and short-lived route sampling.

== DHT Sybil and Eclipse Attacks

Open DHTs are vulnerable to identity flooding. Countermeasures include:

+ Identity cost (Hashcash-style proof-of-work @dwork1992pricing, proof-of-bandwidth, or other verifiable resource tests).
+ Multi-source peer bootstrapping.
+ Diversity-aware route selection across autonomous systems and jurisdictions.

No Sybil defense is free. Stronger defenses increase friction and reduce openness.

= Limitations and Open Problems

+ *Exit plaintext problem*: Ghost sees prompt and output for served sessions.
+ *Global traffic analysis*: low-latency onion routing remains analyzable by powerful observers.
+ *Mint trust concentration*: blind signatures hide linkage, but mint custody and ledger integrity remain trust anchors.
+ *Stylometric deanonymization*: language itself can identify authors.
+ *Endpoint compromise*: malware on client or Ghost bypasses network-layer privacy.
+ *Legal compulsion asymmetry*: distributed networks are harder to suppress, not impossible to pressure.
+ *Abuse governance*: balancing resistance to censorship with abuse containment is unresolved and likely jurisdiction-dependent.
+ *Provider-side model watermarking*: output fingerprints may leak route or account metadata in downstream sharing.
+ *Measurement gap*: robust, public anonymity metrics for real deployments remain immature.

These are active research and engineering areas.

= Implementation Sketch

== Minimal Viable Protocol

+ Kademlia DHT with signed node advertisements.
+ 3-hop default routes plus Ghost exit.
+ Cashu wallet integration with multi-mint support.
+ Deterministic quote format and metered billing.
+ Session pinning and explicit session rotation controls.

== Network Heterogeneity and Hardware Transitions

The protocol must handle the transition from API-proxy nodes to local inference nodes without requiring protocol-level changes.

+ *API-dominant phase*: Ghost nodes provide the majority of supply. Protocol routing, payment, and discovery are identical regardless of backend. Clients select by model capability and price.
+ *Hybrid phase*: GPU nodes running open models compete on price while Ghost nodes retain advantage for frontier-exclusive capabilities. The quote and settlement protocol is backend-agnostic.
+ *Inference-dominant phase*: as open models approach frontier quality, GPU nodes may capture majority supply. Ghost nodes persist for models without open-weight equivalents.

No hard fork or protocol upgrade is required for this transition. Node advertisements already distinguish backend type via capability commitments, and the payment layer is indifferent to how inference is produced.

= Conclusion

Peer2Prompt addresses a specific failure mode of modern AI deployment: durable identity linkage to inference history. The protocol does not attempt to solve all confidentiality problems. It combines known primitives into a practical architecture where unlinkability is probabilistic and bounded, with materially higher attack cost than centralized account-bound systems.

The central mechanism is the Ghost node market. API key holders, GPU operators, relays, and users coordinate through anonymous payments and open discovery — no corporate authority required. If incentives hold, the network can offer frontier access with materially better privacy properties than account-bound assistants.

What remains unsolved is clear: Ghost plaintext exposure, global traffic analysis, stylometry, and governance under adversarial pressure. These are substantial. They are also narrower than the status quo, where complete centralized histories are available by default. Peer2Prompt's claim is practical and limited: replace cheap mass linkage with costly targeted pursuit.

#bibliography("references.bib")