Skip to content

Commit 382e01f

Browse files
FUDCoclaude
andcommitted
docs: Update Ken protocol assessment - all properties now implemented
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent d11ce2d commit 382e01f

1 file changed

Lines changed: 58 additions & 81 deletions

File tree

docs/ken-protocol-assessment.md

Lines changed: 58 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ Key aspects:
5050
| Deferred transmission || Outputs reach RemoteHandle only after originating crank commits |
5151
| Output validity || Crank buffering ensures outputs escape only after commit |
5252
| Atomic checkpoint || Database savepoints make crank state changes atomic |
53+
| Exactly-once receive || Transactional receive with dedup check (Issue #808) |
54+
| FIFO ordering || TCP guarantees in-order; dedup handles retransmits |
5355

5456
### Crank Buffering (Issue #786)
5557

@@ -78,48 +80,59 @@ This achieves Ken's property that **outputs are only externalized after successf
7880

7981
`RemoteHandle` persists messages to `remotePending` before transmitting for a different reason: to enable retransmission on recovery if the transmission or ACK is lost. This is part of the at-least-once delivery mechanism, not the output validity mechanism.
8082

81-
### Remaining Gaps (Receive Side)
83+
### Receive-Side Implementation (Issue #808)
8284

83-
The remaining gaps are on the **receive side** of remote messaging. Code review of `RemoteHandle.handleRemoteMessage()` revealed specific bugs:
85+
The receive side of remote messaging implements Ken's exactly-once delivery guarantee through transactional message processing with duplicate detection.
8486

85-
#### 1. No Duplicate Detection (Bug)
87+
#### Duplicate Detection
8688

87-
Ken maintains a `Done` table ensuring each message is delivered to the application **at most once**.
89+
Ken maintains a `Done` table ensuring each message is delivered to the application **at most once**. Our implementation achieves this by checking `seq <= highestReceivedSeq` before processing:
8890

89-
**Current code behavior** (`RemoteHandle.ts` lines 830-845):
9091
```typescript
91-
// Track received sequence number for piggyback ACK and persist
92-
if (seq > this.#highestReceivedSeq) {
93-
this.#highestReceivedSeq = seq;
94-
this.#kernelStore.setRemoteHighestReceivedSeq(this.remoteId, seq);
92+
// Duplicate detection: skip if we've already processed this sequence number
93+
if (seq <= this.#highestReceivedSeq) {
94+
this.#logger.log(`ignoring duplicate message seq=${seq}`);
95+
return null;
9596
}
96-
// ... then UNCONDITIONALLY:
97-
switch (method) {
98-
case 'deliver':
99-
this.#handleRemoteDeliver(params); // Always runs, even for duplicates!
10097
```
10198

102-
**Problem**: There is no deduplication check. Even when `seq <= highestReceivedSeq`, the message is processed. After a crash and retransmit, duplicate messages will be delivered to the vat.
99+
After a crash and retransmit, duplicate messages are detected and ignored.
100+
101+
#### Transactional Message Processing
103102

104-
#### 2. Wrong Persistence Order (Bug)
103+
Message processing is wrapped in a database savepoint to ensure atomicity:
104+
105+
```typescript
106+
const savepointName = `receive_${this.remoteId}_${seq}`;
107+
this.#kernelStore.createSavepoint(savepointName);
108+
try {
109+
// Process message (translate refs, add to run queue, etc.)
110+
switch (method) {
111+
case 'deliver': ...
112+
case 'redeemURL': ...
113+
case 'redeemURLReply': ...
114+
}
105115

106-
**Current behavior**: `highestReceivedSeq` is persisted BEFORE the message is processed and added to the run queue.
116+
// Persist sequence tracking at the end, within the transaction
117+
this.#kernelStore.setRemoteHighestReceivedSeq(this.remoteId, seq);
107118

108-
**Crash scenario**:
109-
1. Receive message seq=5 from remote R
110-
2. Update and persist `highestReceivedSeq` to 5
111-
3. Crash before message is added to run queue
112-
4. On recovery: `highestReceivedSeq=5` suggests we received it
113-
5. Remote retransmits seq=5, we (correctly) ignore it due to dedup check (once fixed)
114-
6. **Message lost** - never reached the run queue
119+
// Commit the transaction
120+
this.#kernelStore.releaseSavepoint(savepointName);
121+
} catch (error) {
122+
// Rollback on any error - also revert in-memory state
123+
this.#highestReceivedSeq = previousHighestReceivedSeq;
124+
this.#kernelStore.rollbackSavepoint(savepointName);
125+
throw error;
126+
}
127+
```
115128

116-
**What's needed**: Process the message first (add to run queue), then persist `highestReceivedSeq`. Ideally these should be atomic.
129+
This achieves atomicity: if a crash occurs before commit, both the run queue entry and the sequence update roll back together. The remote retransmits, and we process it correctly.
117130

118-
#### 3. FIFO Enforcement on Receive (Not a Gap)
131+
#### FIFO Ordering
119132

120133
Ken enforces per-sender FIFO ordering via `next_ready()` which only delivers the next expected sequence number.
121134

122-
**Our situation**: We use TCP-based transports (libp2p streams) which guarantee in-order delivery during normal operation. Out-of-order arrival only occurs after a crash when the sender retransmits. With proper deduplication (fix #1 above), retransmitted messages for already-processed sequence numbers will be dropped, maintaining FIFO semantics.
135+
**Our situation**: We use TCP-based transports (libp2p streams) which guarantee in-order delivery during normal operation. Out-of-order arrival only occurs after a crash when the sender retransmits. With duplicate detection, retransmitted messages for already-processed sequence numbers are dropped, maintaining FIFO semantics.
123136

124137
Therefore, explicit receive-side reordering is not required given our transport guarantees.
125138

@@ -134,45 +147,12 @@ Therefore, explicit receive-side reordering is not required given our transport
134147
| Consistent frontier | **Yes** | Each kernel's checkpoint is independent |
135148
| Local recovery | **Yes** | Crashes don't affect other processes |
136149
| Sender-based logging | **Yes** | Messages persisted in remotePending until ACKed |
137-
| Exactly-once delivery | **Bug** | Needs transactional receive with dedup check |
150+
| Exactly-once delivery | **Yes** | Transactional receive with dedup check |
138151
| FIFO ordering | **Yes** | TCP guarantees in-order; dedup handles retransmits |
139152

140-
## Required Fix
141-
142-
Wrap `handleRemoteMessage()` in a database transaction with dedup check:
143-
144-
```typescript
145-
handleRemoteMessage(seq, method, params) {
146-
// Begin transaction
147-
148-
// Dedup check - must be inside transaction to read committed state
149-
if (seq <= this.#highestReceivedSeq) {
150-
// Already received, ACK but don't process
151-
return;
152-
}
153-
154-
// Process message (translate refs, add to run queue, etc.)
155-
switch (method) {
156-
case 'deliver': ...
157-
case 'resolve': ...
158-
case 'gc': ...
159-
}
160-
161-
// Update sequence tracking
162-
this.#highestReceivedSeq = seq;
163-
this.#kernelStore.setRemoteHighestReceivedSeq(this.remoteId, seq);
164-
165-
// Commit transaction
166-
}
167-
```
168-
169-
This achieves atomicity without restructuring the existing message handling code. If a crash occurs before commit, both the run queue entry and the sequence update roll back together - the remote retransmits, and we process it correctly.
170-
171-
The transaction approach is simpler than reordering because `handleRemoteMessage` handles multiple message types (`deliver`, `resolve`, `gc`) with different processing paths, and reference slots require translation before persistence.
172-
173153
## Architectural Summary
174154

175-
**Send side (achieved with crank buffering):**
155+
**Send side (crank buffering):**
176156
```
177157
Vat Crank:
178158
vat processes message → syscalls buffer outputs
@@ -186,34 +166,31 @@ Later (separate operation):
186166

187167
The key insight: by the time RemoteHandle sees a message, the originating crank has already committed. Output validity is achieved.
188168

189-
**Receive side (bugs to fix):**
169+
**Receive side (transactional processing):**
190170
```
191-
Current (buggy):
192-
receive from network
193-
persist highestReceivedSeq (WRONG: too early)
194-
process message unconditionally (WRONG: no dedup)
195-
add to run queue
196-
197-
Fixed (wrap in transaction):
198-
receive from network
199-
begin transaction
200-
check seq <= highestReceivedSeq (skip if duplicate)
201-
process message, add to run queue
202-
persist highestReceivedSeq
203-
commit transaction
171+
receive from network
172+
→ check seq <= highestReceivedSeq (skip if duplicate)
173+
→ begin transaction (savepoint)
174+
→ process message, add to run queue
175+
→ persist highestReceivedSeq
176+
→ commit transaction (release savepoint)
204177
```
205178

179+
If a crash occurs before commit, both the run queue entry and the sequence update roll back together. The remote retransmits, and we process it correctly.
180+
206181
## Progress Summary
207182

208183
| Area | Status |
209184
|------|--------|
210-
| Kernel-internal output buffering | **Achieved** |
211-
| Rollback discards uncommitted outputs | **Achieved** |
212-
| Atomic kernel state + output queue | **Achieved** |
213-
| Output validity (send side) | **Achieved** |
214-
| Deferred transmission (send side) | **Achieved** |
185+
| Kernel-internal output buffering | **Achieved** (Issue #786) |
186+
| Rollback discards uncommitted outputs | **Achieved** (Issue #786) |
187+
| Atomic kernel state + output queue | **Achieved** (Issue #786) |
188+
| Output validity (send side) | **Achieved** (Issue #786) |
189+
| Deferred transmission (send side) | **Achieved** (Issue #786) |
215190
| FIFO ordering | **Achieved** (TCP transport) |
216-
| Exactly-once receive (dedup + atomicity) | **Bug** - needs transactional fix |
191+
| Exactly-once receive (dedup + atomicity) | **Achieved** (Issue #808) |
192+
193+
All Ken protocol properties are now implemented.
217194

218195
## References
219196

0 commit comments

Comments
 (0)