Harbor's fedimint db transactions overwrite a single SQLite blob on commit without any compare-and-swap / version check. This violates fedimint’s snapshot isolation requirement that a transaction must fail at commit if any of its modified keys changed since its snapshot. Today, concurrent writers can clobber each other (last-write-wins) and fedimint’s autocommit never sees a write-conflict to retry.
I see two fixes:
Option 1: Federation-level optimistic compare-and-swap
- Add version
INTEGER column alongside the federation blob
- On load, read federation blob data and version
- On commit,
UPDATE … SET data=?, version=version+1 WHERE id=? AND version=?
- If affected rows = 0 ⇒ conflict ⇒ return an error (caller retries)
- Federation-level concurrency, which may cause extra aborts but is simple to implement
Option 2: Per-key rows with version and compare-and-swap
Remove the existing federation blob column and replace it with a new table:
CREATE TABLE IF NOT EXISTS fedimint_kv (
federation_id BLOB NOT NULL,
key BLOB NOT NULL,
value BLOB NOT NULL,
version INTEGER NOT NULL, -- Monotonically increasing
PRIMARY KEY (federation_id, key)
);
- Track keys read/modified in the tx; at commit, compare-and-swap only the modified keys
- Key-level concurrency, which doesn't cause unnecessary aborts but is more complex to implement
Does Option 1 Provide Correctness?
Fedimint's snapshot isolation docs state that the following must be enforced by the db implementation:
Transactions with Snapshot Isolation level will only commit if there has been no write to the modified keys since the snapshot
If this means "if and only if", then option 2 would be required for correctness since federation-level concurrency could lead to a commit failing even if no writes were made to its modified keys. There may be fedimint db code that assumes the db upholds key-level locking and foregoes any write retry logic because it knows for certain that the keys it's writing to won't be concurrently modified. Might be worth clarifying with the fedimint team.
Harbor's fedimint db transactions overwrite a single SQLite blob on commit without any compare-and-swap / version check. This violates fedimint’s snapshot isolation requirement that a transaction must fail at commit if any of its modified keys changed since its snapshot. Today, concurrent writers can clobber each other (last-write-wins) and fedimint’s autocommit never sees a write-conflict to retry.
I see two fixes:
Option 1: Federation-level optimistic compare-and-swap
INTEGERcolumn alongside the federation blobUPDATE … SET data=?, version=version+1 WHERE id=? AND version=?Option 2: Per-key rows with version and compare-and-swap
Remove the existing federation blob column and replace it with a new table:
Does Option 1 Provide Correctness?
Fedimint's snapshot isolation docs state that the following must be enforced by the db implementation:
If this means "if and only if", then option 2 would be required for correctness since federation-level concurrency could lead to a commit failing even if no writes were made to its modified keys. There may be fedimint db code that assumes the db upholds key-level locking and foregoes any write retry logic because it knows for certain that the keys it's writing to won't be concurrently modified. Might be worth clarifying with the fedimint team.