Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 73 additions & 8 deletions packages/dds/matrix/src/matrix.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@ import type {
IEventThisPlaceHolder,
IEventProvider,
} from "@fluidframework/core-interfaces";
import { assert, unreachableCase } from "@fluidframework/core-utils/internal";
import {
assert,
DoublyLinkedList,
type ListNode,
unreachableCase,
} from "@fluidframework/core-utils/internal";
import type {
IChannelAttributes,
IFluidDataStoreRuntime,
Expand Down Expand Up @@ -219,14 +224,74 @@ type FirstWriterWinsPolicy =
cellLastWriteTracker: SparseArray2D<CellLastWriteTrackerItem>;
};

/**
* Encapsulates the list of pending local changes for a single cell, plus the side-index from
* `localSeq` to the corresponding `DoublyLinkedList` node enabling O(1) lookup during reSubmit
* (instead of a linear `findIndex` scan).
*
* The list and the index are kept in lock-step: every mutator updates both.
*/
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: The class JSDoc states the purpose (O(1) lookup) but not the motivation for the side-map shape. Under today's runtime contract reSubmitCore always receives a localSeq equal to the head of pendingCell.local (because sendSetCellOp assigns a fresh nextLocalSeq() and pushes to tail, and the runtime walks pending ops FIFO). A Deque + head-shift + head-match assert would deliver identical O(1) behavior with no side-map — which is exactly the question the next reader will ask, and the question this dossier converged on.

The justification — explicit removal by localSeq (a) avoids encoding an implicit FIFO dependency in a per-cell data structure, and (b) is staged for follow-up squash work that needs arbitrary-position removal — is reasonable but lives only in PR discussion, not in code. Expand the PendingLocalCellChanges class comment to state explicitly: (1) why the side-map exists (avoids relying on FIFO/head invariant at call sites; supports upcoming squash/arbitrary-position removal), and (2) the lock-step invariant the wrapper enforces between list and index. Without this, the next reader will re-derive the "why not a Deque?" critique.

class PendingLocalCellChanges<T> {
private readonly list = new DoublyLinkedList<{ localSeq: number; value: MatrixItem<T> }>();
private readonly index = new Map<
number,
ListNode<{ localSeq: number; value: MatrixItem<T> }>
>();

public get length(): number {
return this.list.length;
}

public get last(): ListNode<{ localSeq: number; value: MatrixItem<T> }> | undefined {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: PendingLocalCellChanges.last returns ListNode<{ localSeq: number; value: MatrixItem<T> }> | undefined, exposing the internal DoublyLinkedList node type as part of the wrapper's public surface. The single call site in setCellCore writes pendingCell.local.last!.data.value with // eslint-disable-next-line @typescript-eslint/no-non-null-assertion — the only such disable in this PR. Both the eslint suppression and the leaky ListNode shape come from the same accessor.

Replace get last(): ListNode<…> with peekLastValue(): MatrixItem<T> | undefined returning this.list.last?.data.value. The call site collapses to pendingCell.local.peekLastValue() ?? pendingCell.consensus, dropping the !.data.value chain and the eslint-disable, and centralising the internal-structure dependency inside the wrapper where it belongs. Josmithr's pattern on #24604 was to push back on exactly this kind of optional/ambiguity surface on the pending-cell interface family.

return this.list.last;
}

public push(localSeq: number, value: MatrixItem<T>): void {
assert(!this.index.has(localSeq), "duplicate localSeq in PendingLocalCellChanges");
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: assert(!this.index.has(localSeq), "duplicate localSeq in PendingLocalCellChanges") is the only plain-string assert in the changed file. Every other assert(...) in matrix.ts carries a hex short-code tag, including the three asserts this PR modifies (0xba40xba7) and the adjacent 0xba80xbab. The convention was established by the bulk-tagging pass in #24797 (jatgarg, 2025-06-09); the release-time assert-short-code tagger converts plain strings to numeric codes. If that policy check runs in CI for this package, it will fail.

Run the assert-short-code tagger (or confirm the intent is to let the next release-time pass tag it). Don't invent a code by hand.

const { first: node } = this.list.push({ localSeq, value });
// single-item push: first === last, both reference the newly inserted node.
this.index.set(localSeq, node);
}

public removeByLocalSeq(
localSeq: number,
): { localSeq: number; value: MatrixItem<T> } | undefined {
const node = this.index.get(localSeq);
if (node === undefined) {
return undefined;
}
this.list.remove(node);
this.index.delete(localSeq);
return node.data;
}

public shift(): { localSeq: number; value: MatrixItem<T> } | undefined {
const node = this.list.shift();
if (node === undefined) {
return undefined;
}
this.index.delete(node.data.localSeq);
return node.data;
}

public pop(): { localSeq: number; value: MatrixItem<T> } | undefined {
const node = this.list.pop();
if (node === undefined) {
return undefined;
}
this.index.delete(node.data.localSeq);
return node.data;
}
}

/**
* Tracks pending local changes for a cell.
*/
interface PendingCellChanges<T> {
/**
* The local changes including the local seq, and the value set at that local seq.
*/
local: { localSeq: number; value: MatrixItem<T> }[];
local: PendingLocalCellChanges<T>;
/**
Comment thread
anthony-murphy marked this conversation as resolved.
* The latest consensus value across all clients.
* this will either be a remote value or ack'd local
Expand Down Expand Up @@ -513,9 +578,9 @@ export class SharedMatrix<T = any>

this.submitLocalMessage(op, metadata);
const pendingCell: PendingCellChanges<T> = this.pending.getCell(rowHandle, colHandle) ?? {
local: [],
local: new PendingLocalCellChanges<T>(),
};
pendingCell.local.push({ localSeq, value });
pendingCell.local.push(localSeq, value);
this.pending.setCell(rowHandle, colHandle, pendingCell);
return pendingCell;
}
Expand Down Expand Up @@ -844,9 +909,8 @@ export class SharedMatrix<T = any>
assert(pendingCell !== undefined, 0xba4 /* local operation must have a pending array */);
const { local } = pendingCell;
assert(local !== undefined, 0xba5 /* local operation must have a pending array */);
const localSeqIndex = local.findIndex((p) => p.localSeq === localSeq);
assert(localSeqIndex >= 0, 0xba6 /* local operation must have a pending entry */);
const [change] = local.splice(localSeqIndex, 1);
const change = local.removeByLocalSeq(localSeq);
assert(change !== undefined, 0xba6 /* local operation must have a pending entry */);
assert(change.localSeq === localSeq, 0xba7 /* must match */);

if (
Expand Down Expand Up @@ -910,7 +974,8 @@ export class SharedMatrix<T = any>

const previous =
pendingCell.local.length > 0
? pendingCell.local[pendingCell.local.length - 1].value
? // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
pendingCell.local.last!.data.value
: pendingCell.consensus;

this.setCellCore(
Expand Down
50 changes: 50 additions & 0 deletions packages/dds/matrix/src/test/matrix.reconnect.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -231,4 +231,54 @@ describe("SharedMatrix reconnect", () => {
assert.deepEqual(extract(matrix1), expected);
assert.deepEqual(extract(matrix3), expected);
});

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: This PR introduces three new code paths on PendingLocalCellChanges: pop() (rollback, matrix.ts:974), last peek (FWW previous-value decision in setCellCore), and shift() (ACK). The single new test exercises only the reSubmit path. Existing tests cover single-pending rollback and single-pending FWW, but the "many pending locals to one cell → rollback / FWW conflict" combinations — exactly the paths that depend on the new last accessor and on pop's side-map cleanup — are not directly exercised.

Add two tests under the same harness:

  • A rollback test that stacks ≥3 pending writes to one cell then undoes them in sequence (exercises pop + last + side-map cleanup; pairs with the existing matrix.rollback.spec.ts patterns).
  • An FWW-conflict test where a remote winner arrives while multiple local pendings exist on the same cell (exercises the length > 0 ? local.last!.data.value : pendingCell.consensus branch).

Both fit the existing MockContainerRuntime harness.

// Regression test for the resubmit path optimized when many pending writes target the
// same (row, col). Disconnect, write the same cell N times, reconnect, and assert that
// only the final value remains and that both clients converge.
it("resubmits N writes to the same cell after reconnect", () => {
const containerRuntimeFactory = new MockContainerRuntimeFactoryForReconnection();
const dataRuntime1 = new MockFluidDataStoreRuntime();
containerRuntimeFactory.createContainerRuntime(dataRuntime1);
const dataRuntime2 = new MockFluidDataStoreRuntime();
const containerRuntime2 = containerRuntimeFactory.createContainerRuntime(dataRuntime2);

const matrix1 = matrixFactory.create(dataRuntime1, "A");
matrix1.connect({
deltaConnection: dataRuntime1.createDeltaConnection(),
objectStorage: new MockStorage(),
});

const matrix2 = matrixFactory.create(dataRuntime2, "B");
matrix2.connect({
deltaConnection: dataRuntime2.createDeltaConnection(),
objectStorage: new MockStorage(),
});

// Create a 1x1 matrix.
matrix1.insertRows(0, 1);
matrix1.insertCols(0, 1);
containerRuntimeFactory.processAllMessages();

// Disconnect matrix2, then write the same (row, col) N times. This stages a
// large run of pending setCell ops all targeting the same cell, which is the
// hot path being optimized by the pending-cell-list change.
containerRuntime2.connected = false;
const N = 200;
for (let i = 0; i < N; i++) {
matrix2.setCell(0, 0, i);
}

// While disconnected, matrix2 sees the latest local write and matrix1 sees nothing.
assert.deepEqual(extract(matrix1), [[undefined]]);
assert.deepEqual(extract(matrix2), [[N - 1]]);

// Reconnect matrix2 to force the resubmit path to re-emit all N pending writes.
containerRuntime2.connected = true;
containerRuntimeFactory.processAllMessages();

// Both clients should converge on the last-written value.
const expected = [[N - 1]];
assert.deepEqual(extract(matrix1), expected);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep Review: "resubmits N writes to the same cell after reconnect" stages N=200 pending writes, reconnects, and asserts only assert.deepEqual(extract(matrix1), [[N - 1]]) and extract(matrix2), [[N - 1]] — final-state convergence. The test's own framing ("re-emit all N pending writes") implies a per-op regression target, but no cellsChanged observation is wired up. On #18018, vladsud explicitly objected to resubmit logic preserving final state while skipping intermediate set-cell states, because matrix exposes each intermediate state through change events. The PR adds no batching/squash path so the risk is narrowed today — but the gap remains real for the path being rewritten, and a future squash follow-up could regress this silently.

Attach a cellsChanged consumer (or a counter on the open-handler) and verify both clients observe all N values in order during the reconnect drain — not just the final one.

assert.deepEqual(extract(matrix2), expected);
});
});
Loading