Feature request: Add structured memory primitives for better context engineering by mchockal · Pull Request #791 · cloudflare/agents

mchockal · 2026-01-20T17:53:36Z

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper, but adapted to agents sdk.

Update 2/16: Added experimental session primitives to agents/experimental/memory — SQL-backed sessions, append-only event storage, ephemeral WorkingContext. This is a complete redesign of the original v0 memory primitives based on PR feedback, focusing just on the Session and Working Context first. A separate class SessionAgent is used to keep these changes purely experimental and independent of the Agent class.

Why?

Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:

Duplicated effort across agent implementations
Inconsistent approaches to memory management
Tight coupling between conversation state and model-specific formats
Complex compaction logic scattered throughout application code

These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and (later)compaction.

⚠️ Experimental — this module is unstable and may change without notice. All exports carry @experimental JSDoc tags.

What's Added

SessionAgent — Agent subclass with session/event SQL tables

Extends Agent and creates cf_agents_sessions and cf_agents_events tables in the constructor (matching the Agent pattern for cf_agents_state, cf_agents_queues, etc.).

RPC-safe methods (plain serializable objects, callable from Workflows/Workers):

Method	Description
createSession(metadata?)	Creates a session row, returns ID
getSession(sessionId)	Returns StoredSession \| null
listSessions()	All sessions for this agent
deleteSession(sessionId)	Deletes session + all events
loadEvents(sessionId, opts?)	Hydrated SessionEvent[], ordered by seq. Default limit: 50
appendEvents(sessionId, events)	Validates session, assigns monotonic seq, inserts rows
deleteEvents(sessionId, eventIds)	Deletes specific events by ID

Local-only methods (class instances don't survive RPC serialization):

Method	Description
_buildWorkingContext(sessionId, opts?)	protected — loads events + builds WorkingContext
persistWorkingContext(sessionId, ctx)	Extracts new messages, converts to events, appends

WorkingContext — Ephemeral in-memory context

Built per-request from session events. Accumulates messages during an agentic loop, tracks which are new via _initialCount. Thrown away after the request.

const ctx = this._buildWorkingContext(sessionId, {
  systemInstructions: ["You are a helpful assistant."],
  limit: 100,
});
ctx.addMessage({ role: "user", content: userMessage });
// ... LLM call, tool calls ...
ctx.addMessage({ role: "assistant", content: response });
this.persistWorkingContext(sessionId, ctx); // batch-persist new messages only

buildWorkingContext() — Pure function

For use outside the Agent (Workflows, Workers, tests). Takes SessionEvent[] from loadEvents() RPC and builds a WorkingContext locally.

workersAIAdapter — Model format adapter

Stateless adapter converting WorkingContext messages to Workers AI chat completions format. Handles system instructions, structured toolCalls, and tool results. Support for other frameworks will be added as new adapters.

Pure utility functions

hydrateEvent(row) — SQL row → typed SessionEvent
dehydrateEvent(event) — SessionEvent → SQL row
eventToMessage(event) — SessionEvent → ContextMessage
messageToEvent(sessionId, msg) — ContextMessage → SessionEvent

SQL Schema

CREATE TABLE cf_agents_sessions (
  id TEXT PRIMARY KEY NOT NULL,
  agent_id TEXT NOT NULL,
  created_at INTEGER NOT NULL,  -- ms since epoch
  updated_at INTEGER NOT NULL,
  metadata TEXT                 -- JSON
);

CREATE TABLE cf_agents_events (
  id TEXT PRIMARY KEY NOT NULL,
  session_id TEXT NOT NULL,
  seq INTEGER NOT NULL,         -- monotonic per session
  action TEXT NOT NULL,
  content TEXT,
  metadata TEXT,                -- JSON
  created_at INTEGER NOT NULL
);

CREATE INDEX idx_events_session_seq ON cf_agents_events(session_id, seq);

POST /chat
  → buildContext(sessionId)           # load completed events → WorkingContext
  → ctx.addMessage(userMessage)       # in-memory only
  → LLM call + tool loop             # accumulate in-memory
  → persistWorkingContext(sessionId)  # batch-persist user + assistant atomically

Example: examples/session-agent/

A complete working example with:

HTTP endpoints: POST /chat, GET /session, DELETE /session
Agentic loop with tool calling (echo + dummy-weather tools)

Tests

✓ adapters.test.ts (8 tests)  — Workers AI format, tool calls, full agentic loop conversation
✓ context.test.ts  (11 tests) — WorkingContext construction, new message tracking, custom mappers
✓ utils.test.ts    (26 tests) — hydrate/dehydrate roundtrips, event↔message mapping, edge cases

Tests run via npm run test:memory (added to the CI test pipeline).

Known Limitations

Workers AI only — OpenAI/Anthropic adapters planned
No compaction/summarization — event schema and the current design supports it, orchestration (processors) not yet built
No token estimation — bring your own estimator. Users can either a.) work with an approximation - chars//3. Faster, no additional cost. Optimistic approximation. We only need this to know when to trigger compaction, so 70% of model context with this estimate is still a good starting point for v0.
b). spend one inference call with output_tokens limited to 1 to obtain the input-only token_usage_count
Concurrent request divergence — concurrent LLM responses may be contextually divergent (each request sees only completed turns)
No Vercel AI SDK integration — only raw env.AI.run() via adapter
_buildWorkingContext is protected — subclasses need a public wrapper for helper functions

Files Changed / Added

packages/agents/src/experimental/memory/
├── index.ts              # Barrel exports + @experimental JSDoc
├── types.ts              # EventAction, SessionEvent, ContextMessage, interfaces
├── utils.ts              # hydrateEvent, dehydrateEvent, eventToMessage, messageToEvent
├── context.ts            # WorkingContext class + buildWorkingContext pure function
├── session-agent.ts      # SessionAgent extends Agent
├── adapters/
│   ├── index.ts          # Re-exports
│   └── workers-ai.ts     # Workers AI adapter
└── __tests__/
    ├── vitest.config.ts
    ├── utils.test.ts
    ├── context.test.ts
    └── adapters.test.ts

packages/agents/package.json         # exports + test:memory script
packages/agents/scripts/build.ts     # entry point for experimental/memory
docs/memory-primitives.md            # Full documentation
examples/session-agent/              # Working example with agentic loop

changeset-bot · 2026-01-20T17:53:41Z

⚠️ No Changeset found

Latest commit: c840629

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

pkg-pr-new · 2026-01-20T19:01:54Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@791

commit: 8078063

deathbyknowledge

This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.

Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?

deathbyknowledge · 2026-01-28T11:01:17Z

packages/agents/src/memory/processors.ts

+/**
+ * Token limit processor - ensures context fits within token limits
+ */
+export const tokenLimitRequestProcessor: RequestProcessor<


This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.

Maybe we can add "character" truncation instead?

True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.

deathbyknowledge · 2026-01-28T11:08:58Z

packages/agents/src/memory/session.ts

+ */
+export class Session {
+  readonly metadata: SessionMetadata;
+  readonly events: Event[];


Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?

Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?

(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )

yeah do sqlite is fine. might need to work around 2mb row limitation, just consider that.

threepointone · 2026-02-02T13:45:42Z

I'll review further on wednesday

threepointone · 2026-02-04T14:45:19Z

Some thoughts: (that opus cleaned up for me)

Thanks for putting this together - the architecture works and the problem space (context management, compaction, model-agnostic event logging) is real. I think this is worth shipping as an experimental feature with some adjustments.

Overall Take

The Session/WorkingContext/ProcessorPipeline split is reasonable. For users building long-running conversations, multi-agent systems, or agents that need audit trails, this provides useful primitives that don't exist in the SDK today.

That said, most agents-sdk users are building simpler things and won't need this level of abstraction yet. So I'd recommend shipping this as explicitly experimental - opt-in for power users who want to explore these patterns. Maybe under agents/experimental/sessions?

Before Merging

1. Add clear experimental labeling

The exports and docs should make it obvious this is experimental and the API may change:

Consider exporting from agents/experimental/sessions and adding Experimental prefix to class names
Add @experimental JSDoc tags
Docs should lead with a warning about API instability

2. Add at least one complete example

The PR description has good code snippets, but there's no end-to-end example showing how to wire this into an actual agent. Something in examples/ that demonstrates:

Creating a Session, adding events
Running the processor pipeline
Calling an LLM with the resulting WorkingContext
(Bonus) Triggering compaction after N turns

Without this, developers won't know how to connect the pieces.

3. Document known limitations

Be explicit about what this doesn't do (yet):

Only supports Workers AI format (OpenAI/Anthropic adapters planned)
Compaction requires you to provide your own summarizer
No integration with Vercel AI SDK (yet)
Token estimation is approximate (~chars/4) we should probably fix this before landing tbh
Will probably break as we iterate on it (schemas, apis, what not)

Code-Level Items

A few things to clean up:

Token counting — The chars / 4 estimation can be off significantly. Either document this limitation clearly or consider making the estimator pluggable.

truncateToFit algorithm — The current implementation breaks on the first content that doesn't fit, which may not be optimal. Consider whether this matters for the use cases you're targeting.

readonly modifiers — events, statistics, and compactionConfig are marked readonly but then mutated. This is confusing - either use actual immutability or drop the readonly keyword.

Event ID generation — Date.now() + random suffix could theoretically collide under high throughput. Consider crypto.randomUUID() for guaranteed uniqueness.

Tests — Would be good to have basic coverage for:

Session serialization/deserialization roundtrip
Processor pipeline ordering
Compaction filtering logic

Summary

Ship it as experimental with:

Clear experimental labeling in exports and docs
At least one working example in examples/
Known limitations documented
Basic test coverage
Fix the readonly confusion

We'll learn more from real usage than from more design iteration. Just make sure the "experimental" signal is loud enough that people don't accidentally depend on API stability.

mchockal · 2026-02-09T15:37:28Z

Thanks for detailed feedback! I'll update the PR with the changes over the next 2-3 days and have it ready for review by Friday latest.

whoiskatrin · 2026-02-19T13:05:11Z

/bonk review latest changes here

ask-bonk

Review: Experimental Memory Primitives

The overall architecture (Session / WorkingContext / Adapters split) is well-thought-out and the implementation is clean. The separation between RPC-safe methods and local-only WorkingContext helpers is a good design choice. The test coverage for the pure utility functions is solid.

However, there are two blocking issues (a syntax error in build.ts that will break the build, and the test scripts being rewritten in a way that breaks the existing vitest workspace setup) plus several smaller items that should be addressed before merging.

Blocking

build.ts syntax error — Missing comma between entry points will fail the build for the entire agents package.
Test scripts rewrite breaks existing setup — The PR replaces the vitest --project workspace-based test commands with -r directory-based commands and drops test:x402 entirely. The repo has a vitest.config.ts at the package root that defines projects; the new scripts bypass it.

Important

destroy() does not clean up session tables — The base Agent.destroy() drops cf_agents_state, cf_agents_queues, etc. but doesn't know about cf_agents_sessions or cf_agents_events. SessionAgent should override destroy() to drop its tables first, then call super.destroy().
appendEvents is not atomic — Individual INSERT statements in a loop without an explicit transaction mean a failure mid-batch leaves partial data. The base Agent uses this.sql for single statements, but batch operations should be wrapped in a transaction.
Missing changeset — The bot flagged this. Changes to packages/agents/package.json exports need a changeset.
Example does not follow repo conventions — Per examples/AGENTS.md, examples should be full-stack (frontend + backend), use Vite with @cloudflare/vite-plugin, include vite.config.ts, index.html, Kumo UI, etc. The current example is backend-only with curl-based usage.
compatibility_date is stale — wrangler.jsonc uses 2025-09-27; the repo standard is 2026-01-28.

Minor / Suggestions

loadEvents query construction — The branching logic for since/actions/tail combinations produces 6 code paths. A query-builder approach or parameterized construction would be more maintainable.
Unsafe type assertions in hydrateEvent — The metadata JSON is parsed and fields are cast with as without validation. Malformed metadata would produce runtime errors downstream.
onStart in example uses sync methods — listSessions() and createSession() are synchronous (SQLite), but onStart is async. This is fine but potentially confusing to readers.
Test config include path is relative to repo root — The vitest config for memory tests uses include: ["src/experimental/memory/__tests__/*.test.ts"] which works only when vitest is invoked from the package root, not the config file's directory. This may cause issues depending on how vitest resolves the path.

ask-bonk · 2026-02-19T13:07:45Z

packages/agents/scripts/build.ts

      "src/mcp/x402.ts",
      "src/observability/index.ts",
      "src/codemode/ai.ts",
+      "src/experimental/memory/index.ts"


Build-breaking bug: Missing comma between array elements. This will cause a syntax error when running npm run build.

Suggested change

"src/experimental/memory/index.ts"

"src/experimental/memory/index.ts",

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/package.json

+    "test": "npm run test:workers && npm run test:react && npm run test:cli && npm run test:memory",
+    "test:cli": "vitest -r src/cli-tests",
+    "test:react": "vitest -r src/react-tests",
+    "test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",
+    "test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",
+    "test:workers": "vitest -r src/tests"
  }


Breaking change to existing test infrastructure. The original scripts use vitest --project <name> which references the workspace config in vitest.config.ts (defines projects: src/tests/vitest.config.ts, src/react-tests/vitest.config.ts, src/cli-tests/vitest.config.ts, src/x402-tests/vitest.config.ts).

This PR:

Replaces --project with -r (changes how vitest resolves configs)

Drops test:x402 entirely

Changes test from vitest --run (which uses the workspace) to a sequential chain

Please keep the existing scripts intact and only add the new test:memory script. Then add the memory vitest config to the projects array in the root vitest.config.ts:

Suggested change

"test": "npm run test:workers && npm run test:react && npm run test:cli && npm run test:memory",

"test:cli": "vitest -r src/cli-tests",

"test:react": "vitest -r src/react-tests",

"test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",

"test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",

"test:workers": "vitest -r src/tests"

}

"test": "vitest --run",

"test:cli": "vitest --project cli",

"test:react": "vitest --project react",

"test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",

"test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",

"test:workers": "vitest --project workers",

"test:x402": "vitest --project x402"

Alternatively, add the memory tests as another project in the workspace config so vitest --run picks them up automatically.

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/session-agent.ts

+      `;
+
+      nextSeq++;
+    }


The loop of individual INSERT statements is not wrapped in a transaction. If the Durable Object fails mid-batch (e.g., on the 3rd of 5 events), you'll have partial data with a gap in seq values. Consider wrapping the batch in a transaction:

this.ctx.storage.sql.exec("BEGIN"); try { for (const event of events) { ... } this.ctx.storage.sql.exec("COMMIT"); } catch (e) { this.ctx.storage.sql.exec("ROLLBACK"); throw e; }

Note: The base Agent's single-row operations don't need this, but batch inserts do.

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/session-agent.ts

+    const events = newMessages.map((msg) => messageToEvent(sessionId, msg));
+    this.appendEvents(sessionId, events);
+  }
+}


SessionAgent creates cf_agents_sessions and cf_agents_events tables, but the base Agent.destroy() method only drops the tables it knows about (cf_agents_state, cf_agents_queues, cf_agents_schedules, cf_agents_mcp_servers, cf_agents_workflows).

Please override destroy() to clean up the session tables:

async destroy() { this.sql`DROP TABLE IF EXISTS cf_agents_events`; this.sql`DROP TABLE IF EXISTS cf_agents_sessions`; await super.destroy(); }

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/utils.ts

+
+  const meta: Record<string, unknown> = row.metadata
+    ? JSON.parse(row.metadata)
+    : {};


The metadata JSON is parsed with JSON.parse and then fields are accessed with bare as casts (e.g., meta.toolCallId as string, meta.toolCalls as ToolCall[]). If the stored metadata is malformed or from a schema migration, this will silently produce undefined values that propagate as runtime errors later.

Consider at minimum adding fallback checks, or documenting that metadata is trusted (since only dehydrateEvent writes it). For toolCallId, the downstream code assumes it's a non-empty string, so an explicit check would be safer:

toolCallId: (meta.toolCallId as string) ?? "", toolName: (meta.toolName as string) ?? "",

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/session-agent.ts

+    if (since !== null && actions !== null && actions.length > 0) {
+      const placeholders = actions.map(() => "?").join(", ");
+      const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
+        FROM cf_agents_events
+        WHERE session_id = ? AND created_at >= ? AND action IN (${placeholders})
+        ORDER BY seq ${innerOrder} LIMIT ?`;
+      const query = tail
+        ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
+        : inner;
+      rows = [
+        ...this.ctx.storage.sql.exec(query, sessionId, since, ...actions, limit)
+      ] as unknown as StoredEvent[];
+    } else if (since !== null) {
+      if (tail) {
+        rows = this.sql<StoredEvent>`
+          SELECT * FROM (
+            SELECT id, session_id, seq, action, content, metadata, created_at
+            FROM cf_agents_events
+            WHERE session_id = ${sessionId} AND created_at >= ${since}
+            ORDER BY seq DESC LIMIT ${limit}
+          ) sub ORDER BY seq ASC
+        `;
+      } else {
+        rows = this.sql<StoredEvent>`
+          SELECT id, session_id, seq, action, content, metadata, created_at
+          FROM cf_agents_events
+          WHERE session_id = ${sessionId} AND created_at >= ${since}
+          ORDER BY seq ASC LIMIT ${limit}
+        `;
+      }
+    } else if (actions !== null && actions.length > 0) {
+      const placeholders = actions.map(() => "?").join(", ");
+      const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
+        FROM cf_agents_events
+        WHERE session_id = ? AND action IN (${placeholders})
+        ORDER BY seq ${innerOrder} LIMIT ?`;
+      const query = tail
+        ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
+        : inner;
+      rows = [
+        ...this.ctx.storage.sql.exec(query, sessionId, ...actions, limit)
+      ] as unknown as StoredEvent[];
+    } else {
+      if (tail) {
+        rows = this.sql<StoredEvent>`
+          SELECT * FROM (
+            SELECT id, session_id, seq, action, content, metadata, created_at
+            FROM cf_agents_events
+            WHERE session_id = ${sessionId}
+            ORDER BY seq DESC LIMIT ${limit}
+          ) sub ORDER BY seq ASC
+        `;
+      } else {
+        rows = this.sql<StoredEvent>`
+          SELECT id, session_id, seq, action, content, metadata, created_at
+          FROM cf_agents_events
+          WHERE session_id = ${sessionId}
+          ORDER BY seq ASC LIMIT ${limit}
+        `;
+      }
+    }
+


The loadEvents method has 6 distinct code paths for the combinations of since, actions, and tail. This is hard to maintain and test — adding a new filter option would double the branches.

Consider a query-builder approach that conditionally appends WHERE clauses:

let conditions = ["session_id = ?"]; let params: unknown[] = [sessionId]; if (since !== null) { conditions.push("created_at >= ?"); params.push(since); } if (actions !== null && actions.length > 0) { conditions.push(`action IN (${actions.map(() => "?").join(", ")})`); params.push(...actions); } const where = conditions.join(" AND "); const inner = `SELECT ... FROM cf_agents_events WHERE ${where} ORDER BY seq ${innerOrder} LIMIT ?`; params.push(limit); const query = tail ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC` : inner; rows = [...this.ctx.storage.sql.exec(query, ...params)] as unknown as StoredEvent[];

This eliminates the combinatorial branching entirely.

ask-bonk · 2026-02-19T13:07:46Z

examples/session-agent/wrangler.jsonc

+  "name": "agents-session-agent",
+  "main": "src/index.ts",
+  "compatibility_date": "2025-09-27",
+  "compatibility_flags": ["nodejs_compat"],


The compatibility_date should be "2026-01-28" per the repo convention (see root AGENTS.md).

Suggested change

"compatibility_flags": ["nodejs_compat"],

"compatibility_date": "2026-01-28",

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/context.ts

+/**
+ * @experimental
+ * Build a WorkingContext from a list of session events. This is an example,
+ * Keep this outside of the Agent class to allow handling multiple independent threads in parallel.


Nit: This comment says "This is an example" — should probably say something like "Pure function — keep outside the Agent class to allow use from Workflows/Workers/tests."

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/__tests__/vitest.config.ts

+  test: {
+    environment: "node",
+    clearMocks: true,
+    include: ["src/experimental/memory/__tests__/*.test.ts"]


This include path is relative to the repo root, not to the config file's directory. This works when running vitest run --config src/experimental/memory/__tests__/vitest.config.ts from the packages/agents/ directory, but may be fragile.

Consider using a path relative to the config file, or use root in the config to make it explicit:

export default defineConfig({ test: { environment: "node", clearMocks: true, root: "../../../..", include: ["src/experimental/memory/__tests__/*.test.ts"] } });

Or simply:

include: ["./*.test.ts"]

if vitest resolves relative to the config file location (check vitest docs for your version).

ask-bonk · 2026-02-19T13:07:46Z

packages/agents/src/experimental/memory/utils.ts

+  if (msg.role === "user") {
+    return { ...base, action: EventAction.USER_MESSAGE, content: msg.content };
+  }
+  // TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check


Nit: typo in comment — eventhough → even though.

Suggested change

// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check

// TOOL_CALL_REQUEST are also stored as 'assistant' messages, even though it is a separate event, hence this extra check

ask-bonk · 2026-02-19T13:08:05Z

Review submitted successfully. Here's a summary of the feedback:

Blocking issues (2)

build.ts syntax error — Missing comma on line 18 between "src/experimental/memory/index.ts" and "src/experimental/forever.ts" will break the entire package build.
Test scripts rewrite — The PR replaces the vitest workspace-based --project commands with -r directory commands, dropping test:x402 and bypassing the existing vitest.config.ts workspace setup. Only the new test:memory script should be added.

Important issues (5)

SessionAgent doesn't override destroy() — The new cf_agents_sessions and cf_agents_events tables won't be cleaned up when destroy() is called.
appendEvents lacks transaction wrapping — The batch INSERT loop can leave partial data on failure.
Missing changeset — Required for packages/ changes that affect exports.
Example doesn't follow repo conventions — Should be full-stack with Vite, not backend-only with curl.
Stale compatibility_date — Should be 2026-01-28 per repo standard.

Minor items (4)

Query builder suggestion for loadEvents to reduce branching
Unsafe as casts in hydrateEvent metadata parsing
Typo fix (eventhough → even though)
Vitest config include path may be fragile

github run

…apters, tail-mode loadEvents

…xperimental

…xperimental JSDoc tags

…update docs

deathbyknowledge · 2026-02-20T12:28:19Z

/bonk review again please

ask-bonk

Review: Experimental Memory Primitives (v2)

The architecture is well-designed. The Session/WorkingContext/Adapter split is clean, the separation between RPC-safe methods and local-only helpers is thoughtful, and the pure utility functions are well-tested. The code addresses the previous round of feedback (from threepointone and deathbyknowledge) effectively — the ProcessorPipeline and token estimation were dropped, SQL-backed storage was adopted, and the module is properly labeled as experimental.

I have two blocking issues, several important items, and a few minor suggestions.

Blocking

package-lock.json is massively out of sync — The fork's lockfile has ~25,000 lines of diff against main, downgrading nearly every dependency (e.g., @ai-sdk/anthropic from ^3.0.45 to ^3.0.8, @cloudflare/workers-types from ^4.20260219.0 to ^4.20260103.0, adding @biomejs/biome, removing @cloudflare/kumo, oxfmt, oxlint, etc.). This PR should not modify package-lock.json beyond what's needed for the new examples/session-agent workspace. Please rebase on the latest main and run npm install to regenerate.
Missing changeset — This PR adds a new export path (agents/experimental/memory) to packages/agents/package.json. Per repo conventions, changes to packages/ that affect the public API need a changeset. Run npx changeset and describe the new experimental memory module.

Important

SessionAgent should override destroy() — The base Agent.destroy() drops the five tables it creates (cf_agents_state, cf_agents_queues, cf_agents_schedules, cf_agents_mcp_servers, cf_agents_workflows) and then calls deleteAll(). However, it doesn't know about cf_agents_sessions or cf_agents_events. While deleteAll() clears storage, the explicit table drops are part of the pattern for clean teardown and should be followed consistently.
appendEvents should use a transaction — The loop of individual INSERT statements at lines 254-265 can leave partial data if the DO fails mid-batch. A single failed insert after 3 of 5 events would leave orphaned rows with gaps in seq values. The base Agent uses individual this.sql statements but only for single-row operations. Batch inserts should be wrapped in a transaction.
deleteSession should also use a transaction — Lines 138-139 execute two separate DELETE statements. If the first succeeds but the second fails, you'd have a session row with no events, but the session itself would still exist (or vice versa).
Example doesn't follow repo conventions — Per examples/AGENTS.md, all examples must be full-stack (frontend + backend) with Vite, @cloudflare/vite-plugin, index.html, React client, Kumo UI, etc. The current example is backend-only with curl-based usage. It's missing vite.config.ts, index.html, src/client.tsx, and the required shared components (PoweredByAgents, ConnectionIndicator, ModeToggle, ThemeProvider). I understand this is extra work, but consistency matters for user-facing learning material.
compatibility_date is stale — The example's wrangler.jsonc uses 2025-09-27; the repo standard is 2026-01-28.

Minor

See inline comments below.

ask-bonk · 2026-02-20T12:32:41Z

packages/agents/src/experimental/memory/session-agent.ts

+    const events = newMessages.map((msg) => messageToEvent(sessionId, msg));
+    this.appendEvents(sessionId, events);
+  }
+}


SessionAgent creates cf_agents_sessions and cf_agents_events tables, but the base Agent.destroy() only drops the tables it knows about. Override destroy() to clean up:

Suggested change

}

async destroy() {

this.sql`DROP TABLE IF EXISTS cf_agents_events`;

this.sql`DROP TABLE IF EXISTS cf_agents_sessions`;

await super.destroy();

}

}

ask-bonk · 2026-02-20T12:32:41Z

packages/agents/src/experimental/memory/session-agent.ts

+    for (const event of events) {
+      // Override seq with the correct monotonic value
+      const withSeq: SessionEvent = { ...event, seq: nextSeq, sessionId };
+      const row = dehydrateEvent(withSeq);
+
+      this.sql`
+        INSERT INTO cf_agents_events (id, session_id, seq, action, content, metadata, created_at)
+        VALUES (${row.id}, ${row.session_id}, ${row.seq}, ${row.action}, ${row.content}, ${row.metadata}, ${row.created_at})
+      `;
+
+      nextSeq++;
+    }


This loop of individual INSERTs is not atomic. If the DO fails mid-batch, you'll have partial data with gaps in seq values. Wrap in a transaction:

this.ctx.storage.sql.exec("BEGIN"); try { for (const event of events) { const withSeq: SessionEvent = { ...event, seq: nextSeq, sessionId }; const row = dehydrateEvent(withSeq); this.sql` INSERT INTO cf_agents_events (id, session_id, seq, action, content, metadata, created_at) VALUES (${row.id}, ${row.session_id}, ${row.seq}, ${row.action}, ${row.content}, ${row.metadata}, ${row.created_at}) `; nextSeq++; } this.ctx.storage.sql.exec("COMMIT"); } catch (e) { this.ctx.storage.sql.exec("ROLLBACK"); throw e; }

Note: Durable Object SQLite supports transactions via BEGIN/COMMIT/ROLLBACK.

ask-bonk · 2026-02-20T12:32:41Z

packages/agents/src/experimental/memory/session-agent.ts

+  deleteSession(sessionId: string): void {
+    this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;
+    this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;
+  }


Same atomicity concern here — if the first DELETE succeeds but the second fails (or vice versa), you'll have inconsistent state. Consider wrapping in a transaction:

Suggested change

deleteSession(sessionId: string): void {

this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;

this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;

}

deleteSession(sessionId: string): void {

this.ctx.storage.sql.exec("BEGIN");

try {

this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;

this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;

this.ctx.storage.sql.exec("COMMIT");

} catch (e) {

this.ctx.storage.sql.exec("ROLLBACK");

throw e;

}

}

ask-bonk · 2026-02-20T12:32:42Z

packages/agents/src/experimental/memory/session-agent.ts

+    if (since !== null && actions !== null && actions.length > 0) {
+      const placeholders = actions.map(() => "?").join(", ");
+      const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
+        FROM cf_agents_events
+        WHERE session_id = ? AND created_at >= ? AND action IN (${placeholders})
+        ORDER BY seq ${innerOrder} LIMIT ?`;
+      const query = tail
+        ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
+        : inner;
+      rows = [
+        ...this.ctx.storage.sql.exec(query, sessionId, since, ...actions, limit)
+      ] as unknown as StoredEvent[];
+    } else if (since !== null) {
+      if (tail) {
+        rows = this.sql<StoredEvent>`
+          SELECT * FROM (
+            SELECT id, session_id, seq, action, content, metadata, created_at
+            FROM cf_agents_events
+            WHERE session_id = ${sessionId} AND created_at >= ${since}
+            ORDER BY seq DESC LIMIT ${limit}
+          ) sub ORDER BY seq ASC
+        `;
+      } else {
+        rows = this.sql<StoredEvent>`
+          SELECT id, session_id, seq, action, content, metadata, created_at
+          FROM cf_agents_events
+          WHERE session_id = ${sessionId} AND created_at >= ${since}
+          ORDER BY seq ASC LIMIT ${limit}
+        `;
+      }
+    } else if (actions !== null && actions.length > 0) {
+      const placeholders = actions.map(() => "?").join(", ");
+      const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
+        FROM cf_agents_events
+        WHERE session_id = ? AND action IN (${placeholders})
+        ORDER BY seq ${innerOrder} LIMIT ?`;
+      const query = tail
+        ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
+        : inner;
+      rows = [
+        ...this.ctx.storage.sql.exec(query, sessionId, ...actions, limit)
+      ] as unknown as StoredEvent[];
+    } else {
+      if (tail) {
+        rows = this.sql<StoredEvent>`
+          SELECT * FROM (
+            SELECT id, session_id, seq, action, content, metadata, created_at
+            FROM cf_agents_events
+            WHERE session_id = ${sessionId}
+            ORDER BY seq DESC LIMIT ${limit}
+          ) sub ORDER BY seq ASC
+        `;
+      } else {
+        rows = this.sql<StoredEvent>`
+          SELECT id, session_id, seq, action, content, metadata, created_at
+          FROM cf_agents_events
+          WHERE session_id = ${sessionId}
+          ORDER BY seq ASC LIMIT ${limit}
+        `;
+      }
+    }
+


The combinatorial branching for since/actions/tail produces 6 code paths. This is hard to maintain and will double with each new filter. Consider a query-builder approach:

const conditions = ["session_id = ?"]; const params: (string | number)[] = [sessionId]; if (since !== null) { conditions.push("created_at >= ?"); params.push(since); } if (actions !== null && actions.length > 0) { conditions.push(`action IN (${actions.map(() => "?").join(", ")})`); params.push(...actions); } const where = conditions.join(" AND "); const cols = "id, session_id, seq, action, content, metadata, created_at"; const inner = `SELECT ${cols} FROM cf_agents_events WHERE ${where} ORDER BY seq ${innerOrder} LIMIT ?`; params.push(limit); const query = tail ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC` : inner; rows = [...this.ctx.storage.sql.exec(query, ...params)] as unknown as StoredEvent[];

This eliminates the branching entirely and is easier to extend.

ask-bonk · 2026-02-20T12:32:42Z

packages/agents/src/experimental/memory/utils.ts

+        ...base,
+        action: EventAction.TOOL_RESULT,
+        content: row.content ?? "",
+        toolCallId: meta.toolCallId as string,


The as casts on metadata fields (meta.toolCallId as string, meta.toolCalls as ToolCall[]) will silently produce undefined if metadata is malformed or from a future schema migration. Since only dehydrateEvent writes these fields, the risk is low in practice, but defensive defaults would be safer:

Suggested change

toolCallId: meta.toolCallId as string,

toolCallId: (meta.toolCallId as string) ?? "",

Similarly for toolName on line 66 and toolCalls on line 57.

ask-bonk · 2026-02-20T12:32:42Z

packages/agents/src/experimental/memory/utils.ts

+  if (msg.role === "user") {
+    return { ...base, action: EventAction.USER_MESSAGE, content: msg.content };
+  }
+  // TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check


Typo: eventhough should be even though.

Suggested change

// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check

// TOOL_CALL_REQUEST are also stored as 'assistant' messages, even though it is a separate event, hence this extra check

ask-bonk · 2026-02-20T12:32:42Z

packages/agents/src/experimental/memory/context.ts

+/**
+ * @experimental
+ * Build a WorkingContext from a list of session events. This is an example,
+ * Keep this outside of the Agent class to allow handling multiple independent threads in parallel.


This comment says "This is an example" — it should describe the function's purpose, e.g.:

Suggested change

* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.

* Pure function for building a WorkingContext from session events.

* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.

ask-bonk · 2026-02-20T12:32:42Z

examples/session-agent/wrangler.jsonc

+  "name": "agents-session-agent",
+  "main": "src/index.ts",
+  "compatibility_date": "2025-09-27",
+  "compatibility_flags": ["nodejs_compat"],


Should be 2026-01-28 per repo convention.

Suggested change

"compatibility_flags": ["nodejs_compat"],

"compatibility_date": "2026-01-28",

ask-bonk · 2026-02-20T12:32:42Z

packages/agents/src/experimental/memory/__tests__/vitest.config.ts

+  test: {
+    environment: "node",
+    clearMocks: true,
+    include: ["src/experimental/memory/__tests__/*.test.ts"]


This include path is relative to the working directory, not the config file location. It works when invoked via vitest run --config <path> from the packages/agents/ directory, but is fragile. Consider using a relative path from the config file:

Suggested change

include: ["src/experimental/memory/__tests__/*.test.ts"]

include: ["./*.test.ts"]

Or explicitly set root if vitest needs the package root:

root: "../../../..", include: ["src/experimental/memory/__tests__/*.test.ts"]

ask-bonk · 2026-02-20T12:32:42Z

examples/session-agent/src/index.ts

+  }
+
+  // Public wrapper — _buildWorkingContext is protected,
+  // but we need it from our own helper methods (?)


The (?) in this comment reads like uncertainty. If this is the intended pattern for subclasses (wrapping _buildWorkingContext as a public method), document it clearly:

Suggested change

// but we need it from our own helper methods (?)

// Public wrapper — _buildWorkingContext is protected to prevent RPC misuse,

// but subclasses need to expose it for their own request handlers.

Alternatively, consider whether _buildWorkingContext should just be public on SessionAgent since every subclass will need to wrap it anyway. The RPC-safety concern is valid, but protected doesn't actually prevent RPC calls — it only restricts TypeScript-level access.

ask-bonk · 2026-02-20T12:32:56Z

Review submitted successfully. Here's a summary of my review:

Review Summary

I posted a comprehensive review on PR #791 with 2 blocking issues, 5 important items, and 8 inline comments.

Blocking

package-lock.json is massively out of sync — The fork's lockfile has ~25,000 lines of changes downgrading nearly every dependency. Needs a rebase on latest main.
Missing changeset — Required for the new agents/experimental/memory export path.

Important

SessionAgent doesn't override destroy() — Session/event tables won't be cleaned up on teardown.
appendEvents needs transaction wrapping — Batch INSERTs can leave partial data on failure.
deleteSession needs transaction wrapping — Two DELETEs without atomicity can leave inconsistent state.
Example doesn't follow repo conventions — Backend-only with curl; needs full-stack with Vite/React/Kumo per examples/AGENTS.md.
Stale compatibility_date — 2025-09-27 should be 2026-01-28.

Minor (inline)

Query-builder suggestion for loadEvents to reduce 6 code paths to 1
Unsafe as casts in hydrateEvent metadata parsing
Typo fix (eventhough)
Misleading comment in context.ts
Fragile vitest config include path
Unclear comment on buildContext wrapper pattern

I also corrected the previous bonk review's incorrect claims about a "missing comma in build.ts" (already fixed) and "test scripts being rewritten" (only test:memory was added; existing scripts were untouched).

github run

…tion, query-builder, nits

…tion, make buildWorkingContext public

deathbyknowledge reviewed Jan 28, 2026

View reviewed changes

mchockal force-pushed the main branch from b5f61f8 to 7da20d8 Compare February 17, 2026 03:26

ask-bonk bot requested changes Feb 19, 2026

View reviewed changes

mchockal-cf added 7 commits February 19, 2026 18:37

Add memory primitives v0

f4f526d

Add experimental memory primitives - SessionAgent, WorkingContext, ad…

de3d43e

…apters, tail-mode loadEvents

Address PR feedback - add known limitations, example in examples/, @e…

bfeaba7

…xperimental JSDoc tags

Fix concurrency - persist user message with full turn for atomicity, …

19c3766

…update docs

Rename memory-agent to session-agent

97dd169

Fix prettier formatting and biome lint

d32c053

Add test:memory to CI pipeline

a34a1e3

mchockal force-pushed the main branch from 8b0402a to a34a1e3 Compare February 20, 2026 00:41

ask-bonk bot reviewed Feb 20, 2026

View reviewed changes

Address PR cloudflare#791 review feedback - destroy override, transac…

2559327

…tion, query-builder, nits

mchockal marked this pull request as draft February 21, 2026 02:14

mchockal-cf added 2 commits February 20, 2026 20:19

Address additional PR cloudflare#791 feedback - deleteSession transac…

09d3b17

…tion, make buildWorkingContext public

Regenerate package-lock.json for upstream sync

c840629

	"src/experimental/memory/index.ts"
	"src/experimental/memory/index.ts",

	"compatibility_flags": ["nodejs_compat"],
	"compatibility_date": "2026-01-28",

	// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check
	// TOOL_CALL_REQUEST are also stored as 'assistant' messages, even though it is a separate event, hence this extra check

-}
+  async destroy() {
+    this.sql`DROP TABLE IF EXISTS cf_agents_events`;
+    this.sql`DROP TABLE IF EXISTS cf_agents_sessions`;
+    await super.destroy();
+  }
+}

	toolCallId: meta.toolCallId as string,
	toolCallId: (meta.toolCallId as string) ?? "",

	* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.
	* Pure function for building a WorkingContext from session events.
	* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.

	include: ["src/experimental/memory/__tests__/*.test.ts"]
	include: ["./*.test.ts"]

	// but we need it from our own helper methods (?)
	// Public wrapper — _buildWorkingContext is protected to prevent RPC misuse,
	// but subclasses need to expose it for their own request handlers.

Comments

Conversation

mchockal commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

Why?

What's Added

SessionAgent — Agent subclass with session/event SQL tables

WorkingContext — Ephemeral in-memory context

buildWorkingContext() — Pure function

workersAIAdapter — Model format adapter

Pure utility functions

SQL Schema

Example: examples/session-agent/

Tests

Known Limitations

Files Changed / Added

Uh oh!

changeset-bot bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

pkg-pr-new bot commented Jan 20, 2026

Uh oh!

deathbyknowledge left a comment

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

threepointone Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

threepointone commented Feb 2, 2026

Uh oh!

threepointone commented Feb 4, 2026

Overall Take

Before Merging

Code-Level Items

Summary

Uh oh!

mchockal commented Feb 9, 2026

Uh oh!

whoiskatrin commented Feb 19, 2026

Uh oh!

ask-bonk bot left a comment

Choose a reason for hiding this comment

Review: Experimental Memory Primitives

Blocking

Important

Minor / Suggestions

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal commented Jan 20, 2026 •

edited

Loading

changeset-bot bot commented Jan 20, 2026 •

edited

Loading