Skip to content

Comments

Feature request: Add structured memory primitives for better context engineering#791

Draft
mchockal wants to merge 10 commits intocloudflare:mainfrom
mchockal:main
Draft

Feature request: Add structured memory primitives for better context engineering#791
mchockal wants to merge 10 commits intocloudflare:mainfrom
mchockal:main

Conversation

@mchockal
Copy link
Contributor

@mchockal mchockal commented Jan 20, 2026

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper, but adapted to agents sdk.

Update 2/16: Added experimental session primitives to agents/experimental/memory — SQL-backed sessions, append-only event storage, ephemeral WorkingContext. This is a complete redesign of the original v0 memory primitives based on PR feedback, focusing just on the Session and Working Context first. A separate class SessionAgent is used to keep these changes purely experimental and independent of the Agent class.

Why?

Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:

  • Duplicated effort across agent implementations
  • Inconsistent approaches to memory management
  • Tight coupling between conversation state and model-specific formats
  • Complex compaction logic scattered throughout application code

These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and (later)compaction.


⚠️ Experimental — this module is unstable and may change without notice. All exports carry @experimental JSDoc tags.


What's Added

SessionAgent — Agent subclass with session/event SQL tables

Extends Agent and creates cf_agents_sessions and cf_agents_events tables in the constructor (matching the Agent pattern for cf_agents_state, cf_agents_queues, etc.).

RPC-safe methods (plain serializable objects, callable from Workflows/Workers):

Method Description
createSession(metadata?) Creates a session row, returns ID
getSession(sessionId) Returns StoredSession | null
listSessions() All sessions for this agent
deleteSession(sessionId) Deletes session + all events
loadEvents(sessionId, opts?) Hydrated SessionEvent[], ordered by seq. Default limit: 50
appendEvents(sessionId, events) Validates session, assigns monotonic seq, inserts rows
deleteEvents(sessionId, eventIds) Deletes specific events by ID

Local-only methods (class instances don't survive RPC serialization):

Method Description
_buildWorkingContext(sessionId, opts?) protected — loads events + builds WorkingContext
persistWorkingContext(sessionId, ctx) Extracts new messages, converts to events, appends

WorkingContext — Ephemeral in-memory context

Built per-request from session events. Accumulates messages during an agentic loop, tracks which are new via _initialCount. Thrown away after the request.

const ctx = this._buildWorkingContext(sessionId, {
  systemInstructions: ["You are a helpful assistant."],
  limit: 100,
});
ctx.addMessage({ role: "user", content: userMessage });
// ... LLM call, tool calls ...
ctx.addMessage({ role: "assistant", content: response });
this.persistWorkingContext(sessionId, ctx); // batch-persist new messages only

buildWorkingContext() — Pure function

For use outside the Agent (Workflows, Workers, tests). Takes SessionEvent[] from loadEvents() RPC and builds a WorkingContext locally.

workersAIAdapter — Model format adapter

Stateless adapter converting WorkingContext messages to Workers AI chat completions format. Handles system instructions, structured toolCalls, and tool results. Support for other frameworks will be added as new adapters.

Pure utility functions

  • hydrateEvent(row) — SQL row → typed SessionEvent
  • dehydrateEvent(event) — SessionEvent → SQL row
  • eventToMessage(event) — SessionEvent → ContextMessage
  • messageToEvent(sessionId, msg) — ContextMessage → SessionEvent

SQL Schema

CREATE TABLE cf_agents_sessions (
  id TEXT PRIMARY KEY NOT NULL,
  agent_id TEXT NOT NULL,
  created_at INTEGER NOT NULL,  -- ms since epoch
  updated_at INTEGER NOT NULL,
  metadata TEXT                 -- JSON
);

CREATE TABLE cf_agents_events (
  id TEXT PRIMARY KEY NOT NULL,
  session_id TEXT NOT NULL,
  seq INTEGER NOT NULL,         -- monotonic per session
  action TEXT NOT NULL,
  content TEXT,
  metadata TEXT,                -- JSON
  created_at INTEGER NOT NULL
);

CREATE INDEX idx_events_session_seq ON cf_agents_events(session_id, seq);

POST /chat
  → buildContext(sessionId)           # load completed events → WorkingContext
  → ctx.addMessage(userMessage)       # in-memory only
  → LLM call + tool loop             # accumulate in-memory
  → persistWorkingContext(sessionId)  # batch-persist user + assistant atomically

Example: examples/session-agent/

A complete working example with:

  • HTTP endpoints: POST /chat, GET /session, DELETE /session
  • Agentic loop with tool calling (echo + dummy-weather tools)

Tests

✓ adapters.test.ts (8 tests)  — Workers AI format, tool calls, full agentic loop conversation
✓ context.test.ts  (11 tests) — WorkingContext construction, new message tracking, custom mappers
✓ utils.test.ts    (26 tests) — hydrate/dehydrate roundtrips, event↔message mapping, edge cases

Tests run via npm run test:memory (added to the CI test pipeline).


Known Limitations

  • Workers AI only — OpenAI/Anthropic adapters planned
  • No compaction/summarization — event schema and the current design supports it, orchestration (processors) not yet built
  • No token estimation — bring your own estimator. Users can either a.) work with an approximation - chars//3. Faster, no additional cost. Optimistic approximation. We only need this to know when to trigger compaction, so 70% of model context with this estimate is still a good starting point for v0.
    b). spend one inference call with output_tokens limited to 1 to obtain the input-only token_usage_count
  • Concurrent request divergence — concurrent LLM responses may be contextually divergent (each request sees only completed turns)
  • No Vercel AI SDK integration — only raw env.AI.run() via adapter
  • _buildWorkingContext is protected — subclasses need a public wrapper for helper functions

Files Changed / Added

packages/agents/src/experimental/memory/
├── index.ts              # Barrel exports + @experimental JSDoc
├── types.ts              # EventAction, SessionEvent, ContextMessage, interfaces
├── utils.ts              # hydrateEvent, dehydrateEvent, eventToMessage, messageToEvent
├── context.ts            # WorkingContext class + buildWorkingContext pure function
├── session-agent.ts      # SessionAgent extends Agent
├── adapters/
│   ├── index.ts          # Re-exports
│   └── workers-ai.ts     # Workers AI adapter
└── __tests__/
    ├── vitest.config.ts
    ├── utils.test.ts
    ├── context.test.ts
    └── adapters.test.ts

packages/agents/package.json         # exports + test:memory script
packages/agents/scripts/build.ts     # entry point for experimental/memory
docs/memory-primitives.md            # Full documentation
examples/session-agent/              # Working example with agentic loop

@changeset-bot
Copy link

changeset-bot bot commented Jan 20, 2026

⚠️ No Changeset found

Latest commit: c840629

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 20, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@791

commit: 8078063

Copy link
Member

@deathbyknowledge deathbyknowledge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.

Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?

/**
* Token limit processor - ensures context fits within token limits
*/
export const tokenLimitRequestProcessor: RequestProcessor<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.

Maybe we can add "character" truncation instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.

*/
export class Session {
readonly metadata: SessionMetadata;
readonly events: Event[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?

(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah do sqlite is fine. might need to work around 2mb row limitation, just consider that.

@threepointone
Copy link
Contributor

I'll review further on wednesday

@threepointone
Copy link
Contributor

Some thoughts: (that opus cleaned up for me)

Thanks for putting this together - the architecture works and the problem space (context management, compaction, model-agnostic event logging) is real. I think this is worth shipping as an experimental feature with some adjustments.

Overall Take

The Session/WorkingContext/ProcessorPipeline split is reasonable. For users building long-running conversations, multi-agent systems, or agents that need audit trails, this provides useful primitives that don't exist in the SDK today.

That said, most agents-sdk users are building simpler things and won't need this level of abstraction yet. So I'd recommend shipping this as explicitly experimental - opt-in for power users who want to explore these patterns. Maybe under agents/experimental/sessions?

Before Merging

1. Add clear experimental labeling

The exports and docs should make it obvious this is experimental and the API may change:

  • Consider exporting from agents/experimental/sessions and adding Experimental prefix to class names
  • Add @experimental JSDoc tags
  • Docs should lead with a warning about API instability

2. Add at least one complete example

The PR description has good code snippets, but there's no end-to-end example showing how to wire this into an actual agent. Something in examples/ that demonstrates:

  • Creating a Session, adding events
  • Running the processor pipeline
  • Calling an LLM with the resulting WorkingContext
  • (Bonus) Triggering compaction after N turns

Without this, developers won't know how to connect the pieces.

3. Document known limitations

Be explicit about what this doesn't do (yet):

  • Only supports Workers AI format (OpenAI/Anthropic adapters planned)
  • Compaction requires you to provide your own summarizer
  • No integration with Vercel AI SDK (yet)
  • Token estimation is approximate (~chars/4) we should probably fix this before landing tbh
  • Will probably break as we iterate on it (schemas, apis, what not)

Code-Level Items

A few things to clean up:

Token counting — The chars / 4 estimation can be off significantly. Either document this limitation clearly or consider making the estimator pluggable.

truncateToFit algorithm — The current implementation breaks on the first content that doesn't fit, which may not be optimal. Consider whether this matters for the use cases you're targeting.

readonly modifiersevents, statistics, and compactionConfig are marked readonly but then mutated. This is confusing - either use actual immutability or drop the readonly keyword.

Event ID generationDate.now() + random suffix could theoretically collide under high throughput. Consider crypto.randomUUID() for guaranteed uniqueness.

Tests — Would be good to have basic coverage for:

  • Session serialization/deserialization roundtrip
  • Processor pipeline ordering
  • Compaction filtering logic

Summary

Ship it as experimental with:

  • Clear experimental labeling in exports and docs
  • At least one working example in examples/
  • Known limitations documented
  • Basic test coverage
  • Fix the readonly confusion

We'll learn more from real usage than from more design iteration. Just make sure the "experimental" signal is loud enough that people don't accidentally depend on API stability.

@mchockal
Copy link
Contributor Author

mchockal commented Feb 9, 2026

Thanks for detailed feedback! I'll update the PR with the changes over the next 2-3 days and have it ready for review by Friday latest.

@whoiskatrin
Copy link
Contributor

/bonk review latest changes here

Copy link
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Experimental Memory Primitives

The overall architecture (Session / WorkingContext / Adapters split) is well-thought-out and the implementation is clean. The separation between RPC-safe methods and local-only WorkingContext helpers is a good design choice. The test coverage for the pure utility functions is solid.

However, there are two blocking issues (a syntax error in build.ts that will break the build, and the test scripts being rewritten in a way that breaks the existing vitest workspace setup) plus several smaller items that should be addressed before merging.

Blocking

  1. build.ts syntax error — Missing comma between entry points will fail the build for the entire agents package.
  2. Test scripts rewrite breaks existing setup — The PR replaces the vitest --project workspace-based test commands with -r directory-based commands and drops test:x402 entirely. The repo has a vitest.config.ts at the package root that defines projects; the new scripts bypass it.

Important

  1. destroy() does not clean up session tables — The base Agent.destroy() drops cf_agents_state, cf_agents_queues, etc. but doesn't know about cf_agents_sessions or cf_agents_events. SessionAgent should override destroy() to drop its tables first, then call super.destroy().
  2. appendEvents is not atomic — Individual INSERT statements in a loop without an explicit transaction mean a failure mid-batch leaves partial data. The base Agent uses this.sql for single statements, but batch operations should be wrapped in a transaction.
  3. Missing changeset — The bot flagged this. Changes to packages/agents/package.json exports need a changeset.
  4. Example does not follow repo conventions — Per examples/AGENTS.md, examples should be full-stack (frontend + backend), use Vite with @cloudflare/vite-plugin, include vite.config.ts, index.html, Kumo UI, etc. The current example is backend-only with curl-based usage.
  5. compatibility_date is stalewrangler.jsonc uses 2025-09-27; the repo standard is 2026-01-28.

Minor / Suggestions

  1. loadEvents query construction — The branching logic for since/actions/tail combinations produces 6 code paths. A query-builder approach or parameterized construction would be more maintainable.
  2. Unsafe type assertions in hydrateEvent — The metadata JSON is parsed and fields are cast with as without validation. Malformed metadata would produce runtime errors downstream.
  3. onStart in example uses sync methodslistSessions() and createSession() are synchronous (SQLite), but onStart is async. This is fine but potentially confusing to readers.
  4. Test config include path is relative to repo root — The vitest config for memory tests uses include: ["src/experimental/memory/__tests__/*.test.ts"] which works only when vitest is invoked from the package root, not the config file's directory. This may cause issues depending on how vitest resolves the path.

"src/mcp/x402.ts",
"src/observability/index.ts",
"src/codemode/ai.ts",
"src/experimental/memory/index.ts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build-breaking bug: Missing comma between array elements. This will cause a syntax error when running npm run build.

Suggested change
"src/experimental/memory/index.ts"
"src/experimental/memory/index.ts",

Comment on lines 196 to 202
"test": "npm run test:workers && npm run test:react && npm run test:cli && npm run test:memory",
"test:cli": "vitest -r src/cli-tests",
"test:react": "vitest -r src/react-tests",
"test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",
"test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",
"test:workers": "vitest -r src/tests"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change to existing test infrastructure. The original scripts use vitest --project <name> which references the workspace config in vitest.config.ts (defines projects: src/tests/vitest.config.ts, src/react-tests/vitest.config.ts, src/cli-tests/vitest.config.ts, src/x402-tests/vitest.config.ts).

This PR:

  • Replaces --project with -r (changes how vitest resolves configs)
  • Drops test:x402 entirely
  • Changes test from vitest --run (which uses the workspace) to a sequential chain

Please keep the existing scripts intact and only add the new test:memory script. Then add the memory vitest config to the projects array in the root vitest.config.ts:

Suggested change
"test": "npm run test:workers && npm run test:react && npm run test:cli && npm run test:memory",
"test:cli": "vitest -r src/cli-tests",
"test:react": "vitest -r src/react-tests",
"test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",
"test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",
"test:workers": "vitest -r src/tests"
}
"test": "vitest --run",
"test:cli": "vitest --project cli",
"test:react": "vitest --project react",
"test:e2e": "vitest run src/e2e/e2e.test.ts --sequence.concurrent",
"test:memory": "vitest run --config src/experimental/memory/__tests__/vitest.config.ts",
"test:workers": "vitest --project workers",
"test:x402": "vitest --project x402"

Alternatively, add the memory tests as another project in the workspace config so vitest --run picks them up automatically.

`;

nextSeq++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop of individual INSERT statements is not wrapped in a transaction. If the Durable Object fails mid-batch (e.g., on the 3rd of 5 events), you'll have partial data with a gap in seq values. Consider wrapping the batch in a transaction:

this.ctx.storage.sql.exec("BEGIN");
try {
  for (const event of events) { ... }
  this.ctx.storage.sql.exec("COMMIT");
} catch (e) {
  this.ctx.storage.sql.exec("ROLLBACK");
  throw e;
}

Note: The base Agent's single-row operations don't need this, but batch inserts do.

const events = newMessages.map((msg) => messageToEvent(sessionId, msg));
this.appendEvents(sessionId, events);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SessionAgent creates cf_agents_sessions and cf_agents_events tables, but the base Agent.destroy() method only drops the tables it knows about (cf_agents_state, cf_agents_queues, cf_agents_schedules, cf_agents_mcp_servers, cf_agents_workflows).

Please override destroy() to clean up the session tables:

async destroy() {
  this.sql`DROP TABLE IF EXISTS cf_agents_events`;
  this.sql`DROP TABLE IF EXISTS cf_agents_sessions`;
  await super.destroy();
}


const meta: Record<string, unknown> = row.metadata
? JSON.parse(row.metadata)
: {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata JSON is parsed with JSON.parse and then fields are accessed with bare as casts (e.g., meta.toolCallId as string, meta.toolCalls as ToolCall[]). If the stored metadata is malformed or from a schema migration, this will silently produce undefined values that propagate as runtime errors later.

Consider at minimum adding fallback checks, or documenting that metadata is trusted (since only dehydrateEvent writes it). For toolCallId, the downstream code assumes it's a non-empty string, so an explicit check would be safer:

toolCallId: (meta.toolCallId as string) ?? "",
toolName: (meta.toolName as string) ?? "",

Comment on lines 170 to 231
if (since !== null && actions !== null && actions.length > 0) {
const placeholders = actions.map(() => "?").join(", ");
const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ? AND created_at >= ? AND action IN (${placeholders})
ORDER BY seq ${innerOrder} LIMIT ?`;
const query = tail
? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
: inner;
rows = [
...this.ctx.storage.sql.exec(query, sessionId, since, ...actions, limit)
] as unknown as StoredEvent[];
} else if (since !== null) {
if (tail) {
rows = this.sql<StoredEvent>`
SELECT * FROM (
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId} AND created_at >= ${since}
ORDER BY seq DESC LIMIT ${limit}
) sub ORDER BY seq ASC
`;
} else {
rows = this.sql<StoredEvent>`
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId} AND created_at >= ${since}
ORDER BY seq ASC LIMIT ${limit}
`;
}
} else if (actions !== null && actions.length > 0) {
const placeholders = actions.map(() => "?").join(", ");
const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ? AND action IN (${placeholders})
ORDER BY seq ${innerOrder} LIMIT ?`;
const query = tail
? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
: inner;
rows = [
...this.ctx.storage.sql.exec(query, sessionId, ...actions, limit)
] as unknown as StoredEvent[];
} else {
if (tail) {
rows = this.sql<StoredEvent>`
SELECT * FROM (
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId}
ORDER BY seq DESC LIMIT ${limit}
) sub ORDER BY seq ASC
`;
} else {
rows = this.sql<StoredEvent>`
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId}
ORDER BY seq ASC LIMIT ${limit}
`;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loadEvents method has 6 distinct code paths for the combinations of since, actions, and tail. This is hard to maintain and test — adding a new filter option would double the branches.

Consider a query-builder approach that conditionally appends WHERE clauses:

let conditions = ["session_id = ?"];
let params: unknown[] = [sessionId];

if (since !== null) {
  conditions.push("created_at >= ?");
  params.push(since);
}
if (actions !== null && actions.length > 0) {
  conditions.push(`action IN (${actions.map(() => "?").join(", ")})`);
  params.push(...actions);
}

const where = conditions.join(" AND ");
const inner = `SELECT ... FROM cf_agents_events WHERE ${where} ORDER BY seq ${innerOrder} LIMIT ?`;
params.push(limit);

const query = tail ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC` : inner;
rows = [...this.ctx.storage.sql.exec(query, ...params)] as unknown as StoredEvent[];

This eliminates the combinatorial branching entirely.

"name": "agents-session-agent",
"main": "src/index.ts",
"compatibility_date": "2025-09-27",
"compatibility_flags": ["nodejs_compat"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compatibility_date should be "2026-01-28" per the repo convention (see root AGENTS.md).

Suggested change
"compatibility_flags": ["nodejs_compat"],
"compatibility_date": "2026-01-28",

/**
* @experimental
* Build a WorkingContext from a list of session events. This is an example,
* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This comment says "This is an example" — should probably say something like "Pure function — keep outside the Agent class to allow use from Workflows/Workers/tests."

test: {
environment: "node",
clearMocks: true,
include: ["src/experimental/memory/__tests__/*.test.ts"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include path is relative to the repo root, not to the config file's directory. This works when running vitest run --config src/experimental/memory/__tests__/vitest.config.ts from the packages/agents/ directory, but may be fragile.

Consider using a path relative to the config file, or use root in the config to make it explicit:

export default defineConfig({
  test: {
    environment: "node",
    clearMocks: true,
    root: "../../../..",
    include: ["src/experimental/memory/__tests__/*.test.ts"]
  }
});

Or simply:

include: ["./*.test.ts"]

if vitest resolves relative to the config file location (check vitest docs for your version).

if (msg.role === "user") {
return { ...base, action: EventAction.USER_MESSAGE, content: msg.content };
}
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo in comment — eventhougheven though.

Suggested change
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, even though it is a separate event, hence this extra check

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 19, 2026

Review submitted successfully. Here's a summary of the feedback:

Blocking issues (2)

  1. build.ts syntax error — Missing comma on line 18 between "src/experimental/memory/index.ts" and "src/experimental/forever.ts" will break the entire package build.
  2. Test scripts rewrite — The PR replaces the vitest workspace-based --project commands with -r directory commands, dropping test:x402 and bypassing the existing vitest.config.ts workspace setup. Only the new test:memory script should be added.

Important issues (5)

  1. SessionAgent doesn't override destroy() — The new cf_agents_sessions and cf_agents_events tables won't be cleaned up when destroy() is called.
  2. appendEvents lacks transaction wrapping — The batch INSERT loop can leave partial data on failure.
  3. Missing changeset — Required for packages/ changes that affect exports.
  4. Example doesn't follow repo conventions — Should be full-stack with Vite, not backend-only with curl.
  5. Stale compatibility_date — Should be 2026-01-28 per repo standard.

Minor items (4)

  1. Query builder suggestion for loadEvents to reduce branching
  2. Unsafe as casts in hydrateEvent metadata parsing
  3. Typo fix (eventhougheven though)
  4. Vitest config include path may be fragile

github run

@deathbyknowledge
Copy link
Member

/bonk review again please

Copy link
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Experimental Memory Primitives (v2)

The architecture is well-designed. The Session/WorkingContext/Adapter split is clean, the separation between RPC-safe methods and local-only helpers is thoughtful, and the pure utility functions are well-tested. The code addresses the previous round of feedback (from threepointone and deathbyknowledge) effectively — the ProcessorPipeline and token estimation were dropped, SQL-backed storage was adopted, and the module is properly labeled as experimental.

I have two blocking issues, several important items, and a few minor suggestions.


Blocking

  1. package-lock.json is massively out of sync — The fork's lockfile has ~25,000 lines of diff against main, downgrading nearly every dependency (e.g., @ai-sdk/anthropic from ^3.0.45 to ^3.0.8, @cloudflare/workers-types from ^4.20260219.0 to ^4.20260103.0, adding @biomejs/biome, removing @cloudflare/kumo, oxfmt, oxlint, etc.). This PR should not modify package-lock.json beyond what's needed for the new examples/session-agent workspace. Please rebase on the latest main and run npm install to regenerate.

  2. Missing changeset — This PR adds a new export path (agents/experimental/memory) to packages/agents/package.json. Per repo conventions, changes to packages/ that affect the public API need a changeset. Run npx changeset and describe the new experimental memory module.

Important

  1. SessionAgent should override destroy() — The base Agent.destroy() drops the five tables it creates (cf_agents_state, cf_agents_queues, cf_agents_schedules, cf_agents_mcp_servers, cf_agents_workflows) and then calls deleteAll(). However, it doesn't know about cf_agents_sessions or cf_agents_events. While deleteAll() clears storage, the explicit table drops are part of the pattern for clean teardown and should be followed consistently.

  2. appendEvents should use a transaction — The loop of individual INSERT statements at lines 254-265 can leave partial data if the DO fails mid-batch. A single failed insert after 3 of 5 events would leave orphaned rows with gaps in seq values. The base Agent uses individual this.sql statements but only for single-row operations. Batch inserts should be wrapped in a transaction.

  3. deleteSession should also use a transaction — Lines 138-139 execute two separate DELETE statements. If the first succeeds but the second fails, you'd have a session row with no events, but the session itself would still exist (or vice versa).

  4. Example doesn't follow repo conventions — Per examples/AGENTS.md, all examples must be full-stack (frontend + backend) with Vite, @cloudflare/vite-plugin, index.html, React client, Kumo UI, etc. The current example is backend-only with curl-based usage. It's missing vite.config.ts, index.html, src/client.tsx, and the required shared components (PoweredByAgents, ConnectionIndicator, ModeToggle, ThemeProvider). I understand this is extra work, but consistency matters for user-facing learning material.

  5. compatibility_date is stale — The example's wrangler.jsonc uses 2025-09-27; the repo standard is 2026-01-28.

Minor

  1. See inline comments below.

const events = newMessages.map((msg) => messageToEvent(sessionId, msg));
this.appendEvents(sessionId, events);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SessionAgent creates cf_agents_sessions and cf_agents_events tables, but the base Agent.destroy() only drops the tables it knows about. Override destroy() to clean up:

Suggested change
}
async destroy() {
this.sql`DROP TABLE IF EXISTS cf_agents_events`;
this.sql`DROP TABLE IF EXISTS cf_agents_sessions`;
await super.destroy();
}
}

Comment on lines 254 to 265
for (const event of events) {
// Override seq with the correct monotonic value
const withSeq: SessionEvent = { ...event, seq: nextSeq, sessionId };
const row = dehydrateEvent(withSeq);

this.sql`
INSERT INTO cf_agents_events (id, session_id, seq, action, content, metadata, created_at)
VALUES (${row.id}, ${row.session_id}, ${row.seq}, ${row.action}, ${row.content}, ${row.metadata}, ${row.created_at})
`;

nextSeq++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop of individual INSERTs is not atomic. If the DO fails mid-batch, you'll have partial data with gaps in seq values. Wrap in a transaction:

this.ctx.storage.sql.exec("BEGIN");
try {
  for (const event of events) {
    const withSeq: SessionEvent = { ...event, seq: nextSeq, sessionId };
    const row = dehydrateEvent(withSeq);
    this.sql`
      INSERT INTO cf_agents_events (id, session_id, seq, action, content, metadata, created_at)
      VALUES (${row.id}, ${row.session_id}, ${row.seq}, ${row.action}, ${row.content}, ${row.metadata}, ${row.created_at})
    `;
    nextSeq++;
  }
  this.ctx.storage.sql.exec("COMMIT");
} catch (e) {
  this.ctx.storage.sql.exec("ROLLBACK");
  throw e;
}

Note: Durable Object SQLite supports transactions via BEGIN/COMMIT/ROLLBACK.

Comment on lines 137 to 140
deleteSession(sessionId: string): void {
this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;
this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same atomicity concern here — if the first DELETE succeeds but the second fails (or vice versa), you'll have inconsistent state. Consider wrapping in a transaction:

Suggested change
deleteSession(sessionId: string): void {
this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;
this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;
}
deleteSession(sessionId: string): void {
this.ctx.storage.sql.exec("BEGIN");
try {
this.sql`DELETE FROM cf_agents_events WHERE session_id = ${sessionId}`;
this.sql`DELETE FROM cf_agents_sessions WHERE id = ${sessionId}`;
this.ctx.storage.sql.exec("COMMIT");
} catch (e) {
this.ctx.storage.sql.exec("ROLLBACK");
throw e;
}
}

Comment on lines 170 to 231
if (since !== null && actions !== null && actions.length > 0) {
const placeholders = actions.map(() => "?").join(", ");
const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ? AND created_at >= ? AND action IN (${placeholders})
ORDER BY seq ${innerOrder} LIMIT ?`;
const query = tail
? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
: inner;
rows = [
...this.ctx.storage.sql.exec(query, sessionId, since, ...actions, limit)
] as unknown as StoredEvent[];
} else if (since !== null) {
if (tail) {
rows = this.sql<StoredEvent>`
SELECT * FROM (
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId} AND created_at >= ${since}
ORDER BY seq DESC LIMIT ${limit}
) sub ORDER BY seq ASC
`;
} else {
rows = this.sql<StoredEvent>`
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId} AND created_at >= ${since}
ORDER BY seq ASC LIMIT ${limit}
`;
}
} else if (actions !== null && actions.length > 0) {
const placeholders = actions.map(() => "?").join(", ");
const inner = `SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ? AND action IN (${placeholders})
ORDER BY seq ${innerOrder} LIMIT ?`;
const query = tail
? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
: inner;
rows = [
...this.ctx.storage.sql.exec(query, sessionId, ...actions, limit)
] as unknown as StoredEvent[];
} else {
if (tail) {
rows = this.sql<StoredEvent>`
SELECT * FROM (
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId}
ORDER BY seq DESC LIMIT ${limit}
) sub ORDER BY seq ASC
`;
} else {
rows = this.sql<StoredEvent>`
SELECT id, session_id, seq, action, content, metadata, created_at
FROM cf_agents_events
WHERE session_id = ${sessionId}
ORDER BY seq ASC LIMIT ${limit}
`;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combinatorial branching for since/actions/tail produces 6 code paths. This is hard to maintain and will double with each new filter. Consider a query-builder approach:

const conditions = ["session_id = ?"];
const params: (string | number)[] = [sessionId];

if (since !== null) {
  conditions.push("created_at >= ?");
  params.push(since);
}
if (actions !== null && actions.length > 0) {
  conditions.push(`action IN (${actions.map(() => "?").join(", ")})`);
  params.push(...actions);
}

const where = conditions.join(" AND ");
const cols = "id, session_id, seq, action, content, metadata, created_at";
const inner = `SELECT ${cols} FROM cf_agents_events WHERE ${where} ORDER BY seq ${innerOrder} LIMIT ?`;
params.push(limit);

const query = tail
  ? `SELECT * FROM (${inner}) sub ORDER BY seq ASC`
  : inner;
rows = [...this.ctx.storage.sql.exec(query, ...params)] as unknown as StoredEvent[];

This eliminates the branching entirely and is easier to extend.

...base,
action: EventAction.TOOL_RESULT,
content: row.content ?? "",
toolCallId: meta.toolCallId as string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as casts on metadata fields (meta.toolCallId as string, meta.toolCalls as ToolCall[]) will silently produce undefined if metadata is malformed or from a future schema migration. Since only dehydrateEvent writes these fields, the risk is low in practice, but defensive defaults would be safer:

Suggested change
toolCallId: meta.toolCallId as string,
toolCallId: (meta.toolCallId as string) ?? "",

Similarly for toolName on line 66 and toolCalls on line 57.

if (msg.role === "user") {
return { ...base, action: EventAction.USER_MESSAGE, content: msg.content };
}
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: eventhough should be even though.

Suggested change
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, eventhough it is a separate event, hence this extra check
// TOOL_CALL_REQUEST are also stored as 'assistant' messages, even though it is a separate event, hence this extra check

/**
* @experimental
* Build a WorkingContext from a list of session events. This is an example,
* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says "This is an example" — it should describe the function's purpose, e.g.:

Suggested change
* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.
* Pure function for building a WorkingContext from session events.
* Keep this outside of the Agent class to allow handling multiple independent threads in parallel.

"name": "agents-session-agent",
"main": "src/index.ts",
"compatibility_date": "2025-09-27",
"compatibility_flags": ["nodejs_compat"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 2026-01-28 per repo convention.

Suggested change
"compatibility_flags": ["nodejs_compat"],
"compatibility_date": "2026-01-28",

test: {
environment: "node",
clearMocks: true,
include: ["src/experimental/memory/__tests__/*.test.ts"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This include path is relative to the working directory, not the config file location. It works when invoked via vitest run --config <path> from the packages/agents/ directory, but is fragile. Consider using a relative path from the config file:

Suggested change
include: ["src/experimental/memory/__tests__/*.test.ts"]
include: ["./*.test.ts"]

Or explicitly set root if vitest needs the package root:

root: "../../../..",
include: ["src/experimental/memory/__tests__/*.test.ts"]

}

// Public wrapper — _buildWorkingContext is protected,
// but we need it from our own helper methods (?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (?) in this comment reads like uncertainty. If this is the intended pattern for subclasses (wrapping _buildWorkingContext as a public method), document it clearly:

Suggested change
// but we need it from our own helper methods (?)
// Public wrapper — _buildWorkingContext is protected to prevent RPC misuse,
// but subclasses need to expose it for their own request handlers.

Alternatively, consider whether _buildWorkingContext should just be public on SessionAgent since every subclass will need to wrap it anyway. The RPC-safety concern is valid, but protected doesn't actually prevent RPC calls — it only restricts TypeScript-level access.

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 20, 2026

Review submitted successfully. Here's a summary of my review:

Review Summary

I posted a comprehensive review on PR #791 with 2 blocking issues, 5 important items, and 8 inline comments.

Blocking

  1. package-lock.json is massively out of sync — The fork's lockfile has ~25,000 lines of changes downgrading nearly every dependency. Needs a rebase on latest main.
  2. Missing changeset — Required for the new agents/experimental/memory export path.

Important

  1. SessionAgent doesn't override destroy() — Session/event tables won't be cleaned up on teardown.
  2. appendEvents needs transaction wrapping — Batch INSERTs can leave partial data on failure.
  3. deleteSession needs transaction wrapping — Two DELETEs without atomicity can leave inconsistent state.
  4. Example doesn't follow repo conventions — Backend-only with curl; needs full-stack with Vite/React/Kumo per examples/AGENTS.md.
  5. Stale compatibility_date2025-09-27 should be 2026-01-28.

Minor (inline)

  • Query-builder suggestion for loadEvents to reduce 6 code paths to 1
  • Unsafe as casts in hydrateEvent metadata parsing
  • Typo fix (eventhough)
  • Misleading comment in context.ts
  • Fragile vitest config include path
  • Unclear comment on buildContext wrapper pattern

I also corrected the previous bonk review's incorrect claims about a "missing comma in build.ts" (already fixed) and "test scripts being rewritten" (only test:memory was added; existing scripts were untouched).

github run

@mchockal mchockal marked this pull request as draft February 21, 2026 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants