Skip to content

WIP: Context policy and out-of-band message design#272

Draft
esafwan wants to merge 16 commits into
developfrom
wip/context-policy-out-of-band-messages
Draft

WIP: Context policy and out-of-band message design#272
esafwan wants to merge 16 commits into
developfrom
wip/context-policy-out-of-band-messages

Conversation

@esafwan
Copy link
Copy Markdown
Contributor

@esafwan esafwan commented May 25, 2026

Summary

Adds a WIP design spec for a HUF context-policy layer and out-of-band/reference-only message pattern.

This is intended to solve cases where large runtime payloads, such as search visible result snapshots, retrieved documents, large tool outputs, browser state, or debug traces, are useful for audit/replay but should not be blindly persisted and replayed as normal LLM conversation history.

What is included

  • Defines the problem: large retrieval/runtime payloads getting saved into Agent Message and replayed in later turns.
  • Proposes the design principle: persist for audit, inject only what is intentionally useful for the next model decision.
  • Compares the pattern to OpenAI Conversations/Responses, ChatGPT Apps/MCP-style metadata separation, Anthropic/Gemini caching, and LangGraph-style state/artifact separation.
  • Proposes future HUF fields such as record_kind, visibility, context_policy, context_summary, and references.
  • Covers result visible context as the first concrete use case.
  • Stages implementation into phases:
    • Phase 1: safe history filtering
    • Phase 2: first-class out-of-band/reference messages
    • Phase 3: Agent Context Artifact
    • Phase 4: central context assembler
    • Phase 5: provider-aware optimization
    • Phase 6: memory/learning integration

Notes

This PR is intentionally documentation-only and WIP. It should be used to discuss the direction before backend/runtime changes are implemented.

esafwan and others added 16 commits May 26, 2026 01:37
…messages

Detailed, code-grounded plan to prevent large tool/result payloads from being
replayed as conversation history. Includes:
- Data model changes (7 new fields on Agent Message)
- History loader refactor (single policy-aware entry point)
- Tool result persistence updates (sync + stream paths)
- Complete test coverage for acceptance criteria
- Risk analysis and rollout strategy

Estimated implementation time: 2.5 hours for Phase 1.

https://claude.ai/code/session_01JYBGo1Ap9ybbsyPLUEgxNa
Adds context_policy field to Agent Message to control how persisted
content re-enters model context on future turns. Large tool outputs
now stored as compact reference lines instead of being replayed
verbatim, keeping input tokens O(1) per result rather than O(payload).

Changes:
- agent_message.json: 7 new nullable fields (context_policy, record_kind,
  context_summary, reference_doctype, reference_name, visibility,
  token_estimate). NULL policy = include_full (backward compatible).
- conversation_manager.py: get_conversation_history() now filters by
  policy via _message_to_context(); add_message() accepts all new fields.
- agent_integration.py: sync tool-result persistence uses include_reference
  for payloads >2000 chars, include_full otherwise.
- tests/test_context_policy.py: 7 acceptance tests covering all policies,
  backward compat, and token growth bound.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Updated get_conversation_history to query tool_call and kind fields.
- Updated _message_to_context to process include_reference policies.
- Implemented logic to dynamically split single database Tool Result rows into strict assistant and tool API objects during history replay.
- Injected a global architectural rule into the system prompt compiler.
- Instructed all agents on how to parse and fetch reference handles using the get_result_context tool.
- Created handle_get_result_context native function to retrieve large truncated payloads from the database.
- Registered the get_result_context tool globally so all agents can fetch data regardless of memory settings.
- Programmatically generated and synced the Agent Context Artifact DocType schema.
- Established foundational database storage layer for future massive non-tool payloads (e.g., file attachments).
- Added @frappe.whitelist() to add_message in agent_chat.py.
- Enabled out-of-band context injection for frontend applications and external integrations.
- Added max_context_chars Integer field directly to the Agent DocType schema.
- Enabled precise, per-bot configuration for tool result truncation limits.
- Modified run_stream loop to automatically flag tool results > 2000 chars as references.
- Applied context policy fields directly onto the mutated message row to preserve existing UI rendering.
- Removed hardcoded 2000 character magic number in tool result length checks.
- Refactored truncation logic to dynamically fetch 'max_context_chars' from the executing Agent object.
- Preserved a robust fallback to 2000 if the agent setting is missing or empty.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants