Skip to content

Python: adjust checkpoint encoding handling#6579

Open
moonbox3 wants to merge 3 commits into
microsoft:mainfrom
moonbox3:checkpoint-encoding-adjustments
Open

Python: adjust checkpoint encoding handling#6579
moonbox3 wants to merge 3 commits into
microsoft:mainfrom
moonbox3:checkpoint-encoding-adjustments

Conversation

@moonbox3

Copy link
Copy Markdown
Contributor

Motivation & Context

Checkpoint persistence should preserve ordinary workflow state values while keeping internal encoding metadata clearly separated from user data. This change refines checkpoint encoding behavior for dictionaries that contain reserved metadata-like keys and tightens reconstruction behavior without changing the public checkpoint storage API.

Description & Review Guide

  • What are the major changes?
    • Encode dictionaries containing checkpoint-reserved keys through the existing pickle envelope so they round-trip without introducing a new JSON wrapper shape.
    • Limit automatic framework/OpenAI reconstruction to concrete classes while preserving explicit allowed_checkpoint_types behavior.
    • Add regression coverage for reserved-key dictionaries, FileCheckpointStorage round-trips, and restricted decode behavior.
  • What is the impact of these changes?
    • Existing checkpoints continue to restore through the current decode path.
    • Newly saved checkpoint state with reserved-key dictionaries round-trips as user data.
    • Applications can still opt in to additional checkpoint types with allowed_checkpoint_types.
  • What do you want reviewers to focus on?
    • Compatibility of reserved-key dictionary handling and the restricted decode behavior for framework/OpenAI classes.

Related Issue

N/A.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

moonbox3 and others added 2 commits June 18, 2026 11:44
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 05:46
@moonbox3 moonbox3 self-assigned this Jun 18, 2026
@moonbox3 moonbox3 added python Issues related to the Python codebase workflows Related to Workflows in agent-framework labels Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

Encoding reserved-key dicts currently does unnecessary recursive encoding work before switching to the pickle envelope, which is avoidable overhead and should be addressed.

Pull request overview

This PR refines Python workflow checkpoint encoding/decoding to (a) safely round-trip user dictionaries that contain reserved checkpoint marker keys by routing them through the existing pickle envelope, and (b) tighten restricted unpickling so framework/OpenAI auto-reconstruction only permits concrete classes (and blocks helper/callable abuse), while preserving explicit allowed_checkpoint_types behavior.

Changes:

  • Encode dicts containing __pickled__ / __type__ via the existing pickle+base64 envelope so they don’t collide with checkpoint metadata shapes.
  • Tighten _RestrictedUnpickler.find_class to block framework helper callables and dotted globals under agent_framework.* / openai.types.*, allowing only concrete classes unless explicitly allowed.
  • Add regression tests covering reserved-key dict round-trips (including FileCheckpointStorage) and additional restricted-deserialization attack variants.
File summaries
File Description
python/packages/core/agent_framework/_workflows/_checkpoint_encoding.py Implements reserved-key dict handling via pickle envelope and tightens restricted unpickling rules for framework/OpenAI modules.
python/packages/core/tests/workflow/test_checkpoint_encode.py Updates/extends encoding tests to validate reserved-key dict round-trips and preservation of an old “escape” user-data shape.
python/packages/core/tests/workflow/test_checkpoint_unrestricted_pickle.py Adds restricted-unpickling regression tests for framework helper callables/dotted globals and validates FileCheckpointStorage marker-shaped dict round-trips.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 1

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework/_workflows
   _checkpoint_encoding.py830100% 
TOTAL39869449788% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7992 34 💤 0 ❌ 0 🔥 2m 4s ⏱️

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 88% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes, Design Approach


Automated review by moonbox3's agents

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Issues related to the Python codebase workflows Related to Workflows in agent-framework

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants