Python: adjust checkpoint encoding handling#6579
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
⚠️ Not ready to approve
Encoding reserved-key dicts currently does unnecessary recursive encoding work before switching to the pickle envelope, which is avoidable overhead and should be addressed.
Pull request overview
This PR refines Python workflow checkpoint encoding/decoding to (a) safely round-trip user dictionaries that contain reserved checkpoint marker keys by routing them through the existing pickle envelope, and (b) tighten restricted unpickling so framework/OpenAI auto-reconstruction only permits concrete classes (and blocks helper/callable abuse), while preserving explicit allowed_checkpoint_types behavior.
Changes:
- Encode dicts containing
__pickled__/__type__via the existing pickle+base64 envelope so they don’t collide with checkpoint metadata shapes. - Tighten
_RestrictedUnpickler.find_classto block framework helper callables and dotted globals underagent_framework.*/openai.types.*, allowing only concrete classes unless explicitly allowed. - Add regression tests covering reserved-key dict round-trips (including
FileCheckpointStorage) and additional restricted-deserialization attack variants.
File summaries
| File | Description |
|---|---|
| python/packages/core/agent_framework/_workflows/_checkpoint_encoding.py | Implements reserved-key dict handling via pickle envelope and tightens restricted unpickling rules for framework/OpenAI modules. |
| python/packages/core/tests/workflow/test_checkpoint_encode.py | Updates/extends encoding tests to validate reserved-key dict round-trips and preservation of an old “escape” user-data shape. |
| python/packages/core/tests/workflow/test_checkpoint_unrestricted_pickle.py | Adds restricted-unpickling regression tests for framework helper callables/dotted globals and validates FileCheckpointStorage marker-shaped dict round-trips. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 1
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation & Context
Checkpoint persistence should preserve ordinary workflow state values while keeping internal encoding metadata clearly separated from user data. This change refines checkpoint encoding behavior for dictionaries that contain reserved metadata-like keys and tightens reconstruction behavior without changing the public checkpoint storage API.
Description & Review Guide
allowed_checkpoint_typesbehavior.FileCheckpointStorageround-trips, and restricted decode behavior.allowed_checkpoint_types.Related Issue
N/A.
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.