fix: resolve race condition in FuturesDict and unsafe __del__ methods#6759
Open
Suraj Sahani (surajsahani) wants to merge 4 commits intolangchain-ai:mainfrom
Open
fix: resolve race condition in FuturesDict and unsafe __del__ methods#6759Suraj Sahani (surajsahani) wants to merge 4 commits intolangchain-ai:mainfrom
Suraj Sahani (surajsahani) wants to merge 4 commits intolangchain-ai:mainfrom
Conversation
- Fix race condition in FuturesDict.__setitem__ where callback could fire before counter was incremented, leading to incorrect counter state - Make on_done() idempotent to handle edge case where future completes between counter increment and callback registration - Add proper exception handling in __del__ methods to prevent crashes during interpreter shutdown in stores and cache - Affected files: pregel/_runner.py, sqlite/postgres stores, batch.py Tests: test_pregel_loop_refcount, test_concurrent_emit_sends pass
…o LLM _filter_validation_errors manually reconstructed injected arg names from only state/store/runtime, missing custom InjectedToolArg subclasses. This meant validation errors for custom injected args (e.g., InjectedAuth) would leak back to the LLM in error messages. Fix: Use all_injected_keys (added by Sydney in 08363db) instead of manually rebuilding the set, ensuring all injected args are filtered. Companion fix to 08363db (injected arg stripping in _inject_tool_args).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Found and fixed critical concurrency bugs that could cause deadlocks and crashes in production:
Race condition in
FuturesDict: The callback could fire before the counter was incremented, or a future could complete between counter increment and callback registration, leading to incorrect counter state and potential deadlocks.Unsafe
__del__methods: Multiple stores and cache implementations had__del__methods that could crash during interpreter shutdown when logging or attributes become unavailable.Solution
Race Condition Fix (
libs/langgraph/langgraph/pregel/_runner.py)on_done()idempotent by checking if future was already processedResource Cleanup Fix
Added proper exception handling in
__del__methods:libs/checkpoint-sqlite/langgraph/store/sqlite/base.pylibs/checkpoint-postgres/langgraph/store/postgres/base.pylibs/checkpoint-sqlite/langgraph/cache/sqlite/__init__.pylibs/checkpoint/langgraph/store/base/batch.pyTesting
test_pregel_loop_refcount- Memory leak test passestest_concurrent_emit_sends- Concurrency test passesImpact
These fixes prevent:
Particularly important for production workloads with hundreds of concurrent graph executions.