Skip to content

fix: workflow incorrectly marked as completed while nodes are still executing#62

Open
tomerqodo wants to merge 4 commits intocopilot_combined_20260121_qodo_grep_cursor_copilot_1_base_fix_workflow_incorrectly_marked_as_completed_while_nodes_are_still_executing_pr441from
copilot_combined_20260121_qodo_grep_cursor_copilot_1_head_fix_workflow_incorrectly_marked_as_completed_while_nodes_are_still_executing_pr441
Open

fix: workflow incorrectly marked as completed while nodes are still executing#62
tomerqodo wants to merge 4 commits intocopilot_combined_20260121_qodo_grep_cursor_copilot_1_base_fix_workflow_incorrectly_marked_as_completed_while_nodes_are_still_executing_pr441from
copilot_combined_20260121_qodo_grep_cursor_copilot_1_head_fix_workflow_incorrectly_marked_as_completed_while_nodes_are_still_executing_pr441

Conversation

@tomerqodo
Copy link
Copy Markdown

Benchmark PR from qodo-benchmark#441

Copilot AI review requested due to automatic review settings January 21, 2026 19:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix an issue where workflows were incorrectly marked as completed while nodes were still executing. The fix introduces execution tracking before enqueueing nodes and adds comprehensive unit tests for the skip propagator component.

Changes:

  • Added start_execution() call before enqueue_node() to track node execution earlier
  • Introduced comprehensive unit tests for SkipPropagator class covering edge state propagation scenarios
  • Created test infrastructure directory with proper initialization

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
api/core/workflow/graph_engine/graph_traversal/skip_propagator.py Added execution tracking call and reordered edge marking operations
api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/test_skip_propagator.py New comprehensive test suite with 7 test cases covering skip propagation logic
api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/init.py Added initialization file with docstring for test directory

if edge_states["has_taken"]:
# Enqueue node
# Start execution and enqueue node
print(f"Starting execution for node: {downstream_node_id}")
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This print statement should be removed. Debug print statements should not be committed to production code. Consider using proper logging instead if debugging output is needed.

Copilot uses AI. Check for mistakes.
Comment on lines 84 to +86
# Recursively propagate skip
self.propagate_skip_from_edge(edge.id)
self._state_manager.mark_edge_skipped(edge.id)
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of these operations has been reversed, which introduces a bug. The edge must be marked as skipped BEFORE calling propagate_skip_from_edge, because propagate_skip_from_edge analyzes the state of incoming edges (line 53) to determine how to handle the downstream node. If the edge isn't marked as skipped yet, the analysis will be incorrect and may lead to improper skip propagation or execution decisions.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to 64
# Start execution and enqueue node
print(f"Starting execution for node: {downstream_node_id}")
self._state_manager.start_execution(downstream_node_id)
self._state_manager.enqueue_node(downstream_node_id)
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of start_execution and enqueue_node has been reversed compared to the pattern used elsewhere in the codebase (e.g., event_handlers.py lines 196-197, 279-280, graph_engine.py lines 334-335). While this change appears intentional to fix the issue described in the PR title, it creates an inconsistent pattern across the codebase. Consider either: (1) documenting why this specific call site needs a different order, or (2) updating all call sites to use the same order if this is the correct pattern.

Suggested change
# Start execution and enqueue node
print(f"Starting execution for node: {downstream_node_id}")
self._state_manager.start_execution(downstream_node_id)
self._state_manager.enqueue_node(downstream_node_id)
# Enqueue node and start execution (order matches rest of codebase)
print(f"Starting execution for node: {downstream_node_id}")
self._state_manager.enqueue_node(downstream_node_id)
self._state_manager.start_execution(downstream_node_id)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants