Skip to content

step_into silently degrades to step_over after any step-triggered stop — stackTrace not requested on reason: "step" stopped events #71

@hjtrbo

Description

@hjtrbo

Title: step_into silently degrades to step_over after any step-triggered stop — stackTrace not requested on reason: "step" stopped events


Summary

After any step-triggered pause (following step_over, step_into, or step_out), calling step_into on a line containing a method call behaves identically to step_over — execution advances one line in the current method instead of entering the callee. The degradation is silent: no error is returned, no warning is emitted, and the tool reports success. The problem does not occur on the first stop if that stop was triggered by a breakpoint.

Inserting an explicit get_stack_trace call between the preceding step and the step_into call fixes the issue, which isolates the cause to DebugMCP's internal frame state not being refreshed on step-triggered stops.


Environment

Field Value
OS Windows 11 Pro 10.0.26200
VS Code Latest stable
DebugMCP extension Latest
Claude Code extension Latest
Target runtime .NET 9, WPF application
Debug adapter coreclr
Debug request type launch
Architecture x64

Reproduction Project

The reproduction uses a real .NET 9 WPF application. The relevant file is GmEcuSimulator/ViewModels/MainViewModel.cs. The constructor (lines 47–69) is the target because it contains a mix of simple assignments and a method call (Rebuild() at line 52), making it a clean test case for step_over followed by step_into.

Constructor (lines 47–69):

public MainViewModel(VirtualBus bus, BinReplayCoordinator replay)   // line 47
{                                                                     // line 48
    this.bus = bus;                                                   // line 49  ← breakpoint A
    this.replay = replay;                                             // line 50
    BinReplay = new BinReplayViewModel(replay, bus, ...);             // line 51
    Rebuild();                                                        // line 52  ← step_into target
    ...
}

Rebuild() definition (lines 93–98, same file):

public void Rebuild()         // line 93
{                             // line 94
    Ecus.Clear();             // line 95  ← expected step_into landing point
    foreach (var node in bus.Nodes) Ecus.Add(new EcuViewModel(node)); // line 96
    SelectedEcu = Ecus.FirstOrDefault();                               // line 97
}                             // line 98

launch.json Configuration

The configuration required to reliably reproduce this (and to avoid the separate empty-stack-trace failure mode described in the Notes section):

{
    "name": "Launch GmEcuSimulator (WPF)",
    "type": "coreclr",
    "request": "launch",
    "preLaunchTask": "build",
    "program": "${workspaceFolder}/GmEcuSimulator/bin/Debug/net9.0-windows/GmEcuSimulator.dll",
    "cwd": "${workspaceFolder}/GmEcuSimulator/bin/Debug/net9.0-windows",
    "console": "internalConsole",
    "stopAtEntry": true,
    "requireExactSource": false,
    "justMyCode": false,
    "suppressJitOptimizations": true,
    "enableStepFiltering": false
}

Key settings:

  • program points at the managed .dll, not the .exe shim — the coreclr adapter fully initialises its managed thread tracking at the .dll entry point; using the .exe wrapper causes the adapter to return empty stack frames everywhere (separate issue, described in Notes)
  • stopAtEntry: true — required to trigger the initial managed-thread initialisation; without it, the first breakpoint hit returns a populated stack but subsequent step-triggered stops do not
  • justMyCode: false — required to see full stack frames including framework calls
  • requireExactSource: false — required to allow the adapter to match PDB paths to source; true causes all frames to be rejected when PDB paths differ from source paths on disk

Steps to Reproduce

  1. Install DebugMCP and Claude Code extensions in VS Code.
  2. Open a .NET 9 WPF project with the launch.json above.
  3. In the Claude Code panel, send the following prompt:

Clear all existing breakpoints. Set a breakpoint at GmEcuSimulator/ViewModels/MainViewModel.cs line 49. Start debugging using the "Launch GmEcuSimulator (WPF)" configuration. When the breakpoint is hit, call get_stack_trace and report the frames. Then call step_over three times (lines 49→50→51→52). At line 52, call step_into. Report where execution landed.

  1. Observe that after the three step_over calls, step_into at line 52 lands on line 54 (the next line in the constructor) instead of line 95 (inside Rebuild()).

Alternative reproduction (minimal): After any step_over, call step_into immediately without calling get_stack_trace in between. It will behave as step_over every time.


Expected Behaviour

After step_over lands on line 52 (Rebuild();), calling step_into should:

  1. Enter the Rebuild method
  2. Pause on line 95 (Ecus.Clear();) — the first executable line inside the method
  3. Return a call stack with Rebuild as the top frame and MainViewModel..ctor as the second frame

Actual Behaviour

After step_over lands on line 52, calling step_into:

  1. Advances to line 54 (the next line in the constructor) — identical to step_over
  2. Does not enter Rebuild
  3. Returns no error — the tool reports success
  4. The call stack remains unchanged (same depth, same top frame)

Observed Evidence

The following sequence was captured during testing:

Call Result
add_breakpoint line 49 Breakpoint set ✓
start_debugging Session started, stopAtEntry fires, then continues to breakpoint
get_stack_trace at breakpoint hit 27 frames returned, full WPF dispatcher chain visible ✓
step_over (49→50) Correct ✓, NowMs value on bus updated live ✓
step_over (50→51) Correct ✓
step_over (51→52) Correct ✓
step_into at line 52 Lands on line 54Rebuild() not entered ✗
get_stack_trace after step_into Empty stack trace returned ✗
get_stack_trace (explicit call after step_over) Frames repopulated
step_into immediately after explicit get_stack_trace Enters Rebuild() correctly, lands on line 95 ✓

The table confirms: get_stack_trace after a step-triggered stop repopulates the internal frame state, and the immediately following step_into works correctly. Without it, the frame is stale and step_into degrades.


Root Cause Analysis

DAP protocol background

The Debug Adapter Protocol defines a stopped event that fires on every pause, regardless of cause. The event carries a reason field with values including "breakpoint", "step", "exception", "pause", "entry", etc.

The DAP stepIn request accepts an optional frameId parameter specifying which stack frame to step into from. When frameId is absent or stale, the coreclr adapter falls back to stepping from the current thread position — which in practice behaves like step_over for a method-call line because the adapter has no managed frame context to descend into.

What DebugMCP appears to do

DebugMCP maintains an internal "current frame ID" used to populate the frameId of outgoing stepIn requests. Based on observed behaviour, this frame ID is refreshed by calling stackTrace on the debug session, but only when a stopped event arrives with reason: "breakpoint".

When a stopped event arrives with reason: "step" (i.e. after any step operation completes), DebugMCP does not call stackTrace. The internal frame ID therefore remains at whatever it was at the last breakpoint-triggered stop — which may refer to a frame that no longer exists on the call stack, or to the synthetic fallback ID 1000 that the coreclr adapter returns when no frames are available.

When step_into is subsequently called using a stale or synthetic frameId, the coreclr adapter silently treats it as a step from the current position with no frame context, producing step_over behaviour.

Why the explicit get_stack_trace fixes it

Calling get_stack_trace forces DebugMCP to issue a stackTrace DAP request, receive the current frames, and update the internal frame ID to the real top-of-stack frame. The subsequent step_into then carries a valid frameId and the adapter correctly descends into the callee.


Workaround

Call get_stack_trace after every step operation and before any step_into. Example prompt pattern that works reliably:

Step over. Then call get_stack_trace to confirm the current position. Then step into.

This is effective but fragile in an AI-driven workflow — the AI must explicitly remember to insert the get_stack_trace call every time, and it has no automated way to know when the frame state is stale.


Suggested Fix

Refresh the internal frame state (call stackTrace and update the stored current frame ID) on every stopped event, regardless of reason. All stop reasons can leave the frame stale:

reason Currently refreshed? Should refresh?
"breakpoint" ✓ Yes ✓ Yes
"step" ✗ No ✓ Yes
"exception" ✗ No ✓ Yes
"pause" ✗ No ✓ Yes
"entry" ✗ No ✓ Yes
"goto" ✗ No ✓ Yes
"function breakpoint" Unknown ✓ Yes

Pseudocode for the corrected handler:

on stopped(event):
    // was: if event.reason === "breakpoint": refreshFrames()
    // fix:
    refreshFrames()   // always, regardless of reason
    updateInternalFrameId(frames[0].id)

This ensures that any tool call following any pause — including step_into, evaluate_expression, and get_variables — always operates against the actual current frame rather than a stale one.


Notes

Separate issue: empty stack frames when using .exe as program target

During investigation, a second failure mode was found unrelated to the step_into bug: when program in launch.json points at the .exe wrapper (e.g. GmEcuSimulator.exe) instead of the managed .dll (e.g. GmEcuSimulator.dll), every stop — including breakpoint-triggered stops — returns an empty stackTrace. The coreclr adapter never fully initialises its managed thread tracking when launched via the native .exe shim. This affects step_into, get_variables, and evaluate_expression equally. The fix is to point program at the .dll and set cwd to the same directory. This is independent of the step-triggered frame-refresh bug described above.

Two competing Claude Code instances cause MCP server corruption

If both the VS Code Claude Code extension and the Claude Code desktop/CLI are connected to the same DebugMCP server simultaneously, and both issue tool calls during the same debug session, the server enters a corrupted state (Internal MCP server error, JSON-RPC code -32603) from which it does not recover without a VS Code window reload (Developer: Reload Window). Consider documenting that only one Claude Code instance should drive a given DebugMCP session at a time.


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions