Testing a graceful failure of fetching codeInfo to see how much this mitigates test failures.#124089
Testing a graceful failure of fetching codeInfo to see how much this mitigates test failures.#124089rcj1 wants to merge 3 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @steveisok, @thaystg, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR changes CoreCLR’s runtime-async exception stack trace augmentation to handle cases where the native diagnosticIP cannot be resolved to valid code information without asserting/failing.
Changes:
- Replace an
_ASSERTE(codeInfo.IsValid())+ unconditional append with a conditional append only whenEECodeInfo(diagnosticIP)is valid.
|
It seems we should rather ensure that the diagnosticIP is always valid. How is it possible that the code got unloaded and we still attempt to use the ip of such code? |
|
There is a similar need to hold code alive in ExceptionDispatchInfo. The DispatchState.StackTrace is responsible for holding the code alive as long as the ExceptionDispatchInfo exists. It seems we need something similar for the continuation resume info. |
The diagnostic IP is populated here, and is not the stub IP at which we resume: runtime/src/coreclr/interpreter/compiler.cpp Line 852 in 35b1dd2 DavidWr has indicated he is not surprised there is an issue involving this as certain asyncv2 debug info is still a work in progress. |
|
Let us choose between this PR and #124076 - if we merge this, we should be able to test non-diagnostics related interpreter tests with runtime-async. |
|
I would be fine with this as a temporary workaround until we fix the real problem. I think we should leave the original issue open and add a link to it to that long comment you have added and mention that it is a temporary measure. |
Add a comment regarding the temporary measure for testing.
| methodDesc, | ||
| NULL); | ||
| // Interpreter diagnostic IP is not recognized by codeInfo, so this does not work with interpreted code. | ||
| // This is a temporary measure to enable testing and once the issue is fixed this condition should be replaced by an assert. |
There was a problem hiding this comment.
What happens if we get a random pointer here that happens to be valid and maps to some completely unrelated method? I doubt we are robust against mapping debug info onto a completely unrelated method. I think this is just going to make it crash less often, it is not a reliable workaround for the bug.
I think stable CI is more important than having a bit more of async testing enabled ASAP. I would take the other PR to disable the runtime async with interpreter instead.
There was a problem hiding this comment.
Another approach would be to omit the append if the interpreter is enabled.
No description provided.