Skip to content

[NativeAOT] Print OOM message before Abort() on Linux#125311

Open
Copilot wants to merge 33 commits into
mainfrom
copilot/fix-out-of-memory-reporting
Open

[NativeAOT] Print OOM message before Abort() on Linux#125311
Copilot wants to merge 33 commits into
mainfrom
copilot/fix-out-of-memory-reporting

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 8, 2026

On Linux, NativeAOT processes terminating due to OutOfMemoryException (e.g. with DOTNET_GCHeapHardLimit set) printed only Aborted with no diagnostic context.

Root cause

RuntimeExceptionHelpers.FailFast detects the preallocated OOM exception via minimalFailFast = (exception == PreallocatedOutOfMemoryException.Instance) and skips all stderr output to avoid heap allocations — leaving the user with no indication of why the process died.

Fix

In the minimalFailFast path, print a hardcoded OOM message to stderr before calling Abort(). The write is wrapped in try/catch {} so a secondary allocation failure silently falls through to the existing abort path.

Before: Aborted
After: Out of memory.

Test

Added a new OomHandling smoke test in src/tests/baseservices/exceptions/OutOfMemoryException/. The test spawns itself as a subprocess with DOTNET_GCHeapHardLimit=20000000 (32 MB) set, waits for the subprocess to run out of memory, and verifies that OutOfMemoryException appears in stderr. This covers both the preallocated OOM path (the fix) and the existing unhandled-exception path. The test is skipped on mobile and browser platforms that do not support process spawning.

Original prompt

This section details on the original issue you should resolve

<issue_title>[NativeAOT] Out of memory reporting on Linux</issue_title>
<issue_description>### Repro

var l = new List<object>();
for (; ; ) l.Add(new object());

Run the native aot compiled binary with with export DOTNET_GCHeapHardLimit=2000000 set

Actual result

Aborted

Expected result

Out of memory.

(Reported by partner team.)</issue_description>

Comments on the Issue (you are @copilot in this section)

@ Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in [area-owners.md](https://github.com/dotnet/runtime/blob/main/docs/area-owners.md) if you want to be subscribed.
Issue Details

Repro

var l = new List<object>();
for (; ; ) l.Add(new object());

Run the native aot compiled binary with with export DOTNET_GCHeapHardLimit=2000000 set

Actual result

Aborted

Expected result

Process is terminating due to OutOfMemoryException

(Reported by partner team.)

Author: jkotas
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: 8.0.0

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: agocke <515774+agocke@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix out of memory reporting on Linux for NativeAOT [NativeAOT] Print OOM message before Abort() on Linux Mar 8, 2026
@agocke
Copy link
Copy Markdown
Member

agocke commented Mar 9, 2026

Please add a unit test to src/tests

@agocke
Copy link
Copy Markdown
Member

agocke commented Mar 9, 2026

@copilot

Co-authored-by: agocke <515774+agocke@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 9, 2026

Please add a unit test to src/tests

Added a new OomHandling smoke test in src/tests/nativeaot/SmokeTests/OomHandling/ (commit 3ef2a6d). The test spawns itself as a subprocess with DOTNET_GCHeapHardLimit=20000000 (20 MB) set, waits for it to OOM, and verifies that OutOfMemoryException appears in stderr — covering both the preallocated OOM path (my fix) and the existing unhandled-exception path. The test is skipped on mobile and browser platforms that do not support process spawning.

@github-actions

This comment has been minimized.

Copilot AI review requested due to automatic review settings April 15, 2026 01:22
@eduardo-vp eduardo-vp marked this pull request as ready for review April 15, 2026 01:22

This comment was marked as outdated.

@github-actions

This comment has been minimized.

Copilot AI review requested due to automatic review settings June 3, 2026 06:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 3, 2026 07:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines +248 to +256
// Minimal OOM fail-fast path: avoid heap allocations as much as possible, but still
// report that OOM is the reason for the crash.
try
{
// Try to print the same short message CoreCLR prints.
Internal.Console.Error.Write("Out of memory.");
Internal.Console.Error.WriteLine();
}
catch { }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to address this by changing Internal.Console.Error.Write to avoid managed allocations for small strings on Unix. Notice that Windows implementation is like that already.

I suspect that the test may be flaky otherwise given that it tries to allocate every last bit of managed memory.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test was flaky indeed but I think the problem was using a list that tried to resize from 2048 to 4096 and failed leaving a remaining size of still ~32 KB in memory, too big to start trying to do very tiny allocations. I think the GC was just thrashing so updated the test.

I also added this change so I'm expecting the test doesn't fail, will observe the CI.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test consistently times out only on osx x64. At this point I'm wondering if there's a problem with the test or it's actually the runtime the one that is thrashing when it should OOM.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not be surprised if there are runtime bugs that lead to hang on OOM like the one this test is exercising.

Copilot AI review requested due to automatic review settings June 4, 2026 03:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

Comment thread src/libraries/System.Private.CoreLib/src/Internal/Console.Unix.cs Outdated
Comment thread src/libraries/System.Private.CoreLib/src/Internal/Console.Unix.cs Outdated
Comment thread src/libraries/System.Private.CoreLib/src/Internal/Console.Unix.cs Outdated
Copilot AI review requested due to automatic review settings June 4, 2026 04:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread src/tests/baseservices/exceptions/OutOfMemoryException/OutOfMemoryException.cs Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 04:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

object[] storage = new object[8192];
int idx = 0;
// We expect ~2048 iterations in the first loop and ~64 iterations in the second.
try { while (idx < storage.Length) storage[idx++] = GC.AllocateArray<byte>(16 * 1024, pinned: true); } catch (OutOfMemoryException) { }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is OOM caused by pinned: true able to hit the minimalFailFast?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally I'm seeing it sometimes hits the minimalFailFast, sometimes it outputs the full stack trace. But in any case the test still doesn't work on osx x64. Should we exclude the test from that config and file an issue?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we exclude the test from that config and file an issue?

Sounds good

Copilot AI review requested due to automatic review settings June 6, 2026 02:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment on lines +102 to +106
// Even in the small allocations case, the runtime might still have enough memory to construct
// an OutOfMemoryException and print the full diagnostic.
// Either token is acceptable, but at least one should be present to confirm that OOM was the reason for termination.
if (!(stderr.Contains(ExpectedOomToken) || stderr.Contains(ExpectedMinimalOomToken)))
{
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[NativeAOT] Out of memory reporting on Linux

6 participants