Skip to content

[26.04_linux-nvidia] NVIDIA: SAUCE: ovl: keep err zero after successful ovl_cache_get()#425

Draft
nirmoy wants to merge 1 commit into
NVIDIA:26.04_linux-nvidiafrom
nirmoy:codex/nvbug-6144764-ovl-ptrerr-26.04
Draft

[26.04_linux-nvidia] NVIDIA: SAUCE: ovl: keep err zero after successful ovl_cache_get()#425
nirmoy wants to merge 1 commit into
NVIDIA:26.04_linux-nvidiafrom
nirmoy:codex/nvbug-6144764-ovl-ptrerr-26.04

Conversation

@nirmoy
Copy link
Copy Markdown
Collaborator

@nirmoy nirmoy commented May 15, 2026

Summary

Fix NVBug 6144764 on 26.04_linux-nvidia by keeping err zero after a successful ovl_cache_get() in ovl_iterate_merged().

The installer crash is an overlayfs readdir failure while rsync reads through overlayfs during BaseOS/DGX OS installation. The bad path is the same as syzbot a16fb0cce329a320661c: a successful cache pointer is passed to PTR_ERR(), truncating pointer bits into a bogus int that can later be returned as a non-errno value.

Sibling BOS PR: #423

Bug Links

Validation

  • Cherry-picked cleanly onto upstream/26.04_linux-nvidia.
  • git show --check --format=short HEAD: clean.
  • scripts/checkpatch.pl --strict --ignore COMMIT_LOG_USE_LINK,COMMIT_LOG_LONG_LINE --git HEAD: 0 errors, 0 warnings.
  • Earlier validation on arm64 virtme/KVM KASAN:
    • unpatched / Amir-only controls reproduced the overlayfs crash.
    • patched v2 completed 5/5 runs clean with OVL_SYZ_DONE rc=0 and no Oops/KASAN/panic markers.

BugLink: https://bugs.launchpad.net/bugs/2150640

ovl_iterate_merged() stores PTR_ERR(cache) in err before checking
IS_ERR(cache). On success err holds the truncated cache pointer and
can be returned as a bogus non-zero error.

The syzbot reproducer reaches this through overlay-on-overlay readdir:

  getdents64
    iterate_dir(outer overlay file)
      ovl_iterate_merged()
        ovl_cache_get()
          ovl_dir_read_merged()
            ovl_dir_read()
              iterate_dir(inner overlay file)
                ovl_iterate_merged()

Only compute PTR_ERR(cache) on the error path.

Fixes: d25e4b7 ("ovl: refactor ovl_iterate() and port to cred guard")
Reported-by: syzbot+a16fb0cce329a320661c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a16fb0cce329a320661c
Cc: stable@vger.kernel.org
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Carol L Soto <csoto@nvidia.com>
(backported from https://lore.kernel.org/r/20260514144258.3068715-1-nirmoyd@nvidia.com)
@nirmoy nirmoy marked this pull request as draft May 15, 2026 16:42
@github-actions
Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ❌ Errors found

Details
Checking 1 commits...

Cherry-pick digest:
E: f3e134440913 ("NVIDIA: SAUCE: ovl: keep err zero after "): backport trailer order: MISSING: backporter SOB after (backported from)
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ f3e134440913 │ ovl: keep err zero after successful ovl_cache_get()              │ match      │ found   │ MISSING: backporter SOB a │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

@nirmoy
Copy link
Copy Markdown
Collaborator Author

nirmoy commented May 16, 2026

Boro review

Latest watcher review: open review

Head: f3e134440913

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant