Skip to content

fix(detector): detect input float passthrough#242

Open
prasannakotyal wants to merge 1 commit intoSinatrasC:mainfrom
prasannakotyal:kg-blue-direct-input-float-red-327
Open

fix(detector): detect input float passthrough#242
prasannakotyal wants to merge 1 commit intoSinatrasC:mainfrom
prasannakotyal:kg-blue-direct-input-float-red-327

Conversation

@prasannakotyal
Copy link
Copy Markdown

Summary

  • add a hard detector for entrypoints that return an input tensor cast with .float() without real compute
  • allow imports, docstrings, and no-op contextlib.nullcontext() wrappers so the detector catches the direct and wrapped fake-output variants
  • register INPUT_PASSTHROUGH_OUTPUT as an auto-filter hard rule

Target

KernelGuard-Red-Submission: 327

Validation

  • UV_CACHE_DIR=/tmp/uvcache uv run python -m py_compile kernelguard.py
  • direct return data.float() sample: classification=hacked, should_filter=true, pattern INPUT_PASSTHROUGH_OUTPUT
  • annotated return data.float() sample: classification=hacked, should_filter=true, pattern INPUT_PASSTHROUGH_OUTPUT
  • contextlib.nullcontext() wrapped sample: classification=hacked, should_filter=true, pattern INPUT_PASSTHROUGH_OUTPUT
  • UV_CACHE_DIR=/tmp/uvcache uv run python ../../kernelguard_bypasses/eval_blue_patch.py kernelguard.py clean fixtures remain should_filter=False

@prasannakotyal prasannakotyal temporarily deployed to kernelguard-api-control-plane May 5, 2026 14:15 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

KernelGuard Blue Evaluation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant