fix: auto-precision for GPU/CPU should default to fp32, not fp16 by DingmaomaoBJTU · Pull Request #998 · microsoft/winml-cli

DingmaomaoBJTU · 2026-06-29T09:37:05Z

Problem

PR #872 introduced FP16 conversion as a quantization mode, but _AUTO_PRECISION mapped "gpu" and "cpu" to "fp16". This caused silent, unintended FP16 conversion on any GPU/CPU machine whenever --precision was not explicitly passed (e.g. winml eval, winml build with defaults).

On AMD machines (MIGraphX EP), this broke eval tests because MIGraphX received an FP16 model it wasn't expecting.

Fix

Changed _AUTO_PRECISION in src/winml/modelkit/config/precision.py:

"gpu": "fp16" → "gpu": "fp32"
"cpu": "fp16" → "cpu": "fp32"

FP16 conversion now only happens when the user explicitly passes --precision fp16.

Testing

Updated affected unit tests in test_precision.py, test_build.py, test_build_onnx.py
All 331 unit tests pass

Fixes AMD eval failures introduced by #872.

Previously _AUTO_PRECISION mapped 'gpu' and 'cpu' to 'fp16', causing resolve_quant_compile_config to trigger an unintended FP16 model conversion whenever a user ran without --precision on a GPU/CPU machine (including AMD/MIGraphX). This broke eval tests because the model was silently converted. Fix: change the mapping to 'fp32' (no-op) for both gpu and cpu. FP16 conversion now only happens when the user explicitly passes --precision fp16. Fixes AMD eval failure reported against PR #872.

Add three e2e tests in TestConfigFlagVariations to guard against regression of the auto-precision GPU/CPU bug fixed in #998: - test_cpu_auto_precision_no_quant: device=cpu + precision=auto must resolve to fp32 (no quant config), not fp16. - test_gpu_auto_precision_no_quant: device=gpu + precision=auto must resolve to fp32 (no quant config), breaking AMD/MIGraphX fix. - test_explicit_fp16_still_triggers_quant: --precision fp16 (explicit) must still produce an fp16 quant config, ensuring the fix didn't regress intentional FP16 conversion. All 41 e2e config tests pass.

## Problem MIGraphX cannot compile FP16 models and hangs until timeout on AMD machines. Two tests that explicitly pass `--precision fp16` were triggering model compilation via MIGraphX, causing eval CI failures. ## Fix Add `require_not_ep("migraphx")` guard to: - `test_image_to_text_fp16` - `test_compare_mode_image_classification` Note: `test_precision_warning_for_prebuilt_onnx` is NOT guarded — it passes a pre-built ONNX so `--precision fp16` is ignored and no compilation occurs. Companion workaround for the AMD eval failures alongside #998. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

DingmaomaoBJTU requested a review from a team as a code owner June 29, 2026 09:37

KayMKM approved these changes Jun 29, 2026

View reviewed changes

DingmaomaoBJTU merged commit f2a5464 into main Jun 29, 2026
9 checks passed

DingmaomaoBJTU deleted the dingmaomaobjtu-fix-auto-precision-gpu-cpu-default branch June 29, 2026 09:57

DingmaomaoBJTU mentioned this pull request Jun 30, 2026

fix(e2e): skip fp16 precision tests on MIGraphX EP #1002

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: auto-precision for GPU/CPU should default to fp32, not fp16#998

fix: auto-precision for GPU/CPU should default to fp32, not fp16#998
DingmaomaoBJTU merged 2 commits into
mainfrom
dingmaomaobjtu-fix-auto-precision-gpu-cpu-default

DingmaomaoBJTU commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

DingmaomaoBJTU commented Jun 29, 2026

Problem

Fix

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants