Skip to content

[P3] FP8 dynamic scaling alignment (H100/SM90+) #176

Description

@Flink-ddd

Context: FP8 quantization introduces scaling drift.
Requirements: Align dynamic scale calculation and truncation logic perfectly across rollout and training for FP8 tensors on Hopper architecture.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureplatform: cudaSpecific optimizations or bugs in NVIDIA graphics cards (such as FlashInfer, TMA optimizations)type: designIssues requiring in-depth discussion of architecture design

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions