Skip to content

Add support for new instructions #1

@keram88

Description

@keram88

Add support for newer instructions should these become available. These are currently:
f32x2 instructions:
add{.rnd}{.ftz}.f32x2
sub{.rnd}{.ftz}.f32x2
etc.

Mixed-precision which requires sm >= 100:
https://docs.nvidia.com/cuda/parallel-thread-execution/#mixed-precision-floating-point-instructions-add

bf16 operations on half, which require sm >= 90:
https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-add

OOB instructions, here for example:
https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-fma

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions