Skip to content

Comments

Replace __cpuid_check for AVX512 with compiler builtin cpu features#630

Open
r-devulap wants to merge 2 commits intodatastax:mainfrom
r-devulap:fix-avx512-runtimecheck
Open

Replace __cpuid_check for AVX512 with compiler builtin cpu features#630
r-devulap wants to merge 2 commits intodatastax:mainfrom
r-devulap:fix-avx512-runtimecheck

Conversation

@r-devulap
Copy link

Fixes #629.

This pull request simplifies AVX-512 feature detection logic in the jvector_simd_check.c file by replacing manual CPUID bit checks with built-in compiler functions (available only on gcc and llvm compilers). The earlier code was missing XSAVE support at runtime. On VMs where 512‑bit register state saving is disabled, this can result in #UD faults when executing AVX‑512 instructions.

CPU feature detection improvements:

  • Replaced manual CPUID register checks for AVX-512 features with calls to __builtin_cpu_supports for each required feature, improving code readability and portability.
  • Added a call to __builtin_cpu_init() to ensure proper initialization before feature checks.
  • Rename check_compatibility to check_avx512_compatibility.

Raghuveer Devulapalli added 2 commits February 16, 2026 04:32
Replaces manual CPUID bit parsing in jvector_simd_check.c with
__builtin_cpu_supports and adds __builtin_cpu_init(). This corrects
missing XSAVE runtime checks, preventing #UD exceptions on systems where
512‑bit register state saves are disabled.
@r-devulap
Copy link
Author

Do we support building with MSVC? In that case, neither __cpuid_check nor __builtin_cpu_supports work and will have to be replaced with something else.

@ashkrisk
Copy link
Contributor

Looks like only GCC is supported at the moment. Refer jextract_vector_simd.h.

@r-devulap
Copy link
Author

Looks like only GCC is supported at the moment. Refer jextract_vector_simd.h.

We seem to be testing on windows CI where the native AVX512 module is not built. Do we want it to work on Windows or is it not important enough to worry about.

@ashkrisk
Copy link
Contributor

ashkrisk commented Feb 17, 2026

I don't think we should worry about it too much at the moment. If we want to support the native SIMD module on Windows there are more extensive changes needed anyway. If it seems important enough, maybe just add a comment over the sections that you feel might complicate future attempts at porting to Windows?

@r-devulap
Copy link
Author

I don't think we should worry about it too much at the moment. If we want to support the native SIMD module on Windows there are more extensive changes needed anyway.

yeah, that's fine by me. Was just curious about MSVC support. The current way of using __cpuid_check also doesn't work on MSVC, so not much will change with this patch.

Copy link
Contributor

@jshook jshook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see some evidence of testing here, even though this should just work. If it doesn't, then the risk goes up dramatically for downstream surprises.

@akash-shankaran
Copy link

How is the determination to load the native module by jvector made today? Is it based on the caller of the library (i.e. Astra or Opensearch), or the library determines it itself based on machine environment?

If it is determined by the library, it would be good to include the cpuId detection for ARM, as many of us use mac's for development and we can take advantage of advanced vector instructions. The latter may be out of scope for this PR.

@r-devulap
Copy link
Author

If it is determined by the library, it would be good to include the cpuId detection for ARM, as many of us use mac's for development and we can take advantage of advanced vector instructions. The latter may be out of scope for this PR.

The vector code written using Panama API should already generate the appropriate vector instructions on an ARM CPU. The JVM handles both the runtime checks and JIT generation of NEON vector code.

AFAIK, JVector does not have any native ARM code and hence doesn't require any CPUID detection for ARM.

@akash-shankaran
Copy link

The vector code written using Panama API should already generate the appropriate vector instructions on an ARM CPU. The JVM handles both the runtime checks and JIT generation of NEON vector code.

Then what is the purpose of the JVector-native module, if Panama is already handling ARM and x86 instructions generation?
my uber question is:
when is the Panama API vs Jvector native module used, and what is it based on?

JVector does not have any native ARM code and hence doesn't require any CPUID detection for ARM.

If the decision to load Panama vs JVector native module is based on Jvector library logic, then should we consider including ARM instructions in native module as well?

@r-devulap
Copy link
Author

when is the Panama API vs Jvector native module used, and what is it based on?

AFAIK, the routines in the native AVX‑512 module can’t be written using the Panama Vector API (I still need to determine the exact limitations—work in progress). If that turns out to be true, then yes, we’ll need to implement the ARM equivalents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing OS level XSAVE check in AVX-512 runtime checks

4 participants