We use FBGEMM MoE kernels to optimise inference; however, these kernels are not natively available on Windows (they are only built for manylinux). As a workaround, we are disabling FBGEMM for Windows in #22, but we would prefer to have native ports of these kernels to ensure that performance has parity between the OSes.
We use FBGEMM MoE kernels to optimise inference; however, these kernels are not natively available on Windows (they are only built for
manylinux). As a workaround, we are disabling FBGEMM for Windows in #22, but we would prefer to have native ports of these kernels to ensure that performance has parity between the OSes.