Is your feature request related to a problem? Please describe.
Problem - no way to use all available memory VRAM + RAM
Describe the solution you'd like
Hi, as far as i know you either have GPU or CPU support for models, is there a plan to add mixed inference? It will be great for MoE models for example.
Is your feature request related to a problem? Please describe.
Problem - no way to use all available memory VRAM + RAM
Describe the solution you'd like
Hi, as far as i know you either have GPU or CPU support for models, is there a plan to add mixed inference? It will be great for MoE models for example.