Please can you explain what the "FastTensors" option does, or better still give it a name that actually corresponds to a parameter on Tabby API's model load endpoint.
Is this related to the tensor_parallel parameter or not? If not how do I set that flag from this extension?
Please can you explain what the "FastTensors" option does, or better still give it a name that actually corresponds to a parameter on Tabby API's model load endpoint.
Is this related to the tensor_parallel parameter or not? If not how do I set that flag from this extension?