| Name |
Type |
Description |
Notes |
| name |
str |
Human-readable name for the inference endpoint. |
|
| checkpoint_artifact_id |
str |
Artifact id of the checkpoint to serve. Must be a ``kind=checkpoint`` Artifact with ``status=ready`` belonging to the same org as the requester. |
|
| instance_size |
str |
Saturn instance size to run the inference pod on. Must be a GPU-equipped size (gpu > 0). |
|
| quantization |
str |
Optional vLLM quantization method. Restricted to calibration-free methods (``fp8``, ``int8``) — these quantize on the fly with no calibration dataset. Calibration-requiring methods (gptq, awq) are rejected. Omit (the default) to serve in the checkpoint's native precision (BF16). |
[optional] |
| visibility |
str |
Route visibility enforced by ForwardAuth: ``org`` (any member of the endpoint's org may call it) or ``owner`` (only the owning identity and explicit ``viewers``). Defaults to ``org``. |
[optional] [default to 'org'] |
| viewers |
List[str] |
Optional list of identity names (usernames or group names in the endpoint's org) granted access to the endpoint route in addition to the owner. Honored by ForwardAuth exactly like a normal deployment's viewers. |
[optional] |
from saturn_api.models.inference_endpoint_create import InferenceEndpointCreate
# TODO update the JSON string below
json = "{}"
# create an instance of InferenceEndpointCreate from a JSON string
inference_endpoint_create_instance = InferenceEndpointCreate.from_json(json)
# print the JSON string representation of the object
print(InferenceEndpointCreate.to_json())
# convert the object into a dict
inference_endpoint_create_dict = inference_endpoint_create_instance.to_dict()
# create an instance of InferenceEndpointCreate from a dict
inference_endpoint_create_from_dict = InferenceEndpointCreate.from_dict(inference_endpoint_create_dict)
[Back to Model list] [Back to API list] [Back to README]