docs(rfc): add driver config passthrough proposal#1589
Conversation
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
24ceb12 to
9ec6a42
Compare
| shape, but the nested Kubernetes schema should not be finalized from a single | ||
| GPU resource example. |
There was a problem hiding this comment.
but the nested Kubernetes schema should not be finalized from a single
GPU resource example.
What do you mean by that ? What does GPU have to do with this case ?
There was a problem hiding this comment.
This is poorly worded and should be updated. The point is that:
- This is not intended as a mechanism to bypass resource requests that are exposed as first-class in the API. (GPUs, CPU, Memory).
- We should consider more use cases than just a non-standard resource request to drive the desing of the API. We need to answer the questions: What k8s-specific properties could a user want to set.
|
|
||
| This example is illustrative, not the final required schema. | ||
|
|
||
| The Kubernetes driver should prefer raw Kubernetes resource names and |
There was a problem hiding this comment.
I would love to talk the k8s implementation, but why bind it to this RFC? The k8s part is simply an implementation detail and other than a reference
There was a problem hiding this comment.
I think we can get started on the k8s part in parallel to this RFC. I was initially just going to comment on or update your issue, but the content (after iterating through some design decisions) got to the point, that I thought an RFC makes more sense.
| ```json | ||
| { | ||
| "driver_config": { | ||
| "kubernetes": { |
There was a problem hiding this comment.
Regarding this and the later mentions, did you consider having the enveloped fields something more generic e.g. compute_config versus the specific Kubernetes part ?
There was a problem hiding this comment.
The top-level "kubernetes" here maps to a concrete driver name. We are trying to add a mechanism for specifying driver-specific configs.
Could you provide an example of what you're expecting? What would you expect to be present in the compute_config?
There was a problem hiding this comment.
I am asking if the concrete driver name is really important.
- Does gateway need to know that it talks with a "kubernetes" driver ?
- Do we expect to have multiple driver configurations nested ? (e.g. kubernetes and podman)
- If not, it is not good enough to just dump the part inside
kubernetesand skip the extra nesting? Maybe even consider a named field e.g.
{ "driver_config": { "type": "kubernetes" //just validation "config: {...} // driver only get's the value passed } - If not, it is not good enough to just dump the part inside
or
{
"driver_config": {
"runtimeClass": "foo" // directly passing the `driver_config`
...
}
The compute_config thing did not help to convey the idea very well 😅
There was a problem hiding this comment.
I think the concrete driver name is important because this is what defines the spec for the allowed config. Although the content is arbitrary and opague from the point of view of the gateway, the contents need to be understood by the driver itself. This also opens up support for a gateway being connected to multiple drivers in the future -- without REQUIRING it at this stage.
Note that although most (if not all) drivers are currently in-tree, it is reasonable to assume that third-party drivers could be written at some stage. Since these are not tied to the release cadence of the gateway itself, the config object can be used to allow users to set driver-specific options without aligning with the gateway. This also allows the OpenShell developers to further decouple the gateway from the driver if that make sense.
Let me try to find better examples here -- possibly with a first PR for k8s.
|
Since we are at it, would a |
Summary
Add RFC 0005 proposing a generic
driver_configpassthrough for driver-owned sandbox creation settings.Related Issue
Addresses #1492
Changes
SandboxTemplate.driver_configenvelope and driver-sideDriverSandboxTemplate.driver_configforwarding model.Testing
mise run pre-commitpassesmise run pre-commitpassed locally. In the Codex shell, the default Nix-provided clang/SDK failed while compilingaws-lc-sys; the successful run used the Apple Command Line Tools compiler/linker environment explicitly.This is a docs-only RFC PR, so no unit or E2E tests were added.
Checklist