-
Notifications
You must be signed in to change notification settings - Fork 125
Add Authentication RFD #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c598aa1
ba68e70
048526c
939ef11
3f57ae7
4bba4eb
07cf5a2
1b8c326
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,151 @@ | ||
| --- | ||
| title: "AUTHENTICATION" | ||
| --- | ||
|
|
||
| Author(s): [anna239](https://github.com/anna239) | ||
|
|
||
| ## Elevator pitch | ||
|
|
||
| > What are you proposing to change? | ||
|
|
||
| I suggest adding more information about auth methods that agent supports, which will allow clients to draw more appropriate UI. | ||
|
|
||
| ## Status quo | ||
|
|
||
| > How do things work today and what problems does this cause? Why would we change things? | ||
|
|
||
| Agents have different ways of authenticating users: env vars with api keys, running a command like `<agent_name> login`, some just open a browser and use oauth. | ||
| [AuthMethod](https://agentclientprotocol.com/protocol/schema#authmethod) does not really tell the client what should be done to authenticate. This means we can't show the user a control for entering key if an agent supports auth through env var. | ||
|
|
||
| ## What we propose to do about it | ||
|
|
||
| > What are you proposing to improve the situation? | ||
|
|
||
| We can add different types of AuthMethods, so that clients can show some UI for them. For example, this auth method | ||
|
|
||
| ```json | ||
| { | ||
| "id": "123", | ||
| "name": "OpenAI api key", | ||
| "description": "Provide your key", | ||
| "type": "env_var", | ||
| "varName": "OPEN_AI_KEY" | ||
| } | ||
| ``` | ||
|
|
||
| would allow client to prompt user to enter a key, then client can provide this key to an agent via `OPEN_AI_KEY` env variable. | ||
|
|
||
| ## Shiny future | ||
|
|
||
| > How will things will play out once this feature exists? | ||
|
|
||
| It will be easier for end-users to start using an agent from inside the IDE as auth process will be more straightforward | ||
|
|
||
| ## Implementation details and plan | ||
|
|
||
| > Tell me more about your implementation. What is your detailed implementation plan? | ||
|
|
||
| I suggest adding following auth types: | ||
|
|
||
| 1. Env variable | ||
|
|
||
| A user can enter a key and a client will pass it to the agent as an env variable | ||
|
|
||
| ```json | ||
| { | ||
| "id": "123", | ||
| "name": "OpenAI api key", | ||
| "description": "Provide your key", | ||
| "type": "env_var", | ||
| "varName": "OPEN_AI_KEY", | ||
| "link": "OPTIONAL link to a page where user can get their key", | ||
| "requiresRestart": true // default | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understand correctly:
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My idea was that |
||
| } | ||
| ``` | ||
|
|
||
| 2. Start argument | ||
|
|
||
| A user can enter a key, and a client will pass it to the agent as a start parameter | ||
|
|
||
| ```json | ||
| { | ||
| "id": "123", | ||
| "name": "OpenAI api key", | ||
| "description": "Provide your key", | ||
| "type": "startParam", | ||
| "paramName": "OPEN_AI_KEY", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this also be handled if we supported the full elicitation options?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's more like env var, so this one also requires restart
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if we add |
||
| "link": "OPTIONAL link to a page where user can get their key", | ||
| "requiresRestart": true // default | ||
| } | ||
| ``` | ||
|
|
||
| 3. Agent auth | ||
|
|
||
| Same as what there is now – agent handles the auth itself, this should be a default type if no type is provided for backward compatibility | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we should be explicit that Agent auth MUST not require a restart unless explicitly marked otherwise (due to some facet not covered by ENV like static file cold reload) |
||
|
|
||
| ```json | ||
| { | ||
| "id": "123", | ||
| "name": "Agent", | ||
| "description": "Authenticate through agent", | ||
| "type": "agent", | ||
| "requiresRestart": false // default, can be changed | ||
| } | ||
| ``` | ||
|
|
||
| 4. Terminal auth | ||
|
|
||
| What's now done with `terminal-auth` capability and \_meta field, for backward compatibility clients might want to read properties from both \_meta and the object itself. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think reading from _meta for now is just for clients who use it now. I don't think we want it in the spec
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we want to pull that behaviour from _meta to the top layer
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I just wonder if we need it in the spec that they should check the _meta field is all |
||
| This means that the client should run the provided command themselves in the terminal and let user interact with that terminal | ||
|
|
||
| ```json | ||
| { | ||
| "id": "123", | ||
| "name": "Run in terminal", | ||
| "type": "terminal", | ||
| "command": "/path/to/command", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In zed, this is relative to the folder in which we extracted the agent... Basically it is an interesting question of what does this command refer to, and it should somehow be scoped to the agent itself likely i.e. Should we assume the command is the agent itself with different args?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you suggest to pass a cmd option or a parameter?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure... basically there is the issue that the agent can't express an absolute path because it doesn't know where it has been downloaded or what other programs might be available (though I think they shouldn't be allowed to) So what is the right way (possibly just through documentation) of how an agent can express a command (hopefully itself) to be run with different args but if it is run via We have a bunch of caveats for all of this in zed extensions using it right now... I just wonder if there is a better way to express this that is portable (i.e. we can tell people we will make sure
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that passing a full path to the binary is okay, dancing with node/other runtime paths is error prone, I'd prefer to avoid it
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But how would the agent know the path? Would it be relative to unpacking the binary?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should remove this field and make it part of the spec that this is always the same binary as the agent |
||
| "args": ["--setup"], | ||
| "env": { "VAR1": "value1", "VAR2": "value2" }, | ||
| "label": "Setup Label", | ||
| "requiresRestart": true // default, can be changed | ||
| } | ||
| ``` | ||
|
|
||
| ### AuthErrors | ||
|
|
||
| It might be useful to include a list of AuthMethod ids to the AUTH_REQUIRED JsonRpc error. Why do we need this if they're already shared during `initialize`: | ||
| All supported auth methods are shared during `initialize`. When user starts a session, they've already selected a model, which can narrow down a list of options. | ||
|
|
||
| ```json | ||
| { | ||
| "jsonrpc": "2.0", | ||
| "id": 2, | ||
| "error": { | ||
| "code": -32000, | ||
| "message": "Authentication required", | ||
| "authMethods": [ | ||
| { | ||
| "id": "chatgpt", | ||
| "name": "Login with ChatGPT", | ||
| "description": "Use your ChatGPT login with Codex CLI (requires a paid ChatGPT subscription)" | ||
| } | ||
| ] | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Frequently asked questions | ||
|
|
||
| > What questions have arisen over the course of authoring this document or during subsequent discussions? | ||
|
|
||
| ### What alternative approaches did you consider, and why did you settle on this one? | ||
|
|
||
| An alternative approach would be to include this information to an agent's declaration making it more static, see [Registry RFD](https://github.com/agentclientprotocol/agent-client-protocol/pull/289) | ||
|
|
||
| There is also an alternative to adding a separate `elicitation` capability, which is to create a separate auth type for this. Then the client can decide themselves if they support it or not. | ||
|
|
||
| ## Revision history | ||
|
|
||
| <!-- If there have been major updates to this RFD, you can include the git revisions and a summary of the changes. --> | ||
|
|
||
| There was a part about elicitations https://github.com/agentclientprotocol/agent-client-protocol/blob/939ef116a1b14016e4e3808b8764237250afa253/docs/rfds/auth.mdx removed it for now, will move to a separate rfd | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is it too late to add the env var since the process has already started?
Could we maybe just model this as a request for a text field that the user pastes the token into? and then the agent would store it somewhere like it usually does?
Or maybe you had an idea here that I am missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we'll have to restart the process indeed. If the agent is ok with just accepting a key in JsonRpc call, then it should declare 3. option —
Provided keyThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this method should rather be included in #289 ? i.e. static information known before startup?
But I guess we want to allow the user to choose? So by choosing this we'd restart the agent for them but at least they could choose if that is what they want from the options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that was my idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below may be too nuanced for the example, which could be best discussed as "ENV Only" "Env Supported" which happens.. Wherever there is support for a "restart only" config (args, env, files) it ideally is representable in the discovery/registry format in a primitive. In worst case we can just punt the credential scope as "restart" because people can always read the docs to figure out what that means.
I think this discussion ties into the #289 (registry) issue.. we need out-of-band info about auth when it implies a process restart. Also, in some cases there is a tiering. many agents have a env fallback to a file or some other process. Not to complicate this, but if something is strictly ENV only then it should be definable pre-start (registry) and if not, we shouldn't mistake the option for an ENV for only supporting ENV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I got your idea right, you suggest putting some auth options (env vars) to the registry, and some (oauth) would strill be returned by the agent during its work. I thought about it but decided against for the follwing reasons:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for responding. seems reasonable