fix: support SSH agent and passphrase-protected keys#10
fix: support SSH agent and passphrase-protected keys#10CaddyGlow wants to merge 1 commit intoghostwright:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves SSH connectivity to Hetzner-provisioned hosts by supporting passphrase-protected SSH keys via ssh-agent, preserving unprotected key-file fallback, and adding per-attempt connection logging in the SSH wait loop.
Changes:
- Add SSH agent (
SSH_AUTH_SOCK) support as the first authentication option inSSHConnect. - Keep fallback to parsing unprotected
~/.ssh/id_ed25519/~/.ssh/id_rsakey files. - Add attempt-by-attempt logging (including failure reasons) to
WaitForSSH.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| signer, err := ssh.ParsePrivateKey(keyBytes) | ||
| if err != nil { | ||
| continue // passphrase-protected, skip — agent handles these | ||
| } |
There was a problem hiding this comment.
The fallback key parsing currently continues on any ssh.ParsePrivateKey error, not just the expected “passphrase missing” case. This can hide real problems (corrupted key file, unsupported format, etc.) and lead to a confusing “no SSH auth available” error. Consider only skipping when the error is a passphrase-missing/encrypted-key error (e.g., via errors.As), and otherwise propagate or include the parse error in the returned error when no auth methods are usable.
internal/hetzner/client.go
Outdated
| attempt := 0 | ||
| log.Printf("[ssh] waiting for SSH on %s", ip) | ||
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| log.Printf("[ssh] context cancelled after %d attempts for %s: %v", attempt, ip, ctx.Err()) | ||
| return nil, ctx.Err() | ||
| default: | ||
| } | ||
|
|
||
| attempt++ | ||
| log.Printf("[ssh] attempt %d connecting to %s:22", attempt, ip) | ||
| client, err := SSHConnect(ip) | ||
| if err == nil { | ||
| log.Printf("[ssh] connected to %s after %d attempts", ip, attempt) | ||
| return client, nil | ||
| } | ||
| log.Printf("[ssh] attempt %d failed: %v", attempt, err) |
There was a problem hiding this comment.
WaitForSSH now logs every attempt using the standard library log package. This repository doesn’t appear to use log elsewhere, and WaitForSSH is called from Bubble Tea TUI flows (e.g., internal/tui/image_build.go / deploy_progress.go), where writing directly to stderr/stdout can corrupt the terminal UI. Consider moving these logs to the TUI layer, making logging optional via an injected logger, or gating it behind an env/config debug flag.
internal/hetzner/client.go
Outdated
| home, err := os.UserHomeDir() | ||
| if err == nil { | ||
| for _, name := range []string{"id_ed25519", "id_rsa"} { | ||
| keyBytes, err := os.ReadFile(filepath.Join(home, ".ssh", name)) | ||
| if err != nil { | ||
| continue | ||
| } | ||
| signer, err := ssh.ParsePrivateKey(keyBytes) | ||
| if err != nil { | ||
| continue // passphrase-protected, skip — agent handles these | ||
| } | ||
| authMethods = append(authMethods, ssh.PublicKeys(signer)) | ||
| } | ||
| } | ||
|
|
||
| signer, err := ssh.ParsePrivateKey(keyBytes) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error parsing SSH key: %w", err) | ||
| if len(authMethods) == 0 { | ||
| return nil, fmt.Errorf("no SSH auth available: set SSH_AUTH_SOCK or provide an unprotected key at ~/.ssh/id_ed25519 or ~/.ssh/id_rsa") | ||
| } |
There was a problem hiding this comment.
os.UserHomeDir() errors are currently ignored (the code just skips key-file fallback). If the agent also isn’t available, the function returns the generic “no SSH auth available … ~/.ssh/…” message, which can be misleading when the real issue is that the home directory couldn’t be determined. Consider returning/wrapping the UserHomeDir error when it prevents all key-file discovery, or at least include it in the final error when authMethods is empty.
SSHConnect failed with "this private key is passphrase protected" for any user whose default key has a passphrase. Now tries SSH agent first (via SSH_AUTH_SOCK), then falls back to raw key files for unprotected keys. - Only skip passphrase-protected keys (PassphraseMissingError), surface other parse errors (corrupted key, unsupported format) - Surface agent dial errors and UserHomeDir errors in diagnostics when no auth methods are available - Close agent socket on dial failure to prevent FD leaks in retry loop
e396525 to
ec9a176
Compare
|
Fixed all suggestions from Copilot review:
|
Summary
SSHConnectnow triesSSH_AUTH_SOCKfirst, so passphrase-protected keys loaded in the agent work out of the boxid_ed25519,id_rsa) are still used as a fallbackWaitForSSHnow logs each connection attempt and failure reason for easier troubleshootingProblem
SSHConnectcalledssh.ParsePrivateKeydirectly on key files, which fails immediately with"this private key is passphrase protected"for any user whose default SSH key has a passphrase. This causedWaitForSSHto retry indefinitely with the same error.