Skip to content

fix: support SSH agent and passphrase-protected keys#10

Open
CaddyGlow wants to merge 1 commit intoghostwright:mainfrom
CaddyGlow:fix/ssh-agent-passphrase
Open

fix: support SSH agent and passphrase-protected keys#10
CaddyGlow wants to merge 1 commit intoghostwright:mainfrom
CaddyGlow:fix/ssh-agent-passphrase

Conversation

@CaddyGlow
Copy link
Copy Markdown

Summary

  • SSH agent support: SSHConnect now tries SSH_AUTH_SOCK first, so passphrase-protected keys loaded in the agent work out of the box
  • Graceful fallback: Unprotected key files (id_ed25519, id_rsa) are still used as a fallback
  • Debug logging: WaitForSSH now logs each connection attempt and failure reason for easier troubleshooting

Problem

SSHConnect called ssh.ParsePrivateKey directly on key files, which fails immediately with "this private key is passphrase protected" for any user whose default SSH key has a passphrase. This caused WaitForSSH to retry indefinitely with the same error.

Copilot AI review requested due to automatic review settings March 31, 2026 13:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves SSH connectivity to Hetzner-provisioned hosts by supporting passphrase-protected SSH keys via ssh-agent, preserving unprotected key-file fallback, and adding per-attempt connection logging in the SSH wait loop.

Changes:

  • Add SSH agent (SSH_AUTH_SOCK) support as the first authentication option in SSHConnect.
  • Keep fallback to parsing unprotected ~/.ssh/id_ed25519 / ~/.ssh/id_rsa key files.
  • Add attempt-by-attempt logging (including failure reasons) to WaitForSSH.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +304 to +307
signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
continue // passphrase-protected, skip — agent handles these
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback key parsing currently continues on any ssh.ParsePrivateKey error, not just the expected “passphrase missing” case. This can hide real problems (corrupted key file, unsupported format, etc.) and lead to a confusing “no SSH auth available” error. Consider only skipping when the error is a passphrase-missing/encrypted-key error (e.g., via errors.As), and otherwise propagate or include the parse error in the returned error when no auth methods are usable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines +331 to +348
attempt := 0
log.Printf("[ssh] waiting for SSH on %s", ip)
for {
select {
case <-ctx.Done():
log.Printf("[ssh] context cancelled after %d attempts for %s: %v", attempt, ip, ctx.Err())
return nil, ctx.Err()
default:
}

attempt++
log.Printf("[ssh] attempt %d connecting to %s:22", attempt, ip)
client, err := SSHConnect(ip)
if err == nil {
log.Printf("[ssh] connected to %s after %d attempts", ip, attempt)
return client, nil
}
log.Printf("[ssh] attempt %d failed: %v", attempt, err)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitForSSH now logs every attempt using the standard library log package. This repository doesn’t appear to use log elsewhere, and WaitForSSH is called from Bubble Tea TUI flows (e.g., internal/tui/image_build.go / deploy_progress.go), where writing directly to stderr/stdout can corrupt the terminal UI. Consider moving these logs to the TUI layer, making logging optional via an injected logger, or gating it behind an env/config debug flag.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines 297 to 314
home, err := os.UserHomeDir()
if err == nil {
for _, name := range []string{"id_ed25519", "id_rsa"} {
keyBytes, err := os.ReadFile(filepath.Join(home, ".ssh", name))
if err != nil {
continue
}
signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
continue // passphrase-protected, skip — agent handles these
}
authMethods = append(authMethods, ssh.PublicKeys(signer))
}
}

signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
return nil, fmt.Errorf("error parsing SSH key: %w", err)
if len(authMethods) == 0 {
return nil, fmt.Errorf("no SSH auth available: set SSH_AUTH_SOCK or provide an unprotected key at ~/.ssh/id_ed25519 or ~/.ssh/id_rsa")
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.UserHomeDir() errors are currently ignored (the code just skips key-file fallback). If the agent also isn’t available, the function returns the generic “no SSH auth available … ~/.ssh/…” message, which can be misleading when the real issue is that the home directory couldn’t be determined. Consider returning/wrapping the UserHomeDir error when it prevents all key-file discovery, or at least include it in the final error when authMethods is empty.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

SSHConnect failed with "this private key is passphrase protected" for
any user whose default key has a passphrase. Now tries SSH agent first
(via SSH_AUTH_SOCK), then falls back to raw key files for unprotected
keys.

- Only skip passphrase-protected keys (PassphraseMissingError), surface
  other parse errors (corrupted key, unsupported format)
- Surface agent dial errors and UserHomeDir errors in diagnostics when
  no auth methods are available
- Close agent socket on dial failure to prevent FD leaks in retry loop
@CaddyGlow CaddyGlow force-pushed the fix/ssh-agent-passphrase branch from e396525 to ec9a176 Compare March 31, 2026 15:23
@CaddyGlow
Copy link
Copy Markdown
Author

Fixed all suggestions from Copilot review:

  • Agent socket leak: agent connection is now closed on dial failure to prevent FD leaks in the retry loop
  • Swallowing parse errors: only PassphraseMissingError is skipped; other parse errors (corrupted key, unsupported format) are surfaced in diagnostics
  • Silent agent dial failure: agent dial errors and UserHomeDir errors are now included in the final error message when no auth methods are available
  • Log output corrupting TUI: removed all log.Printf calls — the improved error messages in SSHConnect already surface the root cause

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants