Description
A 3008.x minion configured for multi-master fails to authenticate and exits the retry loop with SaltClientError: Failed to authenticate with the master after 7 attempts, surfaced via the outer Unable to sign_in to master wrapping in salt.channel.client.AsyncPubChannel.connect().
Affected versions
- 3008.x: emits the error and breaks out of the auth loop after
auth_tries (default 7) attempts. Visible failure.
- 3006.x / 3007.x: same underlying condition (
sign_in() returning "retry" repeatedly) does not have an outer-loop cap. The minion silently loops forever with exponential backoff up to acceptance_wait_time_max. No error log, no traceback, just a stuck minion.
The auth_tries outer cap was added on 3008.x in bcde0577d7c (originally 68c16baeb73, "Improve salt-ssh relenv/thin parity and fix various regressions"). On 3006.x/3007.x, auth_tries is still defined (default 7) but only consumed inside sign_in() as the per-network-send retry count passed to channel.send(...). It is not used to terminate the outer creds == "retry" loop in _authenticate().
Symptom on 3008.x
2026-06-12T19:52:58.914Z ERROR salt-minion 3987900 [salt@4413] salt.minion: Error while bringing up minion for multi-master. Is master at vsp-instance.vcf.nimbus.internal responding? The error message was Unable to sign_in to master: Failed to authenticate with the master after 7 attempts
Traceback (most recent call last):
File "/opt/saltstack/salt/lib/python3.14/site-packages/salt/channel/client.py", line 463, in connect
await self.auth.authenticate()
salt.exceptions.SaltClientError: Failed to authenticate with the master after 7 attempts
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/saltstack/salt/lib/python3.14/site-packages/salt/minion.py", line 1341, in _connect_minion
await minion.connect_master(failed=failed)
File "/opt/saltstack/salt/lib/python3.14/site-packages/salt/minion.py", line 1680, in connect_master
master, self.pub_channel = await self.eval_master(
^^^^^^^^^^^^^^^^^^^^^^^
self.opts, self.timeout, self.safe, failed
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/opt/saltstack/salt/lib/python3.14/site-packages/salt/minion.py", line 1000, in eval_master
await pub_channel.connect()
File "/opt/saltstack/salt/lib/python3.14/site-packages/salt/channel/client.py", line 483, in connect
raise salt.exceptions.SaltClientError(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
f"Unable to sign_in to master: {exc}"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
) # TODO: better error message
Two distinct issues
- Diagnose why
sign_in() keeps returning "retry" for this minion against vsp-instance.vcf.nimbus.internal (master keys, master AES rotation, key acceptance state, etc.). The above traceback alone does not identify the cause — master logs around the same timestamps are needed.
- Backport the outer-loop cap to 3006.x and 3007.x. Today those branches silently spin on the same condition with no operator-visible error. Whatever the root cause turns out to be, having a clear bail-out is the right behavior for all maintained branches.
Salt install type / version
Official package; 3008.x (Python 3.14 onedir, judging by /opt/saltstack/salt/lib/python3.14/...).
Reference
Description
A 3008.x minion configured for multi-master fails to authenticate and exits the retry loop with
SaltClientError: Failed to authenticate with the master after 7 attempts, surfaced via the outerUnable to sign_in to masterwrapping insalt.channel.client.AsyncPubChannel.connect().Affected versions
auth_tries(default 7) attempts. Visible failure.sign_in()returning"retry"repeatedly) does not have an outer-loop cap. The minion silently loops forever with exponential backoff up toacceptance_wait_time_max. No error log, no traceback, just a stuck minion.The
auth_triesouter cap was added on 3008.x inbcde0577d7c(originally68c16baeb73, "Improve salt-ssh relenv/thin parity and fix various regressions"). On 3006.x/3007.x,auth_triesis still defined (default7) but only consumed insidesign_in()as the per-network-send retry count passed tochannel.send(...). It is not used to terminate the outercreds == "retry"loop in_authenticate().Symptom on 3008.x
Two distinct issues
sign_in()keeps returning"retry"for this minion againstvsp-instance.vcf.nimbus.internal(master keys, master AES rotation, key acceptance state, etc.). The above traceback alone does not identify the cause — master logs around the same timestamps are needed.Salt install type / version
Official package; 3008.x (Python 3.14 onedir, judging by
/opt/saltstack/salt/lib/python3.14/...).Reference
_authenticate()loop with the cap: https://github.com/saltstack/salt/blob/3008.x/salt/crypt.py