Skip to content

postdeploy hook fails with exit 3 on first deploy (race with function host startup for connector_extension system key) #14

@paulyuk

Description

@paulyuk

Summary

On a fresh azd up, the postdeploy hook (infra/scripts/configure-trigger.sh) frequently fails with exit code 3 because it tries to read the Function App's connector_extension system key before the function host has finished starting and registering it.

Repro

  1. Fresh clone, fresh azd env new, fresh azd up into an empty resource group.
  2. Provision + deploy succeed.
  3. Postdeploy hook fires immediately and fails:
ERROR: step "cmdhook-postdeploy" failed: 'postdeploy' hook failed with exit code: '3',
Path: '/.../infra/scripts/configure-trigger.sh'.

The script emits (from lines 49–54):

ERROR: connector_extension system key not found on <function-app>.
The function host must start once with the connector-triggered agent before the key is created.

Root cause

Race between two async events:

  1. ARM marks the Function App resource Done (what azd waits on).
  2. The function host inside the app cold-starts, loads the connector-triggered binding, and registers the connector_extension system key.

The postdeploy hook runs as soon as (1) completes. On a cold first deploy, (2) hasn't happened yet, so az functionapp keys list --query systemKeys.connector_extension returns empty and the script exits 3.

Verified: ~45 seconds later the key was present, and azd hooks run postdeploy succeeded on the first retry with no other changes.

Suggested fix

Add a bounded retry loop around the system-key fetch in configure-trigger.sh (and the .ps1 sibling). Something like:

echo -e "${CYAN}Fetching connector_extension system key for ${functionAppName}...${NC}"
connectorKey=""
for i in 1 2 3 4 5 6 7 8 9 10; do
  connectorKey=$(az functionapp keys list -g "$resourceGroup" -n "$functionAppName" \
    --query "systemKeys.connector_extension" -o tsv 2>/dev/null || echo "")
  [[ -n "$connectorKey" ]] && break
  echo -e "${YELLOW}  attempt $i/10: key not registered yet, waiting 15s for host start...${NC}"
  sleep 15
done

if [[ -z "$connectorKey" ]]; then
  echo -e "${RED}ERROR: connector_extension system key not found on ${functionAppName} after ~150s.${NC}" >&2
  echo -e "${RED}       Wait a minute and re-run: azd hooks run postdeploy${NC}" >&2
  exit 3
fi

A 10×15s loop (~2.5 min) comfortably covers a Flex Consumption cold start while still failing fast if something is actually broken (e.g., missing binding, deploy didn't ship the connector-triggered function).

Optional belt-and-suspenders: nudge the host before polling — e.g., az functionapp restart followed by a short sleep — but the polling alone should be enough.

Workaround today

# wait 30–60s after azd up reports done, then:
azd hooks run postdeploy

Environment

  • macOS, azd 1.25.6
  • azure-functions-core-tools v4
  • Fresh rg-m365-test1, default region

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions