Skip to content

FlightCheck: Check Workday SOAP/REST integration certificate validity…#127

Merged
apurvabanka merged 14 commits into
mainfrom
fearture/flightcheck-manual-checklist-workday
Jun 9, 2026
Merged

FlightCheck: Check Workday SOAP/REST integration certificate validity…#127
apurvabanka merged 14 commits into
mainfrom
fearture/flightcheck-manual-checklist-workday

Conversation

@apurvabanka

@apurvabanka apurvabanka commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Checkpoint: WD-CONN-102
Category: Workday
Priority: High
Status emitted: MANUAL on the healthy path (operator compares the Entra thumbprint against the matching Workday "Edit Tenant Setup - Security → SAML Identity Providers" row). Auto-escalates to FAILED (no AsymmetricX509Cert keyCredentials / all expired / active selection expired with rollover live), WARNING (active cert within CERT_EXPIRY_WARN_DAYS=30 of NotAfter, or NotBefore in the future, or 401/403 consent), NOT_CONFIGURED (no federated Workday SAML enterprise app), SKIPPED (Graph unavailable).
Conditional? No — runs before the no-Workday early-return gate in run_workday_checks so the warning fires pre-install.

Why MANUAL is the right shape on the healthy path

ESS Workday connectors use Basic auth + Entra SSO; there is no certificate stored on the Power Platform connection record. The cert the spec's failure mode describes is the SAML signing cert on the federated Workday enterprise app's servicePrincipal (reachable via Microsoft Graph). The matching cert uploaded to Workday's tenant security setup has no Workday API surface (SOAP RaaS / Worker services don't expose tenant security config; WQL admin is gated by the documented chicken-and-egg blocker). AGENTS.md design principle #2 introduces MANUAL specifically for this case — canonical reference AUTH-006, commit 53f3762, fixes #84. WD-CONN-102 reuses the pattern with the addition that Entra-side data is rich enough to programmatically classify expiry/missing/all-expired states into FAILED/WARNING directly.

API tier

Microsoft Graph v1.0 — validatable tier per tests/fixtures/cassettes/INDEX.md. No cassette / no registry changes. Extends existing tests/mocks/graph.py service_principal() with key_credentials and preferred_token_signing_key_thumbprint kwargs, adds a key_credential() builder for the keyCredential complex type — both CSDL-cited. Docstrings updated to list WD-CONN-102 as a consumer.

What the check observes vs. delegates

Entra side (observed via Microsoft Graph):

  • Every federated Workday enterprise app — same SAML filter as AUTH-006: startswith(displayName,'Workday') and preferredSingleSignOnMode eq 'saml'.
  • keyCredentials filtered to type == "AsymmetricX509Cert", with Sign + Verify entries sharing a customKeyIdentifier coalesced into one logical cert.
  • preferredTokenSigningKeyThumbprint to identify the active cert during rollover windows.
  • customKeyIdentifier decoded from base64 → 20-byte SHA-1 digest → colon-uppercase-hex (matches Workday's display format).
  • startDateTime / endDateTime parsed as UTC, days-to-expiry computed against datetime.now(timezone.utc).

Workday side (MANUAL operator):

  • Open "Edit Tenant Setup - Security" → "SAML Identity Providers", match the enabled IdP row's "Service Provider ID" against the entity IDs surfaced in the result to identify the active local Entra app.
  • View that row's X509 Certificate thumbprint, compare byte-for-byte against the active Entra thumbprint. Mismatch → end-user SAML SSO silently broken (ISU runtime calls still work, so the agent appears healthy).

Pattern decisions

  • $select forced on the listing call. Graph GET /servicePrincipals omits keyCredentials unless explicitly projected — without it, every healthy tenant would falsely report "no signing certificate." New GraphClient.get_workday_saml_service_principals() wraps the call with WORKDAY_SAML_SP_SELECT; pinned by TestSelectClauseInRequestUrl.
  • Sign+Verify coalescing. Single uploaded cert appears as two keyCredentials entries sharing customKeyIdentifier; _group_workday_cert_keycredentials() groups by CKI and prefers the Sign entry's metadata.
  • Active-cert selection during rollover. preferredTokenSigningKeyThumbprint is canonical; fallback to first cert whose [start, end] contains now; last-resort fallback to latest endDateTime.
  • Active expired with rollover live → FAILED, not WARNING. Operator forgot to flip the active selection after rotating; surfaced with explicit "rollover certificate exists" hint.
  • NotBefore in future → WARNING ("not yet valid for N more days").
  • Status bucketing for multi-SP tenants (AGENTS.md principle 7). At most one CheckResult per non-empty bucket; each lists only its bucket's SPs.
  • Permission probe pattern inherited from AUTH-006. Missing Application.Read.All surfaces as WARNING, never silent NOT_CONFIGURED.
  • Placement before the no-Workday early-return gate. Cert can be unhealthy pre-install; pinned by TestWireup.
  • Helper localization with lazy import. Cert helpers are local to checks/workday.py; _saml_entity_ids imported lazily from checks/authentication.py to avoid a circular import while reusing AUTH-006's tested filter.
  • Malformed customKeyIdentifier is not fatal. Renders as (malformed) rather than crashing.

Tests

tests/flightcheck/checks/test_workday_saml_certificate.py covers:

  • TestNotConfigured, TestPermissionDenied, TestNoGraphClientSkipped — defensive paths.
  • TestHealthyCertManual — MANUAL with thumbprint, NotAfter math, entity IDs, doc_link.
  • TestSignVerifyCoalescing — pair with same CKI collapses to one logical cert.
  • TestPreferredThumbprintSelectsActive — disambiguation during rollover, rollover: label on the others.
  • TestExpiringSoonWarning, TestNotYetValidWarning — WARNING with hardening framing (principle 9).
  • TestNoCertsFailed, TestAllExpiredFailed — FAILED with explicit reason in result text.
  • TestBucketing — 3 SPs in 3 statuses → 3 CheckResults, each listing only its bucket's SPs.
  • TestSelectClauseInRequestUrl — regression guard for the $select bug.
  • TestMalformedThumbprintIsNotFatal — malformed CKI doesn't crash the Workday block.
  • TestWireup — WD-CONN-102 fires pre-gate; downstream Workday checks do NOT fire (gate still works).

remediation-guide.md and validation-matrix.md updated. Local suite: N passed, M skipped, 0 regressions (fill in after pytest). ruff check clean on touched files.

@apurvabanka

Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree company="Microsoft"

Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py Outdated
Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py
Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py Outdated
Comment thread tests/flightcheck/checks/test_workday_saml_certificate.py
Comment thread tests/flightcheck/checks/test_workday_saml_certificate.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new FlightCheck checkpoint (WD-CONN-102) to assess Workday SAML signing certificate health by reading Workday SAML enterprise app servicePrincipal certificate metadata from Microsoft Graph, emitting MANUAL/WARNING/FAILED/NOT_CONFIGURED/SKIPPED as appropriate, and documenting the operator-side Workday thumbprint comparison step.

Changes:

  • Extend the Graph client to support $select projection on /servicePrincipals and add a Workday SAML-specific listing helper that always projects certificate-related fields.
  • Add WD-CONN-102 implementation to checks/workday.py (cert grouping, active-cert selection, bucketing) and wire it to run before the “no Workday integration” early return.
  • Add Graph mocks + a comprehensive end-to-end test suite for WD-CONN-102, and update the FlightCheck validation matrix + remediation guide docs.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/mocks/graph.py Extends the service principal mock with keyCredentials + preferredTokenSigningKeyThumbprint and adds a key_credential() builder.
tests/flightcheck/checks/test_workday_saml_certificate.py New integration-style tests covering WD-CONN-102 statuses, rollover selection, bucketing, and request $select behavior.
solutions/ess-maker-skills/src/reference/ess-docs/flightcheck/validation-matrix.md Documents the new WD-CONN-102 checkpoint in the validation matrix.
solutions/ess-maker-skills/src/reference/ess-docs/flightcheck/remediation-guide.md Adds operator remediation steps for WD-CONN-102 across FAILED/WARNING/MANUAL outcomes.
solutions/ess-maker-skills/scripts/flightcheck/graph_client.py Adds $select support to service principal listing and introduces get_workday_saml_service_principals() + select constant.
solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py Implements WD-CONN-102 certificate logic and wires it into the Workday checks pipeline.

Comment thread solutions/ess-maker-skills/scripts/flightcheck/graph_client.py
Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py Outdated
Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py
Comment thread solutions/ess-maker-skills/scripts/flightcheck/checks/workday.py
apurvabanka and others added 4 commits June 8, 2026 00:32
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@apurvabanka apurvabanka requested a review from nehaoss June 8, 2026 07:53

@srideshpande srideshpande left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title/description feel a bit confusing to me. Are we actually checking a certificate used by the Workday SOAP/REST integration itself, or are we checking the SAML signing certificate on the federated Workday Enterprise App in Entra?

From the implementation, it seems like the latter. If so, could we clarify the title/description to say “Workday SAML / Enterprise App certificate health” or something in those lines, rather than “SOAP/REST integration certificate” so the scope is clearer?

Comment thread solutions/ess-maker-skills/scripts/flightcheck/graph_client.py
Comment thread solutions/ess-maker-skills/scripts/flightcheck/graph_client.py
@apurvabanka apurvabanka merged commit 4f2a7d9 into main Jun 9, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FlightCheck: Add check for SAML NameID alignment with Workday user identifier

4 participants