Skip to content

Add Traefik X.509 client certificate authentication support#61

Draft
hanstrompert wants to merge 21 commits into
mainfrom
arno-x509-cmp
Draft

Add Traefik X.509 client certificate authentication support#61
hanstrompert wants to merge 21 commits into
mainfrom
arno-x509-cmp

Conversation

@hanstrompert
Copy link
Copy Markdown
Member

Summary

  • Extend AuthInterceptor to authenticate clients via Traefik's X-Forwarded-Tls-Client-Cert (PEM) and X-Forwarded-Tls-Client-Cert-Info (subject DN) headers, alongside the existing nginx subject-DN header and the Jakarta servlet X509Certificate attribute.
  • Replace AuthorizeDnType values (NO/HEADER/CERTIFICATE) with explicit per-source variants: NO, NGINX_TLS_CLIENT_SUBJECT_DN, JAKARTA_SERVLET_TLS_CLIENT_CERT, TRAEFIK_TLS_CLIENT_CERT, TRAEFIK_TLS_CLIENT_SUBJECT_DN. Rename the related config keys to authorize-dn-type / tls-client-auth-n-header.
  • Introduce ClientPrincipals so allowed-DN comparison goes through X500Principal.equals() instead of raw string match, allowing more flexible DN formatting in application.properties.
  • Add happy-path unit test for the Traefik PEM header.

Known blockers (must be addressed before un-drafting)

Critical — silent auth bypass on upgrade

  • application.properties, chart/values.yaml, and README.md still use the old property names (authorize-dn, ssl-client-subject-dn-header) and old enum values (header, certificate). After upgrade, existing deployments bind to nothing, authorize-dn-type falls back to NO, ServerConfig skips installing the interceptor, and client-DN authorization is silently bypassed. Needs migration path (alias old keys, or fail-fast when legacy keys are present) and full doc sync.

Auth/security correctness

  • Case-sensitive header match (AuthInterceptor.java): iterates getHeaderNames() with String.equals. HTTP header names are case-insensitive (RFC 7230). Use request.getHeader(name) directly.
  • No cert validation on TRAEFIK_TLS_CLIENT_CERT path: subject DN of the supplied PEM is trusted as-is. Safe only if Traefik enforces client-auth (RequireAndVerifyClientCert) AND the header is stripped from inbound external requests. Needs explicit ops docs.
  • Missing default case in switch: today the interceptor isn't installed when authorizeDnType == NO, but the switch would NPE on tlsClientSubjectPrincipal for any future enum value. Fail closed instead.
  • Cert chain handling is untested: code blindly takes the first comma-separated PEM. Per the cited source, leaf comes first, but no test asserts this. Add multi-cert test.
  • Malformed allow-list entries are silently dropped in ClientPrincipals (the per-entry try { } catch (Exception) only logs at FINE). Fail-fast at startup instead.

Code quality

  • @Component on ClientPrincipals is incorrect — it's instantiated via new with constructor args, not autowired. Remove the annotation.
  • Allow-list is reparsed on every request inside AuthInterceptor.isAllowed(). Build once at config bind time and cache.
  • Dead if (cps == null) check after new ClientPrincipals(...).
  • System.out.println(e.getMessage()) in the PEM catch — should be LOG.warning (or rethrown as SoapFault cause).
  • ARNOTODO comment left in source.
  • JAKARTA_SERVLET_TLS_CLIENT_CERT_HEADER constant misnames a servlet attribute as a header.
  • Optional: redact header values logged at FINE (currently logs full PEM/DN).

Tests

  • Only happy-path Traefik PEM is covered. Add: invalid PEM, missing header, comma-separated chain (assert leaf is picked), unauthorized DN via Traefik, the entire TRAEFIK_TLS_CLIENT_SUBJECT_DN (Info) path, and case-insensitive header matching.

Test plan

  • mvn test passes locally
  • Address blockers above
  • Manually exercise each AuthorizeDnType against a deployed instance behind nginx and Traefik ingress
  • Verify upgrade path from existing deployments (legacy property names) — either compat aliases or release-note migration

🤖 Generated with Claude Code

Copy link
Copy Markdown
Member Author

@hanstrompert hanstrompert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-up commits! Lots of progress here — the case-sensitive header lookup, the missing default case, the ARNOTODO, and the stray System.out.println in the PEM catch are all sorted, and the new tests for invalid/unauthorized PEM, problematic OIDs, and the multi-cert chain (happy path, badly-separated, wrong order) are really nice additions.

A few things I think would be worth another look before this comes out of draft — most are small, but a couple feel important. (A few of the items below are about files that aren't part of this PR's diff, so I couldn't pin them as inline comments — flagging them here with file:line references instead.)

Higher priority

  1. Legacy config keys haven't been updated, which I think disables auth silently on upgrade. As far as I can tell, after deploy these still bind to nothing, authorize-dn-type falls back to NO, and the interceptor doesn't get installed → client-DN authorization is silently disabled. This was the critical item from the original review, so probably worth tackling before merge. A few possible directions: rename the keys + values everywhere to match the new schema, keep compat aliases for the legacy keys for a release, or fail fast at startup when the legacy keys are present.

    Spots I noticed (all outside the current PR diff, so couldn't pin inline):

    • src/main/resources/application.properties:70-72 — comment on line 70 ("possible values for authorize-dn are no, header, certificate") + authorize-dn=no on line 71 + ssl-client-subject-dn-header=… on line 72.
    • chart/values.yaml:158-159verify-ssl-client-subject-dn / ssl-client-subject-dn-header.
    • README.md:162 — example using legacy authorize-dn=no.
    • README.md:205 — example using legacy authorize-dn=certificate. Also a natural place to add a short note showing what each new AuthorizeDnType value (JAKARTA_SERVLET_TLS_CLIENT_CERT, NGINX_TLS_CLIENT_SUBJECT_DN, TRAEFIK_TLS_CLIENT_CERT, TRAEFIK_TLS_CLIENT_SUBJECT_DN) corresponds to.
    • README.md:293 — another authorize-dn=header + ssl-client-subject-dn-header example.
  2. Two System.out.println calls slipped into the production code paths (AuthInterceptor.java:121, ClientPrincipals.java:78) — they end up logging the client DN and every allow-list comparison to stdout on every request. Probably want to drop those before merge. There's also a leftover println in one of the new tests.

  3. The TRAEFIK_TLS_CLIENT_CERT path still trusts the subject DN of whatever PEM lands in the header without verifying the chain. That's fine if Traefik is configured with RequireAndVerifyClientCert and strips the inbound X-Forwarded-Tls-Client-Cert header from external requests, but it would be great to call that out explicitly in the README (probably around line 205) so this isn't deployed unsafely.

Still open from the original list (no rush, just flagging)

  • Malformed allow-list entries are still silently dropped (ClientPrincipals.java:62-63) — a typo in application.properties would only become visible when an authorized client suddenly gets rejected. Failing fast at startup would be friendlier.
  • @Component on ClientPrincipals looks like it can come off — the class is built via new in AuthInterceptor.isAllowed(), never autowired.
  • The allow-list is reparsed on every request (AuthInterceptor.java:135). Probably worth building it once at config-bind time.
  • Dead if (cps == null) after new ClientPrincipals(...) at AuthInterceptor.java:136.
  • JAKARTA_SERVLET_TLS_CLIENT_CERT_HEADER is actually a servlet attribute name, not a header — the constant name is a bit misleading.
  • The FINE-level log at AuthInterceptor.java:55 includes the full PEM/DN value — might be worth redacting since FINE can end up in shared sinks.
  • Coverage gaps: case-insensitive header lookup, the full TRAEFIK_TLS_CLIENT_SUBJECT_DN (Info) path end-to-end, and a missing-header case for Traefik.

Small nits introduced by these commits

  • The default-case message at AuthInterceptor.java:116 hardcodes "set to NO" regardless of which value triggered it.
  • Typo at AuthInterceptor.java:108: missing space before "missing.".
  • Double-negative in the comment at ClientPrincipals.java:51.
  • The invalid-PEM construction in the test (AuthInterceptorTest.java:260) is a bit fragile.
  • Stray ; at AuthInterceptorTest.java:276.

I've left inline comments on the items that are in the PR diff so they're easier to track. Happy to chat through any of them if I've misread the intent anywhere.

else LOG.fine(sslClientSubjectDn + " in list of allowed DNs");
String rfc2253Dn = tlsClientSubjectPrincipal.getName(X500Principal.RFC2253);

System.out.println("ARNO GOT DN: " + rfc2253Dn);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one slipped in by accident — it ends up writing the client DN to stdout on every authenticated request. The LOG.fine(rfc2253Dn + " in list of allowed DNs") just below already covers it, so probably safe to just drop this line.

faultCode);
break;
default:
throw new SoapFault(clientCertificateProperties.getAuthorizeDnType() + " set to NO.", faultCode);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: the message says "set to NO" but default would fire for any unhandled enum value, not just NO. Maybe fold NO into the switch explicitly and let default say "unsupported AuthorizeDnType: " + at? That way a future enum value won't produce a misleading error.

}
}
} else {
LOG.fine("Expected HTTP header " + expectHeaderName + "missing.");
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny one — missing space, so the log line reads …X-Forwarded-Tls-Client-Certmissing.. Just " missing." should do it.

String expectHeaderName = clientCertificateProperties.getTlsClientAuthNHeader();
String headerValue = request.getHeader(expectHeaderName);
if (headerValue != null) {
LOG.fine("Found HTTP header " + expectHeaderName + ": " + headerValue);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logs the full header value at FINE — for TRAEFIK_TLS_CLIENT_CERT that's the entire PEM, and for the Info path it's the full DN. Even at FINE it can leak into shared log sinks. Could we log just the header name + a length or hash, or truncate?

return clientCertificateProperties.getDistinguishedNames().contains(sslClientSubjectDn);
// Not ideal to convert strings to Objects on each call, but otherwise conflicts with Properties-based
// config. And we must use the Principal equals() method for comparison.
ClientPrincipals cps = new ClientPrincipals(distinguishedNames);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We rebuild the allow-list (parsing each DN into an X500Principal plus the OID map) on every request, which feels avoidable given this is the auth hot path. Could we construct ClientPrincipals once at config-bind time — e.g. via a @PostConstruct on ClientCertificateProperties or a small Spring bean — and reuse the parsed list here?

// "The distinguished name must be specified using the grammar defined in RFC 1779 or RFC 2253 (either
// format is acceptable)."
Map<String, String> names2oid = new HashMap<>();
// Not all x509 implementations do not know the symbolic names for all OIDs.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny grammar thing — "Not all x509 implementations do not know …" is a double negative; reads as the opposite of what I think you meant. Maybe "Not all x509 implementations know the symbolic names …" or "Some x509 implementations don't know …".

X500Principal p = new X500Principal(propDistinguishedName, names2oid);
this.allowedPrincipals.add(p);
} catch (Exception e) {
LOG.fine(propDistinguishedName + " not a proper DN:" + e);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Malformed allow-list entries get silently dropped at FINE level here, so a typo in application.properties wouldn't be visible in normal logs — the symptom would only show up later when an otherwise-authorized client gets rejected. Could we fail fast at startup instead (throw, or collect the errors and throw at the end of construction)? That should be easier to diagnose.


SoapFault fault = assertThrows(SoapFault.class, () -> interceptor.handleMessage(message));

System.out.println("UNAUTH MESSAGE " + fault.getMessage());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a leftover debug print — could we drop it?

when(httpRequest.getHeaderNames())
.thenReturn(Collections.enumeration(List.of("X-Forwarded-Tls-Client-Cert")));
when(httpRequest.getHeader("X-Forwarded-Tls-Client-Cert")).thenReturn(getPemCertString(_UNAUTH_CERT_PEM));
;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray ; on its own line after the when(...) — harmless, but worth tidying.

@Test
void rejectsInvalidPEMHeader() {
String badPemString = getPemCertString(_AUTHORIZED_CERT_PEM);
badPemString = badPemString.replaceFirst("[MQPXY]+", "S");
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: replaceFirst("[MQPXY]+", "S") happens to corrupt enough base64 to fail decoding for this specific fixture, but it depends on the character class hitting a critical position. Something more deterministic like truncating the body, flipping a single byte, or just using "not-a-cert" would be a bit less fragile if we ever swap the fixture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants