Skip to content

Fix Android AEAD tag mismatch exceptions#128889

Draft
Copilot wants to merge 12 commits into
mainfrom
copilot/fix-android-chachapoly1305-test-failure
Draft

Fix Android AEAD tag mismatch exceptions#128889
Copilot wants to merge 12 commits into
mainfrom
copilot/fix-android-chachapoly1305-test-failure

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

Android AEAD ciphers can surface authentication tag failures from the tag Cipher.update call instead of finalization. That caused tampered decryptions to throw CryptographicException instead of AuthenticationTagMismatchException.

  • Native Android crypto interop

    • Added an AEAD-aware cipher update entry point that detects Java AEADBadTagException.
    • Checks for AEAD update exceptions before copying returned output bytes into the destination buffer.
    • Reused the same tag-mismatch detection logic for AEAD finalization.
  • Managed Android AEAD paths

    • Switched the tag update step to the AEAD-aware update helper for ChaCha20-Poly1305, AES-GCM, and AES-CCM.
    • Maps detected tag mismatches to AuthenticationTagMismatchException.
if (authTagMismatch)
{
    throw new AuthenticationTagMismatchException();
}

Closes #128888

Copilot AI requested review from Copilot and removed request for Copilot June 2, 2026 09:54
Co-authored-by: simonrozsival <374616+simonrozsival@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot June 2, 2026 10:08
Copilot AI changed the title [WIP] Fix Android ChaCha20Poly1305 tag mismatch exception in tests Fix Android ChaCha20Poly1305 tag mismatch exception Jun 2, 2026
Copilot AI requested a review from simonrozsival June 2, 2026 10:09
@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 2, 2026

Does AesGcm need the same fix?

@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 2, 2026

@copilot let's change the Android implementation of AesGcm and AesCcm (if CCM is supported on Android) to use the new AEAD update call.

Co-authored-by: vcsjones <361677+vcsjones@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot June 2, 2026 13:03
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 2, 2026

@copilot let's change the Android implementation of AesGcm and AesCcm (if CCM is supported on Android) to use the new AEAD update call.

Addressed in dc6a5dd by switching the Android AesGcm and AesCcm decrypt tag updates to the AEAD-aware update helper.

Copilot AI changed the title Fix Android ChaCha20Poly1305 tag mismatch exception Fix Android AEAD tag mismatch exceptions Jun 2, 2026
Copilot AI requested a review from vcsjones June 2, 2026 13:04
[return: MarshalAs(UnmanagedType.Bool)]
private static partial bool EvpAeadCipherUpdate(
SafeEvpCipherCtxHandle ctx,
ref byte @out,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot let's avoid using keywords as parameter names in C# so they don't need to be @-prefixed. Call them input or output, for example.

Let's not use the ref byte and ref MemoryMarshal.GetReference pattern any more. Just use ReadOnlySpan and Span - the LibraryImport marshaller understands how to marshal them. See @src/libraries/Common/src/Interop/OSX/System.Security.Cryptography.Native.Apple/Interop.X25519.cs as an example.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b28e258 by changing the new Android AEAD update interop signature to use Span<byte> / ReadOnlySpan<byte> directly and by removing the keyword-prefixed parameter names.

Co-authored-by: vcsjones <361677+vcsjones@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot June 2, 2026 13:46
Copilot AI review requested due to automatic review settings June 3, 2026 08:41
@simonrozsival simonrozsival marked this pull request as ready for review June 3, 2026 08:43
@simonrozsival
Copy link
Copy Markdown
Member

/azp run runtime-android

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-android': @vitek-karas, @simonrozsival, @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the Android crypto (Javax.Crypto Cipher) interop and managed AEAD implementations so that authentication tag failures detected during Cipher.update(...) are surfaced as AuthenticationTagMismatchException (instead of a generic CryptographicException), aligning behavior with AEAD expectations.

Changes:

  • Added a new native entry point (AndroidCryptoNative_AeadCipherUpdate) that detects AEADBadTagException during Cipher.update.
  • Refactored native cipher update to check for JNI exceptions before copying output back to managed buffers, and reused the same tag-mismatch detection pattern for finalization.
  • Updated Android managed decrypt paths (ChaCha20-Poly1305, AES-GCM, AES-CCM) to use the AEAD-aware update and throw AuthenticationTagMismatchException on detected tag mismatch.
Show a summary per file
File Description
src/native/libs/System.Security.Cryptography.Native.Android/pal_cipher.h Declares the new AEAD-aware cipher update export.
src/native/libs/System.Security.Cryptography.Native.Android/pal_cipher.c Implements AEAD-aware update and centralizes AEADBadTagException detection for update/final.
src/libraries/Common/src/Interop/Android/System.Security.Cryptography.Native.Android/Interop.Cipher.cs Adds managed interop for the new AEAD-aware update entry point.
src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/ChaCha20Poly1305.Android.cs Uses AEAD-aware update for tag processing and maps mismatches to AuthenticationTagMismatchException.
src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/AesGcm.Android.cs Uses AEAD-aware update for tag processing and maps mismatches to AuthenticationTagMismatchException.
src/libraries/System.Security.Cryptography/src/System/Security/Cryptography/AesCcm.Android.cs Uses AEAD-aware update for tag processing and maps mismatches to AuthenticationTagMismatchException.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 1

Comment thread src/native/libs/System.Security.Cryptography.Native.Android/pal_cipher.c Outdated
@simonrozsival
Copy link
Copy Markdown
Member

/azp run runtime-android

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

simonrozsival and others added 3 commits June 3, 2026 20:31
…decrypt

On some Android devices, when an AEAD decryption fails authentication,
Conscrypt determines the thrown exception type by reading BoringSSL's
thread-local error queue, which can still contain a stale error from a
prior unrelated operation on the same thread (e.g. an RSA error). That
stale error is read first (FIFO) and mapped to the wrong exception type
(observed java.security.InvalidKeyException with an RSA MODULUS_TOO_LARGE
message during a ChaCha20Poly1305 decrypt) instead of the expected
javax.crypto.AEADBadTagException. Because detection only matched
AEADBadTagException, authTagMismatch stayed false and a generic
CryptographicException was thrown instead of AuthenticationTagMismatchException,
intermittently failing ChaCha20Poly1305Tests.EncryptTamperAADDecrypt.

For a successfully initialized AEAD cipher in decrypt mode, the key and
parameters were already validated at Cipher.init, so the only legitimate
failure at the tag-update/doFinal step is an authentication failure. Treat
any java.security.GeneralSecurityException caught there (in decrypt mode) as
an authentication tag mismatch. This also covers AES-GCM and AES-CCM, which
share the same native AEAD paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a debug breadcrumb on the GeneralSecurityException fallback path so a
genuine non-authentication failure, if ever reclassified as a tag mismatch,
can still be diagnosed from logcat. This path does not run for the common
AEADBadTagException case and is behavior-neutral for pass/fail.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Guard GetStringUTFChars with an ExceptionCheck after getMessage() and clear
any pending JNI exception before returning, so the fallback diagnostic path
never leaves a stray Java exception for the caller (which raises the managed
AuthenticationTagMismatchException next). No memory or gref leaks: only local
refs are created and all are released.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 4, 2026

I'd like to look at this a bit further on Thursday, I am not convinced this is the right change (it could be! I'm just not sure yet).

Copilot AI review requested due to automatic review settings June 4, 2026 04:48
@simonrozsival
Copy link
Copy Markdown
Member

/azp run runtime-android

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 8/8 changed files
  • Comments generated: 1

Comment on lines 320 to +333
if (!outl && !in)
// it means caller wants us to record "inl" but we don't need it.
return SUCCESS;

abort_if_invalid_pointer_argument(outl);
abort_if_invalid_pointer_argument(in);

int32_t result = FAIL;
INIT_LOCALS(loc, inDataBytes, outDataBytes);

JNIEnv* env = GetJNIEnv();
jbyteArray inDataBytes = make_java_byte_array(env, inl);
(*env)->SetByteArrayRegion(env, inDataBytes, 0, inl, (jbyte*)in);
loc[inDataBytes] = make_java_byte_array(env, inl);
(*env)->SetByteArrayRegion(env, loc[inDataBytes], 0, inl, (jbyte*)in);
ON_EXCEPTION_PRINT_AND_GOTO(cleanup);
@simonrozsival simonrozsival marked this pull request as draft June 4, 2026 04:55
The previous fix treated any GeneralSecurityException during an Android AEAD
decrypt as an authentication tag mismatch. That depended on Conscrypt
implementation details and risked false positives, so revert the three native
commits (pal_cipher.c / pal_jni.c / pal_jni.h) back to recognizing only a
genuine AEADBadTagException.

Root cause: on AEAD decrypt failure Conscrypt derives the Java exception type
by popping the oldest entry from BoringSSL's thread-local FIFO error queue. A
stale error left by an unrelated prior operation on the same pooled thread
(e.g. an RSA failure) is read first and mapped to the wrong type
(InvalidKeyException), which is not a BadPaddingException and so is never
converted to AEADBadTagException. Conscrypt clears the whole queue after
reading, so the failed attempt leaves a clean queue.

Fix: in the managed Android AEAD DecryptCore, when the first attempt throws a
CryptographicException that is not AuthenticationTagMismatchException, retry the
decrypt once on the same thread. The retry reads a clean error queue and
observes the true result (a real tag mismatch surfaces as AEADBadTagException).
A valid ciphertext authenticates regardless of the queue, so the retry only ever
runs after a genuine failure and can never turn an authentication failure into
unauthenticated plaintext.

Validated on a physical 32-bit Android device (Samsung Galaxy A16, armeabi-v7a):
5x full System.Security.Cryptography test suite, 12373 tests, 0 failures,
including all ChaCha20Poly1305 EncryptTamperAADDecrypt cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot mentioned this pull request Jun 4, 2026
// failed attempt clears that queue, so an immediate same-thread retry observes
// the true result. A valid ciphertext cannot reach this path, so the retry can
// never turn an authentication failure into unauthenticated plaintext.
DecryptCoreOnce(nonce, ciphertext, tag, plaintext, associatedData);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very suspicious about this fix. Here is what we know about the failure.

  1. There is one consistent test failure: EncryptTamperAADDecrypt(dataLength: 0, additionalDataLength: 1). This is a bit of an edge-case test. All failures are for an empty ciphertext with AAD.
  2. All failures are for ARM32, not ARM64 or XArch.

"Retry" logic here generally feels very wrong to me. I'd sooner mark this one test case (not the whole test!) as ActiveIssue to better understand it first.

I am going to try to open a throw-away pull request with some better diagnostics to see what is going on since I cannot reproduce this locally.

If we want to get CI back to green, I would sooner disable that one very-specific test case for Android as an ActiveIssue.

@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 4, 2026

I'm trying to do some scouting / investigation here: #129000

@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 5, 2026

Okay, so this does appear to be a bug in Conscrypt. When it imports an RSA key - and fails - it does not seem to be handling the error queue correctly. The actual exception that JNI is giving us for the bad authentication tag is

java.security.InvalidKeyException: error:04000080:RSA routines:OPENSSL_internal:MODULUS_TOO_LARGE
  at org.conscrypt.NativeCrypto.EVP_AEAD_CTX_open(Native Method)
  at org.conscrypt.OpenSSLAeadCipher.doFinalInternal(OpenSSLAeadCipher.java:420)
  at org.conscrypt.OpenSSLAeadCipher.engineDoFinal(OpenSSLAeadCipher.java:272)
  at java.base/javax.crypto.Cipher.doFinal(Cipher.java:2202)

ChaCha20Poly1305 throwing an error for an RSA modulus being too large is obviously nonsense. We can "fix" ChaCha20 but there is actually nothing wrong with it. The RSA tests just left stale data in the error queue. I would say we probably shouldn't try working around this in ChaCha20Poly1305 - because RSA actually poisoned the error queue, this is very likely to crop up somewhere else - AES-GCM AEAD tag handling, or any other place that cares about what exception was thrown. We might be clearing the error queue here, but this is going to depend on the order in which tests run, if new tests are added, etc. It feels extremely fragile to me.

I put up #129033 as an alternative suggestion for a fix, for now.

@vcsjones
Copy link
Copy Markdown
Member

vcsjones commented Jun 5, 2026

#129033 ran and all cryptography tests passed. I think we should close this in favor of the other pull request.

I can reproduce this problem with AES-GCM - it isn't specific to ChaCha20/Poly1305. I don't think it is feasible for us to sprinkle error handling logic like this everywhere, we were just depending on unit test order.

If we want to put a work around in somewhere when we should do it in RSA to stop it from contaminating the BoringSSL error queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci-scan] Test failure: Android ChaCha20Poly1305 tag mismatch exception type

4 participants