Skip to content

feat(wasm-host): JVM WASM host for TDF encrypt round-trip#338

Draft
pflynn-virtru wants to merge 2 commits intomainfrom
feat/wasm-host-jvm
Draft

feat(wasm-host): JVM WASM host for TDF encrypt round-trip#338
pflynn-virtru wants to merge 2 commits intomainfrom
feat/wasm-host-jvm

Conversation

@pflynn-virtru
Copy link
Member

Summary

  • Adds wasm-host Maven module to the develop profile
  • Loads the TinyGo-built TDF encrypt WASM binary via Chicory (pure-Java, zero native deps)
  • Provides 8 host crypto functions (crypto namespace) using Java SDK classes (AesGcm, AsymEncryption, AsymDecryption, CryptoUtils)
  • Validates WASM-produced TDFs decrypt correctly in 3 JUnit 5 tests:
    • HS256 round-trip: encrypt → parse ZIP → unwrap DEK → AES-GCM decrypt → assert match
    • GMAC round-trip: verify segment hash = GCM auth tag, decrypt → assert match
    • Error handling: invalid PEM → verify error returned

Notes

  • The WASM binary (tdfcore.wasm) is gitignored and must be built from opentdf/platform before running tests
  • Uses ZipFile instead of ZipInputStream to parse TDF output due to upstream zipstream bug (STORED entries with unnecessary data descriptors) — tracked in WASM zipstream uses unnecessary data descriptors for STORED entries #337
  • No changes to existing SDK files; only the parent POM gets a new <module> entry

Test plan

  • mvn test -pl wasm-host -Pdevelop -am passes all 3 tests
  • CI needs tinygo or a pre-built tdfcore.wasm artifact to run

🤖 Generated with Claude Code

Add a new wasm-host Maven module that loads the TinyGo-built TDF encrypt
WASM binary via Chicory (pure-Java WASM runtime), provides host crypto
functions using Java SDK classes, and validates that WASM-produced TDFs
decrypt correctly.

Three JUnit 5 tests:
- HS256 round-trip: encrypt → parse ZIP → unwrap DEK → AES-GCM decrypt
- GMAC round-trip: verify segment hash equals GCM auth tag, then decrypt
- Error handling: invalid PEM returns error via get_error export

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @pflynn-virtru, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by integrating WebAssembly (WASM) capabilities into the JVM environment for TDF encryption. It establishes a robust framework for executing WASM binaries, specifically a TinyGo-built TDF encrypt module, directly within Java. This integration is crucial for validating the TDF encryption process across different platforms and ensures that the cryptographic operations performed by the WASM module are consistent and secure, utilizing the established Java SDK crypto primitives.

Highlights

  • New WASM Host Module: A new wasm-host Maven module has been introduced and added to the develop profile, enabling JVM-based WebAssembly execution.
  • TinyGo WASM Integration: The module loads a TinyGo-built TDF encrypt WASM binary using Chicory, a pure-Java WebAssembly runtime, demonstrating cross-platform compatibility.
  • Java SDK Crypto Host Functions: Eight host crypto functions are provided under a crypto namespace, leveraging existing Java SDK classes (AesGcm, AsymEncryption, AsymDecryption, CryptoUtils) to perform cryptographic operations within the WASM environment.
  • TDF Encryption Round-Trip Validation: Three JUnit 5 tests have been added to validate the end-to-end TDF encryption and decryption process, covering HS256 round-trip, GMAC integrity verification, and error handling for invalid inputs.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pom.xml
    • Added wasm-host as a new Maven module to the develop profile.
  • wasm-host/.gitignore
    • Created a new .gitignore file for the wasm-host module, ignoring target/ and *.wasm build artifacts.
  • wasm-host/pom.xml
    • Defined a new Maven module for wasm-host with parent sdk-pom.
    • Added dependencies for Chicory (WASM runtime), OpenTDF SDK (for crypto), GSON (for JSON parsing), and JUnit 5 (for testing).
    • Configured Maven plugins to skip installation and deployment for this test-only module.
  • wasm-host/src/test/java/io/opentdf/platform/wasm/WasmTdfTest.java
    • Implemented WasmTdfTest class to host a TinyGo-built TDF encrypt WASM module using Chicory.
    • Defined 8 host functions under the crypto namespace for random bytes, AES-GCM encrypt/decrypt, HMAC-SHA256, RSA-OAEP-SHA1 encrypt/decrypt, RSA keypair generation, and error retrieval.
    • Defined 2 host functions under the io namespace for read_input and write_output (mocked for encryption).
    • Included helper methods for WASM memory allocation, error retrieval, WASM encryption invocation, and TDF ZIP parsing.
    • Added three JUnit 5 tests: testHS256RoundTrip to validate TDF encryption and decryption with HS256 integrity, testGMACRoundTrip to verify GMAC segment integrity, and testErrorHandlingInvalidPEM to test error propagation for invalid PEM inputs.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a wasm-host module for end-to-end testing of TDF encryption via a WASM binary, with a well-structured implementation using Chicory and comprehensive tests. However, a medium-severity security vulnerability was identified: the WasmTdfTest.java file uses RSA-OAEP with SHA-1, a deprecated and insecure hash function. This exposes an underlying SDK dependency (AsymEncryption, AsymDecryption) hardcoded to this weak algorithm. It is recommended to update the core SDK to use a modern, secure hashing algorithm like SHA-256 for the OAEP padding scheme. Additionally, minor code quality improvements are suggested, such as refactoring SecureRandom instantiation, reducing host function definition duplication, and replacing magic numbers with named constants.

Comment on lines +155 to +171
new HostFunction(
"crypto", "rsa_oaep_sha1_encrypt",
FunctionType.of(
List.of(ValType.I32, ValType.I32, ValType.I32, ValType.I32, ValType.I32),
List.of(ValType.I32)),
(inst, args) -> {
try {
String pubPEM = inst.memory().readString((int) args[0], (int) args[1]);
byte[] pt = inst.memory().readBytes((int) args[2], (int) args[3]);
byte[] encrypted = new AsymEncryption(pubPEM).encrypt(pt);
inst.memory().write((int) args[4], encrypted);
return new long[]{encrypted.length};
} catch (Exception e) {
lastError = e.getMessage();
return new long[]{ERR_SENTINEL};
}
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The host functions rsa_oaep_sha1_encrypt and rsa_oaep_sha1_decrypt rely on an underlying implementation (AsymEncryption/AsymDecryption) that uses RSA-OAEP with SHA-1. SHA-1 is a deprecated cryptographic hash function with known collision vulnerabilities and should not be used in new applications. The use of SHA-1 in the padding scheme weakens the overall security of the asymmetric encryption. It is recommended to upgrade the underlying SDK classes to use a stronger hash function, such as SHA-256, in the OAEP padding (e.g., RSA/ECB/OAEPWithSHA-256AndMGF1Padding).

int outPtr = (int) args[0];
int n = (int) args[1];
byte[] bytes = new byte[n];
new SecureRandom().nextBytes(bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Creating a new SecureRandom instance on every call to this host function is inefficient, as SecureRandom instances can be expensive to initialize. It's recommended to create a single, static SecureRandom instance and reuse it, since it is thread-safe.

For example, you could add a static field to your test class:

private static final SecureRandom SECURE_RANDOM = new SecureRandom();

And then change this line to:

SECURE_RANDOM.nextBytes(bytes);

Comment on lines +106 to +116
try {
byte[] key = inst.memory().readBytes((int) args[0], (int) args[1]);
byte[] pt = inst.memory().readBytes((int) args[2], (int) args[3]);
AesGcm.Encrypted encrypted = new AesGcm(key).encrypt(pt);
byte[] result = encrypted.asBytes();
inst.memory().write((int) args[4], result);
return new long[]{result.length};
} catch (Exception e) {
lastError = e.getMessage();
return new long[]{ERR_SENTINEL};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The try-catch block for error handling is repeated in multiple host functions (aes_gcm_encrypt, aes_gcm_decrypt, hmac_sha256, etc.). This boilerplate code could be extracted into a helper method to reduce duplication and improve readability. The helper could accept a lambda expression for the core logic of each crypto operation and wrap it with the try-catch logic.

Comment on lines +281 to +282
int outCapacity = 1024 * 1024;
long outPtr = wasmMalloc(outCapacity);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The output capacity 1024 * 1024 is a magic number. This value is also used in testErrorHandlingInvalidPEM, and getWasmError uses another magic number 4096. It's recommended to define these as named constants at the top of the class to improve readability and maintainability. For example:

private static final int WASM_OUTPUT_BUFFER_CAPACITY = 1024 * 1024;
private static final int WASM_ERROR_BUFFER_SIZE = 4096;

…ix landed

Replace ZipFile (temp file + central directory) workaround with standard
ZipInputStream, since opentdf/platform@7dd876e1 fixed the STORED entries
with unnecessary data descriptors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

X-Test Failure Report

@sonarqubecloud
Copy link

@github-actions
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant