Skip to content

Custom loader support#109

Draft
ashu-mehra wants to merge 27 commits intoopenjdk:premainfrom
ashu-mehra:custom-loader-support-v2
Draft

Custom loader support#109
ashu-mehra wants to merge 27 commits intoopenjdk:premainfrom
ashu-mehra:custom-loader-support-v2

Conversation

@ashu-mehra
Copy link
Collaborator

@ashu-mehra ashu-mehra commented Feb 7, 2026

This PR is WIP
It allows user code to set the identity of the ClassLoader object. The identity is of type String. Each InstanceKlass also stores the id of the class loader that loaded the class.. During assembly phase VM stores a map of classloader id to the set of classes loaded by the class loader object in the aot cache. These classes are loaded and linked when the class loader object is created during the production run.
This patch also extends URLClassLoader to transparently set the identity when a new URLClassLoader instance is created, provided all its urls refer to jar files. Currently the id is same as the classpath (set of urls) of the URLClassLoader, but that need not be the case; id can be any function of the urls or their contents. During training run, VM stores a mapping of URLClassLoader id to classpath in the aot config file. This information is propagated to the final aot cache. During production run when the URLClassLoader instance is created, VM verifies that the runtime classpath of the URLClassLoader matches with the classpath stored in the aot cache. If the verification passes, then all the classes associated with the URLClassLoader id are loaded and linked.


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/leyden.git pull/109/head:pull/109
$ git checkout pull/109

Update a local copy of the PR:
$ git checkout pull/109
$ git pull https://git.openjdk.org/leyden.git pull/109/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 109

View PR using the GUI difftool:
$ git pr show -t 109

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/leyden/pull/109.diff

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
…s in the assembly phase

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
@ashu-mehra ashu-mehra marked this pull request as draft February 7, 2026 04:35
@bridgekeeper
Copy link

bridgekeeper bot commented Feb 7, 2026

👋 Welcome back asmehra! A progress list of the required criteria for merging this PR into premain will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@ashu-mehra
Copy link
Collaborator Author

fyi @iklam @adinn @rose00 @DanHeidinga

@openjdk
Copy link

openjdk bot commented Feb 7, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Feb 7, 2026

@ashu-mehra this pull request can not be integrated into premain due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout custom-loader-support-v2
git fetch https://git.openjdk.org/leyden.git premain
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge premain"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Feb 7, 2026
Copy link
Member

@iklam iklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ashutosh, thanks for working on the POC. It looks very promising.

I have some suggesting for loading the classes in the assembly phase.

Does this work with simple test cases where a single URLClassLoader is being used?

It would be great if you could merge your code with the latest Leyden repo so other can play with it -- if you're ready for that ... :-)

Comment on lines +291 to +293
} else if (k->is_defined_by_aot_safe_custom_loader()) {
// Use UnregisteredClassLoader to load these classes
Handle unreg_class_loader = UnregisteredClasses::unregistered_class_loader(THREAD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unregistered_class_loader() can define only one class for each name. I think we should create one ClassLoader instance for each registered AOT compatible loader.

Also, we should rename UnregisteredClassLoader in CDS.java to something like StaticClassLoader, and use it for both the existing "unregistered" classes, as well as classes for new "registered AOT compatible loaders".

// Use UnregisteredClassLoader to load these classes
Handle unreg_class_loader = UnregisteredClasses::unregistered_class_loader(THREAD);
assert(unreg_class_loader.not_null(), "must be");
Klass* actual = SystemDictionary::resolve_or_fail(ik->name(), unreg_class_loader, true, CHECK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using _all_klasses, I think we can create one list for each registered loader.

We order the loaders by delegation (parent loaders are processed first).

Each list is sorted by class hierarchy (similar to what happens in AOTLinkedClassBulkLoader).

Then, we can directly call into SystemDictionary::preload_class(loader, k). Doing the name lookup in SystemDictionaryShared::load_shared_class_for_aot_safe_custom_loader() seems necessary. Also, such an API would open up an execution path that might be taken inadvertently, so it's better to avoid it.

By the way, classes in _all_klasses are loaded even if AOTClassLinking is false. For simplicity, I think we should support registered loaders only when AOTClassLinking is true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about the same. We should be able to use SystemDictionary::preload_class(loader, k for loading classes in assembly phase as well. The preloading in production run and in assembly phase should be able to use the same code. I will work on this change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iklam

Instead of using _all_klasses, I think we can create one list for each registered loader.

We order the loaders by delegation (parent loaders are processed first).

I think in FinalImageRecipes, used during assembly phase, we don't need to maintain list of each registered loader if we can process the classes in hierarchical order. In 4711732 I updated ArchiveBuilder to store the classes in hierarchical order, and use the sorted class list to create class tables, similar to AOTLinkedClassTable. I also clubbed unregistered classes and aot-safe loader classes into a single category of custom_loader_classes. We don't really need to distinguish between the two cases during assembly phase. All these classes get loaded via SystemDictionary::preload_class() (with some modifications).
This change removed the need to do the lookup through SystemDictionaryShared::load_shared_class_for_aot_safe_custom_loader().
Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you handle a situation like this?

URLClassLoader a = foo.jar:bar.jar;   loads classes Foo and and Bar
URLClassLoader b = foo.jar:taz.jar;   loads classes Foo and and Taz

In the assembly phase, you have only a single class loader instance,UnregisteredClasses::unregistered_class_loader(), that loads all the custom loader classes, I think it won't be able to define two different Foo classes:

InstanceKlass* loaded_ik = SystemDictionary::find_instance_klass(THREAD,
       ik->name(), loader);
if (loaded_ik == nullptr) {
  SystemDictionary::preload_class(loader, ik, CHECK);
} else {
   assert(loaded_ik == ik, "sanity check");   <<<<< HERE
}

Also, do you plan to support more complex cases like delegation?

URLClassLoader a = foo.jar, parent = system loader;
URLClassLoader b = bar.jar. parent = a;

a: load class Foo
b: load class Bar, whose super class is Foo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you handle a situation like this?

URLClassLoader a = foo.jar:bar.jar; loads classes Foo and and Bar
URLClassLoader b = foo.jar:taz.jar; loads classes Foo and and Taz

Nope, this won't be handled with the current state of things. I think we are not handling this scenario in the training phase either. Only one InstanceKlass for class Foo would be stored in ArchiveBuilder::_klasses list.. The other one would be marked for exclusion. If we want to allow this, then we will have to allow duplicates in the ArchiveBuilder::_klasses list and then update FinalImageRecipe to maintain a list of each registered loader, as you mentioned earlier. Seems possible. I will take a stab at this.

Also, do you plan to support more complex cases like delegation?

Yes, I expect the simple parent-first delegation model to work, but I haven't tested it yet. I will check it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one InstanceKlass for class Foo would be stored in ArchiveBuilder::_klasses list.

This limitation of unique name should apply only to the "unregistered" classes. We current have two categories of classes: BUILTIN and UNREGISTERED:

https://github.com/openjdk/jdk/blob/7d2b6ed8923d8955afb533ea78c72abd07628c0d/src/hotspot/share/classfile/systemDictionaryShared.hpp#L51-L55

We have the odd UNREGISTERED name because we once had REGISTERED long time ago (but this was never published in open-JDK). We had an API to register a class loader for CDS support (it's kind of similar to what you're doing) but we never had a satisfactory solution, so it was abandoned when the AppCDS code was added in JDK 8.

If it makes sense, you can add this enum back.

Another option is to get rid of the name "unregistered" altogether, and use something line BUILTIN vs CUSTOM.

The CUSTOM classes must be unique with the (aot-id + name) combination. A CUSTOM class that's not defined by an aot-safe loader will have an aot-id of null. This way we will retain the old behavior of UNREGISTERED classes.

Copy link
Collaborator Author

@ashu-mehra ashu-mehra Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to get rid of the name "unregistered" altogether, and use something line BUILTIN vs CUSTOM.

This is exactly what I was planning to do. UNREGISTERED doesn't convey the intention in current code context.

The CUSTOM classes must be unique with the (aot-id + name) combination. A CUSTOM class that's not defined by an aot-safe loader will have an aot-id of null. This way we will retain the old behavior of UNREGISTERED classes.

Right, so we can continue to maintain a single list in ArchiveBuilder::_klasses but at the time of writing in the preimage or final image, they would be stored in a map with aot-id as the key and list of classes loaded by that loader as the value.
Regarding duplicates, we want to allow duplicates in the UNREGISTERED category when custom loader support is enabled, but not when legacy cds archive is used. So I guess just disabling the duplicate processing when custom loader support is enabled should be sufficient to make it work.

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
@ashu-mehra
Copy link
Collaborator Author

Does this work with simple test cases where a single URLClassLoader is being used?

I tested it with a single URLClassLoader delegating to system loader and it seemed to work fine.

It would be great if you could merge your code with the latest Leyden repo so other can play with it -- if you're ready for that ... :-)

yup, I will do that. I just wanted to get this out to get the feedback if this is in line with what we have been discussing.

@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Feb 11, 2026
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Feb 18, 2026
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
… any class

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
file

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
@rose00
Copy link
Collaborator

rose00 commented Feb 20, 2026

Per our meeting yesterday, here are some edge cases to build test cases for.

A(foo...) ← C(bar...)  may adopt (from AOTC) non-foo classes from bar into C
A(…foo...) ← C(…foo...)   must not adopt foo classes into C (already in some A)
A(…foo...) ← C(…foo2...)   must not adopt foo classes into C (if dups in foo2)
A(foo…:new.jar) ← C(new.jar)  must not adopt new classes into C (already in some A)
A ← C(bar…) & A ← C2(bar…)   identical C2 must not poach those adopted into C
A ← C(bar…) & A ← C2(bar2…)   same as above, but dup classfile in bar2
A ← C ← Q(bar…)  must not adopt anything into Q (even if bar is in AOTC)

Notation: A is one of the "big three" CLs (boot/sys/app class loaders) which the AOT cache already supports as "live objects". C, C2 are CCLs (custom CLs) that delegate directly to an A. Such delegation is written A←C. (Backwards arrow favors left-to-right order of lookup.) Q is a class loader that delegates to another CCL (not an A CL). To make classpaths explicit we write A(foo:bar) for foo and bar being JARs, and foo:bar being the -classpath argument on the command line, also C(foo:bar) for a CCL whose CP string is "foo:bar". (I’m avoiding writing foo.jar everywhere; the .jar is part of the foo.) The name foo2 suggests a JAR file whose name differs from another JAR foo, but which contains overlapping classfile elements, when there’s a possible problem with multiple class definitions. Ellipsis C(foo…) denotes possibility of additional CP elements (besides foo).

The case involving A(foo…:new.jar) supposes that A(foo…) was in the AOTC because the training run CP was -cp foo…, but then the production run CP appended new, by saying -cp foo…:new.jar. Leyden supports CP appending on the command line. But it is an open question how to manage simultaneously appending on the command line, and then also adding classes from a CCL.

To support simultaneously extending the command-line classpath (A(foo…new.jar) and also delegation from CCLs (C(new.jar)) we will need to check carefully that classes in new.jar, if stashed in the AOTC for a CCL, will not be adopted if the command line classpath grabbed new.jar first.

The simplest thing to do, for starters, might be to disable CCLs totally, if there is not an EXACT match for the command-line classpath. (No -cp old.jar:new.jar, just -cp old.jar in order to enable CCLs.)

Similar points for multi-level delegation (A←C←Q). We can work it out, but the details are likely to be buggy at first, so go slow if at all in doubt.

Our overall consensus yesterday was "start stupid". If in doubt, leave it out, for now.

We are starting with limited capabilities in order to avoid edge cases we might not fully understand yet.

As we gain confidence, we can relax restrictions. (It’s hard to go the other way.)

(Ioi, feel free to correct me if I missed something.)

@iklam
Copy link
Member

iklam commented Feb 23, 2026

I think John's rules make a lot of sense. Let's start with more restrictive support and make sure the implementation is solid. Then, we can gradually extend it to support more use cases.

I think we can develop some dynamic checks to validate the current implementation. These checks will also be useful when we extend AOT support to other custom class loaders:

(1) When the JVM is running in training mode, and a class loader is marked as AOT compatible (currently only URLClassLoaders with local files):

  • When defineClass() is called, and the CodeSource is from a local JAR file, check that the classfile bytes matches the contents in the JAR file.
  • Check that the behavior is correct (e.g, the rules in John's comment).
  • If any of the checks fail, print an error message and mark the loader as not AOT compatible.

(2) When the JVM is running in production run, and a ClassLoader if found to be adaptable from the AOT cache:

  • If a diagnostic flag (like -XX:+DiagnoseAOTCassLoaders) is specified, do not preload cached classes into this loader.
  • Instead, allow the loader to load classes from classfiles. Each time a class is defined by this loader, check that the shape of this class matches the cached version. If not, throw a runtime exception.
  • If the cached version had a CodeSource of a local JAR file, check that the CodeSource is the same as used in the production run, and that the classfile bytes are correct. If not, throw a runtime exception.
  • Also check that the supertypes (and defining class loader of the super types) match the cached version.

I think the production run checks can validate the implementation of code generators that are supposed to generate predicable class shapes. This way, we can provide AOT support for of dynamic languages implemented on top of the JVM. (@headius what do you think?)

Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflict Pull request has merge conflict with target branch

Development

Successfully merging this pull request may close these issues.

3 participants