UPDATE 2024-11-29: Investigating per-loader class isolation bug. The shiftLeft static method resolution error is being debugged. Debug output has been added to class-package and static-method-symbol functions.
UPDATE 2024-11-28: Fixed the integer? returning null issue! The root cause was incomplete transitive merging of stack variables at control flow merge points.
UPDATE 2024-11-28 (later): Added invoke-special method dispatch caching to fix a 9-minute hang during Clojure compilation. The hang was caused by repeated compute-applicable-methods-using-classes calls for |<init>()| with 851+ methods.
OpenLDK can now correctly execute Clojure's integer? function. Testing continues to see if Clojure fully loads.
When running Clojure on OpenLDK, static method calls can fail with "undefined function" errors because the generated code looks for method symbols in the wrong Lisp package:
The function OPENLDK::|clojure/lang/Numbers.shiftLeft(JLjava/lang/Object;)| is undefined.
The method should be in OPENLDK.APP (application class loader's package), but the generated code looks in OPENLDK (bootstrap class loader's package).
OpenLDK implements JVM class loader isolation using Lisp packages:
| Class Loader | Lisp Package | Purpose |
|---|---|---|
| Bootstrap | OPENLDK |
JDK core classes (java/, sun/, etc.) |
| Application | OPENLDK.APP |
Application classpath classes |
| DynamicClassLoader | OPENLDK.L1, L2, ... |
Clojure's dynamic class loading |
Each class loaded by a specific loader has its CLOS class definition and static method stubs defined in that loader's package.
# Build OpenLDK
make
# Run Clojure Hello World (triggers the bug)
JAVA_CMD=./openldk LDK_DEBUG=L clojure -M -e '(println "Hello World!")'The error occurs during Clojure's bit_shift_left compilation, which calls clojure.lang.Numbers.shiftLeft().
(defmethod codegen ((insn ir-call-static-method) context)
(with-slots (class method-name args return-type) insn
(make-instance '<expression>
:code (let* ((loader (slot-value context 'ldk-loader))
(_ (classload class))
(declaring-class (or (%find-declaring-class class method-name loader) class))
(pkg (class-package declaring-class)) ; <-- Gets package for the class
(full-name (format nil "~A.~A" declaring-class method-name))
(method-sym (static-method-symbol full-name pkg))) ; <-- Resolves symbol
...))))(defun class-package (class-name)
"Get the Lisp package for a class by its binary name.
Searches boot loader, then app loader for the class.
Falls back to :openldk if class not found or has no loader."
(if-let ((class (%get-ldk-class-by-bin-name class-name t)))
(if-let ((loader (slot-value class 'ldk-loader)))
(loader-package loader)
(find-package :openldk))
;; Not in boot loader - check app loader
(if (and *app-ldk-class-loader*
(gethash class-name (slot-value *app-ldk-class-loader* 'ldk-classes-by-bin-name)))
(loader-package *app-ldk-class-loader*)
(find-package :openldk))))(defun static-method-symbol (method-name loader-pkg)
"Get the symbol for a static method name.
First checks :openldk for native method implementations,
then falls back to the loader's package for Java-defined methods."
(let ((openldk-sym (find-symbol method-name (find-package :openldk))))
(if (and openldk-sym (fboundp openldk-sym))
openldk-sym ; Native method in :openldk
(intern method-name loader-pkg)))) ; Java method in loader's packageDebug statements have been added to trace the bug:
In class-package (triggers for clojure/lang/Numbers):
(when (str:starts-with? "clojure/lang/Numbers" class-name)
(format t "~&; DEBUG: class-package called for ~A~%" class-name)
(format t "~&; DEBUG: class-package for ~A: loader=~A pkg=~A~%"
class-name loader pkg))In static-method-symbol (triggers for shiftLeft):
(when (str:contains? "shiftLeft" method-name)
(format t "~&; DEBUG: static-method-symbol: method-name=~A loader-pkg=~A~%"
method-name loader-pkg)
(format t "~&; DEBUG: static-method-symbol: openldk-sym=~A fboundp=~A~%"
openldk-sym (and openldk-sym (fboundp openldk-sym)))
(format t "~&; DEBUG: static-method-symbol: returning ~A (package ~A)~%"
result (symbol-package result)))Sample output:
; DEBUG: class-package called for clojure/lang/Numbers
; DEBUG: class-package for clojure/lang/Numbers: loader=#<<LDK-CLASS-LOADER> 1 pkg=#<PACKAGE "OPENLDK.APP">> pkg=#<PACKAGE "OPENLDK.APP">
This shows class-package IS returning OPENLDK.APP correctly. The bug may involve:
- Timing: Class not loaded when
class-packageis called during codegen - Symbol already exists: Symbol may exist in
:openldkfrom previous reference - Call site mismatch: Stub defined correctly but call site uses wrong symbol
A related bug was fixed where clojure/core$dotimes was defined multiple times. The fix added find-class-in-loader-hierarchy to check parent loaders before defining a class.
- Verify timing: Add debug to see if
classloadcompletes beforeclass-packageis called - Check symbol creation: Trace when
OPENLDK::|..shiftLeft..|symbol is first created - Inspect compiled code: Use
--dump-dirto see generated code forbit_shift_left - Compare with working case: Check why other static methods work
# Basic test with debug output
JAVA_CMD=./openldk LDK_DEBUG=L clojure -M -e '(println "Hello World!")'
# Capture full output
JAVA_CMD=./openldk LDK_DEBUG=L clojure -M -e '(println "Hello World!")' 2>&1 > debug.log
# Filter for debug lines
grep -E "DEBUG|shiftLeft|bit_shift_left" debug.log
# Simple Hello World (fewer classes)
./openldk Hello- Java 8 JDK (tested with temurin-jdk)
- Clojure 1.12.3 JAR
# Build OpenLDK
cd /home/green/git/cl-openldk/openldk
JAVA_HOME=/usr/lib/jvm/java-8-temurin-jdk/jre make
# Run Clojure
JAVA_HOME=/usr/lib/jvm/java-8-temurin-jdk/jre \
LDK_CLASSPATH=/tmp:~/.m2/repository/org/clojure/clojure/1.12.3/clojure-1.12.3.jar \
./openldk clojure.main -e "(println \"Hello Clojure\")"# Compile the test
/usr/lib/jvm/java-8-temurin-jdk/bin/javac -cp ~/.m2/repository/org/clojure/clojure/1.12.3/clojure-1.12.3.jar:/tmp /tmp/CallClojureIntegerQ.java
# Run in OpenLDK - should print "Result: true"
JAVA_HOME=/usr/lib/jvm/java-8-temurin-jdk/jre \
LDK_CLASSPATH=/tmp:~/.m2/repository/org/clojure/clojure/1.12.3/clojure-1.12.3.jar \
./openldk CallClojureIntegerQLDK_DEBUG=c- Enable codegen debug output (very verbose, prints generated Lisp code)LDK_DEBUG=e- Enable exception debug outputLDK_DEBUG=l- Enable class loading debug output
Clojure's integer? function was returning null instead of Boolean.TRUE, causing even? to throw "Argument must be an integer: 2".
The integer? bytecode has multiple paths that all converge at an areturn:
13: getstatic Boolean.TRUE
16: goto 22
19: getstatic Boolean.FALSE
22: goto 154 ; merge point 1: paths from 16 and 19
...
39: getstatic Boolean.TRUE
42: goto 48
45: getstatic Boolean.FALSE
48: goto 154 ; merge point 2: paths from 42 and 45
...
154: areturn ; final merge: paths from 22, 48, etc.
Each path creates a <stack-variable> with its own var-numbers set (addresses where it's used). At merge points, merge-stacks unifies the var-numbers of converging paths.
The bug: When stack states merge at intermediate points (22, 48) and then at the final point (154), the transitive closure wasn't computed. For example:
- At PC 22:
SV_13andSV_19merge → both get var-numbers {13, 19} - At PC 154:
SV_19(from 22) andSV_45(from 48) merge →SV_19gets {13, 19, 39, 45, ...} - But
SV_13still has only {13, 19} - the later merge doesn't propagate back!
This caused the assignment at address 13 to use variable s{13,19} while the return at address 154 used variable s{13,19,39,45,...} - different variables!
Rewrote fix-stack-variables in src/openldk.lisp (lines 145-208) to use a union-find algorithm with path compression:
- Initialize each stack-variable as its own equivalence class
- For each var-number, union all stack-variables that share it
- After all unions, all transitively connected stack-variables are in the same class
- Update all stack-variables in each class to have the same (union of all) var-numbers
This ensures that if SV_A shares var-numbers with SV_B, and SV_B shares with SV_C, then SV_A, SV_B, and SV_C all get the same unified var-numbers.
src/openldk.lisp (lines 145-208):
fix-stack-variables- now uses union-find for proper transitive closure
src/codegen.lisp (line 850-855):
ir-if-acmpeq- fixed to use directeqcomparison instead ofsxhash(earlier fix in this session)
Caused by: java.lang.IllegalArgumentException: Argument must be an integer: 2
This was caused by integer? returning null (due to the stack variable bug), which even? then compared with Boolean.FALSE using if_acmpeq. Since null != FALSE, it went to the error branch.
Problem: Clojure compilation hung for 9+ minutes at "COMPILING clojure/lang/ChunkedCons"
Root Cause: The invoke-special function calls compute-applicable-methods-using-classes to find the correct method to invoke. For constructor methods like |<init>()|, there can be 851+ methods across all loaded classes. Each call to this CLOS function is expensive, and during class loading, invoke-special is called repeatedly for the same method/owner combinations.
Solution: Added *invoke-special-cache* hash table in src/openldk.lisp (lines 94-145):
- Cache key:
(cons method-symbol owner-symbol) - Cache value:
(cons method-function next-methods) - On cache hit, directly call the cached method function
- On cache miss, compute methods, cache the result, then call
(defvar *invoke-special-cache* (make-hash-table :test 'equal))
(defun invoke-special (method-symbol owner-symbol args)
(let* ((cache-key (cons method-symbol owner-symbol))
(cached (gethash cache-key *invoke-special-cache*)))
(if cached
(funcall (car cached) args (cdr cached))
;; Cache miss - compute and cache...Problem: integer? returned null instead of Boolean.TRUE
Root Cause: fix-stack-variables didn't properly compute transitive closure when merging stack variables across multiple control flow merge points.
Solution: Rewrote using union-find algorithm with path compression for O(α(n)) performance.
Problem: Object reference comparison used sxhash instead of eq
Root Cause: The ir-if-acmpeq codegen was comparing hash codes instead of object identity.
Solution: Changed to use direct eq comparison:
(defmethod codegen ((insn ir-if-acmpeq) context)
(with-slots (offset value1 value2) insn
(make-instance '<expression>
:insn insn
:code (list 'when (list 'eq (code (codegen value1 context))
(code (codegen value2 context)))
(list 'go (intern (format nil "branch-target-~A" offset)))))))Problem: The function OPENLDK::|test(Ljava/lang/Object;)| is undefined
Root Cause: %lambda-metafactory only created LambdaSupplier (implementing Supplier.get()), but Java lambdas use many functional interfaces like Predicate.test(), Function.apply(), Consumer.accept(), etc.
Solution: Added multiple lambda implementation classes in src/native.lisp:
LambdaPredicate-test(Object)for filteringLambdaFunction-apply(Object)for transformationLambdaConsumer-accept(Object)for side effectsLambdaBiConsumer-accept(Object, Object)for two-arg consumersLambdaBinaryOperator-apply(Object, Object)for combining
Modified %lambda-metafactory to select the appropriate class based on the interface method name.
Problem: Virtual methods called via linkToVirtual used fully qualified names like java/util/List.add(Ljava/lang/Object;) but instance methods are defined as defmethod |add(Ljava/lang/Object;)|.
Solution: Modified %invoke-from-member-name in src/native.lisp to check the reference kind (ref-kind) from MemberName flags:
ref-kind = 6(REF_invokeStatic): Use fully qualifiedclass.method(desc)nameref-kind = 5(REF_invokeVirtual) or7(REF_invokeSpecial): Use simplemethod(desc)name
Problem: Constructors called via findConstructor needed special handling because:
- MethodType has the class as return type, but
<init>methods return void - Constructor methods are
defmethod |<init>()|, notdefun |Class.<init>()|
Solution: Added constructor handling in %invoke-from-member-name that:
- Creates a new instance with
make-instance - Extracts params from MethodType and appends
Vfor void return - Calls the instance method
|<init>(...)|(not fully qualified) - Returns the constructed instance
Problem: checkUnprivilegedlookupClass threw IllegalArgumentException for bootstrap classes (java., sun.) when creating lookup objects for lambda metafactory.
Solution: Added native no-op implementation for checkUnprivilegedlookupClass and registered it in native-override-p in src/openldk.lisp.
- Lines 145-208:
fix-stack-variables- union-find based stack variable merging - Lines 63-78:
native-override-pfunction that determines which methods use native implementations - Lines 1165-1192:
<clinit>exception handling that wraps errors inExceptionInInitializerError
- Lines 850-855:
ir-if-acmpeqcodegen (fixed to useeq) - Lines 857-862:
ir-if-acmpnecodegen - Lines 959-976:
ir-instanceofcodegen with integer type handling - Lines 1124-1155:
ir-call-dynamic-methodcodegen forinvokedynamic
- Lines 2195-2330:
findStatic,findSpecial,findConstructor,findVirtualimplementations - Lines 2395-2472:
%invoke-from-member-name- core method handle invocation logic - Lines 2583-2662: Lambda implementation classes and
%lambda-metafactory - Lines 2474-2580:
linkToStatic,linkToVirtual,linkToSpecialintrinsics
- Lines 1957-1975:
merge-stack-variablesandmerge-stacks- stack unification at branch points
When bytecode has multiple paths that converge (like if/else branches), each path may push values onto the operand stack. At the merge point, these values must be treated as the "same" variable in the generated code.
Example:
if (cond) {
push A; // stack var SV_A
} else {
push B; // stack var SV_B
}
// merge point: SV_A and SV_B must become the same variable
use(top_of_stack);
-
During bc-to-ir: Each bytecode instruction creates
<stack-variable>objects withvar-numbers= {address}. Branch instructions record stack states at targets via%record-stack-state. -
After bc-to-ir:
merge-stacksis called for each PC with multiple incoming states. This unifiesvar-numbersof corresponding stack positions. -
After merging:
fix-stack-variablescomputes transitive closure using union-find. -
During codegen: Stack variable names are generated from
var-numbers:s{1,5,9}means this variable is used at addresses 1, 5, and 9.
If SV_A shares address with SV_B (merged at point X), and SV_B shares address with SV_C (merged at point Y), then SV_A and SV_C must also share all addresses (transitive property).
Union-find efficiently computes these equivalence classes in near-linear time.
From java.lang.invoke.MethodHandleInfo:
REF_getField = 1
REF_getStatic = 2
REF_putField = 3
REF_putStatic = 4
REF_invokeVirtual = 5
REF_invokeStatic = 6
REF_invokeSpecial = 7
REF_newInvokeSpecial = 8 (constructor)
REF_invokeInterface = 9
For LambdaMetafactory.metafactory(Lookup, String, MethodType, MethodType, MethodHandle, MethodType):
Lookup- caller's lookup contextString- interface method name (e.g., "test", "apply")MethodType- CallSite signature (functional interface type + captured args)MethodType- interface method signatureMethodHandle- implementation method handleMethodType- instantiated method type (after type erasure)