A contributor-facing map of how the Groovy compiler and runtime are
organised in this repository. This document is for people working on
Groovy. For documentation aimed at people using Groovy, see
https://groovy.apache.org/ and the AsciiDoc sources under
src/spec/doc/ and subprojects/<module>/src/spec/doc/.
This is an overview, not a reference. It exists to give a new contributor — human or AI — enough orientation to read the code productively and to avoid a small set of common mis-steps. Code is the source of truth; this document is a pointer file.
| Path | What lives there |
|---|---|
src/main/java/org/codehaus/groovy/ |
Core compiler and runtime (legacy package — most of the codebase) |
src/main/java/org/apache/groovy/ |
Newer code added under the org.apache.groovy.* package convention |
src/main/java/groovy/ |
User-facing API (groovy.lang.*, groovy.util.*, etc.) |
src/main/groovy/ |
Groovy sources compiled into the core jar |
src/main/resources/ |
Service files, META-INF, default scripts |
src/antlr/ |
ANTLR4 grammar (GroovyLexer.g4, GroovyParser.g4) — see "Generated code" below |
src/spec/doc/ |
User-facing AsciiDoc reference docs |
src/spec/test/ |
Executable Groovy snippets include::'d by the AsciiDoc sources |
src/test/ |
JUnit / Spock tests for the core jar |
subprojects/ |
~50 modular subprojects (groovy-json, groovy-sql, groovy-xml, groovy-typecheckers, parser-antlr4 wiring, etc.) |
subprojects/groovy-binary/ |
Aggregator that produces the final distribution and the published spec |
subprojects/binary-compatibility/ |
Enforces public-API binary compatibility across releases |
subprojects/tests-preview/ |
Tests that depend on preview JDK features |
bootstrap/, buildSrc/, build-logic/ |
Build infrastructure (Gradle convention plugins, bootstrap helpers) |
When in doubt, prefer adding new code under org.apache.groovy.*; the
older org.codehaus.groovy.* packages remain for legacy reasons but
are kept stable for compatibility.
Some subprojects with significant local complexity carry their own
ARCHITECTURE.md alongside the source — see
subprojects/groovy-groovysh/ARCHITECTURE.md
for the worked example.
The driver is org.codehaus.groovy.control.CompilationUnit. A
SourceUnit represents a single source file inside it. Compilation
proceeds in numbered phases declared in
Phases.java
and exposed as the
CompilePhase
enum that AST transformations and customizers attach to:
| # | Phase | What happens | Driver classes |
|---|---|---|---|
| 1 | INITIALIZATION |
Source files opened, CompilationUnit configured, customizers applied |
CompilationUnit, CompilerConfiguration |
| 2 | PARSING |
ANTLR4 lexer + parser produce a CST (parse tree) | Antlr4ParserPlugin, GroovyLangLexer, GroovyLangParser |
| 3 | CONVERSION |
CST → AST (ModuleNode / ClassNode / MethodNode / ...) |
AstBuilder |
| 4 | SEMANTIC_ANALYSIS |
Class resolution, import handling, validity checks the grammar can't catch | ResolveVisitor, StaticImportVisitor, AnnotationConstantsVisitor |
| 5 | CANONICALIZATION |
Fill in the AST: synthesized members, generic types, most local AST transforms run here | ASTTransformationVisitor, GenericsVisitor |
| 6 | INSTRUCTION_SELECTION |
Optimisations and instruction-set selection; @CompileStatic / @TypeChecked run here |
OptimizerVisitor, StaticTypeCheckingVisitor |
| 7 | CLASS_GENERATION |
AST → bytecode in memory | AsmClassGenerator, Verifier, classes under classgen/asm/ |
| 8 | OUTPUT |
Write generated .class files |
CompilationUnit output stage |
| 9 | FINALIZATION |
Cleanup, Janitor callbacks |
CompilationUnit, Janitor |
Each phase iterates over all SourceUnits before the next phase
begins. AST transformations declare which phase they run in; the
canonical question to ask before adding one is "what state must the
AST be in for this transform to make sense?" — pick the earliest phase
where that holds.
The phase enum is the right anchor for any documentation that talks about "when X happens during compilation". Quoting the phase names verbatim keeps the reference precise; paraphrasing tends to drift.
- Grammar lives in
src/antlr/GroovyLexer.g4andsrc/antlr/GroovyParser.g4. The generated parser is regenerated from these sources on every build, so changes belong in the.g4files. - The ANTLR Gradle plugin generates
GroovyLexer,GroovyParser,GroovyParserVisitor, andGroovyParserBaseVisitorintobuild/generated/sources/antlr4/org/apache/groovy/parser/antlr4/. - Hand-written code that wires the parser into
CompilationUnitlives insrc/main/java/org/apache/groovy/parser/antlr4/(Antlr4PluginFactory,Antlr4ParserPlugin,GroovyLangLexer,GroovyLangParser,AstBuilder, plus support classes:ModifierManager,GroovydocManager,SemanticPredicates,PositionInfo). AstBuilderis the hand-off from CST to AST. It is large; almost every parser-visible language change touches it.
- Root:
org.codehaus.groovy.ast.ASTNode. - Sub-packages:
org.codehaus.groovy.ast.expr— expression nodes (BinaryExpression,MethodCallExpression, ...)org.codehaus.groovy.ast.stmt— statement nodes (BlockStatement,ForStatement, ...)org.codehaus.groovy.ast.tools— helpers (GeneralUtilsis the common one — prefer its factory methods over hand-built nodes)
- Top-level structural nodes:
ModuleNode(one per source file) →ClassNode→MethodNode/FieldNode/PropertyNode/ConstructorNode. ClassNodeinstances for primitive and common types should be obtained fromClassHelper, not constructed directly. Constructing freshClassNodes forint,String,Object, etc. is a frequent source of equality and resolution bugs.- Visitors:
GroovyCodeVisitor(expression + statement),GroovyClassVisitor(class members), withClassCodeVisitorSupport/CodeVisitorSupportas bases, andClassCodeExpressionTransformerfor transforms that rewrite expressions in place.
- Entry point:
org.codehaus.groovy.transform.stc.StaticTypeCheckingVisitor. - Driven by
@TypeCheckedand@CompileStatic. The latter runs the same checker, then directsAsmClassGeneratorto emit direct calls rather than dynamic dispatch. - Extensible from user code via type-checking extension scripts; see
src/spec/doc/_type-checking-extensions.adocfor the user-facing documentation of that mechanism.
org.codehaus.groovy.classgen.AsmClassGeneratorwalks the AST and emits bytecode via ASM. Supporting visitors run here too:Verifier(synthesizes bridge methods, accessors, default constructors),EnumVisitor,EnumCompletionVisitor,InnerClassVisitor,InnerClassCompletionVisitor,VariableScopeVisitor,ReturnAdder.- ASM-specific helpers:
org.codehaus.groovy.classgen.asm.*. - The class loader path for compiled classes goes through
org.codehaus.groovy.reflection.*and the meta-class system ingroovy.lang.MetaClass*.
Most contributor work touches one of these. Each has a dedicated mechanism — knowing which one applies tells you where the change belongs:
- AST transformations — annotation-driven AST rewrites. Local
transforms run in
CANONICALIZATIONby default; global transforms apply to every compilation unit and are registered viaMETA-INF/services/org.codehaus.groovy.transform.ASTTransformation. Implementations live inorg.codehaus.groovy.transform.*.AbstractASTTransformationis the usual base class, andorg.codehaus.groovy.ast.tools.GeneralUtilsis the standard library for building AST fragments. - Type-checking extensions — DSL scripts that hook into the
static type checker. See
org.codehaus.groovy.transform.stc.GroovyTypeCheckingExtensionSupportand the user docs atsrc/spec/doc/_type-checking-extensions.adoc. - Compilation customizers —
org.codehaus.groovy.control.customizers.*. Programmatic configuration applied atINITIALIZATION:ImportCustomizer,ASTTransformationCustomizer,SecureASTCustomizer,CompilationCustomizer(base class for custom ones). - Extension modules — add instance / static methods to existing
classes via descriptor files. Discovered through
META-INF/groovy/org.codehaus.groovy.runtime.ExtensionModule. The GDK itself is built this way; seeorg.codehaus.groovy.runtime.DefaultGroovyMethodsand friends, and the user-facing description insrc/spec/doc/core-gdk.adoc. - Parser plugin —
org.codehaus.groovy.control.ParserPluginFactoryselects the parser. The ANTLR4 implementation is the only supported one; the older Antlr2-based parser has been removed.
A few project-specific conventions on top of the architecture above. Each bites contributors quickly if missed:
- Default visibility in Groovy sources is
public, not package-private. Astatic foo()method in a.groovyfile ispublic static foo(). For "visible for testing" helpers in.groovyfiles, use@groovy.transform.PackageScopeso same-package tests see the method while external callers don't. Some existing helpers in the tree (e.g. ingroovy.grape.ivy.*) omit the modifier and are technically public; match that pattern only when you intend public exposure. - Explicit imports, not wildcards. The codebase uses explicit per-class imports; new code should match. IDE-default wildcard imports get flagged in review.
- New code prefers
org.apache.groovy.*. Legacy code underorg.codehaus.groovy.*stays where it is for compatibility, but new packages should follow theorg.apache.groovy.*convention. - Mark non-public surface explicitly. Use
@groovy.transform.Internalor place the code in a package namedinternalto signal "no stability guarantee" — see the Public API boundaries table below. - Use
ClassHelperfor knownClassNodeinstances. FreshClassNodeinstances for primitives (int,boolean) or common reference types (String,Object) break equality and resolution. PreferClassHelper.int_TYPE,ClassHelper.STRING_TYPE,ClassHelper.make(SomeType.class), etc. - Use
GeneralUtilsfactories over hand-built AST nodes. The helpers underorg.codehaus.groovy.ast.tools.GeneralUtilscover the common construction patterns and avoid the shape-mismatch traps that hand-building tends to produce. - Pick the earliest compile phase that works for an AST
transform. Local transforms default to
CANONICALIZATION; transforms that need resolved types belong inINSTRUCTION_SELECTIONor later. See the Compilation pipeline phase table.
Several Groovy operators and expression forms are defined for multiple backing types, and behaviour across the members of a family is sometimes inconsistent. When investigating a bug reported for one type, probing the same expression across siblings often surfaces nuance the reporter missed — a hidden bug in a sibling type, confirmation that an asymmetry spans the whole family, or a project-wide spec gap that wasn't visible from a single-type report.
| Family | Members | Notes |
|---|---|---|
Range / index operators (agg[idx], agg[range]) |
List / Object[] / primitive arrays (int[], long[], …) / String / CharSequence |
Different exception classes (IndexOutOfBoundsException vs ArrayIndexOutOfBoundsException vs StringIndexOutOfBoundsException). Negative-endpoint and out-of-range-negative semantics have historically diverged across types — see GROOVY-3974 for a concrete example surfaced by cross-type probing. |
GPath expressions (x.y.z, x?.y, x*.y) |
In-memory (Map, List, nested combinations) / JSON (JsonSlurper) / XML (XmlSlurper / XmlParser) / POGO / Java POJO / SQL result sets (groovy.sql.Sql) |
XML has special handling for attributes (@attr syntax) and returns empty NodeChild collections on missing children rather than null. Map/JSON return null on missing keys. POGOs and POJOs throw MissingPropertyException for missing properties — the asymmetry is by-design (each backend's natural type semantics) but surfaces as a cross-type inconsistency from the user's perspective. |
Numeric coercion (+, -, *, /, comparison) |
int / long / BigInteger / BigDecimal / double / Float / Long (boxed) |
Coercion rules vary; the result type of int + BigDecimal may surprise. |
Some operators have multiple syntactic variants that share a family but dispatch differently:
| Family | Variants | Dispatch notes |
|---|---|---|
| Safe navigation | ?. (SAFE_DOT) / ??. (SAFE_CHAIN_DOT — shorthand for chained ?.) / ?[..] (SAFE_INDEX) |
?. and ??. call getProperty(String). ?[..] calls getAt(Object), but on POGOs that routes through getProperty for missing keys, so the variants behave identically for POGO missing-property access. |
| Spread | *. / *[..] / *: |
Different unpacking semantics across iteration / indexing / map-merge. |
| Equality / identity | == / .equals() / is |
== is equals-based in Groovy (not reference-equality as in Java); is is Java's == (reference). |
| Coercion | as / asType() / constructor + from |
Different conversion paths; as is statically-resolvable, asType is dynamic. |
| Range | .. / ..< / ..> |
Endpoint inclusion / direction differences. |
| Elvis / null-coalesce | ?: and elaborations |
Truthy-vs-null differences in the left-hand side. |
For an investigation of a bug in one family member, probing across siblings is a recurring technique. A ~50-line probe script (constructing each backend, running the same expression, recording outcomes in a table) is usually enough to:
- confirm whether an asymmetry the reporter found spans the family or is type-specific;
- surface a hidden bug in a sibling type the reporter didn't test (and which may warrant its own JIRA);
- reveal that what looks like a bug is actually consistent documented behaviour with a documented or implicit workaround in a sibling form.
See .agents/skills/groovy-reproducer/SKILL.md's "Cross-family
probes" section for the AI-tooling pattern. The probe approach is
equally useful when investigating by hand.
The following are produced by the build and regenerated on every run, so direct edits to them are overwritten. Changes belong in the source they're generated from.
| Generated artefact | Source |
|---|---|
build/generated/sources/antlr4/org/apache/groovy/parser/antlr4/Groovy{Lexer,Parser,ParserVisitor,ParserBaseVisitor}.java |
src/antlr/GroovyLexer.g4, src/antlr/GroovyParser.g4 |
Anything under build/, */build/, out/, subprojects/*/build/ |
The build itself; never committed |
| Repackaged dependency classes (ASM, ANTLR runtime, picocli) | Configured in build.gradle under repackagedDependencies |
If a .java file under build/generated/... looks like the right
thing to change, you are looking at the wrong file. The grammar fix
goes in src/antlr/.
The Gradle build is driven by:
- Convention plugins under
build-logic/src/main/groovy/org.apache.groovy-*.gradle. These describe the shape every subproject takes:-base,-common,-core,-library,-published-library,-aggregating-project,-tested, and so on. A subproject applies the appropriate conventions and overrides only what's specific to it. - Shared types under
build-logic/src/main/groovy/org/apache/groovy/.Versions,SharedConfiguration, andServiceshold the canonical pinned versions and configuration the convention plugins read from. - Root
build.gradle,settings.gradle, andgradle.propertiesfor project-wide settings, version pins, target bytecode, and build flags.
A few build-side conventions:
- Cross-cutting build behaviour belongs in a convention
plugin, not duplicated across subproject
build.gradlefiles. If two or more subprojects need the same configuration, add it to the rightorg.apache.groovy-*.gradle; conversely, don't push a one-off into a shared convention plugin. - Versions flow through
gradle.propertiesand the sharedVersionstype, not as'group:artifact:1.2.3'literals in subproject builds. Ad-hoc version pins drift. - JDK and bytecode targeting are three separate knobs.
targetJavaVersionsets the Javasource/targetcompatibility level for the project's Java sources and Javadoc;groovyTargetBytecodeVersionsets the bytecode level the forked Groovy compiler emits (passed as-Dgroovy.target.bytecode). Both are pinned ingradle.propertiesand read throughSharedConfiguration— that file is the single source of truth, so docs and skills cite the property names, never the values (they change per Groovy version). Neither controls which JVM runs the build or tests: the build compiles with whatever JDK launched Gradle, and tests run on that same JDK unless-Ptarget.java.home=<java-home>overrides the test JVM (the build then auto-derives the matching toolchain from that JDK'sreleasefile). SeeCONTRIBUTING.mdfor the recipe. - Dependency changes require regenerating
gradle/verification-metadata.xml. The build runs Gradle dependency verification; an unverified artifact fails the build. Regenerate with./gradlew --write-verification-metadata sha256,pgp helpand inspect the diff before committing. - ASM, ANTLR runtime, and picocli are jarjar-relocated into
groovyjarjar*packages, configured viagroovyLibrary { repackagedDependencies = ... }. A wrong repackaging rule produces a jar that compiles fine but fails at runtime — run./gradlew :groovy-binary:installGroovyand exercisegroovy/groovycagainst a non-trivial script whenever a repackaging change lands. - The configuration cache constrains build logic. Custom
code must not access mutable
Projectstate at execution time. Prefer providers (providers.gradleProperty,providers.environmentVariable,Providerchains) andtasks.register/configureEachover eager realisation. - Wrapper bumps need a Develocity compatibility check. A
Gradle wrapper version bump can disable build scans or break a
plugin pinned in the root
plugins {}block; check the Develocity compatibility matrix linked inbuild.gradlebefore bumping. - Binary compatibility is enforced.
subprojects/binary-compatibility/runsjapicmpagainst a baseline release. Don't suppress it for green CI — either justify the change in the accepted-changes file or revert the API breakage. SeeCOMPATIBILITY.md.
For the canonical command sequence (targeted → subproject → full
build, with the dependency-verification regeneration and
installed-build smoke test), see
CONTRIBUTING.md. For the public API
contract that the binary-compatibility check defends, see
COMPATIBILITY.md.
Groovy has a covenanted public API. The shape of a change determines
which review path applies — see CONTRIBUTING.md.
| Package convention | Audience |
|---|---|
groovy.* |
End users — the public API surface |
org.apache.groovy.* |
Mixed; preferred location for new code |
org.codehaus.groovy.* |
Historical core — some user-visible, much internal |
Anything annotated @groovy.transform.Internal or in a package named internal |
Implementation detail |
For the stability commitment each tier carries — what counts as
breaking, the deprecation policy, the four-tier stability model
(Public / Incubating / Internal / Generated), and how the
japicmp-based binary-compatibility check is wired up — see
COMPATIBILITY.md. Binary compatibility against
a baseline release is enforced by the
subprojects/binary-compatibility/ module as part of the build.
Test code is laid out in parallel with source code:
| Directory | Purpose |
|---|---|
src/test/ |
Core tests for the core jar |
subprojects/<module>/src/test/ |
Module-specific tests |
src/spec/test/ and subprojects/<module>/src/spec/test/ |
Executable Groovy snippets that AsciiDoc sources include:: to keep documentation examples runnable |
subprojects/tests-preview/src/test/ |
Tests that depend on JDK preview features |
For test framework conventions (JUnit 5, regression-test naming, the
fix-workflow ordering, the executable-AsciiDoc pattern, and
test-writing pitfalls) and the targeted-run command sequence, see
the "Tests" section in
CONTRIBUTING.md.
CONTRIBUTING.md— how to build, test, and submit a change.COMPATIBILITY.md— stability tiers, what counts as a breaking change, deprecation policy, and the binary-compatibility check.GOVERNANCE.md— how decisions get made, where discussions happen, review modes, and wait periods (placeholder draft pending dev@ confirmation).AGENTS.md— supplemental guidance for AI coding assistants; layered on top of this document, not a replacement for it.README.adoc— the canonical build instructions.src/spec/doc/core-metaprogramming.adoc— user-facing description of AST transformations and metaprogramming.src/spec/doc/_type-checking-extensions.adoc— user-facing description of the type-checking extension mechanism.- The Groovy issue tracker (https://issues.apache.org/jira/browse/GROOVY)
and the existing test suite are the best source of precedent for any
given change.
git log --grep GROOVY-NNNNNfinds the original fix for an issue.