From 8599c0ee86e916becc04fb23d82c7ed580e53908 Mon Sep 17 00:00:00 2001 From: xuba Date: Fri, 13 Mar 2026 10:40:17 +0800 Subject: [PATCH] [Feature] Add AGENTS.md for project overview and documentation --- AGENTS.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..67a9911443 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,78 @@ +# AGENTS.md — Apache Amoro (Incubating) + +## Project Overview + +Apache Amoro is a **Lakehouse management system** (Java 11, Maven, Vue 3 frontend). The central component is **AMS** (Amoro Management Service), which manages table metadata, self-optimizing, and catalog services for Iceberg, Hive, Paimon, and Hudi table formats. + +## Architecture + +- **`amoro-common`** — Core abstractions (`FormatCatalog`, `AmoroTable`, `TableFormat`), Thrift IDL definitions (`src/main/thrift/*.thrift`), and shared client code. All other modules depend on this. +- **`amoro-ams`** — The AMS server. Entry point: `AmoroServiceContainer.main()`. Exposes Thrift services (table metastore on port 1260, optimizer service on port 1261) and an HTTP/REST API via Javalin (port 1630). Routes are registered in `DashboardServer.java`. +- **`amoro-web`** — Vue 3 + TypeScript dashboard (Vite, Ant Design Vue, pnpm). API calls in `src/services/`. Built into AMS static resources during Maven build. +- **`amoro-format-{iceberg,mixed,paimon,hudi}`** — Table format integrations loaded via **Java SPI** (`ServiceLoader`). See `CommonUnifiedCatalog.initializeFormatCatalogs()`. +- **`amoro-optimizer`** — Self-optimizing executor with standalone/Flink/Spark implementations. Communicates with AMS via Thrift (`OptimizingService`). +- **`amoro-metrics`** — Metrics subsystem with pluggable reporters (e.g., Prometheus). +- **Plugin system** — AMS plugins are configured via YAML in `dist/src/main/amoro-bin/conf/plugins/` and managed by `AbstractPluginManager`. Categories: `event-listeners`, `metric-reporters`, `rest-extensions`, `table-runtime-factories`. + +## Build & Test + +```bash +# Build all (skip tests, skip dashboard frontend): +./mvnw clean package -DskipTests -Pskip-dashboard-build + +# Build everything including tests: +./mvnw clean package + +# Run checkstyle + spotless validation only: +./mvnw validate + +# Format all Java/Scala code (Spotless + google-java-format): +dev/reformat + +# Run a single test class: +./mvnw test -pl amoro-ams -Dtest=TestDefaultOptimizingService + +# Regenerate configuration docs after modifying ConfigOptions: +UPDATE=1 ./mvnw test -pl amoro-ams -am -Dtest=ConfigurationsTest + +# Frontend dev (from amoro-web/): +pnpm install && pnpm dev:mock +``` + +Key Maven profiles: `-Phadoop2` (Hadoop 2.x), `-Pspark-3.3`/`-Pspark-3.5`, `-Psupport-all-formats`, `-Pbuild-mixed-format-trino` (requires JDK 17). + +## Conventions & Patterns + +- **PR title format**: `[AMORO-{issue_number}][{module}]{description}` +- **Code style**: google-java-format (v1.7) enforced by Spotless; Scala uses scalafmt. Run `dev/reformat` before committing. +- **Checkstyle**: Config at `tools/maven/checkstyle.xml`, runs during `validate` phase. +- **Copyright header**: Apache License 2.0 header required on all source files. +- **Database layer**: MyBatis with annotation-based mappers (no XML). See `amoro-ams/src/main/java/.../persistence/mapper/`. Services extend `PersistentBase` and use `doAs(MapperClass, lambda)` for DB operations. Supports Derby (dev), MySQL, and PostgreSQL. SQL init scripts in `amoro-ams/src/main/resources/{derby,mysql,postgres}/`. +- **REST API**: Javalin-based, all routes defined in `DashboardServer.apiGroup()` under `/api/ams/v1`. Controllers in `server/dashboard/controller/`. +- **SPI extension points**: `FormatCatalogFactory` (table formats), `FormatTableDescriptor` (metadata display), `SparkTableFormat` (Spark catalog), `ProcessFactory` (optimizing processes). Register implementations in `META-INF/services/`. +- **Configuration**: Typed config via `ConfigOption` pattern. Server configs in `AmoroManagementConf.java`, runtime config in `config.yaml`. +- **Thrift**: IDL files in `amoro-common/src/main/thrift/`. Generated code uses shaded Thrift (`org.apache.amoro.shade.thrift`). Uses `ThriftClientPool` for connection management. +- **Tests**: JUnit 5 for new tests. AMS integration tests use `DerbyPersistence` for in-memory DB. Testcontainers for external service tests. + +## Key Files to Start With + +| Area | File | +|------|------| +| Server entry point | `amoro-ams/src/main/java/.../server/AmoroServiceContainer.java` | +| REST routes | `amoro-ams/src/main/java/.../server/dashboard/DashboardServer.java` | +| Server config keys | `amoro-ams/src/main/java/.../server/AmoroManagementConf.java` | +| Format SPI | `amoro-common/src/main/java/.../FormatCatalogFactory.java` | +| Unified catalog | `amoro-common/src/main/java/.../CommonUnifiedCatalog.java` | +| DB mappers | `amoro-ams/src/main/java/.../server/persistence/mapper/` | +| Plugin system | `amoro-ams/src/main/java/.../server/manager/AbstractPluginManager.java` | +| Default config | `dist/src/main/amoro-bin/conf/config.yaml` | +| Thrift IDL | `amoro-common/src/main/thrift/` | +| Frontend API layer | `amoro-web/src/services/` | + +## Running AMS Locally (IDE) + +1. Copy config: `mkdir -p conf && cp dist/src/main/amoro-bin/conf/config.yaml conf/config.yaml` +2. Update Derby path in `conf/config.yaml` to use absolute path under `conf/derby` +3. Run `AmoroServiceContainer.main()` — dashboard at http://localhost:1630 (admin/admin) +4. Optimizer: Run `StandaloneOptimizer.main()` with args `-a thrift://127.0.0.1:1261 -p 1 -g local` +