feat(aem-cloud-service): add aem-agentkit v1.0.0-beta — agentic workflow bootstrap for AEM as a Cloud Service#172
Open
abhishekgarg18 wants to merge 7 commits into
Open
feat(aem-cloud-service): add aem-agentkit v1.0.0-beta — agentic workflow bootstrap for AEM as a Cloud Service#172abhishekgarg18 wants to merge 7 commits into
abhishekgarg18 wants to merge 7 commits into
Conversation
…low bootstrap for AEM as a Cloud Service aem-agentkit bootstraps an AEM as a Cloud Service repository for agentic workflows across Claude Code, Cursor, GitHub Copilot, Codex, Continue.dev, Cline, Windsurf, Augment Code, and every AGENTS.md-spec-compliant agent — without modifying any customer source code. It complements (does not replace) ensure-agents-md, which owns the root AGENTS.md + CLAUDE.md. The skill ships: Universal layer (always written when missing): - Per-module AGENTS.md, recursive for nested AEM monorepos so multi-brand customers get focused context at each archetype leaf. - .aem/context/ codified context: components.json, osgi-services.json, conventions.md, avoid.md, glossary.md, test-patterns.md, aem-api-namespaces.md, README.md, a run manifest (.agentkit-manifest.json), and a workspace advisory lock (.agentkit.lock). Scoped per nested sub-project for monorepos. Tool-specific layer (opt-in via interactive selection; CI runs honor an .aem/agentkit-overrides.yml ide-targets entry): - Claude Code: .claude/agents/aem-*.md + .claude/commands/<owned>.md + .mcp.json placeholder. - Cursor: .cursor/rules/aem-*.mdc + .cursor/mcp.json placeholder. - GitHub Copilot: .github/instructions/aem-*.instructions.md. - Continue.dev: .continue/rules/aem-*.md. - Cline / Windsurf / Augment: single concatenated rule file with .aem-roles-extra.md sidecar for deferred roles. Deterministic helper (bin/aem-agentkit-helper): - ~500 lines of Python 3.10+, no third-party deps; POSIX only (Linux + macOS). JSON-line protocol on stdin/stdout. The skill refuses to run when the helper is absent or version-mismatched. - Closes the byte-exact contracts the LLM cannot uphold: O_NOFOLLOW open with TOCTOU re-check (F_GETPATH on macOS, /proc/self/fd readlink on Linux); SHA-256 over canonical body bytes (Markdown skips marker line; JSON re-emits with sorted keys after stripping the six marker fields); atomic .tmp + rename(2) with parent-dir fsync; Unicode NFC + strip-list drop + length cap; bounded workspace walk with maxFiles / maxDepth / maxFilesPerSubtree and segment-by-segment deny-list pruning at every layer; advisory file lock with stale-lock recovery via PID-liveness; ASCII-lowercase casefold pinned to avoid Turkish-I / German-ß cross-platform divergence. Test suite (tests/test_helper.py): - 27 unit tests covering golden SHA-256 across canonical-body shapes, every strip-list code-point category, workspace-escape rejection, case-insensitive deny matching, node_modules/.git/.env pruning, atomic write with no orphan .tmp, absolute-path and dotdot rejection, stale-lock PID recovery. - npm run test:aem-agentkit-helper hooks the suite into CI. Hard guarantees: - Customer source files are never modified. Allow-list of writes documented in SKILL.md § "Hard guarantee". - Marker-based idempotency: a file lacking the skill marker is treated as human-curated and never overwritten. Marker authentication via SHA-256 over the canonical body; marker spoofing is detected and rejected. - Re-runs are byte-identical (deterministic discovery + four-level tiebreaker; generatedAt excluded from the marker checksum). - Privacy deny-list covers AEM SDK state, IDE secret stores, cloud SDK credentials, package-registry build secrets, Adobe IO configs, SSH keys, PGP, JetBrains/VS Code secrets, vim/emacs swap and backup files. Applied per path segment so a directory name match prunes the entire subtree. - Symlink hardening: workspace root realpath cached at startup; /proc, /sys, /dev, Windows UNC/device paths rejected even when the workspace lives under them; visited-set loop guard. - Six classes of opt-out / override: _disable_agentkit at workspace root, _disable_agentkit at any nested AEM sub-project root, --silent flag, AEM_AGENTKIT_SILENT env var, .aem/agentkit-overrides.yml ide-targets entry, per-decision heuristic overrides. Interactive IDE selection (replaces the original silent multi-projection): - Tightened detection signals (empty .claude/ no longer fires; .github/*.yml is no longer a Copilot signal; .cursor/rules content required for Cursor; .continue/rules content required for Continue). - Prompts the customer: all / single / multi-select / none. - Answer persists to .aem/agentkit-overrides.yml under decision: ide-targets so subsequent runs and CI invocations honor it without prompting. - Headless mode preserved through --silent / env var / pre-existing override entry. Updates after the customer changes code: - /new-component and /new-sling-model slash commands scaffold the artifact and call /regen-context to refresh the index. - /regen-context re-runs only the .aem/context/* generation steps; per-module AGENTS.md and per-tool projections are not touched. - /agents-md-check is a read-only drift report driven by the run manifest; CI gates can call it and block merge on non-zero exit. - Pre-commit hook and CI integration recipes documented in the PR description and README. Sub-project resolution in role bodies: - Role bodies walk up from the file under edit to the closest enclosing pom.xml that matches the nested-AEM-project detection rule, then resolve <project> and the path prefix from that root. - In multi-brand monorepos the agent writes into the correct sub-project tree (brand-a/ui.apps/...) rather than guessing from a hard-coded path. Schema versioning: - _skillVersion + schemaVersion on every JSON marker. - Static-reference files (.aem/context/aem-api-namespaces.md, .aem/context/README.md) carry _static: true and are overwritten in place on a version bump rather than producing .agentkit-new sidecars for every customer. - Worked v1 -> v2 migration example in upgrade-and-migration.md; the migration code path is exercised by golden-output fixtures in the test suite. Helper version pin: - Skill metadata.version (1.0.0-beta) must match helper --version before the helper is used for any byte-exact op. Mismatch aborts the run with a single diagnostic. References (12 reference files): - per-module-agents-md.md, codified-context.md, per-tool-artifacts.md, mcp-wiring.md, guardrails.md, module-catalog.md, collision-rules.md, upgrade-and-migration.md, privacy-and-sanitization.md, output-format.md, helpers.md, manifest.md. Beta markers per plugins/aem/cloud-service/CLAUDE.md convention: frontmatter status: beta, [BETA] description prefix, body blockquote with the standard caveat. Closes the multi-agent PR review findings (Critical 1-13 + Important 14-25) and the tessl-review conciseness threshold. 27/27 helper unit tests pass; npx skills-ref validate clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tessl Skill Lint
|
…eration step Direct inspection of two real customer runs (aem-guides-wknd and twilio-reactor) surfaced a gap between the spec and the runtime behavior: in the twilio monorepo (two nested AEM projects: twilio-com and twilio-foundation-reactor), the workspace-root .aem/context/ was written but neither sub-project received its scoped .aem/context/ directory. The manifest correctly tagged every per-module AGENTS.md with subprojectRoot and the heuristics correctly identified both sub-projects as nested-aem-project, but the per-sub-project index materialization was skipped. Root cause: the prior SKILL.md generation order collapsed steps 1-8 into one line and mentioned per-sub-project scoping as a sub-clause. The contract was in the references but not surfaced in the order as a distinct mandatory step. Spec changes: - SKILL.md § "Generation order": per-sub-project materialization is now an explicit Step 9 (mandatory), repeating steps 1-7 scoped to each detected nested AEM sub-project's source tree. Steps 10-13 renumbered. Self-validation now includes a check that every heuristics[] nested-aem-project entry has its scoped components.json and osgi-services.json on disk; missing is a hard failure (exit 1). - codified-context.md § 11: replaced the brief paragraph with an explicit "mandatory" header, a per-file scope table (what's scoped per sub-project, what stays workspace-root only), and the discovery walk semantics for the scoped pass. - manifest.md § 4: /agents-md-check now reports a distinct "missing-subproject-context" category and exits non-zero so CI gates catch the gap. - command.agents-md-check.md.template: drift report now includes the missing per-sub-project category and the suspicious-markers category (which the spec already promised but the template hadn't surfaced). The fix forces future runs to materialize per-sub-project context in monorepos like twilio-reactor. Existing customer runs that skipped step 9 can be brought to compliance by running /regen-context once the helper / agent loop ships the enforcement. npx skills-ref validate clean; 27/27 helper unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
|
@abhishekgarg18 - this is a very broad PR, it will take a bit of time to understand and review. |
… tests, CI)
Helper (bin/aem-agentkit-helper):
- op_walk now re-checks resolved-realpath segments against the deny-list,
closing the in-workspace symlink bypass (e.g. safe-name -> .git/).
- op_write_atomic enforces the SKILL.md allow-list AND a per-segment
deny-list; helper is now the trust anchor for "no writes outside the
allow-list" instead of trusting the orchestrating LLM. Adds case-
insensitive FS collision detection and validate-before-makedirs
ancestor checks so a workspace symlink cannot redirect mkdir.
- op_open: fail-closed when the TOCTOU re-check is unavailable (returns
toctouVerified flag); opens the resolved realpath so legitimate
intra-workspace symlinks (pnpm, yarn workspaces) are no longer
rejected; bounded buffer read with actual-size diagnostic.
- op_lock: PID-0/negative-PID rejected as stale; PID-reuse defense via
start-time token; empty/corrupt lock files recovered; stale-lock
recovery is race-safe.
- op_cleanup_tmp: recovers orphan .tmp left by a crashed write-atomic
when the .tmp sits at an allow-listed path. Uses bounded walk with
realpath validation per entry.
- op_sha256_canonical: NFC-normalizes JSON string leaves so HFS+ (NFD)
and ext4/APFS (NFC) hash identically; accepts leading blank lines
before the Markdown marker so IDE auto-prettiers don't reclassify
generated files as human-curated.
- New ops/flags: protocol-version (skill version decoupled from wire
protocol), --self-test, stderr traceback on internal errors.
- Hardening: explicit POSIX allow-list, Python 3.10+ guard, /private/var/run
rejected on macOS, compiled deny-list regex, per-invocation op cap.
- _fd_realpath now uses stdlib fcntl.F_GETPATH on macOS (the prior
ctypes path failed on real macOS, silently disabling the TOCTOU check).
Tests (tests/test_helper.py, 27 -> 67 tests):
- Allow-list + deny-list enforcement on write-atomic (.git/hooks/, .env,
node_modules/, escape via intermediate symlink, opt-out flag).
- Lock empty/corrupt/PID-0/negative; two-live-invocation blocking
(AC 18 now actually exercised, not just manually verified).
- op_walk: symlink-deny-bypass regression (.git via in-workspace
symlink), per-subtree cap doesn't drop other roots, global cap,
depth cap, glob dialect.
- NFC/NFD stability, nested-marker preservation, marker-spoof recompute.
- TOCTOU toctouVerified, maxBytes with actual-size, intra-workspace
symlink open, match-deny ENOENT fallback (allowed + denied), orphan-
tmp recovery + post-recovery write.
- Generous subprocess timeouts with actionable assertion messages.
CI (.github/workflows/validate.yml):
- New test-aem-agentkit-helper job runs the helper unit suite on every
PR across Ubuntu + macOS / Python 3.10, 3.11, 3.12. Previously the
27 tests were documentation only; this makes them a real gate.
Test runner (tests/run-tests.sh):
- Guarded chmod (no working-tree mutation); prints both --version and
--protocol-version.
Docs:
- SKILL.md: all 13 generation steps numbered; self-validation failures
categorized (evidence-resolution, module-mismatch, marker-checksum,
url-scoping, strip-list-survivor, manifest-drift,
missing-subproject-context); new Threat Model section with in-scope
and explicitly out-of-scope (prompt-injection via raw bytes, supply-
chain helper tampering, Windows); allow-list documented as helper-
enforced; semantic-equivalence wording for role projections.
- 24 templates: replaced 1.0.0-beta literal with {{SKILL_VERSION}}
token so future version bumps don't require coordinated find/replace.
- AGENTS.module.core, AGENTS.module.ui.apps, AGENTS.subproject got a
"## After making changes" block instructing the agent to run
/regen-context. AGENTS.md is read at session start by every spec-
compliant agent, so the cross-skill index-mutation contract is
wired where it's actually consumed (rather than touching sibling
skill SKILL.md files).
- helpers.md: glob dialect explicit (fnmatch, not git-style **);
protocol-version vs skill-version separation; new write-atomic
behavior; cleanup-tmp orphan recovery.
- upgrade-and-migration.md: SHA-256 pin table with advisory->hard
graduation path; templates use {{SKILL_VERSION}}.
- privacy-and-sanitization.md § 2.2: raw bytes via op_open are NOT
sanitized; orchestrator's responsibility.
- collision-rules.md: dropped 1024-byte _disable_agentkit threshold;
added case-insensitive FS collision row.
- per-tool-artifacts.md § 7: semantic equivalence (not byte-identical)
across IDE projections.
- manifest.md: files[].mtime field for v2 incremental drift detection.
- codified-context.md: enterprise-scale cost disclosure.
- README.md: IDE detection signal table synced with SKILL.md.
All 67 tests pass on Python 3.14 (macOS Darwin 24.3).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… to code - lock: replace PID-file lock with real fcntl.flock(LOCK_EX|LOCK_NB); kernel auto-releases on process exit/kill (crash-safe). Remove the fragile macOS proc_pidinfo ctypes block and now-unused ctypes imports. - security: add read-for-context op — safe-open (deny-list, O_NOFOLLOW, TOCTOU re-check, size cap) + NFC + Unicode strip (bidi/zero-width/BOM/C0-C1 controls; LF/CR preserved) for LLM-bound source reads, closing the prompt-injection-via- file-content gap. Orchestrator must use it instead of raw open. - docs: remove false O_NOFOLLOW_ANY/openat2/RESOLVE_NO_SYMLINKS and PID-reuse claims; consolidate marker spec to upgrade-and-migration.md; condense deny-list; drop non-normative guardrails rationale (~115 lines trimmed). - scope: reinforce AEM as a Cloud Service framing; add "What AI-native means here" to README. - tests: 27->71 passing — flock mutual-exclusion + crash-safety, read-for-context sanitization, TOCTOU fail-closed. Protocol version 1->2 for the additive op. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…CS') Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ted CLAUDE.md - write-atomic now refuses to overwrite a pre-existing file unless it is skill-owned (marker checksum recomputes); overwriting a human-curated file requires explicit allowOverwriteHumanCurated. Makes "never overwrite human-curated files" a helper guarantee, not just orchestrator convention. Factor canonical-body checksum into _canonical_body_sha shared by sha256-canonical and the new _is_skill_owned check. - CLAUDE.md: skill may add an "AEM as a Cloud Service" section to the root CLAUDE.md but only after explicit developer consent (decision: claude-md in agentkit-overrides.yml; silent default = leave untouched). Root AGENTS.md stays deferred to ensure-agents-md. Allow-list root CLAUDE.md (root-only). - docs: SKILL.md/README/collision-rules/output-format/helpers updated for both behaviors; remove customer-specific examples in favor of generic brand-a/brand-b. - tests: 71 -> 82 (overwrite protection, CLAUDE.md allow-list + consent path). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, gitignore it Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
aem-agentkitv1.0.0-beta — a skill that bootstraps an AEM as a Cloud Service repository for agentic workflows across Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Continue.dev, Cline, Windsurf, Augment Code, Aider, and any other AGENTS.md-spec-compliant agent — without modifying customer source code.It generates the machine-readable context a coding agent needs to work the repo reliably: per-module
AGENTS.md, a codified-context layer under.aem/context/, and tool-specific projections (subagents, slash commands, rule files). It complementsensure-agents-mdrather than replacing it.Scope: AEM as a Cloud Service only. The skill exits early on AEM 6.5 LTS, AMS, and on-premise layouts. The context it produces is Cloud Service-native: it understands
conf.d/-based dispatcher layouts (not legacyconf/), Cloud Manager pipelines, RDE, and the AEM SDK; Core Components and anything under/libsare excluded — only customer code is indexed.Why this skill exists
Coding agents work best when the repository ships machine-readable context they can query before generating code. Without it, every agent session re-derives the same answers from scratch — build command, where components live, the Sling Model annotation style, whether a component already exists — burning thousands of tokens per session and getting it wrong often enough to produce duplicate components, hallucinated class names, drift from house style, and Cloud Service violations.
aem-agentkitanswers those questions once, in a stable place. The agent reads authoritative project context at session start instead of rediscovering it, which lowers hallucination on the project's own conventions and removes the per-session rediscovery cost.What gets created
Universal layer (always written when missing):
<module>/AGENTS.md.aem/context/components.json.aem/context/osgi-services.json.aem/context/conventions.md.aem/context/avoid.md.aem/context/glossary.md.aem/context/test-patterns.md.aem/context/aem-api-namespaces.md.aem/context/README.md.aem/context/.agentkit-manifest.json.aem/context/.agentkit.lockTool-specific layer (signal-detected, then the developer confirms):
.claude/agents/aem-*.md+ commands,.cursor/rules/aem-*.mdc,.github/instructions/aem-*.instructions.md,.continue/rules/, and single-file.clinerules/.windsurfrules/augment.md. A single canonical role-prompt source is projected into each tool's format, so the agent sees identical content regardless of IDE. Tools that readAGENTS.mdnatively (Codex, Aider, Gemini CLI, Zed, RooCode, JetBrains Junie, …) need only the universal layer.Key design decisions
AGENTS.md, not one root file. A single root file is either too large (loads conventions for modules you aren't touching) or too thin. Per-module files are sized for one task (soft cap 40 lines, hard cap 80): editingui.apps/loads HTL/component conventions; editingcore/loads Sling Model / package / logging conventions. Written recursively at each archetype leaf so context locality survives monorepos..aem/context/*). Two questions need two shapes: per-module files describe the project's style (prose with evidence pointers); the JSON indexes describe its inventory (deterministic, sorted). "Does theherocomponent exist?" is one read ofcomponents.json, not 30 reads..aem/context/in monorepos. In multi-brand monorepos, conventions and component sets differ between sub-projects (e.g.brand-avsbrand-b). A root-only index conflates them, so a scoped.aem/context/is materialized at every detected nested AEM project root, and self-validation hard-fails if a detected sub-project is missing one.rename(2), SHA-256 canonical-body checksums,O_NOFOLLOW+ TOCTOU re-check, Unicode sanitization, bounded walk, and an advisoryflock. These run inbin/aem-agentkit-helper(Python 3.10+, no third-party deps, JSON-line protocol). The skill pins the helper by--version/--protocol-versionand refuses to run on mismatch.CLAUDE.md. Both decisions persist in.aem/agentkit-overrides.yml(decision: ide-targets,decision: claude-md) so re-runs don't re-prompt; both are suppressible for CI (--silent/AEM_AGENTKIT_SILENT=1), with the safe default being "don't touchCLAUDE.md."generatedAtexcluded, so identical content doesn't churn the file). A file is "skill-owned" only when that checksum recomputes; anything else — including a pasted-in marker — is treated as human-curated and never overwritten.Hard guarantees
These are enforced in the helper (the LLM cannot bypass them, even if prompt-injected), not merely by convention:
<module>/AGENTS.md,.aem/context/*, the per-tool artifact paths, rootCLAUDE.md); the deny-list (.git/,.env,*.pem,node_modules/, …) is refused first, regardless of intent.write-atomicreads any pre-existing target and refuses to overwrite it unless it is genuinely skill-owned (marker checksum recomputes) — fail-closed on symlinks, bad/duplicated markers, or unreadable files. Overwriting a human-curated file requires an explicitallowOverwriteHumanCuratedflag, which the orchestrator sets only after developer consent (e.g. theCLAUDE.mdprompt).read-for-contextop, which neutralizes dangerous Unicode (bidi overrides, zero-width marks, BOM, C0/C1 controls; LF/CR preserved) before bytes reach the model. (It does not defend against natural-language injection — returned content is still treated as untrusted.)fcntl.flock(LOCK_EX|LOCK_NB)advisory lock; the kernel releases it when the helper process exits or is killed, so a crash cannot strand the lock.pom.xml, OSGi config, and the rootAGENTS.mdare never written. RootCLAUDE.mdis touched only with explicit consent./6.5/orexperience-manager-65/.Verified on
aem-guides-wkndReal run against a clean checkout of
aem-guides-wknd(standard Cloud Service archetype:.cloudmanager/,conf.d/dispatcher, 11 modules, no nested sub-projects). Universal layer + per-moduleAGENTS.md:components.json— matches ground truth exactly (jcr:primaryType="cq:Component", source only; the 43 duplicate nodes underui.apps/target/build output were correctly pruned by the walk, so no double-counting).osgi-services.json(HelloWorldModel,BylineImpl,ImageListImpl); every FQCN resolves to a real.java; 0 phantom services.git statusshows only.aem/and 7<module>/AGENTS.mdas untracked; noM/Don any pre-existing file.core/...Hack.java,pom.xml,node_modules/, and.git/configwere all rejected by the helper; allow-listed paths succeeded.Monorepo support (per-sub-project
.aem/context/at each nested AEM root, e.g.brand-a/+brand-b/) follows the same contract and is exercised by self-validation'smissing-subproject-contexthard-fail.Deterministic helper + tests
bin/aem-agentkit-helper(Python 3.10+, no third-party deps, POSIX only). Ops:realpath,open(O_NOFOLLOW+ TOCTOU re-check),read-for-context(safe, Unicode-sanitized source ingestion),walk(bounded),sha256-canonical(Markdown + JSON),write-atomic(allow-list + deny-list + human-curated protection),cleanup-tmp,sanitize-string,lock/unlock(realfcntl.flock),match-deny,protocol-version.tests/test_helper.py: 82 unit tests, 100% pass, no third-party deps, run in CI across macOS + Ubuntu × Python 3.10 / 3.11 / 3.12. Coverage includes canonical-body checksums, every Unicode strip category, workspace-escape and deny-list rejection, walk pruning and caps,flockmutual-exclusion and crash-safety,read-for-contextsanitization, TOCTOU fail-closed, and human-curated overwrite protection.Status
Beta (
1.0.0-beta), with the three required markers (frontmatterstatus: beta,[BETA]description prefix, body blockquote). Generated JSON carriesschemaVersion: "1". Verify all outputs before applying to production projects. Issues: https://github.com/adobe/skills/issues