Skip to content

feat: separate pulled extensions from user extensions (#863)#885

Merged
stack72 merged 10 commits intomainfrom
paul/863-separate-pulled-extensions
Mar 27, 2026
Merged

feat: separate pulled extensions from user extensions (#863)#885
stack72 merged 10 commits intomainfrom
paul/863-separate-pulled-extensions

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented Mar 26, 2026

Summary

Pulled community extensions are now extracted into .swamp/pulled-extensions/{type}/
instead of extensions/{type}/, cleanly separating dependency artifacts from
user-authored code. The .swamp/ directory is already gitignored, so pulled
extension source files will no longer be accidentally committed to git.

A new swamp extension install command restores pulled extensions from the
committed lockfile (upstream_extensions.json), similar to how npm install
restores node_modules/ from package-lock.json.

Closes #863 — reported by @bixu

What changed

New directory layout

repo/
├── extensions/
│   ├── models/
│   │   ├── my-local-model.ts          ← user-authored (committed)
│   │   └── upstream_extensions.json   ← lockfile (committed)
│   └── vaults/
├── .swamp/                            ← gitignored
│   ├── pulled-extensions/
│   │   ├── models/                    ← pulled source files
│   │   ├── vaults/
│   │   ├── workflows/
│   │   ├── drivers/
│   │   ├── datastores/
│   │   └── reports/
│   ├── bundles/                       ← unchanged
│   └── ...

Changes by area

  1. Lockfile path decoupled from modelsDirreadUpstreamExtensions(),
    updateUpstreamExtensions(), and removeUpstreamExtension() now accept
    the full lockfile path instead of deriving it from the models directory.
    This allows the lockfile to stay committed while source files move.

  2. Pull writes to .swamp/pulled-extensions/installExtension() now
    extracts source files (models, vaults, workflows, drivers, datastores,
    reports) into .swamp/pulled-extensions/{type}/. Bundle caches remain in
    .swamp/bundles/ etc. unchanged.

  3. Loaders merge user + pulled directories — All five extension loaders
    (models, vaults, drivers, datastores, reports) accept an additionalDirs
    option and scan multiple directories in a single pass. User extensions are
    discovered first and take precedence via skipAlreadyRegistered.
    skipAlreadyRegistered was also added to the driver, datastore, and report
    loaders (models and vaults already had it).

  4. Legacy layout detection — A requireCurrentExtensionLayout() check
    runs at the start of every extension command (pull, rm, list, update,
    install). If upstream_extensions.json contains file paths outside
    .swamp/, it errors with: "Run 'swamp repo upgrade' to migrate."

  5. Migration in repo upgrade — New migrateExtensionLayout() method
    moves tracked files from extensions/{type}/ to
    .swamp/pulled-extensions/{type}/, updates the lockfile paths, and prunes
    empty directories.

  6. New swamp extension install command — Reads the lockfile and re-pulls
    any extensions whose source files are missing. Use after git clone or in
    CI. Removed the install alias from the pull command since they now have
    distinct behaviors.

  7. Lockfile enhanced with checksum + serverUrlupstream_extensions.json
    entries now store the SHA-256 checksum and registry server URL for reliable
    re-installation.

Design decisions

  • Lockfile stays at extensions/models/upstream_extensions.json — No
    breaking change to its location. It must be committed (not in .swamp/).
    Can revisit the location later.

  • Single-pass loader with additionalDirs rather than calling loaders
    twice — avoids doubling startup time from redundant directory scans, cache
    freshness checks, and import passes.

  • Hard error on legacy layout rather than auto-migration — Users upgrading
    swamp should explicitly run repo upgrade so the migration is predictable
    and not surprising. The error message is clear and actionable.

  • modelsDir in InstallContext reused for pulled paths — Since
    installExtension() only writes (never reads user extensions), the existing
    directory fields naturally become the pulled-extension target dirs without
    needing new fields.

Testing

Automated

  • All 3569 tests pass (0 failures), including 5 new layout detection tests
  • Integration tests updated to use .swamp/pulled-extensions/ paths
  • Unit tests for updateUpstreamExtensions, removeUpstreamExtension,
    readUpstreamExtensions all updated for the new lockfile path API

Manual end-to-end verification

Performed a full migration scenario with the stable binary (installed) vs the
compiled binary from this branch:

Step Result
Create temp dir + swamp repo init (stable) Repo initialized
extension pull @stack72/system-extensions (stable) Files at extensions/models/
model type search system (stable) @stack72/system-usage found
model type search system (new binary) Still loads (user dir scan)
extension list (new binary) Error: "old layout... Run repo upgrade"
repo upgrade (new binary) Migration runs, files moved
Verify file locations Source in .swamp/pulled-extensions/models/, lockfile paths updated
model type search system (new binary) @stack72/system-usage loads from pulled dir
extension list (new binary) Works — shows installed extension

Test plan

  • deno check — type checking passes
  • deno lint — no lint errors
  • deno fmt --check — formatting clean
  • deno run test — all 3569 tests pass
  • deno run compile — binary compiles
  • Manual: pull with stable, upgrade with new, verify migration
  • Manual: fresh pull with new binary writes to .swamp/pulled-extensions/
  • Manual: swamp extension install restores from lockfile after deleting .swamp/

Co-authored-by: Blake Irvin bixu@users.noreply.github.com

🤖 Generated with Claude Code

stack72 and others added 9 commits March 26, 2026 18:08
All functions that read/write the upstream_extensions.json lockfile
(readUpstreamExtensions, updateUpstreamExtensions, removeUpstreamExtension)
now accept the full lockfile path instead of deriving it from modelsDir.
This decoupling is the foundation for moving pulled extension source
files to .swamp/pulled-extensions/ while keeping the lockfile committed
at its current location.

Updated all callers: extension pull, rm, list, update, search commands,
auto-resolver adapter, and all related tests.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add pulled-extension subdirs to SWAMP_SUBDIRS (pulledModels, pulledVaults,
pulledWorkflows, pulledDrivers, pulledDatastores, pulledReports) and update
all extension pull/update/search commands to write source files to
.swamp/pulled-extensions/{type}/ instead of extensions/{type}/.

The .swamp/ directory is already gitignored, so pulled extension source
files will no longer be accidentally committed. Bundle caches remain in
.swamp/bundles/ etc. unchanged.

The auto-resolver adapter in mod.ts also updated to install pulled
extensions into the new directory structure.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All five extension loaders (models, vaults, drivers, datastores, reports)
now accept an additionalDirs option to scan multiple directories in a
single pass. User extensions are discovered first (precedence), then
pulled extensions from .swamp/pulled-extensions/{type}/.

Also adds skipAlreadyRegistered support to driver, datastore, and report
loaders (models and vaults already had it) so pulled extensions with
the same type as a user extension are silently skipped.

CLI mod.ts updated to pass pulled-extension dirs as additionalDirs to
each loader at startup.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add detectLegacyExtensionLayout() and requireCurrentExtensionLayout()
helpers that check upstream_extensions.json for file paths outside
.swamp/, indicating the old layout where pulled extensions lived in
extensions/{type}/.

All extension commands (pull, rm, list, update) now call
requireCurrentExtensionLayout() early and error with a clear message
directing users to run 'swamp repo upgrade'.

Add migrateExtensionLayout() to RepoService.upgrade() which:
- Reads upstream_extensions.json for tracked files
- Moves files from extensions/{type}/ to .swamp/pulled-extensions/{type}/
- Updates file paths in the lockfile
- Prunes empty directories left behind

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New command that reads upstream_extensions.json (the lockfile) and
re-pulls any extensions whose source files are missing from disk.
Analogous to npm install restoring node_modules from package-lock.json.

Use after git clone or in CI where .swamp/pulled-extensions/ is empty.
Skips extensions that already have all files present.

Also removes the "install" alias from the pull command since install
now has its own distinct behavior.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add optional checksum (SHA-256) and serverUrl fields to
UpstreamExtensionEntry. These are stored during pull for reliable
re-installation via swamp extension install — the checksum enables
integrity verification and the serverUrl supports non-default registries.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…kfile dir exists

Update integration test fixtures to use .swamp/pulled-extensions/ paths
instead of extensions/ paths. Ensure lockfile parent directory is created
before writing (needed for fresh repos where extensions/models/ may not
exist yet). Apply formatting fixes.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add startup check: if upstream_extensions.json has entries but source
files are missing from disk, warn the user to run 'swamp extension
install'. This catches post-clone scenarios where .swamp/ is empty.

Also fix workflow discovery for pulled extensions:
ExtensionWorkflowRepository now accepts additionalDirs and scans both
the user workflows dir (extensions/workflows/) and the pulled workflows
dir (.swamp/pulled-extensions/workflows/). Wired through
RepositoryFactoryConfig and all repo_context.ts call sites.

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@stack72 stack72 marked this pull request as ready for review March 26, 2026 19:59
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

- Add skipAlreadyRegistered: true to all 5 loader calls in mod.ts so
  pulled extensions with the same type as user extensions are silently
  skipped instead of producing warning/failure logs

- Add unit tests for extensionInstall generator (4 tests: empty lockfile,
  files present, missing files detection, missing lockfile)

- Fix duplicate step 6 comment in extension_pull.ts (now 6 → 7)

- Add helpful error when extension install receives arguments, directing
  users to extension pull instead (muscle memory from old alias)

- Add requireCurrentExtensionLayout check to extension install command,
  preventing partial implicit migration on legacy-layout repos

- Add requireCurrentExtensionLayout check to extension search --pull
  path, preventing mixed-layout lockfile entries

- Distinguish "No extensions in lockfile." from "All extensions up to
  date." in extension install output

Refs: #863

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. extension install help text references the internal filename — The description says "Reads upstream_extensions.json" which leaks an implementation detail. A user-facing phrasing like "Reads the committed lockfile" would be cleaner, but this is minor.

  2. upToDate vs snake_case in JSON outputExtensionInstallData uses upToDate (camelCase) while the status enum values use up_to_date (snake_case) in the same output. These are inconsistent within the same JSON object — worth making uniform (either all camelCase or all snake_case for field names).

  3. No progress feedback in JSON mode during multi-extension install — The resolving and installing events are silently swallowed in JSON mode (only completed emits). For CI use cases (the primary audience for --json), streaming progress via newline-delimited JSON per extension would be helpful. Not a blocker given the stated use case.

  4. "Reading lockfile..." doesn't say where — When --verbose or debugging is needed, showing the lockfile path (e.g. "Reading lockfile at extensions/models/upstream_extensions.json...") would help users confirm the right file is being read.

Verdict

PASS — New extension install command is well-designed with clear help text, actionable error messages (both the legacy-layout check and the startup warning are excellent), correct --repo-dir consistency, and solid JSON output shape. The removal of the install alias from pull is handled gracefully with a helpful redirection error.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured PR that cleanly separates pulled extensions from user-authored code. The architecture is sound — lockfile stays committed, source files move to .swamp/, loaders merge both directories with user code taking precedence. The new extension install command is a natural analog to npm install.

Blocking Issues

None.

Suggestions

  1. Duplicate step comment in extension_install.ts — Lines 86 and 100 both say // 4. (should be // 4. and // 5.). Minor cosmetic issue.

  2. Silent skip during migration (repo_service.ts:1600-1601) — When migrateExtensionLayout() encounters a NotFound error during file moves, it silently ignores it. Consider logging a warning so users know which files weren't migrated (e.g., "Skipping missing file: {path}").

  3. Heuristic file extension filter (cli/mod.ts:394-395) — checkForMissingPulledExtensions skips .js, .md, .txt files when checking for missing source files. This is a reasonable heuristic but could miss edge cases if new file types are added to bundles. Consider filtering by path prefix (e.g., checking only files under pulled-extensions/) instead of file extension, since the new layout makes this distinction structural.

  4. Path traversal consideration in migration (repo_service.ts:1592-1593) — The file values from upstream_extensions.json are joined to repoPath without validating that the resolved path stays within the repo boundary. Since the lockfile is committed and user-controlled, risk is low, but a resolve() + startsWith() guard would be defense-in-depth. Same applies to hasAnyMissingFiles() in install.ts:134.

  5. DDD notemigrateExtensionLayout() in RepoService reads the lockfile directly via readUpstreamExtensions() and writes it with atomicWriteTextFile(). This is pragmatic for a migration, but if the lockfile gains more operations, consider encapsulating it behind a repository abstraction. Fine for now.

Overall this is clean, well-tested code with good separation of concerns. The event-based async generator pattern, comprehensive deduplication in loaders, and hard-error-on-legacy-layout approach are all solid design choices.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found.

Medium

  1. extension install ignores per-extension serverUrl from lockfilesrc/cli/commands/extension_install.ts:101-111

    The lockfile now stores serverUrl per extension (added in this PR), but extension install creates a single ExtensionApiClient with the global resolveServerUrl() and uses it for all extensions. The createInstallContext callback ignores _name/_version, so it can't look up the entry's stored serverUrl.

    Breaking example: User pulls @corp/internal-model from a private registry (SWAMP_CLUB_URL=https://registry.corp.com). The lockfile records "serverUrl": "https://registry.corp.com". Later, on a fresh clone, swamp extension install tries to fetch it from the default https://swamp.club, which 404s or returns a different extension.

    Suggested fix: Inside createInstallContext, read the entry from the upstream map and use entry.serverUrl ?? resolveServerUrl() to construct the ExtensionApiClient:

    createInstallContext: (name, _version) => {
      const entry = upstream[name];
      const entryServerUrl = entry?.serverUrl ?? serverUrl;
      const entryClient = new ExtensionApiClient(entryServerUrl);
      return {
        getExtension: (n) => entryClient.getExtension(n),
        // ...
      };
    },

    This requires reading upstream before the consumeStream call (which you already do inside extensionInstall, so you'd need to either pass it through or read it once in the command).

Low

  1. Migration writes lockfile without advisory locksrc/domain/repo/repo_service.ts:1614

    migrateExtensionLayout writes the lockfile via atomicWriteTextFile directly, bypassing the advisory .lock file that updateUpstreamExtensions and removeUpstreamExtension use. A concurrent extension pull during repo upgrade could see a torn write. Unlikely in practice since upgrade is manual, but inconsistent with the locking pattern established elsewhere.

  2. alreadyPulled not shared across extensions during installsrc/cli/commands/extension_install.ts:125

    Each createInstallContext call creates a fresh new Set(). If extension A depends on B, and both are in the lockfile, B could be pulled twice (once as A's dependency, once standalone). The outer loop's file-existence check prevents true duplication, but the install count in the completed event may undercount "installed" and overcount "up_to_date" for dependency chains.

  3. No path validation in migration's relativePart computationsrc/domain/repo/repo_service.ts:1579

    relativePart is computed via file.slice(oldPrefix.length) which could contain ../ segments if the lockfile were hand-edited. join() normalizes these, potentially escaping the intended directory. The lockfile is committed/trusted data so exploitation requires repo write access, but a resolve+startsWith guard would be defense-in-depth.

Verdict

PASS — The core design is solid: clean separation of user vs. pulled extensions, single-pass loaders with precedence, safe migration path, and good test coverage. The serverUrl issue (Medium #1) is a real gap for private-registry users but doesn't affect the default single-registry case. No merge-blocking findings.

@stack72 stack72 merged commit eda4673 into main Mar 27, 2026
10 checks passed
@stack72 stack72 deleted the paul/863-separate-pulled-extensions branch March 27, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

swamp extension pull should gitignore pulled extension source files

1 participant