diff --git a/CHANGELOG.md b/CHANGELOG.md index 5ce8d274..5b5cde4a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,8 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [3.3.0] - 2026-02-22 + ### Added +- **Content-on-node (M13 VESSEL)** — Attach rich content to graph nodes using git's native CAS. Content stored as git blobs via `hash-object`, SHA and metadata recorded as WARP node properties under the `_content.*` prefix (#271) +- **`git mind content set --from `** — Attach content from a file. MIME auto-detected from extension, `--mime` override supported. `--json` output (#273) +- **`git mind content show `** — Display attached content. `--raw` for piping (body only, no metadata header). `--json` output (#273) +- **`git mind content meta `** — Show content metadata (SHA, MIME, size, encoding). `--json` output (#273) +- **`git mind content delete `** — Remove content attachment from a node. `--json` output (#273) +- **Content store API** — `writeContent()`, `readContent()`, `getContentMeta()`, `hasContent()`, `deleteContent()` exported from public API (#272) +- **SHA integrity verification** — `readContent()` re-hashes retrieved blob and compares to stored SHA on every read (#272) +- **JSON Schema contracts for content CLI** — `content-set.schema.json`, `content-show.schema.json`, `content-meta.schema.json` in `docs/contracts/cli/` (#274) - **ADR-0004: Content Attachments Belong in git-warp** — Decision record establishing that CAS-backed content-on-node is a git-warp substrate responsibility, not a git-mind domain concern. Aligns with Paper I's `Atom(p)` attachment formalism (#252) - **Chalk formatting for `extension list`** — `formatExtensionList()` renders extension names in cyan bold, versions dimmed, `[builtin]` in yellow / `[custom]` in magenta, consistent with all other CLI commands (#265) - **Prefix collision detection** — `registerExtension()` now checks incoming domain prefixes against all registered extensions and throws a descriptive error on overlap. Idempotent re-registration of the same extension name is still allowed (#264) @@ -17,11 +27,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **JSON Schema contracts for extension CLI output** — 4 new schemas in `docs/contracts/cli/`: `extension-list`, `extension-validate`, `extension-add`, `extension-remove`. Valid samples added to the contract test harness (#262) - **Deferred items documented in ROADMAP** — #261 (ephemeral registration) and #269 (`--extension` flag) documented with rationale and recommended H2 slot +### Fixed + +- **CRITICAL: Command injection in `readContent()`** — Replaced all `execSync` shell interpolation with `execFileSync` arg arrays + SHA validation regex. Zero shell invocations in content module (#276) +- **Dead `encoding` parameter removed** — Removed unused `encoding` field from content store, CLI format, JSON Schema contracts, and tests. Content is always UTF-8 (#276) +- **Static imports in content CLI** — Replaced dynamic `await import('node:fs/promises')` and `await import('node:path')` with static imports (#276) +- **`nodeId` in `content show` metadata** — Non-raw `content show` now passes `nodeId` to `formatContentMeta` for consistent display (#276) +- **Schema `if/then/else` conditional** — `content-meta.schema.json` enforces `sha`, `mime`, and `size` required when `hasContent` is `true`; forbids them when `false` (#276) +- **Redundant null check** — Removed dead `sha !== undefined` in `hasContent()` — `?? null` guarantees non-undefined (#276) +- **Misleading integrity test** — Split into blob-not-found test + genuine integrity mismatch test using non-UTF-8 blob (#276) +- **Test SHA assertions accept both SHA-1 (40 chars) and SHA-256 (64 chars)** (#276) +- **Schema test compile-once** — Content schema validators compiled once in `beforeAll` instead of per-test; removed `$id` stripping workaround (#276) +- **Error-path CLI tests** — 4 new tests: nonexistent file, node without content, non-existent node for show/delete (#276) +- **MIME map extended** — Added `.css` → `text/css` and `.svg` → `image/svg+xml` (#276) +- **YAML MIME type** — Changed `.yaml`/`.yml` mapping from `text/yaml` to `application/yaml` (IANA standard) (#276) +- **Missing `content-delete.schema.json` contract** — Added JSON Schema for `content delete --json` output (#276) +- **Content subcommand positional parsing** — `extractPositionals()` helper properly skips `--flag value` pairs instead of naive `!startsWith('--')` check (#276) + ### Changed - **Upgraded `@git-stunts/git-warp`** from v11.3.3 to v11.5.0 - **`registerBuiltinExtensions()` memoized** — Module-level `builtInsLoaded` flag prevents redundant YAML file reads on repeated invocations within the same process (#266) -- **Test count** — 537 tests across 28 files (was 527) +- **Test count** — 571 tests across 29 files (was 537) ## [3.2.0] - 2026-02-17 diff --git a/ROADMAP.md b/ROADMAP.md index 37ce6f0f..8f1b9891 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2187,6 +2187,16 @@ Two issues were filed during the M12 extension polish pass and intentionally def **Recommended slot:** H2 (CONTENT + MATERIALIZATION) planning. Both issues naturally fall into the extension lifecycle story — persistence is a prerequisite for the extension marketplace vision (H4). Design the persistence mechanism during H2 kickoff, implement as the first H2 deliverable so that all subsequent extension work (content system extensions, materializer extensions) benefits from proper registration. +### Content system enhancements (from M13 VESSEL review) + +- **`git mind content list`** — Query all nodes that have `_content.sha` properties. Currently there's no way to discover which nodes carry content without inspecting each one individually. +- **Binary content support** — Add base64 encoding for non-text MIME types. Currently the content system is text-only (UTF-8); non-UTF-8 blobs fail the integrity check by design. Requires reintroducing encoding metadata and updating `readContent()` to handle buffer round-trips. +- **`content meta --verify` flag** — Run the SHA integrity check without dumping the full content body. Useful for bulk health checks across all content-bearing nodes. + +### Codebase hardening (from M13 VESSEL review) + +- **Standardize all git subprocess calls to `execFileSync`** — `src/content.js` now uses `execFileSync` exclusively, but other modules (e.g. `processCommitCmd` in `commands.js`) still use `execSync` with string interpolation. Audit and migrate for consistency and defense-in-depth. + ### Other backlog items - `git mind onboarding` as a guided walkthrough (not just a view) diff --git a/bin/git-mind.js b/bin/git-mind.js index 59cf9edf..f76b6ce1 100755 --- a/bin/git-mind.js +++ b/bin/git-mind.js @@ -5,7 +5,7 @@ * Usage: git mind [options] */ -import { init, link, view, list, remove, nodes, status, at, importCmd, importMarkdownCmd, exportCmd, mergeCmd, installHooks, processCommitCmd, doctor, suggest, review, diff, set, unsetCmd, extensionList, extensionValidate, extensionAdd, extensionRemove } from '../src/cli/commands.js'; +import { init, link, view, list, remove, nodes, status, at, importCmd, importMarkdownCmd, exportCmd, mergeCmd, installHooks, processCommitCmd, doctor, suggest, review, diff, set, unsetCmd, contentSet, contentShow, contentMeta, contentDelete, extensionList, extensionValidate, extensionAdd, extensionRemove } from '../src/cli/commands.js'; import { parseDiffRefs, collectDiffPositionals } from '../src/diff.js'; import { createContext } from '../src/context-envelope.js'; import { registerBuiltinExtensions } from '../src/extension.js'; @@ -87,6 +87,17 @@ Commands: review Review pending suggestions --batch accept|reject Non-interactive batch mode --json Output as JSON + content Manage node content + set --from Attach content from a file + --mime Override MIME type detection + --json Output as JSON + show Display attached content + --raw Output body only (no metadata header) + --json Output as JSON + meta Show content metadata + --json Output as JSON + delete Remove attached content + --json Output as JSON extension Manage extensions list List registered extensions --json Output as JSON @@ -101,7 +112,7 @@ Edge types: implements, augments, relates-to, blocks, belongs-to, consumed-by, depends-on, documents`); } -const BOOLEAN_FLAGS = new Set(['json', 'fix', 'dry-run', 'validate']); +const BOOLEAN_FLAGS = new Set(['json', 'fix', 'dry-run', 'validate', 'raw']); /** * Extract a ContextEnvelope from parsed flags. @@ -141,6 +152,24 @@ function parseFlags(args) { return flags; } +/** + * Extract positional arguments from args, skipping --flag value pairs. + * @param {string[]} args + * @returns {string[]} + */ +function extractPositionals(args) { + const positionals = []; + for (let i = 0; i < args.length; i++) { + if (args[i].startsWith('--')) { + const flag = args[i].slice(2); + if (!BOOLEAN_FLAGS.has(flag) && i + 1 < args.length) i++; // skip value + } else { + positionals.push(args[i]); + } + } + return positionals; +} + switch (command) { case 'init': await init(cwd); @@ -166,16 +195,7 @@ switch (command) { case 'view': { const viewArgs = args.slice(1); const viewFlags = parseFlags(viewArgs); - // Collect positionals: skip flags and their consumed values - const viewPositionals = []; - for (let i = 0; i < viewArgs.length; i++) { - if (viewArgs[i].startsWith('--')) { - const flag = viewArgs[i].slice(2); - if (!BOOLEAN_FLAGS.has(flag) && i + 1 < viewArgs.length) i++; // skip value - } else { - viewPositionals.push(viewArgs[i]); - } - } + const viewPositionals = extractPositionals(viewArgs); const viewCtx = contextFromFlags(viewFlags); await view(cwd, viewPositionals[0], { scope: viewFlags.scope, @@ -373,6 +393,67 @@ switch (command) { break; } + case 'content': { + const contentSubCmd = args[1]; + const contentArgs = args.slice(2); + const contentFlags = parseFlags(contentArgs); + const contentPositionals = extractPositionals(contentArgs); + switch (contentSubCmd) { + case 'set': { + const setNode = contentPositionals[0]; + const fromFile = contentFlags.from; + if (!setNode || !fromFile) { + console.error('Usage: git mind content set --from [--mime ] [--json]'); + process.exitCode = 1; + break; + } + await contentSet(cwd, setNode, fromFile, { + mime: contentFlags.mime, + json: contentFlags.json ?? false, + }); + break; + } + case 'show': { + const showNode = contentPositionals[0]; + if (!showNode) { + console.error('Usage: git mind content show [--raw] [--json]'); + process.exitCode = 1; + break; + } + await contentShow(cwd, showNode, { + raw: contentFlags.raw ?? false, + json: contentFlags.json ?? false, + }); + break; + } + case 'meta': { + const metaNode = contentPositionals[0]; + if (!metaNode) { + console.error('Usage: git mind content meta [--json]'); + process.exitCode = 1; + break; + } + await contentMeta(cwd, metaNode, { json: contentFlags.json ?? false }); + break; + } + case 'delete': { + const deleteNode = contentPositionals[0]; + if (!deleteNode) { + console.error('Usage: git mind content delete [--json]'); + process.exitCode = 1; + break; + } + await contentDelete(cwd, deleteNode, { json: contentFlags.json ?? false }); + break; + } + default: + console.error(`Unknown content subcommand: ${contentSubCmd ?? '(none)'}`); + console.error('Usage: git mind content '); + process.exitCode = 1; + } + break; + } + case 'extension': { await registerBuiltinExtensions(); const subCmd = args[1]; diff --git a/docs/contracts/cli/content-delete.schema.json b/docs/contracts/cli/content-delete.schema.json new file mode 100644 index 00000000..604137ab --- /dev/null +++ b/docs/contracts/cli/content-delete.schema.json @@ -0,0 +1,16 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://github.com/neuroglyph/git-mind/docs/contracts/cli/content-delete.schema.json", + "title": "git-mind content delete --json", + "description": "Content deletion result from `git mind content delete --json`", + "type": "object", + "required": ["schemaVersion", "command", "nodeId", "removed", "previousSha"], + "additionalProperties": false, + "properties": { + "schemaVersion": { "type": "integer", "const": 1 }, + "command": { "type": "string", "const": "content-delete" }, + "nodeId": { "type": "string", "minLength": 1 }, + "removed": { "type": "boolean" }, + "previousSha": { "type": ["string", "null"], "pattern": "^[0-9a-f]{40,64}$" } + } +} diff --git a/docs/contracts/cli/content-meta.schema.json b/docs/contracts/cli/content-meta.schema.json new file mode 100644 index 00000000..18329633 --- /dev/null +++ b/docs/contracts/cli/content-meta.schema.json @@ -0,0 +1,37 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://github.com/neuroglyph/git-mind/docs/contracts/cli/content-meta.schema.json", + "title": "git-mind content meta --json", + "description": "Content metadata result from `git mind content meta --json`", + "type": "object", + "required": ["schemaVersion", "command", "nodeId", "hasContent"], + "additionalProperties": false, + "if": { + "properties": { "hasContent": { "const": true } }, + "required": ["hasContent"] + }, + "then": { + "required": ["sha", "mime", "size"], + "properties": { + "sha": { "type": "string", "pattern": "^[0-9a-f]{40,64}$" }, + "mime": { "type": "string", "minLength": 1 }, + "size": { "type": "integer", "minimum": 0 } + } + }, + "else": { + "properties": { + "sha": false, + "mime": false, + "size": false + } + }, + "properties": { + "schemaVersion": { "type": "integer", "const": 1 }, + "command": { "type": "string", "const": "content-meta" }, + "nodeId": { "type": "string", "minLength": 1 }, + "hasContent": { "type": "boolean" }, + "sha": { "type": "string", "pattern": "^[0-9a-f]{40,64}$" }, + "mime": { "type": "string", "minLength": 1 }, + "size": { "type": "integer", "minimum": 0 } + } +} diff --git a/docs/contracts/cli/content-set.schema.json b/docs/contracts/cli/content-set.schema.json new file mode 100644 index 00000000..25f9e672 --- /dev/null +++ b/docs/contracts/cli/content-set.schema.json @@ -0,0 +1,17 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://github.com/neuroglyph/git-mind/docs/contracts/cli/content-set.schema.json", + "title": "git-mind content set --json", + "description": "Content attachment result from `git mind content set --json`", + "type": "object", + "required": ["schemaVersion", "command", "nodeId", "sha", "mime", "size"], + "additionalProperties": false, + "properties": { + "schemaVersion": { "type": "integer", "const": 1 }, + "command": { "type": "string", "const": "content-set" }, + "nodeId": { "type": "string", "minLength": 1 }, + "sha": { "type": "string", "pattern": "^[0-9a-f]{40,64}$" }, + "mime": { "type": "string", "minLength": 1 }, + "size": { "type": "integer", "minimum": 0 } + } +} diff --git a/docs/contracts/cli/content-show.schema.json b/docs/contracts/cli/content-show.schema.json new file mode 100644 index 00000000..949fee4c --- /dev/null +++ b/docs/contracts/cli/content-show.schema.json @@ -0,0 +1,18 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://github.com/neuroglyph/git-mind/docs/contracts/cli/content-show.schema.json", + "title": "git-mind content show --json", + "description": "Content display result from `git mind content show --json`", + "type": "object", + "required": ["schemaVersion", "command", "nodeId", "content", "sha", "mime", "size"], + "additionalProperties": false, + "properties": { + "schemaVersion": { "type": "integer", "const": 1 }, + "command": { "type": "string", "const": "content-show" }, + "nodeId": { "type": "string", "minLength": 1 }, + "content": { "type": "string" }, + "sha": { "type": "string", "pattern": "^[0-9a-f]{40,64}$" }, + "mime": { "type": "string", "minLength": 1 }, + "size": { "type": "integer", "minimum": 0 } + } +} diff --git a/package.json b/package.json index 0793154c..e41a74cd 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@neuroglyph/git-mind", - "version": "3.2.0", + "version": "3.3.0", "description": "A project knowledge graph tool built on git-warp", "type": "module", "license": "Apache-2.0", diff --git a/src/cli/commands.js b/src/cli/commands.js index 2cff8c00..a0a82e45 100644 --- a/src/cli/commands.js +++ b/src/cli/commands.js @@ -4,8 +4,8 @@ */ import { execSync } from 'node:child_process'; -import { writeFile, chmod, access, constants } from 'node:fs/promises'; -import { join } from 'node:path'; +import { writeFile, chmod, access, constants, readFile } from 'node:fs/promises'; +import { join, extname } from 'node:path'; import { initGraph, loadGraph } from '../graph.js'; import { createEdge, queryEdges, removeEdge, EDGE_TYPES } from '../edges.js'; import { getNodes, hasNode, getNode, getNodesByPrefix, setNodeProperty, unsetNodeProperty } from '../nodes.js'; @@ -25,7 +25,8 @@ import { getPendingSuggestions, acceptSuggestion, rejectSuggestion, skipSuggesti import { computeDiff } from '../diff.js'; import { createContext, DEFAULT_CONTEXT } from '../context-envelope.js'; import { loadExtension, registerExtension, removeExtension, listExtensions, validateExtension } from '../extension.js'; -import { success, error, info, warning, formatEdge, formatView, formatNode, formatNodeList, formatStatus, formatExportResult, formatImportResult, formatDoctorResult, formatSuggestions, formatReviewItem, formatDecisionSummary, formatAtStatus, formatDiff, formatExtensionList } from './format.js'; +import { writeContent, readContent, getContentMeta, hasContent, deleteContent } from '../content.js'; +import { success, error, info, warning, formatEdge, formatView, formatNode, formatNodeList, formatStatus, formatExportResult, formatImportResult, formatDoctorResult, formatSuggestions, formatReviewItem, formatDecisionSummary, formatAtStatus, formatDiff, formatExtensionList, formatContentMeta } from './format.js'; /** * Write structured JSON to stdout with schemaVersion and command fields. @@ -807,6 +808,134 @@ export async function diff(cwd, refA, refB, opts = {}) { } } +// ── Content commands ───────────────────────────────────────────── + +/** MIME type mapping from file extensions. */ +const MIME_MAP = { + '.md': 'text/markdown', + '.markdown': 'text/markdown', + '.txt': 'text/plain', + '.json': 'application/json', + '.yaml': 'application/yaml', + '.yml': 'application/yaml', + '.html': 'text/html', + '.xml': 'application/xml', + '.csv': 'text/csv', + '.css': 'text/css', + '.svg': 'image/svg+xml', +}; + +/** + * Attach content to a graph node from a file. + * @param {string} cwd + * @param {string} nodeId + * @param {string} filePath + * @param {{ mime?: string, json?: boolean }} opts + */ +export async function contentSet(cwd, nodeId, filePath, opts = {}) { + try { + const buf = await readFile(filePath); + const mime = opts.mime ?? MIME_MAP[extname(filePath).toLowerCase()] ?? 'application/octet-stream'; + + const graph = await loadGraph(cwd); + const result = await writeContent(cwd, graph, nodeId, buf, { mime }); + + if (opts.json) { + outputJson('content-set', result); + } else { + console.log(success(`Content attached to ${nodeId}`)); + console.log(formatContentMeta(result)); + } + } catch (err) { + console.error(error(err.message)); + process.exitCode = 1; + } +} + +/** + * Show content attached to a graph node. + * @param {string} cwd + * @param {string} nodeId + * @param {{ raw?: boolean, json?: boolean }} opts + */ +export async function contentShow(cwd, nodeId, opts = {}) { + try { + const graph = await loadGraph(cwd); + const { content, meta } = await readContent(cwd, graph, nodeId); + + if (opts.json) { + outputJson('content-show', { nodeId, content, ...meta }); + return; + } + + if (opts.raw) { + process.stdout.write(content); + } else { + console.log(formatContentMeta({ nodeId, ...meta })); + console.log(''); + console.log(content); + } + } catch (err) { + console.error(error(err.message)); + process.exitCode = 1; + } +} + +/** + * Show content metadata for a graph node. + * @param {string} cwd + * @param {string} nodeId + * @param {{ json?: boolean }} opts + */ +export async function contentMeta(cwd, nodeId, opts = {}) { + try { + const graph = await loadGraph(cwd); + const meta = await getContentMeta(graph, nodeId); + + if (!meta) { + if (opts.json) { + outputJson('content-meta', { nodeId, hasContent: false }); + } else { + console.log(info(`No content attached to ${nodeId}`)); + } + return; + } + + if (opts.json) { + outputJson('content-meta', { nodeId, hasContent: true, ...meta }); + } else { + console.log(formatContentMeta({ nodeId, ...meta })); + } + } catch (err) { + console.error(error(err.message)); + process.exitCode = 1; + } +} + +/** + * Delete content from a graph node. + * @param {string} cwd + * @param {string} nodeId + * @param {{ json?: boolean }} opts + */ +export async function contentDelete(cwd, nodeId, opts = {}) { + try { + const graph = await loadGraph(cwd); + const result = await deleteContent(graph, nodeId); + + if (opts.json) { + outputJson('content-delete', result); + } else if (result.removed) { + console.log(success(`Content removed from ${nodeId}`)); + } else { + console.log(info(`No content to remove from ${nodeId}`)); + } + } catch (err) { + console.error(error(err.message)); + process.exitCode = 1; + } +} + // ── Extension commands ─────────────────────────────────────────── /** diff --git a/src/cli/format.js b/src/cli/format.js index b9025b78..bcfba9c0 100644 --- a/src/cli/format.js +++ b/src/cli/format.js @@ -498,6 +498,22 @@ export function formatProgressMeta(meta) { return lines.join('\n'); } +/** + * Format content metadata for terminal display. + * @param {{ nodeId?: string, sha: string, mime: string, size: number }} meta + * @returns {string} + */ +export function formatContentMeta(meta) { + const lines = []; + if (meta.nodeId) { + lines.push(` ${chalk.dim('node:')} ${chalk.cyan.bold(meta.nodeId)}`); + } + lines.push(` ${chalk.dim('sha:')} ${meta.sha}`); + lines.push(` ${chalk.dim('mime:')} ${meta.mime}`); + lines.push(` ${chalk.dim('size:')} ${meta.size} bytes`); + return lines.join('\n'); +} + /** * Format the extension list for terminal display. * @param {import('../extension.js').ExtensionRecord[]} extensions diff --git a/src/content.js b/src/content.js new file mode 100644 index 00000000..2ba1849d --- /dev/null +++ b/src/content.js @@ -0,0 +1,208 @@ +/** + * @module content + * Content-on-node: attach rich content (markdown, text, etc.) to graph nodes + * using git's native content-addressed storage. + * + * Content is stored as git blobs via `git hash-object -w`. The blob SHA and + * metadata are recorded as WARP node properties under the `_content.` prefix. + * + * Property convention: + * _content.sha — git blob SHA + * _content.mime — MIME type (e.g. "text/markdown") + * _content.size — byte count + */ + +import { execFileSync } from 'node:child_process'; + +/** Property key prefix for content metadata. */ +const PREFIX = '_content.'; + +/** Known content property keys. */ +const KEYS = { + sha: `${PREFIX}sha`, + mime: `${PREFIX}mime`, + size: `${PREFIX}size`, +}; + +/** Validates a string is a 40- or 64-hex-char git object hash (SHA-1 or SHA-256). */ +const SHA_RE = /^[0-9a-f]{40,64}$/; + +/** @throws {Error} if sha is not a valid git object hash (40 or 64 hex chars). */ +function assertValidSha(sha) { + if (typeof sha !== 'string' || !SHA_RE.test(sha)) { + throw new Error(`Invalid content SHA: ${sha}`); + } +} + +/** + * @typedef {object} ContentMeta + * @property {string} sha - Git blob SHA + * @property {string} mime - MIME type + * @property {number} size - Content size in bytes + */ + +/** + * @typedef {object} WriteContentResult + * @property {string} nodeId - Target node + * @property {string} sha - Written blob SHA + * @property {string} mime - MIME type + * @property {number} size - Byte count + */ + +/** + * Write content to a graph node. Stores the content as a git blob and records + * metadata as node properties. + * + * @param {string} cwd - Repository working directory + * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance + * @param {string} nodeId - Target node ID + * @param {Buffer|string} content - Content to store + * @param {{ mime?: string }} [opts] + * @returns {Promise} + */ +export async function writeContent(cwd, graph, nodeId, content, opts = {}) { + const exists = await graph.hasNode(nodeId); + if (!exists) { + throw new Error(`Node not found: ${nodeId}`); + } + + const buf = Buffer.isBuffer(content) ? content : Buffer.from(content, 'utf-8'); + const mime = opts.mime ?? 'text/plain'; + const size = buf.length; + + // Write blob to git object store + const sha = execFileSync('git', ['hash-object', '-w', '--stdin'], { + cwd, + input: buf, + encoding: 'utf-8', + }).trim(); + + // Record metadata as node properties + const patch = await graph.createPatch(); + patch.setProperty(nodeId, KEYS.sha, sha); + patch.setProperty(nodeId, KEYS.mime, mime); + patch.setProperty(nodeId, KEYS.size, size); + await patch.commit(); + + return { nodeId, sha, mime, size }; +} + +/** + * Read content attached to a graph node. Retrieves the blob from git's object + * store and verifies SHA integrity. + * + * @param {string} cwd - Repository working directory + * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance + * @param {string} nodeId - Target node ID + * @returns {Promise<{ content: string, meta: ContentMeta }>} + */ +export async function readContent(cwd, graph, nodeId) { + const meta = await getContentMeta(graph, nodeId); + if (!meta) { + throw new Error(`No content attached to node: ${nodeId}`); + } + + // Validate SHA before passing to git + assertValidSha(meta.sha); + + // Retrieve blob from git object store + let content; + try { + content = execFileSync('git', ['cat-file', 'blob', meta.sha], { + cwd, + encoding: 'utf-8', + }); + } catch { + throw new Error( + `Content blob ${meta.sha} not found in git object store for node: ${nodeId}`, + ); + } + + // Verify integrity: re-hash and compare + const verifyBuf = Buffer.from(content, 'utf-8'); + const verifySha = execFileSync('git', ['hash-object', '--stdin'], { + cwd, + input: verifyBuf, + encoding: 'utf-8', + }).trim(); + + if (verifySha !== meta.sha) { + throw new Error( + `Content integrity check failed for node ${nodeId}: ` + + `expected ${meta.sha}, got ${verifySha}`, + ); + } + + return { content, meta }; +} + +/** + * Get content metadata for a node without retrieving the blob. + * Returns null if no content is attached. + * + * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance + * @param {string} nodeId - Target node ID + * @returns {Promise} + */ +export async function getContentMeta(graph, nodeId) { + const exists = await graph.hasNode(nodeId); + if (!exists) { + throw new Error(`Node not found: ${nodeId}`); + } + + const propsMap = await graph.getNodeProps(nodeId); + const sha = propsMap?.get(KEYS.sha) ?? null; + if (!sha) return null; + + return { + sha, + mime: propsMap.get(KEYS.mime) ?? 'text/plain', + size: propsMap.get(KEYS.size) ?? 0, + }; +} + +/** + * Check whether a node has content attached. + * + * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance + * @param {string} nodeId - Target node ID + * @returns {Promise} + */ +export async function hasContent(graph, nodeId) { + const exists = await graph.hasNode(nodeId); + if (!exists) return false; + + const propsMap = await graph.getNodeProps(nodeId); + const sha = propsMap?.get(KEYS.sha) ?? null; + return sha !== null; +} + +/** + * Delete content from a node by clearing the `_content.*` properties. + * The git blob remains in the object store (cleaned up by git gc). + * + * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance + * @param {string} nodeId - Target node ID + * @returns {Promise<{ nodeId: string, removed: boolean, previousSha: string|null }>} + */ +export async function deleteContent(graph, nodeId) { + const exists = await graph.hasNode(nodeId); + if (!exists) { + throw new Error(`Node not found: ${nodeId}`); + } + + const propsMap = await graph.getNodeProps(nodeId); + const previousSha = propsMap?.get(KEYS.sha) ?? null; + + if (!previousSha) { + return { nodeId, removed: false, previousSha: null }; + } + + const patch = await graph.createPatch(); + patch.setProperty(nodeId, KEYS.sha, null); + patch.setProperty(nodeId, KEYS.mime, null); + patch.setProperty(nodeId, KEYS.size, null); + await patch.commit(); + + return { nodeId, removed: true, previousSha }; +} diff --git a/src/index.js b/src/index.js index 44e63c11..9a910e39 100644 --- a/src/index.js +++ b/src/index.js @@ -48,3 +48,6 @@ export { loadExtension, registerExtension, removeExtension, listExtensions, getExtension, validateExtension, resetExtensions, registerBuiltinExtensions, } from './extension.js'; +export { + writeContent, readContent, getContentMeta, hasContent, deleteContent, +} from './content.js'; diff --git a/test/content.test.js b/test/content.test.js new file mode 100644 index 00000000..3194aa5f --- /dev/null +++ b/test/content.test.js @@ -0,0 +1,395 @@ +/** + * @module test/content + * Tests for content-on-node: CAS storage, CLI commands, and schema contracts. + */ + +import { describe, it, expect, beforeEach, afterEach, beforeAll } from 'vitest'; +import { mkdtemp, rm, readFile, writeFile } from 'node:fs/promises'; +import { join } from 'node:path'; +import { tmpdir } from 'node:os'; +import { execFileSync, execSync } from 'node:child_process'; +import Ajv from 'ajv/dist/2020.js'; +import { initGraph } from '../src/graph.js'; +import { writeContent, readContent, getContentMeta, hasContent, deleteContent } from '../src/content.js'; + +/** Create a temp dir with an initialized git repo. */ +async function setupGitRepo() { + const dir = await mkdtemp(join(tmpdir(), 'gitmind-content-')); + execFileSync('git', ['init'], { cwd: dir, stdio: 'ignore' }); + execFileSync('git', ['config', 'user.email', 'test@test.com'], { cwd: dir, stdio: 'ignore' }); + execFileSync('git', ['config', 'user.name', 'Test'], { cwd: dir, stdio: 'ignore' }); + return dir; +} + +const BIN = join(import.meta.dirname, '..', 'bin', 'git-mind.js'); +const SCHEMA_DIR = join(import.meta.dirname, '..', 'docs', 'contracts', 'cli'); + +function runCli(args, cwd) { + return execFileSync(process.execPath, [BIN, ...args], { + cwd, + encoding: 'utf-8', + timeout: 30_000, + env: { ...process.env, NO_COLOR: '1' }, + }); +} + +function runCliJson(args, cwd) { + return JSON.parse(runCli(args, cwd)); +} + +async function loadSchema(name) { + return JSON.parse(await readFile(join(SCHEMA_DIR, name), 'utf-8')); +} + +describe('content store core', () => { + let tempDir, graph; + + beforeEach(async () => { + tempDir = await setupGitRepo(); + graph = await initGraph(tempDir); + + // Create a test node + const patch = await graph.createPatch(); + patch.addNode('doc:readme'); + patch.setProperty('doc:readme', 'title', 'README'); + await patch.commit(); + }); + + afterEach(async () => { + await rm(tempDir, { recursive: true, force: true }); + }); + + it('writeContent stores blob and sets properties', async () => { + const result = await writeContent(tempDir, graph, 'doc:readme', '# Hello World\n', { + mime: 'text/markdown', + }); + + expect(result.nodeId).toBe('doc:readme'); + expect(result.sha).toMatch(/^[0-9a-f]{40,64}$/); + expect(result.mime).toBe('text/markdown'); + expect(result.size).toBe(Buffer.from('# Hello World\n').length); + expect(result.encoding).toBeUndefined(); + }); + + it('readContent retrieves correct content', async () => { + const body = '# Hello World\n\nThis is a test document.\n'; + await writeContent(tempDir, graph, 'doc:readme', body, { mime: 'text/markdown' }); + + const { content, meta } = await readContent(tempDir, graph, 'doc:readme'); + expect(content).toBe(body); + expect(meta.mime).toBe('text/markdown'); + }); + + it('readContent throws when blob is missing from object store', async () => { + await writeContent(tempDir, graph, 'doc:readme', 'original', { mime: 'text/plain' }); + + // Point to a valid-looking SHA that doesn't exist in the object store + const patch = await graph.createPatch(); + patch.setProperty('doc:readme', '_content.sha', 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef'); + await patch.commit(); + + await expect(readContent(tempDir, graph, 'doc:readme')).rejects.toThrow(/not found in git object store/); + }); + + it('readContent detects integrity mismatch on non-UTF-8 blob', async () => { + // Write a blob with non-UTF-8 bytes directly via git — the UTF-8 + // round-trip in readContent will corrupt the data, producing a + // different hash and triggering the integrity check. + const binaryBuf = Buffer.from([0x80, 0x81, 0x82, 0xFF, 0xFE]); + const sha = execFileSync('git', ['hash-object', '-w', '--stdin'], { + cwd: tempDir, + input: binaryBuf, + encoding: 'utf-8', + }).trim(); + + const patch = await graph.createPatch(); + patch.setProperty('doc:readme', '_content.sha', sha); + patch.setProperty('doc:readme', '_content.mime', 'application/octet-stream'); + patch.setProperty('doc:readme', '_content.size', 5); + await patch.commit(); + + await expect(readContent(tempDir, graph, 'doc:readme')).rejects.toThrow(/integrity check failed/); + }); + + it('getContentMeta returns correct metadata', async () => { + await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + + const meta = await getContentMeta(graph, 'doc:readme'); + expect(meta).not.toBeNull(); + expect(meta.sha).toMatch(/^[0-9a-f]{40,64}$/); + expect(meta.mime).toBe('text/plain'); + expect(meta.size).toBe(4); + expect(meta.encoding).toBeUndefined(); + }); + + it('getContentMeta returns null for node without content', async () => { + const meta = await getContentMeta(graph, 'doc:readme'); + expect(meta).toBeNull(); + }); + + it('hasContent returns true for node with content', async () => { + await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + expect(await hasContent(graph, 'doc:readme')).toBe(true); + }); + + it('hasContent returns false for node without content', async () => { + expect(await hasContent(graph, 'doc:readme')).toBe(false); + }); + + it('hasContent returns false for non-existent node', async () => { + expect(await hasContent(graph, 'doc:nonexistent')).toBe(false); + }); + + it('deleteContent removes properties', async () => { + await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + const result = await deleteContent(graph, 'doc:readme'); + + expect(result.removed).toBe(true); + expect(result.previousSha).toMatch(/^[0-9a-f]{40,64}$/); + expect(await hasContent(graph, 'doc:readme')).toBe(false); + }); + + it('deleteContent is idempotent on node without content', async () => { + const result = await deleteContent(graph, 'doc:readme'); + expect(result.removed).toBe(false); + expect(result.previousSha).toBeNull(); + }); + + it('writeContent fails on non-existent node', async () => { + await expect( + writeContent(tempDir, graph, 'doc:nonexistent', 'test', { mime: 'text/plain' }), + ).rejects.toThrow(/Node not found/); + }); + + it('readContent fails on node without content', async () => { + await expect( + readContent(tempDir, graph, 'doc:readme'), + ).rejects.toThrow(/No content attached/); + }); + + it('getContentMeta fails on non-existent node', async () => { + await expect( + getContentMeta(graph, 'doc:nonexistent'), + ).rejects.toThrow(/Node not found/); + }); + + it('deleteContent fails on non-existent node', async () => { + await expect( + deleteContent(graph, 'doc:nonexistent'), + ).rejects.toThrow(/Node not found/); + }); + + it('overwrite replaces content cleanly', async () => { + await writeContent(tempDir, graph, 'doc:readme', 'version 1', { mime: 'text/plain' }); + await writeContent(tempDir, graph, 'doc:readme', 'version 2', { mime: 'text/markdown' }); + + const { content, meta } = await readContent(tempDir, graph, 'doc:readme'); + expect(content).toBe('version 2'); + expect(meta.mime).toBe('text/markdown'); + }); + + it('handles Buffer input', async () => { + const buf = Buffer.from('binary-safe content', 'utf-8'); + await writeContent(tempDir, graph, 'doc:readme', buf, { mime: 'application/octet-stream' }); + + const { content } = await readContent(tempDir, graph, 'doc:readme'); + expect(content).toBe('binary-safe content'); + }); +}); + +describe('content CLI commands', () => { + let tempDir; + + beforeEach(async () => { + tempDir = await setupGitRepo(); + + // Init graph and add a node + runCli(['init'], tempDir); + const graph = await initGraph(tempDir); + const patch = await graph.createPatch(); + patch.addNode('doc:test'); + patch.setProperty('doc:test', 'title', 'Test Document'); + await patch.commit(); + + // Create a test file to attach + await writeFile(join(tempDir, 'test.md'), '# Test\n\nHello world.\n'); + await writeFile(join(tempDir, 'data.json'), '{"key": "value"}'); + }); + + afterEach(async () => { + await rm(tempDir, { recursive: true, force: true }); + }); + + it('content set --from writes and reports success', () => { + const output = runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md')], tempDir); + expect(output).toContain('Content attached to doc:test'); + }); + + it('content set --json outputs valid JSON', () => { + const result = runCliJson( + ['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md'), '--json'], + tempDir, + ); + expect(result.command).toBe('content-set'); + expect(result.nodeId).toBe('doc:test'); + expect(result.sha).toMatch(/^[0-9a-f]{40,64}$/); + expect(result.mime).toBe('text/markdown'); + }); + + it('content set detects MIME from file extension', () => { + const result = runCliJson( + ['content', 'set', 'doc:test', '--from', join(tempDir, 'data.json'), '--json'], + tempDir, + ); + expect(result.mime).toBe('application/json'); + }); + + it('content set --mime overrides detection', () => { + const result = runCliJson( + ['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md'), '--mime', 'text/plain', '--json'], + tempDir, + ); + expect(result.mime).toBe('text/plain'); + }); + + it('content show retrieves correct content', () => { + runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md')], tempDir); + const output = runCli(['content', 'show', 'doc:test', '--raw'], tempDir); + expect(output).toBe('# Test\n\nHello world.\n'); + }); + + it('content show --json outputs full payload', () => { + runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md')], tempDir); + const result = runCliJson(['content', 'show', 'doc:test', '--json'], tempDir); + expect(result.command).toBe('content-show'); + expect(result.content).toBe('# Test\n\nHello world.\n'); + expect(result.sha).toMatch(/^[0-9a-f]{40,64}$/); + }); + + it('content meta --json returns metadata', () => { + runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md')], tempDir); + const result = runCliJson(['content', 'meta', 'doc:test', '--json'], tempDir); + expect(result.command).toBe('content-meta'); + expect(result.hasContent).toBe(true); + expect(result.mime).toBe('text/markdown'); + }); + + it('content meta --json for node without content', () => { + const result = runCliJson(['content', 'meta', 'doc:test', '--json'], tempDir); + expect(result.hasContent).toBe(false); + expect(result.sha).toBeUndefined(); + }); + + it('content delete removes content', () => { + runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'test.md')], tempDir); + const output = runCli(['content', 'delete', 'doc:test'], tempDir); + expect(output).toContain('Content removed from doc:test'); + + // Verify gone + const meta = runCliJson(['content', 'meta', 'doc:test', '--json'], tempDir); + expect(meta.hasContent).toBe(false); + }); + + it('content delete on node without content', () => { + const output = runCli(['content', 'delete', 'doc:test'], tempDir); + expect(output).toContain('No content to remove'); + }); + + it('content set --from nonexistent file throws with file error', () => { + expect(() => { + runCli(['content', 'set', 'doc:test', '--from', join(tempDir, 'nonexistent.md')], tempDir); + }).toThrow(/ENOENT|no such file/i); + }); + + it('content show on node without content throws with no-content error', () => { + expect(() => { + runCli(['content', 'show', 'doc:test'], tempDir); + }).toThrow(/No content attached/); + }); + + it('content show on non-existent node throws with not-found error', () => { + expect(() => { + runCli(['content', 'show', 'doc:nonexistent'], tempDir); + }).toThrow(/Node not found/); + }); + + it('content delete on non-existent node throws with not-found error', () => { + expect(() => { + runCli(['content', 'delete', 'doc:nonexistent'], tempDir); + }).toThrow(/Node not found/); + }); +}); + +describe('content CLI schema contracts', () => { + let tempDir; + let validateSet, validateShow, validateMeta, validateDelete; + + beforeAll(async () => { + const ajv = new Ajv({ strict: true, allErrors: true }); + validateSet = ajv.compile(await loadSchema('content-set.schema.json')); + validateShow = ajv.compile(await loadSchema('content-show.schema.json')); + validateMeta = ajv.compile(await loadSchema('content-meta.schema.json')); + validateDelete = ajv.compile(await loadSchema('content-delete.schema.json')); + }); + + beforeEach(async () => { + tempDir = await setupGitRepo(); + + runCli(['init'], tempDir); + const graph = await initGraph(tempDir); + const patch = await graph.createPatch(); + patch.addNode('doc:schema-test'); + patch.setProperty('doc:schema-test', 'title', 'Schema Test'); + await patch.commit(); + + await writeFile(join(tempDir, 'test.md'), '# Schema Test\n'); + }); + + afterEach(async () => { + await rm(tempDir, { recursive: true, force: true }); + }); + + it('content set --json validates against content-set.schema.json', () => { + const result = runCliJson( + ['content', 'set', 'doc:schema-test', '--from', join(tempDir, 'test.md'), '--json'], + tempDir, + ); + expect(validateSet(result), JSON.stringify(validateSet.errors)).toBe(true); + }); + + it('content show --json validates against content-show.schema.json', () => { + runCli(['content', 'set', 'doc:schema-test', '--from', join(tempDir, 'test.md')], tempDir); + const result = runCliJson(['content', 'show', 'doc:schema-test', '--json'], tempDir); + expect(validateShow(result), JSON.stringify(validateShow.errors)).toBe(true); + }); + + it('content meta --json (with content) validates against content-meta.schema.json', () => { + runCli(['content', 'set', 'doc:schema-test', '--from', join(tempDir, 'test.md')], tempDir); + const result = runCliJson(['content', 'meta', 'doc:schema-test', '--json'], tempDir); + expect(validateMeta(result), JSON.stringify(validateMeta.errors)).toBe(true); + expect(result.hasContent).toBe(true); + expect(result.sha).toBeDefined(); + expect(result.mime).toBeDefined(); + }); + + it('content meta --json (no content) validates against content-meta.schema.json', () => { + const result = runCliJson(['content', 'meta', 'doc:schema-test', '--json'], tempDir); + expect(validateMeta(result), JSON.stringify(validateMeta.errors)).toBe(true); + expect(result.hasContent).toBe(false); + }); + + it('content delete --json validates against content-delete.schema.json', () => { + runCli(['content', 'set', 'doc:schema-test', '--from', join(tempDir, 'test.md')], tempDir); + const result = runCliJson(['content', 'delete', 'doc:schema-test', '--json'], tempDir); + expect(validateDelete(result), JSON.stringify(validateDelete.errors)).toBe(true); + expect(result.removed).toBe(true); + expect(result.previousSha).toMatch(/^[0-9a-f]{40,64}$/); + }); + + it('content delete --json (no content) validates against content-delete.schema.json', () => { + const result = runCliJson(['content', 'delete', 'doc:schema-test', '--json'], tempDir); + expect(validateDelete(result), JSON.stringify(validateDelete.errors)).toBe(true); + expect(result.removed).toBe(false); + expect(result.previousSha).toBeNull(); + }); +}); diff --git a/test/contracts.test.js b/test/contracts.test.js index 0e28bbe1..2a89dbb1 100644 --- a/test/contracts.test.js +++ b/test/contracts.test.js @@ -247,6 +247,39 @@ const VALID_SAMPLES = { name: 'test-ext', version: '1.0.0', }, + 'content-set.schema.json': { + schemaVersion: 1, + command: 'content-set', + nodeId: 'doc:readme', + sha: 'a'.repeat(40), + mime: 'text/markdown', + size: 42, + }, + 'content-show.schema.json': { + schemaVersion: 1, + command: 'content-show', + nodeId: 'doc:readme', + content: '# Hello World\n', + sha: 'a'.repeat(40), + mime: 'text/markdown', + size: 15, + }, + 'content-meta.schema.json': { + schemaVersion: 1, + command: 'content-meta', + nodeId: 'doc:readme', + hasContent: true, + sha: 'a'.repeat(40), + mime: 'text/markdown', + size: 15, + }, + 'content-delete.schema.json': { + schemaVersion: 1, + command: 'content-delete', + nodeId: 'doc:readme', + removed: true, + previousSha: 'a'.repeat(40), + }, }; describe('CLI JSON Schema contracts', () => { @@ -407,6 +440,41 @@ describe('CLI JSON Schema contracts', () => { const validate = validators.get('suggest.schema.json'); expect(validate(sample)).toBe(true); }); + + it('content-delete schema accepts removed: false with previousSha: null', () => { + const validate = validators.get('content-delete.schema.json'); + const sample = { + schemaVersion: 1, + command: 'content-delete', + nodeId: 'doc:readme', + removed: false, + previousSha: null, + }; + expect(validate(sample), JSON.stringify(validate.errors)).toBe(true); + }); + + it('content-meta schema accepts hasContent: false without sha/mime/size', () => { + const validate = validators.get('content-meta.schema.json'); + const sample = { + schemaVersion: 1, + command: 'content-meta', + nodeId: 'doc:readme', + hasContent: false, + }; + expect(validate(sample), JSON.stringify(validate.errors)).toBe(true); + }); + + it('content-meta schema rejects hasContent: false with sha present', () => { + const validate = validators.get('content-meta.schema.json'); + const sample = { + schemaVersion: 1, + command: 'content-meta', + nodeId: 'doc:readme', + hasContent: false, + sha: 'a'.repeat(40), + }; + expect(validate(sample)).toBe(false); + }); }); });