diff --git a/.git-ai/lancedb.tar.gz b/.git-ai/lancedb.tar.gz index 2a4430b..6448044 100644 --- a/.git-ai/lancedb.tar.gz +++ b/.git-ai/lancedb.tar.gz @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:edb61bd5c5d8970c2ea78ac25998c4b62126048b001e89a9bf54922ec7773d67 -size 56009 +oid sha256:540a1dc8b2ac0c2431296a8a013dfbc2c4587e076161bccc10fa33b3a2028b37 +size 66332 diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index 111c0f3..08bc02d 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -1,31 +1,33 @@ -# 开发指引 +# Development Guide -## 先决条件 -- Node.js 18+(建议 20+) -- Git(可选:git-lfs) +[简体中文](./DEVELOPMENT.zh-CN.md) | **English** -### Windows / Linux 安装注意 -- `@lancedb/lancedb` 使用 N-API 预编译包,支持 win32/linux/darwin(x64/arm64)。如果安装失败,优先确认:Node 版本 >=18 且架构是 x64/arm64。 -- `tree-sitter` / `tree-sitter-typescript` 依赖原生扩展,通常会拉取预编译包;若你的平台/Node 版本没有命中预编译包,则需要本机编译工具链: - - Windows:安装 “Visual Studio Build Tools(C++)” 与 Python(node-gyp 需要) - - Linux:安装 `build-essential`、`python3`(不同发行版包名略有差异) +## Prerequisites +- Node.js 18+ (20+ recommended) +- Git (optional: git-lfs) -## 安装依赖与构建 +### Windows / Linux Installation Notes +- `@lancedb/lancedb` uses N-API prebuilt binaries, supporting win32/linux/darwin (x64/arm64). If installation fails, first check: Node version >=18 and architecture is x64/arm64. +- `tree-sitter` / `tree-sitter-typescript` rely on native extensions, usually fetching prebuilt binaries; if your platform/Node version doesn't hit a prebuilt binary, you need a local build toolchain: + - Windows: Install "Visual Studio Build Tools (C++)" and Python (required by node-gyp) + - Linux: Install `build-essential`, `python3` (package names may vary by distro) + +## Install Dependencies & Build ```bash npm i npm run build ``` -本项目使用 TypeScript 编译输出到 `dist/`。 +This project uses TypeScript to compile output to `dist/`. -## 本地运行(开发态) +## Local Run (Development) ```bash npm run start -- --help ``` -建议用 `node dist/bin/git-ai.js ...` 验证打包后的行为: +It is recommended to use `node dist/bin/git-ai.js ...` to verify behavior after packaging: ```bash npm run build @@ -33,27 +35,27 @@ node dist/bin/git-ai.js --help node dist/bin/git-ai.js ai --help ``` -## 全局安装(本机验证) +## Global Installation (Local Verification) ```bash npm i -g . git-ai --version ``` -## 端到端测试 +## End-to-End Tests ```bash npm test ``` -测试会在临时目录创建两类仓库(Spring Boot / Vue)并验证: -- `git-ai` 代理 git 的常用命令 +Tests will create two types of repositories (Spring Boot / Vue) in a temporary directory and verify: +- `git-ai` proxies common git commands - `git-ai ai index/pack/unpack/hooks` -- MCP server 的工具暴露 +- MCP server tool exposure -## 常用开发工作流 +## Common Development Workflow -### 1) 在任意仓库里跑索引 +### 1) Run Indexing in Any Repo ```bash cd /path/to/repo @@ -61,32 +63,32 @@ git-ai ai index --overwrite git-ai ai pack ``` -### 2) 安装 hooks(让索引随提交自动更新) +### 2) Install Hooks (Auto-update index on commit) ```bash git-ai ai hooks install git-ai ai hooks status ``` -### 3) 启动 MCP Server(供 Agent 查询) +### 3) Start MCP Server (For Agent Query) ```bash git-ai ai serve ``` -如果宿主无法保证工作目录指向仓库目录,可以先让 Agent 调用 `set_repo({path: ...})`,或在工具参数里传 `path`。 +If the host cannot guarantee the working directory points to the repository directory, you can let the Agent call `set_repo({path: ...})` first, or pass `path` in tool parameters. -## 发布注意事项(npm) -- 确保 `npm run build` 已生成 `dist/**` -- `package.json` 的 `files` 字段已包含 `dist/**` 与 `assets/**` -- 发布前确认未提交任何敏感信息(token/密钥) +## Publishing Notes (npm) +- Ensure `npm run build` has generated `dist/**` +- `package.json` `files` field includes `dist/**` and `assets/**` +- Confirm no sensitive info (tokens/keys) is committed before publishing -### GitHub Actions(归档 + GitHub Packages) -仓库已提供发布工作流:当推送 tag `v*` 时,会: +### GitHub Actions (Archive + GitHub Packages) +Repository provides a release workflow: when pushing tag `v*`, it will: - `npm ci` + `npm test` -- `npm pack` 生成 tgz 并作为 Release 资产上传 -- 发布到 GitHub Packages(npm.pkg.github.com) +- `npm pack` generate tgz and upload as Release asset +- Publish to GitHub Packages (npm.pkg.github.com) -说明: -- GitHub Packages 的 npm 包名需要 scope,工作流会在发布时临时把包名改为 `@/git-ai`(不修改仓库内源码包名)。 -- 如需同时发布 npmjs.org,请在仓库 Secrets 配置 `NPM_TOKEN`。 +Note: +- GitHub Packages npm package names require a scope. The workflow will temporarily change the package name to `@/git-ai` during publishing (without modifying source package.json). +- To publish to npmjs.org simultaneously, please configure `NPM_TOKEN` in repository Secrets. diff --git a/DEVELOPMENT.zh-CN.md b/DEVELOPMENT.zh-CN.md new file mode 100644 index 0000000..4e3c288 --- /dev/null +++ b/DEVELOPMENT.zh-CN.md @@ -0,0 +1,94 @@ +# 开发指引 + +**简体中文** | [English](./DEVELOPMENT.md) + +## 先决条件 +- Node.js 18+(建议 20+) +- Git(可选:git-lfs) + +### Windows / Linux 安装注意 +- `@lancedb/lancedb` 使用 N-API 预编译包,支持 win32/linux/darwin(x64/arm64)。如果安装失败,优先确认:Node 版本 >=18 且架构是 x64/arm64。 +- `tree-sitter` / `tree-sitter-typescript` 依赖原生扩展,通常会拉取预编译包;若你的平台/Node 版本没有命中预编译包,则需要本机编译工具链: + - Windows:安装 “Visual Studio Build Tools(C++)” 与 Python(node-gyp 需要) + - Linux:安装 `build-essential`、`python3`(不同发行版包名略有差异) + +## 安装依赖与构建 + +```bash +npm i +npm run build +``` + +本项目使用 TypeScript 编译输出到 `dist/`。 + +## 本地运行(开发态) + +```bash +npm run start -- --help +``` + +建议用 `node dist/bin/git-ai.js ...` 验证打包后的行为: + +```bash +npm run build +node dist/bin/git-ai.js --help +node dist/bin/git-ai.js ai --help +``` + +## 全局安装(本机验证) + +```bash +npm i -g . +git-ai --version +``` + +## 端到端测试 + +```bash +npm test +``` + +测试会在临时目录创建两类仓库(Spring Boot / Vue)并验证: +- `git-ai` 代理 git 的常用命令 +- `git-ai ai index/pack/unpack/hooks` +- MCP server 的工具暴露 + +## 常用开发工作流 + +### 1) 在任意仓库里跑索引 + +```bash +cd /path/to/repo +git-ai ai index --overwrite +git-ai ai pack +``` + +### 2) 安装 hooks(让索引随提交自动更新) + +```bash +git-ai ai hooks install +git-ai ai hooks status +``` + +### 3) 启动 MCP Server(供 Agent 查询) + +```bash +git-ai ai serve +``` + +如果宿主无法保证工作目录指向仓库目录,可以先让 Agent 调用 `set_repo({path: ...})`,或在工具参数里传 `path`。 + +## 发布注意事项(npm) +- 确保 `npm run build` 已生成 `dist/**` +- `package.json` 的 `files` 字段已包含 `dist/**` 与 `assets/**` +- 发布前确认未提交任何敏感信息(token/密钥) + +### GitHub Actions(归档 + GitHub Packages) +仓库已提供发布工作流:当推送 tag `v*` 时,会: +- `npm ci` + `npm test` +- `npm pack` 生成 tgz 并作为 Release 资产上传 +- 发布到 GitHub Packages(npm.pkg.github.com) + +说明: +- GitHub Packages 的 npm 包名需要 scope,工作流会在发布时临时把包名改为 `@/git-ai`(不修改仓库内源码包名)。 +- 如需同时发布 npmjs.org,请在仓库 Secrets 配置 `NPM_TOKEN`。 diff --git a/README.md b/README.md index b119194..4961384 100644 --- a/README.md +++ b/README.md @@ -5,16 +5,23 @@ [![license](https://img.shields.io/github/license/mars167/git-ai-cli)](./LICENSE) [![npm (github packages)](https://img.shields.io/npm/v/%40mars167%2Fgit-ai?registry_uri=https%3A%2F%2Fnpm.pkg.github.com)](https://github.com/mars167/git-ai-cli/packages) -`git-ai` 是一个全局命令行工具:默认行为与 `git` 保持一致(代理系统 git),同时提供 `ai` 子命令用于代码索引与检索能力。 +[🇨🇳 简体中文](./README.zh-CN.md) | **English** -## 支持语言 +`git-ai` is a global command-line tool: it defaults to behaving like `git` (proxying system git), while providing an `ai` subcommand for code indexing and retrieval capabilities. -当前索引/符号提取支持以下语言与文件后缀: -- JavaScript:`.js`、`.jsx` -- TypeScript:`.ts`、`.tsx` -- Java:`.java` +## Supported Languages -## 安装 +Current indexing/symbol extraction supports the following languages and file extensions: +- JavaScript: `.js`, `.jsx` +- TypeScript: `.ts`, `.tsx` +- Java: `.java` +- C: `.c`, `.h` +- Go: `.go` +- Python: `.py` +- PHP: `.php` +- Rust: `.rs` + +## Installation ```bash npm i -g git-ai @@ -22,16 +29,16 @@ npm i -g git-ai yarn global add git-ai ``` -## 文档 -- 开发指引:[DEVELOPMENT.md](./DEVELOPMENT.md) -- 文档中心(使用/概念/排障):[docs/README.md](./docs/README.md) -- 设计说明:[docs/design.md](./docs/design.md) -- 技术原理详解(小白向):[docs/architecture_explained.md](./docs/architecture_explained.md) -- Agent 集成(Skills/Rules):[docs/mcp.md](./docs/mcp.md) +## Documentation +- Development Guide: [DEVELOPMENT.md](./DEVELOPMENT.md) +- Documentation Center (Usage/Concepts/Troubleshooting): [docs/README.md](./docs/README.md) +- Design: [docs/design.md](./docs/zh-CN/design.md) (Chinese) +- Architecture Explained: [docs/architecture_explained.md](./docs/zh-CN/architecture_explained.md) (Chinese) +- Agent Integration (Skills/Rules): [docs/mcp.md](./docs/zh-CN/mcp.md) (Chinese) -## 基本用法(与 git 类似) +## Basic Usage (Like Git) -`git-ai` 会把大多数命令直接转发给 `git`: +`git-ai` forwards most commands directly to `git`: ```bash git-ai init @@ -41,9 +48,9 @@ git-ai commit -m "msg" git-ai push -u origin main ``` -## AI 能力 +## AI Capabilities -所有 AI 相关能力放在 `git-ai ai` 下: +All AI-related capabilities are under `git-ai ai`: ```bash git-ai ai index --overwrite @@ -55,29 +62,29 @@ git-ai ai unpack git-ai ai serve ``` -## MCP Server(stdio) +## MCP Server (stdio) -`git-ai` 提供一个基于 MCP 的 stdio Server,供 Agent/客户端以工具方式调用: -- `search_symbols`:符号检索(substring/prefix/wildcard/regex/fuzzy) -- `semantic_search`:基于 LanceDB + SQ8 的语义检索 -- `ast_graph_query`:基于 CozoDB 的 AST 图查询(CozoScript) +`git-ai` provides an MCP-based stdio Server for Agents/Clients to call as tools: +- `search_symbols`: Symbol retrieval (substring/prefix/wildcard/regex/fuzzy) +- `semantic_search`: Semantic retrieval based on LanceDB + SQ8 +- `ast_graph_query`: AST graph query based on CozoDB (CozoScript) -### 启动 +### Startup -建议先在目标仓库生成索引: +It is recommended to generate the index in the target repository first: ```bash git-ai ai index --overwrite ``` -然后启动 MCP Server(会在 stdio 上等待客户端连接,这是正常的): +Then start the MCP Server (it will wait for client connections on stdio, which is normal): ```bash cd /ABS/PATH/TO/REPO git-ai ai serve ``` -### Claude Desktop 配置示例 +### Claude Desktop Configuration Example ```json { @@ -90,39 +97,39 @@ git-ai ai serve } ``` -说明: -- `git-ai ai serve` 默认使用当前目录作为仓库定位起点(类似 git 的用法)。 -- 若宿主无法保证 MCP 进程的工作目录(cwd)指向仓库目录,推荐由 Agent 在首次调用前先执行一次 `set_repo({path: \"/ABS/PATH/TO/REPO\"})`,或在每次 tool 调用里传 `path` 参数。 +Note: +- `git-ai ai serve` defaults to using the current directory as the repository location (similar to git usage). +- If the host cannot guarantee that the MCP process working directory (cwd) points to the repository directory, it is recommended that the Agent execute `set_repo({path: \"/ABS/PATH/TO/REPO\"})` before the first call, or pass the `path` parameter in every tool call. -## Agent Skills / Rules(Trae) +## Agent Skills / Rules (Trae) -本仓库提供了 Agent 可直接复用的 Skill/Rule 模版: -- Skill: [./.trae/skills/git-ai-mcp/SKILL.md](./.trae/skills/git-ai-mcp/SKILL.md) -- Rule: [./.trae/rules/git-ai-mcp/RULE.md](./.trae/rules/git-ai-mcp/RULE.md) +This repository provides reusable Skill/Rule templates for Agents: +- Skill: [./.trae/skills/git-ai-mcp/SKILL.md](./.trae/skills/git-ai-mcp/SKILL.md) +- Rule: [./.trae/rules/git-ai-mcp/RULE.md](./.trae/rules/git-ai-mcp/RULE.md) -使用方式: -- 在 Trae 中打开本仓库后,Agent 会自动加载 `.trae/skills/**` 下的 Skill。 -- 需要给 Agent 加约束时,把 Rule 内容放到你的 Agent 配置/系统规则中(也可以直接引用本仓库的 `.trae/rules/**` 作为规范来源)。 +Usage: +- After opening this repository in Trae, the Agent will automatically load Skills under `.trae/skills/**`. +- When you need to add constraints to the Agent, put the Rule content into your Agent configuration/system rules (or directly reference `.trae/rules/**` in this repository as a source). -## Git hooks(提交前重建索引,push 前打包校验,checkout 自动解包) +## Git hooks (Rebuild index before commit, verify pack before push, auto unpack on checkout) -在任意 git 仓库中安装 hooks: +Install hooks in any git repository: ```bash git-ai ai hooks install git-ai ai hooks status ``` -说明: -- `pre-commit`:自动 `index --overwrite` + `pack`,并把 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区。 -- `pre-push`:再次 `pack`,若归档发生变化则阻止 push,提示先提交归档文件。 -- `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz` 则自动 `unpack`。 +Explanation: +- `pre-commit`: Automatically `index --overwrite` + `pack`, and add `.git-ai/meta.json` and `.git-ai/lancedb.tar.gz` to the staging area. +- `pre-push`: `pack` again, if the archive changes, block the push and prompt to submit the archive file first. +- `post-checkout` / `post-merge`: If `.git-ai/lancedb.tar.gz` exists, automatically `unpack`. -## Git LFS(推荐,用于 .git-ai/lancedb.tar.gz) +## Git LFS (Recommended for .git-ai/lancedb.tar.gz) -为了避免把较大的索引归档直接存进 Git 历史,推荐对 `.git-ai/lancedb.tar.gz` 启用 Git LFS。 +To avoid storing large index archives directly in Git history, it is recommended to enable Git LFS for `.git-ai/lancedb.tar.gz`. -### 开启(一次性) +### Enable (One-time) ```bash git lfs install @@ -131,14 +138,14 @@ git add .gitattributes git commit -m "chore: track lancedb archive via git-lfs" ``` -也可以用 `git-ai` 触发(仅在已安装 git-lfs 的情况下生效): +Can also be triggered with `git-ai` (only works if git-lfs is installed): ```bash git-ai ai pack --lfs ``` -### 克隆/切分支后(如果未自动拉取 LFS) -如果你环境设置了 `GIT_LFS_SKIP_SMUDGE=1`,或发现 `.git-ai/lancedb.tar.gz` 不是有效的 gzip 文件: +### After Clone/Checkout (If LFS pull is not automatic) +If your environment has `GIT_LFS_SKIP_SMUDGE=1` set, or you find `.git-ai/lancedb.tar.gz` is not a valid gzip file: ```bash git lfs pull diff --git a/README.zh-CN.md b/README.zh-CN.md new file mode 100644 index 0000000..70fe473 --- /dev/null +++ b/README.zh-CN.md @@ -0,0 +1,156 @@ +# git-ai + +[![ci](https://github.com/mars167/git-ai-cli/actions/workflows/ci.yml/badge.svg)](https://github.com/mars167/git-ai-cli/actions/workflows/ci.yml) +[![release](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml/badge.svg)](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml) +[![license](https://img.shields.io/github/license/mars167/git-ai-cli)](./LICENSE) +[![npm (github packages)](https://img.shields.io/npm/v/%40mars167%2Fgit-ai?registry_uri=https%3A%2F%2Fnpm.pkg.github.com)](https://github.com/mars167/git-ai-cli/packages) + +**简体中文** | [English](./README.md) + +`git-ai` 是一个全局命令行工具:默认行为与 `git` 保持一致(代理系统 git),同时提供 `ai` 子命令用于代码索引与检索能力。 + +## 支持语言 + +当前索引/符号提取支持以下语言与文件后缀: +- JavaScript:`.js`、`.jsx` +- TypeScript:`.ts`、`.tsx` +- Java:`.java` +- C: `.c`, `.h` +- Go: `.go` +- Python: `.py` +- PHP: `.php` +- Rust: `.rs` + +## 安装 + +```bash +npm i -g git-ai +# or +yarn global add git-ai +``` + +## 文档 +- 开发指引:[DEVELOPMENT.md](./DEVELOPMENT.zh-CN.md) +- 文档中心(使用/概念/排障):[docs/README.md](./docs/zh-CN/README.md) +- 设计说明:[docs/design.md](./docs/zh-CN/design.md) +- 技术原理详解(小白向):[docs/architecture_explained.md](./docs/zh-CN/architecture_explained.md) +- Agent 集成(Skills/Rules):[docs/mcp.md](./docs/zh-CN/mcp.md) + +## 基本用法(与 git 类似) + +`git-ai` 会把大多数命令直接转发给 `git`: + +```bash +git-ai init +git-ai status +git-ai add -A +git-ai commit -m "msg" +git-ai push -u origin main +``` + +## AI 能力 + +所有 AI 相关能力放在 `git-ai ai` 下: + +```bash +git-ai ai index --overwrite +git-ai ai query Indexer --limit 10 +git-ai ai semantic "semantic search" --topk 5 +git-ai ai graph find GitAIV2MCPServer +git-ai ai pack +git-ai ai unpack +git-ai ai serve +``` + +## MCP Server(stdio) + +`git-ai` 提供一个基于 MCP 的 stdio Server,供 Agent/客户端以工具方式调用: +- `search_symbols`:符号检索(substring/prefix/wildcard/regex/fuzzy) +- `semantic_search`:基于 LanceDB + SQ8 的语义检索 +- `ast_graph_query`:基于 CozoDB 的 AST 图查询(CozoScript) + +### 启动 + +建议先在目标仓库生成索引: + +```bash +git-ai ai index --overwrite +``` + +然后启动 MCP Server(会在 stdio 上等待客户端连接,这是正常的): + +```bash +cd /ABS/PATH/TO/REPO +git-ai ai serve +``` + +### Claude Desktop 配置示例 + +```json +{ + "mcpServers": { + "git-ai": { + "command": "git-ai", + "args": ["ai", "serve"] + } + } +} +``` + +说明: +- `git-ai ai serve` 默认使用当前目录作为仓库定位起点(类似 git 的用法)。 +- 若宿主无法保证 MCP 进程的工作目录(cwd)指向仓库目录,推荐由 Agent 在首次调用前先执行一次 `set_repo({path: \"/ABS/PATH/TO/REPO\"})`,或在每次 tool 调用里传 `path` 参数。 + +## Agent Skills / Rules(Trae) + +本仓库提供了 Agent 可直接复用的 Skill/Rule 模版: +- Skill: [./.trae/skills/git-ai-mcp/SKILL.md](./.trae/skills/git-ai-mcp/SKILL.md) +- Rule: [./.trae/rules/git-ai-mcp/RULE.md](./.trae/rules/git-ai-mcp/RULE.md) + +使用方式: +- 在 Trae 中打开本仓库后,Agent 会自动加载 `.trae/skills/**` 下的 Skill。 +- 需要给 Agent 加约束时,把 Rule 内容放到你的 Agent 配置/系统规则中(也可以直接引用本仓库的 `.trae/rules/**` 作为规范来源)。 + +## Git hooks(提交前重建索引,push 前打包校验,checkout 自动解包) + +在任意 git 仓库中安装 hooks: + +```bash +git-ai ai hooks install +git-ai ai hooks status +``` + +说明: +- `pre-commit`:自动 `index --overwrite` + `pack`,并把 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区。 +- `pre-push`:再次 `pack`,若归档发生变化则阻止 push,提示先提交归档文件。 +- `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz` 则自动 `unpack`。 + +## Git LFS(推荐,用于 .git-ai/lancedb.tar.gz) + +为了避免把较大的索引归档直接存进 Git 历史,推荐对 `.git-ai/lancedb.tar.gz` 启用 Git LFS。 + +### 开启(一次性) + +```bash +git lfs install +git lfs track ".git-ai/lancedb.tar.gz" +git add .gitattributes +git commit -m "chore: track lancedb archive via git-lfs" +``` + +也可以用 `git-ai` 触发(仅在已安装 git-lfs 的情况下生效): + +```bash +git-ai ai pack --lfs +``` + +### 克隆/切分支后(如果未自动拉取 LFS) +如果你环境设置了 `GIT_LFS_SKIP_SMUDGE=1`,或发现 `.git-ai/lancedb.tar.gz` 不是有效的 gzip 文件: + +```bash +git lfs pull +``` + +## License + +[MIT](./LICENSE) diff --git a/docs/README.md b/docs/README.md index f4c3ca6..3346e76 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,38 +1,39 @@ -# 文档中心 +# Documentation Center -这里汇集了 `git-ai` 的所有文档。 +This collects all documentation for `git-ai`. -## 概览 +## Overview -`git-ai` 是一个全局 CLI: -- 默认行为像 `git`:`git-ai status/commit/push/...` 会代理到系统 `git` -- AI 能力放在 `git-ai ai ...`:索引、检索、归档、hooks、MCP Server +`git-ai` is a global CLI: +- Default behavior acts like `git`: `git-ai status/commit/push/...` proxies to system `git`. +- AI capabilities are under `git-ai ai ...`: Indexing, Retrieval, Packing, Hooks, MCP Server. -### 核心目标 -- 把代码仓的结构化索引放在 `.git-ai/` 下,并可通过归档文件 `.git-ai/lancedb.tar.gz` 分享 -- 让 Agent 通过 MCP tools 低成本命中符号/片段,再按需读取文件 +### Core Goals +- Store structured code repository indexes under `.git-ai/`, shareable via archive `.git-ai/lancedb.tar.gz`. +- Enable Agents to hit symbols/snippets via MCP tools at low cost, then read files as needed. -### 重要目录 -- `.git-ai/meta.json`:索引元数据(本地生成,通常不提交) -- `.git-ai/lancedb/`:本地向量索引目录(通常不提交) -- `.git-ai/lancedb.tar.gz`:归档后的索引(可提交/可用 git-lfs 追踪) -- `.git-ai/ast-graph.sqlite`:AST 图数据库(CozoDB) -- `.git-ai/ast-graph.export.json`:AST 图导出快照(用于非 SQLite 后端跨进程复用) +### Important Directories +- `.git-ai/meta.json`: Index metadata (locally generated, usually not committed). +- `.git-ai/lancedb/`: Local vector index directory (usually not committed). +- `.git-ai/lancedb.tar.gz`: Archived index (can be committed/tracked via git-lfs). +- `.git-ai/ast-graph.sqlite`: AST graph database (CozoDB). +- `.git-ai/ast-graph.export.json`: AST graph export snapshot (for non-SQLite backend cross-process reuse). -## 目录 +## Directory -### 使用指引 -- [安装与快速开始](./quickstart.md) -- [命令行使用](./cli.md) -- [Hooks 工作流](./hooks.md) -- [MCP Server 接入](./mcp.md) -- [Manifest Workspace 支持](./manifests.md) -- [排障](./troubleshooting.md) +### Usage Guides +- [Installation & Quick Start](./zh-CN/quickstart.md) (Chinese) +- [Windows Setup Guide](./windows-setup.md) +- [CLI Usage](./zh-CN/cli.md) (Chinese) +- [Hooks Workflow](./zh-CN/hooks.md) (Chinese) +- [MCP Server Integration](./zh-CN/mcp.md) (Chinese) +- [Manifest Workspace Support](./zh-CN/manifests.md) (Chinese) +- [Troubleshooting](./zh-CN/troubleshooting.md) (Chinese) -### 进阶与原理 -- [进阶:索引归档与 LFS](./advanced.md) -- [架构设计](./design.md) -- [开发规则](./rules.md) +### Advanced & Principles +- [Advanced: Index Archiving & LFS](./zh-CN/advanced.md) (Chinese) +- [Architecture Design](./zh-CN/design.md) (Chinese) +- [Development Rules](./zh-CN/rules.md) (Chinese) -## Agent 集成 -- [MCP Skill & Rule 模版](./mcp.md#agent-skills--rules) +## Agent Integration +- [MCP Skill & Rule Templates](./zh-CN/mcp.md#agent-skills--rules) (Chinese) diff --git a/docs/windows-setup.md b/docs/windows-setup.md new file mode 100644 index 0000000..d376642 --- /dev/null +++ b/docs/windows-setup.md @@ -0,0 +1,68 @@ +# Windows Development and Installation Guide + +[简体中文](./zh-CN/windows-setup.md) | **English** + +This guide describes how to set up the development environment for `git-ai` on Windows, specifically for the multi-language support (C, Go, Python, PHP, Rust). + +## Prerequisites + +1. **Node.js**: Install Node.js (LTS version recommended) from [nodejs.org](https://nodejs.org/). +2. **Git**: Install Git for Windows from [git-scm.com](https://git-scm.com/). + +## Build Tools for Native Dependencies + +`git-ai` relies on libraries with native bindings: +* `tree-sitter`: For code parsing (C++) +* `cozo-node`: Graph database engine (Rust/C++) + +While these libraries typically provide prebuilt binaries, you may need to build from source in certain environments (e.g., mismatched Node versions or specific architectures). Therefore, setting up a build environment is recommended. + +### Option 1: Install via Admin PowerShell (Recommended) + +Open PowerShell as Administrator and run: + +```powershell +npm install --global --production windows-build-tools +``` + +*Note: This package is sometimes deprecated or problematic. If it hangs or fails, use Option 2.* + +### Option 2: Manual Installation + +1. **Python**: Install Python 3 from [python.org](https://www.python.org/) or the Microsoft Store. +2. **Visual Studio Build Tools**: + * Download [Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/). + * Run the installer and select the **"Desktop development with C++"** workload. + * Ensure "MSVC ... C++ x64/x86 build tools" and "Windows 10/11 SDK" are selected. + +## Installation + +Once prerequisites are met: + +```bash +git clone https://github.com/mars167/git-ai-cli.git +cd git-ai-cli-v2 +npm install +npm run build +``` + +## Running Examples + +To verify support for different languages, you can run the parsing test: + +```bash +npx ts-node test/verify_parsing.ts +``` + +To fully develop with the polyglot examples, you may need to install the respective language runtimes: + +* **C**: Install MinGW or use MSVC (cl.exe). +* **Go**: Install from [go.dev](https://go.dev/dl/). +* **Python**: [python.org](https://www.python.org/). +* **PHP**: Download from [windows.php.net](https://windows.php.net/download/). Add to PATH. +* **Rust**: Install via [rustup.rs](https://rustup.rs/). + +## Troubleshooting + +* **node-gyp errors**: Ensure Python and Visual Studio Build Tools are correctly installed and in PATH. You can configure npm to use a specific python version: `npm config set python python3`. +* **Path issues**: Ensure `git-ai` binary or `npm bin` is in your PATH if running globally. diff --git a/docs/DESIGN.md b/docs/zh-CN/DESIGN.md similarity index 100% rename from docs/DESIGN.md rename to docs/zh-CN/DESIGN.md diff --git a/docs/zh-CN/README.md b/docs/zh-CN/README.md new file mode 100644 index 0000000..6c3f31a --- /dev/null +++ b/docs/zh-CN/README.md @@ -0,0 +1,41 @@ +# 文档中心 + +[**English**](../README.md) | 简体中文 + +这里汇集了 `git-ai` 的所有文档。 + +## 概览 + +`git-ai` 是一个全局 CLI: +- 默认行为像 `git`:`git-ai status/commit/push/...` 会代理到系统 `git` +- AI 能力放在 `git-ai ai ...`:索引、检索、归档、hooks、MCP Server + +### 核心目标 +- 把代码仓的结构化索引放在 `.git-ai/` 下,并可通过归档文件 `.git-ai/lancedb.tar.gz` 分享 +- 让 Agent 通过 MCP tools 低成本命中符号/片段,再按需读取文件 + +### 重要目录 +- `.git-ai/meta.json`:索引元数据(本地生成,通常不提交) +- `.git-ai/lancedb/`:本地向量索引目录(通常不提交) +- `.git-ai/lancedb.tar.gz`:归档后的索引(可提交/可用 git-lfs 追踪) +- `.git-ai/ast-graph.sqlite`:AST 图数据库(CozoDB) +- `.git-ai/ast-graph.export.json`:AST 图导出快照(用于非 SQLite 后端跨进程复用) + +## 目录 + +### 使用指引 +- [安装与快速开始](./quickstart.md) +- [Windows 开发与安装指引](./windows-setup.md) +- [命令行使用](./cli.md) +- [Hooks 工作流](./hooks.md) +- [MCP Server 接入](./mcp.md) +- [Manifest Workspace 支持](./manifests.md) +- [排障](./troubleshooting.md) + +### 进阶与原理 +- [进阶:索引归档与 LFS](./advanced.md) +- [架构设计](./design.md) +- [开发规则](./rules.md) + +## Agent 集成 +- [MCP Skill & Rule 模版](./mcp.md#agent-skills--rules) diff --git a/docs/advanced.md b/docs/zh-CN/advanced.md similarity index 100% rename from docs/advanced.md rename to docs/zh-CN/advanced.md diff --git a/docs/architecture_explained.md b/docs/zh-CN/architecture_explained.md similarity index 100% rename from docs/architecture_explained.md rename to docs/zh-CN/architecture_explained.md diff --git a/docs/cli.md b/docs/zh-CN/cli.md similarity index 73% rename from docs/cli.md rename to docs/zh-CN/cli.md index 8456835..2241945 100644 --- a/docs/cli.md +++ b/docs/zh-CN/cli.md @@ -26,6 +26,23 @@ git-ai ai hooks install git-ai ai serve ``` +## RepoMap(全局鸟瞰,可选) + +为了支持类似 aider 的 repomap 能力(重要文件/符号排名、上下文映射、引导 Wiki 关联阅读),repo map 被集成到 **已有检索命令** 中,默认不输出,避免增加输出体积与 token 消耗。 + +在需要时,显式开启: + +```bash +git-ai ai query "HelloController" --with-repo-map --repo-map-files 20 --repo-map-symbols 5 +git-ai ai semantic "where is auth handled" --with-repo-map +``` + +参数说明: +- `--with-repo-map`:在 JSON 输出中附加 `repo_map` 字段 +- `--repo-map-files `:repo map 展示的文件数量上限(默认 20) +- `--repo-map-symbols `:每个文件展示的符号上限(默认 5) +- `--wiki `:指定 Wiki 目录(默认自动探测 `docs/wiki` 或 `wiki`) + ## 符号搜索模式(ai query) `git-ai ai query` 默认是子串搜索;当你的输入包含 `*` / `?` 时,或显式指定 `--mode`,可以启用更适合 code agent 的搜索模式: diff --git a/docs/graph_scenarios.md b/docs/zh-CN/graph_scenarios.md similarity index 100% rename from docs/graph_scenarios.md rename to docs/zh-CN/graph_scenarios.md diff --git a/docs/hooks.md b/docs/zh-CN/hooks.md similarity index 100% rename from docs/hooks.md rename to docs/zh-CN/hooks.md diff --git a/docs/manifests.md b/docs/zh-CN/manifests.md similarity index 100% rename from docs/manifests.md rename to docs/zh-CN/manifests.md diff --git a/docs/mcp.md b/docs/zh-CN/mcp.md similarity index 81% rename from docs/mcp.md rename to docs/zh-CN/mcp.md index d4c603e..a925d52 100644 --- a/docs/mcp.md +++ b/docs/zh-CN/mcp.md @@ -19,14 +19,14 @@ git-ai ai serve - `set_repo({ path })`:设置默认仓库路径,避免依赖进程工作目录 ### 索引管理 -- `index_repo({ path?, dim?, overwrite? })`:构建/更新索引 - `check_index({ path? })`:检查索引结构是否与当前版本一致(不一致需重建索引) - `pack_index({ path?, lfs? })`:打包索引为 `.git-ai/lancedb.tar.gz`(可选启用 git-lfs track) - `unpack_index({ path? })`:解包索引归档 ### 检索 -- `search_symbols({ query, mode?, case_insensitive?, max_candidates?, limit?, lang?, path? })`:符号检索(lang: auto/all/java/ts) -- `semantic_search({ query, topk?, lang?, path? })`:基于 LanceDB + SQ8 的语义检索(lang: auto/all/java/ts) +- `search_symbols({ query, mode?, case_insensitive?, max_candidates?, limit?, lang?, path?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:符号检索(lang: auto/all/java/ts;可选附带 repo_map) +- `semantic_search({ query, topk?, lang?, path?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:基于 LanceDB + SQ8 的语义检索(lang: auto/all/java/ts;可选附带 repo_map) +- `repo_map({ path?, max_files?, max_symbols?, wiki_dir? })`:生成 repo map(重要文件/符号排名、引导 Wiki 阅读) - `ast_graph_find({ prefix, limit?, lang?, path? })`:按名字前缀查找符号定义(大小写不敏感;lang: auto/all/java/ts) - `ast_graph_children({ id, as_file?, path? })`:列出包含关系的直接子节点(文件→顶层符号、类→方法等) - `ast_graph_refs({ name, limit?, lang?, path? })`:按名字查引用位置(call/new/type;lang: auto/all/java/ts) @@ -67,6 +67,21 @@ ast_graph_chain({ name: "greet", direction: "upstream", max_depth: 3 }) - 第一次调用先 `set_repo({path: "/ABS/PATH/TO/REPO"})` - 后续工具调用不传 `path`(走默认仓库) +## RepoMap 使用建议 + +repo map 用于给 Agent 一个“全局鸟瞰 + 导航入口”(重要文件/符号 + Wiki 关联),建议作为分析前置步骤: + +```js +repo_map({ max_files: 20, max_symbols: 5 }) +``` + +如果你希望在一次检索结果里顺带附加 repo map(默认关闭,避免输出膨胀): + +```js +search_symbols({ query: "Foo", limit: 20, with_repo_map: true, repo_map_max_files: 20, repo_map_max_symbols: 5 }) +semantic_search({ query: "where is auth handled", topk: 5, with_repo_map: true }) +``` + ## Agent Skills / Rules 本仓库提供了 Agent 可直接复用的 Skill/Rule 模版,旨在让 Agent 能够遵循最佳实践来使用上述工具。 @@ -87,8 +102,9 @@ ast_graph_chain({ name: "greet", direction: "upstream", max_depth: 3 }) - `search_symbols` / `semantic_search` 没结果或明显过时 - 用户刚改了大量文件/刚切分支/刚合并 -调用: -- `index_repo({ overwrite: true, dim: 256 })` +建议: +- 用 `check_index({})` 判断索引结构是否兼容 +- 用 CLI 重建索引:`git-ai ai index --overwrite` - 如需共享索引:`pack_index({ lfs: false })` #### 检视套路(推荐顺序) diff --git a/docs/quickstart.md b/docs/zh-CN/quickstart.md similarity index 100% rename from docs/quickstart.md rename to docs/zh-CN/quickstart.md diff --git a/docs/RULES.md b/docs/zh-CN/rules.md similarity index 100% rename from docs/RULES.md rename to docs/zh-CN/rules.md diff --git a/docs/troubleshooting.md b/docs/zh-CN/troubleshooting.md similarity index 100% rename from docs/troubleshooting.md rename to docs/zh-CN/troubleshooting.md diff --git a/docs/zh-CN/windows-setup.md b/docs/zh-CN/windows-setup.md new file mode 100644 index 0000000..9d62558 --- /dev/null +++ b/docs/zh-CN/windows-setup.md @@ -0,0 +1,68 @@ +# Windows 开发与安装指引 + +**简体中文** | [English](../windows-setup.md) + +本指引介绍如何在 Windows 上设置 `git-ai` 的开发环境,特别是针对多语言支持(C、Go、Python、PHP、Rust)。 + +## 前置条件 + +1. **Node.js**: 从 [nodejs.org](https://nodejs.org/) 安装 Node.js (推荐 LTS 版本)。 +2. **Git**: 从 [git-scm.com](https://git-scm.com/) 安装 Git for Windows。 + +## 原生依赖构建工具 + +`git-ai` 依赖以下包含原生绑定的库: +* `tree-sitter`: 用于代码解析 (C++) +* `cozo-node`: 图数据库引擎 (Rust/C++) + +虽然这些库通常提供预编译二进制包,但在某些环境(如 Node 版本不匹配或特定系统架构)下可能需要从源码编译。因此建议准备好编译环境。 + +### 选项 1: 通过管理员 PowerShell 安装 (推荐) + +以管理员身份打开 PowerShell 并运行: + +```powershell +npm install --global --production windows-build-tools +``` + +*注意:此包有时会过时或有问题。如果卡住或失败,请使用选项 2。* + +### 选项 2: 手动安装 + +1. **Python**: 从 [python.org](https://www.python.org/) 或 Microsoft Store 安装 Python 3。 +2. **Visual Studio Build Tools**: + * 下载 [Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)。 + * 运行安装程序并选择 **"Desktop development with C++" (使用 C++ 的桌面开发)** 工作负载。 + * 确保选中 "MSVC ... C++ x64/x86 build tools" 和 "Windows 10/11 SDK"。 + +## 安装 + +满足前置条件后: + +```bash +git clone https://github.com/mars167/git-ai-cli.git +cd git-ai-cli-v2 +npm install +npm run build +``` + +## 运行示例 + +要验证对不同语言的支持,可以运行解析测试: + +```bash +npx ts-node test/verify_parsing.ts +``` + +要完整开发多语言示例,你可能需要安装各自的语言运行时: + +* **C**: 安装 MinGW 或使用 MSVC (cl.exe)。 +* **Go**: 从 [go.dev](https://go.dev/dl/) 安装。 +* **Python**: [python.org](https://www.python.org/)。 +* **PHP**: 从 [windows.php.net](https://windows.php.net/download/) 下载。添加到 PATH。 +* **Rust**: 通过 [rustup.rs](https://rustup.rs/) 安装。 + +## 排障 + +* **node-gyp 错误**: 确保 Python 和 Visual Studio Build Tools 已正确安装并在 PATH 中。你可以配置 npm 使用特定 python 版本:`npm config set python python3`。 +* **路径问题**: 如果全局运行,确保 `git-ai` 二进制文件或 `npm bin` 在你的 PATH 中。 diff --git a/examples/polyglot-repo/main.c b/examples/polyglot-repo/main.c new file mode 100644 index 0000000..32ff31f --- /dev/null +++ b/examples/polyglot-repo/main.c @@ -0,0 +1,10 @@ +#include + +void hello() { + printf("Hello from C\n"); +} + +int main() { + hello(); + return 0; +} diff --git a/examples/polyglot-repo/main.go b/examples/polyglot-repo/main.go new file mode 100644 index 0000000..2e81b2b --- /dev/null +++ b/examples/polyglot-repo/main.go @@ -0,0 +1,11 @@ +package main + +import "fmt" + +func Hello() { + fmt.Println("Hello from Go") +} + +func main() { + Hello() +} diff --git a/examples/polyglot-repo/main.php b/examples/polyglot-repo/main.php new file mode 100644 index 0000000..dd64353 --- /dev/null +++ b/examples/polyglot-repo/main.php @@ -0,0 +1,15 @@ +greet(); diff --git a/examples/polyglot-repo/main.py b/examples/polyglot-repo/main.py new file mode 100644 index 0000000..99d0db7 --- /dev/null +++ b/examples/polyglot-repo/main.py @@ -0,0 +1,11 @@ +def hello(): + print("Hello from Python") + +class Greeter: + def greet(self): + print("Greetings") + +if __name__ == "__main__": + hello() + g = Greeter() + g.greet() diff --git a/examples/polyglot-repo/main.rs b/examples/polyglot-repo/main.rs new file mode 100644 index 0000000..87e2706 --- /dev/null +++ b/examples/polyglot-repo/main.rs @@ -0,0 +1,17 @@ +fn hello() { + println!("Hello from Rust"); +} + +struct Greeter; + +impl Greeter { + fn greet(&self) { + println!("Greetings"); + } +} + +fn main() { + hello(); + let g = Greeter; + g.greet(); +} diff --git a/package-lock.json b/package-lock.json index 2f8572a..6eb8c77 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "git-ai", - "version": "1.0.0", + "version": "1.1.1", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "git-ai", - "version": "1.0.0", + "version": "1.1.1", "license": "MIT", "dependencies": { "@lancedb/lancedb": "0.22.3", @@ -15,13 +15,17 @@ "@types/node": "^25.0.9", "apache-arrow": "18.1.0", "commander": "^14.0.2", - "cozo-lib-wasm": "0.7.6", "fs-extra": "^11.3.3", "glob": "^13.0.0", "simple-git": "^3.30.0", "tar": "^7.5.3", "tree-sitter": "^0.21.1", + "tree-sitter-c": "^0.21.4", + "tree-sitter-go": "^0.21.2", "tree-sitter-java": "^0.21.0", + "tree-sitter-php": "^0.20.0", + "tree-sitter-python": "^0.21.0", + "tree-sitter-rust": "^0.21.0", "tree-sitter-typescript": "^0.23.2", "ts-node": "^10.9.2", "typescript": "^5.9.3", @@ -1802,6 +1806,12 @@ "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", "license": "MIT" }, + "node_modules/nan": { + "version": "2.24.0", + "resolved": "https://registry.npmmirror.com/nan/-/nan-2.24.0.tgz", + "integrity": "sha512-Vpf9qnVW1RaDkoNKFUvfxqAbtI8ncb8OJlqZ9wwpXzWPEsvsB1nvdUi6oYrHIkQ1Y/tMDnr1h4nczS0VB9Xykg==", + "license": "MIT" + }, "node_modules/negotiator": { "version": "1.0.0", "resolved": "https://registry.npmmirror.com/negotiator/-/negotiator-1.0.0.tgz", @@ -2474,6 +2484,44 @@ "node-gyp-build": "^4.8.0" } }, + "node_modules/tree-sitter-c": { + "version": "0.21.4", + "resolved": "https://registry.npmmirror.com/tree-sitter-c/-/tree-sitter-c-0.21.4.tgz", + "integrity": "sha512-IahxFIhXiY15SUlrt2upBiKSBGdOaE1fjKLK1Ik5zxqGHf6T1rvr3IJrovbsE5sXhypx7Hnmf50gshsppaIihA==", + "hasInstallScript": true, + "license": "MIT", + "dependencies": { + "node-addon-api": "^8.0.0", + "node-gyp-build": "^4.8.1" + }, + "peerDependencies": { + "tree-sitter": "^0.21.0" + }, + "peerDependenciesMeta": { + "tree_sitter": { + "optional": true + } + } + }, + "node_modules/tree-sitter-go": { + "version": "0.21.2", + "resolved": "https://registry.npmmirror.com/tree-sitter-go/-/tree-sitter-go-0.21.2.tgz", + "integrity": "sha512-aMFwjsB948nWhURiIxExK8QX29JYKs96P/IfXVvluVMRJZpL04SREHsdOZHYqJr1whkb7zr3/gWHqqvlkczmvw==", + "hasInstallScript": true, + "license": "MIT", + "dependencies": { + "node-addon-api": "^8.1.0", + "node-gyp-build": "^4.8.1" + }, + "peerDependencies": { + "tree-sitter": "^0.21.0" + }, + "peerDependenciesMeta": { + "tree_sitter": { + "optional": true + } + } + }, "node_modules/tree-sitter-java": { "version": "0.21.0", "resolved": "https://registry.npmmirror.com/tree-sitter-java/-/tree-sitter-java-0.21.0.tgz", @@ -2512,6 +2560,66 @@ } } }, + "node_modules/tree-sitter-php": { + "version": "0.20.0", + "resolved": "https://registry.npmmirror.com/tree-sitter-php/-/tree-sitter-php-0.20.0.tgz", + "integrity": "sha512-di7d1jjAu05Hj9AufXlUQzTHGaThemU2HV9MjE17HnbW8TfuwNzH0Q0BQuJLrIJipjK9bhqhNe1fS9wtkyUkYg==", + "hasInstallScript": true, + "license": "MIT", + "dependencies": { + "nan": "^2.18.0" + } + }, + "node_modules/tree-sitter-python": { + "version": "0.21.0", + "resolved": "https://registry.npmmirror.com/tree-sitter-python/-/tree-sitter-python-0.21.0.tgz", + "integrity": "sha512-IUKx7JcTVbByUx1iHGFS/QsIjx7pqwTMHL9bl/NGyhyyydbfNrpruo2C7W6V4KZrbkkCOlX8QVrCoGOFW5qecg==", + "hasInstallScript": true, + "license": "MIT", + "dependencies": { + "node-addon-api": "^7.1.0", + "node-gyp-build": "^4.8.0" + }, + "peerDependencies": { + "tree-sitter": "^0.21.0" + }, + "peerDependenciesMeta": { + "tree_sitter": { + "optional": true + } + } + }, + "node_modules/tree-sitter-python/node_modules/node-addon-api": { + "version": "7.1.1", + "resolved": "https://registry.npmmirror.com/node-addon-api/-/node-addon-api-7.1.1.tgz", + "integrity": "sha512-5m3bsyrjFWE1xf7nz7YXdN4udnVtXK6/Yfgn5qnahL6bCkf2yKt4k3nuTKAtT4r3IG8JNR2ncsIMdZuAzJjHQQ==", + "license": "MIT" + }, + "node_modules/tree-sitter-rust": { + "version": "0.21.0", + "resolved": "https://registry.npmmirror.com/tree-sitter-rust/-/tree-sitter-rust-0.21.0.tgz", + "integrity": "sha512-unVr73YLn3VC4Qa/GF0Nk+Wom6UtI526p5kz9Rn2iZSqwIFedyCZ3e0fKCEmUJLIPGrTb/cIEdu3ZUNGzfZx7A==", + "hasInstallScript": true, + "license": "MIT", + "dependencies": { + "node-addon-api": "^7.1.0", + "node-gyp-build": "^4.8.0" + }, + "peerDependencies": { + "tree-sitter": "^0.21.1" + }, + "peerDependenciesMeta": { + "tree_sitter": { + "optional": true + } + } + }, + "node_modules/tree-sitter-rust/node_modules/node-addon-api": { + "version": "7.1.1", + "resolved": "https://registry.npmmirror.com/node-addon-api/-/node-addon-api-7.1.1.tgz", + "integrity": "sha512-5m3bsyrjFWE1xf7nz7YXdN4udnVtXK6/Yfgn5qnahL6bCkf2yKt4k3nuTKAtT4r3IG8JNR2ncsIMdZuAzJjHQQ==", + "license": "MIT" + }, "node_modules/tree-sitter-typescript": { "version": "0.23.2", "resolved": "https://registry.npmmirror.com/tree-sitter-typescript/-/tree-sitter-typescript-0.23.2.tgz", diff --git a/package.json b/package.json index 2d10626..bf005f6 100644 --- a/package.json +++ b/package.json @@ -11,7 +11,8 @@ "scripts": { "build": "tsc", "start": "ts-node bin/git-ai.ts", - "test": "npm run build && node --test" + "test": "npm run build && node --test", + "test:parser": "ts-node test/verify_parsing.ts" }, "files": [ "dist/**", @@ -46,7 +47,12 @@ "simple-git": "^3.30.0", "tar": "^7.5.3", "tree-sitter": "^0.21.1", + "tree-sitter-c": "^0.21.4", + "tree-sitter-go": "^0.21.2", "tree-sitter-java": "^0.21.0", + "tree-sitter-php": "^0.20.0", + "tree-sitter-python": "^0.21.0", + "tree-sitter-rust": "^0.21.0", "tree-sitter-typescript": "^0.23.2", "ts-node": "^10.9.2", "typescript": "^5.9.3", diff --git a/src/commands/query.ts b/src/commands/query.ts index 7e01357..12160de 100644 --- a/src/commands/query.ts +++ b/src/commands/query.ts @@ -1,11 +1,13 @@ import { Command } from 'commander'; import path from 'path'; +import fs from 'fs-extra'; import { inferWorkspaceRoot, resolveGitRoot } from '../core/git'; import { defaultDbDir, openTablesByLang } from '../core/lancedb'; import { queryManifestWorkspace } from '../core/workspace'; import { buildCoarseWhere, filterAndRankSymbolRows, inferSymbolSearchMode, pickCoarseToken, SymbolSearchMode } from '../core/symbolSearch'; import { createLogger } from '../core/log'; import { checkIndex, resolveLangs } from '../core/indexCheck'; +import { generateRepoMap, type FileRank } from '../core/repoMap'; export const queryCommand = new Command('query') .description('Query refs table by symbol match (substring/prefix/wildcard/regex/fuzzy)') @@ -16,6 +18,10 @@ export const queryCommand = new Command('query') .option('--case-insensitive', 'Case-insensitive matching', false) .option('--max-candidates ', 'Max candidates to fetch before filtering', '1000') .option('--lang ', 'Language: auto|all|java|ts', 'auto') + .option('--with-repo-map', 'Attach a lightweight repo map (ranked files + top symbols + wiki links)', false) + .option('--repo-map-files ', 'Max repo map files', '20') + .option('--repo-map-symbols ', 'Max repo map symbols per file', '5') + .option('--wiki ', 'Wiki directory (default: docs/wiki or wiki)', '') .action(async (keyword, options) => { const log = createLogger({ component: 'cli', cmd: 'ai query' }); const startedAt = Date.now(); @@ -27,6 +33,7 @@ export const queryCommand = new Command('query') const caseInsensitive = Boolean(options.caseInsensitive ?? false); const maxCandidates = Math.max(limit, Number(options.maxCandidates ?? Math.min(2000, limit * 20))); const langSel = String(options.lang ?? 'auto'); + const withRepoMap = Boolean((options as any).withRepoMap ?? false); if (inferWorkspaceRoot(repoRoot)) { const coarse = (mode === 'substring' || mode === 'prefix') ? q : pickCoarseToken(q); @@ -38,7 +45,8 @@ export const queryCommand = new Command('query') : res.rows; const rows = filterAndRankSymbolRows(filteredByLang, { query: q, mode, caseInsensitive, limit }); log.info('query_symbols', { ok: true, repoRoot, workspace: true, mode, case_insensitive: caseInsensitive, limit, max_candidates: maxCandidates, candidates: res.rows.length, rows: rows.length, duration_ms: Date.now() - startedAt }); - console.log(JSON.stringify({ ...res, rows }, null, 2)); + const repoMap = withRepoMap ? { enabled: false, skippedReason: 'workspace_mode_not_supported' } : undefined; + console.log(JSON.stringify({ ...res, rows, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2)); return; } @@ -71,9 +79,35 @@ export const queryCommand = new Command('query') } const rows = filterAndRankSymbolRows(candidates as any[], { query: q, mode, caseInsensitive, limit }); log.info('query_symbols', { ok: true, repoRoot, workspace: false, lang: langSel, langs, mode, case_insensitive: caseInsensitive, limit, max_candidates: maxCandidates, candidates: candidates.length, rows: rows.length, duration_ms: Date.now() - startedAt }); - console.log(JSON.stringify({ repoRoot, count: rows.length, lang: langSel, rows }, null, 2)); + const repoMap = withRepoMap ? await buildRepoMapAttachment(repoRoot, options) : undefined; + console.log(JSON.stringify({ repoRoot, count: rows.length, lang: langSel, rows, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2)); } catch (e) { log.error('query_symbols', { ok: false, duration_ms: Date.now() - startedAt, err: e instanceof Error ? { name: e.name, message: e.message, stack: e.stack } : { message: String(e) } }); process.exit(1); } }); + +async function buildRepoMapAttachment(repoRoot: string, options: any): Promise<{ enabled: boolean; wikiDir: string; files: FileRank[] } | { enabled: boolean; skippedReason: string }> { + try { + const wikiDir = resolveWikiDir(repoRoot, String(options.wiki ?? '')); + const files = await generateRepoMap({ + repoRoot, + maxFiles: Number(options.repoMapFiles ?? 20), + maxSymbolsPerFile: Number(options.repoMapSymbols ?? 5), + wikiDir, + }); + return { enabled: true, wikiDir, files }; + } catch (e: any) { + return { enabled: false, skippedReason: String(e?.message ?? e) }; + } +} + +function resolveWikiDir(repoRoot: string, wikiOpt: string): string { + const w = String(wikiOpt ?? '').trim(); + if (w) return path.resolve(repoRoot, w); + const candidates = [path.join(repoRoot, 'docs', 'wiki'), path.join(repoRoot, 'wiki')]; + for (const c of candidates) { + if (fs.existsSync(c)) return c; + } + return ''; +} diff --git a/src/commands/semantic.ts b/src/commands/semantic.ts index 782a5e4..6169ec2 100644 --- a/src/commands/semantic.ts +++ b/src/commands/semantic.ts @@ -1,10 +1,12 @@ import { Command } from 'commander'; import path from 'path'; +import fs from 'fs-extra'; import { resolveGitRoot } from '../core/git'; import { defaultDbDir, openTablesByLang } from '../core/lancedb'; import { buildQueryVector, scoreAgainst } from '../core/search'; import { createLogger } from '../core/log'; import { checkIndex, resolveLangs } from '../core/indexCheck'; +import { generateRepoMap, type FileRank } from '../core/repoMap'; export const semanticCommand = new Command('semantic') .description('Semantic search using SQ8 vectors (brute-force over chunks)') @@ -12,11 +14,16 @@ export const semanticCommand = new Command('semantic') .option('-p, --path ', 'Path inside the repository', '.') .option('-k, --topk ', 'Top K results', '10') .option('--lang ', 'Language: auto|all|java|ts', 'auto') + .option('--with-repo-map', 'Attach a lightweight repo map (ranked files + top symbols + wiki links)', false) + .option('--repo-map-files ', 'Max repo map files', '20') + .option('--repo-map-symbols ', 'Max repo map symbols per file', '5') + .option('--wiki ', 'Wiki directory (default: docs/wiki or wiki)', '') .action(async (text, options) => { const log = createLogger({ component: 'cli', cmd: 'ai semantic' }); const startedAt = Date.now(); try { const repoRoot = await resolveGitRoot(path.resolve(options.path)); + const withRepoMap = Boolean((options as any).withRepoMap ?? false); const status = await checkIndex(repoRoot); if (!status.ok) { process.stderr.write(JSON.stringify({ ...status, ok: false, reason: 'index_incompatible' }, null, 2) + '\n'); @@ -87,9 +94,35 @@ export const semanticCommand = new Command('semantic') })); log.info('semantic_search', { ok: true, repoRoot, topk: k, lang: langSel, langs, chunks: totalChunks, hits: hits.length, duration_ms: Date.now() - startedAt }); - console.log(JSON.stringify({ repoRoot, topk: k, lang: langSel, hits }, null, 2)); + const repoMap = withRepoMap ? await buildRepoMapAttachment(repoRoot, options) : undefined; + console.log(JSON.stringify({ repoRoot, topk: k, lang: langSel, hits, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2)); } catch (e) { log.error('semantic_search', { ok: false, duration_ms: Date.now() - startedAt, err: e instanceof Error ? { name: e.name, message: e.message, stack: e.stack } : { message: String(e) } }); process.exit(1); } }); + +async function buildRepoMapAttachment(repoRoot: string, options: any): Promise<{ enabled: boolean; wikiDir: string; files: FileRank[] } | { enabled: boolean; skippedReason: string }> { + try { + const wikiDir = resolveWikiDir(repoRoot, String(options.wiki ?? '')); + const files = await generateRepoMap({ + repoRoot, + maxFiles: Number(options.repoMapFiles ?? 20), + maxSymbolsPerFile: Number(options.repoMapSymbols ?? 5), + wikiDir, + }); + return { enabled: true, wikiDir, files }; + } catch (e: any) { + return { enabled: false, skippedReason: String(e?.message ?? e) }; + } +} + +function resolveWikiDir(repoRoot: string, wikiOpt: string): string { + const w = String(wikiOpt ?? '').trim(); + if (w) return path.resolve(repoRoot, w); + const candidates = [path.join(repoRoot, 'docs', 'wiki'), path.join(repoRoot, 'wiki')]; + for (const c of candidates) { + if (fs.existsSync(c)) return c; + } + return ''; +} diff --git a/src/core/parser.ts b/src/core/parser.ts index 9001e7d..c169459 100644 --- a/src/core/parser.ts +++ b/src/core/parser.ts @@ -1,217 +1,60 @@ import Parser from 'tree-sitter'; -import TypeScript from 'tree-sitter-typescript'; -import Java from 'tree-sitter-java'; import fs from 'fs-extra'; -import { AstReference, AstRefKind, ParseResult, SymbolInfo } from './types'; +import { ParseResult } from './types'; +import { LanguageAdapter } from './parser/adapter'; +import { TypeScriptAdapter } from './parser/typescript'; +import { JavaAdapter } from './parser/java'; +import { CAdapter } from './parser/c'; +import { GoAdapter } from './parser/go'; +import { PythonAdapter } from './parser/python'; +import { PHPAdapter } from './parser/php'; +import { RustAdapter } from './parser/rust'; export class CodeParser { private parser: Parser; + private adapters: LanguageAdapter[]; constructor() { this.parser = new Parser(); + this.adapters = [ + new TypeScriptAdapter(false), + new TypeScriptAdapter(true), + new JavaAdapter(), + new CAdapter(), + new GoAdapter(), + new PythonAdapter(), + new PHPAdapter(), + new RustAdapter(), + ]; } async parseFile(filePath: string): Promise { const content = await fs.readFile(filePath, 'utf-8'); - const language = this.pickLanguage(filePath); - if (!language) return { symbols: [], refs: [] }; + const adapter = this.pickAdapter(filePath); + if (!adapter) return { symbols: [], refs: [] }; - this.parser.setLanguage(language.language); + this.parser.setLanguage(adapter.getTreeSitterLanguage()); try { const tree = this.parser.parse(content); - return this.extractSymbolsAndRefs(tree.rootNode, language.id); + return adapter.extractSymbolsAndRefs(tree.rootNode); } catch (e: any) { const msg = String(e?.message ?? e); if (!msg.includes('Invalid argument')) return { symbols: [], refs: [] }; try { const tree = this.parser.parse(content, undefined, { bufferSize: 1024 * 1024 }); - return this.extractSymbolsAndRefs(tree.rootNode, language.id); + return adapter.extractSymbolsAndRefs(tree.rootNode); } catch { return { symbols: [], refs: [] }; } } } - private pickLanguage(filePath: string): { id: 'typescript' | 'java'; language: any } | null { - if (filePath.endsWith('.ts') || filePath.endsWith('.js')) { - return { id: 'typescript', language: TypeScript.typescript }; - } - if (filePath.endsWith('.tsx') || filePath.endsWith('.jsx')) { - return { id: 'typescript', language: TypeScript.tsx }; - } - if (filePath.endsWith('.java')) { - return { id: 'java', language: Java as any }; + private pickAdapter(filePath: string): LanguageAdapter | null { + for (const adapter of this.adapters) { + for (const ext of adapter.getSupportedFileExtensions()) { + if (filePath.endsWith(ext)) return adapter; + } } return null; } - - private extractSymbolsAndRefs(node: Parser.SyntaxNode, languageId: 'typescript' | 'java'): ParseResult { - const symbols: SymbolInfo[] = []; - const refs: AstReference[] = []; - - const parseHeritage = (head: string): { extends?: string[]; implements?: string[] } => { - const out: { extends?: string[]; implements?: string[] } = {}; - const extendsMatch = head.match(/\bextends\s+([A-Za-z0-9_$.<>\[\]]+)/); - if (extendsMatch?.[1]) out.extends = [extendsMatch[1]]; - - const implMatch = head.match(/\bimplements\s+([A-Za-z0-9_$. ,<>\[\]]+)/); - if (implMatch?.[1]) { - const raw = implMatch[1]; - const parts: string[] = []; - let current = ''; - let depth = 0; - for (const char of raw) { - if (char === '<') depth++; - else if (char === '>') depth--; - - if (char === ',' && depth === 0) { - if (current.trim()) parts.push(current.trim()); - current = ''; - } else { - current += char; - } - } - if (current.trim()) parts.push(current.trim()); - - if (parts.length > 0) out.implements = parts; - } - return out; - }; - - const pushRef = (name: string, refKind: AstRefKind, n: Parser.SyntaxNode) => { - const nm = String(name ?? '').trim(); - if (!nm) return; - refs.push({ - name: nm, - refKind, - line: n.startPosition.row + 1, - column: n.startPosition.column + 1, - }); - }; - - const findFirstByType = (n: Parser.SyntaxNode, types: string[]): Parser.SyntaxNode | null => { - if (types.includes(n.type)) return n; - for (let i = 0; i < n.childCount; i++) { - const c = n.child(i); - if (!c) continue; - const found = findFirstByType(c, types); - if (found) return found; - } - return null; - }; - - const extractTsCalleeName = (callee: Parser.SyntaxNode | null): string | null => { - if (!callee) return null; - if (callee.type === 'identifier') return callee.text; - if (callee.type === 'member_expression' || callee.type === 'optional_chain') { - const prop = callee.childForFieldName('property'); - if (prop) return prop.text; - const last = callee.namedChild(callee.namedChildCount - 1); - if (last) return last.text; - } - return null; - }; - - const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { - if (languageId === 'typescript') { - if (n.type === 'call_expression') { - const fn = n.childForFieldName('function') ?? n.namedChild(0); - const callee = extractTsCalleeName(fn); - if (callee) pushRef(callee, 'call', fn ?? n); - } else if (n.type === 'new_expression') { - const ctor = n.childForFieldName('constructor') ?? n.namedChild(0); - const callee = extractTsCalleeName(ctor); - if (callee) pushRef(callee, 'new', ctor ?? n); - } else if (n.type === 'type_identifier') { - pushRef(n.text, 'type', n); - } - - if (n.type === 'function_declaration' || n.type === 'method_definition') { - const nameNode = n.childForFieldName('name'); - if (nameNode) { - symbols.push({ - name: nameNode.text, - kind: n.type === 'method_definition' ? 'method' : 'function', - startLine: n.startPosition.row + 1, - endLine: n.endPosition.row + 1, - signature: n.text.split('{')[0].trim(), - container: n.type === 'method_definition' ? container : undefined, - }); - } - } else if (n.type === 'class_declaration') { - const nameNode = n.childForFieldName('name'); - if (nameNode) { - const head = n.text.split('{')[0].trim(); - const heritage = parseHeritage(head); - const classSym: SymbolInfo = { - name: nameNode.text, - kind: 'class', - startLine: n.startPosition.row + 1, - endLine: n.endPosition.row + 1, - signature: `class ${nameNode.text}`, - container, - extends: heritage.extends, - implements: heritage.implements, - }; - symbols.push(classSym); - for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, classSym); - return; - } - } - } else { - if (n.type === 'method_invocation') { - const nameNode = n.childForFieldName('name'); - if (nameNode) pushRef(nameNode.text, 'call', nameNode); - } else if (n.type === 'object_creation_expression') { - const typeNode = findFirstByType(n, ['type_identifier', 'identifier']); - if (typeNode) pushRef(typeNode.text, 'new', typeNode); - } - - if (n.type === 'method_declaration' || n.type === 'constructor_declaration') { - const nameNode = n.childForFieldName('name'); - if (nameNode) { - const head = n.text.split('{')[0].split(';')[0].trim(); - symbols.push({ - name: nameNode.text, - kind: 'method', - startLine: n.startPosition.row + 1, - endLine: n.endPosition.row + 1, - signature: head, - container, - }); - } - } else if ( - n.type === 'class_declaration' - || n.type === 'interface_declaration' - || n.type === 'enum_declaration' - || n.type === 'record_declaration' - || n.type === 'annotation_type_declaration' - ) { - const nameNode = n.childForFieldName('name'); - if (nameNode) { - const head = n.text.split('{')[0].split(';')[0].trim(); - const heritage = parseHeritage(head); - const classSym: SymbolInfo = { - name: nameNode.text, - kind: 'class', - startLine: n.startPosition.row + 1, - endLine: n.endPosition.row + 1, - signature: `${n.type.replace(/_declaration$/, '')} ${nameNode.text}`, - container, - extends: heritage.extends, - implements: heritage.implements, - }; - symbols.push(classSym); - for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, classSym); - return; - } - } - } - - for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, container); - }; - - traverse(node, undefined); - return { symbols, refs }; - } } diff --git a/src/core/parser/adapter.ts b/src/core/parser/adapter.ts new file mode 100644 index 0000000..3ab7e21 --- /dev/null +++ b/src/core/parser/adapter.ts @@ -0,0 +1,9 @@ +import Parser from 'tree-sitter'; +import { ParseResult } from '../types'; + +export interface LanguageAdapter { + getLanguageId(): string; + getTreeSitterLanguage(): any; + getSupportedFileExtensions(): string[]; + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult; +} diff --git a/src/core/parser/c.ts b/src/core/parser/c.ts new file mode 100644 index 0000000..11e43f6 --- /dev/null +++ b/src/core/parser/c.ts @@ -0,0 +1,92 @@ +import Parser from 'tree-sitter'; +import C from 'tree-sitter-c'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef } from './utils'; + +export class CAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'c'; + } + + getTreeSitterLanguage(): any { + return C as any; + } + + getSupportedFileExtensions(): string[] { + return ['.c', '.h']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call_expression') { + const fn = n.childForFieldName('function'); + if (fn) pushRef(refs, fn.text, 'call', fn); + } else if (n.type === 'type_identifier') { + pushRef(refs, n.text, 'type', n); + } + + let currentContainer = container; + + if (n.type === 'function_definition') { + const declarator = n.childForFieldName('declarator'); + const nameNode = this.findIdentifier(declarator); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'function', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'struct_specifier') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `struct ${nameNode.text}`, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } + + private findIdentifier(node: Parser.SyntaxNode | null): Parser.SyntaxNode | null { + if (!node) return null; + if (node.type === 'identifier') return node; + // recursive search, but limit depth or prioritize 'declarator' fields? + // In C, function_declarator has 'declarator' field. + if (node.type === 'function_declarator' || node.type === 'pointer_declarator' || node.type === 'parenthesized_declarator') { + const decl = node.childForFieldName('declarator'); + if (decl) return this.findIdentifier(decl); + // if no named field, just check children + for (let i = 0; i < node.childCount; i++) { + const res = this.findIdentifier(node.child(i)); + if (res) return res; + } + } + return null; + } + + private getSignature(node: Parser.SyntaxNode): string { + return node.text.split('{')[0].trim(); + } +} diff --git a/src/core/parser/go.ts b/src/core/parser/go.ts new file mode 100644 index 0000000..c5e3f9a --- /dev/null +++ b/src/core/parser/go.ts @@ -0,0 +1,98 @@ +import Parser from 'tree-sitter'; +import Go from 'tree-sitter-go'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef } from './utils'; + +export class GoAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'go'; + } + + getTreeSitterLanguage(): any { + return Go as any; + } + + getSupportedFileExtensions(): string[] { + return ['.go']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call_expression') { + const fn = n.childForFieldName('function'); + const nameNode = this.getCallNameNode(fn); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } else if (n.type === 'type_identifier') { + pushRef(refs, n.text, 'type', n); + } + + let currentContainer = container; + + if (n.type === 'function_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'function', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'method_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'method', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'type_specifier') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `type ${nameNode.text}`, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } + + private getCallNameNode(node: Parser.SyntaxNode | null): Parser.SyntaxNode | null { + if (!node) return null; + if (node.type === 'identifier') return node; + if (node.type === 'selector_expression') { + return node.childForFieldName('field'); + } + return null; + } + + private getSignature(node: Parser.SyntaxNode): string { + return node.text.split('{')[0].trim(); + } +} diff --git a/src/core/parser/java.ts b/src/core/parser/java.ts new file mode 100644 index 0000000..8c6c2f8 --- /dev/null +++ b/src/core/parser/java.ts @@ -0,0 +1,82 @@ +import Parser from 'tree-sitter'; +import Java from 'tree-sitter-java'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef, parseHeritage, findFirstByType } from './utils'; + +export class JavaAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'java'; + } + + getTreeSitterLanguage(): any { + return Java as any; + } + + getSupportedFileExtensions(): string[] { + return ['.java']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'method_invocation') { + const nameNode = n.childForFieldName('name'); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } else if (n.type === 'object_creation_expression') { + const typeNode = findFirstByType(n, ['type_identifier', 'identifier']); + if (typeNode) pushRef(refs, typeNode.text, 'new', typeNode); + } + + let currentContainer = container; + + if (n.type === 'method_declaration' || n.type === 'constructor_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const head = n.text.split('{')[0].split(';')[0].trim(); + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'method', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: head, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if ( + n.type === 'class_declaration' + || n.type === 'interface_declaration' + || n.type === 'enum_declaration' + || n.type === 'record_declaration' + || n.type === 'annotation_type_declaration' + ) { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const head = n.text.split('{')[0].split(';')[0].trim(); + const heritage = parseHeritage(head); + const classSym: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `${n.type.replace(/_declaration$/, '')} ${nameNode.text}`, + container, + extends: heritage.extends, + implements: heritage.implements, + }; + symbols.push(classSym); + currentContainer = classSym; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } +} diff --git a/src/core/parser/php.ts b/src/core/parser/php.ts new file mode 100644 index 0000000..2ab67b5 --- /dev/null +++ b/src/core/parser/php.ts @@ -0,0 +1,101 @@ +import Parser from 'tree-sitter'; +import PHP from 'tree-sitter-php'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef } from './utils'; + +export class PHPAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'php'; + } + + getTreeSitterLanguage(): any { + return (PHP as any).php_only || (PHP as any).php || PHP; + } + + getSupportedFileExtensions(): string[] { + return ['.php']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call_expression') { + const fn = n.childForFieldName('function'); + const nameNode = this.getCallNameNode(fn); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } else if (n.type === 'member_call_expression') { + const nameNode = n.childForFieldName('name'); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } else if (n.type === 'object_creation_expression') { + const typeNode = n.childForFieldName('type'); + if (typeNode && (typeNode.type === 'name' || typeNode.type === 'qualified_name')) { + pushRef(refs, typeNode.text, 'new', typeNode); + } + } + + let currentContainer = container; + + if (n.type === 'function_definition') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'function', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'method_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'method', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'class_declaration' || n.type === 'interface_declaration' || n.type === 'trait_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `class ${nameNode.text}`, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } + + private getCallNameNode(node: Parser.SyntaxNode | null): Parser.SyntaxNode | null { + if (!node) return null; + if (node.type === 'name' || node.type === 'qualified_name') return node; + return null; + } + + private getSignature(node: Parser.SyntaxNode): string { + return node.text.split('{')[0].trim(); + } +} diff --git a/src/core/parser/python.ts b/src/core/parser/python.ts new file mode 100644 index 0000000..5512194 --- /dev/null +++ b/src/core/parser/python.ts @@ -0,0 +1,83 @@ +import Parser from 'tree-sitter'; +import Python from 'tree-sitter-python'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef } from './utils'; + +export class PythonAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'python'; + } + + getTreeSitterLanguage(): any { + return Python as any; + } + + getSupportedFileExtensions(): string[] { + return ['.py']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call') { + const fn = n.childForFieldName('function'); + const nameNode = this.getCallNameNode(fn); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } + + let currentContainer = container; + + if (n.type === 'function_definition') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const kind = container?.kind === 'class' ? 'method' : 'function'; + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind, + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'class_definition') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `class ${nameNode.text}`, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } + + private getCallNameNode(node: Parser.SyntaxNode | null): Parser.SyntaxNode | null { + if (!node) return null; + if (node.type === 'identifier') return node; + if (node.type === 'attribute') { + return node.childForFieldName('attribute'); + } + return null; + } + + private getSignature(node: Parser.SyntaxNode): string { + return node.text.split(':')[0].trim(); + } +} diff --git a/src/core/parser/rust.ts b/src/core/parser/rust.ts new file mode 100644 index 0000000..d4853c5 --- /dev/null +++ b/src/core/parser/rust.ts @@ -0,0 +1,103 @@ +import Parser from 'tree-sitter'; +import Rust from 'tree-sitter-rust'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef } from './utils'; + +export class RustAdapter implements LanguageAdapter { + getLanguageId(): string { + return 'rust'; + } + + getTreeSitterLanguage(): any { + return Rust as any; + } + + getSupportedFileExtensions(): string[] { + return ['.rs']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call_expression') { + const fn = n.childForFieldName('function'); + const nameNode = this.getCallNameNode(fn); + if (nameNode) pushRef(refs, nameNode.text, 'call', nameNode); + } else if (n.type === 'type_identifier') { + pushRef(refs, n.text, 'type', n); + } + + let currentContainer = container; + + if (n.type === 'function_item') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + // If container is class (impl block or struct), it's a method + const kind = container?.kind === 'class' ? 'method' : 'function'; + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: kind, + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: this.getSignature(n), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'struct_item' || n.type === 'enum_item' || n.type === 'trait_item') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `${n.type.replace(/_item$/, '')} ${nameNode.text}`, + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'impl_item') { + const typeNode = n.childForFieldName('type'); + if (typeNode) { + const newSymbol: SymbolInfo = { + name: typeNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `impl ${typeNode.text}`, + container: container + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } + + private getCallNameNode(node: Parser.SyntaxNode | null): Parser.SyntaxNode | null { + if (!node) return null; + if (node.type === 'identifier') return node; + if (node.type === 'scoped_identifier') { + return node.childForFieldName('name'); + } + if (node.type === 'field_expression') { + return node.childForFieldName('field'); + } + return null; + } + + private getSignature(node: Parser.SyntaxNode): string { + return node.text.split('{')[0].trim(); + } +} diff --git a/src/core/parser/typescript.ts b/src/core/parser/typescript.ts new file mode 100644 index 0000000..e463026 --- /dev/null +++ b/src/core/parser/typescript.ts @@ -0,0 +1,93 @@ +import Parser from 'tree-sitter'; +import TypeScript from 'tree-sitter-typescript'; +import { LanguageAdapter } from './adapter'; +import { ParseResult, SymbolInfo, AstReference } from '../types'; +import { pushRef, parseHeritage } from './utils'; + +export class TypeScriptAdapter implements LanguageAdapter { + constructor(private isTsx: boolean = false) {} + + getLanguageId(): string { + return 'typescript'; + } + + getTreeSitterLanguage(): any { + return this.isTsx ? TypeScript.tsx : TypeScript.typescript; + } + + getSupportedFileExtensions(): string[] { + return this.isTsx ? ['.tsx', '.jsx'] : ['.ts', '.js', '.mjs', '.cjs']; + } + + extractSymbolsAndRefs(node: Parser.SyntaxNode): ParseResult { + const symbols: SymbolInfo[] = []; + const refs: AstReference[] = []; + + const extractTsCalleeName = (callee: Parser.SyntaxNode | null): string | null => { + if (!callee) return null; + if (callee.type === 'identifier') return callee.text; + if (callee.type === 'member_expression' || callee.type === 'optional_chain') { + const prop = callee.childForFieldName('property'); + if (prop) return prop.text; + const last = callee.namedChild(callee.namedChildCount - 1); + if (last) return last.text; + } + return null; + }; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { + if (n.type === 'call_expression') { + const fn = n.childForFieldName('function') ?? n.namedChild(0); + const callee = extractTsCalleeName(fn); + if (callee) pushRef(refs, callee, 'call', fn ?? n); + } else if (n.type === 'new_expression') { + const ctor = n.childForFieldName('constructor') ?? n.namedChild(0); + const callee = extractTsCalleeName(ctor); + if (callee) pushRef(refs, callee, 'new', ctor ?? n); + } else if (n.type === 'type_identifier') { + pushRef(refs, n.text, 'type', n); + } + + let currentContainer = container; + + if (n.type === 'function_declaration' || n.type === 'method_definition') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const newSymbol: SymbolInfo = { + name: nameNode.text, + kind: n.type === 'method_definition' ? 'method' : 'function', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: n.text.split('{')[0].trim(), + container: container, + }; + symbols.push(newSymbol); + currentContainer = newSymbol; + } + } else if (n.type === 'class_declaration') { + const nameNode = n.childForFieldName('name'); + if (nameNode) { + const head = n.text.split('{')[0].trim(); + const heritage = parseHeritage(head); + const classSym: SymbolInfo = { + name: nameNode.text, + kind: 'class', + startLine: n.startPosition.row + 1, + endLine: n.endPosition.row + 1, + signature: `class ${nameNode.text}`, + container, + extends: heritage.extends, + implements: heritage.implements, + }; + symbols.push(classSym); + currentContainer = classSym; + } + } + + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, currentContainer); + }; + + traverse(node, undefined); + return { symbols, refs }; + } +} diff --git a/src/core/parser/utils.ts b/src/core/parser/utils.ts new file mode 100644 index 0000000..9b4eedf --- /dev/null +++ b/src/core/parser/utils.ts @@ -0,0 +1,53 @@ +import Parser from 'tree-sitter'; +import { AstRefKind, AstReference } from '../types'; + +export const pushRef = (refs: AstReference[], name: string, refKind: AstRefKind, n: Parser.SyntaxNode) => { + const nm = String(name ?? '').trim(); + if (!nm) return; + refs.push({ + name: nm, + refKind, + line: n.startPosition.row + 1, + column: n.startPosition.column + 1, + }); +}; + +export const findFirstByType = (n: Parser.SyntaxNode, types: string[]): Parser.SyntaxNode | null => { + if (types.includes(n.type)) return n; + for (let i = 0; i < n.childCount; i++) { + const c = n.child(i); + if (!c) continue; + const found = findFirstByType(c, types); + if (found) return found; + } + return null; +}; + +export const parseHeritage = (head: string): { extends?: string[]; implements?: string[] } => { + const out: { extends?: string[]; implements?: string[] } = {}; + const extendsMatch = head.match(/\bextends\s+([A-Za-z0-9_$.<>\[\]]+)/); + if (extendsMatch?.[1]) out.extends = [extendsMatch[1]]; + + const implMatch = head.match(/\bimplements\s+([A-Za-z0-9_$. ,<>\[\]]+)/); + if (implMatch?.[1]) { + const raw = implMatch[1]; + const parts: string[] = []; + let current = ''; + let depth = 0; + for (const char of raw) { + if (char === '<') depth++; + else if (char === '>') depth--; + + if (char === ',' && depth === 0) { + if (current.trim()) parts.push(current.trim()); + current = ''; + } else { + current += char; + } + } + if (current.trim()) parts.push(current.trim()); + + if (parts.length > 0) out.implements = parts; + } + return out; +}; diff --git a/src/core/repoMap.ts b/src/core/repoMap.ts new file mode 100644 index 0000000..0dda6bd --- /dev/null +++ b/src/core/repoMap.ts @@ -0,0 +1,202 @@ +import { runAstGraphQuery } from './astGraphQuery'; +import path from 'path'; +import fs from 'fs-extra'; + +export interface RepoMapOptions { + repoRoot: string; + maxFiles?: number; + maxSymbolsPerFile?: number; + wikiDir?: string; +} + +export interface SymbolRank { + id: string; + name: string; + kind: string; + file: string; + rank: number; + signature?: string; + start_line: number; + end_line: number; +} + +export interface FileRank { + path: string; + rank: number; + symbols: SymbolRank[]; + wikiLink?: string; +} + +export async function generateRepoMap(options: RepoMapOptions): Promise { + const { repoRoot, maxFiles = 20, maxSymbolsPerFile = 5, wikiDir } = options; + + const symbolsQuery = `?[ref_id, file, name, kind, signature, start_line, end_line] := *ast_symbol{ref_id, file, name, kind, signature, start_line, end_line}`; + const symbolsRes = await runAstGraphQuery(repoRoot, symbolsQuery); + const symbolsRaw = Array.isArray(symbolsRes?.rows) ? symbolsRes.rows : []; + + const symbolMap = new Map(); + for (const row of symbolsRaw) { + symbolMap.set(row[0], { + id: row[0], + file: row[1], + name: row[2], + kind: row[3], + signature: row[4], + start_line: row[5], + end_line: row[6], + inDegree: 0, + outEdges: new Set(), + }); + } + + const relationsQuery = ` + ?[from_id, to_id] := *ast_call_name{caller_id: from_id, callee_name: name}, *ast_symbol{ref_id: to_id, name} + ?[from_id, to_id] := *ast_ref_name{from_id, name}, *ast_symbol{ref_id: to_id, name} + `; + const relationsRes = await runAstGraphQuery(repoRoot, relationsQuery); + const relationsRaw = Array.isArray(relationsRes?.rows) ? relationsRes.rows : []; + + for (const [fromId, toId] of relationsRaw) { + if (symbolMap.has(fromId) && symbolMap.has(toId) && fromId !== toId) { + const fromNode = symbolMap.get(fromId); + const toNode = symbolMap.get(toId); + if (!fromNode.outEdges.has(toId)) { + fromNode.outEdges.add(toId); + toNode.inDegree += 1; + } + } + } + + const nodes = Array.from(symbolMap.values()); + const N = nodes.length; + if (N === 0) return []; + + let ranks = new Map(); + nodes.forEach(n => ranks.set(n.id, 1 / N)); + + const damping = 0.85; + const iterations = 10; + + for (let i = 0; i < iterations; i++) { + const newRanks = new Map(); + nodes.forEach(n => newRanks.set(n.id, (1 - damping) / N)); + + for (const node of nodes) { + const currentRank = ranks.get(node.id)!; + if (node.outEdges.size > 0) { + const share = (currentRank * damping) / node.outEdges.size; + for (const targetId of node.outEdges) { + newRanks.set(targetId, newRanks.get(targetId)! + share); + } + } else { + const share = (currentRank * damping) / N; + for (const n2 of nodes) { + newRanks.set(n2.id, newRanks.get(n2.id)! + share); + } + } + } + ranks = newRanks; + } + + const fileMap = new Map(); + for (const node of nodes) { + const rank = ranks.get(node.id)!; + if (!fileMap.has(node.file)) { + fileMap.set(node.file, { rank: 0, symbols: [] }); + } + const fileInfo = fileMap.get(node.file)!; + fileInfo.rank += rank; + fileInfo.symbols.push({ + id: node.id, + name: node.name, + kind: node.kind, + file: node.file, + rank: rank, + signature: node.signature, + start_line: node.start_line, + end_line: node.end_line, + }); + } + + let wikiPages: Array<{ file: string; content: string }> = []; + if (wikiDir && fs.existsSync(wikiDir)) { + const files = fs.readdirSync(wikiDir).filter(f => f.endsWith('.md') && f !== 'index.md'); + wikiPages = files.map(f => ({ + file: f, + content: fs.readFileSync(path.join(wikiDir, f), 'utf8').toLowerCase(), + })); + } + + const result: FileRank[] = Array.from(fileMap.entries()) + .map(([filePath, info]) => { + const sortedSymbols = info.symbols + .sort((a, b) => b.rank - a.rank) + .slice(0, maxSymbolsPerFile); + + let wikiLink: string | undefined; + const baseName = path.basename(filePath, path.extname(filePath)).toLowerCase(); + + const matchedByFile = wikiPages.find(p => p.file.toLowerCase().includes(baseName)); + if (matchedByFile) { + wikiLink = matchedByFile.file; + } else { + const mentioner = wikiPages.find(p => + p.content.includes(baseName) || + sortedSymbols.some(s => s.name.length > 3 && p.content.includes(s.name.toLowerCase())) + ); + if (mentioner) { + wikiLink = mentioner.file; + } + } + + return { + path: filePath, + rank: info.rank, + symbols: sortedSymbols, + wikiLink, + }; + }) + .sort((a, b) => b.rank - a.rank) + .slice(0, maxFiles); + + return result; +} + +export function formatRepoMap(fileRanks: FileRank[]): string { + if (fileRanks.length === 0) return 'No symbols found to map.'; + + let output = 'Repository Map (ranked by importance)\n'; + output += '====================================\n\n'; + + for (const file of fileRanks) { + output += `${file.path} (score: ${(file.rank * 100).toFixed(2)})\n`; + if (file.wikiLink) { + output += ` wiki: ${file.wikiLink}\n`; + } + for (const sym of file.symbols) { + const indent = ' '; + const kindIcon = getKindIcon(sym.kind); + output += `${indent}${kindIcon} ${sym.name} [L${sym.start_line}]\n`; + } + output += '\n'; + } + + return output; +} + +function getKindIcon(kind: string): string { + switch (kind.toLowerCase()) { + case 'function': + case 'method': + return 'ƒ'; + case 'class': + return '©'; + case 'interface': + return 'ɪ'; + case 'variable': + case 'constant': + return 'ν'; + default: + return '•'; + } +} diff --git a/src/mcp/server.ts b/src/mcp/server.ts index 30148eb..affd537 100644 --- a/src/mcp/server.ts +++ b/src/mcp/server.ts @@ -17,6 +17,7 @@ import { sha256Hex } from '../core/crypto'; import { toPosixPath } from '../core/paths'; import { createLogger } from '../core/log'; import { checkIndex, resolveLangs } from '../core/indexCheck'; +import { generateRepoMap, type FileRank } from '../core/repoMap'; export interface GitAIV2MCPServerOptions { disableAccessLog?: boolean; @@ -101,6 +102,10 @@ export class GitAIV2MCPServer { lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, path: { type: 'string', description: 'Repository path (optional)' }, limit: { type: 'number', default: 50 }, + with_repo_map: { type: 'boolean', default: false }, + repo_map_max_files: { type: 'number', default: 20 }, + repo_map_max_symbols: { type: 'number', default: 5 }, + wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, }, required: ['query'], }, @@ -115,10 +120,27 @@ export class GitAIV2MCPServer { path: { type: 'string', description: 'Repository path (optional)' }, topk: { type: 'number', default: 10 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, + with_repo_map: { type: 'boolean', default: false }, + repo_map_max_files: { type: 'number', default: 20 }, + repo_map_max_symbols: { type: 'number', default: 5 }, + wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, }, required: ['query'], }, }, + { + name: 'repo_map', + description: 'Generate a lightweight repository map (ranked files + top symbols + wiki links)', + inputSchema: { + type: 'object', + properties: { + path: { type: 'string', description: 'Repository path (optional)' }, + max_files: { type: 'number', default: 20 }, + max_symbols: { type: 'number', default: 5 }, + wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, + }, + }, + }, { name: 'check_index', description: 'Check whether the repository index structure matches current expected schema', @@ -510,6 +532,15 @@ export class GitAIV2MCPServer { } const repoRootForDispatch = await this.resolveRepoRoot(callPath); + + if (name === 'repo_map') { + const wikiDir = resolveWikiDirInsideRepo(repoRootForDispatch, String((args as any).wiki_dir ?? '')); + const maxFiles = Number((args as any).max_files ?? 20); + const maxSymbolsPerFile = Number((args as any).max_symbols ?? 5); + const repoMap = await buildRepoMapAttachment(repoRootForDispatch, wikiDir, maxFiles, maxSymbolsPerFile); + return { content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, repo_map: repoMap }, null, 2) }] }; + } + if (name === 'search_symbols' && inferWorkspaceRoot(repoRootForDispatch)) { const query = String((args as any).query ?? ''); const limit = Number((args as any).limit ?? 50); @@ -525,8 +556,10 @@ export class GitAIV2MCPServer { ? res.rows.filter(r => !String((r as any).file ?? '').endsWith('.java')) : res.rows; const rows = filterAndRankSymbolRows(filteredByLang, { query, mode, caseInsensitive, limit }); + const withRepoMap = Boolean((args as any).with_repo_map ?? false); + const repoMap = withRepoMap ? { enabled: false, skippedReason: 'workspace_mode_not_supported' } : undefined; return { - content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows }, null, 2) }], + content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2) }], }; } @@ -537,6 +570,10 @@ export class GitAIV2MCPServer { const mode = inferSymbolSearchMode(query, (args as any).mode); const caseInsensitive = Boolean((args as any).case_insensitive ?? false); const maxCandidates = Math.max(limit, Number((args as any).max_candidates ?? Math.min(2000, limit * 20))); + const withRepoMap = Boolean((args as any).with_repo_map ?? false); + const wikiDir = resolveWikiDirInsideRepo(repoRootForDispatch, String((args as any).wiki_dir ?? '')); + const repoMapMaxFiles = Number((args as any).repo_map_max_files ?? 20); + const repoMapMaxSymbols = Number((args as any).repo_map_max_symbols ?? 5); const status = await checkIndex(repoRootForDispatch); if (!status.ok) { return { content: [{ type: 'text', text: JSON.stringify({ ...status, ok: false, reason: 'index_incompatible' }, null, 2) }], isError: true }; @@ -556,13 +593,18 @@ export class GitAIV2MCPServer { for (const r of rows as any[]) candidates.push({ ...r, lang }); } const rows = filterAndRankSymbolRows(candidates as any[], { query, mode, caseInsensitive, limit }); - return { content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows }, null, 2) }] }; + const repoMap = withRepoMap ? await buildRepoMapAttachment(repoRootForDispatch, wikiDir, repoMapMaxFiles, repoMapMaxSymbols) : undefined; + return { content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2) }] }; } if (name === 'semantic_search') { const query = String((args as any).query ?? ''); const topk = Number((args as any).topk ?? 10); const langSel = String((args as any).lang ?? 'auto'); + const withRepoMap = Boolean((args as any).with_repo_map ?? false); + const wikiDir = resolveWikiDirInsideRepo(repoRootForDispatch, String((args as any).wiki_dir ?? '')); + const repoMapMaxFiles = Number((args as any).repo_map_max_files ?? 20); + const repoMapMaxSymbols = Number((args as any).repo_map_max_symbols ?? 5); const status = await checkIndex(repoRootForDispatch); if (!status.ok) { return { content: [{ type: 'text', text: JSON.stringify({ ...status, ok: false, reason: 'index_incompatible' }, null, 2) }], isError: true }; @@ -588,7 +630,8 @@ export class GitAIV2MCPServer { } } const rows = allScored.sort((a, b) => b.score - a.score).slice(0, topk); - return { content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows }, null, 2) }] }; + const repoMap = withRepoMap ? await buildRepoMapAttachment(repoRootForDispatch, wikiDir, repoMapMaxFiles, repoMapMaxSymbols) : undefined; + return { content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, lang: langSel, rows, ...(repoMap ? { repo_map: repoMap } : {}) }, null, 2) }] }; } return { @@ -616,3 +659,33 @@ export class GitAIV2MCPServer { createLogger({ component: 'mcp' }).info('server_started', { startDir: this.startDir, transport: 'stdio' }); } } + +async function buildRepoMapAttachment( + repoRoot: string, + wikiDir: string, + maxFiles: number, + maxSymbolsPerFile: number +): Promise<{ enabled: true; wikiDir: string; files: FileRank[] } | { enabled: false; skippedReason: string }> { + try { + const files = await generateRepoMap({ repoRoot, maxFiles, maxSymbolsPerFile, wikiDir: wikiDir || undefined }); + return { enabled: true, wikiDir, files }; + } catch (e: any) { + return { enabled: false, skippedReason: String(e?.message ?? e) }; + } +} + +function resolveWikiDirInsideRepo(repoRoot: string, wikiOpt: string): string { + const w = String(wikiOpt ?? '').trim(); + if (w) { + const abs = path.resolve(repoRoot, w); + const rel = path.relative(repoRoot, abs); + if (rel.startsWith('..') || path.isAbsolute(rel)) throw new Error('wiki_dir escapes repository root'); + if (fs.existsSync(abs)) return abs; + return ''; + } + const candidates = [path.join(repoRoot, 'docs', 'wiki'), path.join(repoRoot, 'wiki')]; + for (const c of candidates) { + if (fs.existsSync(c)) return c; + } + return ''; +} diff --git a/src/modules.d.ts b/src/modules.d.ts new file mode 100644 index 0000000..ff14d98 --- /dev/null +++ b/src/modules.d.ts @@ -0,0 +1,5 @@ +declare module 'tree-sitter-c'; +declare module 'tree-sitter-go'; +declare module 'tree-sitter-python'; +declare module 'tree-sitter-php'; +declare module 'tree-sitter-rust'; diff --git a/test/e2e.test.js b/test/e2e.test.js index 349ada5..976397e 100644 --- a/test/e2e.test.js +++ b/test/e2e.test.js @@ -127,6 +127,14 @@ test('git-ai works in Spring Boot and Vue repos', async () => { assert.ok(obj.rows.some(r => String(r.file || '').endsWith('.java'))); } + { + const res = runOk('node', [CLI, 'ai', 'query', 'HelloController', '--limit', '10', '--with-repo-map', '--repo-map-files', '5', '--repo-map-symbols', '2'], springRepo); + const obj = JSON.parse(res.stdout); + assert.ok(obj.repo_map && obj.repo_map.enabled === true); + assert.ok(Array.isArray(obj.repo_map.files)); + assert.ok(obj.repo_map.files.length > 0); + } + { const res = runOk('node', [CLI, 'ai', 'query', 'PingController', '--limit', '10'], springMultiRepo); const obj = JSON.parse(res.stdout); @@ -141,6 +149,15 @@ test('git-ai works in Spring Boot and Vue repos', async () => { assert.ok(obj.hits.length > 0); } + { + const res = runOk('node', [CLI, 'ai', 'semantic', 'hello controller', '--topk', '5', '--with-repo-map', '--repo-map-files', '5', '--repo-map-symbols', '2'], springRepo); + const obj = JSON.parse(res.stdout); + assert.ok(Array.isArray(obj.hits)); + assert.ok(obj.repo_map && obj.repo_map.enabled === true); + assert.ok(Array.isArray(obj.repo_map.files)); + assert.ok(obj.repo_map.files.length > 0); + } + { const res = runOk('node', [CLI, 'ai', 'graph', 'find', 'HelloController'], springRepo); const obj = JSON.parse(res.stdout); @@ -215,4 +232,8 @@ test('git-ai can index repo-tool manifests workspace repos', async () => { const obj = JSON.parse(res.stdout); assert.ok(obj.count > 0); assert.ok(obj.rows.some(r => String(r.project?.path || '') === 'project-b' && String(r.file || '').includes('src/main/java/'))); + + const res2 = runOk('node', [CLI, 'ai', 'query', 'BController', '--limit', '20', '--with-repo-map'], manifestRepo); + const obj2 = JSON.parse(res2.stdout); + assert.ok(obj2.repo_map && obj2.repo_map.enabled === false); }); diff --git a/test/mcp.smoke.test.js b/test/mcp.smoke.test.js index 65ab0e4..8372528 100644 --- a/test/mcp.smoke.test.js +++ b/test/mcp.smoke.test.js @@ -81,6 +81,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(toolNames.has('search_symbols')); assert.ok(toolNames.has('semantic_search')); + assert.ok(toolNames.has('repo_map')); assert.ok(toolNames.has('set_repo')); assert.ok(toolNames.has('get_repo')); assert.ok(toolNames.has('check_index')); @@ -136,6 +137,26 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(parsed.rows.length > 0); } + { + const call = await client.callTool({ + name: 'search_symbols', + arguments: { + query: 'hello', + mode: 'substring', + case_insensitive: true, + limit: 10, + with_repo_map: true, + repo_map_max_files: 5, + repo_map_max_symbols: 2, + }, + }); + const text = String(call?.content?.[0]?.text ?? ''); + const parsed = text ? JSON.parse(text) : null; + assert.ok(parsed && parsed.repo_map && parsed.repo_map.enabled === true); + assert.ok(Array.isArray(parsed.repo_map.files)); + assert.ok(parsed.repo_map.files.length > 0); + } + { const call = await client.callTool({ name: 'semantic_search', arguments: { query: 'hello world', topk: 3 } }); const text = String(call?.content?.[0]?.text ?? ''); @@ -144,6 +165,15 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(parsed.rows.length > 0); } + { + const call = await client.callTool({ name: 'repo_map', arguments: { max_files: 5, max_symbols: 2 } }); + const text = String(call?.content?.[0]?.text ?? ''); + const parsed = text ? JSON.parse(text) : null; + assert.ok(parsed && parsed.repo_map && parsed.repo_map.enabled === true); + assert.ok(Array.isArray(parsed.repo_map.files)); + assert.ok(parsed.repo_map.files.length > 0); + } + { const call = await client.callTool({ name: 'list_files', arguments: { pattern: 'src/**/*', limit: 50 } }); const text = String(call?.content?.[0]?.text ?? ''); diff --git a/test/verify_parsing.ts b/test/verify_parsing.ts new file mode 100644 index 0000000..0344615 --- /dev/null +++ b/test/verify_parsing.ts @@ -0,0 +1,18 @@ +import test from 'node:test'; +import assert from 'node:assert/strict'; +import path from 'path'; +import { fileURLToPath } from 'node:url'; +import { CodeParser } from '../dist/src/core/parser.js'; + +test('parser can parse polyglot examples', async () => { + const parser = new CodeParser(); + const __filename = fileURLToPath(import.meta.url); + const __dirname = path.dirname(__filename); + const repo = path.join(__dirname, '../examples/polyglot-repo'); + const files = ['main.c', 'main.go', 'main.py', 'main.rs']; + for (const f of files) { + const res = await parser.parseFile(path.join(repo, f)); + assert.ok(Array.isArray(res.symbols)); + assert.ok(Array.isArray(res.refs)); + } +});