diff --git a/.git-ai/lancedb.tar.gz b/.git-ai/lancedb.tar.gz index 8d38ea4..a77b759 100644 --- a/.git-ai/lancedb.tar.gz +++ b/.git-ai/lancedb.tar.gz @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:987b3d4d330725144c6367f6678826f3608964904ede6339dd11ce68b971d82d -size 21238 +oid sha256:ec26e551b488aecd466453bd29c62d3817046616833973ab31cfe47028e3945a +size 25407 diff --git a/.git-ai/meta.json b/.git-ai/meta.json deleted file mode 100644 index 409419e..0000000 --- a/.git-ai/meta.json +++ /dev/null @@ -1,9 +0,0 @@ -{ - "version": "2.0", - "dim": 256, - "files": 35, - "chunksAdded": 87, - "refsAdded": 87, - "dbDir": ".git-ai/lancedb", - "scanRoot": "" -} diff --git a/.gitignore b/.gitignore index 43fcc50..32600a6 100644 --- a/.gitignore +++ b/.gitignore @@ -3,5 +3,9 @@ dist/ .DS_Store .git-ai/lancedb/ +.git-ai/meta.json +.git-ai/cozo.error.json +.git-ai/ast-graph.sqlite +.git-ai/ast-graph.export.json .git-ai/._* .tmp-hook-test/ diff --git a/.trae/skills/git-ai-mcp/SKILL.md b/.trae/skills/git-ai-mcp/SKILL.md index 8cff63f..ada7dda 100644 --- a/.trae/skills/git-ai-mcp/SKILL.md +++ b/.trae/skills/git-ai-mcp/SKILL.md @@ -33,6 +33,8 @@ description: "通过 git-ai 的 MCP 工具检视/检索代码仓。用户要“ ### 1) 符号定位(最稳) 当用户提到函数/类/文件名/模块名: - `search_symbols({ query: "FooBar", limit: 50 })` +- `search_symbols({ query: "get*repo", mode: "wildcard", case_insensitive: true, limit: 20 })` +- `search_symbols({ query: "^get.*repo$", mode: "regex", case_insensitive: true, limit: 20 })` 输出 rows 后,选最可能的 1-3 个命中点继续读代码: - `read_file({ file: "src/xxx.ts", start_line: 1, end_line: 220 })` @@ -49,6 +51,10 @@ description: "通过 git-ai 的 MCP 工具检视/检索代码仓。用户要“ - `list_files({ pattern: "src/**/*.{ts,tsx,js,jsx}", limit: 500 })` - `list_files({ pattern: "**/*mcp*", limit: 200 })` +### 4) AST 图查询(递归/关系类问题) +当你需要回答“包含关系/继承关系/子节点列表/递归查询”等问题: +- `ast_graph_query({ query: "", params: {...} })` + ## 输出要求(给用户的答复) - 先给结论,再给证据(文件 + 行范围) - 引用代码位置用 IDE 可点链接(file://...#Lx-Ly) @@ -58,4 +64,3 @@ description: "通过 git-ai 的 MCP 工具检视/检索代码仓。用户要“ - MCP 的 `semantic_search` 依赖 `.git-ai/lancedb`:没索引就没结果 - 修改索引后建议 `pack_index`,并把 `.git-ai/lancedb.tar.gz` 提交(如果团队要共享) - `read_file` 只能读仓库内相对路径,不允许 `../` 越界 - diff --git a/README.md b/README.md index fc5af37..b119194 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,7 @@ yarn global add git-ai - 开发指引:[DEVELOPMENT.md](./DEVELOPMENT.md) - 文档中心(使用/概念/排障):[docs/README.md](./docs/README.md) - 设计说明:[docs/design.md](./docs/design.md) +- 技术原理详解(小白向):[docs/architecture_explained.md](./docs/architecture_explained.md) - Agent 集成(Skills/Rules):[docs/mcp.md](./docs/mcp.md) ## 基本用法(与 git 类似) @@ -48,6 +49,7 @@ git-ai push -u origin main git-ai ai index --overwrite git-ai ai query Indexer --limit 10 git-ai ai semantic "semantic search" --topk 5 +git-ai ai graph find GitAIV2MCPServer git-ai ai pack git-ai ai unpack git-ai ai serve @@ -56,8 +58,9 @@ git-ai ai serve ## MCP Server(stdio) `git-ai` 提供一个基于 MCP 的 stdio Server,供 Agent/客户端以工具方式调用: -- `search_symbols`:按子串搜索符号并返回文件位置 +- `search_symbols`:符号检索(substring/prefix/wildcard/regex/fuzzy) - `semantic_search`:基于 LanceDB + SQ8 的语义检索 +- `ast_graph_query`:基于 CozoDB 的 AST 图查询(CozoScript) ### 启动 diff --git a/docs/DESIGN.md b/docs/DESIGN.md index 1c554e1..78bf5af 100644 --- a/docs/DESIGN.md +++ b/docs/DESIGN.md @@ -11,6 +11,8 @@ - `.git-ai/`:索引目录 - `lancedb/`:LanceDB 数据目录 - `lancedb.tar.gz`:打包后的 LanceDB(用于 Git LFS 追踪与传输) + - `ast-graph.sqlite`:AST 关系图数据库(CozoDB,优先 SQLite 引擎) + - `ast-graph.export.json`:AST 图导出快照(仅在非 SQLite 后端时用于跨进程复用) - `meta.json`:索引元信息(维度、编码、构建时间等) ## 3. 数据模型(两张表) @@ -55,6 +57,21 @@ - 扫描 chunks(或按过滤条件缩小)反量化计算 cosine 相似度; - 取 TopK 后关联 refs 输出定位结果。 +## 6.1 AST 图查询(CozoDB) + +索引时会把符号及其关系写入 CozoDB,用于表达“包含关系”和“继承关系”等更适合图/递归查询的数据: + +### 关系(relations) +- `ast_file(file_id => file)`:文件节点(file_id 为 `sha256("file:" + file)`) +- `ast_symbol(ref_id => file, name, kind, signature, start_line, end_line)`:符号节点(ref_id 与 refs 表一致) +- `ast_contains(parent_id, child_id)`:包含关系边(parent_id 可能是 file_id 或 ref_id) +- `ast_extends_name(sub_id, super_name)`:继承关系(按名字记录,便于后续 join/解析) +- `ast_implements_name(sub_id, iface_name)`:实现关系(按名字记录) + +### CLI / MCP +- CLI:`git-ai ai graph ...` +- MCP:`ast_graph_query({query, params})` + ## 7. Git hooks 集成 - `pre-commit`:自动重建索引(index --overwrite)并打包(pack),把 `.git-ai/lancedb.tar.gz` 添加到暂存区;若安装了 git-lfs 会自动执行 lfs track。 - `pre-push`:再次打包并校验归档未发生变化;若变化则阻止 push,提示先提交归档文件。 diff --git a/docs/README.md b/docs/README.md index 52d115c..f4c3ca6 100644 --- a/docs/README.md +++ b/docs/README.md @@ -13,9 +13,11 @@ - 让 Agent 通过 MCP tools 低成本命中符号/片段,再按需读取文件 ### 重要目录 -- `.git-ai/meta.json`:索引元数据 +- `.git-ai/meta.json`:索引元数据(本地生成,通常不提交) - `.git-ai/lancedb/`:本地向量索引目录(通常不提交) - `.git-ai/lancedb.tar.gz`:归档后的索引(可提交/可用 git-lfs 追踪) +- `.git-ai/ast-graph.sqlite`:AST 图数据库(CozoDB) +- `.git-ai/ast-graph.export.json`:AST 图导出快照(用于非 SQLite 后端跨进程复用) ## 目录 diff --git a/docs/architecture_explained.md b/docs/architecture_explained.md new file mode 100644 index 0000000..cbc0e54 --- /dev/null +++ b/docs/architecture_explained.md @@ -0,0 +1,106 @@ +# 技术架构与选型深度解析 + +本文档旨在从架构设计角度,深入剖析 `git-ai` 的核心实现原理、关键技术选型及其背后的决策逻辑。适用于技术评审、架构选型参考及二次开发指导。 + +## 1. 核心架构设计理念 + +`git-ai` 的设计目标是构建一个**轻量级、去中心化、零依赖**的代码库语义索引引擎。不同于传统的集中式代码搜索服务(如 Sourcegraph),`git-ai` 采用“Client-Side Indexing”模式,将索引能力下沉至开发者本地环境。 + +### 1.1 设计哲学:Hybrid RAG 与 高召回策略 + +我们采用了 **Hybrid RAG (混合检索增强生成)** 的设计思想,通过不同组件的协同来平衡检索的精度与召回率。核心原则是 **"Recall over Precision"(召回优于精度)** —— 宁可多搜几个交给 AI (LLM) 去过滤,也绝不漏掉潜在的关键信息。 + +* **Tree-sitter (骨架提取)**:负责“精准”的结构化数据。提取代码的类、方法、接口定义,构建代码的“骨架”。 +* **CozoDB (关联推导)**:负责“逻辑”连接。处理继承、实现、包含等图关系,支持多跳查询(如“查找所有子类”)。 +* **LanceDB (语义仲裁)**:负责“模糊”召回。通过 Hash Embedding 捕捉代码的语义特征,即使不知道确切名字,也能通过上下文找到相关代码。 +* **AI (最终过滤)**:作为 RAG 的最后一环,LLM 利用其强大的理解能力,对召回的混合结果进行 Re-ranking 和精确过滤。 + +**核心约束:** +* **零环境依赖**:不依赖 Docker、JVM、Python 环境,开箱即用。 +* **纯本地运行**:数据隐私优先,无需上传代码至云端。 +* **高性能**:毫秒级检索,索引体积可控(通常 < 代码体积的 20%)。 + +--- + +## 2. 索引流水线 (Indexing Pipeline) + +索引构建过程是一个典型的 ETL (Extract, Transform, Load) 流程,分为三个阶段: + +### 2.1 结构化解析 (Parsing & Chunking) + +为了解决传统基于行的分片(Line-based Chunking)导致的语义截断问题,我们采用了基于 AST(抽象语法树)的结构化分片策略。 + +* **技术选型:Tree-sitter** + * **背景**:GitHub Atom 团队开发的增量解析系统,现已成为代码解析领域的工业标准。 + * **实现机制**:通过 `tree-sitter-{lang}` 生成具体语言的 CST (Concrete Syntax Tree),再通过遍历算法提取 Symbol(类、函数、接口)及其上下文(Range)。 + * **优势**: + * **多语言支持**:通过统一的 WASM/Node.js 绑定支持几十种主流语言。 + * **容错性**:即使代码存在语法错误,仍能构建部分 AST,保证索引鲁棒性。 + * **性能**:基于 C 编写,解析速度极快(单文件 < 10ms)。 + +### 2.2 向量化 (Embedding) + +这是将非结构化代码转换为结构化向量的关键步骤。 + +* **技术选型:Random Indexing (Deterministic Hash Embedding)** + * **背景**:一种降维技术,基于 Johnson-Lindenstrauss 引理(高维空间中的随机向量近似正交)。 + * **实现机制**: + 1. **Tokenization**:对代码标识符进行分词与归一化。 + 2. **Hashing**:计算 Token 的 SHA-256 哈希。 + 3. **Projection**:将哈希映射到固定维度(如 256 维)的稀疏向量中(+1/-1)。 + 4. **Aggregation**:叠加所有 Token 向量并归一化。 + * **决策依据(VS 深度学习模型)**: + * **Transformer 模型 (如 BERT/OpenAI)**:虽然语义理解强,但模型文件巨大(数百 MB)、推理延迟高、且通常需要 GPU 或云端 API,违背了“轻量级 CLI”的设计初衷。 + * **Hash Embedding**:虽然无法捕捉同义词语义(如 Login ≈ SignIn),但在代码搜索场景中,**精确的标识符匹配**(Identifier Match)往往比模糊语义更重要。该方案实现了**零模型文件依赖、纳秒级推理速度**。 + +### 2.3 关系图谱构建 (Knowledge Graph) + +为了弥补向量检索在结构化查询(如继承关系、嵌套结构)上的不足,我们同步构建了 AST 关系图。 + +* **模型设计**: + * **节点**:File, Symbol (Class, Method, Interface) + * **边**:Contains (包含), Extends (继承), Implements (实现) + * *注:当前版本主要关注“定义(Definition)”关系,暂未包含“引用(Reference/Call Graph)”关系,以保持索引构建的轻量化。* +* **存储**:将 AST 关系降维为 Datalog 事实表(Facts),存入图数据库。 + +--- + +## 3. 存储引擎选型 (Storage Engine) + +我们采用了“双引擎”策略,分别处理向量检索和图查询。 + +### 3.1 向量存储:LanceDB + +* **技术背景**:基于 Apache Arrow 和 Lance 数据格式的新一代向量数据库。 +* **选型理由**: + * **Serverless 架构**:不同于 Milvus/Qdrant 需要独立服务进程,LanceDB 是嵌入式的(类似 SQLite),数据即文件。 + * **列式存储**:原生支持 Arrow 格式,Zero-copy 读取,极大降低内存开销。 + * **多模态支持**:单表支持向量索引(IVF-PQ)与标量字段(全文检索),便于混合查询。 + * **Rust 内核**:保证了极高的 I/O 吞吐和稳定性。 + +### 3.2 图存储:CozoDB + +* **技术背景**:基于 Datalog 的事务型、关系型/图混合数据库。 +* **选型理由**: + * **递归查询能力**:原生支持 Datalog 推理规则,能够优雅处理代码中的递归结构(如多层继承链、模块依赖树),这是标准 SQL (SQLite) 难以高效实现的。 + * **轻量级嵌入**:底层存储引擎可插拔(支持 RocksDB, SQLite, Sled),我们默认使用 SQLite 后端,保持了单文件部署的简洁性。 + * **WASM 支持**:具备回退到纯内存 WASM 模式的能力,保证在极端环境下的可用性。 + +--- + +## 4. 技术栈横向对比 (Benchmark & Comparison) + +| 维度 | git-ai (本方案) | Sourcegraph (Zoekt) | CTags / GTags | 基于 OpenAI 的方案 | +| :--- | :--- | :--- | :--- | :--- | +| **核心算法** | Hash Embedding + AST Graph | Trigram Index (N-gram) | 正则/词法分析 | LLM Embedding | +| **检索模式** | 混合检索 (语义+结构) | 精确/正则匹配 | 符号跳转 | 纯语义相似度 | +| **依赖环境** | Node.js Runtime (零外部依赖) | Go Server, Docker | C 编译环境 | Python/GPU/API Key | +| **索引体积** | 小 (~15-20%) | 中等 (~30%) | 极小 (<5%) | 极大 (向量维度高) | +| **语义理解** | 中 (基于词袋模型) | 无 | 无 | 高 | +| **部署成本** | **极低 (CLI 工具)** | 高 (需运维集群) | 低 | 中/高 | + +## 5. 总结与展望 + +`git-ai` 的架构本质上是在**检索效果**与**工程成本**之间寻找的一个极致平衡点。 + +通过 **Tree-sitter + Hash Embedding + LanceDB + CozoDB** 的组合,我们在不引入任何重型依赖的前提下,实现了对代码库的**语义级(Vector)**和**结构级(Graph)**的双重索引。这种架构特别适合作为 AI Agent 的“代码知识外脑”,为其提供精准、快速的上下文检索能力。 diff --git a/docs/cli.md b/docs/cli.md index b912e32..8456835 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -15,10 +15,43 @@ git-ai push -u origin main ```bash git-ai ai index --overwrite git-ai ai query "search text" --limit 20 +git-ai ai query "get*repo" --mode wildcard --case-insensitive --limit 20 git-ai ai semantic "semantic query" --topk 10 +git-ai ai graph find "Foo" +git-ai ai graph children src/mcp/server.ts --as-file +git-ai ai graph query "?[name, kind] := *ast_symbol{ref_id, file, name, kind, signature, start_line, end_line}" --params "{}" git-ai ai pack git-ai ai unpack git-ai ai hooks install git-ai ai serve ``` +## 符号搜索模式(ai query) + +`git-ai ai query` 默认是子串搜索;当你的输入包含 `*` / `?` 时,或显式指定 `--mode`,可以启用更适合 code agent 的搜索模式: + +- `--mode substring`:子串匹配(默认) +- `--mode prefix`:前缀匹配 +- `--mode wildcard`:通配符(`*` 任意串,`?` 单字符) +- `--mode regex`:正则 +- `--mode fuzzy`:模糊匹配(子序列) + +常用参数: +- `--case-insensitive`:大小写不敏感 +- `--max-candidates `:先拉取候选再过滤的上限(模式为 wildcard/regex/fuzzy 时有用) + +## AST 图搜索(CozoDB) + +> **实战指南**:觉得命令太抽象?请查看 [AST 图谱实战指南](./graph_scenarios.md) 了解如何查找定义、父类、子类等常见场景。 + +`git-ai ai index` 会在 `.git-ai/` 下额外维护一份 AST 关系图数据库(默认文件名:`.git-ai/ast-graph.sqlite`)。 + +图搜索相关命令: +- `git-ai ai graph find `:按符号名前缀(不区分大小写)查找 +- `git-ai ai graph children `:列出包含关系的直接子节点(`id` 可以是 `ref_id` 或 `file_id`) +- `git-ai ai graph children --as-file`:把 `` 视作 repo 相对路径,自动换算为 `file_id` +- `git-ai ai graph query "" --params ''`:直接执行 CozoScript 查询 + +依赖说明: +- 默认优先使用 `cozo-node`(SQLite 持久化) +- 若 `cozo-node` 不可用,会回退到 `cozo-lib-wasm`(内存引擎,通过导出文件实现跨进程复用) diff --git a/docs/graph_scenarios.md b/docs/graph_scenarios.md new file mode 100644 index 0000000..a8da208 --- /dev/null +++ b/docs/graph_scenarios.md @@ -0,0 +1,150 @@ +# AST 图谱实战指南 + +`git-ai ai graph` 命令提供了强大的代码结构查询能力。本文档通过实际场景,介绍如何查找定义、结构、继承关系等。 + +## 1. 查找定义 (Find Definitions) + +### 场景:我知道一个类或方法的名字(或前缀),想找到它在哪里定义。 + +**命令:** +```bash +# 查找名字以 "GitAI" 开头的符号 +git-ai ai graph find "GitAI" +``` + +**输出示例:** +```json +{ + "repoRoot": "/path/to/repo", + "result": { + "headers": ["ref_id", "file", "name", "kind", "signature", "start_line", "end_line"], + "rows": [ + ["...", "src/mcp/server.ts", "GitAIV2MCPServer", "class", "class GitAIV2MCPServer", 16, 120] + ] + } +} +``` + +> **提示**:如果你只记得模糊的名字(如 `*Server`),建议使用 `ai query` 命令配合 wildcard 模式: +> ```bash +> git-ai ai query "*Server" --mode wildcard +> ``` + +--- + +## 2. 查看文件结构 (File Structure) + +### 场景:我想知道某个文件里定义了哪些类、函数或接口。 + +**命令:** +使用 `children` 子命令,并加上 `--as-file` 参数,直接传文件路径: + +```bash +# 查看 src/mcp/server.ts 里的顶层符号 +git-ai ai graph children src/mcp/server.ts --as-file +``` + +**输出示例:** +```json +{ + "result": { + "headers": ["child_id", "file", "name", "kind", "signature", "start_line", "end_line"], + "rows": [ + ["...", "src/mcp/server.ts", "GitAIV2MCPServer", "class", "class GitAIV2MCPServer", 16, 120] + ] + } +} +``` + +### 场景:我想进一步看某个类里有哪些方法。 + +**步骤:** +1. 从上一步结果中复制类的 `child_id`(即 `ref_id`)。 +2. 再次运行 `children` 命令(这次不需要 `--as-file`)。 + +```bash +git-ai ai graph children +``` + +--- + +## 3. 查找继承与实现 (Inheritance & Implementation) + +这部分需要使用 `git-ai ai graph query` 执行 CozoScript。CozoScript 是一种类似 Datalog 的逻辑查询语言。 + +### 场景:查找某个类的所有子类 (Find Subclasses) + +假设你想找所有继承自 `BaseCommand` 的类。 + +**CozoScript:** +```cozo +?[name, file, start_line] := + *ast_extends_name{sub_id, super_name: 'BaseCommand'}, + *ast_symbol{ref_id: sub_id, name, file, kind, start_line, end_line} +``` + +**CLI 命令:** +```bash +git-ai ai graph query "?[name, file] := *ast_extends_name{sub_id, super_name: 'BaseCommand'}, *ast_symbol{ref_id: sub_id, name, file, kind, signature, start_line, end_line}" +``` + +### 场景:查找某个接口的所有实现 (Find Implementations) + +假设你想找所有实现了 `Runnable` 接口的类。 + +**CozoScript:** +```cozo +?[name, file] := + *ast_implements_name{sub_id, iface_name: 'Runnable'}, + *ast_symbol{ref_id: sub_id, name, file, kind, start_line, end_line} +``` + +**CLI 命令:** +```bash +git-ai ai graph query "?[name, file] := *ast_implements_name{sub_id, iface_name: 'Runnable'}, *ast_symbol{ref_id: sub_id, name, file, kind, signature, start_line, end_line}" +``` + +### 场景:查找某个类的父类 (Find Parent Class) + +假设你想知道 `MyClass` 继承了谁。 + +**CozoScript:** +```cozo +?[super_name] := + *ast_symbol{ref_id, name: 'MyClass'}, + *ast_extends_name{sub_id: ref_id, super_name} +``` + +--- + +## 4. 查找引用 (Find References/Usages) + +**注意**:目前的 AST 图谱主要存储**定义(Definition)**和**声明关系**,**不包含**全量的函数调用图(Call Graph)或变量引用。 + +### 替代方案 + +要查找“哪里使用了这个类/方法”,推荐使用 **Symbol 搜索** 或 **文本搜索**: + +**方法 A:使用 Symbol 搜索(推荐)** +查找包含该符号名的所有索引记录(包括定义和部分上下文): +```bash +git-ai ai query "MySymbol" --mode wildcard +``` + +**方法 B:使用 Grep(最准确的文本匹配)** +直接在仓库中搜索字符串: +```bash +git grep "MySymbol" +``` + +--- + +## 附录:数据表结构参考 + +如果你想编写更复杂的查询,可以参考以下表结构: + +- **`ast_file`**: `{ file_id, file }` +- **`ast_symbol`**: `{ ref_id, file, name, kind, signature, start_line, end_line }` +- **`ast_contains`**: `{ parent_id, child_id }` (parent_id 可能是 file_id 或 ref_id) +- **`ast_extends_name`**: `{ sub_id, super_name }` +- **`ast_implements_name`**: `{ sub_id, iface_name }` diff --git a/docs/mcp.md b/docs/mcp.md index 1266c60..861b6a6 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -24,13 +24,25 @@ git-ai ai serve - `unpack_index({ path? })`:解包索引归档 ### 检索 -- `search_symbols({ query, limit?, path? })`:按子串搜索符号并返回文件位置 +- `search_symbols({ query, mode?, case_insensitive?, max_candidates?, limit?, path? })`:符号检索(默认 substring;支持 prefix/wildcard/regex/fuzzy) - `semantic_search({ query, topk?, path? })`:基于 LanceDB + SQ8 的语义检索 +- `ast_graph_query({ query, params?, path? })`:对 AST 图数据库执行 CozoScript 查询 ### 文件读取 - `list_files({ path?, pattern?, limit? })`:按 glob 列文件(默认忽略 node_modules, .git 等) - `read_file({ path?, file, start_line?, end_line? })`:按行读取文件片段 +## AST 图查询示例 + +列出指定文件里的顶层符号(先把 file path 转为 file_id): + +```cozo +?[file_id] <- [[$file_id]] +?[child_id, name, kind, start_line, end_line] := + *ast_contains{parent_id: file_id, child_id}, + *ast_symbol{ref_id: child_id, file, name, kind, signature, start_line, end_line} +``` + ## 推荐调用方式(让 Agent 自动传对路径) - 第一次调用先 `set_repo({path: "/ABS/PATH/TO/REPO"})` - 后续工具调用不传 `path`(走默认仓库) @@ -64,6 +76,7 @@ git-ai ai serve **1) 符号定位(最稳)** 当用户提到函数/类/文件名/模块名: - `search_symbols({ query: "FooBar", limit: 50 })` +- `search_symbols({ query: "get*repo", mode: "wildcard", case_insensitive: true, limit: 20 })` 输出 rows 后,选最可能的 1-3 个命中点继续读代码: - `read_file({ file: "src/xxx.ts", start_line: 1, end_line: 220 })` diff --git a/package-lock.json b/package-lock.json index d05f871..2f8572a 100644 --- a/package-lock.json +++ b/package-lock.json @@ -15,6 +15,7 @@ "@types/node": "^25.0.9", "apache-arrow": "18.1.0", "commander": "^14.0.2", + "cozo-lib-wasm": "0.7.6", "fs-extra": "^11.3.3", "glob": "^13.0.0", "simple-git": "^3.30.0", @@ -40,7 +41,9 @@ "@lancedb/lancedb-linux-x64-gnu": "0.22.3", "@lancedb/lancedb-linux-x64-musl": "0.22.3", "@lancedb/lancedb-win32-arm64-msvc": "0.22.3", - "@lancedb/lancedb-win32-x64-msvc": "0.22.3" + "@lancedb/lancedb-win32-x64-msvc": "0.22.3", + "cozo-lib-wasm": "^0.7.6", + "cozo-node": "^0.7.6" } }, "node_modules/@cspotcode/source-map-support": { @@ -302,6 +305,99 @@ "node": ">= 18" } }, + "node_modules/@mapbox/node-pre-gyp": { + "version": "1.0.11", + "resolved": "https://registry.npmmirror.com/@mapbox/node-pre-gyp/-/node-pre-gyp-1.0.11.tgz", + "integrity": "sha512-Yhlar6v9WQgUp/He7BdgzOz8lqMQ8sU+jkCq7Wx8Myc5YFJLbEe7lgui/V7G1qB1DJykHSGwreceSaD60Y0PUQ==", + "license": "BSD-3-Clause", + "optional": true, + "dependencies": { + "detect-libc": "^2.0.0", + "https-proxy-agent": "^5.0.0", + "make-dir": "^3.1.0", + "node-fetch": "^2.6.7", + "nopt": "^5.0.0", + "npmlog": "^5.0.1", + "rimraf": "^3.0.2", + "semver": "^7.3.5", + "tar": "^6.1.11" + }, + "bin": { + "node-pre-gyp": "bin/node-pre-gyp" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/chownr": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/chownr/-/chownr-2.0.0.tgz", + "integrity": "sha512-bIomtDF5KGpdogkLd9VspvFzk9KfpyyGlS8YFVZl7TGPBHL5snIOnxeshwVgPteQ9b4Eydl+pVbIyE1DcvCWgQ==", + "license": "ISC", + "optional": true, + "engines": { + "node": ">=10" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/minipass": { + "version": "5.0.0", + "resolved": "https://registry.npmmirror.com/minipass/-/minipass-5.0.0.tgz", + "integrity": "sha512-3FnjYuehv9k6ovOEbyOswadCDPX1piCfhV8ncmYtHOjuPwylVWsghTLo7rabjC3Rx5xD4HDx8Wm1xnMF7S5qFQ==", + "license": "ISC", + "optional": true, + "engines": { + "node": ">=8" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/minizlib": { + "version": "2.1.2", + "resolved": "https://registry.npmmirror.com/minizlib/-/minizlib-2.1.2.tgz", + "integrity": "sha512-bAxsR8BVfj60DWXHE3u30oHzfl4G7khkSuPW+qvpd7jFRHm7dLxOjUk1EHACJ/hxLY8phGJ0YhYHZo7jil7Qdg==", + "license": "MIT", + "optional": true, + "dependencies": { + "minipass": "^3.0.0", + "yallist": "^4.0.0" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/minizlib/node_modules/minipass": { + "version": "3.3.6", + "resolved": "https://registry.npmmirror.com/minipass/-/minipass-3.3.6.tgz", + "integrity": "sha512-DxiNidxSEK+tHG6zOIklvNOwm3hvCrbUrdtzY74U6HKTJxvIDfOUL5W5P2Ghd3DTkhhKPYGqeNUIh5qcM4YBfw==", + "license": "ISC", + "optional": true, + "dependencies": { + "yallist": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/tar": { + "version": "6.2.1", + "resolved": "https://registry.npmmirror.com/tar/-/tar-6.2.1.tgz", + "integrity": "sha512-DZ4yORTwrbTj/7MZYq2w+/ZFdI6OZ/f9SFHR+71gIVUZhOQPHzVCLpvRnPgyaMpfWxxk/4ONva3GQSyNIKRv6A==", + "license": "ISC", + "optional": true, + "dependencies": { + "chownr": "^2.0.0", + "fs-minipass": "^2.0.0", + "minipass": "^5.0.0", + "minizlib": "^2.1.1", + "mkdirp": "^1.0.3", + "yallist": "^4.0.0" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/@mapbox/node-pre-gyp/node_modules/yallist": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/yallist/-/yallist-4.0.0.tgz", + "integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==", + "license": "ISC", + "optional": true + }, "node_modules/@modelcontextprotocol/sdk": { "version": "1.25.2", "resolved": "https://registry.npmmirror.com/@modelcontextprotocol/sdk/-/sdk-1.25.2.tgz", @@ -414,6 +510,13 @@ "undici-types": "~7.16.0" } }, + "node_modules/abbrev": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/abbrev/-/abbrev-1.1.1.tgz", + "integrity": "sha512-nne9/IiQ/hzIhY6pdDnbBtz7DjPTKrY00P/zvPSm5pOFkl6xuGrGnXn/VtTNNfNtAfZ9/1RtehkszU9qcTii0Q==", + "license": "ISC", + "optional": true + }, "node_modules/accepts": { "version": "2.0.0", "resolved": "https://registry.npmmirror.com/accepts/-/accepts-2.0.0.tgz", @@ -451,6 +554,19 @@ "node": ">=0.4.0" } }, + "node_modules/agent-base": { + "version": "6.0.2", + "resolved": "https://registry.npmmirror.com/agent-base/-/agent-base-6.0.2.tgz", + "integrity": "sha512-RZNwNclF7+MS/8bDg70amg32dyeZGZxiDuQmZxKLAlQjr3jGyLx+4Kkk58UO7D2QdgFIQCovuSuZESne6RG6XQ==", + "license": "MIT", + "optional": true, + "dependencies": { + "debug": "4" + }, + "engines": { + "node": ">= 6.0.0" + } + }, "node_modules/ajv": { "version": "8.17.1", "resolved": "https://registry.npmmirror.com/ajv/-/ajv-8.17.1.tgz", @@ -484,6 +600,16 @@ } } }, + "node_modules/ansi-regex": { + "version": "5.0.1", + "resolved": "https://registry.npmmirror.com/ansi-regex/-/ansi-regex-5.0.1.tgz", + "integrity": "sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==", + "license": "MIT", + "optional": true, + "engines": { + "node": ">=8" + } + }, "node_modules/ansi-styles": { "version": "4.3.0", "resolved": "https://registry.npmmirror.com/ansi-styles/-/ansi-styles-4.3.0.tgz", @@ -534,6 +660,28 @@ "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", "license": "MIT" }, + "node_modules/aproba": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/aproba/-/aproba-2.1.0.tgz", + "integrity": "sha512-tLIEcj5GuR2RSTnxNKdkK0dJ/GrC7P38sUkiDmDuHfsHmbagTFAxDVIBltoklXEVIQ/f14IL8IMJ5pn9Hez1Ew==", + "license": "ISC", + "optional": true + }, + "node_modules/are-we-there-yet": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/are-we-there-yet/-/are-we-there-yet-2.0.0.tgz", + "integrity": "sha512-Ci/qENmwHnsYo9xKIcUJN5LeDKdJ6R1Z1j9V/J5wyq8nh/mYPEpIKJbBZXtZjG04HiK7zV/p6Vs9952MrMeUIw==", + "deprecated": "This package is no longer supported.", + "license": "ISC", + "optional": true, + "dependencies": { + "delegates": "^1.0.0", + "readable-stream": "^3.6.0" + }, + "engines": { + "node": ">=10" + } + }, "node_modules/arg": { "version": "4.1.3", "resolved": "https://registry.npmmirror.com/arg/-/arg-4.1.3.tgz", @@ -549,6 +697,13 @@ "node": ">=6" } }, + "node_modules/balanced-match": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/balanced-match/-/balanced-match-1.0.2.tgz", + "integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==", + "license": "MIT", + "optional": true + }, "node_modules/body-parser": { "version": "2.2.2", "resolved": "https://registry.npmmirror.com/body-parser/-/body-parser-2.2.2.tgz", @@ -573,6 +728,17 @@ "url": "https://opencollective.com/express" } }, + "node_modules/brace-expansion": { + "version": "1.1.12", + "resolved": "https://registry.npmmirror.com/brace-expansion/-/brace-expansion-1.1.12.tgz", + "integrity": "sha512-9T9UjW3r0UW5c1Q7GTwllptXwhvYmEzFhzMfZ9H7FQWt+uZePjZPjBP/W1ZEyZ1twGWom5/56TF4lPcqjnDHcg==", + "license": "MIT", + "optional": true, + "dependencies": { + "balanced-match": "^1.0.0", + "concat-map": "0.0.1" + } + }, "node_modules/bytes": { "version": "3.1.2", "resolved": "https://registry.npmmirror.com/bytes/-/bytes-3.1.2.tgz", @@ -669,6 +835,16 @@ "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==", "license": "MIT" }, + "node_modules/color-support": { + "version": "1.1.3", + "resolved": "https://registry.npmmirror.com/color-support/-/color-support-1.1.3.tgz", + "integrity": "sha512-qiBjkpbMLO/HL68y+lh4q0/O1MZFj2RX6X/KmMa3+gJD3z+WwI1ZzDHysvqHGS3mP6mznPckpXmw1nI9cJjyRg==", + "license": "ISC", + "optional": true, + "bin": { + "color-support": "bin.js" + } + }, "node_modules/command-line-args": { "version": "5.2.1", "resolved": "https://registry.npmmirror.com/command-line-args/-/command-line-args-5.2.1.tgz", @@ -726,6 +902,20 @@ "node": ">=20" } }, + "node_modules/concat-map": { + "version": "0.0.1", + "resolved": "https://registry.npmmirror.com/concat-map/-/concat-map-0.0.1.tgz", + "integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg==", + "license": "MIT", + "optional": true + }, + "node_modules/console-control-strings": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/console-control-strings/-/console-control-strings-1.1.0.tgz", + "integrity": "sha512-ty/fTekppD2fIwRvnZAVdeOiGd1c7YXEixbgJTNzqcxJWKQnjJ/V1bNEEE6hygpM3WjwHFUVK6HTjWSzV4a8sQ==", + "license": "ISC", + "optional": true + }, "node_modules/content-disposition": { "version": "1.0.1", "resolved": "https://registry.npmmirror.com/content-disposition/-/content-disposition-1.0.1.tgz", @@ -779,6 +969,24 @@ "node": ">= 0.10" } }, + "node_modules/cozo-lib-wasm": { + "version": "0.7.6", + "resolved": "https://registry.npmmirror.com/cozo-lib-wasm/-/cozo-lib-wasm-0.7.6.tgz", + "integrity": "sha512-JxM0JHF2EVY7/S+100FKHB1h+fBgcmtqbs/9gSka0TgD9YutDYFCgIE5I80fRmbjh0h2e+THEe2HooRIbML7dw==", + "license": "MPL-2.0", + "optional": true + }, + "node_modules/cozo-node": { + "version": "0.7.6", + "resolved": "https://registry.npmmirror.com/cozo-node/-/cozo-node-0.7.6.tgz", + "integrity": "sha512-St2I4A9mD1I9LmSQo0r/EuOZ0Y0dknSCidLx8+BU5HzjrhqSbgoScDZ0nL/2sXOcUfJnSOYKNOKFUrv10j3MHA==", + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "dependencies": { + "@mapbox/node-pre-gyp": "^1.0.10" + } + }, "node_modules/create-require": { "version": "1.1.1", "resolved": "https://registry.npmmirror.com/create-require/-/create-require-1.1.1.tgz", @@ -816,6 +1024,13 @@ } } }, + "node_modules/delegates": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/delegates/-/delegates-1.0.0.tgz", + "integrity": "sha512-bd2L678uiWATM6m5Z1VzNCErI3jiGzt6HGY8OVICs40JQq/HALfbyNJmp0UDakEY4pMMaN0Ly5om/B1VI/+xfQ==", + "license": "MIT", + "optional": true + }, "node_modules/depd": { "version": "2.0.0", "resolved": "https://registry.npmmirror.com/depd/-/depd-2.0.0.tgz", @@ -825,6 +1040,16 @@ "node": ">= 0.8" } }, + "node_modules/detect-libc": { + "version": "2.1.2", + "resolved": "https://registry.npmmirror.com/detect-libc/-/detect-libc-2.1.2.tgz", + "integrity": "sha512-Btj2BOOO83o3WyH59e8MgXsxEQVcarkUOpEYrubB0urwnN10yQ364rsiByU11nZlqWYZm05i/of7io4mzihBtQ==", + "license": "Apache-2.0", + "optional": true, + "engines": { + "node": ">=8" + } + }, "node_modules/diff": { "version": "4.0.2", "resolved": "https://registry.npmmirror.com/diff/-/diff-4.0.2.tgz", @@ -854,6 +1079,13 @@ "integrity": "sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow==", "license": "MIT" }, + "node_modules/emoji-regex": { + "version": "8.0.0", + "resolved": "https://registry.npmmirror.com/emoji-regex/-/emoji-regex-8.0.0.tgz", + "integrity": "sha512-MSjYzcWNOA0ewAHpz0MxpYFvwg6yjy1NG3xteoqz644VCo/RPgnr1/GGt+ic3iJTzQ8Eu3TdM14SawnVUmGE6A==", + "license": "MIT", + "optional": true + }, "node_modules/encodeurl": { "version": "2.0.0", "resolved": "https://registry.npmmirror.com/encodeurl/-/encodeurl-2.0.0.tgz", @@ -1080,6 +1312,46 @@ "node": ">=14.14" } }, + "node_modules/fs-minipass": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/fs-minipass/-/fs-minipass-2.1.0.tgz", + "integrity": "sha512-V/JgOLFCS+R6Vcq0slCuaeWEdNC3ouDlJMNIsacH2VtALiu9mV4LPrHc5cDl8k5aw6J8jwgWWpiTo5RYhmIzvg==", + "license": "ISC", + "optional": true, + "dependencies": { + "minipass": "^3.0.0" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/fs-minipass/node_modules/minipass": { + "version": "3.3.6", + "resolved": "https://registry.npmmirror.com/minipass/-/minipass-3.3.6.tgz", + "integrity": "sha512-DxiNidxSEK+tHG6zOIklvNOwm3hvCrbUrdtzY74U6HKTJxvIDfOUL5W5P2Ghd3DTkhhKPYGqeNUIh5qcM4YBfw==", + "license": "ISC", + "optional": true, + "dependencies": { + "yallist": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/fs-minipass/node_modules/yallist": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/yallist/-/yallist-4.0.0.tgz", + "integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==", + "license": "ISC", + "optional": true + }, + "node_modules/fs.realpath": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/fs.realpath/-/fs.realpath-1.0.0.tgz", + "integrity": "sha512-OO0pH2lK6a0hZnAdau5ItzHPI6pUlvI7jMVnxUQRtw4owF2wk8lOSabtGDCTP4Ggrg2MbGnWO9X8K1t4+fGMDw==", + "license": "ISC", + "optional": true + }, "node_modules/function-bind": { "version": "1.1.2", "resolved": "https://registry.npmmirror.com/function-bind/-/function-bind-1.1.2.tgz", @@ -1089,6 +1361,28 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/gauge": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/gauge/-/gauge-3.0.2.tgz", + "integrity": "sha512-+5J6MS/5XksCuXq++uFRsnUd7Ovu1XenbeuIuNRJxYWjgQbPuFhT14lAvsWfqfAmnwluf1OwMjz39HjfLPci0Q==", + "deprecated": "This package is no longer supported.", + "license": "ISC", + "optional": true, + "dependencies": { + "aproba": "^1.0.3 || ^2.0.0", + "color-support": "^1.1.2", + "console-control-strings": "^1.0.0", + "has-unicode": "^2.0.1", + "object-assign": "^4.1.1", + "signal-exit": "^3.0.0", + "string-width": "^4.2.3", + "strip-ansi": "^6.0.1", + "wide-align": "^1.1.2" + }, + "engines": { + "node": ">=10" + } + }, "node_modules/get-intrinsic": { "version": "1.3.0", "resolved": "https://registry.npmmirror.com/get-intrinsic/-/get-intrinsic-1.3.0.tgz", @@ -1182,6 +1476,13 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/has-unicode": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/has-unicode/-/has-unicode-2.0.1.tgz", + "integrity": "sha512-8Rf9Y83NBReMnx0gFzA8JImQACstCYWUplepDa9xprwwtmgEZUF0h/i5xSA625zB/I37EtrswSST6OXxwaaIJQ==", + "license": "ISC", + "optional": true + }, "node_modules/hasown": { "version": "2.0.2", "resolved": "https://registry.npmmirror.com/hasown/-/hasown-2.0.2.tgz", @@ -1224,6 +1525,20 @@ "url": "https://opencollective.com/express" } }, + "node_modules/https-proxy-agent": { + "version": "5.0.1", + "resolved": "https://registry.npmmirror.com/https-proxy-agent/-/https-proxy-agent-5.0.1.tgz", + "integrity": "sha512-dFcAjpTQFgoLMzC2VwU+C/CbS7uRL0lWmxDITmqm7C+7F0Odmj6s9l6alZc6AELXhrnggM2CeWSXHGOdX2YtwA==", + "license": "MIT", + "optional": true, + "dependencies": { + "agent-base": "6", + "debug": "4" + }, + "engines": { + "node": ">= 6" + } + }, "node_modules/iconv-lite": { "version": "0.7.2", "resolved": "https://registry.npmmirror.com/iconv-lite/-/iconv-lite-0.7.2.tgz", @@ -1240,6 +1555,18 @@ "url": "https://opencollective.com/express" } }, + "node_modules/inflight": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/inflight/-/inflight-1.0.6.tgz", + "integrity": "sha512-k92I/b08q4wvFscXCLvqfsHCrjrF7yiXsQuIVvVE7N82W3+aqpzuUdBbfhWcy/FZR3/4IgflMgKLOsvPDrGCJA==", + "deprecated": "This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.", + "license": "ISC", + "optional": true, + "dependencies": { + "once": "^1.3.0", + "wrappy": "1" + } + }, "node_modules/inherits": { "version": "2.0.4", "resolved": "https://registry.npmmirror.com/inherits/-/inherits-2.0.4.tgz", @@ -1255,6 +1582,16 @@ "node": ">= 0.10" } }, + "node_modules/is-fullwidth-code-point": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/is-fullwidth-code-point/-/is-fullwidth-code-point-3.0.0.tgz", + "integrity": "sha512-zymm5+u+sCsSWyD9qNaejV3DFvhCKclKdizYaJUuHA83RLjb7nSuGnddCHGv0hk+KY7BMAlsWeK4Ueg6EV6XQg==", + "license": "MIT", + "optional": true, + "engines": { + "node": ">=8" + } + }, "node_modules/is-promise": { "version": "4.0.0", "resolved": "https://registry.npmmirror.com/is-promise/-/is-promise-4.0.0.tgz", @@ -1323,6 +1660,32 @@ "node": "20 || >=22" } }, + "node_modules/make-dir": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/make-dir/-/make-dir-3.1.0.tgz", + "integrity": "sha512-g3FeP20LNwhALb/6Cz6Dd4F2ngze0jz7tbzrD2wAV+o9FeNHe4rL+yK2md0J/fiSf1sa1ADhXqi5+oVwOM/eGw==", + "license": "MIT", + "optional": true, + "dependencies": { + "semver": "^6.0.0" + }, + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/make-dir/node_modules/semver": { + "version": "6.3.1", + "resolved": "https://registry.npmmirror.com/semver/-/semver-6.3.1.tgz", + "integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==", + "license": "ISC", + "optional": true, + "bin": { + "semver": "bin/semver.js" + } + }, "node_modules/make-error": { "version": "1.3.6", "resolved": "https://registry.npmmirror.com/make-error/-/make-error-1.3.6.tgz", @@ -1420,6 +1783,19 @@ "node": ">= 18" } }, + "node_modules/mkdirp": { + "version": "1.0.4", + "resolved": "https://registry.npmmirror.com/mkdirp/-/mkdirp-1.0.4.tgz", + "integrity": "sha512-vVqVZQyf3WLx2Shd0qJ9xuvqgAyKPLAiqITEtqW0oIUjzo3PePDd6fW9iFz30ef7Ysp/oiWqbhszeGWW2T6Gzw==", + "license": "MIT", + "optional": true, + "bin": { + "mkdirp": "bin/cmd.js" + }, + "engines": { + "node": ">=10" + } + }, "node_modules/ms": { "version": "2.1.3", "resolved": "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz", @@ -1444,6 +1820,27 @@ "node": "^18 || ^20 || >= 21" } }, + "node_modules/node-fetch": { + "version": "2.7.0", + "resolved": "https://registry.npmmirror.com/node-fetch/-/node-fetch-2.7.0.tgz", + "integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==", + "license": "MIT", + "optional": true, + "dependencies": { + "whatwg-url": "^5.0.0" + }, + "engines": { + "node": "4.x || >=6.0.0" + }, + "peerDependencies": { + "encoding": "^0.1.0" + }, + "peerDependenciesMeta": { + "encoding": { + "optional": true + } + } + }, "node_modules/node-gyp-build": { "version": "4.8.4", "resolved": "https://registry.npmmirror.com/node-gyp-build/-/node-gyp-build-4.8.4.tgz", @@ -1455,6 +1852,36 @@ "node-gyp-build-test": "build-test.js" } }, + "node_modules/nopt": { + "version": "5.0.0", + "resolved": "https://registry.npmmirror.com/nopt/-/nopt-5.0.0.tgz", + "integrity": "sha512-Tbj67rffqceeLpcRXrT7vKAN8CwfPeIBgM7E6iBkmKLV7bEMwpGgYLGv0jACUsECaa/vuxP0IjEont6umdMgtQ==", + "license": "ISC", + "optional": true, + "dependencies": { + "abbrev": "1" + }, + "bin": { + "nopt": "bin/nopt.js" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/npmlog": { + "version": "5.0.1", + "resolved": "https://registry.npmmirror.com/npmlog/-/npmlog-5.0.1.tgz", + "integrity": "sha512-AqZtDUWOMKs1G/8lwylVjrdYgqA4d9nu8hc+0gzRxlDb1I10+FHBGMXs6aiQHFdCUUlqH99MUMuLfzWDNDtfxw==", + "deprecated": "This package is no longer supported.", + "license": "ISC", + "optional": true, + "dependencies": { + "are-we-there-yet": "^2.0.0", + "console-control-strings": "^1.1.0", + "gauge": "^3.0.0", + "set-blocking": "^2.0.0" + } + }, "node_modules/object-assign": { "version": "4.1.1", "resolved": "https://registry.npmmirror.com/object-assign/-/object-assign-4.1.1.tgz", @@ -1506,6 +1933,16 @@ "node": ">= 0.8" } }, + "node_modules/path-is-absolute": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/path-is-absolute/-/path-is-absolute-1.0.1.tgz", + "integrity": "sha512-AVbw3UJ2e9bq64vSaS9Am0fje1Pa8pbGqTTsmXfaIiMpnr5DlDhfJOuLj9Sf95ZPVDAUerDfEk88MPmPe7UCQg==", + "license": "MIT", + "optional": true, + "engines": { + "node": ">=0.10.0" + } + }, "node_modules/path-key": { "version": "3.1.1", "resolved": "https://registry.npmmirror.com/path-key/-/path-key-3.1.1.tgz", @@ -1602,6 +2039,21 @@ "node": ">= 0.10" } }, + "node_modules/readable-stream": { + "version": "3.6.2", + "resolved": "https://registry.npmmirror.com/readable-stream/-/readable-stream-3.6.2.tgz", + "integrity": "sha512-9u/sniCrY3D5WdsERHzHE4G2YCXqoG5FTHUiCC4SIbr6XcLZBY05ya9EKjYek9O5xOAwjGq+1JdGBAS7Q9ScoA==", + "license": "MIT", + "optional": true, + "dependencies": { + "inherits": "^2.0.3", + "string_decoder": "^1.1.1", + "util-deprecate": "^1.0.1" + }, + "engines": { + "node": ">= 6" + } + }, "node_modules/reflect-metadata": { "version": "0.2.2", "resolved": "https://registry.npmmirror.com/reflect-metadata/-/reflect-metadata-0.2.2.tgz", @@ -1617,6 +2069,58 @@ "node": ">=0.10.0" } }, + "node_modules/rimraf": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/rimraf/-/rimraf-3.0.2.tgz", + "integrity": "sha512-JZkJMZkAGFFPP2YqXZXPbMlMBgsxzE8ILs4lMIX/2o0L9UBw9O/Y3o6wFw/i9YLapcUJWwqbi3kdxIPdC62TIA==", + "deprecated": "Rimraf versions prior to v4 are no longer supported", + "license": "ISC", + "optional": true, + "dependencies": { + "glob": "^7.1.3" + }, + "bin": { + "rimraf": "bin.js" + }, + "funding": { + "url": "https://github.com/sponsors/isaacs" + } + }, + "node_modules/rimraf/node_modules/glob": { + "version": "7.2.3", + "resolved": "https://registry.npmmirror.com/glob/-/glob-7.2.3.tgz", + "integrity": "sha512-nFR0zLpU2YCaRxwoCJvL6UvCH2JFyFVIvwTLsIf21AuHlMskA1hhTdk+LlYJtOlYt9v6dvszD2BGRqBL+iQK9Q==", + "deprecated": "Glob versions prior to v9 are no longer supported", + "license": "ISC", + "optional": true, + "dependencies": { + "fs.realpath": "^1.0.0", + "inflight": "^1.0.4", + "inherits": "2", + "minimatch": "^3.1.1", + "once": "^1.3.0", + "path-is-absolute": "^1.0.0" + }, + "engines": { + "node": "*" + }, + "funding": { + "url": "https://github.com/sponsors/isaacs" + } + }, + "node_modules/rimraf/node_modules/minimatch": { + "version": "3.1.2", + "resolved": "https://registry.npmmirror.com/minimatch/-/minimatch-3.1.2.tgz", + "integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==", + "license": "ISC", + "optional": true, + "dependencies": { + "brace-expansion": "^1.1.7" + }, + "engines": { + "node": "*" + } + }, "node_modules/router": { "version": "2.2.0", "resolved": "https://registry.npmmirror.com/router/-/router-2.2.0.tgz", @@ -1633,12 +2137,46 @@ "node": ">= 18" } }, + "node_modules/safe-buffer": { + "version": "5.2.1", + "resolved": "https://registry.npmmirror.com/safe-buffer/-/safe-buffer-5.2.1.tgz", + "integrity": "sha512-rp3So07KcdmmKbGvgaNxQSJr7bGVSVk5S9Eq1F+ppbRo70+YeaDxkw5Dd8NPN+GD6bjnYm2VuPuCXmpuYvmCXQ==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/feross" + }, + { + "type": "patreon", + "url": "https://www.patreon.com/feross" + }, + { + "type": "consulting", + "url": "https://feross.org/support" + } + ], + "license": "MIT", + "optional": true + }, "node_modules/safer-buffer": { "version": "2.1.2", "resolved": "https://registry.npmmirror.com/safer-buffer/-/safer-buffer-2.1.2.tgz", "integrity": "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==", "license": "MIT" }, + "node_modules/semver": { + "version": "7.7.3", + "resolved": "https://registry.npmmirror.com/semver/-/semver-7.7.3.tgz", + "integrity": "sha512-SdsKMrI9TdgjdweUSR9MweHA4EJ8YxHn8DFaDisvhVlUOe4BF1tLD7GAj0lIqWVl+dPb/rExr0Btby5loQm20Q==", + "license": "ISC", + "optional": true, + "bin": { + "semver": "bin/semver.js" + }, + "engines": { + "node": ">=10" + } + }, "node_modules/send": { "version": "1.2.1", "resolved": "https://registry.npmmirror.com/send/-/send-1.2.1.tgz", @@ -1684,6 +2222,13 @@ "url": "https://opencollective.com/express" } }, + "node_modules/set-blocking": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/set-blocking/-/set-blocking-2.0.0.tgz", + "integrity": "sha512-KiKBS8AnWGEyLzofFfmvKwpdPzqiy16LvQfK3yv/fVH7Bj13/wl3JSR1J+rfgRE9q7xUJK4qvgS8raSOeLUehw==", + "license": "ISC", + "optional": true + }, "node_modules/setprototypeof": { "version": "1.2.0", "resolved": "https://registry.npmmirror.com/setprototypeof/-/setprototypeof-1.2.0.tgz", @@ -1783,6 +2328,13 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/signal-exit": { + "version": "3.0.7", + "resolved": "https://registry.npmmirror.com/signal-exit/-/signal-exit-3.0.7.tgz", + "integrity": "sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==", + "license": "ISC", + "optional": true + }, "node_modules/simple-git": { "version": "3.30.0", "resolved": "https://registry.npmmirror.com/simple-git/-/simple-git-3.30.0.tgz", @@ -1807,6 +2359,44 @@ "node": ">= 0.8" } }, + "node_modules/string_decoder": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/string_decoder/-/string_decoder-1.3.0.tgz", + "integrity": "sha512-hkRX8U1WjJFd8LsDJ2yQ/wWWxaopEsABU1XfkM8A+j0+85JAGppt16cr1Whg6KIbb4okU6Mql6BOj+uup/wKeA==", + "license": "MIT", + "optional": true, + "dependencies": { + "safe-buffer": "~5.2.0" + } + }, + "node_modules/string-width": { + "version": "4.2.3", + "resolved": "https://registry.npmmirror.com/string-width/-/string-width-4.2.3.tgz", + "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==", + "license": "MIT", + "optional": true, + "dependencies": { + "emoji-regex": "^8.0.0", + "is-fullwidth-code-point": "^3.0.0", + "strip-ansi": "^6.0.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/strip-ansi": { + "version": "6.0.1", + "resolved": "https://registry.npmmirror.com/strip-ansi/-/strip-ansi-6.0.1.tgz", + "integrity": "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==", + "license": "MIT", + "optional": true, + "dependencies": { + "ansi-regex": "^5.0.1" + }, + "engines": { + "node": ">=8" + } + }, "node_modules/supports-color": { "version": "7.2.0", "resolved": "https://registry.npmmirror.com/supports-color/-/supports-color-7.2.0.tgz", @@ -1866,6 +2456,13 @@ "node": ">=0.6" } }, + "node_modules/tr46": { + "version": "0.0.3", + "resolved": "https://registry.npmmirror.com/tr46/-/tr46-0.0.3.tgz", + "integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==", + "license": "MIT", + "optional": true + }, "node_modules/tree-sitter": { "version": "0.21.1", "resolved": "https://registry.npmmirror.com/tree-sitter/-/tree-sitter-0.21.1.tgz", @@ -2044,6 +2641,13 @@ "node": ">= 0.8" } }, + "node_modules/util-deprecate": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/util-deprecate/-/util-deprecate-1.0.2.tgz", + "integrity": "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw==", + "license": "MIT", + "optional": true + }, "node_modules/v8-compile-cache-lib": { "version": "3.0.1", "resolved": "https://registry.npmmirror.com/v8-compile-cache-lib/-/v8-compile-cache-lib-3.0.1.tgz", @@ -2059,6 +2663,24 @@ "node": ">= 0.8" } }, + "node_modules/webidl-conversions": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/webidl-conversions/-/webidl-conversions-3.0.1.tgz", + "integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==", + "license": "BSD-2-Clause", + "optional": true + }, + "node_modules/whatwg-url": { + "version": "5.0.0", + "resolved": "https://registry.npmmirror.com/whatwg-url/-/whatwg-url-5.0.0.tgz", + "integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==", + "license": "MIT", + "optional": true, + "dependencies": { + "tr46": "~0.0.3", + "webidl-conversions": "^3.0.0" + } + }, "node_modules/which": { "version": "2.0.2", "resolved": "https://registry.npmmirror.com/which/-/which-2.0.2.tgz", @@ -2074,6 +2696,16 @@ "node": ">= 8" } }, + "node_modules/wide-align": { + "version": "1.1.5", + "resolved": "https://registry.npmmirror.com/wide-align/-/wide-align-1.1.5.tgz", + "integrity": "sha512-eDMORYaPNZ4sQIuuYPDHdQvf4gyCF9rEEV/yPxGfwPkRodwEgiMUUXTx/dex+Me0wxx53S+NgUHaP7y3MGlDmg==", + "license": "ISC", + "optional": true, + "dependencies": { + "string-width": "^1.0.2 || 2 || 3 || 4" + } + }, "node_modules/wordwrapjs": { "version": "5.1.1", "resolved": "https://registry.npmmirror.com/wordwrapjs/-/wordwrapjs-5.1.1.tgz", diff --git a/package.json b/package.json index a139916..247a5ba 100644 --- a/package.json +++ b/package.json @@ -60,6 +60,8 @@ "@lancedb/lancedb-linux-x64-gnu": "0.22.3", "@lancedb/lancedb-linux-x64-musl": "0.22.3", "@lancedb/lancedb-win32-arm64-msvc": "0.22.3", - "@lancedb/lancedb-win32-x64-msvc": "0.22.3" + "@lancedb/lancedb-win32-x64-msvc": "0.22.3", + "cozo-lib-wasm": "^0.7.6", + "cozo-node": "^0.7.6" } } diff --git a/src/commands/ai.ts b/src/commands/ai.ts index bdb894d..ea0bab2 100644 --- a/src/commands/ai.ts +++ b/src/commands/ai.ts @@ -6,14 +6,15 @@ import { serveCommand } from './serve'; import { packCommand } from './pack'; import { unpackCommand } from './unpack'; import { hooksCommand } from './hooks'; +import { graphCommand } from './graph'; export const aiCommand = new Command('ai') .description('AI features (indexing, search, hooks, MCP)') .addCommand(indexCommand) .addCommand(queryCommand) .addCommand(semanticCommand) + .addCommand(graphCommand) .addCommand(packCommand) .addCommand(unpackCommand) .addCommand(hooksCommand) .addCommand(serveCommand); - diff --git a/src/commands/graph.ts b/src/commands/graph.ts new file mode 100644 index 0000000..923aef1 --- /dev/null +++ b/src/commands/graph.ts @@ -0,0 +1,47 @@ +import { Command } from 'commander'; +import path from 'path'; +import { resolveGitRoot } from '../core/git'; +import { sha256Hex } from '../core/crypto'; +import { buildChildrenQuery, buildFindSymbolsQuery, runAstGraphQuery } from '../core/astGraphQuery'; + +export const graphCommand = new Command('graph') + .description('AST graph search powered by CozoDB') + .addCommand( + new Command('query') + .description('Run a CozoScript query against the AST graph database') + .argument('', 'CozoScript query') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--params ', 'JSON params object', '{}') + .action(async (scriptParts, options) => { + const repoRoot = await resolveGitRoot(path.resolve(options.path)); + const query = Array.isArray(scriptParts) ? scriptParts.join(' ') : String(scriptParts ?? ''); + const params = JSON.parse(String(options.params ?? '{}')); + const result = await runAstGraphQuery(repoRoot, query, params); + console.log(JSON.stringify({ repoRoot, result }, null, 2)); + }) + ) + .addCommand( + new Command('find') + .description('Find symbols by name prefix') + .argument('', 'Name prefix (case-insensitive)') + .option('-p, --path ', 'Path inside the repository', '.') + .action(async (prefix, options) => { + const repoRoot = await resolveGitRoot(path.resolve(options.path)); + const result = await runAstGraphQuery(repoRoot, buildFindSymbolsQuery(), { prefix: String(prefix) }); + console.log(JSON.stringify({ repoRoot, result }, null, 2)); + }) + ) + .addCommand( + new Command('children') + .description('List direct children in the AST containment graph') + .argument('', 'Parent id (ref_id or file_id)') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--as-file', 'Treat as a repository-relative file path and hash it to file_id', false) + .action(async (id, options) => { + const repoRoot = await resolveGitRoot(path.resolve(options.path)); + const parentId = options.asFile ? sha256Hex(`file:${String(id)}`) : String(id); + const result = await runAstGraphQuery(repoRoot, buildChildrenQuery(), { parent_id: parentId }); + console.log(JSON.stringify({ repoRoot, parent_id: parentId, result }, null, 2)); + }) + ); + diff --git a/src/commands/query.ts b/src/commands/query.ts index 04fb297..b906ad4 100644 --- a/src/commands/query.ts +++ b/src/commands/query.ts @@ -3,24 +3,37 @@ import path from 'path'; import { inferWorkspaceRoot, resolveGitRoot } from '../core/git'; import { defaultDbDir, openTables } from '../core/lancedb'; import { queryManifestWorkspace } from '../core/workspace'; +import { buildCoarseWhere, filterAndRankSymbolRows, inferSymbolSearchMode, pickCoarseToken, SymbolSearchMode } from '../core/symbolSearch'; export const queryCommand = new Command('query') - .description('Query refs table by symbol substring') + .description('Query refs table by symbol match (substring/prefix/wildcard/regex/fuzzy)') .argument('', 'Symbol substring') .option('-p, --path ', 'Path inside the repository', '.') .option('--limit ', 'Limit results', '50') + .option('--mode ', 'Mode: substring|prefix|wildcard|regex|fuzzy (default: auto)') + .option('--case-insensitive', 'Case-insensitive matching', false) + .option('--max-candidates ', 'Max candidates to fetch before filtering', '1000') .action(async (keyword, options) => { const repoRoot = await resolveGitRoot(path.resolve(options.path)); const limit = Number(options.limit); const q = String(keyword); + const mode = inferSymbolSearchMode(q, options.mode as SymbolSearchMode | undefined); + const caseInsensitive = Boolean(options.caseInsensitive ?? false); + const maxCandidates = Math.max(limit, Number(options.maxCandidates ?? Math.min(2000, limit * 20))); if (inferWorkspaceRoot(repoRoot)) { - const res = await queryManifestWorkspace({ manifestRepoRoot: repoRoot, keyword: q, limit }); - console.log(JSON.stringify(res, null, 2)); + const coarse = (mode === 'substring' || mode === 'prefix') ? q : pickCoarseToken(q); + const res = await queryManifestWorkspace({ manifestRepoRoot: repoRoot, keyword: coarse, limit: maxCandidates }); + const rows = filterAndRankSymbolRows(res.rows, { query: q, mode, caseInsensitive, limit }); + console.log(JSON.stringify({ ...res, rows }, null, 2)); return; } const dbDir = defaultDbDir(repoRoot); const { refs } = await openTables({ dbDir, dim: 256, mode: 'create_if_missing' }); - const rows = await refs.query().where(`symbol LIKE '%${q.replace(/'/g, "''")}%'`).limit(limit).toArray(); - console.log(JSON.stringify({ repoRoot, count: (rows as any[]).length, rows }, null, 2)); + const where = buildCoarseWhere({ query: q, mode, caseInsensitive }); + const candidates = where + ? await refs.query().where(where).limit(maxCandidates).toArray() + : await refs.query().limit(maxCandidates).toArray(); + const rows = filterAndRankSymbolRows(candidates as any[], { query: q, mode, caseInsensitive, limit }); + console.log(JSON.stringify({ repoRoot, count: rows.length, rows }, null, 2)); }); diff --git a/src/core/astGraph.ts b/src/core/astGraph.ts new file mode 100644 index 0000000..123ba6f --- /dev/null +++ b/src/core/astGraph.ts @@ -0,0 +1,72 @@ +import fs from 'fs-extra'; +import { openRepoCozoDb, repoAstGraphExportPath } from './cozo'; + +export interface AstGraphData { + files: Array<[string, string]>; + symbols: Array<[string, string, string, string, string, number, number]>; + contains: Array<[string, string]>; + extends_name: Array<[string, string]>; + implements_name: Array<[string, string]>; +} + +export interface WriteAstGraphResult { + enabled: boolean; + engine?: 'sqlite' | 'mem'; + dbPath?: string; + counts?: { + files: number; + symbols: number; + contains: number; + extends_name: number; + implements_name: number; + }; + skippedReason?: string; +} + +export async function writeAstGraphToCozo(repoRoot: string, data: AstGraphData): Promise { + const db = await openRepoCozoDb(repoRoot); + if (!db) return { enabled: false, skippedReason: 'Cozo backend not available (see .git-ai/cozo.error.json)' }; + + const script = ` +{ + ?[file_id, file] <- $files + :replace ast_file { file_id: String => file: String } +} +{ + ?[ref_id, file, name, kind, signature, start_line, end_line] <- $symbols + :replace ast_symbol { ref_id: String => file: String, name: String, kind: String, signature: String, start_line: Int, end_line: Int } +} +{ + ?[parent_id, child_id] <- $contains + :replace ast_contains { parent_id: String, child_id: String } +} +{ + ?[sub_id, super_name] <- $extends_name + :replace ast_extends_name { sub_id: String, super_name: String } +} +{ + ?[sub_id, iface_name] <- $implements_name + :replace ast_implements_name { sub_id: String, iface_name: String } +} +`; + + await db.run(script, data as any); + if (db.engine !== 'sqlite' && db.exportRelations) { + const exported = await db.exportRelations(['ast_file', 'ast_symbol', 'ast_contains', 'ast_extends_name', 'ast_implements_name']); + await fs.writeJSON(repoAstGraphExportPath(repoRoot), exported, { spaces: 2 }); + } + if (db.close) await db.close(); + + return { + enabled: true, + engine: db.engine, + dbPath: db.dbPath, + counts: { + files: data.files.length, + symbols: data.symbols.length, + contains: data.contains.length, + extends_name: data.extends_name.length, + implements_name: data.implements_name.length, + }, + }; +} diff --git a/src/core/astGraphQuery.ts b/src/core/astGraphQuery.ts new file mode 100644 index 0000000..5d69b2c --- /dev/null +++ b/src/core/astGraphQuery.ts @@ -0,0 +1,29 @@ +import { openRepoCozoDb } from './cozo'; + +export async function runAstGraphQuery(repoRoot: string, query: string, params?: Record): Promise { + const db = await openRepoCozoDb(repoRoot); + if (!db) { + throw new Error('AST graph is not available: Cozo backend not available (see .git-ai/cozo.error.json)'); + } + try { + return await db.run(query, params ?? {}); + } finally { + if (db.close) await db.close(); + } +} + +export function buildFindSymbolsQuery(): string { + return ` +?[ref_id, file, name, kind, signature, start_line, end_line] := + *ast_symbol{ref_id, file, name, kind, signature, start_line, end_line}, + starts_with(lowercase(name), lowercase($prefix)) +`; +} + +export function buildChildrenQuery(): string { + return ` +?[child_id, file, name, kind, signature, start_line, end_line] := + *ast_contains{parent_id: $parent_id, child_id}, + *ast_symbol{ref_id: child_id, file, name, kind, signature, start_line, end_line} +`; +} diff --git a/src/core/cozo.ts b/src/core/cozo.ts new file mode 100644 index 0000000..b6c9273 --- /dev/null +++ b/src/core/cozo.ts @@ -0,0 +1,154 @@ +import fs from 'fs-extra'; +import path from 'path'; + +export interface CozoClient { + backend: 'cozo-node' | 'cozo-wasm'; + run: (script: string, params?: Record) => Promise; + exportRelations?: (relations: string[]) => Promise; + importRelations?: (data: any) => Promise; + close?: () => Promise; + engine: 'sqlite' | 'mem'; + dbPath?: string; +} + +export function repoAstGraphDbPath(repoRoot: string): string { + return path.join(repoRoot, '.git-ai', 'ast-graph.sqlite'); +} + +export function repoAstGraphExportPath(repoRoot: string): string { + return path.join(repoRoot, '.git-ai', 'ast-graph.export.json'); +} + +let cozoWasmInit: Promise | null = null; + +async function tryImportFromExport(repoRoot: string, client: CozoClient): Promise { + if (!client.importRelations) return; + if (client.engine === 'sqlite') return; + const exportPath = repoAstGraphExportPath(repoRoot); + if (!await fs.pathExists(exportPath)) return; + const data = await fs.readJSON(exportPath).catch(() => null); + if (!data) return; + await client.importRelations(data); +} + +async function openCozoNode(repoRoot: string): Promise { + let mod: any; + try { + const moduleName: string = 'cozo-node'; + mod = await import(moduleName); + } catch (e: any) { + throw new Error(`Failed to load cozo-node: ${String(e?.message ?? e)}`); + } + + const CozoDb = mod?.CozoDb ?? mod?.default?.CozoDb ?? mod?.default ?? mod; + if (typeof CozoDb !== 'function') throw new Error('cozo-node loaded but CozoDb export is missing'); + + const dbPath = repoAstGraphDbPath(repoRoot); + await fs.ensureDir(path.dirname(dbPath)); + + let db: any; + let engine: CozoClient['engine'] = 'mem'; + try { + db = new CozoDb('sqlite', dbPath); + engine = 'sqlite'; + } catch (e1) { + try { + db = new CozoDb({ engine: 'sqlite', path: dbPath }); + engine = 'sqlite'; + } catch { + db = new CozoDb(); + engine = 'mem'; + } + } + + const client: CozoClient = { + backend: 'cozo-node', + engine, + dbPath: engine === 'sqlite' ? dbPath : undefined, + run: async (script: string, params?: Record) => db.run(script, params ?? {}), + exportRelations: typeof db.exportRelations === 'function' ? async (rels: string[]) => db.exportRelations(rels) : undefined, + importRelations: typeof db.importRelations === 'function' ? async (data: any) => db.importRelations(data) : undefined, + close: typeof db.close === 'function' ? async () => { await db.close(); } : undefined, + }; + await tryImportFromExport(repoRoot, client); + return client; +} + +async function openCozoWasm(repoRoot: string): Promise { + let mod: any; + try { + const moduleName: string = 'cozo-lib-wasm'; + mod = await import(moduleName); + } catch (e: any) { + throw new Error(`Failed to load cozo-lib-wasm: ${String(e?.message ?? e)}`); + } + + const init = mod?.default; + const CozoDb = mod?.CozoDb; + if (typeof init !== 'function' || typeof CozoDb?.new !== 'function') { + throw new Error('cozo-lib-wasm loaded but exports are not compatible'); + } + + if (!cozoWasmInit) cozoWasmInit = Promise.resolve(init()).then(() => {}); + await cozoWasmInit; + + const db: any = CozoDb.new(); + + const run = async (script: string, params?: Record) => { + const out = db.run(String(script), JSON.stringify(params ?? {})); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const exportRelations = async (relations: string[]) => { + if (typeof db.export_relations !== 'function') return null; + const out = db.export_relations(JSON.stringify(relations)); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const importRelations = async (data: any) => { + if (typeof db.import_relations !== 'function') return null; + const out = db.import_relations(JSON.stringify(data)); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const client: CozoClient = { + backend: 'cozo-wasm', + engine: 'mem', + run, + exportRelations, + importRelations, + close: typeof db.free === 'function' ? async () => { db.free(); } : undefined, + }; + + await tryImportFromExport(repoRoot, client); + return client; +} + +export async function openRepoCozoDb(repoRoot: string): Promise { + const errors: string[] = []; + try { + return await openCozoNode(repoRoot); + } catch (e: any) { + errors.push(String(e?.message ?? e)); + } + try { + return await openCozoWasm(repoRoot); + } catch (e: any) { + errors.push(String(e?.message ?? e)); + } + await fs.ensureDir(path.join(repoRoot, '.git-ai')); + await fs.writeJSON(path.join(repoRoot, '.git-ai', 'cozo.error.json'), { errors }, { spaces: 2 }).catch(() => {}); + return null; +} diff --git a/src/core/indexer.ts b/src/core/indexer.ts index 46ba083..c5e7c2c 100644 --- a/src/core/indexer.ts +++ b/src/core/indexer.ts @@ -6,6 +6,7 @@ import { defaultDbDir, openTables } from './lancedb'; import { sha256Hex } from './crypto'; import { hashEmbedding } from './embedding'; import { quantizeSQ8 } from './sq8'; +import { writeAstGraphToCozo } from './astGraph'; import { ChunkRow, RefRow } from './types'; export interface IndexOptions { @@ -89,15 +90,40 @@ export class IndexerV2 { const chunkRows: any[] = []; const refRows: any[] = []; + const astFiles: Array<[string, string]> = []; + const astSymbols: Array<[string, string, string, string, string, number, number]> = []; + const astContains: Array<[string, string]> = []; + const astExtendsName: Array<[string, string]> = []; + const astImplementsName: Array<[string, string]> = []; for (const file of files) { const fullPath = path.join(this.scanRoot, file); const symbols = await this.parser.parseFile(fullPath); + const fileId = sha256Hex(`file:${file}`); + astFiles.push([fileId, file]); for (const s of symbols) { const text = buildChunkText(file, s); const contentHash = sha256Hex(text); const refId = sha256Hex(`${file}:${s.name}:${s.kind}:${s.startLine}:${s.endLine}:${contentHash}`); + astSymbols.push([refId, file, s.name, s.kind, s.signature, s.startLine, s.endLine]); + let parentId = fileId; + if (s.container) { + const cText = buildChunkText(file, s.container); + const cHash = sha256Hex(cText); + parentId = sha256Hex(`${file}:${s.container.name}:${s.container.kind}:${s.container.startLine}:${s.container.endLine}:${cHash}`); + } + astContains.push([parentId, refId]); + + if (s.kind === 'class') { + if (s.extends) { + for (const superName of s.extends) astExtendsName.push([refId, superName]); + } + if (s.implements) { + for (const ifaceName of s.implements) astImplementsName.push([refId, ifaceName]); + } + } + if (!existingChunkIds.has(contentHash)) { const vec = hashEmbedding(text, { dim: this.dim }); const q = quantizeSQ8(vec); @@ -129,6 +155,14 @@ export class IndexerV2 { if (chunkRows.length > 0) await chunks.add(chunkRows); if (refRows.length > 0) await refs.add(refRows); + const astGraph = await writeAstGraphToCozo(this.repoRoot, { + files: astFiles, + symbols: astSymbols, + contains: astContains, + extends_name: astExtendsName, + implements_name: astImplementsName, + }); + const meta = { version: '2.0', dim: this.dim, @@ -137,6 +171,18 @@ export class IndexerV2 { refsAdded: refRows.length, dbDir: path.relative(this.repoRoot, dbDir), scanRoot: path.relative(this.repoRoot, this.scanRoot), + astGraph: astGraph.enabled + ? { + backend: 'cozo', + engine: astGraph.engine, + dbPath: astGraph.dbPath ? path.relative(this.repoRoot, astGraph.dbPath) : undefined, + counts: astGraph.counts, + } + : { + backend: 'cozo', + enabled: false, + skippedReason: astGraph.skippedReason, + }, }; await fs.writeJSON(path.join(gitAiDir, 'meta.json'), meta, { spaces: 2 }); } diff --git a/src/core/parser.ts b/src/core/parser.ts index 728f9a3..5ebb8a5 100644 --- a/src/core/parser.ts +++ b/src/core/parser.ts @@ -17,8 +17,19 @@ export class CodeParser { if (!language) return []; this.parser.setLanguage(language.language); - const tree = this.parser.parse(content); - return this.extractSymbols(tree.rootNode, language.id); + try { + const tree = this.parser.parse(content); + return this.extractSymbols(tree.rootNode, language.id); + } catch (e: any) { + const msg = String(e?.message ?? e); + if (!msg.includes('Invalid argument')) return []; + try { + const tree = this.parser.parse(content, undefined, { bufferSize: 1024 * 1024 }); + return this.extractSymbols(tree.rootNode, language.id); + } catch { + return []; + } + } } private pickLanguage(filePath: string): { id: 'typescript' | 'java'; language: any } | null { @@ -29,7 +40,7 @@ export class CodeParser { return { id: 'typescript', language: TypeScript.tsx }; } if (filePath.endsWith('.java')) { - return { id: 'java', language: Java }; + return { id: 'java', language: Java as any }; } return null; } @@ -37,7 +48,36 @@ export class CodeParser { private extractSymbols(node: Parser.SyntaxNode, languageId: 'typescript' | 'java'): SymbolInfo[] { const symbols: SymbolInfo[] = []; - const traverse = (n: Parser.SyntaxNode) => { + const parseHeritage = (head: string): { extends?: string[]; implements?: string[] } => { + const out: { extends?: string[]; implements?: string[] } = {}; + const extendsMatch = head.match(/\bextends\s+([A-Za-z0-9_$.<>\[\]]+)/); + if (extendsMatch?.[1]) out.extends = [extendsMatch[1]]; + + const implMatch = head.match(/\bimplements\s+([A-Za-z0-9_$. ,<>\[\]]+)/); + if (implMatch?.[1]) { + const raw = implMatch[1]; + const parts: string[] = []; + let current = ''; + let depth = 0; + for (const char of raw) { + if (char === '<') depth++; + else if (char === '>') depth--; + + if (char === ',' && depth === 0) { + if (current.trim()) parts.push(current.trim()); + current = ''; + } else { + current += char; + } + } + if (current.trim()) parts.push(current.trim()); + + if (parts.length > 0) out.implements = parts; + } + return out; + }; + + const traverse = (n: Parser.SyntaxNode, container?: SymbolInfo) => { if (languageId === 'typescript') { if (n.type === 'function_declaration' || n.type === 'method_definition') { const nameNode = n.childForFieldName('name'); @@ -48,18 +88,27 @@ export class CodeParser { startLine: n.startPosition.row + 1, endLine: n.endPosition.row + 1, signature: n.text.split('{')[0].trim(), + container: n.type === 'method_definition' ? container : undefined, }); } } else if (n.type === 'class_declaration') { const nameNode = n.childForFieldName('name'); if (nameNode) { - symbols.push({ + const head = n.text.split('{')[0].trim(); + const heritage = parseHeritage(head); + const classSym: SymbolInfo = { name: nameNode.text, kind: 'class', startLine: n.startPosition.row + 1, endLine: n.endPosition.row + 1, signature: `class ${nameNode.text}`, - }); + container, + extends: heritage.extends, + implements: heritage.implements, + }; + symbols.push(classSym); + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, classSym); + return; } } } else { @@ -73,6 +122,7 @@ export class CodeParser { startLine: n.startPosition.row + 1, endLine: n.endPosition.row + 1, signature: head, + container, }); } } else if ( @@ -84,21 +134,29 @@ export class CodeParser { ) { const nameNode = n.childForFieldName('name'); if (nameNode) { - symbols.push({ + const head = n.text.split('{')[0].split(';')[0].trim(); + const heritage = parseHeritage(head); + const classSym: SymbolInfo = { name: nameNode.text, kind: 'class', startLine: n.startPosition.row + 1, endLine: n.endPosition.row + 1, signature: `${n.type.replace(/_declaration$/, '')} ${nameNode.text}`, - }); + container, + extends: heritage.extends, + implements: heritage.implements, + }; + symbols.push(classSym); + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, classSym); + return; } } } - for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!); + for (let i = 0; i < n.childCount; i++) traverse(n.child(i)!, container); }; - traverse(node); + traverse(node, undefined); return symbols; } } diff --git a/src/core/symbolSearch.ts b/src/core/symbolSearch.ts new file mode 100644 index 0000000..a7089e7 --- /dev/null +++ b/src/core/symbolSearch.ts @@ -0,0 +1,150 @@ +export type SymbolSearchMode = 'substring' | 'prefix' | 'wildcard' | 'regex' | 'fuzzy'; + +export interface SymbolSearchParams { + query: string; + mode?: SymbolSearchMode; + caseInsensitive?: boolean; + limit?: number; + maxCandidates?: number; +} + +export function inferSymbolSearchMode(query: string, mode?: SymbolSearchMode): SymbolSearchMode { + if (mode) return mode; + if (query.includes('*') || query.includes('?')) return 'wildcard'; + return 'substring'; +} + +function escapeQuotes(s: string): string { + return s.replace(/'/g, "''"); +} + +function extractTokens(s: string): string[] { + const tokens = s.match(/[A-Za-z0-9_$.]+/g) ?? []; + return tokens.map(t => t.trim()).filter(Boolean); +} + +export function pickCoarseToken(query: string): string { + const tokens = extractTokens(query); + if (tokens.length === 0) return ''; + let best = tokens[0]!; + for (const t of tokens) if (t.length > best.length) best = t; + return best; +} + +export function buildCoarseWhere(params: SymbolSearchParams): string | null { + const q = String(params.query ?? ''); + const mode = inferSymbolSearchMode(q, params.mode); + const safe = escapeQuotes(q); + const likeOp = params.caseInsensitive ? 'ILIKE' : 'LIKE'; + + if (mode === 'prefix') { + if (!safe) return null; + return `symbol ${likeOp} '${safe}%'`; + } + + if (mode === 'substring') { + if (!safe) return null; + return `symbol ${likeOp} '%${safe}%'`; + } + + const token = pickCoarseToken(q); + if (!token) return null; + const tokenSafe = escapeQuotes(token); + return `symbol ${likeOp} '%${tokenSafe}%'`; +} + +function escapeRegex(s: string): string { + return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); +} + +function globToRegex(pattern: string, caseInsensitive: boolean): RegExp | null { + try { + const body = pattern + .split('') + .map(ch => { + if (ch === '*') return '.*'; + if (ch === '?') return '.'; + return escapeRegex(ch); + }) + .join(''); + const flags = caseInsensitive ? 'i' : ''; + return new RegExp(`^${body}$`, flags); + } catch { + return null; + } +} + +function buildRegex(pattern: string, caseInsensitive: boolean): RegExp | null { + try { + const flags = caseInsensitive ? 'i' : ''; + return new RegExp(pattern, flags); + } catch { + return null; + } +} + +function normalizeForFuzzy(s: string): string { + return s.toLowerCase().replace(/[^a-z0-9_$.]+/g, ''); +} + +function fuzzySubsequenceScore(needle: string, haystack: string): number { + if (!needle) return 0; + let i = 0; + let score = 0; + let lastMatch = -2; + for (let j = 0; j < haystack.length && i < needle.length; j++) { + if (haystack[j] === needle[i]) { + score += (j === lastMatch + 1) ? 2 : 1; + lastMatch = j; + i++; + } + } + if (i < needle.length) return -1; + return score; +} + +export function filterAndRankSymbolRows>(rows: T[], params: SymbolSearchParams): T[] { + const qRaw = String(params.query ?? ''); + const mode = inferSymbolSearchMode(qRaw, params.mode); + const limit = Math.max(1, Number(params.limit ?? 50)); + const caseInsensitive = Boolean(params.caseInsensitive); + + const getSymbol = (r: any) => String(r?.symbol ?? r?.name ?? ''); + + if (mode === 'substring' || mode === 'prefix') { + const q = caseInsensitive ? qRaw.toLowerCase() : qRaw; + const out = rows.filter(r => { + const s = getSymbol(r); + const ss = caseInsensitive ? s.toLowerCase() : s; + return mode === 'prefix' ? ss.startsWith(q) : ss.includes(q); + }); + return out.slice(0, limit); + } + + if (mode === 'wildcard') { + const re = globToRegex(qRaw, caseInsensitive); + if (!re) return []; + const out = rows.filter(r => re.test(getSymbol(r))); + return out.slice(0, limit); + } + + if (mode === 'regex') { + const re = buildRegex(qRaw, caseInsensitive); + if (!re) return []; + const out = rows.filter(r => re.test(getSymbol(r))); + return out.slice(0, limit); + } + + const q = normalizeForFuzzy(qRaw); + const scored = rows + .map(r => { + const s = normalizeForFuzzy(getSymbol(r)); + const score = fuzzySubsequenceScore(q, s); + return { r, score }; + }) + .filter(x => x.score >= 0) + .sort((a, b) => b.score - a.score) + .slice(0, limit); + + return scored.map(x => x.r); +} diff --git a/src/core/types.ts b/src/core/types.ts index c01fdb0..07243b6 100644 --- a/src/core/types.ts +++ b/src/core/types.ts @@ -4,6 +4,15 @@ export interface SymbolInfo { startLine: number; endLine: number; signature: string; + container?: { + name: string; + kind: 'function' | 'class' | 'method'; + startLine: number; + endLine: number; + signature: string; + }; + extends?: string[]; + implements?: string[]; } export interface RefRow { diff --git a/src/core/workspace.ts b/src/core/workspace.ts index 2327d29..9397e7e 100644 --- a/src/core/workspace.ts +++ b/src/core/workspace.ts @@ -91,7 +91,7 @@ export async function queryManifestWorkspace(params: { const dbDir = defaultDbDir(ensured.repoDir); const { refs } = await openTables({ dbDir, dim: 256, mode: 'create_if_missing' }); - const projectRows = await refs.query().where(`symbol LIKE '%${q}%'`).limit(params.limit).toArray(); + const projectRows = await refs.query().where(`symbol ILIKE '%${q}%'`).limit(params.limit).toArray(); for (const r of projectRows as any[]) { rows.push({ project: { name: project.name, path: project.path, repoRoot: ensured.repoDir, from: ensured.from }, diff --git a/src/mcp/server.ts b/src/mcp/server.ts index c989039..4ee4330 100644 --- a/src/mcp/server.ts +++ b/src/mcp/server.ts @@ -11,6 +11,8 @@ import { ensureLfsTracking } from '../core/lfs'; import { IndexerV2 } from '../core/indexer'; import { buildQueryVector, scoreAgainst } from '../core/search'; import { queryManifestWorkspace } from '../core/workspace'; +import { runAstGraphQuery } from '../core/astGraphQuery'; +import { buildCoarseWhere, filterAndRankSymbolRows, inferSymbolSearchMode, pickCoarseToken } from '../core/symbolSearch'; export class GitAIV2MCPServer { private server: Server; @@ -58,11 +60,14 @@ export class GitAIV2MCPServer { }, { name: 'search_symbols', - description: 'Search symbols by substring and return file locations', + description: 'Search symbols and return file locations (substring/prefix/wildcard/regex/fuzzy)', inputSchema: { type: 'object', properties: { query: { type: 'string' }, + mode: { type: 'string', enum: ['substring', 'prefix', 'wildcard', 'regex', 'fuzzy'] }, + case_insensitive: { type: 'boolean', default: false }, + max_candidates: { type: 'number', default: 1000 }, path: { type: 'string', description: 'Repository path (optional)' }, limit: { type: 'number', default: 50 }, }, @@ -152,6 +157,19 @@ export class GitAIV2MCPServer { required: ['path'], }, }, + { + name: 'ast_graph_query', + description: 'Run a CozoScript query against the AST graph database', + inputSchema: { + type: 'object', + properties: { + query: { type: 'string' }, + params: { type: 'object', default: {} }, + path: { type: 'string', description: 'Repository path (optional)' }, + }, + required: ['query'], + }, + }, ], }; }); @@ -238,13 +256,28 @@ export class GitAIV2MCPServer { }; } + if (name === 'ast_graph_query') { + const repoRoot = await this.resolveRepoRoot(callPath); + const query = String((args as any).query ?? ''); + const params = (args as any).params && typeof (args as any).params === 'object' ? (args as any).params : {}; + const result = await runAstGraphQuery(repoRoot, query, params); + return { + content: [{ type: 'text', text: JSON.stringify({ repoRoot, result }, null, 2) }], + }; + } + const repoRootForDispatch = await this.resolveRepoRoot(callPath); if (name === 'search_symbols' && inferWorkspaceRoot(repoRootForDispatch)) { const query = String((args as any).query ?? ''); const limit = Number((args as any).limit ?? 50); - const res = await queryManifestWorkspace({ manifestRepoRoot: repoRootForDispatch, keyword: query, limit }); + const mode = inferSymbolSearchMode(query, (args as any).mode); + const caseInsensitive = Boolean((args as any).case_insensitive ?? false); + const maxCandidates = Math.max(limit, Number((args as any).max_candidates ?? Math.min(2000, limit * 20))); + const keyword = (mode === 'substring' || mode === 'prefix') ? query : pickCoarseToken(query); + const res = await queryManifestWorkspace({ manifestRepoRoot: repoRootForDispatch, keyword, limit: maxCandidates }); + const rows = filterAndRankSymbolRows(res.rows, { query, mode, caseInsensitive, limit }); return { - content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, rows: res.rows }, null, 2) }], + content: [{ type: 'text', text: JSON.stringify({ repoRoot: repoRootForDispatch, rows }, null, 2) }], }; } @@ -253,8 +286,14 @@ export class GitAIV2MCPServer { if (name === 'search_symbols') { const query = String((args as any).query ?? ''); const limit = Number((args as any).limit ?? 50); - const safe = query.replace(/'/g, "''"); - const rows = await refs.query().where(`symbol LIKE '%${safe}%'`).limit(limit).toArray(); + const mode = inferSymbolSearchMode(query, (args as any).mode); + const caseInsensitive = Boolean((args as any).case_insensitive ?? false); + const maxCandidates = Math.max(limit, Number((args as any).max_candidates ?? Math.min(2000, limit * 20))); + const where = buildCoarseWhere({ query, mode, caseInsensitive }); + const candidates = where + ? await refs.query().where(where).limit(maxCandidates).toArray() + : await refs.query().limit(maxCandidates).toArray(); + const rows = filterAndRankSymbolRows(candidates as any[], { query, mode, caseInsensitive, limit }); return { content: [{ type: 'text', text: JSON.stringify({ repoRoot, rows }, null, 2) }], }; diff --git a/test/mcp.smoke.test.js b/test/mcp.smoke.test.js index 9eae8ae..bfab893 100644 --- a/test/mcp.smoke.test.js +++ b/test/mcp.smoke.test.js @@ -30,6 +30,21 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(toolNames.has('unpack_index')); assert.ok(toolNames.has('list_files')); assert.ok(toolNames.has('read_file')); + assert.ok(toolNames.has('ast_graph_query')); + + const call = await client.callTool({ + name: 'search_symbols', + arguments: { + query: 'get*repo', + mode: 'wildcard', + case_insensitive: true, + limit: 5, + max_candidates: 50, + }, + }); + const text = String(call?.content?.[0]?.text ?? ''); + const parsed = text ? JSON.parse(text) : null; + assert.ok(parsed && Array.isArray(parsed.rows)); } finally { await transport.close(); } diff --git a/test/symbolSearch.test.js b/test/symbolSearch.test.js new file mode 100644 index 0000000..efc7f76 --- /dev/null +++ b/test/symbolSearch.test.js @@ -0,0 +1,53 @@ +const test = require('node:test'); +const assert = require('node:assert/strict'); + +const { + inferSymbolSearchMode, + filterAndRankSymbolRows, +} = require('../dist/src/core/symbolSearch.js'); + +test('inferSymbolSearchMode auto-detects wildcard', () => { + assert.equal(inferSymbolSearchMode('get*repo'), 'wildcard'); + assert.equal(inferSymbolSearchMode('getRepo'), 'substring'); +}); + +test('wildcard mode supports * and ? with case-insensitive', () => { + const rows = [ + { symbol: 'get_repo' }, + { symbol: 'getRepo' }, + { symbol: 'set_repo' }, + ]; + const out = filterAndRankSymbolRows(rows, { query: 'get*repo', mode: 'wildcard', caseInsensitive: true, limit: 50 }); + const syms = out.map(r => r.symbol); + assert.deepEqual(syms.sort(), ['getRepo', 'get_repo'].sort()); +}); + +test('prefix mode matches only prefix', () => { + const rows = [ + { symbol: 'get_repo' }, + { symbol: 'forget_repo' }, + ]; + const out = filterAndRankSymbolRows(rows, { query: 'get_', mode: 'prefix', limit: 50 }); + assert.deepEqual(out.map(r => r.symbol), ['get_repo']); +}); + +test('regex mode filters by regex', () => { + const rows = [ + { symbol: 'get_repo' }, + { symbol: 'getRepo' }, + { symbol: 'set_repo' }, + ]; + const out = filterAndRankSymbolRows(rows, { query: '^get.*repo$', mode: 'regex', caseInsensitive: true, limit: 50 }); + const syms = out.map(r => r.symbol); + assert.deepEqual(syms.sort(), ['getRepo', 'get_repo'].sort()); +}); + +test('fuzzy mode matches subsequence', () => { + const rows = [ + { symbol: 'get_repo' }, + { symbol: 'set_repo' }, + ]; + const out = filterAndRankSymbolRows(rows, { query: 'gtrp', mode: 'fuzzy', limit: 50 }); + assert.deepEqual(out.map(r => r.symbol), ['get_repo']); +}); +