Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
58a900d
Phase 1.1: Remove v1 semantic tools
brian-lai Feb 7, 2026
14fd6d1
Phase 1.2: Remove mattn driver stub
brian-lai Feb 7, 2026
e324b72
Phase 1.3: Remove ctags entirely (code changes)
brian-lai Feb 7, 2026
3672b7b
Phase 1.3: Remove ctags entirely (install scripts)
brian-lai Feb 7, 2026
1a7e120
Phase 1.4: Remove v1 documentation
brian-lai Feb 7, 2026
aeeb99f
Phase 1.5: Update ctags reference in Symbol struct comment
brian-lai Feb 7, 2026
9ebcfa8
Phase 1.5: Format code with gofmt
brian-lai Feb 7, 2026
622be8f
Add Phase 1 summary
brian-lai Feb 7, 2026
330d796
Update context.md: Phase 1 complete
brian-lai Feb 7, 2026
5020078
Phase 2.2: Consolidate enrichment methods (DRY)
brian-lai Feb 7, 2026
4de4456
Phase 2.3: Consolidate migration files
brian-lai Feb 7, 2026
7444b80
Phase 2.4: Error handling already standardized
brian-lai Feb 7, 2026
1718820
Phase 2.5: Replace bubble sort with sort.Slice
brian-lai Feb 7, 2026
4831576
Add Phase 2 summary
brian-lai Feb 7, 2026
197571e
Update context.md: Phase 2 complete
brian-lai Feb 7, 2026
5f1632e
Phase 3.A: Consolidate architecture documentation
brian-lai Feb 7, 2026
1b883c1
Phase 3.B: Update README.md (partial - core updates)
brian-lai Feb 7, 2026
1299f3b
Phase 3.B: Update CLAUDE.md
brian-lai Feb 7, 2026
faaabea
Phase 3.C: Update CHANGELOG.md with v2.2.0 and v3.0.0
brian-lai Feb 7, 2026
bf1612c
Phase 3.D: Archive completed plans
brian-lai Feb 7, 2026
1d1b130
Phase 3.E: Add Makefile lint/fmt/tidy targets
brian-lai Feb 7, 2026
8feb997
Add Phase 3 summary
brian-lai Feb 7, 2026
0919045
Merge Phase 2 into working branch
brian-lai Feb 8, 2026
e2e2646
Merge Phase 3 into working branch (keep cleanup plans active)
brian-lai Feb 8, 2026
0d27179
Phase 4.1: Add tests for internal/tools/
brian-lai Feb 8, 2026
971195c
Phase 4.4: Add integration smoke test
brian-lai Feb 8, 2026
87930a5
Phase 4.4: Fix integration test build paths
brian-lai Feb 8, 2026
587a7bd
Add Phase 4 summary
brian-lai Feb 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,46 @@ All notable changes to codetect will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [3.0.0] - TBD

### Removed
- **v1 indexer** - Removed `--v1` flag and ctags-based indexing
- **ctags dependency** - Symbol indexing now uses ast-grep exclusively (built-in, no external dependency)
- **mattn SQLite driver stub** - Only modernc and ncruces drivers supported
- **v1 documentation** - Removed `docs/v1/` directory

### Improved
- **Consolidated enrichment logic** - DRY refactor of scope lookup (3 methods → 1 shared method)
- **Standardized error handling** - Consistent patterns across tool handlers
- **Consolidated migration files** - Merged type and database migrations into single file
- **Vector search performance** - Replaced O(n²) bubble sort with O(n log n) sort.Slice (50x-380x faster)

### Added
- **Makefile lint/fmt/tidy targets** - Code quality tooling

## [2.2.0] - 2026-02-07

### Added
- **Rich context in search results** (Phase 2a)
- Parent scope extraction (function/class containing each result)
- Scope kind tracking (function, method, class, etc.)
- Context enrichment (3-5 lines before/after matches)
- Receiver type for methods
- `include_context` parameter for search tools

### Improved
- **AST chunker** - Extracts scope information during indexing
- **Search results** - Include rich metadata for better LLM understanding
- **Dependency injection** - Clean, removable enrichment pattern

### Performance
- **6.5% token reduction** in evaluations
- **3.2% accuracy improvement** in evaluations

---

## [2.1.1] - 2026-02-02

### Fixed
Expand Down
14 changes: 2 additions & 12 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@

## About

codetect is a local MCP server providing fast codebase search, file retrieval, symbol navigation, and semantic search for Claude Code. It combines keyword search (ripgrep), symbol indexing (ctags), and semantic search (Ollama embeddings) to enable natural language code exploration.
codetect is a local MCP server providing fast codebase search, file retrieval, symbol navigation, and semantic search for Claude Code. It combines keyword search (ripgrep), symbol indexing (ast-grep), and semantic search (Ollama embeddings) to enable natural language code exploration.

## Tech Stack

- **Go 1.25+** - Primary language
- **SQLite** - Default embedded database (modernc.org/sqlite)
- **PostgreSQL + pgvector** - Optional high-performance vector backend
- **ripgrep** - Fast keyword search
- **universal-ctags** - Symbol indexing
- **tree-sitter (via ast-grep)** - Symbol indexing
- **Ollama** - Local embeddings for semantic search
- **MCP (Model Context Protocol)** - LLM tool integration

Expand Down Expand Up @@ -118,16 +118,6 @@ make clean # Clean build artifacts
- `CODETECT_DB_TYPE` - `sqlite` (default) or `postgres`
- `CODETECT_DB_DSN` - PostgreSQL connection string

**Project Config (`.codetect.yaml`):**
```yaml
db:
type: postgres
dsn: postgres://user:pass@localhost/codetect

embedding:
provider: ollama
model: nomic-embed-text
```

## MCP Tools

Expand Down
22 changes: 15 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,8 @@ doctor:
@command -v rg >/dev/null 2>&1 || { echo "❌ missing: ripgrep (rg)"; exit 1; }
@echo "✓ ripgrep: $$(rg --version | head -1)"
@echo ""
@echo "=== Optional (for symbol indexing) ==="
@if command -v ctags >/dev/null 2>&1 && ctags --version 2>&1 | grep -q "Universal Ctags"; then \
echo "✓ ctags: $$(ctags --version | head -1)"; \
else \
echo "○ ctags: not found (symbol indexing disabled)"; \
echo " Install with: brew install universal-ctags"; \
fi
@echo "=== Symbol Indexing ==="
@echo "✓ ast-grep: built-in (no external dependency required)"
@echo ""
@echo "=== Embedding Provider ==="
@PROVIDER=$${CODETECT_EMBEDDING_PROVIDER:-ollama}; \
Expand Down Expand Up @@ -199,3 +194,16 @@ eval-list: build
# Show latest evaluation report
eval-report: build
@./$(EVAL) report

# Code quality targets
lint:
@command -v golangci-lint >/dev/null 2>&1 || { echo "golangci-lint not installed. Install from: https://golangci-lint.run/usage/install/"; exit 1; }
golangci-lint run ./...

fmt:
gofmt -s -w .
@command -v goimports >/dev/null 2>&1 && goimports -w . || echo "Note: goimports not found, skipping import formatting"

tidy:
go mod tidy
go mod verify
26 changes: 9 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

A local MCP server providing fast codebase search, file retrieval, symbol navigation, and semantic search for Claude Code.

## What's New in v2.0.0 🎉
## What's New in v2.2.0 🎉

codetect v2.0.0 brings **multi-repo support**, **parallel embedding**, and **improved user experience**:
codetect v2.2.0 brings **rich context enrichment**, **ast-grep symbol indexing**, and **improved search results**:

- ✨ **Dimension-grouped embedding tables** - Multiple repos can use different embedding models without conflicts
- ⚡ **Parallel embedding** with `-j` flag - 3.3x faster embedding with configurable workers
Expand All @@ -13,15 +13,15 @@ codetect v2.0.0 brings **multi-repo support**, **parallel embedding**, and **imp
- 🛡️ **Config preservation** - Reinstalls no longer overwrite your settings
- 🐛 **Better error handling** - Improved ripgrep error messages and diagnostics

**Upgrading from v1.x?** v2.0.0 is fully backward compatible. See [Migration Guide](docs/MIGRATION.md) for details.
**Upgrading from v2.0.0?** v2.2.0 adds rich context and removes ctags dependency. See [CHANGELOG](CHANGELOG.md) for details.

**Full changelog:** [CHANGELOG.md](CHANGELOG.md)

## Features

- **`search_keyword`** - Fast regex search powered by ripgrep
- **`get_file`** - File reading with optional line-range slicing
- **`find_symbol`** - Symbol lookup (functions, types, etc.) via ctags + SQLite
- **`find_symbol`** - Symbol lookup (functions, types, etc.) via ast-grep + SQLite
- **`list_defs_in_file`** - List all definitions in a file
- **`search_semantic`** - Semantic code search via local embeddings (Ollama)
- **`hybrid_search`** - Combined keyword + semantic search
Expand All @@ -37,7 +37,6 @@ cd codetect

The installer will:
- ✓ Check for required dependencies (Go, ripgrep)
- ✓ Offer to install ctags automatically for symbol indexing
- ✓ Guide you through Ollama setup for semantic search (with prominent warnings if missing)
- ✓ Build and install globally to `~/.local/bin`
- ✓ Configure your shell PATH automatically
Expand All @@ -62,39 +61,32 @@ See [Installation Guide](docs/installation.md) for detailed setup instructions.
|------------|----------|---------|
| Go 1.21+ | Yes | Building from source |
| [ripgrep](https://github.com/BurntSushi/ripgrep) | Yes | Keyword search |
| [universal-ctags](https://github.com/universal-ctags/ctags) | No | Symbol indexing (v1 legacy mode only, v2 uses built-in tree-sitter) |
| [Ollama](https://ollama.ai) | No | Semantic search (local embeddings) |

**Note:** v2 (default) uses built-in tree-sitter parsers for symbol extraction. ctags is only needed if using `--v1` legacy mode.
**Note:** Symbol indexing uses built-in ast-grep (no external dependencies required).

## CLI Commands

### Main Commands

```bash
codetect init # Initialize in current directory (.mcp.json)
codetect index # Index with v2 (AST-based, incremental, 15x faster)
codetect index --v1 # Index with v1 (ctags-based, legacy, deprecated)
codetect index # Index symbols (AST-based, incremental)
codetect embed # Generate embeddings (sequential)
codetect embed -j 10 # Generate embeddings in parallel (10 workers, 3.3x faster)
codetect doctor # Check dependencies and configuration
codetect stats # Show v2 index statistics
codetect stats --v1 # Show v1 index statistics (if v1 index exists)
codetect stats # Show index statistics
codetect migrate # Discover existing indexes and register them
codetect update # Update to latest version
codetect help # Show all commands
```

**v2 features (default):**
**Key features:**
- ⚡ Incremental indexing with Merkle tree change detection (~2s vs ~30s)
- 🧬 AST-based chunking preserves semantic boundaries
- 📦 Content-addressed caching (95% cache hit rate)
- 🔄 Parallel embedding with `-j` flag (3.3x faster)

**v1 legacy mode:**
- Use `--v1` flag for ctags-based indexing (deprecated, removed in v3.0.0)
- See [v1 documentation](docs/v1/README.md) for details

### Daemon Commands

```bash
Expand Down Expand Up @@ -279,7 +271,7 @@ See [MCP Compatibility](docs/mcp-compatibility.md) for details and roadmap for n

- [x] MCP stdio server
- [x] Keyword search via ripgrep
- [x] Symbol indexing via ctags
- [x] Symbol indexing via ast-grep
- [x] Semantic search via Ollama
- [x] Hybrid search
- [x] Global installation
Expand Down
Loading