Skip to content

Commit 08d6d48

Browse files
committed
docs: sync all markdown files with v0.5.0 final state
- Rename v0.5.0 release to "Beyond the Diff" - Test count 340 → 367, secret patterns 25 → 24 - PRD v4.2: add FR-062 (security hardening), FR-063 (prompt optimization), update eval harness description, fix test file list, fuzz targets 3 → 5 - CHANGELOG: add Security, Prompt Quality, Testing, API subsections
1 parent 5c94f19 commit 08d6d48

5 files changed

Lines changed: 83 additions & 43 deletions

File tree

CHANGELOG.md

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,45 @@ SPDX-License-Identifier: PolyForm-Noncommercial-1.0.0
88

99
All notable changes to CommitBee are documented here.
1010

11-
## `v0.5.0` — Understand Everything (current)
12-
13-
- **Full signature extraction** — The LLM sees `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`, not just "Function connect." Extracted from tree-sitter AST nodes using a two-strategy approach: `child_by_field_name("body")` primary, body-node-kind scan fallback. Works across all 10 languages.
14-
- **Signature diffs for modified symbols** — When a function signature changes, the prompt shows `[~] old_sig → new_sig` so the LLM understands exactly what was modified.
15-
- **Cross-file connection detection** — Detects when a changed file calls a symbol defined in another changed file. Shown as `CONNECTIONS: validator calls parse() — both changed` in the prompt.
16-
- **Semantic change classification** — Modified symbols are classified as whitespace-only or semantic via character-stream comparison. Formatting-only changes (cargo fmt, prettier) auto-detected as `style` type when all modified symbols are whitespace-only.
17-
- **Dual old/new line tracking**`classify_span_change` correctly tracks old-file and new-file line numbers independently, handling cases where symbols shift positions due to added/removed lines above them.
18-
- **Token budget rebalance** — Symbol section gets 30% of budget (up from 20%) when signatures are present, since richer symbols reduce the LLM's dependency on raw diff.
19-
- **BODY_NODE_KINDS coverage** — Signature extraction verified across all 10 languages with dedicated tests for Java, C, C++, Ruby, and C#.
20-
- **Connection reliability** — Short symbol names (<4 chars) filtered to prevent false positives, short-circuit after 5 connections, sort+dedup for correctness.
21-
- **Fixed false positive in breaking change detection** — Modified public symbols were incorrectly counted as "removed APIs", causing spurious `breaking_change` validator violations and retry exhaustion.
11+
## `v0.5.0` — Beyond the Diff (current)
12+
13+
### Semantic Analysis
14+
15+
- **Full signature extraction** — The LLM sees `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`, not just "Function connect." Two-strategy body detection: `child_by_field_name("body")` primary, `BODY_NODE_KINDS` fallback. Works across all 10 languages.
16+
- **Signature diffs for modified symbols** — When a function signature changes, the prompt shows `[~] old_sig → new_sig`.
17+
- **Cross-file connection detection** — Detects when a changed file calls a symbol defined in another changed file. Shown as `CONNECTIONS: validator calls parse() — both changed`.
18+
- **Semantic change classification** — Modified symbols classified as whitespace-only or semantic via character-stream comparison. Formatting-only changes auto-detected as `style`.
19+
- **Dual old/new line tracking** — Correctly handles symbols shifting positions between HEAD and staged.
20+
- **Token budget rebalance** — Symbol section gets 30% of budget (up from 20%) when signatures present.
21+
22+
### Security
23+
24+
- **Block project config URL overrides**`.commitbee.toml` can no longer redirect `openai_base_url`, `anthropic_base_url`, or `ollama_host` to prevent SSRF/exfiltration of API keys and staged code.
25+
- **Cap streaming line_buffer** — All 3 LLM providers cap `line_buffer` at 1 MB to prevent unbounded memory growth from malicious servers.
26+
- **Strip URLs from error messages**`reqwest::Error` display uses `without_url()` to prevent leaking configured base URLs.
27+
- **Broadened OpenAI secret pattern** — Detects `sk-proj-` and `sk-svcacct-` prefixed keys alongside legacy `sk-` format.
28+
- **Replaced Box::leak with Cow** — Custom secret pattern names use `Cow<'static, str>` instead of leaked heap allocations.
29+
30+
### Prompt Quality
31+
32+
- **Fixed breaking change subject budget** — Subject character budget now accounts for `!` suffix, preventing guaranteed validator rejection on breaking changes.
33+
- **Omit empty EVIDENCE section** — Saves ~200 chars when all flags are at default (most changes).
34+
- **Symbol marker legend** — SYSTEM_PROMPT now explains `[+] added, [-] removed, [~] modified`.
35+
- **Removed duplicate JSON schema** — System prompt no longer includes a competing schema template.
36+
- **Replaced emoji with text**`` replaced with `WARNING:` for better small-model tokenization.
37+
- **Enhanced Python queries** — Tree-sitter now captures decorated functions and classes.
38+
39+
### Testing & Evaluation
40+
41+
- **Evaluation harness** — 36 fixtures covering all 11 commit types, AST features, and edge cases. Per-type accuracy reporting with `EvalSummary`.
42+
- **15+ new unit tests** — Coverage for `detect_primary_change`, `detect_metadata_breaking`, `detect_bug_evidence` (all 7 patterns), Deleted/Renamed status, signature edge cases, connection content assertions.
43+
- **5 fuzz targets**`fuzz_sanitizer`, `fuzz_safety`, `fuzz_diff_parser`, `fuzz_signature`, `fuzz_classify_span`.
44+
- **367 tests** total (up from 308 at v0.4.0).
45+
46+
### API
47+
48+
- **Demoted internal types**`SymbolChangeType`, `GitService`, `Progress` changed from `pub` to `pub(crate)`.
49+
- **Added `#[non_exhaustive]`** to `SymbolChangeType` for future-safe extension.
2250

2351
## `v0.4.0` — See Everything
2452

@@ -27,7 +55,7 @@ All notable changes to CommitBee are documented here.
2755
- **Multi-language commit messages** — Generate messages in any language with `--locale` flag or `locale` config (e.g., `--locale de` for German).
2856
- **Commit history style learning** — Learns from recent commit history to match your project's style (`learn_from_history`, `history_sample_size` config).
2957
- **Rename detection** — Detects file renames with similarity percentage via `git diff --find-renames`, displayed as `old → new (N% similar)` in prompts and split suggestions. Configurable threshold (default 70%, set to 0 to disable).
30-
- **Expanded secret scanning**25 built-in patterns across 13 categories (cloud providers, AI/ML, source control, communication, payment, database, cryptographic, generic). Pluggable engine: add custom regex patterns or disable built-ins by name via config.
58+
- **Expanded secret scanning**24 built-in patterns across 13 categories (cloud providers, AI/ML, source control, communication, payment, database, cryptographic, generic). Pluggable engine: add custom regex patterns or disable built-ins by name via config.
3159
- **Progress indicators** — Contextual `indicatif` spinners during pipeline phases (analyzing, scanning, generating). Auto-suppressed in non-TTY environments (git hooks, pipes).
3260
- **Evaluation harness**`cargo test --features eval` for structured LLM output quality benchmarking.
3361
- **Fuzz testing**`cargo-fuzz` targets for sanitizer and diff parser robustness.

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ src/
132132
├── analyzer.rs # AnalyzerService (tree-sitter queries, parallel via rayon)
133133
├── context.rs # ContextBuilder (token budget)
134134
├── history.rs # HistoryService (commit style learning)
135-
├── safety.rs # Secret scanning (25 patterns), conflict detection
135+
├── safety.rs # Secret scanning (24 patterns), conflict detection
136136
├── sanitizer.rs # CommitSanitizer (JSON + plain text, BREAKING CHANGE footer)
137137
├── splitter.rs # CommitSplitter (multi-commit detection)
138138
├── template.rs # TemplateService (custom prompt templates)
@@ -192,7 +192,7 @@ src/
192192
### Running Tests
193193

194194
```bash
195-
cargo test # All tests (340 tests)
195+
cargo test # All tests (367 tests)
196196
cargo test --test sanitizer # CommitSanitizer tests
197197
cargo test --test safety # Safety module tests
198198
cargo test --test context # ContextBuilder tests

DOCS.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ think = false
175175
# Set to 0 to disable rename detection
176176
rename_threshold = 70
177177

178-
# Custom secret patterns (regex). Added to the 25 built-in patterns.
178+
# Custom secret patterns (regex). Added to the 24 built-in patterns.
179179
# custom_secret_patterns = ["CUSTOM_KEY_[a-zA-Z0-9]{32}"]
180180

181181
# Disable built-in secret patterns by name (case-insensitive).
@@ -452,7 +452,7 @@ If the sanitizer can't produce a valid commit message, you get a clear error exp
452452

453453
### Secret Scanning
454454

455-
Before anything is sent to an LLM, CommitBee scans all staged content with **25 built-in patterns** across 13 categories:
455+
Before anything is sent to an LLM, CommitBee scans all staged content with **24 built-in patterns** across 13 categories:
456456

457457
| Category | Patterns |
458458
| --- | --- |
@@ -657,7 +657,7 @@ src/
657657
├── git.rs # GitService — gix for discovery, git CLI for diffs
658658
├── analyzer.rs # AnalyzerService — tree-sitter parsing via rayon
659659
├── context.rs # ContextBuilder — evidence flags, token budget
660-
├── safety.rs # Secret scanning (25 patterns), conflict detection
660+
├── safety.rs # Secret scanning (24 patterns), conflict detection
661661
├── sanitizer.rs # CommitSanitizer + CommitValidator
662662
├── splitter.rs # CommitSplitter — diff-shape + Jaccard clustering
663663
├── progress.rs # Progress indicators (indicatif spinners, TTY-aware)
@@ -694,7 +694,7 @@ No panics in user-facing code paths. The sanitizer and validator are tested with
694694

695695
### Testing Strategy
696696

697-
CommitBee has 340 tests across multiple strategies:
697+
CommitBee has 367 tests across multiple strategies:
698698

699699
| Strategy | What It Covers |
700700
| --- | --- |
@@ -707,7 +707,7 @@ CommitBee has 340 tests across multiple strategies:
707707
Run them:
708708

709709
```bash
710-
cargo test # All 340 tests
710+
cargo test # All 367 tests
711711
cargo test --test sanitizer # Just sanitizer tests
712712
cargo test --test integration # LLM provider mocks
713713
COMMITBEE_LOG=debug cargo test -- --nocapture # With logging

PRD.md

Lines changed: 31 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,20 @@ SPDX-License-Identifier: PolyForm-Noncommercial-1.0.0
66

77
# CommitBee — Product Requirements Document
88

9-
**Version**: 4.1
9+
**Version**: 4.2
1010
**Date**: 2026-03-22
1111
**Status**: Active
1212
**Author**: [Sephyi](https://github.com/Sephyi) + [Claude Opus 4.6](https://www.anthropic.com/news/claude-opus-4-6)
1313

1414
## Changelog
1515

1616
<details>
17-
<summary>Revision history (v3.3 → v4.1)</summary>
17+
<summary>Revision history (v3.3 → v4.2)</summary>
1818

1919
| Version | Date | Summary |
2020
|---------|------------|---------|
21-
| 4.1 | 2026-03-22 | AST context overhaul (v0.5.0): full signature extraction from tree-sitter nodes, semantic change classification (whitespace vs body vs signature), old→new signature diffs, cross-file connection detection, formatting auto-detection via symbols. 340 tests. |
21+
| 4.2 | 2026-03-22 | v0.5.0 hardening: security fixes (SSRF prevention, streaming caps), prompt optimization (budget fix, evidence omission, emoji removal), eval harness (36 fixtures, per-type reporting), test coverage (15+ new tests), API hygiene (pub(crate) demotions), 5 fuzz targets. 367 tests. |
22+
| 4.1 | 2026-03-22 | AST context overhaul (v0.5.0): full signature extraction from tree-sitter nodes, semantic change classification (whitespace vs body vs signature), old→new signature diffs, cross-file connection detection, formatting auto-detection via symbols. 367 tests. |
2223
| 4.0 | 2026-03-13 | PRD normalization: aligned phases with shipped versions (v0.2.0/v0.3.x/v0.4.0), collapsed revision history, unified status markers, resolved stale critical issues, canonicalized test count to 308, removed dead cross-references. FR-031 (Exclude Files) and FR-033 (Copy to Clipboard) shipped. |
2324
| 3.3 | 2026-03-13 | v0.4.0 full feature completion — FR-030 (Custom Prompt Templates), FR-032 (Multi-Language), FR-036 (Tree-sitter Query Patterns), FR-057 (Additional Languages), FR-058 (History Learning), TR-006 (Eval Harness), TR-007 (Fuzzing). 308 tests. |
2425
| 3.2 | 2026-03-13 | FR-035 (Rename Detection), FR-037 (Expanded Secret Scanning), FR-038 (Progress Indicators). 202 tests. |
@@ -91,7 +92,7 @@ CommitBee is a Rust-native CLI tool that uses tree-sitter semantic analysis and
9192
| Multiple message generation (pick from N) | Common (aicommits, aicommit2) | ✅ v0.2.0 |
9293
| Commit splitting (multi-concern detection) | No competitor has this | ✅ v0.2.0 |
9394
| Custom prompt/instruction files | Growing (Copilot, aicommit2) | ✅ v0.4.0 |
94-
| Unit/integration tests | Non-negotiable for quality |340 tests |
95+
| Unit/integration tests | Non-negotiable for quality |367 tests |
9596

9697
## 3. Architecture
9798

@@ -158,7 +159,7 @@ commitbee
158159
│ ├── git.rs # GitService trait + impl (async, single-diff)
159160
│ ├── analyzer.rs # AnalyzerService (parallel parsing via rayon)
160161
│ ├── context.rs # ContextBuilder (fixed budget math, fallback ladder)
161-
│ ├── safety.rs # Secret scanning (25 patterns, pluggable engine)
162+
│ ├── safety.rs # Secret scanning (24 patterns, pluggable engine)
162163
│ ├── sanitizer.rs # CommitSanitizer (UTF-8 safe) + CommitValidator (7 rules)
163164
│ ├── splitter.rs # CommitSplitter (Jaccard + fingerprinting)
164165
│ ├── template.rs # TemplateService (custom prompt templates)
@@ -170,16 +171,19 @@ commitbee
170171
│ └── anthropic.rs # Anthropic Claude
171172
├── tests/
172173
│ ├── snapshots/ # insta snapshot files
173-
│ ├── fixtures/ # Test git repos, diff samples, golden semantic fixtures, eval fixtures
174-
│ ├── languages.rs # Feature-gated language tests
175-
│ ├── sanitizer.rs # Unit + snapshot + proptest
176-
│ ├── context.rs # Unit + snapshot
177-
│ ├── safety.rs # Unit + proptest
178-
│ ├── analyzer.rs # Unit + snapshot with fixture files
179-
│ ├── git.rs # Integration with tempfile repos
180-
│ ├── ollama.rs # Integration with wiremock
181-
│ └── cli.rs # CLI integration with assert_cmd
182-
├── fuzz/ # cargo-fuzz targets (sanitizer, safety, diff parser)
174+
│ ├── fixtures/ # Eval fixtures (36 scenarios), diff samples
175+
│ ├── helpers.rs # Shared test helpers (make_file_change, make_staged_changes)
176+
│ ├── context.rs # ContextBuilder, type inference, evidence, signatures, connections
177+
│ ├── sanitizer.rs # CommitSanitizer + CommitValidator (unit + snapshot + proptest)
178+
│ ├── splitter.rs # CommitSplitter grouping and merge logic
179+
│ ├── languages.rs # Feature-gated per-language symbol + signature extraction
180+
│ ├── safety.rs # Secret scanning patterns + conflict detection
181+
│ ├── integration.rs # LLM provider round-trips with wiremock
182+
│ ├── history.rs # HistoryService with tempfile git repos
183+
│ ├── template.rs # TemplateService custom/default templates
184+
│ ├── commit_type.rs # CommitType parsing and ALL sync
185+
│ └── eval.rs # Eval harness fixture validation (feature-gated)
186+
├── fuzz/ # cargo-fuzz targets (sanitizer, safety, diff parser, signature, classify_span)
183187
└── completions/ # Generated shell completions
184188
```
185189

@@ -438,11 +442,11 @@ Config: `learn_from_history` (default `false`), `history_sample_size` (default 5
438442

439443
#### TR-006: Evaluation Harness ✅
440444

441-
`commitbee eval` — runs full pipeline against fixture diffs, compares against expected snapshots. Feature-gated (`eval` feature). Fixtures in `tests/fixtures/eval/`. Pass/fail report with diff of expected vs. actual.
445+
`commitbee eval` — runs full pipeline against fixture diffs with assertion-based validation. Feature-gated (`eval` feature). 36 fixtures in `tests/fixtures/eval/` covering all 11 commit types, AST features (signatures, connections, whitespace classification), and edge cases. Each fixture has `metadata.toml` (assertions for type, evidence flags, prompt content, connections, breaking changes), `diff.patch`, and optional `symbols.toml` (injected CodeSymbol data). `EvalSummary` reports per-type accuracy and overall score. `run_sync()` method for integration test access.
442446

443447
#### TR-007: Fuzzing ✅
444448

445-
3 `cargo-fuzz` targets: `fuzz_sanitizer`, `fuzz_safety`, `fuzz_diff_parser`. `fuzz/Cargo.toml` with `libfuzzer-sys`.
449+
5 `cargo-fuzz` targets: `fuzz_sanitizer`, `fuzz_safety`, `fuzz_diff_parser`, `fuzz_signature`, `fuzz_classify_span`. `fuzz/Cargo.toml` with `libfuzzer-sys`.
446450

447451
#### FR-031: Exclude Files ✅
448452

@@ -466,6 +470,14 @@ Modified symbols (same name+kind+file in both HEAD and staged) are classified as
466470

467471
Scans added diff lines for `symbol_name(` call patterns referencing symbols defined in other changed files. Connections displayed in new `CONNECTIONS:` prompt section (e.g., `validator calls parse() — both changed`). Capped at 5 connections to prevent prompt bloat. SYSTEM_PROMPT updated with connection-aware guidance. 1 test + 1 splitter integration test.
468472

473+
#### FR-062: Security Hardening ✅
474+
475+
Project-level `.commitbee.toml` can no longer override `openai_base_url`, `anthropic_base_url`, or `ollama_host` (SSRF/exfiltration prevention). All 3 streaming LLM providers cap `line_buffer` at `MAX_RESPONSE_BYTES` (1 MB) to prevent unbounded memory growth. `reqwest::Error` display stripped of URLs via `without_url()`. OpenAI secret pattern broadened to `sk-proj-` and `sk-svcacct-` prefixes. `Box::leak` replaced with `Cow<'static, str>` for custom secret pattern names.
476+
477+
#### FR-063: Prompt Optimization for Small Models ✅
478+
479+
Subject character budget accounts for `!` suffix on breaking changes. EVIDENCE section omitted when all flags are default (~200 chars saved). Symbol marker legend added to SYSTEM_PROMPT (`[+] added, [-] removed, [~] modified`). Duplicate JSON schema removed from system prompt. Emoji replaced with text labels (`WARNING:` instead of ``). CONNECTIONS instruction softened for small models. Python tree-sitter queries enhanced with `decorated_definition` support.
480+
469481
### 4.6 Future — v0.6.0+ (Market Leadership)
470482

471483
#### FR-050: MCP Server Mode
@@ -648,7 +660,7 @@ commitbee eval # Run evaluation harness (dev, feature-ga
648660

649661
## 8. Testing Requirements
650662

651-
**Current test count: 334**
663+
**Current test count: 367**
652664

653665
### TR-001: Unit Tests
654666

@@ -806,7 +818,7 @@ Invalid JSON → retry once with repair prompt. Second failure → heuristic ext
806818
| 2 | v0.3.x | ✅ Shipped | Differentiation — heuristics, validation, spec compliance |
807819
| 3 | v0.4.0 | ✅ Shipped | Feature completion — templates, languages, rename, history, eval, fuzzing |
808820
| 4 | v0.4.x | ✅ Shipped | Remaining polish — exclude files (FR-031), clipboard (FR-033) |
809-
| 5 | v0.5.0 | ✅ Shipped | AST context overhaul — full signatures, semantic change classification, cross-file connections. 340 tests. |
821+
| 5 | v0.5.0 | ✅ Shipped | AST context overhaul — full signatures, semantic change classification, cross-file connections. 367 tests. |
810822
| 6 | v0.6.0+ | 📋 Planned | Market leadership — MCP server, changelog, monorepo, version bumping, GitHub Action |
811823

812824
## 12. Success Metrics

0 commit comments

Comments
 (0)