You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All notable changes to CommitBee are documented here.
10
10
11
-
## `v0.5.0` — Understand Everything (current)
12
-
13
-
-**Full signature extraction** — The LLM sees `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`, not just "Function connect." Extracted from tree-sitter AST nodes using a two-strategy approach: `child_by_field_name("body")` primary, body-node-kind scan fallback. Works across all 10 languages.
14
-
-**Signature diffs for modified symbols** — When a function signature changes, the prompt shows `[~] old_sig → new_sig` so the LLM understands exactly what was modified.
15
-
-**Cross-file connection detection** — Detects when a changed file calls a symbol defined in another changed file. Shown as `CONNECTIONS: validator calls parse() — both changed` in the prompt.
16
-
-**Semantic change classification** — Modified symbols are classified as whitespace-only or semantic via character-stream comparison. Formatting-only changes (cargo fmt, prettier) auto-detected as `style` type when all modified symbols are whitespace-only.
17
-
-**Dual old/new line tracking** — `classify_span_change` correctly tracks old-file and new-file line numbers independently, handling cases where symbols shift positions due to added/removed lines above them.
18
-
-**Token budget rebalance** — Symbol section gets 30% of budget (up from 20%) when signatures are present, since richer symbols reduce the LLM's dependency on raw diff.
19
-
-**BODY_NODE_KINDS coverage** — Signature extraction verified across all 10 languages with dedicated tests for Java, C, C++, Ruby, and C#.
20
-
-**Connection reliability** — Short symbol names (<4 chars) filtered to prevent false positives, short-circuit after 5 connections, sort+dedup for correctness.
21
-
-**Fixed false positive in breaking change detection** — Modified public symbols were incorrectly counted as "removed APIs", causing spurious `breaking_change` validator violations and retry exhaustion.
11
+
## `v0.5.0` — Beyond the Diff (current)
12
+
13
+
### Semantic Analysis
14
+
15
+
-**Full signature extraction** — The LLM sees `pub fn connect(host: &str, timeout: Duration) -> Result<Connection>`, not just "Function connect." Two-strategy body detection: `child_by_field_name("body")` primary, `BODY_NODE_KINDS` fallback. Works across all 10 languages.
16
+
-**Signature diffs for modified symbols** — When a function signature changes, the prompt shows `[~] old_sig → new_sig`.
17
+
-**Cross-file connection detection** — Detects when a changed file calls a symbol defined in another changed file. Shown as `CONNECTIONS: validator calls parse() — both changed`.
18
+
-**Semantic change classification** — Modified symbols classified as whitespace-only or semantic via character-stream comparison. Formatting-only changes auto-detected as `style`.
19
+
-**Dual old/new line tracking** — Correctly handles symbols shifting positions between HEAD and staged.
20
+
-**Token budget rebalance** — Symbol section gets 30% of budget (up from 20%) when signatures present.
21
+
22
+
### Security
23
+
24
+
-**Block project config URL overrides** — `.commitbee.toml` can no longer redirect `openai_base_url`, `anthropic_base_url`, or `ollama_host` to prevent SSRF/exfiltration of API keys and staged code.
25
+
-**Cap streaming line_buffer** — All 3 LLM providers cap `line_buffer` at 1 MB to prevent unbounded memory growth from malicious servers.
26
+
-**Strip URLs from error messages** — `reqwest::Error` display uses `without_url()` to prevent leaking configured base URLs.
-**Demoted internal types** — `SymbolChangeType`, `GitService`, `Progress` changed from `pub` to `pub(crate)`.
49
+
-**Added `#[non_exhaustive]`** to `SymbolChangeType` for future-safe extension.
22
50
23
51
## `v0.4.0` — See Everything
24
52
@@ -27,7 +55,7 @@ All notable changes to CommitBee are documented here.
27
55
-**Multi-language commit messages** — Generate messages in any language with `--locale` flag or `locale` config (e.g., `--locale de` for German).
28
56
-**Commit history style learning** — Learns from recent commit history to match your project's style (`learn_from_history`, `history_sample_size` config).
29
57
-**Rename detection** — Detects file renames with similarity percentage via `git diff --find-renames`, displayed as `old → new (N% similar)` in prompts and split suggestions. Configurable threshold (default 70%, set to 0 to disable).
30
-
-**Expanded secret scanning** — 25 built-in patterns across 13 categories (cloud providers, AI/ML, source control, communication, payment, database, cryptographic, generic). Pluggable engine: add custom regex patterns or disable built-ins by name via config.
58
+
-**Expanded secret scanning** — 24 built-in patterns across 13 categories (cloud providers, AI/ML, source control, communication, payment, database, cryptographic, generic). Pluggable engine: add custom regex patterns or disable built-ins by name via config.
31
59
-**Progress indicators** — Contextual `indicatif` spinners during pipeline phases (analyzing, scanning, generating). Auto-suppressed in non-TTY environments (git hooks, pipes).
32
60
-**Evaluation harness** — `cargo test --features eval` for structured LLM output quality benchmarking.
33
61
-**Fuzz testing** — `cargo-fuzz` targets for sanitizer and diff parser robustness.
`commitbee eval` — runs full pipeline against fixture diffs, compares against expected snapshots. Feature-gated (`eval` feature). Fixtures in `tests/fixtures/eval/`. Pass/fail report with diff of expected vs. actual.
445
+
`commitbee eval` — runs full pipeline against fixture diffs with assertion-based validation. Feature-gated (`eval` feature). 36 fixtures in `tests/fixtures/eval/` covering all 11 commit types, AST features (signatures, connections, whitespace classification), and edge cases. Each fixture has `metadata.toml` (assertions for type, evidence flags, prompt content, connections, breaking changes), `diff.patch`, and optional `symbols.toml` (injected CodeSymbol data). `EvalSummary` reports per-type accuracy and overall score. `run_sync()` method for integration test access.
442
446
443
447
#### TR-007: Fuzzing ✅
444
448
445
-
3`cargo-fuzz` targets: `fuzz_sanitizer`, `fuzz_safety`, `fuzz_diff_parser`. `fuzz/Cargo.toml` with `libfuzzer-sys`.
449
+
5`cargo-fuzz` targets: `fuzz_sanitizer`, `fuzz_safety`, `fuzz_diff_parser`, `fuzz_signature`, `fuzz_classify_span`. `fuzz/Cargo.toml` with `libfuzzer-sys`.
446
450
447
451
#### FR-031: Exclude Files ✅
448
452
@@ -466,6 +470,14 @@ Modified symbols (same name+kind+file in both HEAD and staged) are classified as
466
470
467
471
Scans added diff lines for `symbol_name(` call patterns referencing symbols defined in other changed files. Connections displayed in new `CONNECTIONS:` prompt section (e.g., `validator calls parse() — both changed`). Capped at 5 connections to prevent prompt bloat. SYSTEM_PROMPT updated with connection-aware guidance. 1 test + 1 splitter integration test.
468
472
473
+
#### FR-062: Security Hardening ✅
474
+
475
+
Project-level `.commitbee.toml` can no longer override `openai_base_url`, `anthropic_base_url`, or `ollama_host` (SSRF/exfiltration prevention). All 3 streaming LLM providers cap `line_buffer` at `MAX_RESPONSE_BYTES` (1 MB) to prevent unbounded memory growth. `reqwest::Error` display stripped of URLs via `without_url()`. OpenAI secret pattern broadened to `sk-proj-` and `sk-svcacct-` prefixes. `Box::leak` replaced with `Cow<'static, str>` for custom secret pattern names.
476
+
477
+
#### FR-063: Prompt Optimization for Small Models ✅
478
+
479
+
Subject character budget accounts for `!` suffix on breaking changes. EVIDENCE section omitted when all flags are default (~200 chars saved). Symbol marker legend added to SYSTEM_PROMPT (`[+] added, [-] removed, [~] modified`). Duplicate JSON schema removed from system prompt. Emoji replaced with text labels (`WARNING:` instead of `⚠`). CONNECTIONS instruction softened for small models. Python tree-sitter queries enhanced with `decorated_definition` support.
0 commit comments