fix(pr-analyze): area-via-git-diff + skip-uncached-cross-PR by coseto6125 · Pull Request #390 · coseto6125/egent-code-plexus

coseto6125 · 2026-05-23T11:37:28Z

Unblocks T14 round-2 routing test (PRs #386/#387/#388). See commit body for both bug root-causes.

…lock) T14 round-2 routing test (PRs #386 / #387 / #388) surfaced two bugs: 1. Comment-only and docs PRs analyzed with area=null even when the only changed path was a clear area match (README.md, parser comment, cli WHY-comment). `ImpactJson::changed_files()` derived from `changed_symbols[].filePath`, but `ecp impact` returns 0 changed_symbols for comment-only diffs (no symbol semantically changed). Empty changed_files → `classify_area(&[]) → None` → no `ecp:area-*` label → wrong Mergify queue. Pull the diff file list from git directly via a new `git_diff_files(baseline, pr_head)` helper that runs `git diff --name-only $baseline..$pr_head`. Now docs PRs route to docs-only queue and parser-comment PRs route to parser-changes. The unused `ImpactJson::changed_files` is removed; the `file_path` field on `ChangedSymbol` stays for future consumers and to keep the deserializer happy. 2. Cross-PR conflict deadlock: when 3 PRs pushed in parallel, each `analyze` run reads the other two's caches before they've been written. Old code reported each missing cache as a conflict (`__pending_analysis__` sentinel), so all 3 PRs ended up with `ecp/cross-pr-conflict=pending` and Mergify queued none. Spec actually said missing-cache should fall through to "let Mergify speculative trial catch real conflicts" — the conservative implementation was inconsistent with the design. Switch the `None => …` arm to `continue` so uncached siblings are skipped, not blocked. Real symbol overlap on cached siblings still reports normally. Updates `cross_pr_conflict_missing_cache_is_conservative` test (renamed `_skipped`) to assert the new behaviour, and tightens `cross_pr_conflict_excludes_self` to use cached impact (so it actually exercises the overlap path rather than the now-skipped missing-cache path).

github-actions · 2026-05-23T11:49:04Z

["run","cross_pr_conflict_disjoint","cross_pr_conflict_overlap","cross_pr_conflict_missing_cache_skipped","cross_pr_conflict_excludes_self"]

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

PR #390 changed `run()` to use `git_diff_files()` for area classification (comment-only / whitespace-only diff coverage), which left `ImpactJson::changed_files` without an in-crate caller. The struct-level `#[allow(dead_code)]` doesn't cascade to methods, so clippy `-D warnings` in the pre-push hook flags this method. The method itself is intentionally retained as a library-API derived view alongside `impact_set_names` / `changed_symbol_names` — see the doc comment for the rationale. Add a method-local `#[allow(dead_code)]` so the deferred-API status is encoded in the source rather than relying on the struct attribute (which a reader could miss).

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

PR #390 changed `run()` to use `git_diff_files()` for area classification (comment-only / whitespace-only diff coverage), which left `ImpactJson::changed_files` without an in-crate caller. The struct-level `#[allow(dead_code)]` doesn't cascade to methods, so clippy `-D warnings` in the pre-push hook flags this method. The method itself is intentionally retained as a library-API derived view alongside `impact_set_names` / `changed_symbol_names` — see the doc comment for the rationale. Add a method-local `#[allow(dead_code)]` so the deferred-API status is encoded in the source rather than relying on the struct attribute (which a reader could miss).

…de (6 atomic commits) (#385) * test(cypher): pin executor row-width invariant (FU-2026-05-23-040) The CLI's `cypher::build_payload` collapses single-column rows to flat scalars and falls back to `Value::Null` when a row is empty. The fallback is intentionally tolerant — but if the executor ever violates `row.len() == columns.len()` (the implicit projection contract), the silent null would mimic a legitimate OPTIONAL MATCH miss. Pin the invariant on the executor side so future projection refactors can't silently regress: - `execute_inner` carries a `debug_assert!` before returning that every emitted row has `row.len() == columns.len()`. Catches violations in dev / test builds. - `row_width_matches_columns_for_all_projection_shapes` covers the three representative shapes — single-column, multi-column, and aggregation (COUNT). All 46 existing executor tests continue to pass since the invariant was already true; the assertion now codifies it. * refactor(pr-analyze): 3 minor cleanups (FU-2026-05-23-043) (1) Eliminate `changed_files_pb: Vec<PathBuf>` redundant state — was only consumed by `classify_area`. Generalised `classify_area` to `<P: AsRef<Path>>(&[P])` so callers can pass `&[String]` directly. (2) Make the non-JSON output format path an explicit error instead of a catch-all stderr fallback. `pr-analyze` output is consumed by Mergify CI hooks which require JSON; silently routing `--format llm/toon/text` to a one-line summary misleads anyone testing the flag and risks hiding bugs in the JSON serialisation path. (3) `ImpactJson::changed_files` / `impact_set_names` now hold the dedup `HashSet<&str>` against the borrowed source field instead of cloning into the set on every insert. The final `.map(...).clone()` only fires for items that survived the dedup filter, so the total clone count drops from `2 × n` to `1 × (unique_count)`. Tests: 14 pr_analyze unit tests + integration tests pass. classify_area signature change is callable from `[PathBuf]` / `[String]` alike via `AsRef<Path>` blanket impl. * refactor(shape-check): widen render_text to consume typed counters (FU-2026-05-23-038) `render_text` was reaching back into the just-serialised JSON value to read `unparseable_fetches` / `unknown_target_shapes_total` / `unknown_target_shapes_truncated` — a leaky abstraction that round-trips typed `u64` / `bool` through `serde_json::Value::as_u64()` / `as_bool()` after `build_payload_with_hints` already held the same values in proper typed form. Extend `ShapeCheckHints` with the three counter fields and pass `&hints` into `render_text` alongside the JSON value. The JSON payload still carries the same keys for the JSON/Toon output paths (unchanged contract); text rendering now bypasses the JSON detour for fields it already knew about. Tests: shape_check_cmd / _ab / _route_filter / _build_payload — all 10 passes preserved. * refactor(tests): hoist write + init_and_analyze into common module (FU-2026-05-23-037) 18 integration-test fixtures redefined the same three helpers verbatim (`ecp_bin`, `write`, `init_and_analyze`) on every file. `tests/common/mod.rs` already housed `ecp_bin` + `write_graph` + `run_git`; promote the remaining two helpers there and switch all 18 fixtures to consume from `common::*`. Two variants existed for `init_and_analyze`: - canonical (blank lines between git arg groups) — 13 files - compact (no blank lines) — 5 of the inspect_*_qualified_calls fixtures Both produced identical behaviour (single-commit main branch + `ecp admin index --repo .` with HOME=tempdir). The canonical form is what now lives in common; per-file copies removed. Files migrated: - 6 silent_drop_*.rs (silent-drop audit fixtures from PR #378) - 5 inspect_*_qualified_calls.rs - inspect_cmd / inspect_fqn / inspect_enum_variants_and_decorators - class_membership_inspect / context_cmd / impact_fqn / imports_edge_inspect All 18 fixtures (64 sub-tests across them) pass. Per-file `setup_repo*` helpers that build different git state (branches, remotes, multi-commit) remain unchanged — common's doc comment already explained when to keep them per-file. * feat(path-literal): sink-override for ext-change callees (FU-2026-05-23-024) `File::new(...).with_extension("json")` / `path.with_file_name("session.toml")` calls produce `string_literal` nodes whose values (`"json"`, `"toml"`) are too short / too plain to pass `is_path_shaped`. They were silently dropped from the path-literal index, so `ecp impact --literal config.json` (or any value-grouped query) saw an incomplete sink set. The classifier always had `with_extension` / `with_file_name` mapped to `(ExtChange, High)`, but the gate was upstream of it. Resolution: - New helper `is_ext_change_callee(callee: Option<&str>) -> bool` recognises HIGH-confidence ext-change names (`with_extension`, `with_file_name`, `set_extension`, `set_file_name`) using the same `trailing_ident` trim as `classify_sink`. - Per-lang extractors (14 languages) reordered: pull `callee` BEFORE the `is_path_shaped` check, then accept the literal when either the shape check passes OR the callee is an ext-change op. - Conservative: bare extension strings appearing in unknown callees (`format_picker("json")`) still get filtered by `is_path_shaped` so option/error tokens don't pollute the index. Tests: - New `is_ext_change_callee_recognises_canonical_names` + `classify_sink_extchange_high_for_ext_change_names` unit tests. - All 14 per-language `*_path_literals.rs` integration tests pass unchanged (46 sub-tests total) — no regression on the existing is-path-shaped path. * chore(tests): drop unused Path import in 6 silent_drop_* fixtures Leftover from the FU-037 fixture-helpers migration — the fixtures previously took `Path` as a parameter for the inline `init_and_analyze` and `write` helpers. Now that both live in `common::*`, the per-file import is dead. clippy flagged it; remove. * feat(path-literal): Kotlin chained-call sink promotion (FU-2026-05-23-023, kotlin only) `File("session_meta.json").readText()` / `.writeText(data)` chains used to classify as `sink:join|medium` because the immediate parent of the string literal is the constructor (`File`), not the read/write terminal. The LLM consumer couldn't tell read-only sinks from write-only ones — exactly the split-brain detection use case PathLiteral is built to support. Resolution (Kotlin only — Swift / Python / Rust deferred to follow-ups): - `enclosing_callee` now walks outward through `value_arguments → call_suffix → call_expression`. When the inner call is wrapped in a `navigation_expression` that's the receiver of an outer call_expression, the outer call's method name is treated as the terminal sink if it's in the high-confidence read/write whitelist (`readText` / `readBytes` / `readLines` / `writeText` / `writeBytes` / `appendText` / `appendBytes`). - `classify_sink` HIGH writes list extended with `writeText`, `writeBytes`, `appendText`, `appendBytes` to make the promotion actually classify. `readText` / `read_text` were already present. Fixture updates (`kotlin_path_literals.rs`): - `free_function_with_read_sink_classified_as_read_high` now asserts `sink:read|confidence:high` (was P2-deferred). - `method_in_class_with_write_sink_resolves_owner` asserts `sink:write|confidence:high`. Per-language Swift / Python / Rust / C++ AST shapes differ; each deserves a separate commit. Tracked under FU-2026-05-23-023. * feat(path-literal): Python chained-call sink promotion (FU-2026-05-23-023, python) Mirrors the Kotlin chain promotion from commit d052333 for Python's `pathlib.Path` idiom: ```python Path("config.json").read_text() # was: sink:join|medium Path("output.json").write_text(data) # was: sink:join|medium ``` After promotion both classify as `sink:{read,write}|confidence:high`. Python AST chain shape (from spike): ``` call (outer — read_text()) attribute (Path("x.json").read_text) call (inner — Path(...)) ← immediate parent of argument_list identifier "Path" argument_list string "x.json" ← str_node "." identifier "read_text" ← last identifier child of attribute argument_list () ``` Differs from Kotlin: `navigation_expression` → `attribute`; no `navigation_suffix` wrapper; method name is the last identifier child of the attribute node. classify_sink HIGH list extended with the snake_case pathlib equivalents of the Kotlin camelCase entries (added in commit d052333): - HIGH reads: `read_bytes` (already had `read_text`) - HIGH writes: `write_text`, `write_bytes` Tests: `pathlib_chain_read_text_classified_high` + `pathlib_chain_write_text_classified_high`. Full ecp-analyzer suite passes (no regression). * feat(path-literal): Swift labelled-arg sink promotion (FU-2026-05-23-023, swift) Swift's path-I/O idiom routes through *argument labels*, not chained method names like Kotlin / Python. Two shapes covered: ```swift let s = try String(contentsOfFile: "config.json") // labelled-arg constructor try "x".write(toFile: "output.json", atomically: ..., encoding: ...) // labelled-arg method ``` In both, the literal sits inside a `value_argument` whose label (`contentsOfFile`, `toFile`) carries the file-op semantics, not the function callee (`String` / `write`). Pre-existing latent bug: `child_by_field_name("function")` returned None unconditionally because the Swift grammar has no named `function` field, so every Swift path literal was emitting `sink:free|confidence:high` regardless of context. Resolution: - `enclosing_callee` walks up via `child(0)` (which actually works) and tracks the nearest enclosing `value_argument`. When the argument label is in the HIGH-confidence whitelist (`contentsOfFile` / `contentsOf` / `toFile`), prefer it over the function callee text. Applies whether the function is a bare `simple_identifier` (constructor) or a `navigation_expression` (method chain). - `is_high_confidence_label` whitelist matches `classify_sink`'s new Swift entries below; spurious match on non-file labels avoided. classify_sink HIGH list extended: - HIGH reads: `contentsOfFile`, `contentsOf` (String / NSString / Data initialisers) - HIGH writes: `toFile` (String / Data `write(toFile:...)`) Tests: - `function_with_read_sink` — assertion tightened from no-sink-check to `sink:read|confidence:high`. - `labelled_arg_constructor_contentsoffile` (new). - `navigation_chain_write_tofile` (new). Full ecp-analyzer suite continues passing (no regression). * feat(path-literal): Rust chained-call sink promotion (FU-2026-05-23-023, rust) Closes the 4th and final lang for FU-023. Mirrors the Kotlin / Python chain promotion pattern for Rust's `File::open + chain` idiom: ```rust File::open("session_meta.json").unwrap().read_to_string(&mut s).unwrap(); File::create("output.json").unwrap().write_all(data).unwrap(); ``` Both used to surface as `sink:open-read|medium` / `sink:open-write|high` (stuck on the constructor leg); now promote to `sink:read|high` and `sink:write|high` respectively. Rust AST chain shape (from spike): ``` string_literal ↑ arguments ↑ call_expression (inner: File::open(...)) ↑ field_expression (.unwrap) ← adapter step ↑ call_expression (.unwrap()) ↑ field_expression (.read_to_string) ← terminal step ↑ call_expression (.read_to_string(...)) ← TERMINAL ↑ ... (further .unwrap()/.expect() ignored) ``` Differs from Kotlin / Python — Rust uses `field_expression` (with `field_identifier` as the method name) instead of Kotlin's `navigation_expression` / Python's `attribute`. `.unwrap()` / `.expect()` / `.ok()` etc. are real chain links rather than implicit in the language; the walk steps THROUGH them via the new `is_transparent_adapter` helper rather than terminating. Whitelist: `read_to_string`, `read_to_end`, `read_text`, `write_all`, `write_text`, plus the four `with_*_extension` / `set_*_file_name` ext-change names already used by FU-024. No new entries needed in `classify_sink` — Rust idioms were already covered there. Tests: `file_open_chain_with_unwrap_promotes_to_read_high` + `file_create_chain_write_all_promotes_to_write_high`. Full ecp-analyzer suite passes (12 / 12 in rust_path_literals; no regression). * feat(path-literal): C++ chained-call sink promotion (FU-2026-05-23-023, c++) C++ chain idiom turns out structurally identical to Rust: `call_expression > field_expression > call_expression` with a `field_identifier` carrying the method name. Earlier judgement that "constructor temporaries can't be chained" was wrong — C++11+ permits rvalue method calls, and `std::filesystem::path(...).replace_extension(...)` is a real C++17 idiom (not just a theoretical chain). ```cpp std::filesystem::path("config.json").replace_extension("toml") // ^ "config.json" ^ "toml" // inner ctor terminal ext-change ``` Without promotion, "config.json" classified as `sink:join|medium` (constructor) and "toml" was dropped entirely (too short for `is_path_shaped`). After promotion both reach `sink:ext-change|high`: - chain walk promotes "config.json" to `replace_extension` - FU-024 `is_ext_change_callee` sink-override accepts the short "toml" because the call is a known ext-change op `classify_sink` ExtChange list + `is_ext_change_callee` whitelist extended: - `replace_extension` (C++17 `std::filesystem::path::replace_extension`) - `replace_filename` (mirror — `replace_filename` is the path-component equivalent of `set_file_name`) Stream `ifstream/ofstream` `.read()` / `.write()` intentionally not promoted — they're MEDIUM-confidence by design (overloaded with non-file stream IO). Tests: `filesystem_path_chain_replace_extension_promotes_to_ext_change_high` asserts both the primary-path-shaped literal AND the short ext literal surface as `sink:ext-change|confidence:high`. 14-lang path_literals fixtures still pass — `replace_extension` was previously absent from the classifier, so no existing test exercised the (now-classified) name. * chore(pr-analyze): allow dead_code on ImpactJson::changed_files PR #390 changed `run()` to use `git_diff_files()` for area classification (comment-only / whitespace-only diff coverage), which left `ImpactJson::changed_files` without an in-crate caller. The struct-level `#[allow(dead_code)]` doesn't cascade to methods, so clippy `-D warnings` in the pre-push hook flags this method. The method itself is intentionally retained as a library-API derived view alongside `impact_set_names` / `changed_symbol_names` — see the doc comment for the rationale. Add a method-local `#[allow(dead_code)]` so the deferred-API status is encoded in the source rather than relying on the struct attribute (which a reader could miss).

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

coseto6125 added the ci:run fire CI test matrix on this PR label May 23, 2026

coseto6125 enabled auto-merge (squash) May 23, 2026 11:37

github-actions Bot added ecp:area-cli ecp signal ecp:risk-low ecp signal labels May 23, 2026

coseto6125 merged commit c75f3e4 into main May 23, 2026
15 of 22 checks passed

coseto6125 deleted the fix/pr-analyze-area-race branch May 23, 2026 11:55

coseto6125 added a commit that referenced this pull request May 23, 2026

chore(test): Area::to_kebab doc comment (T14 routing test v2)

51fd3d9

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

coseto6125 added a commit that referenced this pull request May 23, 2026

chore(test): Area::to_kebab doc comment (T14 routing test v2)

6155602

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

coseto6125 added a commit that referenced this pull request May 23, 2026

chore(test): Area::to_kebab doc comment (T14 routing test v2)

fb37a42

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

coseto6125 added a commit that referenced this pull request May 23, 2026

chore(test): Area::to_kebab doc comment (T14 routing test v2)

f5c4f94

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

coseto6125 added a commit that referenced this pull request May 25, 2026

chore(test): Area::to_kebab doc comment (T14 routing test v2) (#387)

47883bb

Replaces previous test commit that conflicted with #390 refactor of __pending_analysis__ region. Same purpose: trigger area-cli routing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pr-analyze): area-via-git-diff + skip-uncached-cross-PR#390

fix(pr-analyze): area-via-git-diff + skip-uncached-cross-PR#390
coseto6125 merged 1 commit into
mainfrom
fix/pr-analyze-area-race

coseto6125 commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coseto6125 commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant