Conversation
…, set_formula, and add_sheet operations; add comprehensive tests for patch functionality and validation.
…tured error handling
…fy output handling
…ce patch functionality with output directory creation and error handling
…ifications for dry_run, inverse_ops, and formula checks
- Added new patch operations: set_range_values, fill_formula, set_value_if, and set_formula_if. - Introduced FormulaIssue model for tracking formula-related issues during patching. - Implemented preflight formula checks to validate formulas before applying patches. - Enhanced PatchOp model to include new fields for range and base_cell. - Updated run_patch function to support dry_run and return_inverse_ops options. - Added tests for new operations and validation scenarios, ensuring correct behavior and error handling.
…update conflict policy to overwrite
…s and implement related utilities and tests
…d specifications and tests
…ol default as true
… lists and JSON strings
…essages, and support alpha column keys
- Introduced new tools for reading ranges, specific cells, and formulas from extracted JSON data. - Implemented input and output models for each tool, including validation. - Enhanced server registration to support the new tools. - Updated existing tools and models to accommodate the new functionalities. - Added comprehensive tests for the new tools and their interactions with the server. - Created a new sheet_reader module to handle reading logic for ranges, cells, and formulas.
…read tools, and improvements
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds MCP Excel patching (exstruct_patch) and A1-oriented direct-read tools, alpha-column key support with merged-range reporting, extensive MCP modules (patch runner, sheet reader, tools/server), CLI/engine alpha_col wiring, many tests, and v0.4.4 docs and release notes. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant MCPServer as MCP Server
participant PatchRunner as Patch Runner
participant Backend as Excel Backend
participant FS as Filesystem
Client->>MCPServer: exstruct_patch(xlsx_path, ops, flags)
MCPServer->>MCPServer: _coerce_patch_ops(normalize JSON ops)
MCPServer->>PatchRunner: run_patch(PatchRequest)
PatchRunner->>FS: resolve path & check extension
PatchRunner->>PatchRunner: validate ops & select backend
alt dry_run
PatchRunner->>PatchRunner: simulate operations (no save)
PatchRunner-->>MCPServer: PatchResult (simulated diff)
else execute
PatchRunner->>Backend: apply operations in order
alt success
Backend->>FS: save workbook
PatchRunner-->>MCPServer: PatchResult (out_path, patch_diff)
else failure
PatchRunner->>PatchRunner: rollback/no write
PatchRunner-->>MCPServer: PatchResult (error, details)
end
end
MCPServer-->>Client: PatchToolOutput
sequenceDiagram
participant Client
participant MCPServer as MCP Server
participant SheetReader as Sheet Reader
participant FS as Filesystem
participant JSON as JSON Parser
Client->>MCPServer: exstruct_read_range(out_path, sheet, range, flags)
MCPServer->>SheetReader: read_range(ReadRangeRequest)
SheetReader->>FS: read JSON file
JSON->>SheetReader: parse WorkbookData
SheetReader->>SheetReader: select sheet & parse A1 range
SheetReader->>SheetReader: build maps & filter cells
SheetReader-->>MCPServer: ReadRangeResult (cells, warnings)
MCPServer-->>Client: ReadRangeToolOutput
Estimated Code Review Effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly Related PRs
Suggested Labels
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@docs/README.ja.md`:
- Around line 72-78: The tools list is missing the three read tools introduced
in v0.4.4; add bullet entries for `exstruct_read_range`, `exstruct_read_cells`,
and `exstruct_read_formulas` into the same MCP tools list near
`exstruct_extract`, `exstruct_patch`, `exstruct_read_json_chunk`, and
`exstruct_validate_input`, and include a short note (in Japanese) that these
were added in v0.4.4 and are registered/tested on the server; keep the existing
MCP remark about `exstruct_extract`'s `options.alpha_col=true` unchanged.
In `@src/exstruct/mcp/patch_runner.py`:
- Around line 534-554: The PatchErrorDetail is hardcoding op_index=0 for
preflight formula failures; update the logic in the preflight block (the code
that constructs PatchErrorDetail before returning PatchResult) to locate the
actual op index that modified the failing sheet/cell by scanning request.ops for
the first op that targets issue.sheet/issue.cell and set op_index to that index
(or set op_index to a sentinel like -1 if no matching op is found or request.ops
is empty); also set the op field accordingly (use the matching op's op name or
fallback "set_value") so consumers get accurate location info.
- Around line 530-533: The warning about "openpyxl editing may drop
shapes/charts or unsupported elements." is appended unconditionally; change the
logic in patch_runner.py so that this warning is only added when not in dry_run
mode (i.e., when actual saving can occur). Locate the block that appends to
warnings (the list named warnings) and the call to
_append_skip_warnings(warnings, diff) and wrap or gate the warnings.append(...)
with a check of the dry_run flag (or equivalent parameter/state used by the
patch runner) so dry-run previews do not receive that warning.
In `@tests/com/test_charts_extraction.py`:
- Around line 14-15: The contextmanager generator `_excel_app` is incorrectly
annotated as returning `xw.App`; change its return annotation to
`Iterator[xw.App]` (from `collections.abc`) and add the corresponding import,
and apply the same fix for the `_excel_app` in
`tests/com/test_shapes_extraction.py` (line 15) so both contextmanager
generators are typed as `Iterator[xw.App]`.
In `@tests/com/test_shapes_extraction.py`:
- Around line 15-16: The `@contextmanager-decorated` generator _excel_app is
incorrectly annotated as returning xw.App; change its return annotation to
Iterator[xw.App] (and import typing.Iterator if not already) so the signature
reads def _excel_app() -> Iterator[xw.App]:; apply the same fix for the
identical `@contextmanager` function in the other test file
(test_charts_extraction.py) to satisfy mypy strict mode.
🧹 Nitpick comments (13)
tests/com/test_charts_extraction.py (1)
14-26: Duplicated_excel_apphelper across test modules.This context manager is identical to the one in
tests/com/test_shapes_extraction.py. Consider extracting it into a sharedconftest.pyfixture (or a common test utility module) to avoid maintaining the same code in multiple places.tests/com/test_shapes_extraction.py (1)
129-153: Missing docstring — inconsistent with the other test functions in this file.
test_shapes_basicandtest_line_directionboth have Google-style docstrings, but this one does not.As per coding guidelines, "Add Google-style docstrings to all functions and classes".
tests/conftest.py (1)
18-20: Substring match"com" in markexprmay false-positive on markers like"command"or"compare".Consider a more precise check, e.g. a regex word boundary or splitting on known delimiters.
♻️ Safer marker check
+import re + def pytest_configure(config: pytest.Config) -> None: """Register custom markers to avoid pytest warnings.""" markexpr = getattr(config.option, "markexpr", "") or "" - if "com" in markexpr: + if re.search(r'\bcom\b', markexpr): os.environ.pop("SKIP_COM_TESTS", None)src/exstruct/mcp/sheet_reader.py (2)
414-436: Column conversion utilities duplicate logic already inmodels/__init__.py.
_alpha_to_col(1-based) and_col_to_alpha(1-based) reimplement the same bijective base-26 math ascol_index_to_alpha(0-based) insrc/exstruct/models/__init__.py. The only difference is whether the input is 0-based or 1-based. Consider reusing one in terms of the other (e.g.,_col_to_alpha(col) = col_index_to_alpha(col - 1)) to keep a single source of truth for the conversion logic.Also applies to: 425-456
152-195:read_cellsadds all requested addresses toitemsincluding missing ones — document this intentional behavior.A cell that has
value is None and formula is Noneis both appended toitemsand added tomissing_cells. This is likely intentional (caller sees full results + a separate missing list), but it's a subtle contract. A brief docstring note would help.tests/mcp/test_extract_alpha_col.py (1)
8-24: Consider adding a test foralpha_col=False.The tests cover default (
True) and explicitTrue, but don't verify thatalpha_colcan be explicitly set toFalse. Since the MCP default differs from the CLI default, testing the opt-out path is valuable.🧪 Suggested additional test
def test_extract_options_alpha_col_false() -> None: """alpha_col should be settable to False to opt out.""" opts = ExtractOptions(alpha_col=False) assert opts.alpha_col is Falsesrc/exstruct/models/__init__.py (1)
556-564:_merged_items_to_a1_rangesalways emitsstart:endeven for single-cell ranges.When
r1 == r2andc1 == c2, this produces"A1:A1"rather than just"A1". The sister function_format_rangeinsheet_reader.pycollapses single-cell ranges. This inconsistency is harmless (single-cell merges are unusual) but worth noting if you want consistent A1 notation across modules.tests/engine/test_engine_alpha_col.py (1)
8-8: Use publicpytest.MonkeyPatchinstead of private_pytest.monkeypatch.Other test files in this PR (e.g.,
test_tools_handlers.py) consistently usepytest.MonkeyPatch. Importing from_pytestis relying on a private API that could change without notice.Suggested fix
-from _pytest.monkeypatch import MonkeyPatch +import pytestThen update the type hints in the test signatures:
-def test_engine_alpha_col_false(monkeypatch: MonkeyPatch, tmp_path: Path) -> None: +def test_engine_alpha_col_false(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None:-def test_engine_alpha_col_true(monkeypatch: MonkeyPatch, tmp_path: Path) -> None: +def test_engine_alpha_col_true(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None:-def test_engine_serialize_alpha_col_includes_merged_ranges( - monkeypatch: MonkeyPatch, tmp_path: Path +def test_engine_serialize_alpha_col_includes_merged_ranges( + monkeypatch: pytest.MonkeyPatch, tmp_path: Pathdocs/mcp.md (2)
93-113: Multiple JSON objects in a single code block aren't valid JSON.Each tool call example is a separate JSON object, but placing them sequentially in one code block produces invalid JSON. Consider separating them into individual code blocks or wrapping in an array for clarity.
57-69: Same issue in the quick-start example — multiple bare JSON objects in one block.tests/mcp/test_server.py (1)
220-336: Consider extracting shared fake runner setup into a fixture.The fake
run_*function definitions and monkeypatch wiring are repeated nearly identically acrosstest_register_tools_uses_default_on_conflict,test_register_tools_passes_read_tool_arguments,test_register_tools_accepts_patch_ops_json_strings,test_register_tools_rejects_invalid_patch_ops_json_strings,test_register_tools_passes_patch_default_on_conflict, andtest_register_tools_passes_patch_extended_flags. A shared pytest fixture or helper that sets up theDummyApp+ monkeypatched fakes and returns acallsdict would significantly reduce the boilerplate (~60 repeated lines per test).src/exstruct/mcp/patch_runner.py (2)
1063-1107: Formula scanning bypasses protocol abstractions.
_collect_formula_issues_openpyxlusesgetattr(sheet, "iter_rows", None)andgetattr(cell, "coordinate", "")which are not declared inOpenpyxlWorksheetProtocolorOpenpyxlCellProtocol. While the function is explicitly openpyxl-specific andgetattrusage is defensive, this is a deviation from the Protocol abstraction used elsewhere.Also note: for very large workbooks, scanning every cell in every sheet may be slow since it's not restricted to the sheets/cells affected by the patch ops. Consider scoping the scan to only sheets referenced by the ops.
1136-1198: Xlwings backend only supportsadd_sheet,set_value,set_formula.The xlwings op handler raises
ValueError("Unsupported op: ...")for extended op types (set_range_values,fill_formula,set_value_if,set_formula_if). This is correct by design since_requires_openpyxl_backendroutes those to openpyxl, but there's no safeguard preventing future code paths from accidentally reaching here with extended ops. A debug-level log or comment documenting this intentional limitation would aid future maintainers.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 69c20ced98
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/exstruct/mcp/patch_runner.py
Outdated
| if input_path.suffix.lower() == ".xls": | ||
| raise ValueError("openpyxl cannot edit .xls files.") | ||
|
|
||
| workbook = load_workbook(input_path) |
There was a problem hiding this comment.
Preserve VBA when patching .xlsm with openpyxl
.xlsm workbooks are loaded with load_workbook(input_path) but keep_vba=True is never set, so when this backend saves the patched file (for example on non-Windows/COM-unavailable environments or when extended flags force openpyxl), the VBA project is dropped and macro-enabled files are silently corrupted.
Useful? React with 👍 / 👎.
src/exstruct/mcp/patch_runner.py
Outdated
| if skipped: | ||
| return PatchResult( |
There was a problem hiding this comment.
Honor dry-run even when conflict policy is skip
The conflict-policy short-circuit executes before any patch simulation, so dry_run=True combined with on_conflict='skip' returns immediately with an empty diff whenever the target output already exists. This prevents users from getting the preview data that dry-run is supposed to provide, even though no file write would occur.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds Excel editing and targeted read capabilities to the ExStruct MCP server, plus alpha-column (A, B, …) output support and documentation/test updates to support agent workflows.
Changes:
- Introduces
exstruct_patch(atomic Excel patch ops with dry-run/inverse ops/formula checks) and new direct-read tools (exstruct_read_range,exstruct_read_cells,exstruct_read_formulas). - Adds
alpha_colsupport across engine/CLI/MCP extraction plusmerged_rangesfor alpha-column mode. - Updates MCP docs/release notes and expands automated test coverage for new behaviors.
Reviewed changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/models/test_alpha_col.py | Tests for alpha-column conversions and collision handling. |
| tests/mcp/test_tools_handlers.py | Verifies MCP tool handlers build correct internal requests for new tools. |
| tests/mcp/test_tool_models.py | Validates new MCP tool input model defaults/validation. |
| tests/mcp/test_sheet_reader.py | Coverage for new A1-based sheet JSON read utilities. |
| tests/mcp/test_server.py | Tests MCP server tool registration and patch ops coercion. |
| tests/mcp/test_patch_runner.py | End-to-end tests for patch operations and safety flags. |
| tests/mcp/test_extract_alpha_col.py | Ensures MCP extract defaults alpha_col=True. |
| tests/mcp/test_chunk_reader.py | Tests improved chunk-reader errors + alpha-column filter.cols compatibility. |
| tests/engine/test_engine_alpha_col.py | Ensures engine and serialization honor alpha_col and merged range output. |
| tests/conftest.py | Test harness changes for COM gating and render prerequisites. |
| tests/com/test_shapes_extraction.py | Improves Excel app lifecycle handling in COM shape tests. |
| tests/com/test_charts_extraction.py | Improves Excel app lifecycle handling in COM chart tests. |
| tests/cli/test_cli_alpha_col.py | CLI coverage for --alpha-col. |
| src/exstruct/models/init.py | Adds alpha-column conversion utilities + merged_ranges field. |
| src/exstruct/mcp/tools.py | Adds tool I/O models and handlers for patch + direct-read tools. |
| src/exstruct/mcp/sheet_reader.py | New module: A1-based reads from extracted JSON. |
| src/exstruct/mcp/server.py | Registers new MCP tools and adds patch-ops coercion helpers. |
| src/exstruct/mcp/patch_runner.py | New module: patch execution (openpyxl + optional COM path) with safety features. |
| src/exstruct/mcp/extract_runner.py | Adds MCP-level alpha_col option defaulting to true. |
| src/exstruct/mcp/chunk_reader.py | Better error messages and alpha-column support for filter.cols. |
| src/exstruct/mcp/init.py | Re-exports new patch/read APIs and tool models. |
| src/exstruct/engine.py | Adds StructOptions.alpha_col and applies workbook key conversion post-extract. |
| src/exstruct/cli/main.py | Adds --alpha-col flag and passes through to processing. |
| src/exstruct/init.py | Public API adds alpha_col option; exports alpha conversion helpers. |
| pyproject.toml | Version bump to 0.4.4. |
| docs/release-notes/v0.4.4.md | New release notes for v0.4.4. |
| docs/mcp.md | Expanded MCP docs: new tools, workflows, chunking guidance. |
| docs/agents/TASKS.md | Updates task tracking for new MCP features. |
| docs/agents/RAG_INTEGRATION.md | Removed legacy RAG integration doc. |
| docs/agents/FEATURE_SPEC.md | Major expansion to include patch/read tools, alpha_col, chunking guidance. |
| docs/agents/API_DESIGN.md | Removed legacy API design doc. |
| docs/README.ja.md | MCP tool list updated + alpha_col default note. |
| docs/README.en.md | MCP tool list updated + alpha_col default note. |
| README.md | MCP tool list updated + alpha_col default note. |
| CHANGELOG.md | Adds 0.4.4 entry describing new features and changes. |
| AGENTS.md | Documentation cleanup and updated AI workflow guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| markexpr = getattr(request.config.option, "markexpr", "") or "" | ||
| if "com" in markexpr: | ||
| monkeypatch.delenv("SKIP_COM_TESTS", raising=False) | ||
| return |
There was a problem hiding this comment.
This branch treats any markexpr containing the substring "com" as a signal to enable COM, so -m "not com" will also disable SKIP_COM_TESTS. That defeats the purpose of this fixture and can cause accidental COM access on CI/non-Windows. Match com as a marker token (and avoid the negated case) instead of using a substring check.
src/exstruct/mcp/patch_runner.py
Outdated
| if input_path.suffix.lower() == ".xls": | ||
| raise ValueError("openpyxl cannot edit .xls files.") | ||
|
|
||
| workbook = load_workbook(input_path) |
There was a problem hiding this comment.
When patching via openpyxl, loading an .xlsm workbook without keep_vba=True will typically drop the VBA project on save. Since .xlsm is allowed here, consider enabling keep_vba for .xlsm (or otherwise preventing openpyxl writes to .xlsm and emitting a clear warning/error) to avoid corrupting macro-enabled workbooks.
| if input_path.suffix.lower() == ".xls": | |
| raise ValueError("openpyxl cannot edit .xls files.") | |
| workbook = load_workbook(input_path) | |
| suffix = input_path.suffix.lower() | |
| if suffix == ".xls": | |
| raise ValueError("openpyxl cannot edit .xls files.") | |
| # For macro-enabled workbooks, ensure VBA is preserved on save. | |
| if suffix == ".xlsm": | |
| workbook = load_workbook(input_path, keep_vba=True) | |
| else: | |
| workbook = load_workbook(input_path) |
src/exstruct/mcp/patch_runner.py
Outdated
| op_index=0, | ||
| op=request.ops[0].op if request.ops else "set_value", | ||
| sheet=issue.sheet, | ||
| cell=issue.cell, | ||
| message=f"Formula health check failed: {issue.message}", |
There was a problem hiding this comment.
The formula preflight failure builds a PatchErrorDetail with op_index=0/op=request.ops[0].op even though the failure may not be attributable to the first operation (or any single op). This can mislead callers when debugging. Consider using a sentinel op_index (e.g., -1) and a clearer op/message that indicates this is a workbook-level preflight failure, while keeping sheet/cell pointing to the first detected issue.
| op_index=0, | |
| op=request.ops[0].op if request.ops else "set_value", | |
| sheet=issue.sheet, | |
| cell=issue.cell, | |
| message=f"Formula health check failed: {issue.message}", | |
| op_index=-1, | |
| op="workbook_preflight_formula_check", | |
| sheet=issue.sheet, | |
| cell=issue.cell, | |
| message=f"Formula health check failed (workbook-level preflight): {issue.message}", |
| def _validate_add_sheet(op: PatchOp) -> None: | ||
| """Validate add_sheet operation.""" | ||
| if op.cell is not None: | ||
| raise ValueError("add_sheet does not accept cell.") | ||
| if op.value is not None: | ||
| raise ValueError("add_sheet does not accept value.") | ||
| if op.formula is not None: | ||
| raise ValueError("add_sheet does not accept formula.") | ||
|
|
There was a problem hiding this comment.
add_sheet validation only rejects cell/value/formula, but it still allows unrelated fields like range/base_cell/expected/values to be provided (and silently ignored). This makes the PatchOp schema easier to misuse and harder to debug. Consider extending validation to reject any fields not applicable to the selected op type (similarly for set_value/set_formula).
src/exstruct/mcp/patch_runner.py
Outdated
| try: | ||
| app.quit() | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
If app.quit() fails, the Excel process can be left running. The COM tests added a kill() fallback in this PR; it would be good to apply the same pattern here (attempt app.kill() in the exception path) to reduce the risk of orphaned Excel instances during patching.
| pass | |
| try: | |
| app.kill() | |
| except Exception: | |
| pass |
| markexpr = getattr(config.option, "markexpr", "") or "" | ||
| if "com" in markexpr: | ||
| os.environ.pop("SKIP_COM_TESTS", None) |
There was a problem hiding this comment.
The markexpr check uses substring matching ("com" in markexpr), which will also match expressions like "not com and not render". That will incorrectly unset SKIP_COM_TESTS for the unit-test task and allow COM usage in non-COM tests. Consider parsing markexpr tokens (or at least matching com as a standalone marker and ensuring it's not negated) before changing environment behavior.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/exstruct/mcp/patch_runner.py`:
- Around line 117-234: Docstring for PatchOp incorrectly states that `expected`
is required for conditional ops; clarify that `expected` may be null to indicate
matching empty cells. Update the PatchOp class docstring (the section describing
`set_value_if` and `set_formula_if`) to say "`expected` may be null to match an
empty cell" and/or "use null to clear or to match empty cells" so it matches the
behavior enforced by the validators `_validate_set_value_if` and
`_validate_set_formula_if`; reference the `expected` field and the `PatchOp`
model in the docstring update.
In `@tests/com/test_shapes_extraction.py`:
- Around line 130-131: The docstring for test_connector_connections is written
in Japanese while the rest of the tests use English; replace the Japanese
docstring with an English one that conveys the same meaning (e.g., "Ensure the
connector destination IDs match the shape IDs extracted from results.") by
updating the docstring in the test_connector_connections function so it matches
the English naming convention used across the test suite.
🧹 Nitpick comments (5)
.agents/skills/codacy-issues-fetcher/SKILL.md (1)
26-35: Consider usingbashinstead ofpowershellfor the code block language.The commands are standard CLI invocations (
python scripts/...) and are not PowerShell-specific. Usingbashorshellwould be more broadly applicable and avoid implying a Windows-only workflow.-```powershell +```bashtests/com/test_shapes_extraction.py (1)
16-28: Duplicated_excel_appcontext manager across chart and shape test files.This exact context manager is duplicated in
tests/com/test_charts_extraction.py(lines 15–27). Consider extracting it into a shared fixture intests/com/conftest.pyto reduce duplication.src/exstruct/mcp/patch_runner.py (3)
930-971: Implicit fall-through toset_formula_ifin_apply_openpyxl_cell_op.After explicit
ifblocks forset_value,set_formula, andset_value_if(all with early returns), the remaining code at line 964 implicitly handlesset_formula_if. This works because_apply_openpyxl_op(line 843) already guarantees only these four ops reach this function, but adding an explicit guard would improve clarity.Proposed clarification
- formula_if = _require_formula(op.formula, "set_formula_if") + # set_formula_if — only remaining path after set_value/set_formula/set_value_if + assert op.op == "set_formula_if", f"Unexpected op: {op.op}" + formula_if = _require_formula(op.formula, "set_formula_if")
1139-1154:"==" in normalizedcheck may produce false positive on legitimate formulas.While Excel doesn't use
==as an operator, the check on line 1145 scans the entire formula string (including the leading=). A formula like=A1=B1(valid Excel comparison) normalizes to=A1=B1, and"==" not in "=A1=B1"— this is fine, no false positive here. However,==could appear in string literals within formulas (e.g.,=SUBSTITUTE(A1,"==","")) and would trigger a false warning. Since the level is "warning" (not "error") this is acceptable, but worth documenting in the code.
1169-1192: xlwings path silently wraps all non-ValueError exceptions twice.When
_apply_ops_xlwingsraises a generalException(line 1190–1191), it wraps it inRuntimeError. Then inrun_patch(line 507–517), the same exception triggers the fallback logic or anotherRuntimeErrorwrap. This means the original exception context could be buried under two layers of wrapping. Consider lettingPatchOpErrorandValueErrorpropagate cleanly (which is already done at 1188–1189), and only wrapping truly unexpected errors once.
Codacy's Analysis Summary0 new issue (≤ 0 issue) Review Pull Request in Codacy →
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
tests/com/test_shapes_extraction.py (2)
91-92: Compound assertions obscure which condition failed.If either of these multi-condition
assertstatements fails, pytest will report a single failure without indicating which sub-condition was false. Splitting them aids debugging.Proposed fix
- assert rect.l >= 0 and rect.t >= 0 - assert rect.id is not None and rect.id > 0 + assert rect.l >= 0 + assert rect.t >= 0 + assert rect.id is not None + assert rect.id > 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/com/test_shapes_extraction.py` around lines 91 - 92, The tests use compound assertions that hide which sub-condition failed; replace the two multi-condition asserts with separate assertions so failures are clear: split "assert rect.l >= 0 and rect.t >= 0" into "assert rect.l >= 0" and "assert rect.t >= 0", and split "assert rect.id is not None and rect.id > 0" into "assert rect.id is not None" and "assert rect.id > 0", referring to the rect object's attributes (rect.l, rect.t, rect.id) and updating any accompanying messages if present.
31-31: Missing docstring on helper function.Per coding guidelines, all functions should have Google-style docstrings.
_make_workbook_with_shapes(and_excel_appabove) are missing them.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/com/test_shapes_extraction.py` at line 31, Add Google-style docstrings to the helper functions _make_workbook_with_shapes and _excel_app: for each function include a short summary line, a blank line, and parameter and return sections as applicable (e.g., Args: path (Path): description for _make_workbook_with_shapes; Returns: type and description for _excel_app if it returns a value or None). Ensure the docstrings follow the Google style (triple-quoted) and describe purpose, parameters, and return value/side effects to satisfy the project's docstring guidelines.src/exstruct/mcp/patch_runner.py (2)
1185-1192:except ValueErrorimplicitly re-raisesPatchOpError— works but is subtle.
PatchOpErrorinherits fromValueError(line 1298), so theexcept ValueError: raiseon line 1188 also catches and re-raisesPatchOpErrorfrom line 1186. This works correctly, but the intent is non-obvious. A more explicit ordering would be to catchPatchOpErrorbeforeValueError:Suggested clarification
- except ValueError: + except (PatchOpError, ValueError): raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/exstruct/mcp/patch_runner.py` around lines 1185 - 1192, The current except ordering catches ValueError before PatchOpError (PatchOpError subclasses ValueError), which makes the re-raise of PatchOpError implicit and confusing; modify the exception handlers so that you explicitly catch PatchOpError first (add an "except PatchOpError: raise" block immediately before the existing "except ValueError: raise") and then handle other ValueError cases, leaving the generic "except Exception as exc: raise RuntimeError(...)" unchanged so intent is clear; refer to the PatchOpError class and the try/except around workbook.save and the subsequent exception handlers in patch_runner.py to locate where to add this explicit except.
472-476:get_com_availability()is called on everyrun_patchinvocation.Per
src/exstruct/cli/availability.py, this function launches and quits an Excel COM app to probe availability. Ifrun_patchis called in a loop (e.g., batch patching), this becomes an expensive per-call overhead. Consider caching the result (e.g.,functools.lru_cacheat the availability module level or passing availability status intorun_patch).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/exstruct/mcp/patch_runner.py` around lines 472 - 476, The call to get_com_availability() inside run_patch causes an expensive probe on every invocation; either memoize the probe at its definition (e.g., add functools.lru_cache to get_com_availability in src/exstruct/cli/availability.py so subsequent calls return cached availability) or change the run_patch signature to accept a precomputed availability object (e.g., add a parameter com_availability or com_available) and compute it once in the caller loop before calling run_patch; update usages of run_patch to pass the cached value and replace the in-function get_com_availability() call with the passed-in value (references: get_com_availability, run_patch, resolved_input).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/exstruct/mcp/patch_runner.py`:
- Around line 1144-1154: The current substring checks (e.g., "==" in normalized)
false-positive on contents inside string literals; update the logic in
patch_runner.py (around variables raw, normalized and where FormulaIssue is
appended) to ignore quoted string literals before performing checks: implement a
small parser that walks raw, removes or masks characters within Excel-style
quotes (handling doubled quotes "" as an escaped quote), then uppercase and run
the existing "==" and token_map checks against that masked string; apply the
same masking to all token_map checks (e.g., "#REF!", "#N/A") so tokens inside
string arguments are not matched.
---
Nitpick comments:
In `@src/exstruct/mcp/patch_runner.py`:
- Around line 1185-1192: The current except ordering catches ValueError before
PatchOpError (PatchOpError subclasses ValueError), which makes the re-raise of
PatchOpError implicit and confusing; modify the exception handlers so that you
explicitly catch PatchOpError first (add an "except PatchOpError: raise" block
immediately before the existing "except ValueError: raise") and then handle
other ValueError cases, leaving the generic "except Exception as exc: raise
RuntimeError(...)" unchanged so intent is clear; refer to the PatchOpError class
and the try/except around workbook.save and the subsequent exception handlers in
patch_runner.py to locate where to add this explicit except.
- Around line 472-476: The call to get_com_availability() inside run_patch
causes an expensive probe on every invocation; either memoize the probe at its
definition (e.g., add functools.lru_cache to get_com_availability in
src/exstruct/cli/availability.py so subsequent calls return cached availability)
or change the run_patch signature to accept a precomputed availability object
(e.g., add a parameter com_availability or com_available) and compute it once in
the caller loop before calling run_patch; update usages of run_patch to pass the
cached value and replace the in-function get_com_availability() call with the
passed-in value (references: get_com_availability, run_patch, resolved_input).
In `@tests/com/test_shapes_extraction.py`:
- Around line 91-92: The tests use compound assertions that hide which
sub-condition failed; replace the two multi-condition asserts with separate
assertions so failures are clear: split "assert rect.l >= 0 and rect.t >= 0"
into "assert rect.l >= 0" and "assert rect.t >= 0", and split "assert rect.id is
not None and rect.id > 0" into "assert rect.id is not None" and "assert rect.id
> 0", referring to the rect object's attributes (rect.l, rect.t, rect.id) and
updating any accompanying messages if present.
- Line 31: Add Google-style docstrings to the helper functions
_make_workbook_with_shapes and _excel_app: for each function include a short
summary line, a blank line, and parameter and return sections as applicable
(e.g., Args: path (Path): description for _make_workbook_with_shapes; Returns:
type and description for _excel_app if it returns a value or None). Ensure the
docstrings follow the Google style (triple-quoted) and describe purpose,
parameters, and return value/side effects to satisfy the project's docstring
guidelines.
| normalized = raw.upper() | ||
| if "==" in normalized: | ||
| issues.append( | ||
| FormulaIssue( | ||
| sheet=sheet_name, | ||
| cell=str(getattr(cell, "coordinate", "")), | ||
| level="warning", | ||
| code="invalid_token", | ||
| message="Formula contains duplicated '=' token.", | ||
| ) | ||
| ) |
There was a problem hiding this comment.
"==" in normalized check can false-positive on string literals inside formulas.
A formula like =IF(A1="==test",1,0) would trigger this warning even though the == is inside a string literal, not a mistyped operator. Similarly, all token_map checks (e.g., #REF!, #N/A) could match tokens embedded in string arguments. Since these are all warnings/errors surfaced to users, consider parsing formula tokens rather than plain substring matching, or documenting this as a known limitation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/exstruct/mcp/patch_runner.py` around lines 1144 - 1154, The current
substring checks (e.g., "==" in normalized) false-positive on contents inside
string literals; update the logic in patch_runner.py (around variables raw,
normalized and where FormulaIssue is appended) to ignore quoted string literals
before performing checks: implement a small parser that walks raw, removes or
masks characters within Excel-style quotes (handling doubled quotes "" as an
escaped quote), then uppercase and run the existing "==" and token_map checks
against that masked string; apply the same masking to all token_map checks
(e.g., "#REF!", "#N/A") so tokens inside string arguments are not matched.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
.codacy.yml (1)
3-3: Consider whether excluding all tests from static analysis is intentional.Excluding
tests/**from Codacy will suppress all static analysis findings (e.g., security issues, correctness bugs) in test code. If the goal is only to suppress specific noisy rules (like missing docstrings or naming conventions), a more targeted exclusion via rule configuration may be preferable.If this is a deliberate decision to reduce CI noise from the new test suites, this is fine as-is.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.codacy.yml at line 3, The .codacy.yml currently excludes "tests/**", which silences all static analysis in test code; either remove or narrow this blanket exclusion by deleting the "tests/**" pattern and instead configure targeted rule exclusions (or path-specific rule overrides) for only the noisy rules you want suppressed, or add a comment in .codacy.yml documenting that the full-test exclusion is intentional so reviewers know it’s deliberate.tests/com/test_shapes_extraction.py (2)
16-28:app = Noneon Line 28 is a dead assignment; usepassfor consistency.If
app.kill()raises, assigningNoneto the localapphas no effect — the variable is about to go out of scope and no further cleanup depends on it. The sibling helper intest_charts_extraction.pyusespassin the same position.Proposed fix
except Exception: - app = None + pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/com/test_shapes_extraction.py` around lines 16 - 28, In the _excel_app contextmanager, remove the dead assignment "app = None" in the nested except and replace it with "pass" (or simply do nothing) to match the sibling helper and avoid a meaningless local reassignment; update the finally/except block in the _excel_app function to use pass after the second except that handles app.kill() failures.
31-70: Helper function missing a Google-style docstring.
_make_workbook_with_shapesis a non-trivial helper that builds multiple shapes, a group, and a connector — a brief docstring would help future readers.As per coding guidelines, "Add Google-style docstrings to all functions and classes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/com/test_shapes_extraction.py` around lines 31 - 70, Add a Google-style docstring to the helper function _make_workbook_with_shapes describing its behavior (creates an Excel workbook with multiple shapes: rectangle, oval, line with arrowhead, grouped shapes, and an optional connector between src and dst), document the parameter path (type: Path) and that it returns None, and note side effects (uses _excel_app context manager, saves the workbook to the given path, and closes it). Keep the docstring concise (one short summary line, a longer description of what shapes are created and connector behavior, Args: path (Path), Returns: None, and any side effects/exceptions).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.codacy.yml:
- Line 3: The .codacy.yml currently excludes "tests/**", which silences all
static analysis in test code; either remove or narrow this blanket exclusion by
deleting the "tests/**" pattern and instead configure targeted rule exclusions
(or path-specific rule overrides) for only the noisy rules you want suppressed,
or add a comment in .codacy.yml documenting that the full-test exclusion is
intentional so reviewers know it’s deliberate.
In `@tests/com/test_shapes_extraction.py`:
- Around line 16-28: In the _excel_app contextmanager, remove the dead
assignment "app = None" in the nested except and replace it with "pass" (or
simply do nothing) to match the sibling helper and avoid a meaningless local
reassignment; update the finally/except block in the _excel_app function to use
pass after the second except that handles app.kill() failures.
- Around line 31-70: Add a Google-style docstring to the helper function
_make_workbook_with_shapes describing its behavior (creates an Excel workbook
with multiple shapes: rectangle, oval, line with arrowhead, grouped shapes, and
an optional connector between src and dst), document the parameter path (type:
Path) and that it returns None, and note side effects (uses _excel_app context
manager, saves the workbook to the given path, and closes it). Keep the
docstring concise (one short summary line, a longer description of what shapes
are created and connector behavior, Args: path (Path), Returns: None, and any
side effects/exceptions).
#53
Summary by CodeRabbit
New Features
Documentation
Chores
Tests