Skip to content

feat(evaluator): implement indirect offset resolution (#37)#199

Merged
unclesp1d3r merged 29 commits intomainfrom
37-evaluator-implement-indirect-offset-resolution
Mar 30, 2026
Merged

feat(evaluator): implement indirect offset resolution (#37)#199
unclesp1d3r merged 29 commits intomainfrom
37-evaluator-implement-indirect-offset-resolution

Conversation

@unclesp1d3r
Copy link
Copy Markdown
Member

Summary

Implements indirect offset resolution (OffsetSpec::Indirect) for the evaluator, enabling detection of complex binary formats like PE executables where a pointer at a fixed offset must be dereferenced to locate the actual header.

  • 4-step evaluation pipeline: resolve base offset → read pointer value (with endianness) → apply adjustment (checked arithmetic) → validate bounds
  • Full parser support: indirect offset syntax (base.type)+adj with GNU file semantics — lowercase specifiers = little-endian, uppercase = big-endian, signed by default
  • 35 unit tests + 9 integration tests covering all pointer types, endiannesses, signed/unsigned values, adjustments, overflow, and PE-header-style scenarios

Key files

  • src/evaluator/offset/indirect.rs — core implementation (740 lines)
  • src/parser/grammar/mod.rsparse_indirect_offset() and pointer_specifier_to_type()
  • tests/indirect_offset_integration.rs — end-to-end tests

Test Plan

  • All 1,299 tests pass (cargo test)
  • All pointer types (byte, short, long, quad) with both endiannesses
  • Signed and unsigned pointer values (signed reinterpreted as raw unsigned per libmagic)
  • Positive and negative adjustments with overflow protection
  • Buffer overrun detection at both pointer-read and final-offset stages
  • PE-header-style real-world scenario (0x3C pointer dereference)
  • Integration tests verifying full parser → evaluator pipeline

Closes #37

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…lines

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…lity

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…details

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…unctions

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…mantics

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@unclesp1d3r unclesp1d3r linked an issue Mar 30, 2026 that may be closed by this pull request
7 tasks
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 30, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (3)
  • rust
  • testing
  • compatibility

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 690f6b63-6e6b-4dab-a136-21e869744db2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR implements full indirect offset resolution (evaluating pointer-chasing offsets like (0x3c.l)), adds parsing support for indirect offset syntax with optional adjustments, and extends documentation with policy and implementation guidance. Core evaluator logic replaces a stub with complete pointer reading, endianness handling, and bounds checking.

Changes

Cohort / File(s) Summary
Indirect Offset Evaluator Implementation
src/evaluator/offset/indirect.rs
Fully implemented indirect offset resolution: reads pointer values from buffer (supporting Byte, Short, Long, Quad), handles endianness via Endianness enum, applies signed/unsigned interpretation, validates adjustment arithmetic (including i64::MIN edge cases), and performs bounds checking. Replaced unconditional UnsupportedType error with actual resolution logic.
Offset Resolution Integration
src/evaluator/offset/mod.rs
Made map_offset_error public to crate, updated resolve_offset documentation to reflect indirect offset support, and replaced test stub with real indirect offset validation using byte pointer types.
Indirect Offset Parsing
src/parser/grammar/mod.rs
Added parsing for indirect offset syntax (base_offset.pointer_spec) with optional post-paren adjustments (+N or -N). New private helpers: pointer_specifier_to_type (maps b/B/s/S/l/L/q/Q to type + endianness) and parse_indirect_offset. Updated parse_offset branching to detect and handle ( prefix.
Parser Test Suite
src/parser/grammar/tests.rs
Comprehensive unit tests for indirect offset parsing: all pointer specifiers with endianness verification, adjustment syntax (positive/negative), negative base offsets, whitespace handling, malformed syntax error cases, and multi-level nesting (e.g., >>(...)) with correct level propagation.
Documentation Formatting Updates
docs/src/ast-structures.md, docs/src/evaluator.md, docs/src/magic-format.md, docs/src/parser.md
Markdown spacing/alignment adjustments: blank lines around code blocks in ast-structures.md, trailing-space removal in evaluator.md, table column realignment in magic-format.md, and blank-line insertion in parser.md date section.
Project Metadata & Policy
AGENTS.md, AI_POLICY.md, GOTCHAS.md, tessl.json
Added AI usage policy (disclosure, review, accountability), documented parsing gotchas (sign handling, string quoting, indirect offset semantics), updated AGENTS.md feature matrix, and bumped tessl.json project identifier and dependencies.

Sequence Diagram

sequenceDiagram
    participant Parser as Parser
    participant Evaluator as Evaluator
    participant Buffer as Buffer
    participant TypeReader as Type Reader

    Parser->>Parser: Detect '(' prefix in offset
    Parser->>Parser: Parse (base_offset.pointer_spec)
    Parser->>Parser: Parse optional +/- adjustment
    Parser->>Evaluator: Return OffsetSpec::Indirect{base_offset, pointer_type, adjustment, endian}
    
    Evaluator->>Evaluator: resolve_offset(OffsetSpec::Indirect)
    Evaluator->>Evaluator: Resolve base_offset to absolute index
    Evaluator->>Buffer: Read at base_offset position
    Buffer->>TypeReader: Read pointer value (Byte/Short/Long/Quad)
    TypeReader->>TypeReader: Apply endianness (lowercase=LE, uppercase=BE)
    TypeReader->>TypeReader: Handle signedness (default signed)
    TypeReader->>Evaluator: Return pointer value
    Evaluator->>Evaluator: Apply adjustment (+/- arithmetic)
    Evaluator->>Evaluator: Bounds check final offset
    Evaluator->>Parser: Return final offset or error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PR #156 — Refactors type-reading functionality (types/{mod,numeric,string}.rs) that indirect offset resolution directly depends on for reading pointer values from buffers with endianness handling.

Poem

Pointers chase through binary streams, 🎯
Indirect offsets unlock the dreams,
Endian-aware, with bounds so tight,
Memory-safe in Rust's pure light! 🦀

🚥 Pre-merge checks | ✅ 8 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Libmagic-Compatibility-Check ⚠️ Warning Implementation maps both .b and .B byte pointer specifiers to Endianness::Little, losing semantic distinction required by GNU file format compatibility and breaking round-trip AST fidelity. Fix pointer_specifier_to_type to map .b to Endianness::Little and .B to Endianness::Big; update AGENTS.md offset documentation; add tests for uppercase specifier preservation.
Test-Coverage-Check ⚠️ Warning Test suite lacks property-based testing (proptest/quickcheck), relies on individual test functions instead of table-driven consolidation, and has unclear coverage of malformed input handling and boundary conditions. Implement property-based tests using proptest for arbitrary input generation, consolidate repetitive pointer type/endianness test combinations into table-driven tests, expand malformed input coverage across all parse error paths, and add property tests for arithmetic boundary conditions (i64::MIN/MAX adjustments).
✅ Passed checks (8 passed)
Check name Status Explanation
Title check ✅ Passed Title 'feat(evaluator): implement indirect offset resolution (#37)' follows Conventional Commits specification with valid type (feat), scope (evaluator), and describes the primary change accurately.
Description check ✅ Passed PR description comprehensively explains indirect offset resolution implementation, test coverage, and key files changed, directly relating to the changeset.
Linked Issues check ✅ Passed All acceptance criteria from #37 are met: indirect offset evaluation implemented with full pointer type support, endianness handling, adjustment arithmetic, bounds checking, comprehensive unit and integration tests including PE header scenario.
Out of Scope Changes check ✅ Passed All changes are directly aligned with #37 objectives: evaluator implementation, parser support, documentation updates, and test coverage for indirect offset resolution.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 85.00%.
Memory-Safety-Check ✅ Passed Implementation maintains strict memory safety with safe buffer access using .get() methods, checked arithmetic operations, no unsafe blocks or pointer manipulations, and comprehensive bounds validation.
Performance-Regression-Check ✅ Passed Implementation uses zero-copy buffer access and stack-allocated primitives, maintaining performance without safety compromises.
Error-Handling-Check ✅ Passed Public APIs use Result types throughout; no unwrap/expect/panic in library code; structured LibmagicError variants handle all error cases appropriately.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 37-evaluator-implement-indirect-offset-resolution

Comment @coderabbitai help to get the list of available commands and usage tips.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 30, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Require conventional commit format per https://www.conventionalcommits.org/en/v1.0.0/. Skipped for bots.

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?!?:

🟢 Full CI must pass

Wonderful, this rule succeeded.

All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.

  • check-success = coverage
  • check-success = quality
  • check-success = test
  • check-success = test-cross-platform (macos-latest, macOS)
  • check-success = test-cross-platform (ubuntu-22.04, Linux)
  • check-success = test-cross-platform (ubuntu-latest, Linux)
  • check-success = test-cross-platform (windows-latest, Windows)

🟢 Do not merge outdated PRs

Wonderful, this rule succeeded.

Make sure PRs are within 10 commits of the base branch before merging

  • #commits-behind <= 10

@dosubot dosubot bot added evaluator Rule evaluation engine and logic parser Magic file parsing components and grammar testing Test infrastructure and coverage labels Mar 30, 2026
@coderabbitai coderabbitai bot removed parser Magic file parsing components and grammar evaluator Rule evaluation engine and logic labels Mar 30, 2026
@coderabbitai coderabbitai bot removed the testing Test infrastructure and coverage label Mar 30, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/src/parser.md (1)

38-58: ⚠️ Potential issue | 🟡 Minor

Parser docs still advertise indirect offsets as unimplemented.

Both the parse_offset section and the limitations list still describe indirect offsets as planned / not yet implemented, but this PR adds parse_indirect_offset() plus evaluator support. Please update those sections to document the accepted indirect grammar and supported pointer specifiers so the guide matches the shipped behavior.

As per coding guidelines, docs/**: Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides.

Also applies to: 393-401

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/parser.md` around lines 38 - 58, Update the parse_offset docs to
reflect that indirect offsets are implemented: replace the "planned" notes with
a description of the indirect grammar accepted by parse_indirect_offset (include
example strings like "@0x10", "@(reg+4)" or whatever pointer syntax your parser
accepts), list supported pointer specifiers (e.g., '@', '&', register/label
forms) and show at least one example and its return (e.g.,
parse_indirect_offset("@0x10") -> Ok(...)). Mark "Indirect offset parsing" as
implemented in the Features checklist, update any references in the limitations
section (lines ~393-401) to document evaluator support and accepted pointer
specifiers, and ensure you reference the functions parse_offset and
parse_indirect_offset in the prose so readers can find implementation details.
docs/src/evaluator.md (1)

9-14: ⚠️ Potential issue | 🟡 Minor

Mark indirect offset resolution as supported in this guide.

The overview, offset-resolution summary, and implementation checklist still omit or leave indirect offsets unchecked even though src/evaluator/offset/indirect.rs now implements them. Right now the evaluator docs understate the feature set exposed by this PR.

As per coding guidelines, docs/**: Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides.

Also applies to: 92-100, 549-561

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/evaluator.md` around lines 9 - 14, Update the evaluator
documentation to mark indirect offset resolution as supported: edit the
overview, the offset-resolution summary table/section, and the implementation
checklist to indicate "indirect offsets" (or "indirect offset resolution") is
implemented; reference the implemented module/file indirect.rs and any API names
that expose indirect resolution so readers know it's available; ensure examples
and the list (the steps like Load file, Resolve offsets, Read typed values,
Apply operators, Process children, Collect results) and any checklist or
migration notes explicitly mention indirect offsets as supported and include a
short usage note or link to the indirect resolver implementation for further
details.
docs/src/magic-format.md (1)

97-107: ⚠️ Potential issue | 🟠 Major

Document )+adj, not +adj) for indirect offsets.

The implemented grammar is (base.type)+adj, with the adjustment after the closing ), and it accepts uppercase specifiers for big-endian pointer reads. The current reference shows (base.type+adj) and lowercase-only specifiers, so readers following this page will write syntax the parser rejects.

As per coding guidelines, docs/**: Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides.

Also applies to: 590-593

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/magic-format.md` around lines 97 - 107, Update the indirect offset
syntax examples and type list to match the implemented grammar: change any
occurrences of "(base.type+adj)" to "(base.type)+adj" (i.e., the adjustment
appears after the closing parenthesis) and note that the parser accepts
uppercase type specifiers for big-endian pointer reads; also update the types
section listing ".b .s .l .q" to mention their uppercase equivalents (".B .S .L
.Q") are valid. Ensure all other occurrences of the old syntax and
lowercase-only type documentation (including the similar block referenced later)
are corrected to the "(base.type)+adj" form and mention uppercase specifiers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Around line 207-208: Update AGENTS.md to accurately describe implemented
offset support: remove or reword the claim that indirect offsets are fully
implemented and instead state that only absolute and from-end offsets are
functional and that relative offsets are parsed but not evaluated; mention the
parser behavior in src/parser/grammar/mod.rs which currently only branches on
indirect '(' and numeric absolute forms to justify the limitation, and apply the
same correction to the other referenced bullet block (lines 248-250) so the
document matches the current implementation rather than aspirational features.

In `@docs/src/ast-structures.md`:
- Around line 245-250: The fenced code block showing the example currently lacks
a language tag (MD040); update the triple-backtick fence that surrounds the
snippet containing "0    pstring    JPEG" and "0    pstring/H  JPEG" to include
the language identifier (use "text") by replacing ``` with ```text so
markdownlint stops flagging it.

In `@src/evaluator/offset/indirect.rs`:
- Around line 35-41: OffsetSpec::Indirect carries endian twice but
read_pointer() only uses the outer endian for Short/Long/Quad; add a consistency
check in the pattern match where OffsetSpec::Indirect is destructured (the
binding of base_offset, pointer_type, adjustment, endian) to verify that any
inner/endian metadata (the one inside pointer_type for Short/Long/Quad) matches
the outer endian, and return an error (or panic) when they contradict, or
alternatively derive the read endianness from a single canonical field and
remove/ignore the duplicate; update the same logic for the second occurrence
around the 80-84 area so all indirect pointer paths enforce the
single-source-of-truth endianness.
- Around line 147-735: Many tests in the tests module duplicate permutations
(sizes/endian/signed) and push the file beyond the repo size cap; refactor by
converting repetitive one-assertion tests into table-driven tests that iterate
over a Vec/array of cases (use a struct Case { name: &str, buffer: Vec<u8> or
&'static [u8], spec_params... } and call resolve_indirect_offset per case)
reusing the existing indirect() helper and asserting with the case name in the
failure message, and/or split the huge tests module into smaller submodules
(e.g., pointer_read_tests, adjustment_tests, unsupported_type_tests) to keep
each source file under ~500–600 lines; update references to
resolve_indirect_offset, indirect, and test case names accordingly.

In `@src/parser/grammar/mod.rs`:
- Around line 158-175: The match arm in pointer_specifier_to_type currently
collapses 'b' and 'B' into the same tuple, losing the distinction between
lowercase and uppercase pointer specifiers; update pointer_specifier_to_type so
'b' returns Some((TypeKind::Byte { signed: true }, Endianness::Little)) and 'B'
returns Some((TypeKind::Byte { signed: true }, Endianness::Big)) (i.e., split
the 'b' | 'B' arm into two arms) so OffsetSpec::Indirect's endian field
preserves the original `.b` vs `.B` spelling in the AST and round-trips
correctly.

In `@src/parser/grammar/tests.rs`:
- Around line 284-289: Move the indirect-offset test suite out of the oversized
tests.rs into a new focused test module file named indirect_offset_tests.rs:
create the new file (e.g., src/parser/grammar/tests/indirect_offset_tests.rs),
add "use super::*;" at top and copy the indirect-offset test functions/comments
unchanged (they reference parsing behaviors like indirect offset parsing and any
test functions in tests.rs that mention "indirect offset" / GNU file semantics),
then in the original tests.rs replace the moved block with a single module
declaration (mod indirect_offset_tests;) so the tests compile and run; ensure
the new file is under the same parent cfg(test) test module scope so visibility
from super::* remains correct.

---

Outside diff comments:
In `@docs/src/evaluator.md`:
- Around line 9-14: Update the evaluator documentation to mark indirect offset
resolution as supported: edit the overview, the offset-resolution summary
table/section, and the implementation checklist to indicate "indirect offsets"
(or "indirect offset resolution") is implemented; reference the implemented
module/file indirect.rs and any API names that expose indirect resolution so
readers know it's available; ensure examples and the list (the steps like Load
file, Resolve offsets, Read typed values, Apply operators, Process children,
Collect results) and any checklist or migration notes explicitly mention
indirect offsets as supported and include a short usage note or link to the
indirect resolver implementation for further details.

In `@docs/src/magic-format.md`:
- Around line 97-107: Update the indirect offset syntax examples and type list
to match the implemented grammar: change any occurrences of "(base.type+adj)" to
"(base.type)+adj" (i.e., the adjustment appears after the closing parenthesis)
and note that the parser accepts uppercase type specifiers for big-endian
pointer reads; also update the types section listing ".b .s .l .q" to mention
their uppercase equivalents (".B .S .L .Q") are valid. Ensure all other
occurrences of the old syntax and lowercase-only type documentation (including
the similar block referenced later) are corrected to the "(base.type)+adj" form
and mention uppercase specifiers.

In `@docs/src/parser.md`:
- Around line 38-58: Update the parse_offset docs to reflect that indirect
offsets are implemented: replace the "planned" notes with a description of the
indirect grammar accepted by parse_indirect_offset (include example strings like
"@0x10", "@(reg+4)" or whatever pointer syntax your parser accepts), list
supported pointer specifiers (e.g., '@', '&', register/label forms) and show at
least one example and its return (e.g., parse_indirect_offset("@0x10") ->
Ok(...)). Mark "Indirect offset parsing" as implemented in the Features
checklist, update any references in the limitations section (lines ~393-401) to
document evaluator support and accepted pointer specifiers, and ensure you
reference the functions parse_offset and parse_indirect_offset in the prose so
readers can find implementation details.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4045876d-1fd3-4d8f-9d19-86b4d404478b

📥 Commits

Reviewing files that changed from the base of the PR and between 20b23c0 and c481d42.

⛔ Files ignored due to path filters (6)
  • .tessl/.gitignore is excluded by none and included by none
  • docs/solutions/integration-issues/indirect-offset-parser-evaluator-sync.md is excluded by none and included by none
  • docs/solutions/logic-errors/indirect-offset-gnu-file-semantics.md is excluded by none and included by none
  • docs/solutions/logic-errors/indirect-offset-resolution.md is excluded by none and included by none
  • mise.lock is excluded by !**/*.lock and included by none
  • tests/indirect_offset_integration.rs is excluded by none and included by none
📒 Files selected for processing (12)
  • AGENTS.md
  • AI_POLICY.md
  • GOTCHAS.md
  • docs/src/ast-structures.md
  • docs/src/evaluator.md
  • docs/src/magic-format.md
  • docs/src/parser.md
  • src/evaluator/offset/indirect.rs
  • src/evaluator/offset/mod.rs
  • src/parser/grammar/mod.rs
  • src/parser/grammar/tests.rs
  • tessl.json

@unclesp1d3r unclesp1d3r self-assigned this Mar 30, 2026
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Fix AGENTS.md: remove contradictory "only absolute and from-end are
  fully functional" line since indirect offsets are now implemented
- Fix ast-structures.md: add `text` language tag to fenced code block (MD040)
- Split .b/.B byte pointer specifiers into separate match arms to
  preserve GNU file endianness distinction in the AST
- Add debug_assert for endian consistency between inner TypeKind and
  outer OffsetSpec::Indirect endian field
- Consolidate indirect.rs tests into table-driven format (735 -> 538 lines)
- Extract indirect offset parser tests into dedicated submodule

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 30, 2026
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 30, 2026
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 97.29207% with 14 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/evaluator/offset/indirect.rs 96.99% 13 Missing ⚠️
src/parser/grammar/mod.rs 98.73% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@unclesp1d3r unclesp1d3r enabled auto-merge (squash) March 30, 2026 06:25
@unclesp1d3r unclesp1d3r merged commit 6426d18 into main Mar 30, 2026
25 of 27 checks passed
@unclesp1d3r unclesp1d3r deleted the 37-evaluator-implement-indirect-offset-resolution branch March 30, 2026 06:27
@dosubot
Copy link
Copy Markdown
Contributor

dosubot bot commented Mar 30, 2026

Documentation Updates

3 document(s) were updated by changes in this PR:

architecture
View Changes
@@ -71,7 +71,7 @@
 - ✅ **File parsing**: Complete magic file parsing with `parse_text_magic_file()`
 - ✅ **Hierarchy building**: Parent-child relationships via `build_rule_hierarchy()`
 - ✅ **Format detection**: Text, directory, and binary format detection
-- 📋 **Indirect offsets**: Pointer dereferencing patterns
+- ✅ **Indirect offsets**: Pointer dereferencing patterns — fully implemented with `.b/.B`, `.s/.S`, `.l/.L`, `.q/.Q` specifiers; lowercase=little-endian, uppercase=big-endian; adjustment after closing paren: `(base.type)+adj`
 
 ### 2. AST Data Structures (`src/parser/ast.rs`)
 
@@ -164,7 +164,7 @@
 - `offset/`: Offset resolution submodule
   - `mod.rs`: Dispatcher (`resolve_offset`) and re-exports
   - `absolute.rs`: `OffsetError`, `resolve_absolute_offset`
-  - `indirect.rs`: `resolve_indirect_offset` stub (issue #37)
+  - `indirect.rs`: `resolve_indirect_offset` — indirect offset resolution, fully implemented.
   - `relative.rs`: `resolve_relative_offset` stub (issue #38)
 - `operators/`: Operator application submodule
   - `mod.rs`: Dispatcher (`apply_operator`) and re-exports
@@ -185,7 +185,7 @@
 - ✅ **Recursion Limiting**: Prevent stack overflow from deep nesting
 - ✅ **Signedness Coercion**: Automatic value coercion for signed type comparisons (e.g., `0xff` → `-1` for signed byte)
 - ✅ **Comparison Operators**: Full support for `<`, `>`, `<=`, `>=` with numeric and lexicographic ordering
-- 📋 **Indirect Offsets**: Pointer dereferencing (planned)
+- ✅ **Indirect Offsets**: Pointer dereferencing — 4-step pipeline: resolve base offset → read pointer value → apply adjustment (checked arithmetic) → validate bounds.
 
 ### 4. I/O Module (`src/io/`)
 
compatibility
View Changes
@@ -73,7 +73,7 @@
 | ------------------ | -------- | ------ | ----------- | ---------------------------- |
 | Basic patterns     | ✅       | ✅     | Complete    | String, numeric matching     |
 | Hierarchical rules | ✅       | 🔄     | In Progress | Parent-child relationships   |
-| Indirect offsets   | ✅       | 📋     | Planned     | Pointer dereferencing        |
+| Indirect offsets   | ✅       | ✅     | Complete    | Pointer types byte/short/long/quad; lowercase specifiers = LE, uppercase = BE; optional adjustment with checked arithmetic; buffer bounds validation |
 | Relative offsets   | ✅       | 📋     | Planned     | Position-relative addressing |
 | Search patterns    | ✅       | 📋     | Planned     | Pattern searching in ranges  |
 | Bitwise operations | ✅       | ✅     | Complete    | AND, OR operations           |
MAGIC_FORMAT
View Changes
@@ -115,14 +115,20 @@
 Indirect offset syntax:
 
 - `(base.type)` - Read pointer at base, interpret as type
-- `(base.type+adj)` - Add adjustment to pointer value
-
-Types for indirect offsets:
-
-- `.b` - byte (1 byte)
-- `.s` - short (2 bytes)
-- `.l` - long (4 bytes)
-- `.q` - quad (8 bytes)
+- `(base.type)+adj` - Add adjustment to pointer value after dereferencing
+
+Pointer type specifiers (lowercase = little-endian, uppercase = big-endian; all signed by default):
+
+| Specifier | Size    | Endianness    |
+| --------- | ------- | ------------- |
+| `.b`      | 1 byte  | little-endian |
+| `.B`      | 1 byte  | big-endian    |
+| `.s`      | 2 bytes | little-endian |
+| `.S`      | 2 bytes | big-endian    |
+| `.l`      | 4 bytes | little-endian |
+| `.L`      | 4 bytes | big-endian    |
+| `.q`      | 8 bytes | little-endian |
+| `.Q`      | 8 bytes | big-endian    |
 
 ### Relative Offset
 
@@ -557,7 +563,7 @@
 
 - Absolute offsets
 - Relative offsets
-- Indirect offsets (basic)
+- Indirect offsets
 - Byte, short, long, quad types (8-bit, 16-bit, 32-bit, 64-bit integers)
 - String types (`string`, `pstring`)
 - Date and timestamp types (32-bit and 64-bit Unix timestamps)

How did I do? Any feedback?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluator: implement indirect offset resolution

1 participant