From 88189e19652f2299dc3fe02fc43ed5703a47c444 Mon Sep 17 00:00:00 2001 From: UncleSp1d3r Date: Fri, 29 May 2026 18:26:00 -0400 Subject: [PATCH] docs: correct stale &+N relative-offset parsing TODO claims AGENTS.md and GOTCHAS.md both stated that magic-file &+N/&-N relative-offset parsing was "still TODO". Parsing has been implemented for some time and is exercised by parse_offset_relative tests in src/parser/grammar/tests/mod.rs (covers &0, &4, &+4, &-4, &0x10, &-0x10; bare & rejected). - AGENTS.md Offset Specifications: replace the TODO clause with a statement that &N/&+N/&-N parsing is implemented, including hex forms, and point to the parse-side test module. - GOTCHAS.md S3.11: replace the canonical fail-fast example (which used &+N as the unsupported syntax) with an unquoted $VAR string value on a non-string-family type, which remains a real parse failure per S3.6. Pure documentation. No code changes. Closes #287 Signed-off-by: UncleSp1d3r --- AGENTS.md | 2 +- GOTCHAS.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 4316903..32d71a7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -242,7 +242,7 @@ See **Development Phases** below for the planned roadmap of features not yet imp - Indirect offsets are fully implemented (parsing + evaluation) with specifiers: `.b/.B` (byte), `.s/.S` (short), `.l/.L` (long), `.q/.Q` (quad); lowercase = little-endian, uppercase = big-endian (GNU `file` semantics); pointer types signed by default. Adjustment is accepted both **inside** the parens (canonical magic(5): `(base.type+N)`, `(base.type-N)`, `(base.type*N)`, `(base.type/N)`, `(base.type%N)`, `(base.type&N)`, `(base.type|N)`, `(base.type^N)`) and **after** the closing paren as a legacy form (`(base.type)+N`, `(base.type)-N` only). Only one form per rule. Anchor-relative variants are supported via `base_relative` (the `(&N.X)` form -- pointer-read base is `anchor + N`) and `result_relative` (the `&(N.X)` wrapper -- final offset is added to the anchor) on `OffsetSpec::Indirect`. - **Pre-comparison value transforms**: `type+N`, `type-N`, `type*N`, `type/N`, `type%N`, `type|N`, `type^N` between the type keyword and the operator. The transform runs on the read value before the comparison and before printf-style format substitution. Stored on `MagicRule::value_transform` as `ValueTransform { op, operand }` (default: `None`). magic(5) usage: `lelong+1 x volume %d`, `ulequad/1073741824 x size %lluGB`. Bitwise AND (`&MASK`) is *not* part of `ValueTransformOp` -- it predates this enum and is encoded at the operator layer via `Operator::BitwiseAndMask`. -- Relative offsets are fully evaluated against the GNU `file` previous-match anchor: the engine tracks `EvaluationContext::last_match_end()`, advancing it after each successful match by the bytes consumed (variable-width types include c-string NUL terminators and pstring length prefixes). Top-level relative offsets resolve from anchor 0. Magic-file `&+N`/`&-N` *parsing* is still TODO -- relative offsets are exercised programmatically through the AST. +- Relative offsets are fully evaluated against the GNU `file` previous-match anchor: the engine tracks `EvaluationContext::last_match_end()`, advancing it after each successful match by the bytes consumed (variable-width types include c-string NUL terminators and pstring length prefixes). Top-level relative offsets resolve from anchor 0. Magic-file `&N`, `&+N`, and `&-N` parsing is implemented (decimal and `0x` hex forms supported; bare `&` is rejected). See `parse_offset_relative` in `src/parser/grammar/tests/mod.rs` for the parse-side coverage. ### Magic File Syntax diff --git a/GOTCHAS.md b/GOTCHAS.md index 5628535..3650c96 100644 --- a/GOTCHAS.md +++ b/GOTCHAS.md @@ -150,7 +150,7 @@ Inside a `MetaType::Use` subroutine body, `OffsetSpec::Absolute(n)` with `n >= 0 ### 3.11 `parse_text_magic_file` is Fail-Fast, Not Skip-on-Error -`build_rule_hierarchy` propagates any `parse_magic_rule_line` error immediately, so a single unparseable rule (e.g., a child using unsupported `&+N` relative-offset syntax or an unquoted `$VAR` string value -- see S3.6) causes the **entire file load** to fail with `ParseError::InvalidSyntax`. There is no skip-and-continue mode. When writing corpus tests against third_party `.magic` files that mix supported and unsupported syntax, bypass the parser and build the equivalent `MagicRule` tree programmatically via the AST; the runtime evaluator can still be exercised end-to-end against the real testfile buffer. See `tests/evaluator_tests.rs::test_regex_eol_corpus` for a worked example. +`build_rule_hierarchy` propagates any `parse_magic_rule_line` error immediately, so a single unparseable rule (e.g., a string value using unquoted `$VAR` substitution on a non-string-family type -- see S3.6) causes the **entire file load** to fail with `ParseError::InvalidSyntax`. There is no skip-and-continue mode. When writing corpus tests against third_party `.magic` files that mix supported and unsupported syntax, bypass the parser and build the equivalent `MagicRule` tree programmatically via the AST; the runtime evaluator can still be exercised end-to-end against the real testfile buffer. See `tests/evaluator_tests.rs::test_regex_eol_corpus` for a worked example. ### 3.10 `parse_text_magic_file` Returns `ParsedMagic`, Not `Vec`