Skip to content

Improve MMIO base register resolution#220

Merged
mad-sol-dev merged 1 commit into
mainfrom
codex/extend-data-flow-tracking-in-mmio.py
Jan 9, 2026
Merged

Improve MMIO base register resolution#220
mad-sol-dev merged 1 commit into
mainfrom
codex/extend-data-flow-tracking-in-mmio.py

Conversation

@mad-sol-dev
Copy link
Copy Markdown
Owner

Motivation

  • Improve MMIO heuristics to detect when load instructions populate registers with literal addresses so later register-indirect loads/stores can be resolved.
  • Handle PC-relative literal loads (e.g. LDR rX, [PC, #imm]) and literal pool data so pointer-table indirections can be followed.
  • Support derived addresses formed by immediate ADD instructions (e.g. ADD rY, rX, #imm) to compute absolute targets.
  • Reduce missed MMIO targets by emitting resolved absolute targets for LDR/STR [rX, #imm] when the base register value is known.

Description

  • Added data-line parsing (_parse_data_line and _DATA_VALUE_RE) and register/operand helpers (_REGISTER_RE, _BRACKET_BASE_RE, _ADD_IMMEDIATE_RE, _normalize_register, _extract_dest_register).
  • Collected memory literals (memory_literals) and tracked register base values (register_bases) inside _collect_operations to record LDR rX, =..., LDR rX, [PC, #imm], and indirect literal-to-pointer mappings.
  • Recognized ADD ...,#imm sequences to compute derived register addresses and resolved register-indirect addresses (_resolve_register_indirect_address) so READ/WRITE operations may include absolute target values.
  • Updated unit tests in bridge/tests/unit/test_mmio_heuristics.py to add coverage for literal base-register indirect accesses and pointer-table literal resolution.

Testing

  • Ran pytest bridge/tests/unit/test_mmio_heuristics.py and all tests passed (4 passed).
  • Exercised the updated _collect_operations logic with cases for literal pool loads, register-derived addresses, and pointer-table dereferences.
  • Verified the produced samples include resolved target values for indirect LDR/STR when resolvable.
  • No other automated test suites were run as part of this change.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ff2dab2ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread bridge/features/mmio.py
Comment on lines +178 to +182
try:
offset = int(bracket.group(2), 16) if bracket.group(2) else 0
except ValueError:
return None
literal_addr = addr + 8 + offset
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid hard-coding ARM PC offset for literal loads

The PC-relative literal resolution assumes literal_addr = addr + 8 + offset, which is only correct for ARM state. In Thumb mode the PC is addr + 4, so this will compute the wrong literal pool address and either miss the literal or resolve to the wrong value if there happens to be data at addr+8+offset. That leads to incorrect target values (or skipped samples) for Thumb disassembly. Consider making the PC advance conditional on the instruction mode (or leaving it unresolved when mode is unknown).

Useful? React with 👍 / 👎.

Comment thread bridge/features/mmio.py
Comment on lines +252 to +256
resolved_address = _resolve_register_indirect_address(operands, register_bases)
if resolved_address is not None:
if op == "READ" and resolved_address in memory_literals:
target = memory_literals[resolved_address]
else:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Invalidate stale register bases before resolving indirects

register_bases is only populated on literal LDRs and immediate ADDs, but it is never cleared when the same register is overwritten by other instructions (e.g., MOV, SUB, LDR from a non-literal address). That means later [Rn,#imm] loads/stores can be resolved using a stale base and emit incorrect MMIO targets, whereas previously they would be skipped. Consider invalidating mappings on any instruction that writes a tracked register, or restricting the resolution to immediately following instructions.

Useful? React with 👍 / 👎.

@mad-sol-dev mad-sol-dev merged commit ea9bcb0 into main Jan 9, 2026
2 of 5 checks passed
@mad-sol-dev mad-sol-dev deleted the codex/extend-data-flow-tracking-in-mmio.py branch January 9, 2026 02:11
mad-sol-dev added a commit that referenced this pull request Jan 9, 2026
…ing-in-mmio.py

Improve MMIO base register resolution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant