Skip to content

feat(lint): add encode-packed-collision#14914

Open
mablr wants to merge 10 commits into
masterfrom
mablr/encode-packed-collision_lint
Open

feat(lint): add encode-packed-collision#14914
mablr wants to merge 10 commits into
masterfrom
mablr/encode-packed-collision_lint

Conversation

@mablr
Copy link
Copy Markdown
Collaborator

@mablr mablr commented May 26, 2026

Description

Closes OSS-58

Add encode-packed-collision lint.

Detects abi.encodePacked() calls with 2+ dynamic-type arguments (string, bytes, T[]) which can produce hash collisions, e.g. encodePacked("a","bc") == encodePacked("ab","c").

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

Detects `abi.encodePacked()` calls with 2+ dynamic-type arguments
(`string`, `bytes`, `T[]`) which can produce hash collisions, e.g.
`encodePacked("a","bc") == encodePacked("ab","c")`.
Comment thread crates/lint/src/sol/high/encode_packed_collision.rs Outdated
Comment thread crates/lint/testdata/EncodePackedCollision.sol
figtracer
figtracer previously approved these changes May 26, 2026
Comment thread crates/lint/src/sol/high/encode_packed_collision.rs Outdated
Copy link
Copy Markdown
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Please fix member-call return inference before shipping. The current logic scans inherited contract items by method name and then bails unless there is exactly one match: encode_packed_collision.rs#L110-L126.

For concrete contracts that override inherited methods, e.g. name() / symbol(), both the base function and override can match, so abi.encodePacked(token.name(), token.symbol()) is missed even though both return string.

Suggested fix: resolve the actual called function using the call args / semantic function type instead of collecting by name only, or filter overridden base functions before the uniqueness check. Please also add a regression test with a concrete contract overriding inherited name() and symbol() that still warns.

mablr added 3 commits May 28, 2026 11:07
ternary, and slices

- Extend `contract_id_of` to handle interface/contract casts
  (`IERC20Metadata(addr).name()`) by inspecting the call callee for
  `Ident -> Contract` and `Type -> Contract` resolutions; add a general
  fallback through `expr_type` to cover struct-field and array-index
  receivers (`cfg.token.name()`, `tokens[i].name()`).
- Thread `n_call_args: Option<usize>` through `call_return_type` so
  same-name overloads that differ only in arity are disambiguated before
  the dynamic-agreement check (`f()` vs `f(uint256)` now resolves
  correctly).
- Handle `ExprKind::Slice` in `expr_type`; calldata slice `data[:4]`
  preserves the bytes type of the base expression.
- Handle `ExprKind::Ternary` in `expr_type`; both branches must agree
  on dynamic-ness; if they do, the then-branch type is returned.
@mablr mablr requested a review from mattsse May 28, 2026 10:24
figtracer
figtracer previously approved these changes May 28, 2026
@grandizzy
Copy link
Copy Markdown
Collaborator

A few things worth addressing before merge, mostly around the hand-rolled type inference in expr_type / call_return_type:

Likely false negatives (silent miss on real bugs)

The ExprKind::Member arm in expr_type only handles struct fields and returns None for everything else. That causes the lint to miss several common dynamic expressions:

  • abi.encodePacked(msg.data, s)msg.data is bytes calldata
  • abi.encodePacked(addr.code, s) / address(this).code / type(C).creationCode|runtimeCode
  • abi.encodePacked(abi.encode(a), abi.encode(b)) — works only with an explicit bytes(...) cast today (that's all the fixture covers); the bare nested call resolves as a member call on the abi builtin, and contract_id_of bails because abi is not a contract. Same for nested abi.encodePacked, abi.encodeWithSelector, etc.
  • abi.encodePacked(string.concat(a, b), c) and bytes.concat(...) — receiver is ExprKind::Type
  • abi.encodePacked(new bytes(n), s)
  • abi.encodePacked(token.name(), s) where name is a string public state variable — item.as_function() skips synthesized getters if the HIR represents them as variables (worth checking)
  • using X for string method-style calls
  • Literal inside a ternary: flag ? a : "x"expr_type(Ternary) recurses into expr_type, which has no Lit arm, so the literal branch isn't classified as dynamic
  • Overload selection: call_return_type filters by name+arity only. If two same-arity overloads exist and the chosen one returns string, the bail returns None. The current fixture only exercises the inverse direction (selected overload non-dynamic → no warn); please add the opposite.

Likely false positives worth refining

The literal short-circuit (LitKind::Str → dynamic) over-fires on safe idioms:

  • abi.encodePacked("prefix:", s) and abi.encodePacked(hex"dead", s)
  • abi.encodePacked(uint256(bytes(a).length), a, uint256(bytes(b).length), b) — the canonical safe pattern
  • abi.encodePacked(s, s) — injective in s
  • EIP-191 personal-sign with uintToString(len) before data

A common mitigation other linters use: only count non-literal dynamics, or require ≥2 non-literal dynamic args.

Minor soundness / style notes

  • is_abi_builtin uses reses.iter().any(...) — if abi is ever shadowed and co-resolves with the builtin, the lint still fires. Exact-match would be safer.
  • call_return_type's "all candidates agree on dynamic-ness, return first" comment should be explicit that the result is only meaningful for the dynamic/non-dynamic decision, not as a true type — easy footgun if reused.
  • The Slice arm comment says "preserves the element type of the base"; it actually preserves the base type (correct for bytes slices, just misleading wording).

Suggested additional fixtures

// Currently MISSED — should warn
abi.encodePacked(msg.data, s);
abi.encodePacked(addr.code, s);
abi.encodePacked(abi.encode(a), abi.encode(b));      // no bytes() cast
abi.encodePacked(string.concat(a, b), c);
abi.encodePacked(bytes.concat(a, b), c);
abi.encodePacked(new bytes(10), s);
abi.encodePacked(flag ? a : "x", b);
abi.encodePacked(token.name(), s);                    // public string getter
abi.encodePacked(Strings.toString(x), Strings.toString(y)); // library

// Known FPs
abi.encodePacked("prefix:", s);
abi.encodePacked(s, s);
abi.encodePacked(uint256(bytes(a).length), a, uint256(bytes(b).length), b);

Minimal fix suggestion

If solar exposes a resolved-callee or expression-type API, prefer that over the ad-hoc walker. If not, special-casing the handful of well-known Solidity builtins inside expr_type(Member) / call_return_type (abi.encode*bytes, msg.databytes, *.codebytes, string.concatstring, bytes.concatbytes) plus a Lit arm in expr_type would close most of the gaps above with little added complexity.

Copy link
Copy Markdown
Collaborator Author

@mablr mablr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grandizzy I fixed your findings in 8f60378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants