Skip to content

Conversation

@MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Jan 9, 2026

In C/C++, assertions are often done via a macro defined like:

#ifdef NDEBUG
#define assert(condition) ((void)0)
#else
#define assert(condition) /* implementation defined */
#endif

where /* implementation defined */ represents the actual operation that implements the assertion in a debug build.

However, in a release build (i.e., when NDEBUG is defined) then no check is performed. This is great for performance, but it means the CodeQL database has no way of observing these conditions. And these conditions often help us remove FPs (i.e., a null check or an index validation prior to a dereference).

This PR adds support for identifying (a small subset of) assertions by generating IR corresponding to the check which would have been performed had assertions been enabled (the rationale being basically the same as what Schack wrote for Java here).

This PR only covers a small subset of assertions since we only have the assertion as text since this is a macro argument. So we have to parse that macro argument in QL 😭. Because of this, I've limited this PR to only genearte IR for an assertion of the form E op E where E is an integer constant, or a local variable, and op is =, !=, <, >, <=, or >=. (Locally, I have a follow-up PR to add support for negations, disjunctions, and conjunctions.)

As I didn't feel like implementing all of C++'s conversion rules the generated IR will also not be totally conversion-correct. For example, in an expression like x < y where x is int and y is unsigned int there would normally be a signed-to-unsigned conversion on x but currently we simply generate a comparison between types of different types. I don't imagine this will be a problem in practice, though.

Commit-by-commit review recommended.

The three new alerts look genuine. They arise because we now realize that there's a suspicious looking assertion here which ought to have been fhSize < dstCapacity.

@github-actions github-actions bot added the C++ label Jan 9, 2026
@MathiasVP MathiasVP force-pushed the ir-support-for-assertions branch 3 times, most recently from 0cf33ff to 20c0239 Compare January 12, 2026 12:00
@MathiasVP MathiasVP marked this pull request as ready for review January 12, 2026 19:30
@MathiasVP MathiasVP requested a review from a team as a code owner January 12, 2026 19:30
Copilot AI review requested due to automatic review settings January 12, 2026 19:30
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Jan 12, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for generating IR for assertions in C++ release builds where they would normally be compiled out. The implementation parses macro arguments from assertion macros (like assert and __analysis_assume) to synthesize the conditional checks that would have been present in debug builds, helping CodeQL remove false positives by understanding assertion constraints.

Changes:

  • Adds new TranslatedAssertion.qll module that parses assertion macro arguments and generates synthetic IR for simple comparisons
  • Updates TranslatedStmt.qll and TranslatedElement.qll to exclude assertion statements from normal translation
  • Adds new instruction tags and test coverage for assertion IR generation

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedAssertion.qll New module implementing assertion parsing and IR generation for simple comparison assertions
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedStmt.qll Excludes assertion statements from normal translation to avoid duplication
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll Integrates assertion translation into the element translation framework
cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/InstructionTag.qll Adds four new instruction tags for assertion IR generation
cpp/ql/lib/semmle/code/cpp/stmts/Stmt.qll Adds override for getEnclosingFunction() in StmtParent base class
cpp/ql/lib/semmle/code/cpp/exprs/Expr.qll Adds override keyword to getEnclosingFunction()
cpp/ql/lib/semmle/code/cpp/Element.qll Adds new isAffectedByMacro(MacroInvocation) predicate
cpp/ql/test/library-tests/ir/ir/ir.cpp Adds test cases for assertion IR generation including templates and shadowed variables
cpp/ql/test/library-tests/ir/ir/raw_ir.expected Expected IR output for assertion tests
cpp/ql/test/library-tests/ir/ir/aliased_ir.expected Expected aliased IR output for assertion tests
cpp/ql/test/library-tests/ir/ir/PrintAST.expected Expected AST output for assertion tests
Comments suppressed due to low confidence (1)

cpp/ql/lib/semmle/code/cpp/ir/implementation/raw/internal/TranslatedStmt.qll:330

  • The comment has a typo: "synthedsize" should be "synthesize".
    // we synthedsize the check that would have occurred.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MathiasVP MathiasVP force-pushed the ir-support-for-assertions branch from 4370bc6 to 4f4baee Compare January 12, 2026 21:08
Comment on lines +2840 to +2841
int shadowed = x;
assert(shadowed > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the idea behind this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that I don't to implement local name resolution for C/C++ in QL 😅 So I simplified the necessary name resolution by requiring that there's only 1 local variable with the given name in the enclosing function (see the unique here which achieves this).

Comment on lines +195 to +198
/** Holds if this element is affected by the expansion of `mi`. */
predicate isAffectedByMacro(MacroInvocation mi) {
affectedbymacroexpansion(underlyingElement(this), unresolveElement(mi))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this might be potentially confusing, because isAffectedByMacro/0 is not implemented as affectedbymacroexpansion(underlyingElement(this), _).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Do you have any suggestion for a better QLDoc?

private predicate macroInvocationLocation(int startline, Function f, MacroInvocation mi) {
mi.getMacroName() = ["assert", "__analysis_assume"] and
mi.getLocation().hasLocationInfo(_, startline, _, _, _) and
f.getEntryPoint().isAffectedByMacro(mi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line achieve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relates the function f to the macro invocation mi. Otherwise, we don't know which function mi occurs in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so why does this look at - what is effectively - the whole body of the function, and not just at the statement we're interested in?

Copy link
Contributor Author

@MathiasVP MathiasVP Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that as well. I guess we could do something like:

exists(Stmt stmt |
  stmt.isAffectedByMacro(mi) and
  f = stmt.getEnclosingFunction()
)

I only did the above since that didn't need me to existentially quantify a Stmt. But if you prefer the other thing I guess that's just as fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that really resolves the thing I have trouble wrapping my head around: Why do we need to refer to the function at all here? Why isn't it sufficient to look at the statement?

Copy link
Contributor Author

@MathiasVP MathiasVP Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement isn't enough because of templates. Consider this scenario (let's ignore the uninstantiated function template for the discussion since we don't generate IR for that anyway):

1 | #define assert(x) ((void)0)
2 | 
3 | template<typename T>
4 | void f(int x) {
5 |   assert(x > 10);
6 | }
7 |
8 | // two arbitrary instantiations
9 | template void f<short>();
10| template void f<long>();

Now there are two (instantiated) function templates: f<short> and f<long> and they each have an ExprStmt for ((void)0) that we want to generate an assertion in the IR for.

To make it easy to figure out where we should place the assertion in the IR we require that the only thing on the line is the assertion.

However, the two ExprStmts have the same location. So the unique aggregate here require that there's a unique statement in the enclosing function where the macro is expanded.

So even though there are two ExprStmts on line 5 the condition "there is only 1 statement on the line" is still satisfied because we require it per-function. We could remove the function stuff, but then we wouldn't generate IR for the assertion since there would be two statements on the line (corresponding to the two ExprStmts from the two instantiations).

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense.

MathiasVP and others added 2 commits January 15, 2026 14:41
…slatedStmt.qll

Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
…slatedStmt.qll

Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
…slatedAssertion.qll

Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
Comment on lines 55 to 64
unique(StmtParent p, int startline, Function f |
macroInvocationLocation(startline, f, mi) and
stmtParentLocation(startline, f, p) and
// Also do not count the elements from the expanded macro. i.e., when checking
// if `assert(x)` is the only thing on the line we do not count the
// generated `((void)0)` expression.
not p = mi.getAnExpandedElement()
|
p
)
Copy link
Contributor

@jketema jketema Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for using StmtParent here? what's the relevance of StmtExprs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to capture the notion "this is the only entity we'll be generating IR for on this line (in this function)" to have a meaningful place to insert the assertion, and StmtParent (i.e., the union of Expr and Stmt) is the most general class which has an enclosing function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would change if we just check that there's just a single macro that affects the statement we're interested in? Instead of doing all this line business?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single macro = single macro of the kind we're interested in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how you would do that without doing all the line business. If you have:

#define assert(x)

void test(int x) {
  assert(x < 10);
}

I'm not sure how you relate the EmptyStmt to the MacroInvocation for assert(x < 10) without resorting to line matching. The affectedbymacroexpansion extensional on this DB only relates the macro invocation to the enclosing block statement:

image

Maybe that's a bug? If it related it to the macro invocation to the EmptyStmt (which is the statement we use as the TranslatedElement) then we could probably avoid line matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants