Integrate libbacktrace for enhanced stack trace resolution#7721
Integrate libbacktrace for enhanced stack trace resolution#7721eddyashton wants to merge 2 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Integrates libbacktrace into CCF’s task worker exception handling to produce higher-fidelity stack traces (function names, and optionally file/line with debug info) compared to the previous execinfo/backtrace_symbols approach.
Changes:
- Replace
execinfo-based stack trace formatting with libbacktrace-based DWARF-aware resolution insrc/tasks/worker.cpp. - Update task-system unit test expectations and prevent inlining of call-chain helpers to make stack frames stable in optimised builds.
- Add CI package installation and CMake linkage for libbacktrace.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/tasks/worker.cpp |
Uses libbacktrace to capture/resolve throw-point stack traces and demangle symbols. |
src/tasks/test/basic_tasks.cpp |
Adjusts stack-trace assertions and marks helper functions noinline to preserve frames. |
scripts/setup-ci.sh |
Installs libbacktrace-static in CI images. |
CMakeLists.txt |
Removes Debug-only -rdynamic/-fno-omit-frame-pointer flags and links ccf_tasks to libbacktrace. |
| int pcinfo_callback( | ||
| void* data, | ||
| uintptr_t /*pc*/, | ||
| const char* filename, | ||
| int lineno, | ||
| const char* function) | ||
| { | ||
| // backtrace_symbols format: "binary(mangled+0xoffset) [0xaddr]" | ||
| // Try to extract and demangle the symbol name between '(' and '+'/')' | ||
| std::string entry(raw); | ||
| auto open = entry.find('('); | ||
| auto plus = entry.find('+', open != std::string::npos ? open : 0); | ||
| auto close = entry.find(')', open != std::string::npos ? open : 0); | ||
|
|
||
| if ( | ||
| open != std::string::npos && close != std::string::npos && | ||
| close > open + 1) | ||
| auto* result = static_cast<PcinfoResult*>(data); | ||
| if (function != nullptr) | ||
| { | ||
| auto end = (plus != std::string::npos && plus < close) ? plus : close; | ||
| std::string mangled = entry.substr(open + 1, end - open - 1); | ||
|
|
||
| if (!mangled.empty()) | ||
| { | ||
| int status = 0; | ||
| std::unique_ptr<char, FreeDeleter> demangled( | ||
| abi::__cxa_demangle(mangled.c_str(), nullptr, nullptr, &status)); | ||
| if (status == 0 && demangled != nullptr) | ||
| { | ||
| std::string rest = entry.substr(end); | ||
| entry = entry.substr(0, open + 1) + demangled.get() + rest; | ||
| } | ||
| } | ||
| result->resolved = true; | ||
| result->function = demangle(function); | ||
| result->filename = (filename != nullptr) ? filename : ""; | ||
| result->lineno = lineno; | ||
| } |
There was a problem hiding this comment.
pcinfo_callback only marks the frame as resolved and records file/line when function != nullptr. libbacktrace can provide filename/lineno even when function is null; with the current logic, those frames will be treated as unresolved and printed as raw addresses. Consider setting resolved (and recording filename/lineno) whenever any of {function, filename, lineno} is present, and using a placeholder like <unknown> when the function name is missing.
| find_library(BACKTRACE_LIBRARY backtrace REQUIRED) | ||
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) |
There was a problem hiding this comment.
This changes user-visible behaviour (stack traces in fatal logs) and adds a new dependency. Per repo guidance, user-facing behaviour changes should be recorded in CHANGELOG.md. Please add an entry describing the improved stack trace resolution and the new libbacktrace dependency.
| find_library(BACKTRACE_LIBRARY backtrace REQUIRED) | |
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) | |
| find_library(BACKTRACE_LIBRARY backtrace) | |
| if(BACKTRACE_LIBRARY) | |
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) | |
| else() | |
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS}) | |
| endif() |
| # libbacktrace reads DWARF debug info directly, providing file/line/function | ||
| # resolution in stack traces without requiring -rdynamic. | ||
| find_library(BACKTRACE_LIBRARY backtrace REQUIRED) | ||
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) |
There was a problem hiding this comment.
ccf_tasks is a STATIC library, and target_link_libraries(ccf_tasks PRIVATE ... ${BACKTRACE_LIBRARY}) will not propagate libbacktrace to dependents. Targets like task_system_test link only ccf_tasks, so they'll fail to link with unresolved backtrace_* symbols. Make this dependency PUBLIC (or add it via add_ccf_static_library(... LINK_LIBS ...)) so consumers link libbacktrace automatically (and consider doing the same for ${CMAKE_DL_LIBS} if still required).
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) | |
| target_link_libraries(ccf_tasks PUBLIC ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY}) |
| target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS}) | ||
| # libbacktrace reads DWARF debug info directly, providing file/line/function | ||
| # resolution in stack traces without requiring -rdynamic. | ||
| find_library(BACKTRACE_LIBRARY backtrace REQUIRED) |
There was a problem hiding this comment.
This introduces a hard build dependency on libbacktrace (find_library(... REQUIRED) + include <backtrace.h>). If this is intended to be mandatory, it should be surfaced with a clearer configure-time message and documented as a new build prerequisite (and/or made optional with a fallback to the previous implementation when libbacktrace isn't available).
| find_library(BACKTRACE_LIBRARY backtrace REQUIRED) | |
| find_library(BACKTRACE_LIBRARY backtrace) | |
| if(NOT BACKTRACE_LIBRARY) | |
| message( | |
| FATAL_ERROR | |
| "libbacktrace (library 'backtrace') is required to build the CCF task " | |
| "system (target ccf_tasks). Please install libbacktrace and retry " | |
| "configuration." | |
| ) | |
| endif() |
| auto& trace = ccf::tasks::current_throw_trace; | ||
| trace.num_frames = | ||
| backtrace(trace.frames, ccf::tasks::throw_trace_max_frames); | ||
| trace.num_frames = 0; | ||
| auto* bt_state = ccf::tasks::get_backtrace_state(); | ||
| if (bt_state != nullptr) | ||
| { | ||
| backtrace_simple( | ||
| bt_state, | ||
| 0, // skip = 0, capture from here | ||
| [](void* data, uintptr_t pc) -> int { | ||
| auto* t = static_cast<ccf::tasks::ThrowTrace*>(data); | ||
| if (t->num_frames < ccf::tasks::throw_trace_max_frames) | ||
| { | ||
| t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT | ||
| } | ||
| return 0; | ||
| }, | ||
| nullptr, // ignore errors | ||
| &trace); | ||
| } |
There was a problem hiding this comment.
If backtrace_create_state() fails (so bt_state == nullptr), this code records trace.num_frames = 0 and captures nothing. Downstream, this means exceptions may log with no stack trace at all. Consider adding a fallback capture path when bt_state is null (for example, a glibc backtrace() fallback, or a secondary libbacktrace state initialised with an error callback so failures are visible).
| backtrace_simple( | ||
| bt_state, | ||
| 0, // skip = 0, capture from here | ||
| [](void* data, uintptr_t pc) -> int { | ||
| auto* t = static_cast<ccf::tasks::ThrowTrace*>(data); | ||
| if (t->num_frames < ccf::tasks::throw_trace_max_frames) | ||
| { | ||
| t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT | ||
| } | ||
| return 0; | ||
| }, |
There was a problem hiding this comment.
The backtrace_simple callback stores every pc value it receives and always returns 0 (continue). In the example output, this results in a bogus trailing frame 0xffffffffffffffff. Consider stopping/ignoring sentinel PCs (eg pc == 0 or pc == std::numeric_limits<uintptr_t>::max()) by returning non-zero and/or skipping these values so they don’t appear as frames.
| # backtrace_symbols can resolve function names in stacktraces, and preserve | ||
| # frame pointers so that backtrace() can walk the full call stack. | ||
| add_link_options($<$<CONFIG:Debug>:-rdynamic>) | ||
| add_compile_options($<$<CONFIG:Debug>:-fno-omit-frame-pointer>) |
There was a problem hiding this comment.
Not omitting frame pointers still seems useful?
| auto* bt_state = ccf::tasks::get_backtrace_state(); | ||
| if (bt_state != nullptr) | ||
| { | ||
| backtrace_simple( |
There was a problem hiding this comment.
Can we backtrace_full() depending on NDEBUG? It's not obvious from the doc to what extent full is nicer than simple, but it sounds like it might be?
| for (int i = 0; i < num_frames; ++i) | ||
| { | ||
| oss << " #" << i << ": " << demangle_symbol(symbols.get()[i]) << "\n"; | ||
| auto pc = reinterpret_cast<uintptr_t>(frames[i]); |
Closes #7714.
BEFORE (Release):
AFTER (Release):
AFTER (RelWithDebInfo):