Skip to content

Integrate libbacktrace for enhanced stack trace resolution#7721

Open
eddyashton wants to merge 2 commits intomicrosoft:mainfrom
eddyashton:try_libbacktrace
Open

Integrate libbacktrace for enhanced stack trace resolution#7721
eddyashton wants to merge 2 commits intomicrosoft:mainfrom
eddyashton:try_libbacktrace

Conversation

@eddyashton
Copy link
Member

Closes #7714.

BEFORE (Release):

2026-03-06T15:57:58.537393Z 0   [fatal] CCF/src/tasks/worker.cpp:104         | BasicTask task failed with exception: I went boom on purpose
2026-03-06T15:57:58.537544Z 0   [fatal] CCF/src/tasks/worker.cpp:109         | Stack trace:
  #0: ./js_generic(__cxa_throw+0x31) [0x592b5265ffa1]
  #1: ./js_generic(+0x398c99) [0x592b52430c99]
  #2: ./js_generic(+0x5bf1fe) [0x592b526571fe]
  #3: ./js_generic(+0x3e6cd1) [0x592b5247ecd1]
  #4: ./js_generic(+0x1ffac5) [0x592b52297ac5]
  #5: ./js_generic(+0x1ca502) [0x592b52262502]
  #6: ./js_generic(+0x634cb) [0x592b520fb4cb]
  #7: /usr/lib/libstdc++.so.6(+0xec013) [0x79c7a4465013]
  #8: /usr/lib/libc.so.6(+0x8bca7) [0x79c7a40b2ca7]
  #9: /usr/lib/libc.so.6(+0x10fb1c) [0x79c7a4136b1c]

AFTER (Release):

2026-03-06T16:06:31.506310Z 0   [fatal] CCF/src/tasks/worker.cpp:166         | BasicTask task failed with exception: I went boom on purpose
2026-03-06T16:06:31.569938Z 0   [fatal] CCF/src/tasks/worker.cpp:171         | Stack trace:
  #0: __cxa_throw
  #1: ccf::JwtKeyAutoRefresh::start()::{lambda()#1}::operator()() const
  #2: ccf::tasks::BaseTask::do_task()
  #3: ccf::tasks::try_do_task(ccf::tasks::BaseTask&, bool)
  #4: ccf::Enclave::run_main()
  #5: ccf::enclave_run()
  #6: std::thread::_State_impl<std::thread::_Invoker<std::tuple<ccf::run_enclave_threads(host::CCHostConfig const&)::$_0, unsigned int> > >::_M_run()
  #7: execute_native_thread_routine
  #8: start_thread
  #9: clone
  #10: 0xffffffffffffffff

AFTER (RelWithDebInfo):

2026-03-06T16:28:41.220970Z 0   [fatal] CCF/src/tasks/worker.cpp:169         | BasicTask task failed with exception: I went boom on purpose
2026-03-06T16:28:41.810765Z 0   [fatal] CCF/src/tasks/worker.cpp:174         | Stack trace:
  #0: __cxa_throw at CCF/build/CCF/src/tasks/worker.cpp:208
  #1: ccf::JwtKeyAutoRefresh::start()::{lambda()#1}::operator()() const at CCF/src/node/jwt_key_auto_refresh.h:65
  #2: ccf::tasks::BaseTask::do_task() at CCF/build/CCF/src/tasks/task_system.cpp:32
  #3: ccf::tasks::try_do_task(ccf::tasks::BaseTask&, bool) at CCF/src/tasks/worker.h:32
  #4: ccf::Enclave::run_main() at CCF/src/enclave/enclave.h:394
  #5: ccf::enclave_run() at CCF/build/CCF/src/enclave/main.cpp:179
  #6: std::thread::_State_impl<std::thread::_Invoker<std::tuple<ccf::run_enclave_threads(host::CCHostConfig const&)::$_0, unsigned int> > >::_M_run() at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/bits/std_thread.h:244
  #7: execute_native_thread_routine
  #8: start_thread
  #9: clone
  #10: 0xffffffffffffffff

@eddyashton eddyashton requested a review from a team as a code owner March 6, 2026 16:51
Copilot AI review requested due to automatic review settings March 6, 2026 16:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates libbacktrace into CCF’s task worker exception handling to produce higher-fidelity stack traces (function names, and optionally file/line with debug info) compared to the previous execinfo/backtrace_symbols approach.

Changes:

  • Replace execinfo-based stack trace formatting with libbacktrace-based DWARF-aware resolution in src/tasks/worker.cpp.
  • Update task-system unit test expectations and prevent inlining of call-chain helpers to make stack frames stable in optimised builds.
  • Add CI package installation and CMake linkage for libbacktrace.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/tasks/worker.cpp Uses libbacktrace to capture/resolve throw-point stack traces and demangle symbols.
src/tasks/test/basic_tasks.cpp Adjusts stack-trace assertions and marks helper functions noinline to preserve frames.
scripts/setup-ci.sh Installs libbacktrace-static in CI images.
CMakeLists.txt Removes Debug-only -rdynamic/-fno-omit-frame-pointer flags and links ccf_tasks to libbacktrace.

Comment on lines +81 to 95
int pcinfo_callback(
void* data,
uintptr_t /*pc*/,
const char* filename,
int lineno,
const char* function)
{
// backtrace_symbols format: "binary(mangled+0xoffset) [0xaddr]"
// Try to extract and demangle the symbol name between '(' and '+'/')'
std::string entry(raw);
auto open = entry.find('(');
auto plus = entry.find('+', open != std::string::npos ? open : 0);
auto close = entry.find(')', open != std::string::npos ? open : 0);

if (
open != std::string::npos && close != std::string::npos &&
close > open + 1)
auto* result = static_cast<PcinfoResult*>(data);
if (function != nullptr)
{
auto end = (plus != std::string::npos && plus < close) ? plus : close;
std::string mangled = entry.substr(open + 1, end - open - 1);

if (!mangled.empty())
{
int status = 0;
std::unique_ptr<char, FreeDeleter> demangled(
abi::__cxa_demangle(mangled.c_str(), nullptr, nullptr, &status));
if (status == 0 && demangled != nullptr)
{
std::string rest = entry.substr(end);
entry = entry.substr(0, open + 1) + demangled.get() + rest;
}
}
result->resolved = true;
result->function = demangle(function);
result->filename = (filename != nullptr) ? filename : "";
result->lineno = lineno;
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pcinfo_callback only marks the frame as resolved and records file/line when function != nullptr. libbacktrace can provide filename/lineno even when function is null; with the current logic, those frames will be treated as unresolved and printed as raw addresses. Consider setting resolved (and recording filename/lineno) whenever any of {function, filename, lineno} is present, and using a placeholder like <unknown> when the function name is missing.

Copilot uses AI. Check for mistakes.
Comment on lines +282 to +283
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes user-visible behaviour (stack traces in fatal logs) and adds a new dependency. Per repo guidance, user-facing behaviour changes should be recorded in CHANGELOG.md. Please add an entry describing the improved stack trace resolution and the new libbacktrace dependency.

Suggested change
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
find_library(BACKTRACE_LIBRARY backtrace)
if(BACKTRACE_LIBRARY)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
else()
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS})
endif()

Copilot uses AI. Check for mistakes.
# libbacktrace reads DWARF debug info directly, providing file/line/function
# resolution in stack traces without requiring -rdynamic.
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ccf_tasks is a STATIC library, and target_link_libraries(ccf_tasks PRIVATE ... ${BACKTRACE_LIBRARY}) will not propagate libbacktrace to dependents. Targets like task_system_test link only ccf_tasks, so they'll fail to link with unresolved backtrace_* symbols. Make this dependency PUBLIC (or add it via add_ccf_static_library(... LINK_LIBS ...)) so consumers link libbacktrace automatically (and consider doing the same for ${CMAKE_DL_LIBS} if still required).

Suggested change
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
target_link_libraries(ccf_tasks PUBLIC ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})

Copilot uses AI. Check for mistakes.
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS})
# libbacktrace reads DWARF debug info directly, providing file/line/function
# resolution in stack traces without requiring -rdynamic.
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a hard build dependency on libbacktrace (find_library(... REQUIRED) + include <backtrace.h>). If this is intended to be mandatory, it should be surfaced with a clearer configure-time message and documented as a new build prerequisite (and/or made optional with a fallback to the previous implementation when libbacktrace isn't available).

Suggested change
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
find_library(BACKTRACE_LIBRARY backtrace)
if(NOT BACKTRACE_LIBRARY)
message(
FATAL_ERROR
"libbacktrace (library 'backtrace') is required to build the CCF task "
"system (target ccf_tasks). Please install libbacktrace and retry "
"configuration."
)
endif()

Copilot uses AI. Check for mistakes.
Comment on lines 201 to +219
auto& trace = ccf::tasks::current_throw_trace;
trace.num_frames =
backtrace(trace.frames, ccf::tasks::throw_trace_max_frames);
trace.num_frames = 0;
auto* bt_state = ccf::tasks::get_backtrace_state();
if (bt_state != nullptr)
{
backtrace_simple(
bt_state,
0, // skip = 0, capture from here
[](void* data, uintptr_t pc) -> int {
auto* t = static_cast<ccf::tasks::ThrowTrace*>(data);
if (t->num_frames < ccf::tasks::throw_trace_max_frames)
{
t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT
}
return 0;
},
nullptr, // ignore errors
&trace);
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If backtrace_create_state() fails (so bt_state == nullptr), this code records trace.num_frames = 0 and captures nothing. Downstream, this means exceptions may log with no stack trace at all. Consider adding a fallback capture path when bt_state is null (for example, a glibc backtrace() fallback, or a secondary libbacktrace state initialised with an error callback so failures are visible).

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +216
backtrace_simple(
bt_state,
0, // skip = 0, capture from here
[](void* data, uintptr_t pc) -> int {
auto* t = static_cast<ccf::tasks::ThrowTrace*>(data);
if (t->num_frames < ccf::tasks::throw_trace_max_frames)
{
t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT
}
return 0;
},
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backtrace_simple callback stores every pc value it receives and always returns 0 (continue). In the example output, this results in a bogus trailing frame 0xffffffffffffffff. Consider stopping/ignoring sentinel PCs (eg pc == 0 or pc == std::numeric_limits<uintptr_t>::max()) by returning non-zero and/or skipping these values so they don’t appear as frames.

Copilot uses AI. Check for mistakes.
# backtrace_symbols can resolve function names in stacktraces, and preserve
# frame pointers so that backtrace() can walk the full call stack.
add_link_options($<$<CONFIG:Debug>:-rdynamic>)
add_compile_options($<$<CONFIG:Debug>:-fno-omit-frame-pointer>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not omitting frame pointers still seems useful?

auto* bt_state = ccf::tasks::get_backtrace_state();
if (bt_state != nullptr)
{
backtrace_simple(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we backtrace_full() depending on NDEBUG? It's not obvious from the doc to what extent full is nicer than simple, but it sounds like it might be?

for (int i = 0; i < num_frames; ++i)
{
oss << " #" << i << ": " << demangle_symbol(symbols.get()[i]) << "\n";
auto pc = reinterpret_cast<uintptr_t>(frames[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use libbacktrace for stack trace printing

3 participants