In cpp/HybridCactus.cpp:77:
// Remove null terminator
responseBuffer.resize(strlen(responseBuffer.c_str()));
strlen stops at the first null byte. If the model output contains a null byte mid-response (possible with certain tokenizers or tool-call JSON payloads), the response is silently cut off at that point and everything after is dropped.
Safer to use the actual written length reported by cactus_complete if available, or at least trim only trailing nulls rather than using strlen on the whole buffer.
In
cpp/HybridCactus.cpp:77:// Remove null terminator responseBuffer.resize(strlen(responseBuffer.c_str()));strlenstops at the first null byte. If the model output contains a null byte mid-response (possible with certain tokenizers or tool-call JSON payloads), the response is silently cut off at that point and everything after is dropped.Safer to use the actual written length reported by
cactus_completeif available, or at least trim only trailing nulls rather than usingstrlenon the whole buffer.