C++的全链路追踪方案，稍微有点高端

背景：本人主要在做C++ SDK的开发，需要给到业务端去集成，在集成的过程中可能会出现某些功能性bug，即没有得到想要的结果。那怎么调试?

分析：这种问题其实调试起来稍微有点困难，它不像crash，当发生crash时还能拿到堆栈信息去分析，然而功能性bug没有crash，也就没法捕捉对应到当时的堆栈信息。因为不是在本地，也没法用编译器debug。那思路就剩log了，一种方式是考虑在SDK内部的关键路径下打印详细的log，当出现问题时拿到log去分析。然而总有漏的时候，谁能保证log一定打的很全面，很有可能问题就出现在没有log的函数中。

解决：基于上面的背景和问题分析，考虑是否能做一个全链路追踪的方案，把打印出整个SDK的调用路径，从哪个函数进入，从哪个函数退出等。

想法1：可以考虑在SDK的每个接口都加一个context结构体参数，记录下来函数的调用路径，这可能是比较通用有效的方案，但是SDK接口已经固定了，更改接口要面临的困难很大，业务端基本不会同意，所以这种方案不适合我们现有情况，当然一个从0开始建设的中间件和SDK可以考虑考虑。

想法2：有没有一种不用改接口，还能追踪到函数调用路径的方案?

继续沿着这个思路继续调研，我找到了gcc和clang编译器的一个编译参数：-finstrument-functions，编译时添加此参数会在函数的入口和出口处触发一个固定的回调函数，即：

__cyg_profile_func_enter(void *callee, void *caller);
__cyg_profile_func_exit(void *callee, void *caller);

参数就是callee和caller的地址，那怎么将地址解析成对应函数名?可以使用dladdr函数：

int dladdr(const void *addr, Dl_info *info);

看下下面的代码：

// tracing.cc
#include <cxxabi.h>
#include <dlfcn.h> // for dladdr
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#ifndef NO_INSTRUMENT
#define NO_INSTRUMENT __attribute__((no_instrument_function))
#endif
extern "C" __attribute__((no_instrument_function)) void __cyg_profile_func_enter(void *callee, void *caller) {
Dl_info info;
if (dladdr(callee, &info)) {
int status;
const char *name;
char *demangled = abi::__cxa_demangle(info.dli_sname, NULL, 0, &status);
if (status == 0) {
name = demangled ? demangled : "[not demangled]";
} else {
name = info.dli_sname ? info.dli_sname : "[no dli_sname nd std]";
}
printf("enter %s (%s)\n", name, info.dli_fname);
if (demangled) {
free(demangled);
demangled = NULL;
}
}
}
extern "C" __attribute__((no_instrument_function)) void __cyg_profile_func_exit(void *callee, void *caller) {
Dl_info info;
if (dladdr(callee, &info)) {
int status;
const char *name;
char *demangled = abi::__cxa_demangle(info.dli_sname, NULL, 0, &status);
if (status == 0) {
name = demangled ? demangled : "[not demangled]";
} else {
name = info.dli_sname ? info.dli_sname : "[no dli_sname and std]";
}
printf("exit %s (%s)\n", name, info.dli_fname);
if (demangled) {
free((void *)demangled);
demangled = NULL;
}
}
}

这是测试文件：

// test_trace.cc
void func1() {}
void func() { func1(); }
int main() { func(); }
将test_trace.cc和tracing.cc文件同时编译链接，即可达到链路追踪的目的：
g++ test_trace.cc tracing.cc -std=c++14 -finstrument-functions -rdynamic -ldl;./a.out
输出：enter main (./a.out)
enter func() (./a.out)
enter func1() (./a.out)
exit func1() (./a.out)
exit func() (./a.out)
exit main (./a.out)

如果在func()中调用了一些其他的函数呢?

#include <iostream>
#include <vector>
void func1() {}
void func() {
std::vector<int> v{1, 2, 3};
std::cout << v.size();
func1();
}
int main() { func(); }

再重新编译后输出会是这样：

enter [no dli_sname nd std] (./a.out)
enter [no dli_sname nd std] (./a.out)
exit [no dli_sname and std] (./a.out)
exit [no dli_sname and std] (./a.out)
enter main (./a.out)
enter func() (./a.out)
enter std::allocator<int>::allocator() (./a.out)
enter __gnu_cxx::new_allocator<int>::new_allocator() (./a.out)
exit __gnu_cxx::new_allocator<int>::new_allocator() (./a.out)
exit std::allocator<int>::allocator() (./a.out)
enter std::vector<int, std::allocator<int> >::vector(std::initializer_list<int>, std::allocator<int> const&) (./a.out)
enter std::_Vector_base<int, std::allocator<int> >::_Vector_base(std::allocator<int> const&) (./a.out)
enter std::_Vector_base<int, std::allocator<int> >::_Vector_impl::_Vector_impl(std::allocator<int> const&) (./a.out)
enter std::allocator<int>::allocator(std::allocator<int> const&) (./a.out)
enter __gnu_cxx::new_allocator<int>::new_allocator(__gnu_cxx::new_allocator<int> const&) (./a.out)
exit __gnu_cxx::new_allocator<int>::new_allocator(__gnu_cxx::new_allocator<int> const&) (./a.out)
exit std::allocator<int>::allocator(std::allocator<int> const&) (./a.out)
exit std::_Vector_base<int, std::allocator<int> >::_Vector_impl::_Vector_impl(std::allocator<int> const&) (./a.out)
exit std::_Vector_base<int, std::allocator<int> >::_Vector_base(std::allocator<int> const&) (./a.out)

上面我只贴出了部分信息，这显然不是我们想要的，我们只想要显示自定义的函数调用路径，其他的都想要过滤掉，怎么办?

这里可以将自定义的函数都加一个统一的前缀，在打印时只打印含有前缀的符号，这种个人认为是比较通用的方案。

下面是我过滤掉std和gnu子串的代码：

if (!strcasestr(name, "std") && !strcasestr(name, "gnu")) {
printf("enter %s (%s)\n", name, info.dli_fname);
}
if (!strcasestr(name, "std") && !strcasestr(name, "gnu")) {
printf("exit %s (%s)\n", name, info.dli_fname);
}

重新编译后就会输出我想要的结果：

g++ test_trace.cc tracing.cc -std=c++14 -finstrument-functions -rdynamic -ldl;./a.out
输出：enter main (./a.out)
enter func() (./a.out)
enter func1() (./a.out)
exit func1() (./a.out)
exit func() (./a.out)
exit main (./a.out)