The profiler in kernel space only sees addresses. Useful insights emerge only after symbolization—and in Go, this stage is structured differently than in other languages.
The problem arises when the profile has already been collected, but it cannot be interpreted. The eBPF profiler captures stack traces at the kernel level and obtains a set of program counter values—raw addresses in memory. Without symbolization, these are just hex strings without context. Unlike traditional profilers, there is no way to access the runtime of the process or inject an agent. All that is available are addresses and binaries on disk. At the same time, the system must adhere to a strict overhead constraint (less than ~1% CPU). This turns symbolization into a task with stringent requirements for latency and algorithmic complexity.
The solution revolves around offline analysis of the binary and fast matching of “address → function.” For Go, the key role is played by the .gopclntab section—a table embedded in the binary that stores the correspondence of address ranges to functions and lines of source code. Unlike DWARF debug information or ELF symbol tables, this structure is not removed when stripped. This is a compromise: the binary becomes larger (in the example, gopclntab occupies about 22% of the size), but the system achieves stable symbolization without external symbol servers. This is especially important for eBPF because fallback strategies (as in C/C++) are either costly or unavailable.
The implementation looks like a pipeline with several phases. First, the profiler reads /proc/<pid>/maps to determine which binary the address belongs to. It then opens the ELF file and extracts the .gopclntab. This structure is already sorted by addresses, so binary search (O(log n)) is applied to find the function that the address falls within. The result is cached (LRU) so that repeated lookups are performed in microseconds. This is critical: with a frequency of 20–100 Hz and dozens of processes, the system can easily reach tens of thousands of address resolutions per second. Linear search here immediately leads to unacceptable CPU overhead.
In practice, this provides predictable behavior. Even if the binary is stripped, the Go program continues to be correctly symbolized because the runtime requires the presence of gopclntab. In other languages, the removal of symbols often renders the profile useless without external debug files. Limitations remain: for example, inlined functions do not appear as separate frames, and CGO calls may require separate symbolization. Improvement metrics in the source material are not provided, but architecturally, the system achieves microsecond lookups and maintains low overhead, making continuous profiling practically undetectable in production.