伪寄存器 & 函数栈 #

伪寄存器 #

伪寄存器常用的一般是下面的四个:

FP: Frame pointer: arguments and locals.
PC: Program counter: jumps and branches.
SB: Static base pointer: global symbols.
SP: Stack pointer: top of stack.

下面我们来翻译一下官网¹的对他们的解释，然后做一个总结,方便理解。

FP #

FP伪寄存器是一个用于引用函数参数的虚拟帧指针。编译器维护一个虚拟帧指针，并将堆栈上的参数引用为该伪寄存器的偏移量。因此0(FP)是函数的第一个参数，8(FP)是第二个参数(在64位机器上)，以此类推。但是，当以这种方式引用一个函数参数时，有必要将名称放在开头，如first_arg+0(FP)和second_arg+8(FP)。(偏移量的含义(从帧指针出发的偏移量)与它在SB中的使用不同，在SB中，它是从符号出发的偏移量。) 汇编器强制执行这个约定，拒绝普通的0(FP)和8(FP)。实际的名称在语义上是不相关的，但应该用来记录参数的名称。值得强调的是，FP始终是一个伪寄存器，而不是硬件寄存器，即使在具有硬件帧指针的架构上也是如此。

The FP pseudo-register is a virtual frame pointer used to refer to function arguments. The compilers maintain a virtual frame pointer and refer to the arguments on the stack as offsets from that pseudo-register. Thus 0(FP) is the first argument to the function, 8(FP) is the second (on a 64-bit machine), and so on. However, when referring to a function argument this way, it is necessary to place a name at the beginning, as in first_arg+0(FP) and second_arg+8(FP). (The meaning of the offset—offset from the frame pointer—distinct from its use with SB, where it is an offset from the symbol.) The assembler enforces this convention, rejecting plain 0(FP) and 8(FP). The actual name is semantically irrelevant but should be used to document the argument’s name. It is worth stressing that FP is always a pseudo-register, not a hardware register, even on architectures with a hardware frame pointer.

对于带有Go原型的汇编函数，go vet会检查参数名和偏移量是否匹配。在32位系统上，64位值的低位和高位32位是通过在名称中添加一个_lo或_hi后缀来区分的，如arg_lo+0(FP)或arg_hi+4(FP)。如果一个Go原型没有给它的结果命名，那么预期的汇编名是ret。

For assembly functions with Go prototypes, go vet will check that the argument names and offsets match. On 32-bit systems, the low and high 32 bits of a 64-bit value are distinguished by adding a _lo or _hi suffix to the name, as in arg_lo+0(FP) or arg_hi+4(FP). If a Go prototype does not name its result, the expected assembly name is ret.

SP #

SP伪寄存器是一个虚拟栈指针，用于引用帧本地变量和为函数调用准备的参数。它指向本地栈帧的顶部，所以引用时应使用负偏移量，范围为[-framesize，0)：x-8(SP)，y-4(SP)，以此类推。

The SP pseudo-register is a virtual stack pointer used to refer to frame-local variables and the arguments being prepared for function calls. It points to the top of the local stack frame, so references should use negative offsets in the range [−framesize, 0): x-8(SP), y-4(SP), and so on.

在具有名为SP的硬件寄存器的架构上，名称前缀可以区分对虚拟栈指针的引用和对架构SP寄存器的引用，即x-8(SP)，y-4(SP)，以此类推。也就是说，x-8(SP)和-8(SP)是不同的内存位置：第一个是指虚拟栈指针伪寄存器，而第二个是指硬件的SP寄存器。

On architectures with a hardware register named SP, the name prefix distinguishes references to the virtual stack pointer from references to the architectural SP register. That is, x-8(SP) and -8(SP) are different memory locations: the first refers to the virtual stack pointer pseudo-register, while the second refers to the hardware’s SP register.

总结 #

如何理解伪寄存器FP和SP呢？其实伪寄存器FP和SP相当于plan9伪汇编中的一个助记符，他们是根据当前函数栈空间计算出来的一个相对于物理寄存器SP的一个偏移量坐标。
伪SP和FP的相对位置是会变的，所以不应该尝试用伪SP寄存器去找那些本用FP+offset来引用的值，例如函数的入参和返回值。
官方文档中说的伪SP指向栈的top，是有问题的。其指向的局部变量位置实际上是整个栈的栈底（除caller BP 之外），所以说bottom更合适一些。
MOVQ 0(SP), AX,这种前面没有flags的，它相当于实际的寄存器的值，不是伪寄存器了。

函数调用栈分析 #

函数调用栈的知识(为方便起见，在函数A1中调用函数A2,我们称A1是caller,A2是callee); 栈指每个进程/线程/goroutine都有自己的调用栈，参数和返回值的传递，函数的局部变量存储通常是通过栈来完成的。和数据结构中的栈一样，内存栈也是后进先出的，地址从高地址开始增长到低地址。栈帧也称为帧,每一帧对应一个尚未返回的函数调用，帧本身以栈的形式存储数据。栈由许多帧组成，它描述函数之间的调用关系.

如下图所示:内存中的栈从高地址空间向低地址空间增长，栈顶小于栈底，分配栈空间对应sp值减小。

20220622230646

其中caller与callee的关系在go1.17版本以下是下图所示，go1.17+以上返回参数已使用寄存器方式传递

20220622204818

例子 #

package main

func main() {
    Sub(2,1)
}

//go:noinline
func Sub(a , b int) int {
    d := a - b
    return d
}

生成的汇编结果如下:

Expand ↕

# disass $pc,+40 
# disass $pc-30, +150
# 使用help disass查看用法
# $pc-30, +150 打印此范围的汇编结果

(gdb) disass $pc-30, +150
Dump of assembler code from 0x45259f to 0x452635:
   0x000000000045259f:	int3
   0x00000000004525a0 <main.main+0>:	mov    %fs:0xfffffffffffffff8,%rcx
   0x00000000004525a9 <main.main+9>:	cmp    0x10(%rcx),%rsp
   0x00000000004525ad <main.main+13>:	jbe    0x4525dd <main.main+61>
   0x00000000004525af <main.main+15>:	sub    $0x20,%rsp
   0x00000000004525b3 <main.main+19>:	mov    %rbp,0x18(%rsp)
   0x00000000004525b8 <main.main+24>:	lea    0x18(%rsp),%rbp
=> 0x00000000004525bd <main.main+29>:	movq   $0x2,(%rsp)
   0x00000000004525c5 <main.main+37>:	movq   $0x1,0x8(%rsp)
   0x00000000004525ce <main.main+46>:	callq  0x4525f0 <main.Sub>
   0x00000000004525d3 <main.main+51>:	mov    0x18(%rsp),%rbp
   0x00000000004525d8 <main.main+56>:	add    $0x20,%rsp
   0x00000000004525dc <main.main+60>:	retq
   0x00000000004525dd <main.main+61>:	callq  0x44a0f0 <runtime.morestack_noctxt>
   0x00000000004525e2 <main.main+66>:	jmp    0x4525a0 <main.main>
   0x00000000004525e4:	int3
   0x00000000004525e5:	int3
   0x00000000004525e6:	int3
   0x00000000004525e7:	int3
   0x00000000004525e8:	int3
   0x00000000004525e9:	int3
   0x00000000004525ea:	int3
   0x00000000004525eb:	int3
   0x00000000004525ec:	int3
   0x00000000004525ed:	int3
   0x00000000004525ee:	int3
   0x00000000004525ef:	int3
   0x00000000004525f0 <main.Sub+0>:	sub    $0x10,%rsp
   0x00000000004525f4 <main.Sub+4>:	mov    %rbp,0x8(%rsp)
   0x00000000004525f9 <main.Sub+9>:	lea    0x8(%rsp),%rbp
   0x00000000004525fe <main.Sub+14>:	movq   $0x0,0x28(%rsp)
   0x0000000000452607 <main.Sub+23>:	mov    0x18(%rsp),%rax
   0x000000000045260c <main.Sub+28>:	sub    0x20(%rsp),%rax
   0x0000000000452611 <main.Sub+33>:	mov    %rax,(%rsp)
   0x0000000000452615 <main.Sub+37>:	mov    %rax,0x28(%rsp)
   0x000000000045261a <main.Sub+42>:	mov    0x8(%rsp),%rbp
   0x000000000045261f <main.Sub+47>:	add    $0x10,%rsp
   0x0000000000452623 <main.Sub+51>:	retq
   0x0000000000452624:	add    %al,(%rax)
   0x0000000000452626:	add    %al,(%rax)
   0x0000000000452628:	add    %al,(%rax)

这里需要注意一点的是，上面都是在代码空间的，所以左边都是代码空间的地址，当我们分析栈空间的时候,需要查找栈空间地址的内容

Expand ↕

# x/10og $rsp

# define rr
# info threads
# info register rbp rsp pc
# end

(gdb) rr
  Id   Target Id         Frame
* 1    LWP 3496 "test"   main.Sub (a=2, b=1, ~r2=824634122328) at /root/pprof/common_test/main.go:8
  2    LWP 3500 "test"   runtime.usleep () at /usr/lib/go-1.13/src/runtime/sys_linux_amd64.s:131
  3    LWP 3501 "test"   runtime.futex () at /usr/lib/go-1.13/src/runtime/sys_linux_amd64.s:536
  4    LWP 3502 "test"   runtime.futex () at /usr/lib/go-1.13/src/runtime/sys_linux_amd64.s:536
rbp            0xc000032750        0xc000032750
rsp            0xc000032730        0xc000032730
pc             0x4525f0            0x4525f0 <main.Sub>
(gdb)  x/10og $rsp
0xc000032730:	021222723	02
0xc000032740:	01	014000001420130
0xc000032750:	014000000623530	020463176
0xc000032760:	014000001420000	0
0xc000032770:	014000001420000	0

下面来一步一步来看下调用的过程:

20220622193033

伪寄存器的位置 #

下面来做下实验。
- 确认伪FP， SP相对于真实存在的寄存器的位置点
  - 我们伪FP应该在caller’s next pc + 8byte
  - 伪SP应该在caller’s BP

main.go

package main

func test_FP_SP(a, b int64)(first uintptr, second uintptr)

func main(){
	first, second := test_FP_SP(1, 2)
	first -= second
	_ = first
}

test_FP_SP.s

// func test_FP_SP(a, b int64)(first uintptr, second uintptr)
TEXT ·test_FP_SP(SB),$1040-16    // 这里的16是为了存caller调用call指令的时候，把它下一个pc地圵放入栈中与caller's BP,所以就减16
        LEAQ x-0(SP), DI         // 
        MOVQ DI, first+16(FP)    // 将原伪寄存器SP偏移量存入返回值first

        MOVQ    SP, BP           // 存储物理SP偏移量到BP寄存器
        ADDQ    $512, SP        // 将物理SP偏移增加 0.5K

        LEAQ x-0(SP), SI         // 在上面中只改变了一个值就是SP这个寄存器，然后再次一模一样的把x-0(SP)给到了SI.

        /* 第一个 MOVQ    BP, SP */
        MOVQ    BP, SP           // 恢复物理SP，因为修改物理SP后，伪寄存器FP/SP随之改变，
                                 // 为了正确访问FP，先恢复物理SP
        MOVQ SI, second+24(FP)   // 将偏移后的伪寄存器SP偏移量存入返回值second

        /* 第二个 MOVQ    BP, SP */
        //MOVQ    BP, SP         

        RET					    // 从输出的second-first来看，正好相差 0.5K

编译一下源代码：


# linux
go build -gcflags "-N -l" -o test .
# or 
go build -gcflags "all=-N -l" -o test .

# xos:
go build -gcflags "all=-N -l" -ldflags=-compressdwarf=false   -o test .

# result
[root@iZf8z14idfp0rwhiicngwqZ FP_SP]# tree .
.
├── main.go
├── test
└── test_FP_SP.s

我们用到的gdb命令:


gdb ./test
list
b 6
display /25i $pc-8
si
si
si

Expand ↕

(gdb) disass $pc, + 240
Dump of assembler code from 0x4591dd to 0x4592cd:
=> 0x00000000004591dd <main.main+29>:	movq   $0x1,(%rsp)
   0x00000000004591e5 <main.main+37>:	movq   $0x2,0x8(%rsp)
   0x00000000004591ee <main.main+46>:	callq  0x459240 <main.test_FP_SP>
   0x00000000004591f3 <main.main+51>:	mov    0x10(%rsp),%rax
   0x00000000004591f8 <main.main+56>:	mov    %rax,0x38(%rsp)
   0x00000000004591fd <main.main+61>:	mov    0x18(%rsp),%rax
   0x0000000000459202 <main.main+66>:	mov    %rax,0x30(%rsp)
   0x0000000000459207 <main.main+71>:	mov    0x38(%rsp),%rax
   0x000000000045920c <main.main+76>:	mov    %rax,0x28(%rsp)
   0x0000000000459211 <main.main+81>:	mov    0x30(%rsp),%rax
   0x0000000000459216 <main.main+86>:	mov    %rax,0x20(%rsp)
   0x000000000045921b <main.main+91>:	mov    0x28(%rsp),%rax
   0x0000000000459220 <main.main+96>:	sub    0x20(%rsp),%rax
   0x0000000000459225 <main.main+101>:	mov    %rax,0x28(%rsp)
   0x000000000045922a <main.main+106>:	mov    0x40(%rsp),%rbp
   0x000000000045922f <main.main+111>:	add    $0x48,%rsp
   0x0000000000459233 <main.main+115>:	retq
   0x0000000000459234 <main.main+116>:	callq  0x450c70 <runtime.morestack_noctxt>
   0x0000000000459239 <main.main+121>:	jmp    0x4591c0 <main.main>
   0x000000000045923b:	int3
   0x000000000045923c:	int3
   0x000000000045923d:	int3
   0x000000000045923e:	int3
   0x000000000045923f:	int3
   0x0000000000459240 <main.test_FP_SP+0>:	mov    %fs:0xfffffffffffffff8,%rcx
   0x0000000000459249 <main.test_FP_SP+9>:	lea    -0x398(%rsp),%rax
   0x0000000000459251 <main.test_FP_SP+17>:	cmp    0x10(%rcx),%rax
   0x0000000000459255 <main.test_FP_SP+21>:	jbe    0x4592ab <main.test_FP_SP+107>
   0x0000000000459257 <main.test_FP_SP+23>:	sub    $0x418,%rsp
   0x000000000045925e <main.test_FP_SP+30>:	mov    %rbp,0x410(%rsp)
   0x0000000000459266 <main.test_FP_SP+38>:	lea    0x410(%rsp),%rbp
   0x000000000045926e <main.test_FP_SP+46>:	lea    0x410(%rsp),%rdi
   0x0000000000459276 <main.test_FP_SP+54>:	mov    %rdi,0x430(%rsp)
   0x000000000045927e <main.test_FP_SP+62>:	mov    %rsp,%rbp
   0x0000000000459281 <main.test_FP_SP+65>:	add    $0x200,%rsp
   0x0000000000459288 <main.test_FP_SP+72>:	lea    0x410(%rsp),%rsi
   0x0000000000459290 <main.test_FP_SP+80>:	mov    %rbp,%rsp
   0x0000000000459293 <main.test_FP_SP+83>:	mov    %rsi,0x438(%rsp)
   0x000000000045929b <main.test_FP_SP+91>:	mov    0x410(%rsp),%rbp
   0x00000000004592a3 <main.test_FP_SP+99>:	add    $0x418,%rsp
   0x00000000004592aa <main.test_FP_SP+106>:	retq
   0x00000000004592ab <main.test_FP_SP+107>:	callq  0x450c70 <runtime.morestack_noctxt>
   0x00000000004592b0 <main.test_FP_SP+112>:	jmp    0x459240 <main.test_FP_SP>

Expand ↕

si: To execute one line of code, type “step” or “s”. If the line to be executed is a function call, gdb will step into that function and start executing its code one line at a time
n: If you want to execute the entire function with one keypress, type “next” or “n”
set pagination off
- 关闭按页打印
定义函数:

define rr
info threads
info register rbp rsp pc
end

打印指定地址的值
- x/10og 0xc000032738
- x/40xg

(gdb) help x
Examine memory: x/FMT ADDRESS.
ADDRESS is an expression for the memory address to examine.
FMT is a repeat count followed by a format letter and a size letter.
Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
  t(binary), f(float), a(address), i(instruction), c(char) and s(string),
  T(OSType), A(floating point values in hex).
Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).
The specified number of objects of the specified size are printed
according to the format.

打印变量的值&地址
- p first
- p &first
查看汇编源码，注意找准开始点
- help disass
- ~~disass $pc-1,+100~~
- disass $pc,+40

disass会从你指定的位置解析，如果当前指定的值是在某个指令的中间，那么它的后面会错误的解析 So, for example, if you want to disassemble function bar in file foo.c you must type “disassemble ‘foo.c’::bar” and not “disassemble foo.c:bar”

从上面的图中可以看出，go assemble中的x-0(SP)与first+16(FP),其实都是与SP寄存器关联的，其中SP，伪FP，与伪SP的位置，在下图中已经标识出来了;

                 +------------------------+                                  
                 |                        |                                  
                 |                        |                                  
                 |         second         |                                  
                 |                        |                                  
                 |-------------------------                                  
                 |                        |                                  
                 |                        |                                  
                 |         first          |                                  
                 |                        |                                  
                 |------------------------|                                  
                 |                        |                                  
                 |                        |                                  
                 |         b              |                                  
                 |                        |                                  
                 |------------------------|                                  
                 |                        |                                  
                 |                        |                                  
                 |         a              |                                  
                 |                        |                                  
            伪FP +------------------------+                       
                 |                        |                                  
                 |                        |                                  
                 |     caller's pc        |                                  
                 |                        |                                  
                 +------------------------+                                  
                 |                        |                                  
                 |                        |                                  
                 |     caller's BP        |                                  
                 |                        |                                  
伪SP|callee's BP +------------------------+ 
                 |                        |                                  
                 |                        |                                  
                 |        ...             |                                  
                 |                        |                                  
                 |                        |                                  
真实寄存器SP等于   +------------------------+
     caller's SP - caller's next CP(8) - callee's stack size
     上图已标识

当我们把/* 第一个 MOVQ BP, SP */下面的注释掉，执行的话会panic，是因为PC寄存器读取错误，而不是注释掉的下一行导致的。

可以实验下:我们把/* 第二个 MOVQ BP, SP */取消注释，它就正常执行，只是返回值不对而已。

添加汇编 #

go编译器在函数头添加额外汇编，判断当前Goroutine栈是否将越界,如将越界,需加空间

go version go1.13.8 linux/amd64

   0x0000000000459240 <main.test_FP_SP+0>:	mov    %fs:0xfffffffffffffff8,%rcx
   0x0000000000459249 <main.test_FP_SP+9>:	lea    -0x398(%rsp),%rax
   0x0000000000459251 <main.test_FP_SP+17>:	cmp    0x10(%rcx),%rax
   0x0000000000459255 <main.test_FP_SP+21>:	jbe    0x4592ab <main.test_FP_SP+107>
   // ...
   0x00000000004592ab <main.test_FP_SP+107>:	callq  0x450c70 <runtime.morestack_noctxt>
   0x00000000004592b0 <main.test_FP_SP+112>:	jmp    0x459240 <main.test_FP_SP>

go version go1.15

=> 0x459240 <main.test_FP_SP>:	mov    %fs:0xfffffffffffffff8,%rcx
   0x459249 <main.test_FP_SP+9>:	mov    0x10(%rcx),%rsi
   0x45924d <main.test_FP_SP+13>:	cmp    $0xfffffffffffffade,%rsi
   0x459254 <main.test_FP_SP+20>:	je     0x4592bd <main.test_FP_SP+125>
   //...
   0x4592bd <main.test_FP_SP+125>:	callq  0x450c70 <runtime.morestack_noctxt>
   0x4592c2 <main.test_FP_SP+130>:	jmpq   0x459240 <main.test_FP_SP>
   0x4592c7:	add    %al,(%rax)
   0x4592c9:	add    %al,(%rax)

可以观察到是在头和尾部加上了跳转代码

0x10(%rcx)
$0xfffffffffffffade

type stack struct {
	lo uintptr
	hi uintptr
}

type g struct {
	// Stack parameters.
	// stack describes the actual stack memory: [stack.lo, stack.hi).
	// stackguard0 is the stack pointer compared in the Go stack growth prologue.
	// It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.
	// stackguard1 is the stack pointer compared in the C stack growth prologue.
	// It is stack.lo+StackGuard on g0 and gsignal stacks.
	// It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).
	stack       stack   // offset known to runtime/cgo
	stackguard0 uintptr // offset known to liblink
	stackguard1 uintptr // offset known to liblink
	//...

20220626164439

其中rcx是g的地址，所以0x10(%rcx)是就是g.stackguard0的值，当被设置需要抢占的时候，其值是: StackPreempt = -1314 // 也就是 0xfff...fade

当设置需要抢占的时候，程序会会跳转到函数尾部的runtime.morestack_noctxt函数

golang汇编 #

Addressing modes:

(DI)(BX2): The location at address DI plus BX2.
64(DI)(BX2): The location at address DI plus BX2 plus 64. These modes accept only 1, 2, 4, and 8 as scale factors.

伪寄存器 & 函数栈 #

伪寄存器 #

FP #

SP #

总结 #

函数调用栈分析 #

例子 #

伪寄存器的位置 #

添加汇编 #

golang汇编 #

附录 #