类汇编 | go调度源码分析

golang类汇编指令 #

寻址模式 #

(DI)(BX2): The location at address DI plus BX2.
64(DI)(BX2): The location at address DI plus BX2 plus 64. These modes accept only 1, 2, 4, and 8 as scale factors.

结构体+寄存器 #

类似这种:

// (m_morebuf+gobuf_pc)(REGISTER)

	MOVQ	8(SP), AX	# f's caller's PC
	MOVQ	AX, (m_morebuf+gobuf_pc)(BX)

type m struct {
	g0      *g     // goroutine with scheduling stack
	morebuf gobuf  // gobuf arg to morestack   //-----------morebuf-------------//
	divmod  uint32 // div/mod denominator for arm - known to liblink
	//...
}

type gobuf struct {
	// The offsets of sp, pc, and g are known to (hard-coded in) libmach.
	//
	// ctxt is unusual with respect to GC: it may be a
	// heap-allocated funcval, so GC needs to track it, but it
	// needs to be set and cleared from assembly, where it's
	// difficult to have write barriers. However, ctxt is really a
	// saved, live register, and we only ever exchange it between
	// the real register and the gobuf. Hence, we treat it as a
	// root during stack scanning, which means assembly that saves
	// and restores it doesn't need write barriers. It's still
	// typed as a pointer so that any other writes from Go get
	// write barriers.
	sp   uintptr
	pc   uintptr   // <<<--- 
	g    guintptr
	ctxt unsafe.Pointer
	ret  sys.Uintreg
	lr   uintptr
	bp   uintptr // for GOEXPERIMENT=framepointer
}

我们从这个m_morebuf+gobuf_pc就知道指的是这个m结构体中的morebuf结构体字段中的pc值。

golang类汇编函数 #

dropg函数 #

dropg()函数:解除g和m之间连接关系,其实就是设置g->m = nil, m->currg = nil.

func dropg() {
	_g_ := getg()

	setMNoWB(&_g_.m.curg.m, nil)
	setGNoWB(&_g_.m.curg, nil)
}

几个退出收尾函数 #

非main goroutine运行结束： goexit0
主动调度： gosched_m
- 剥夺调度(运行太久)： gopreempt_m
  - 其中 gopreempt_m与gosched_m内部都是调用的goschedImpl函数，所以功能都是一样。
被动调度： park_m

tls相关函数 #

普通的全局变量，一个线程对其进行了修改，所有线程都可以看到这个修改; 线程私有全局变量不同，每个线程都有自己的一份副本，某个线程对其所做的修改不会影响到其它线程的副本;
线程本地存储

TLS代表的是伪寄存器,存储线程本地存储的基地址(基地址加上偏移地址能得到完整的一个地址), 语义上

MOVQ TLS, reg
off(reg)(TLS*1)

等于

off(TLS)

而(TLS*1)说明是从线程本地存储的基地址上进行索引

Expand ↕

	// Thread-local storage references use the TLS pseudo-register.
	// As a register, TLS refers to the thread-local storage base, and it
	// can only be loaded into another register:
	//
	//         MOVQ TLS, AX
	//
	// An offset from the thread-local storage base is written off(reg)(TLS*1).
	// Semantically it is off(reg), but the (TLS*1) annotation marks this as
	// indexing from the loaded TLS base. This emits a relocation so that
	// if the linker needs to adjust the offset, it can. For example:
	//
	//         MOVQ TLS, AX
	//         MOVQ 0(AX)(TLS*1), CX // load g into CX
	//
	// On systems that support direct access to the TLS memory, this
	// pair of instructions can be reduced to a direct TLS memory reference:
	//
	//         MOVQ 0(TLS), CX // load g into CX
	//
	// The 2-instruction and 1-instruction forms correspond to the two code
	// sequences for loading a TLS variable in the local exec model given in "ELF
	// Handling For Thread-Local Storage".

    // When building for inclusion into a shared library, an instruction of the form
    //     MOV off(CX)(TLS*1), AX
    // becomes
    //     mov %fs:off(%rcx), %rax
    // which assumes that the correct TLS offset has been loaded into %rcx (today
    // there is only one TLS variable -- g -- so this is OK). When not building for
    // a shared library the instruction does not require a prefix.

runtime·rt0_go(SB) #

TEXT runtime·rt0_go(SB),NOSPLIT,$0
    //...
    LEAQ	runtime·m0+m_tls(SB), DI  // DI = &m0.tls，取m0的tls成员的地址到DI寄存器
    CALL	runtime·settls(SB)        // 调用settls设置线程本地存储，其中settls的入参为DI寄存器

    get_tls(BX)                       // 把TLS地址放入BX寄存器，测试刚刚那个绑定是否成功。
    MOVQ	$0x123, g(BX)             // 0x123设置到了m0.tls[0]
    MOVQ	runtime·m0+m_tls(SB), AX
    CMPQ	AX, $0x123                // 比较确认前面设置是否成功
    JEQ 2(PC)
    CALL	runtime·abort(SB)

settls #

设置段基地址

// set tls base to DI
TEXT runtime·settls(SB),NOSPLIT,$32
    ADDQ	$8, DI      // ELF wants to use -8(FS) // https://akkadia.org/drepper/tls.pdf
    MOVQ	DI, SI      // SI 是 arch_prctl的第二个参数 addr
    MOVQ	$0x1002, DI // ARCH_SET_FS, arch_prctl的第一个参数 code
    MOVQ	$SYS_arch_prctl, AX
    SYSCALL
    CMPQ	AX, $0xfffffffffffff001
    JLS	2(PC)
    MOVL	$0xf1, 0xf1
    RET

主要使用系统API arch_prctl

// arch_prctl() sets architecture-specific process or thread state. code selects a subfunction and passes argument addr to it;
int arch_prctl(int code, unsigned long addr);

参数之一就是指定操作类型，其中： ARCH_SET_FS ¹: 代表将FS寄存器的64位基数设置为addr指定的参数操作,它定义的操作类型代号为0x1002

上面的步骤是把段基寄存器设置为 m0.tls[0]所在的地址.

get_tls(BX)与g(BX) #

这个get_tls(BX)与g()其实都是宏定义：

#ifdef GOARCH_amd64
#define	get_tls(r)	MOVQ TLS, r
 
#define	g(r)	0(r)(TLS*1)
#endif

翻译过来就是:

MOVQ TLS, BX
0(BX)(TLS*1)

而0(BX)(TLS*1)就等于0(TLS), 所以上面就是把0x123设置到了m0.tls[0], 然后通过比较CMPQ AX, $0x123看刚刚测试那个绑定是否成功。

getg函数 #

getg()单独返回当前的g, 得到当前用户g,最好使用getg().m.curg，因在系统或信号栈上执行时，这将分别返回当前M的 “g0 “或 “gsignal”。这通常不是你想要的。想确定你是在用户栈还是系统栈上运行，可以使用getg() == getg().m.curg

// getg returns the pointer to the current g.
// The compiler rewrites calls to this function into instructions
// that fetch the g directly (from TLS or from the dedicated register).
func getg() *g

https://github.com/golang/go/blob/master/src/runtime/HACKING.md#getg-and-getgmcurg

附录 #

https://lrita.github.io/2017/12/12/golang-asm/#interface

https://www.zhihu.com/question/284288720 https://segmentfault.com/a/1190000038626134 https://www.altoros.com/blog/golang-internals-part-5-the-runtime-bootstrap-process/

gid

https://golang.org/doc/asm#directives https://lrita.github.io/2017/12/12/golang-asm/#%E5%B1%80%E9%83%A8%E5%8F%98%E9%87%8F%E5%A3%B0%E6%98%8E https://github.com/go-internals-cn/go-internals