代码执行过久

执行太久需要抢占 #

preemptone设置抢占 #

继续上文看下preemptone函数,它设置了g.preempt(抢占标示)为true和g.stackguard0为很大的数((1<<(8*sys.PtrSize) - 1) & -1314 ---> 0xfffffffffffffade),使被抢占的goroutine在进行函数调用会去检查栈溢出,然后处理抢占请求

// Tell the goroutine running on processor P to stop.
// This function is purely best-effort. It can incorrectly fail to inform the
// goroutine. It can send inform the wrong goroutine. Even if it informs the
// correct goroutine, that goroutine might ignore the request if it is
// simultaneously executing newstack.
// No lock needs to be held.
// Returns true if preemption request was issued.
// The actual preemption will happen at some point in the future
// and will be indicated by the gp->status no longer being
// Grunning
func preemptone(_p_ *p) bool {
	mp := _p_.m.ptr()
	if mp == nil || mp == getg().m {
		return false
	}
	gp := mp.curg // gp == 被抢占的goroutine
	if gp == nil || gp == mp.g0 {
		return false
	}

	gp.preempt = true // 设置抢占信号preempt == true

	// Every call in a go routine checks for stack overflow by
	// comparing the current stack pointer to gp->stackguard0.
	// Setting gp->stackguard0 to StackPreempt folds
	// preemption into the normal stack overflow check.

	// (1<<(8*sys.PtrSize) - 1) & -1314 ---> 0xfffffffffffffade, 很大的数
	gp.stackguard0 = stackPreempt //stackguard0==很大的数; 使被抢占的goroutine;在进行函数调用会去检查栈溢出;去处理抢占请求
	return true
}

触发抢占 #

通过preemptone函数设置抢占后,我们继续来看实际触发抢占的,在前文我们讲述了编译器会在函数的头尾部分添加额外的汇编: 编译器加的函数头的部分

设置抢占后,只要执行函数调用就会执行下列函数调用:

morestack_noctxt() -> morestack() -> newstack()

runtime·morestack #

首先来看下runtime·morestack函数:

  • 类似于mcall
    • 保存调用morestack函数的goroutine(假设为gN)到它的sched成员 -> 将当前工作线程的g0与线程TLS关联 -> 将当前工作线程的g0栈恢复到CPU寄存器
    • g0栈中call调用传入的参数(mcall)/执行runtime·newstack(SB)函数(morestack),所以不会影响gN,如果gN下一次被调度起来了,那么执行PC,又会重新到本函数头部执行,从上面分析也可以知道,这里的风险就是,如果执行过程没有调用其他函数,那么无法进行抢占,这个就是基于插入抢占,1.14基于信号抢占。
# morestack but not preserving ctxt.
TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0
	MOVL	$0, DX
	JMP	runtime·morestack(SB)

# 
# support for morestack
# 

# Called during function prolog when more stack is needed.
#
# The traceback routines see morestack on a g0 as being
# the top of a stack (for example, morestack calling newstack
# calling the scheduler calling newm calling gc), so we must
# record an argument size. For that purpose, it has no arguments.
TEXT runtime·morestack(SB),NOSPLIT,$0-0	
# 开始是进行一些判断
	#  Cannot grow scheduler stack (m->g0).
	# ...
	#  Cannot grow signal stack (m->gsignal).
	# ...

# 设置m->morebuf的PC,SP,g为相对应的'main'
	# Called from f.
	# Set m->morebuf to f's caller.
	NOP	SP	# tell vet SP changed - stop checking offsets
	MOVQ	8(SP), AX	# f's caller's PC # 这里的路径比如我的:  'main'--->'sub_function'。
                                           # 但是抢占了,所以走下面的路径:->morestack_noctxt()->morestack()->newstack()
                                           # 所以这里的f在我这里应该是main.
                                           # 需要注意morestack_noctxt与morestack使用的栈大小都是0,且他们的跳转没用call指令,使用的是JMP
	MOVQ	AX, (m_morebuf+gobuf_pc)(BX)
	LEAQ	16(SP), AX	# f's caller's SP
	MOVQ	AX, (m_morebuf+gobuf_sp)(BX)
	get_tls(CX)  #...
	MOVQ	g(CX), SI
	MOVQ	SI, (m_morebuf+gobuf_g)(BX)

# 保存当前的寄存器信息到g->sched中
	# Set g->sched to context in f.
	MOVQ	0(SP), AX # f's PC
	MOVQ	AX, (g_sched+gobuf_pc)(SI)
	MOVQ	SI, (g_sched+gobuf_g)(SI)
	LEAQ	8(SP), AX # f's SP
	MOVQ	AX, (g_sched+gobuf_sp)(SI) #在morestack里面就已经保存了sp的值
	MOVQ	BP, (g_sched+gobuf_bp)(SI)
	MOVQ	DX, (g_sched+gobuf_ctxt)(SI)

# 把g0设置为m当前运行的G; 把g0->sched->sp恢复到SP寄存器中;
	#Call newstack on m->g0's stack.
	MOVQ	m_g0(BX), BX
	MOVQ	BX, g(CX)
	MOVQ	(g_sched+gobuf_sp)(BX), SP # 把g0的栈SP寄存器恢复到实际的寄存器中。所以下面就使用了g0的栈
# 调用newstack
	CALL	runtime·newstack(SB)
	CALL	runtime·abort(SB)	#crash if newstack returns
	RET

newstack(SB) #

看下高亮出来的代码,主要是判断是否是设置为了需抢占:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
func newstack() {
	thisg := getg() //到这里我们又是在g0栈里面。

	//...

	gp := thisg.m.curg //这个就是原来的Goroutine.

	//...

	// NOTE: stackguard0 may change underfoot, if another thread
	// is about to try to preempt gp. Read it just once and use that same
	// value now and below.
	preempt := atomic.Loaduintptr(&gp.stackguard0) == stackPreempt //这里判断是否是抢占 打了stackguard0;

	// Be conservative about where we preempt.
	// We are interested in preempting user Go code, not runtime code.
	// If we're holding locks, mallocing, or preemption is disabled, don't
	// preempt.
	// This check is very early in newstack so that even the status change
	// from Grunning to Gwaiting and back doesn't happen in this case.
	// That status change by itself can be viewed as a small preemption,
	// because the GC might change Gwaiting to Gscanwaiting, and then
	// this goroutine has to wait for the GC to finish before continuing.
	// If the GC is in some way dependent on this goroutine (for example,
	// it needs a lock held by the goroutine), that small preemption turns
	// into a real deadlock.
	if preempt {
		// 这里还检查了一系列的状态,如果满足就不抢占它了, 让它继续执行。
		if thisg.m.locks != 0 || thisg.m.mallocing != 0 || thisg.m.preemptoff != "" || thisg.m.p.ptr().status != _Prunning {
			// Let the goroutine keep running for now.
			// gp->preempt is set, so it will be preempted next time.
			gp.stackguard0 = gp.stack.lo + _StackGuard //还原stackguard0为正常值,表示我们已经处理过抢占请求了
			gogo(&gp.sched) // never return
		}
	}

	if gp.stack.lo == 0 {
		throw("missing stack in newstack")
	}
	sp := gp.sched.sp
	if sys.ArchFamily == sys.AMD64 || sys.ArchFamily == sys.I386 || sys.ArchFamily == sys.WASM {
		// The call to morestack cost a word.
		sp -= sys.PtrSize
	}
	if stackDebug >= 1 || sp < gp.stack.lo {
		print("runtime: newstack sp=", hex(sp), " stack=[", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n",
			"\tmorebuf={pc:", hex(morebuf.pc), " sp:", hex(morebuf.sp), " lr:", hex(morebuf.lr), "}\n",
			"\tsched={pc:", hex(gp.sched.pc), " sp:", hex(gp.sched.sp), " lr:", hex(gp.sched.lr), " ctxt:", gp.sched.ctxt, "}\n")
	}
	if sp < gp.stack.lo {
		print("runtime: gp=", gp, ", goid=", gp.goid, ", gp->status=", hex(readgstatus(gp)), "\n ")
		print("runtime: split stack overflow: ", hex(sp), " < ", hex(gp.stack.lo), "\n")
		throw("runtime: split stack overflow")
	}

	if preempt {
		if gp == thisg.m.g0 {
			throw("runtime: preempt g0")
		}
		if thisg.m.p == 0 && thisg.m.locks == 0 {
			throw("runtime: g is running but p is not")
		}
		// Synchronize with scang.
		casgstatus(gp, _Grunning, _Gwaiting) // 设置gp状态变为等待状态。处理gc时把gp的状态修改成_Gwaiting
		if gp.preemptscan { //gc相关,暂时忽略。
			for !castogscanstatus(gp, _Gwaiting, _Gscanwaiting) {
				// Likely to be racing with the GC as
				// it sees a _Gwaiting and does the
				// stack scan. If so, gcworkdone will
				// be set and gcphasework will simply
				// return.
			}
			if !gp.gcscandone {
				// gcw is safe because we're on the
				// system stack.
				gcw := &gp.m.p.ptr().gcw
				scanstack(gp, gcw)
				gp.gcscandone = true
			}
			gp.preemptscan = false
			gp.preempt = false
			casfrom_Gscanstatus(gp, _Gscanwaiting, _Gwaiting)
			// This clears gcscanvalid.
			casgstatus(gp, _Gwaiting, _Grunning)
			gp.stackguard0 = gp.stack.lo + _StackGuard
			gogo(&gp.sched) // never return
		}

		// Act like goroutine called runtime.Gosched.
		casgstatus(gp, _Gwaiting, _Grunning) //恢复状态。
		gopreempt_m(gp) // 放入全局队列,重新schedule(); never return === gopreempt_m(gp)---call--->goschedImpl(gp)----call-->globrunqput()放入全局队列/schedule()
	}

	//...
}

gopreempt_m #

主要把被抢占的Goroutine重新放入全局队列:

// gopreempt_m(gp) ---> goschedImpl(gp)
func gopreempt_m(gp *g) {
    if trace.enabled {
        traceGoPreempt()
    }
    goschedImpl(gp)
}

goschedImpl我们之前讲过: goschedImpl

栈增长相关代码 #

func newstack() {
	//...省略抢占的代码

	// Allocate a bigger segment and move the stack.
	oldsize := gp.stack.hi - gp.stack.lo
	newsize := oldsize * 2  // 新的栈大小直接*2
	if newsize > maxstacksize {
		print("runtime: goroutine stack exceeds ", maxstacksize, "-byte limit\n")
		throw("stack overflow")
	}

	// The goroutine must be executing in order to call newstack,
	// so it must be Grunning (or Gscanrunning).
	casgstatus(gp, _Grunning, _Gcopystack)

	// The concurrent GC will not scan the stack while we are doing the copy since
	// the gp is in a Gcopystack status.
	copystack(gp, newsize, true)
	if stackDebug >= 1 {
		print("stack grow done\n")
	}
	casgstatus(gp, _Gcopystack, _Grunning)
	gogo(&gp.sched)
}
func copystack(gp *g, newsize uintptr, sync bool) {
	//...

	// allocate new stack
	new := stackalloc(uint32(newsize))

	//...
}
  • stackalloc
// stackalloc allocates an n byte stack.
//
// stackalloc must run on the system stack because it uses per-P
// resources and must not split the stack.
//
//go:systemstack
func stackalloc(n uint32) stack {

	// Small stacks are allocated with a fixed-size free-list allocator.
	// If we need a stack of a bigger size, we fall back on allocating
	// a dedicated span.
	var v unsafe.Pointer
	if n < _FixedStack<<_NumStackOrders && n < _StackCacheSize {
		//小堆栈用固定大小的自由列表分配器进行分配。
	} else {
		//...
		if s == nil {
			// 如果我们需要一个更大的堆栈,我们会重新分配一个span.
			// Allocate a new stack from the heap.
			s = mheap_.allocManual(npage, &memstats.stacks_inuse)
			if s == nil {
				throw("out of memory")
			}
			osStackAlloc(s)
			s.elemsize = uintptr(n)
		}
		//...
	}
	//...
}

实验验证环节 #

定义程序 #

main.go

package main

import "fmt"

func call_some_job() {

	fmt.Println("complete this job")
}

func main() {
	for i:=0; i<100000; i++{
		i=i
	}
	call_some_job()
}

gdb调试前准备 #

编译程序 #

编译一下源代码: go build -gcflags "-N -l" -o test ..

准备mcall函数断点的文件 #

  • gdb
    • list /usr/lib/golang/src/runtime/proc.go:267
    • list /tmp/kubernets/test_preempt/main.go:1
    • list /usr/lib/golang/src/runtime/asm_amd64.s:454

gdb调试自定义函数 #

define zxc
info threads
info register rbp rsp pc
end

gdb #

[root@gitlab test_preempt]# gdb ./test
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/kubernets/test_preempt/test...done.
Loading Go Runtime support.
(gdb) list
1	package main
2
3	import "fmt"
4
5	func call_some_job() {
6
7		fmt.Println("complete this job")
8	}
9
10	func main() {
(gdb)
11		call_some_job()
12	}
(gdb) b 10
Breakpoint 1 at 0x48cf90: file /tmp/kubernets/test_preempt/main.go, line 10.
(gdb) run
Starting program: /tmp/kubernets/test_preempt/./test

Breakpoint 1, main.main () at /tmp/kubernets/test_preempt/main.go:10
10	func main() {
(gdb) disas
Dump of assembler code for function main.main:
=> 0x000000000048cf90 <+0>:	    mov    %fs:0xfffffffffffffff8,%rcx     --------------------------------here
   0x000000000048cf99 <+9>: 	cmp    0x10(%rcx),%rsp                     --------------------------------here
   0x000000000048cf9d <+13>:	jbe    0x48cfb9 <main.main+41>
   0x000000000048cf9f <+15>:	sub    $0x8,%rsp
   0x000000000048cfa3 <+19>:	mov    %rbp,(%rsp)
   0x000000000048cfa7 <+23>:	lea    (%rsp),%rbp
   0x000000000048cfab <+27>:	callq  0x48cef0 <main.call_some_job>
   0x000000000048cfb0 <+32>:	mov    (%rsp),%rbp
   0x000000000048cfb4 <+36>:	add    $0x8,%rsp
   0x000000000048cfb8 <+40>:	retq
   0x000000000048cfb9 <+41>:	callq  0x4517d0 <runtime.morestack_noctxt> --------------------------------here
   0x000000000048cfbe <+46>:	jmp    0x48cf90 <main.main>
End of assembler dump.

上面三个--------------------------------here,前面我们说的很清楚,就是g.stack.stackguard0与sp寄存器进行比较,如果sp小于g.stack.stackguard0 就跳转到runtime.morestack_noctxt;而我们前面设置preempt:gp.stackguard0 = stackPreempt //stackguard0==很大的数; 使被抢占的goroutine;在进行函数调用会去检查栈溢出;去处理抢占请求,它必定比sp要大,所以肯定跳转到了runtime.morestack_noctxt

MOVQ 0(SP), AX // f's PC,就是caller’s pc是因为它的rbp在那一步还没有保存到callee‘s stack空间.

MOVQ	0(SP), AX // f&rsquo;s PC

那继续来看如果如果调用<runtime.morestack_noctxt>,它的下一个PC就是jmp 0x48cf90 <main.main>又重新跳回来了. 看这个 disas

附录 #