执行太久需要抢占 #
preemptone设置抢占 #
继续上文看下preemptone函数,它设置了g.preempt
(抢占标示)为true和g.stackguard0
为很大的数((1<<(8*sys.PtrSize) - 1) & -1314 ---> 0xfffffffffffffade
),使被抢占的goroutine在进行函数调用会去检查栈溢出,然后处理抢占请求
// Tell the goroutine running on processor P to stop.
// This function is purely best-effort. It can incorrectly fail to inform the
// goroutine. It can send inform the wrong goroutine. Even if it informs the
// correct goroutine, that goroutine might ignore the request if it is
// simultaneously executing newstack.
// No lock needs to be held.
// Returns true if preemption request was issued.
// The actual preemption will happen at some point in the future
// and will be indicated by the gp->status no longer being
// Grunning
func preemptone(_p_ *p) bool {
mp := _p_.m.ptr()
if mp == nil || mp == getg().m {
return false
}
gp := mp.curg // gp == 被抢占的goroutine
if gp == nil || gp == mp.g0 {
return false
}
gp.preempt = true // 设置抢占信号preempt == true
// Every call in a go routine checks for stack overflow by
// comparing the current stack pointer to gp->stackguard0.
// Setting gp->stackguard0 to StackPreempt folds
// preemption into the normal stack overflow check.
// (1<<(8*sys.PtrSize) - 1) & -1314 ---> 0xfffffffffffffade, 很大的数
gp.stackguard0 = stackPreempt //stackguard0==很大的数; 使被抢占的goroutine;在进行函数调用会去检查栈溢出;去处理抢占请求
return true
}
触发抢占 #
通过preemptone函数设置抢占后,我们继续来看实际触发抢占的,在前文我们讲述了编译器会在函数的头尾部分添加额外的汇编: 编译器加的函数头的部分
设置抢占后,只要执行函数调用就会执行下列函数调用:
morestack_noctxt() -> morestack() -> newstack()
runtime·morestack #
首先来看下runtime·morestack函数:
- 类似于mcall
- 保存调用morestack函数的goroutine(假设为gN)到它的sched成员 -> 将当前工作线程的g0与线程TLS关联 -> 将当前工作线程的g0栈恢复到CPU寄存器
- 在g0栈中call调用传入的参数(mcall)/执行
runtime·newstack(SB)
函数(morestack),所以不会影响gN,如果gN下一次被调度起来了,那么执行PC,又会重新到本函数头部执行,从上面分析也可以知道,这里的风险就是,如果执行过程没有调用其他函数,那么无法进行抢占,这个就是基于插入抢占,1.14基于信号抢占。
# morestack but not preserving ctxt.
TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0
MOVL $0, DX
JMP runtime·morestack(SB)
#
# support for morestack
#
# Called during function prolog when more stack is needed.
#
# The traceback routines see morestack on a g0 as being
# the top of a stack (for example, morestack calling newstack
# calling the scheduler calling newm calling gc), so we must
# record an argument size. For that purpose, it has no arguments.
TEXT runtime·morestack(SB),NOSPLIT,$0-0
# 开始是进行一些判断
# Cannot grow scheduler stack (m->g0).
# ...
# Cannot grow signal stack (m->gsignal).
# ...
# 设置m->morebuf的PC,SP,g为相对应的'main'
# Called from f.
# Set m->morebuf to f's caller.
NOP SP # tell vet SP changed - stop checking offsets
MOVQ 8(SP), AX # f's caller's PC # 这里的路径比如我的: 'main'--->'sub_function'。
# 但是抢占了,所以走下面的路径:->morestack_noctxt()->morestack()->newstack()
# 所以这里的f在我这里应该是main.
# 需要注意morestack_noctxt与morestack使用的栈大小都是0,且他们的跳转没用call指令,使用的是JMP
MOVQ AX, (m_morebuf+gobuf_pc)(BX)
LEAQ 16(SP), AX # f's caller's SP
MOVQ AX, (m_morebuf+gobuf_sp)(BX)
get_tls(CX) #...
MOVQ g(CX), SI
MOVQ SI, (m_morebuf+gobuf_g)(BX)
# 保存当前的寄存器信息到g->sched中
# Set g->sched to context in f.
MOVQ 0(SP), AX # f's PC
MOVQ AX, (g_sched+gobuf_pc)(SI)
MOVQ SI, (g_sched+gobuf_g)(SI)
LEAQ 8(SP), AX # f's SP
MOVQ AX, (g_sched+gobuf_sp)(SI) #在morestack里面就已经保存了sp的值
MOVQ BP, (g_sched+gobuf_bp)(SI)
MOVQ DX, (g_sched+gobuf_ctxt)(SI)
# 把g0设置为m当前运行的G; 把g0->sched->sp恢复到SP寄存器中;
#Call newstack on m->g0's stack.
MOVQ m_g0(BX), BX
MOVQ BX, g(CX)
MOVQ (g_sched+gobuf_sp)(BX), SP # 把g0的栈SP寄存器恢复到实际的寄存器中。所以下面就使用了g0的栈
# 调用newstack
CALL runtime·newstack(SB)
CALL runtime·abort(SB) #crash if newstack returns
RET
newstack(SB) #
看下高亮出来的代码,主要是判断是否是设置为了需抢占:
|
|
gopreempt_m #
主要把被抢占的Goroutine重新放入全局队列:
// gopreempt_m(gp) ---> goschedImpl(gp)
func gopreempt_m(gp *g) {
if trace.enabled {
traceGoPreempt()
}
goschedImpl(gp)
}
goschedImpl我们之前讲过: goschedImpl
栈增长相关代码 #
func newstack() {
//...省略抢占的代码
// Allocate a bigger segment and move the stack.
oldsize := gp.stack.hi - gp.stack.lo
newsize := oldsize * 2 // 新的栈大小直接*2
if newsize > maxstacksize {
print("runtime: goroutine stack exceeds ", maxstacksize, "-byte limit\n")
throw("stack overflow")
}
// The goroutine must be executing in order to call newstack,
// so it must be Grunning (or Gscanrunning).
casgstatus(gp, _Grunning, _Gcopystack)
// The concurrent GC will not scan the stack while we are doing the copy since
// the gp is in a Gcopystack status.
copystack(gp, newsize, true)
if stackDebug >= 1 {
print("stack grow done\n")
}
casgstatus(gp, _Gcopystack, _Grunning)
gogo(&gp.sched)
}
func copystack(gp *g, newsize uintptr, sync bool) {
//...
// allocate new stack
new := stackalloc(uint32(newsize))
//...
}
- stackalloc
// stackalloc allocates an n byte stack.
//
// stackalloc must run on the system stack because it uses per-P
// resources and must not split the stack.
//
//go:systemstack
func stackalloc(n uint32) stack {
// Small stacks are allocated with a fixed-size free-list allocator.
// If we need a stack of a bigger size, we fall back on allocating
// a dedicated span.
var v unsafe.Pointer
if n < _FixedStack<<_NumStackOrders && n < _StackCacheSize {
//小堆栈用固定大小的自由列表分配器进行分配。
} else {
//...
if s == nil {
// 如果我们需要一个更大的堆栈,我们会重新分配一个span.
// Allocate a new stack from the heap.
s = mheap_.allocManual(npage, &memstats.stacks_inuse)
if s == nil {
throw("out of memory")
}
osStackAlloc(s)
s.elemsize = uintptr(n)
}
//...
}
//...
}
实验验证环节 #
定义程序 #
main.go
package main
import "fmt"
func call_some_job() {
fmt.Println("complete this job")
}
func main() {
for i:=0; i<100000; i++{
i=i
}
call_some_job()
}
gdb调试前准备 #
编译程序 #
编译一下源代码: go build -gcflags "-N -l" -o test .
.
准备mcall函数断点的文件 #
- gdb
list /usr/lib/golang/src/runtime/proc.go:267
list /tmp/kubernets/test_preempt/main.go:1
list /usr/lib/golang/src/runtime/asm_amd64.s:454
gdb调试自定义函数 #
define zxc
info threads
info register rbp rsp pc
end
gdb #
[root@gitlab test_preempt]# gdb ./test
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/kubernets/test_preempt/test...done.
Loading Go Runtime support.
(gdb) list
1 package main
2
3 import "fmt"
4
5 func call_some_job() {
6
7 fmt.Println("complete this job")
8 }
9
10 func main() {
(gdb)
11 call_some_job()
12 }
(gdb) b 10
Breakpoint 1 at 0x48cf90: file /tmp/kubernets/test_preempt/main.go, line 10.
(gdb) run
Starting program: /tmp/kubernets/test_preempt/./test
Breakpoint 1, main.main () at /tmp/kubernets/test_preempt/main.go:10
10 func main() {
(gdb) disas
Dump of assembler code for function main.main:
=> 0x000000000048cf90 <+0>: mov %fs:0xfffffffffffffff8,%rcx --------------------------------here
0x000000000048cf99 <+9>: cmp 0x10(%rcx),%rsp --------------------------------here
0x000000000048cf9d <+13>: jbe 0x48cfb9 <main.main+41>
0x000000000048cf9f <+15>: sub $0x8,%rsp
0x000000000048cfa3 <+19>: mov %rbp,(%rsp)
0x000000000048cfa7 <+23>: lea (%rsp),%rbp
0x000000000048cfab <+27>: callq 0x48cef0 <main.call_some_job>
0x000000000048cfb0 <+32>: mov (%rsp),%rbp
0x000000000048cfb4 <+36>: add $0x8,%rsp
0x000000000048cfb8 <+40>: retq
0x000000000048cfb9 <+41>: callq 0x4517d0 <runtime.morestack_noctxt> --------------------------------here
0x000000000048cfbe <+46>: jmp 0x48cf90 <main.main>
End of assembler dump.
上面三个--------------------------------here
,前面我们说的很清楚,就是g.stack.stackguard0与sp寄存器进行比较,如果sp小于g.stack.stackguard0
就跳转到runtime.morestack_noctxt
;而我们前面设置preempt:gp.stackguard0 = stackPreempt //stackguard0==很大的数; 使被抢占的goroutine;在进行函数调用会去检查栈溢出;去处理抢占请求
,它必定比sp要大,所以肯定跳转到了runtime.morestack_noctxt
MOVQ 0(SP), AX // f's PC
,就是caller’s pc是因为它的rbp在那一步还没有保存到callee‘s stack空间.
那继续来看如果如果调用<runtime.morestack_noctxt>
,它的下一个PC就是jmp 0x48cf90 <main.main>
又重新跳回来了.
看这个