4.2: 从用户空间陷入(Traps from user space)

从用户空间陷入粗粒度过程

当用户程序执行了系统调用(ecall instruction),或做了一些非法的事情(something illegal),如除零错误,或者设备中断(device interrupts)。 从用户空间触发trap的粗粒度执行路径为:

  • uservec(kernel/trampoline.S)//将用户寄存器保存到TRAPFRAME里,并设置内核栈指针和pc等信息

  • usertrap(kernel/trap.c)//

  • usertrapret(kernel/trap.c)

  • userret(kernel/trampoline.S)

从如下的代码可以看到,uservec在特权模式下运行,但是使用的是进程用户态的页表

在创建用户进程的时候相关的函数调用路径有: allocproc -> forkret-> usertrapret-> w_stvec(trampoline_uservec)

uservec的实现逻辑

uservec大概的过程为:

  • 用户进程调用系统调用,根据stub和程序调用的规则,系统调用接口函数索引号放入到a7寄存器,参数放入a0寄存器,并调用ecall指令进入trap

  • ecall会触发uservec代码段,开始保存当前cpu计算上下文(寄存器)并将trapframe的中保存的内核的kernel_sp,页表位置加载到cpu相关的寄存器里

    • 采用一个临时的寄存器sscratchjiang a0寄存器保存,然后uservec就可以依赖a0寄存器来做中转将其他寄存器保存到进程的proc结构体里的trapframe结构体里

    • trapframe包含了当前进程的内核栈的地址kernel_sp,当前的CPU的hartidusertrap函数的地址,内核页表的地址,并将内核页表的地址载入satp寄存器,并跳转到usertrap开始执行

.section trampsec
.globl trampoline
.globl usertrap
trampoline:
.align 4
.globl uservec
uservec:    
	#
        # trap.c sets stvec to point here, so
        # traps from user space start here,
        # in supervisor mode, but with a
        # user page table.
        #

        # save user a0 in sscratch so
        # a0 can be used to get at TRAPFRAME.
        csrw sscratch, a0

        # each process has a separate p->trapframe memory area,
        # but it's mapped to the same virtual address
        # (TRAPFRAME) in every process's user page table.
        li a0, TRAPFRAME
        
        # save the user registers in TRAPFRAME
        sd ra, 40(a0)
        sd sp, 48(a0)
        sd gp, 56(a0)
        sd tp, 64(a0)
        sd t0, 72(a0)
        sd t1, 80(a0)
        sd t2, 88(a0)
        sd s0, 96(a0)
        sd s1, 104(a0)
        sd a1, 120(a0)
        sd a2, 128(a0)
        sd a3, 136(a0)
        sd a4, 144(a0)
        sd a5, 152(a0)
        sd a6, 160(a0)
        sd a7, 168(a0)
        sd s2, 176(a0)
        sd s3, 184(a0)
        sd s4, 192(a0)
        sd s5, 200(a0)
        sd s6, 208(a0)
        sd s7, 216(a0)
        sd s8, 224(a0)
        sd s9, 232(a0)
        sd s10, 240(a0)
        sd s11, 248(a0)
        sd t3, 256(a0)
        sd t4, 264(a0)
        sd t5, 272(a0)
        sd t6, 280(a0)

	# save the user a0 in p->trapframe->a0
        csrr t0, sscratch
        sd t0, 112(a0)

        # initialize kernel stack pointer, from p->trapframe->kernel_sp
        ld sp, 8(a0)

        # make tp hold the current hartid, from p->trapframe->kernel_hartid
        ld tp, 32(a0)

        # load the address of usertrap(), from p->trapframe->kernel_trap
        ld t0, 16(a0)

        # fetch the kernel page table address, from p->trapframe->kernel_satp.
        ld t1, 0(a0)

        # wait for any previous memory operations to complete, so that
        # they use the user page table.
        sfence.vma zero, zero

        # install the kernel page table.
        csrw satp, t1

        # flush now-stale user entries from the TLB.
        sfence.vma zero, zero

        # jump to usertrap(), which does not return
        jr t0

usertrap函数实现

  • 开始触发usertrap函数,进一步在该函数内部调用syscall函数(真正实现了系统调用的执行)

    • usertrap确定产生trap的原因,进行处理然后返回

      • 首先将stvec指向kernelvec,以便在内核状态下响应中断和异常

      • 保存sepc寄存器,因为usertrap可能会调用yield去切换到另外进程的内核线程执行,另外进程可能会返回到用户空间,这样会修改sepc寄存器的值

      • 如果产生这个trap的原因是

        • 系统调用,usertrap会调用syscall来进行处理,用户代码需要在ecall的下一条指令开始执行,因此需要用如下代码来保证返回用户态时的相关指令的继续执行p->trapframe->epc += 4;

        • 设备中断,usertrap会调用devintr来进行处理

        • 异常,内核会杀掉当前错误的进程

      • 最后函数在退出之前,检查如果时设备中断,将会让出cpu的执行

//
// handle an interrupt, exception, or system call from user space.
// called from trampoline.S
//
void
usertrap(void)
{
  int which_dev = 0;

  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");

  // send interrupts and exceptions to kerneltrap(),
  // since we're now in the kernel.
  w_stvec((uint64)kernelvec);

  struct proc *p = myproc();
  
  // save user program counter.
  p->trapframe->epc = r_sepc();
  
  if(r_scause() == 8){
    // system call

    if(killed(p))
      exit(-1);

    // sepc points to the ecall instruction,
    // but we want to return to the next instruction.
    p->trapframe->epc += 4;

    // an interrupt will change sepc, scause, and sstatus,
    // so enable only now that we're done with those registers.
    intr_on();

    syscall();
  } else if((which_dev = devintr()) != 0){
    // ok
  } else {
    printf("usertrap(): unexpected scause 0x%lx pid=%d\n", r_scause(), p->pid);
    printf("            sepc=0x%lx stval=0x%lx\n", r_sepc(), r_stval());
    setkilled(p);
  }

  if(killed(p))
    exit(-1);

  // give up the CPU if this is a timer interrupt.
  if(which_dev == 2)
    yield();

  usertrapret();
}

usertrapet函数实现

  • 返回到用户态空间的第一个语句为usertrapret

//
// return to user space
//
void
usertrapret(void)
{
  struct proc *p = myproc();

  // we're about to switch the destination of traps from
  // kerneltrap() to usertrap(), so turn off interrupts until
  // we're back in user space, where usertrap() is correct.
  intr_off();

  // send syscalls, interrupts, and exceptions to uservec in trampoline.S
  uint64 trampoline_uservec = TRAMPOLINE + (uservec - trampoline);
  w_stvec(trampoline_uservec);

  // set up trapframe values that uservec will need when
  // the process next traps into the kernel.
  p->trapframe->kernel_satp = r_satp();         // kernel page table
  p->trapframe->kernel_sp = p->kstack + PGSIZE; // process's kernel stack
  p->trapframe->kernel_trap = (uint64)usertrap;
  p->trapframe->kernel_hartid = r_tp();         // hartid for cpuid()

  // set up the registers that trampoline.S's sret will use
  // to get to user space.
  
  // set S Previous Privilege mode to User.
  unsigned long x = r_sstatus();
  x &= ~SSTATUS_SPP; // clear SPP to 0 for user mode
  x |= SSTATUS_SPIE; // enable interrupts in user mode
  w_sstatus(x);

  // set S Exception Program Counter to the saved user pc.
  w_sepc(p->trapframe->epc);

  // tell trampoline.S the user page table to switch to.
  uint64 satp = MAKE_SATP(p->pagetable);

  // jump to userret in trampoline.S at the top of memory, which 
  // switches to the user page table, restores user registers,
  // and switches to user mode with sret.
  uint64 trampoline_userret = TRAMPOLINE + (userret - trampoline);
  ((void (*)(uint64))trampoline_userret)(satp);
}

Trampoline page的作用

trampoline page包含有uservec代码段,xv6的trap handling的代码stvec指向的内容,trampoline page在每个进程的用户页表的TRAMPOLINE地址执行 uservec trap handler的代码为trampoline.S(kernel/trampoline.S),当uservec开始执行时,CPU核所有的32位寄存器都包含值着中断处用户代码的相关值 这32个值需要保存在内存的某个地方,因此后续内核可以将其恢复到用户空间,RISC-V提供了sscratch寄存器,在uservec开始处的csrw指令将a0存入sscratch寄存器,然后uservec就有a0寄存器操作对象 uservec的下一个任务为存储32位用户寄存器,内核为每一个进程分配了一个内存页来存储trapframe结构,该结构里可以存储32个用户寄存器(kernel/proc.h) satp寄存器仍然指向用户页表,uservec需要trapframe映射到用户空间 xv6将每一个进程在虚拟地址TRAPFRAME的trapframe映射到进程的用户页表 TRAPFRAME紧接着TRAMPOLINE下方,进程的p->trapframe同时也指向了trapframe Uservec将TRAPFRAME的地址加载进a0,然后在那里保存所有用户寄存器,包括用户寄存器a0。然后从sscratch寄存器读取相关内容到当前寄存器 trapframe包含了当前进程内核的地址,当前CPU的hartid,usertrap函数的地址,内核页表的地址,uservec检索出这些值,然后将satp切换到内核页表,跳转到usertrap处开始执行

text