Saturday, June 18, 2011

Switching between user mode and kernel mode.

Hello,
Today I would dig into something we very often listen in software development or debugging process.
That is different modes processor run into on windows OS. that is kernel mode and user mode.
We also say it ring0 and ring3 execution...
So there are many places when the OS switches from User Mode to Kernel Mode or from Lower Privilege level to high privilege level and vice versa. Examples are:
> Interrupts
> Exceptions
> System Calls

I would go in details with System calls today. We know that when we call a windows API from User mode, ntdll makes the transition to kernel mode API and after the kernel completes the function call results are returned back to User mode.
For our example let's take ntdll!NtReadFile. Any user mode read file operation would result in this API and subsequently transferred to Kernel mode. let's check what is inside this function. Below is the disassembly of this fucntion:
          mov     r10,rcx
          mov     eax,3
          syscall
          ret

You can notice the syscall instruction, this instruction is the one that makes a fast call to kernel mode. Just for the information, this output is from a X64 based PC.
X64 based CPU only support Syscall for 64bit mode and not in compatibility mode. 
So what processor does when it sees the Syscall instruction. Here are the steps directly from intel manual.
For Syscall the processor Saves the RFlags to R11 and RIP of the next instruction to RCX.
Now to run the kernel mode code:
the processor should have following piece of information:
Target Code Segment (CS)
Target Instruction (where the execution should start)
Stack Segment
and System Flags.
This all it gets from CPU specific Registers.
CS = IA32_STAR[47:32]
RIP = IA32_LSTAR (64bit Canonical address)
SS = IA32_STAR[47:32] +8
System Flags: processor sets the system flags to the logical AND of its current value witht the complement of the value in IA32_FMASK_MSR.

IA32_Star* basically are MSRs(Model Specific Registers) and you can check the value of those using rdmsr windbg command if you know the index of these registers(which you can easily find in the Processors's manuals). The index for IA32_star is C0000081 for example. C0000082 for IA32_LSTAR and C0000084 for IA32_FMASK.

when I checked on my box I found that RIP = IA32_LSTAR  this is the address of nt!KiSystemCall64.
So now you know where the call will go when Syscall instruction is executed. You should also notice that this is the kernel function called for any system call and not just ntdll!ntreadfile.

Once in kernel, the nt!nt!KiSystemCall64 decides what to do (which api to call based on the parameter it received from user mode. I will explain next where this parameter is coming from?)
Kernel stack is stored in the TSS of every task and is fetched from there. The User stack and RIP , RFlags etc are stored in the stack so that when the call returns to the user mode they can be restored.

When I said the argument is passed from user mode, to identify which system call will be invoked in kernel mode, that argument basically is set into Eax register right before the Syscall instruction.

mov r10,rcx
mov eax,3
syscall
ret

Then corresponding to this number 3 we would need the API from system service table. On 64bit OS windows keeps the offset of System Service calls, these offsets are 32 bits and are relative to nt!kIServiceTable. So if you have to find the address of a specific service routine. pick out the offset from the array nt!KiServiceTable and add that to the base address of nt!kiServiceTable.

Hang on, there is a slight change in my above statement. You should not directly add the offset to get the system service address. Rather there is a trick used by Kernel. Kernel actually uses only 28-bits to store the offset. Last 4 bits it uses for the number of arguments to that system call. So the actually what you should add is (Offset>>4)(i.e. remove the last bit that is used for storing the arguments.)
kd> dd nt!KiServiceTable l4

fffff800`014c7b00 04106900 02f6f000 fff72d00 031a0105

kd> ln nt!KiServiceTable+(031a0105>>4)


nt!NtReadFile (.....)


We can check here the user mode ReadFile ultimately goes to Kernel Mode nt!NTReadFile. Ok so what next. What if Nt!ntReadfile takes some arguments, where do they come from? How does kernel receive them from user mode? How does the control goes back to User mode once the systme call completes its job? We will look at all these details in the next post soon.







No comments:

Post a Comment