RealiZing VirtualiZation !!!

One Final attempt to get going with our Major Project through Blogging.....

Tuesday, November 6, 2007

HVMM in PseudoCode

                

Control Flow

BIOS -> BOOTLOADER -> HVMM


HVMM Pseudo-Code
{
INITIALIZE_SVM();
HYPERVISOR();
}

LOAD_VMM_UI() {
//display text based interface listing guest OS choices
//wait for user input
}

INITIALIZE_SVM() {
ENABLE_SVM();
SETUP_HYPERVISOR();
}

ENABLE_SVM() {
EFER.SVME = 1;
}

SETUP_HYPERVISOR() {
ALLOCATE_HYPERVISOR_CODE();
LOAD_HYPERVISOR_CODE();
ALLOCATE_HOST_STATE_AREA();
}

ALLOCATE_HYPERVISOR_CODE() {
//allocate a non-paged area in kernel memory
}

LOAD_HYPERVISOR_CODE() {
//copy the hypervisor code to memory
}

ALLOCATE_HOST_STATE_AREA() {
//allocate a non-paged contiguous physical memory space for a host save area
//store the physical address to this area in the VM_HSAVE_PA register
}

HYPERVISOR() {
while (1) {
If(vmm_switch) {
LAUNCH_VMM_UI();
vmcb = GET_SELECTED_VMCB();
if(vmcb = NULL) {
vmcb = SETUP_VMCB();
ADD_ACTIVE(vmcb);
}
LAUNCH_VM_UI();
}
else {
vmcb = GET_NEXT_VMCB();
// from scheduler
}
rax = &VMCB;
VMLOAD(rax);
While(running_vm) {
VMRUN(rax);
Switch(exit_code) {
//handle interrupt in each case
If(timer_expire OR vmm_switch or power_off) break;
}
}
If(power_off) REMOVE_ACTIVE(rax);
Else VMSAVE(rax);
}
}

SETUP_VMCB() {
ALLOCATE_VMCB();
// CLGI instruction is executed to disable global interrupts
// initialize the control area of the with a set of intercept conditions that will cause execution to transfer out of the guest and
back to the hypervisor

// initialize the guest area of the VMCB with the address where guest execution should begin
}

ALLOCATE_VMCB() {
//allocate a region of physically contiguous, page-aligned, non-pageable memory
}



Monday, November 5, 2007

The AMD SVM Architecture - An Overview

The AMD SVM processor support provides a set of hardware extensions designed to enable economical and efficient implementation of virtual machine systems. The term host refers to the execution context of the VMM, and guest, that of an OS running atop. World switch refers to the operation of switching between the host and guest. The AMD virtual machine architecture is designed to provide
  • Mechanisms for fast world switch between guest and host.
  • The ability to intercept selected instructions or events in the guest.
  • External (DMA) access protection for memory.
  • Assists for interrupt handling and virtual interrupt support.
  • A guest/host tagged TLB, Nested Paging to reduce virtualization overhead.
Instruction Set Additions

AMD SVM introduces several new instructions and modifies several existing instructions to facilitate the implementation of VMM systems on the x86 architecture or more specifically the AMD64 architecture. The following are the virtualization specific additions to the instruction set.

  1. VMRUN - Start execution of a guest

  2. VMLOAD - Save subsets of processor state

  3. VMSAVE - restore subsets of processor state

  4. VMMCALL - Allow guests to explicitly communicate with the VMM

  5. STGI - set the global interrupt flag

  6. CLGI - clear the global interrupt flag

  7. SKINIT - Secure init and control transfer with attestation

  8. INVLPGA - Invalidate TLB entries in a specified ASID

Guest Mode

This new processor mode is entered through the VMRUN instruction. When in guest mode, the behavior of some x86 instructions changes to facilitate virtualization.

Virtual Machine Control Block (VMCB)

There is a VMCB for each running guest OS. The VMCB is divided into two areas.

  1. Control Area: contains various control bits including the intercept vector with settings that determine what actions cause #VMEXIT (transfer of control from the guest to host). Rich set of intercepts allow the host to customize each guest’s privileges.

  2. State Area: All CPU state for each guest is saved in this area. Information about the intercepted event is put into the VMCB on #VMEXIT

VMRUN

  • Host state is saved to memory

  • Guest state loaded from VMCB

  • Guest runs

#VMEXIT

  • Guest state is saved back to VMCB

  • Host state loaded from memory

Host State Save Area is pointed to by Model Specific Register(MSR) VM_HSAVE_PA and VMCB is pointed to by register RAX.

Nested Paging

The SVM Nested Paging facility provides for two levels of address translation in hardware, thus eliminating the need for the VMM to maintain the so called shadow page tables in software.

With nested paging enabled, the processor applies two levels of address translation. A guest page table (gPT) mapping guest virtual addresses to guest physical addresses located in guest physical space. Each guest also has a host page table (hPT) mapping host virtual addresses to host physical addresses located in host physical space. Both host and guest levels have their own copy of the CR3 register, referred to as hCR3 and gCR3, respectively.

After translating a guest virtual address using the guest page tables, the resulting (guest physical) address is treated as a host virtual address and is further translated, using the host page tables, into a host physical address. The resulting translation from guest virtual to host physical address is cached in the TLB and used on subsequent guest accesses.

Nested paging is enabled by the VMRUN instruction if the NP_ENA bit in the VMCB is set to 1; nested paging is disabled by #VMEXIT.

Thus there are 3 different registers – hCR3, gCR3 and CR3. The value of hCR3 can be different from the CR3 in effect while the VMM is running; this gives the VMM maximum flexibility on how to further remap guests’ physical address spaces, and where to optionally map guest physical pages in the VMM’s address space. The optional host paging mechanism allows a VMM to page out guest pages and to use copy-on-write techniques (i.e. sharing of redundant physical pages) between guests. We are not planning to implement host paging in our HVMM at this point.

Tagged TLB

In the SVM usage model, the VMM is mapped in a different address space than the guests each of which in turn have their own address spaces. To reduce the cost of world switches, the TLB is tagged with an address space identifier (ASID) distinguishing host-space entries and different guest-space entries from each other. The ASID Tag in the TLB specifies to which virtual machine, each memory page is assigned. This allows more efficient switching between virtual machines as it completely eliminates the need for TLB flushes each time a different virtual machine is scheduled.


Project Virtualization - A Brief !DEA

The project deals with the design and implementation of a prototype Hardware Virtual Machine Monitor (HVMM) from scratch.

A Virtual Machine Monitor( VMM) also known as a Hypervisor consists of software that controls the execution of multiple guest operating systems on a single physical machine. It is a thin software layer that exports a virtual machine abstraction. The abstraction looks enough like the hardware that any software written for that hardware will run in the virtual machine. The VMM provides each guest the appearance of full control over a complete computer system (memory, CPU, and all peripheral devices). Fundamentally, a VMM works by intercepting and emulating in a safe manner sensitive operations in the guest (such as changing the page tables, which could give a guest access to memory it is not allowed to access).

There are numerous techniques for implementing virtualization. Until the recent arrival of Intel and AMD hardware support, x86 - the wold's most popular architecture was very hostile towards virtualization. Complicated Software workarounds were developed to virtualize x86 machines, but not without the associated overhead. Hardware virtualization, though immature at present, represents the future of virtualization. This is the inspiration behind the HVMM design.

The HVMM is designed to run on machines with AMD-V hardware virtualization support codenamed “Pacifica”. The AMD Secure Virtual Machine (SVM) architecture provides hardware assists to improve performance and facilitate implementation of virtualization.