Integrated Resource Management for Virtualized Embedded Systems

Kishore Ramachandran

Georgia Tech PI:  Kishore Ramachandran
Students: Dushmanta Mohapatra, Hyojun Kim
Samsung Technical POC: Dr. Sang-bum Suh, Junghyun Yoo

1. Project Goals

2. Project Details

Background


System virtualization has been very successful for high end server-class machines because it provides benefits such as consolidation, increased utilization, rapid provisioning, dynamic fault tolerance against software failures through rapid bootstrapping or rebooting, and hardware fault tolerance through migration of a virtual machine to different hardware.
Until now virtualization techniques have hardly been used for mobile embedded devices even though it has many potential benefits for mobile devices. For instance, system virtualization would allow applications developed on diverse platforms to run on a generic platform thus increasing the extensibility of mobile devices without compromising the security for trusted applications.   Further, it would also allow seamless user mobility across platforms with migration functionality. The increasing processing power and storage capability of the mobile devices coupled with the trend of mobile device OS vendors making their systems open points to a healthy virtualization based ecosystem for mobile devices.


System Architecture


To use virtualization technology for mobile embedded devices, we need to minimize various virtualization overheads. Especially, memory resource management is a very critical problem for resource-constrained embedded systems. We have two target goals

Memory has always been a scarce resource in virtualized environments. The need for memory and its efficient utilization are much more pronounced in the case of embedded devices (as compared to server environments). As such, the primary goal of the project has been to achieve dynamic memory resource allocation in a virtualized environment targeted for embedded devices. 

In the normal mode of operation, the OS kernel is entrusted with efficient multiplexing of the amount of available physical memory among the competing processes. Managing a fixed amount of memory (RAM) optimally is a long-solved problem in the Linux kernel. Managing RAM optimally in a virtual environment, however, is still a challenging problem for to the following reasons:

Previous research has produced mechanisms like ballooning and memory-hotplugging,which allow memory to be taken from a guest domain that is having un-utilized memory to a domain which needs more memory. But when more than two domains are running, there is an issue of choosing the domain from which to take memory. It is because of the inherent hardness in estimating the actual memory needs of various domains at the hypervisor level. The Collaborative Memory Management (CMM) project at IBM attempted to solve this issue by devising a mechanism to transfer memory related information from each guest into the hypervisor, but the resulting architecture was deemed a bit complex for mainstream usage.

Transcendental Memory (T-Mem) is an effort directed at solving the similar problem of efficient memory allocation among a group of competing domains. In the original T-Mem approach, underutilized RAM from   each domain and RAM unassigned to any domain is collected into a central pool and an indirect access to this central pool is provided by the VMM through a page-copy based interface. The modifications required to Linux kernel (to become compatible with the T-Mem architecture) is relatively small and the performance benefit achieved (in terms of reduced I/O cost and less latency involved in memory transfer) more than compensates for the development effort involved.

T-Mem aims to provide a mechanism by which memory could be treated as a renewable resource. As has already been explained, this is achieved by creating a central memory pool and allowing domains to use them in a controlled manner. If there is space available in the central pool and a domain needs more memory for its workload, it can ask the hypervisor to put some pages into this global pool. This saves the guest from costly disk/swap I/O as the pages are available in memory. When another domain needs memory, some of the pages in the central pool that are being used for storing content of other domains could be freed. From the perspective of an operating system, T-Mem is a fast pseudo-RAM of indeterminate and varying size that is useful primarily when real RAM is in short supply and is accessible only via a somewhat quirky copy-based interface.

Figure 1 shows the T-MEM usage model.

Figure 1 shows the T-MEM usage model.

  Figure 1. T-MEM Usage Model


Each domain has direct access to the RAM allocated to it. In addition to that, each domain also has a page-copy based indirect access to the global pool of memory allocated to T-Mem. Domains (/Virtual Machines) can create pools in this T-Mem pool and can use them to store memory pages.

T-MEM in Xen-ARM


In this project, we adapt T-Mem to work in Xen-ARM. The platform we use is nVidia’s Tegra board.  Currently available Xen-ARM version is 3.0.1, while the oldest T-Mem is for Xen 3.3.1. To enable T-Mem on Xen-ARM, we had to solve Xen version, 32-bit architecture, and ARM architecture related issues. Following are some of the details involved in the porting task.

Snapshot of Performance Results

We have conducted micro-measurements of the T-MEM operations such as put-page and get-page.  Table below shows a comparison of these operations in comparison to storage devices.  In mobile devices, a MicroSDHC card is often used as swap device, and as we can see from the result, T-Mem is 15,462 times faster than measured MicroSDHC card.

Storage Device

Possible Swap Op. / second

Comparison with T-Mem

T-Mem

47846.9

-

2.5inch 5400RPM HDD 80GB

63.8

X 750 slower

SAMSUNG SLC SSD 64GB

146.6

X 326 slower

Kingston MLC SSD 64GB

52.4

X 914 slower

USB Memory Stick 2GB

3.7

X13,015 slower

Kingston MicroSDHC 16GB

3.1

x15,462 slower

We have also developed demonstration application to showcase the working of T-MEM as memory pressure builds up in different domains.  The application uses two client programs running on the Tegra board making dynamic memory requests as dictated by a server program running on a network connected Linux box.  The following snapshot represents the results generated in a scenario where the workload is generated statically at the server side and the client side memory allocation happens incrementally.

graphs of mem

This goal of this project was to implement dynamic memory resource management in a virtualized embedded system.  Using nVidia’s Tegra board and Xen-ARM distribution from Samsung, we have successfully implemented transcendental memory for use by guest operating systems executing on top of Xen-ARM to deal with dynamic memory pressure.  We have conducted preliminary performance measurements of our implementation to validate the utility of transcendental memory as an intermediate level in the memory hierarchy between RAM and stable storage.  Complete details can be found in the final project report delivered to Samsung.

3. Research Artifacts