TG Daily Special: Virtualization Explored
Virtualization stands at the threshold of changing the way we think about computing. We are headed into uncharted waters where computers are opening up a whole new world of opportunities, enabled by faster CPUs that integrate more and more cores. Join us for a three-part article in which we look at what virtualization is, how it applies and where it's headed.
Published in this TG Daily Special: Virtualization Explored
You may have heard of virtualization before. Intel has been talking about it for more than two years, AMD is following its rival. But even if you are deeply involved in Enterprise architecture, an environment that is expected to be leading the virtualization world, there is a good chance that you are only at what could be the beginning of a long journey. A journey that has the potential to change how we think about computing today.
Even though virtualization technology today is solely directed at business and enterprise users, there are implications for home users as well. It will offer an opportunity to run an application-focused system and not an operating system focused computer. Backups of entire computers will become much easier, while the security of systems and their content has the potential to increase dramatically.
In this installment we will explain what it will take to move into this new world of computing and what benefits can be expected. You'll get a glimpse of what's coming and why we're no longer trapped inside the physical machine. You'll discover why we have the power and potential today to wield multiple virtual machines simultaneously. You may also discover that it is a technology all of us will want to begin experimenting with.
Following this article, we will dive deeper into the topic, business and personal uses alike. We will take a virtual tour through server farms, ISPs and businesses who are turning to virtualization as a key component of their server infrastructure. And finally, we'll wrap up the series with a look at where things are headed. It's a future showing us that this thing called virtualization, in our opinion, has the potential to change the computing world.
The first thing we need to look at when understanding virtualization is: What is it? In abstract terms, virtualization can be defined as the process of taking something physical and abstracting it to something logical. This article specifically applies to the virtualization of the operating system (OS) on x86 hardware.
A virtualized environment is one that runs multiple guest operating systems on a host operating system. This means a machine can physically boot into a small version of Linux, for example. And from within that single copy of Linux running on the one machine, it can launch either windowed or full-screen versions of several other operating systems (limited only by processors, disk space and memory). A single Linux box can, therefore, boot Windows 95, 98, ME, 2000 and XP and Vista (in both 32-bit and 64-bit modes), as well as Solaris, other versions of Linux and Unix, even Mac OS X (though it's illegal to do so), DOS, and more, all at the same time. Provided enough resources exist on the machine, little or no slowdown will be noticeable for most users on most applications across the board.
How does virtualization work?
Virtualization is offered in two forms today. There is the complete software solution which works on all x86 machines. And then there's the biggie. It's the software/hardware combination solution which requires modern CPUs with virtualization extensions to operate.
The software solution will work on all VIA, AMD, Intel and even Transmeta CPUs. Right now it can be somewhat slower, but it depends on the workload being run. It does allow multiple OSes to run without error at the same time. A software layer handles interpolation for all requests of the OSes and applications within. This interpolation is a somewhat heavy process (lots of compute time) but it works without error. Companies like VMware perfected this technology for x86 several years ago, before special hardware existed for x86 systems.
The other solution requires an AMD or Intel CPU with hardware assisted virtualization. This is referred to generally as Virtualization Technology, or VT. AMD calls their solution AMD-V while Intel calls theirs IVT (and specifically VT-x for x86 as Intel also has virtualization technology in their Itanium, called VT-i). The hardware assisted virtualization still requires a software layer, but that software layer does not need to be as complex or do as much as the strictly software model does. It also can exist entirely outside of the guest operating system. It has potential is to be much faster, because the hardware itself carries out much of that heavy lifting that must otherwise be done in software. Using today's hardware assistance the software and hardware models are comparable in speed. There are advantages to each depending on the type of workload. However, the future is definitely moving toward full-on hardware integration in all key slowdown points seen in software today. As virtualization moves forward in revisions, it will only get faster and faster toward the hardware side.
The required software layer is called a hypervisor. The hypervisor is setup to maintain the state of the virtual machine for all guests, and it serves as a go-between for the hardware and software. This component is the virtualization engine, be it done primarily in software or via the software/hardware combination.
Basically, the virtualization engine sits atop the OS (or below if you look at it in this illustration). It's closer to the hardware and has physical access to everything hardware related. It intercepts all of the things the OS or application would generally do in hardware, but without hardware actually being updated. The virtualization engine reads the intent of the thing being requested by the OS or software, and then sets up an internal state which either emulates that ability or passes it through to hardware. To the OS or software, it's completely transparent. They both "think" they have physically updated hardware and are proceeding accordingly. And that's just how it's supposed to work: transparently.
Some implementations do allow immediate pass-through utilization of hardware resources for greater performance. These are typically used for 3D graphics or specialized pieces of equipment which, if emulated, would defeat the purpose of having the hardware there in the first place. Such implementations are specific and, because they are allowing physical use of the hardware, are much more limited for sharing. You cannot, for example, have a virtualization engine running two versions of Windows with both of them physically operating the high-end graphics card at the same time. I'm told by AMD and others that such technology is coming, but it's still several years away.
Read on the next page: Technology propellers, Why Virtualize?
What is it that's brought the idea of virtualization into reality? It's very simple: hardware advancements. We've seen increases in computing power, multiple cores, memory, disk space and networking speed coupled to the ongoing increase of the average user's knowledge base and use requirements for their machine. All of these factors have focused in on making this potential a viable reality.
If we look at machine resources today the numbers are just astounding. In 20 years we've gone from a standard of 1 MB of memory or less to 2 GB, with 4 GB not even unusual. Hard drive capacities have exploded. Our first hard drives about 20 years ago were single-digit megabyte disks and they cost thousands of dollars. Today's $100 hard drives hold 50,000 times more. Our processors have also moved from 4.77 MHz 16-bit CPUs with 3-40 clocks per operation. We're now sitting atop super-scalar, super-pipelined, out-of-order, multiple-core, SIMD, 64-bit processors at 3.0 GHz and higher (about 680 times faster in raw numbers, closer to 6,000 times faster in raw performance). And of course, hardware assisted virtualization. It allows even several virtual machines running on a multi-core processor to all appear as if they're running at nearly native speeds.
The bottom line is: It is this culmination of hardware advancements that makes it all possible. Without the memory, it would be difficult. Without large hard drives, it would be limited. Without the multiple cores, it would be slow. And without the hardware assisted virtualization it would be more complex and more highly prone to errors and failure.
So the next question becomes: Why? What do we gain from this ability? Why would we want to run multiple operating systems on a single machine? These questions are very likely the last questions, which keep the average computer-user population from embracing virtualization.
Virtualization affords abilities that do not exist today. And not all of these relate strictly to software. A virtualized computer is basically a hard disk file. If a logical machine is created with a 16 GB hard drive (to install the OS and software within) then to the host machine it's just a 16 GB data file sitting on the hard drive. Internally, it maintains a file structure and form which, when loaded into the virtualization engine (like VMware), represents that machine's state and everything about it.
The true power, however, comes from the fact that since the machine now exists in soft form, it could be backed up completely, with everything intact within, just by doing a single copy operation. Or it could just be copied for the purpose of moving it to another machine where it will now run. Suppose you need to launch a Windows 2000 machine for software testing. Or suppose you would like to take the very Windows XP instance (with all your custom settings) from your desktop on the road with you. Just copy the file down to your laptop and go. The entire machine state is there. All disk files, settings, user preferences, login scripts, everything. Whatever you would ordinarily do inside of the OS on your desktop, it's all there now on your laptop.
Also, consider security. Suppose a nasty virus attacks. In a virtualized machine won't compromise anything outside of the virtualized environment. And, depending on how isolated that particular machine is (whether it's sharing hard drive resources, for example), that could mean nothing more than going back to yesterday's backup and just starting over (just remember not to open that email attachment this time).
Virtualization affords many software advantages. Developers often need to test new software in several versions of OSes. Users might like to try things out before they risk their machine by physically installing it. The ability to setup virtual "test machines," even to the point of setting up a master test machine which can be copied and rolled out as necessary, sees tremendous advantages.
There are also practical reasons for running minimalistic virtual machines. Some websites, for example, do not support browsers other than IE. Simple Windows-based virtual machines could be setup that allow various browser versions to run. Also, some software exists only for Windows. Or only for DOS, etc. In those cases, you could setup a full version of the OS that can be wielded as necessary. Even ZIP'd up and archived when not in use.
Read on the next page: Limitations in hardware virtualization, Emulation and Products
Limitations in hardware virtualization
As advanced as hardware virtualization is today, more is needed. AMD and Intel are both working on I/O virtualization techniques which will greatly speedup throughput, specifically on heavy I/O apps like servers. These will allow virtual machines to communicate I/O directly to hardware, but in a controlled, shared and cooperative manner - it would not work today without some software intervention. Future models will not be requiring that direct intervention by the hypervisor layer once it is all setup. The hypervisor still has control and can intervene at any time. However, until the direct I/O need from another virtual machine is setup, the hypervisor grants a type of "pass-through" lane for the application.
I/O virtualization is a huge technology move in terms of design and implementation. It's the primary reason we don't have it yet. But for now, think of it like a train track routing system. The hypervisor sets up the switches so that all I/O (trains) coming out of a particular virtual machine go to their correct destination. If it's setup a particular way then it goes straight through. If it's setup a different way then it traps to the hypervisor where the track switches to allow direct pass-through for that one. The actual implementation is far more complex, but that's the gist.
There are also some software advantages in hardware virtualization which are lost. For example, there is a technology called paravirtualization. It allows the OS itself to be recompiled and have information about the fact that it's running in a virtualized state. Such an OS can be optimized in many ways which provide additional speedups. Several Linux-based virtualization solutions offer paravirtaalization extensions or options. However, in order to run canned operating systems like Windows XP (where the OS cannot be recompiled), the hardware solution is still preferred because it's faster.
Another popular model for virtualization is complete emulation. In these models we see a software program running just like any other application. This program's goal is to read a virtual hard-drive and process machine instructions for the target CPU. It presents itself as a whole machine. Some of these machines are very capable, but without special hardware intervening all of them are very slow. A commercial application called Simics by Virtutech operates in this way for a wide array of hardware.
These emulators allow a whole machine to run within a machine. It doesn't have to be the same processor either. An x86 processor could emulate an ARM environment, for example. Many of the older video games written for the original PC, Nintendo, Atari, Super Nintendo, Sega, Commodore 64, etc., all operate this way. Their older processors are emulated on the x86, yet they provide a full virtual machine which runs at comparable speed in a window thanks to the high-speed of modern x86 processors.
This is one of the main reasons emulation still exists. If you have an x86-based machine and need to run an ARM application, it can be done. The primary use for these machines is development and testing, but also for largely proprietary runtime installations which need to be operated in the field on modern laptops, for example.
There are countless virtualization tools out there. Some of the more popular ones are: VMware, Xen, SimNow by AMD, Virtual Iron, Virtual PC / Virtual Server by Microsoft. Each of these products has different benefits. Some are free, some cost money. Some run at near native speeds, others are quite slow. For this article I engaged only in VMware Server and Player. Both of these only support IVT right now (no AMD-V support yet). VMware is free and I have found it to be completely robust, running even 32-bit Vista without problems (though I haven't used Aero Glass mode).
Read on the next page: Technologies and tools, Today's conclusion
Technologies and tools
Several virtualization tools and technologies exist, including emulators. And many more custom implementations which essentially achieve the same things have been created over the decades. Most of today's more modern models are more along the lines of what we see today in generic tools like VMware and Xen. These capture and direct everything, either via software or hardware (or both). Anything that would otherwise affect the underlying hardware is intercepted. These can utilize OS support for speed enhancements, but they also run without it.
The primary difference between these modern hardware approaches and software emulation is speed and flexibility. Virtualization engines designed around hardware utilize technologies that allow the physical machine to handle computing. In an emulated environment, everything must be interpreted. A whole virtual CPU state must be maintained and that means lots of complexity and overhead. However, products like Bochs, which emulate the whole machine, can be taken and recompiled to run on just about any platform. This means an ARM-based PDA could theoretically run x86 DOS. And I think you'll agree, there's definite benefit there.
Virtualization is here to stay. It comes in many forms and its history has been one of software migrating to hardware. It is becoming ubiquitous in business and will soon be the same in consumer spaces. The possibilities of running virtualized environments on the desktop are more feasible and desirable now than ever before. They open up opportunities where only walls once existed.
We are at a point in history where a fork in the road is being made today. The idea of any future computing machine running solely on its own will fall by the wayside. In fact, with virtualization hardware coming online with such a presence, it is very likely the machines of tomorrow will be far simpler than the machines of today. They will be designed for the virtual model and all of the benefits provided therein will be had. We'll see software protocols replace hardware interfaces. And we'll see the benefits of virtualization abound therein.