08. Spark MLlib

Virtualization is a foundational technology in modern computing that enables multiple operating systems and applications to run on a single physical machine. This lecture explores the core concepts, history, and types of virtualization.

What is Virtualization?

General Definition

Virtualization refers to creating a virtual (rather than physical) version of computing resources. In the context of this course:

Common Distinction

There are two main categories of virtual machines:

1. Language-Based Virtual Machines

2. Virtual Machine Monitors (VMM) or Hypervisors

This course focuses primarily on hypervisor-based virtualization used in cloud computing.

History of Virtualization

The evolution of virtualization technology:

1972 → IBM VM/370
       First VM architecture for mainframe machines
       
1997 → Virtual PC for Mac
       Connectix brings virtualization to personal computers
       
1999 → VMware Virtual Platform
       Commercial virtualization for x86 architecture
       
2003 → Xen Hypervisor
       Open-source hypervisor project launched
       
2005 → VMware Player
       Free VM player for end users
       
2007 → VirtualBox
       Cross-platform virtualization software
       
2010s → Cloud Era
       AWS, Azure, GCP built on virtualization

Virtualization Terminology

Understanding the key terms:

Guest OS vs Host OS

Type 1 Hypervisor (Bare Metal)

┌─────────────────────────────────────┐
│     Virtual Machine 1  │  VM 2      │
│  ┌──────────────┐  ┌──────────────┐ │
│  │  Guest OS 1  │  │  Guest OS 2  │ │
│  └──────────────┘  └──────────────┘ │
├─────────────────────────────────────┤
│      Type 1 Hypervisor              │
├─────────────────────────────────────┤
│      Hardware (CPU/RAM/Disk)        │
└─────────────────────────────────────┘

Type 2 Hypervisor (Hosted)

┌─────────────────────────────────────┐
│     Virtual Machine 1  │  VM 2      │
│  ┌──────────────┐  ┌──────────────┐ │
│  │  Guest OS 1  │  │  Guest OS 2  │ │
│  └──────────────┘  └──────────────┘ │
├─────────────────────────────────────┤
│      Type 2 Hypervisor              │
├─────────────────────────────────────┤
│         Host OS                     │
├─────────────────────────────────────┤
│      Hardware (CPU/RAM/Disk)        │
└─────────────────────────────────────┘

Properties of Virtual Machines

Virtual machines provide four key properties that make them valuable:

1. Partitioning

Example:

Physical Server: 64 GB RAM, 16 CPU cores
├── VM1 (Web Server):    16 GB RAM, 4 cores
├── VM2 (Database):      32 GB RAM, 8 cores
└── VM3 (App Server):    16 GB RAM, 4 cores

2. Isolation

Benefits:

3. Encapsulation

Use Cases:

# VM files typically include:
- .vmdk / .vdi  → Virtual disk files
- .vmx / .vbox  → VM configuration
- .nvram        → BIOS settings
- .vmem         → Memory snapshot

This enables:

4. Hardware Independence

Advantages:

How Virtualization Works

The Illusion

The VM gives users an illusion of running on a physical machine:

The Reality

Behind the scenes:

  1. Operating systems normally run in privileged mode
    • Direct access to hardware
    • Can execute privileged instructions
  2. VM OSs run in user mode
    • No direct hardware access
    • Privileged instructions are trapped
  3. Most instructions execute directly
    • Hardware executes them without hypervisor intervention
    • Provides near-native performance
  4. Resource management handled by hypervisor
    • Memory allocation
    • Peripheral access
    • CPU scheduling
  5. Privileged instructions are “trapped”
    • Hypervisor intercepts them
    • Emulates the instruction
    • Returns control to VM

Hardware Assistance

Modern CPUs include virtualization support:

These provide:

VM Components

A virtual machine virtualizes several key components:

1. CPU Virtualization

2. Memory Virtualization

3. Network Virtualization

4. Disk Virtualization

Storage and Migration

Paravirtualization

An alternative approach to full virtualization:

Concept

Advantages

Example: Xen

Xen hypervisor uses paravirtualization:

Modern Trend

Comparison: Type 1 vs Type 2

Aspect Type 1 (Bare Metal) Type 2 (Hosted)
Installation Directly on hardware On top of host OS
Performance Higher (direct hardware access) Lower (host OS overhead)
Use Case Production servers, cloud Development, testing
Examples ESXi, Xen, Hyper-V VirtualBox, VMware Workstation
Management More complex Easier to use
Cost Often enterprise licensing Often free or low-cost

Real-World Applications

Cloud Computing

All major cloud providers use virtualization:

Enterprise Data Centers

Desktop Virtualization (VDI)

Summary

Virtualization is a cornerstone technology that:

Enables multiple VMs on a single physical machine
Provides isolation for security and fault tolerance
Allows easy migration and backup through encapsulation
Abstracts hardware for flexibility and portability
Powers modern cloud computing platforms

In the next lecture, we’ll dive deeper into how virtualization works internally, including binary translation and dynamic translation techniques.