08-02 Containerization and Linux Primitives Required

While virtual machines provide complete isolation by virtualizing entire operating systems, containers offer a lightweight alternative that shares the OS kernel while providing isolated execution environments. This lecture explores containerization and the Linux primitives that make it possible.

Motivation for Containerization

Limitations of VMs

Virtual machines are powerful but have drawbacks:

  • High runtime overhead: Each VM runs a complete OS
  • Slow startup: Booting an OS takes time
  • Resource intensive: Multiple OS copies consume significant memory
  • Large image sizes: VM images are typically gigabytes in size

The Container Solution

What if we could sandbox applications but share the OS kernel?

Containers provide:

  • Faster scaling: Start in seconds instead of minutes
  • Lower overhead: Share kernel, only package app dependencies
  • Smaller footprint: Container images are megabytes, not gigabytes
  • Higher density: Run more containers than VMs on same hardware

This enables:

  • Different software architectures (microservices)
  • New development practices (DevOps, CI/CD)
  • More efficient resource utilization

Containers vs Virtual Machines

Architecture Comparison

Traditional VM-based Infrastructure:

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│    App A     │  │    App B     │  │    App C     │
├──────────────┤  ├──────────────┤  ├──────────────┤
│   Bins/Libs  │  │   Bins/Libs  │  │   Bins/Libs  │
├──────────────┤  ├──────────────┤  ├──────────────┤
│   Guest OS   │  │   Guest OS   │  │   Guest OS   │
├──────────────┴──┴──────────────┴──┴──────────────┤
│              Hypervisor                           │
├───────────────────────────────────────────────────┤
│              Host OS (optional)                   │
├───────────────────────────────────────────────────┤
│              Physical Hardware                    │
└───────────────────────────────────────────────────┘

Container Infrastructure:

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│    App A     │  │    App B     │  │    App C     │
├──────────────┤  ├──────────────┤  ├──────────────┤
│   Bins/Libs  │  │   Bins/Libs  │  │   Bins/Libs  │
├──────────────┴──┴──────────────┴──┴──────────────┤
│           Container Runtime (Docker)              │
├───────────────────────────────────────────────────┤
│              Host OS / Kernel                     │
├───────────────────────────────────────────────────┤
│              Physical Hardware                    │
└───────────────────────────────────────────────────┘

Key Differences

Aspect Virtual Machines Containers
OS Complete OS per VM Shared kernel
Size Gigabytes Megabytes
Startup Minutes Seconds
Isolation Strong (hardware-level) Process-level
Overhead Higher Lower
Density 10s per host 100s per host
Portability Less portable Highly portable

What Containers Share

Containers have a separate view of:

  • Root filesystem
  • Libraries and utilities
  • Process tree
  • Users and permissions
  • Networking stack
  • IPC endpoints

But they share:

  • The same OS kernel
  • System calls interface
  • Hardware resources (managed by kernel)

Big Idea: Less OS Overhead

┌─────────────────────────────────────────────┐
│  Traditional VMs: 3 VMs on one host         │
│  - 3 complete OS copies                     │
│  - High memory usage                        │
│  - Slower startup                           │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│  Containers: 10+ containers on same host    │
│  - 1 shared kernel                          │
│  - Lower memory usage                       │
│  - Instant startup                          │
└─────────────────────────────────────────────┘

Important Constraint: Kernel Compatibility

[!IMPORTANT] Container-Kernel Dependency

  • A Linux container needs a Linux kernel
  • A Windows container needs a Windows kernel
  • You cannot run a Windows container on a Linux host natively (or vice versa)

Solutions

On Windows:

  • Use Windows Subsystem for Linux (WSL 2) to run Linux containers
  • Use Hyper-V to run a Linux VM that hosts containers
  • Use Windows Server to run native Windows containers

On macOS:

  • Docker Desktop uses a lightweight Linux VM
  • Containers run inside this VM

On Linux:

  • Native container support
  • Best performance and compatibility

Linux Primitives for Containers

Containers are built on two fundamental Linux kernel mechanisms:

1. Namespaces

Purpose: Provide isolated view of global resources

  • Group of processes see only their “slice” of a resource
  • Other processes cannot see or interfere with this slice
  • Creates the isolation aspect of containers

2. Control Groups (Cgroups)

Purpose: Control and limit resource usage

  • Set limits on CPU, memory, I/O for process groups
  • Prevent resource exhaustion
  • Enables resource management for containers
┌─────────────────────────────────────────┐
│  Namespaces + Cgroups = Container       │
│                                         │
│  Namespaces → Isolation                 │
│  Cgroups    → Resource Limits           │
└─────────────────────────────────────────┘

Namespaces in Detail

Linux provides several types of namespaces to isolate different resources:

Types of Namespaces

1. Mount Namespace

  • Isolates: Filesystem mount points
  • Effect: Each namespace has its own view of the filesystem hierarchy
  • Use: Containers have their own root filesystem
# Container sees:
/
├── bin/
├── lib/
├── app/
└── ...

# Host sees different filesystem

2. PID Namespace

  • Isolates: Process ID number space
  • Effect: First process in namespace gets PID 1
  • Use: Containers have their own process tree
Host PID Namespace:
  PID 1234 → Container init process
  
Container PID Namespace:
  PID 1 → Same process (appears as init)
  PID 2 → First child process

3. Network Namespace

  • Isolates: Network resources (IP addresses, routing tables, ports)
  • Effect: Each namespace has its own network stack
  • Use: Containers can have their own IP addresses
# Container can bind to port 80
# Host can also bind to port 80
# No conflict because different network namespaces

4. UTS Namespace

  • Isolates: Hostname and domain name
  • Effect: Each namespace can have different hostname
  • Use: Containers have unique hostnames

5. User Namespace

  • Isolates: User and group ID number space
  • Effect: Process can be root in container but unprivileged on host
  • Use: Enhanced security (rootless containers)
Container: UID 0 (root)
    ↓ mapped to
Host: UID 1000 (regular user)

6. IPC Namespace

  • Isolates: Inter-Process Communication endpoints
  • Effect: Separate message queues, semaphores, shared memory
  • Use: Prevent IPC interference between containers

Namespace API

Three key system calls:

1. clone()

// Create new process in new namespace
clone(child_func, stack, CLONE_NEWPID | CLONE_NEWNET, args);
  • More general version of fork()
  • Flags specify what to share vs create new

2. setns()

// Join an existing namespace
setns(namespace_fd, CLONE_NEWNET);
  • Allows process to enter existing namespace
  • Useful for debugging containers

3. unshare()

// Create new namespace for calling process
unshare(CLONE_NEWPID | CLONE_NEWNET);
  • Calling process moves to new namespace
  • Equivalent to fork() + clone()

Viewing Namespaces

Namespaces are represented as files in /proc:

# View namespaces for your current shell
ls -l /proc/$$/ns

# Output:
lrwxrwxrwx 1 user user 0 Jan  5 10:00 ipc -> 'ipc:[4026531839]'
lrwxrwxrwx 1 user user 0 Jan  5 10:00 mnt -> 'mnt:[4026531840]'
lrwxrwxrwx 1 user user 0 Jan  5 10:00 net -> 'net:[4026531992]'
lrwxrwxrwx 1 user user 0 Jan  5 10:00 pid -> 'pid:[4026531836]'
lrwxrwxrwx 1 user user 0 Jan  5 10:00 user -> 'user:[4026531837]'
lrwxrwxrwx 1 user user 0 Jan  5 10:00 uts -> 'uts:[4026531838]'

The ID (e.g., [4026531839]) uniquely identifies the namespace. Processes with the same ID share that namespace.

Control Groups (Cgroups)

Purpose

Cgroups allow you to:

  • Limit resources (CPU, memory, I/O)
  • Prioritize resource allocation
  • Account for resource usage
  • Control which CPUs/memory nodes processes can use

Cgroup Hierarchies

Resources can be organized hierarchically:

All CPU Resources
├── CPU-Faculty (40%)
│   ├── Fac-Web (50%)
│   └── Fac-Non-Web (50%)
└── CPU-Students (60%)
    ├── Student-Web (50%)
    └── Student-Non-Web (50%)

Creating Cgroups

Managed via filesystem (no new system calls):

# Cgroup filesystem mounted at
/sys/fs/cgroup/

# Create a cgroup for limiting memory
mkdir /sys/fs/cgroup/memory/mycontainer

# Set memory limit to 512MB
echo 536870912 > /sys/fs/cgroup/memory/mycontainer/memory.limit_in_bytes

# Add process to cgroup
echo $PID > /sys/fs/cgroup/memory/mycontainer/tasks

Resource Types

Cgroups can control:

  • CPU: CPU time, CPU shares, CPU quotas
  • Memory: Memory limits, swap limits
  • Block I/O: I/O bandwidth limits
  • Network: Network priority (with tc)
  • Devices: Device access control
  • CPUsets: Which CPUs/memory nodes to use

How to Create a Container

Putting it all together:

1. Create namespaces for isolation
   ├── PID namespace (process isolation)
   ├── Mount namespace (filesystem isolation)
   ├── Network namespace (network isolation)
   └── User namespace (security)

2. Create and configure cgroups for resource limits
   ├── CPU limits
   ├── Memory limits
   └── I/O limits

3. Create root filesystem
   ├── Base OS files (minimal)
   ├── Application binaries
   ├── Required libraries
   └── Configuration files

4. Enter namespaces, mount rootfs, register in cgroups

5. Execute application or shell

→ Your application is now running in a "container"!

Container Frameworks

Why Use Frameworks?

Creating containers manually is complex. Frameworks automate:

  • Namespace and cgroup configuration
  • Filesystem management
  • Network setup
  • Image distribution

LXC (Linux Containers)

  • General-purpose container framework
  • Provides standard OS shell interface
  • Acts like a lightweight VM
  • Uses namespaces and cgroups under the hood

Docker

  • Application-focused containers
  • Optimized to run a single application
  • Easy packaging and distribution
  • Dockerfile for reproducible builds
  • Docker Hub for sharing images

Comparison

Feature LXC Docker
Purpose System containers Application containers
Interface Full OS environment Single application
Use Case VM replacement App deployment
Ecosystem Smaller Large (Docker Hub)

What Containers CAN Do

Run different Linux distributions on the same host

  • Ubuntu container on Red Hat host
  • Alpine container on Ubuntu host

Run applications with different dependencies

  • Python 3.9 in one container
  • Python 3.11 in another
  • Even if host has no Python installed

Use the host’s hardware and system calls

  • Access network interfaces
  • Use GPUs (with proper drivers)
  • Access storage

Provide isolation and security

  • Process isolation
  • Filesystem isolation
  • Network isolation

Summary

Containers provide lightweight isolation with lower overhead than VMs:

  • Share the same kernel but have different root filesystems
  • Built on Linux primitives:
    • Namespaces for isolation
    • Cgroups for resource limits
  • Frameworks like Docker and LXC provide user-friendly interfaces
  • Enable modern architectures: microservices, serverless, cloud-native

In the next lecture, we’ll explore Docker in depth, including how to create, manage, and deploy containerized applications.

Further Reading

← Back to Chapter Home