08-03 Docker and Container Orchestration Required
Docker has revolutionized application deployment by making containers accessible and easy to use. This lecture covers Docker concepts, architecture, and container orchestration frameworks.
Docker Concepts
Core Components
1. Image
- Frozen description of an environment
- Read-only template containing:
- Base operating system files
- Application code
- Dependencies and libraries
- Configuration files
- Stored in layers for efficiency
- Can be shared via Docker Hub or private registries
2. Container
- Running instantiation of an image
- Writable layer on top of image
- Isolated execution environment
- Can be started, stopped, moved, and deleted
- Ephemeral by default (state lost when removed)
3. Volume
- Persistent data storage
- Survives container lifecycle
- Can be shared between containers
- Managed by Docker or mounted from host
┌─────────────────────────────────────┐
│ Image (Read-Only Template) │
│ ├── Ubuntu base layer │
│ ├── Python installation │
│ ├── Application dependencies │
│ └── Application code │
└─────────────────────────────────────┘
↓ docker run
┌─────────────────────────────────────┐
│ Container (Running Instance) │
│ ├── Writable layer │
│ └── All image layers (read-only) │
└─────────────────────────────────────┘
↓ uses
┌─────────────────────────────────────┐
│ Volume (Persistent Data) │
│ └── Database files, logs, etc. │
└─────────────────────────────────────┘
The Dockerfile
A Dockerfile describes everything your container needs:
- Dependencies
- Source code / binaries
- Configuration
- Startup command
Overview Guide to Running Code in Docker
- Inherit from a parent OS/platform container
- Install any packages/libraries you need
- Add any source code you need
- Attach any volumes for data persistence
- Set a command to be run at startup
Essential Dockerfile Commands
FROM - Inherit from a parent container
FROM ubuntu:22.04
# or
FROM python:3.11-slim
# or
FROM node:18-alpine
RUN - Execute commands during build
RUN apt-get update && apt-get install -y \
python3 \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
COPY / ADD - Copy files into the image
# COPY is preferred for simple file copying
COPY app.py /usr/local/app/
COPY requirements.txt /usr/local/app/
# ADD can extract archives and download URLs
ADD myapp.tar.gz /usr/local/
WORKDIR - Set working directory
WORKDIR /usr/local/app
EXPOSE - Register ports the container listens on
EXPOSE 80
EXPOSE 443
ENV - Set environment variables
ENV PYTHONUNBUFFERED=1
ENV APP_ENV=production
CMD - Default command to execute on startup
CMD ["python", "/usr/local/app/app.py"]
# or
CMD ["npm", "start"]
ENTRYPOINT - Configure container as executable
ENTRYPOINT ["python", "app.py"]
# Arguments can be passed: docker run myimage arg1 arg2
Example Dockerfile
# Use official Python runtime as base
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Copy requirements file
COPY requirements.txt .
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Expose port
EXPOSE 8000
# Set environment variables
ENV PYTHONUNBUFFERED=1
# Run the application
CMD ["python", "app.py"]
Docker Build Process
Each command in a Dockerfile creates an intermediate image layer:
FROM ubuntu:22.04 → Layer 1 (base)
RUN apt-get update → Layer 2
RUN apt-get install python → Layer 3
COPY app.py /app/ → Layer 4
CMD ["python", "/app/app.py"] → Layer 5
Benefits of layering:
- Caching: Unchanged layers are reused
- Efficiency: Only modified layers are rebuilt
- Sharing: Common layers shared between images
Optimizing Dockerfiles for Caching
Structure your Dockerfile to maximize cache hits:
# ❌ BAD: Code changes invalidate all layers
FROM python:3.11
COPY . /app
RUN pip install -r /app/requirements.txt
CMD ["python", "/app/app.py"]
# ✅ GOOD: Dependencies cached separately
FROM python:3.11
WORKDIR /app
# Install dependencies first (changes less frequently)
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy code last (changes frequently)
COPY . .
CMD ["python", "app.py"]
Principle: “Funnel down” from most general to most specific
Docker Architecture
Docker uses a client-server architecture:
┌─────────────────────────────────────────────────┐
│ Docker Client (CLI) │
│ $ docker run, docker build, docker pull │
└────────────────┬────────────────────────────────┘
│ REST API
↓
┌─────────────────────────────────────────────────┐
│ Docker Daemon (dockerd) │
│ ├── Manages images, containers, networks │
│ ├── Handles build requests │
│ └── Communicates with registries │
└────────────────┬────────────────────────────────┘
│
┌──────────┴──────────┬──────────────┐
↓ ↓ ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Images │ │Containers│ │ Networks │
└──────────┘ └──────────┘ └──────────┘
Components
- Docker Daemon (dockerd)
- Long-running background process
- Manages containers, images, networks, volumes
- Listens for API requests
- Docker Client (docker CLI)
- Command-line interface
- Sends commands to daemon via REST API
- Can connect to remote daemons
- Docker Registry
- Stores Docker images
- Docker Hub (public)
- Private registries (AWS ECR, Google GCR, Azure ACR)
- Docker Images
- Read-only templates
- Built from Dockerfiles
- Stored in layers
- Docker Containers
- Running instances of images
- Isolated processes
Basic Docker Commands
Working with Images
# Pull an image from registry
docker pull ubuntu:22.04
# List local images
docker images
# Build an image from Dockerfile
docker build -t myapp:v1.0 .
# Tag an image
docker tag myapp:v1.0 username/myapp:v1.0
# Push image to registry
docker push username/myapp:v1.0
# Remove an image
docker rmi myapp:v1.0
# Search Docker Hub for images
docker search nginx
Working with Containers
# Run a container
docker run -d -p 8080:80 --name webserver nginx
# List running containers
docker ps
# List all containers (including stopped)
docker ps -a
# Stop a container
docker stop webserver
# Start a stopped container
docker start webserver
# Restart a container
docker restart webserver
# Execute command in running container
docker exec -it webserver bash
# View container logs
docker logs webserver
# Attach to running container
docker attach webserver
# Remove a container
docker rm webserver
# Remove all stopped containers
docker container prune
Advanced Operations
# Run container with volume mount
docker run -v /host/path:/container/path myapp
# Run with environment variables
docker run -e DB_HOST=localhost -e DB_PORT=5432 myapp
# Run with resource limits
docker run --memory="512m" --cpus="1.5" myapp
# Export container filesystem
docker export webserver > webserver.tar
# Create image from container changes
docker commit webserver myapp:v1.1
# Inspect container details
docker inspect webserver
# View container resource usage
docker stats
Container Orchestration
Managing multiple containers across multiple hosts requires orchestration:
Docker Swarm
- Native Docker clustering
- Pools multiple Docker engines into a virtual host
- Allows multiple VMs to collaborate
- Built-in load balancing
- Service discovery
# Initialize swarm
docker swarm init
# Deploy a service
docker service create --replicas 3 -p 80:80 nginx
# Scale a service
docker service scale web=5
# List services
docker service ls
Docker Compose
- Multi-container application orchestration
- Declarative YAML format
- Defines services, networks, volumes
- Easy local development
docker-compose.yml example:
version: '3.8'
services:
web:
build: .
ports:
- "8000:8000"
volumes:
- .:/app
environment:
- DATABASE_URL=postgresql://db:5432/myapp
depends_on:
- db
db:
image: postgres:15
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=secret
volumes:
postgres_data:
# Start all services
docker-compose up -d
# Stop all services
docker-compose down
# View logs
docker-compose logs -f
# Scale a service
docker-compose up -d --scale web=3
Kubernetes
The most popular container orchestration platform:
Key Concepts
- Nodes: Physical or virtual machines
- Pods: Group of one or more containers
- Share same network namespace
- Have the same IP address
- Represent a tier of multi-tier app (frontend, backend, database)
- Services: Stable network endpoint for pods
- Deployments: Declarative updates for pods
Kubernetes Features
- Auto-scaling: Scale pods based on load
- Self-healing: Restart failed containers
- Load balancing: Distribute traffic across pods
- Rolling updates: Zero-downtime deployments
- Service discovery: Automatic DNS for services
# Example Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: myapp:v1.0
ports:
- containerPort: 8000
What Containers CAN Do
✓ Run different Linux distributions on the same host
- Example: Ubuntu container on Red Hat host
✓ Run applications with different dependencies
- Example: Python 3.9 in one container, Python 3.11 in another
✓ Use the host’s hardware
- Access network interfaces
- Access GPUs (with NVIDIA drivers)
- Access storage
✓ Isolate applications
- Each container has its own filesystem
- Process isolation
- Network isolation
Best Practices
Image Best Practices
- Use official base images when possible
- Keep images small: Use alpine or slim variants
- Use specific tags: Avoid
latestin production - Minimize layers: Combine RUN commands
- Use .dockerignore: Exclude unnecessary files
- Don’t run as root: Use USER directive
- Scan for vulnerabilities: Use
docker scan
Container Best Practices
- One process per container: Follow microservices principle
- Use volumes for data: Don’t store data in containers
- Log to stdout/stderr: Let Docker handle log management
- Use health checks: Enable automatic restart
- Set resource limits: Prevent resource exhaustion
- Use environment variables: For configuration
- Keep containers stateless: Enable easy scaling
Summary
Docker simplifies container management:
- Images are templates, containers are running instances
- Dockerfiles define how to build images
- Docker architecture uses client-server model
- Layering enables efficient caching and sharing
- Orchestration tools (Swarm, Compose, Kubernetes) manage multiple containers
- Best practices ensure secure, efficient deployments
In the next lecture, we’ll explore serverless computing and how it abstracts away even container management.