Cgroups, namespaces, and beyond: what are containers made from?
https://youtu.be/sK5i-N34im8?si=Db5HG2LRPgAggBsm
- A hypervisor is software that pools computing resourcesβlike processing, memory, and storageβand reallocates them among virtual machines (VMs).
- This technology makes virtualization possible, meaning you can create and run many VMs from a single physical machine.
- e.g. Oracle VirtualBox
https://www.redhat.com/en/topics/virtualization/what-is-a-hypervisor
Check Docker version:
docker --versionPull an image (replace "image_name" with the actual image name):
docker pull image_nameRun a container in detached mode (background):
docker run -d image_nameRun a container with a custom name (replace "container_name" and "image_name"):
docker run --name container_name image_nameRun a container interactively (get a terminal inside):
docker run -it image_nameList all running containers:
docker psList all containers (running and stopped):
docker ps -aStop a running container (replace "container_name" or "container_id"):
docker stop container_name # OR docker stop container_idRemove a stopped container (replace "container_name" or "container_id"):
docker rm container_name # OR docker rm container_idImage Management:
List all images:
docker imagesRemove an image (replace "image_name" or "image_id"):
docker rmi image_name # OR docker rmi image_idfor deleting unused images
system prune -a -fremoves unused images, not necessarily dangling images (untagged layers). If you want to remove dangling images as well, use the -a
- Docker is written in the 'GO' language.
- Docker is a tool that performs OS-level virtualization, also known as containerization.
FROM node:12.2.0-alpine
WORKDIR app
COPY . .
RUN npm install
RUN npm run test
EXPOSE 8000
CMD ["node","app.js"]WORKDIR /app
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
COPY . .
ENV FLASK_RUN_HOST=0.0.0.0
EXPOSE 5000
CMD ["flask", "run"]# First Stage: Build Stage
FROM openjdk:8-jdk as build
# Working Directory where all code will be kept
WORKDIR /app/
# Copy the app to the current directory of the build stage
COPY Hello.java .
# Compile code
RUN javac Hello.java
# Second Stage: Runtime Stage
FROM openjdk:8-jdk-alpine
# Set the working directory for the second stage
WORKDIR /app/
# Copy the compiled code from the build stage to the runtime stage
COPY --from=build /app/ .
# Run java compiled code
CMD ["java", "Hello"]FROM python:3.12.0b4-slim-bullseye
WORKDIR /app
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt
ENV FLASK_RUN_HOST=0.0.0.0
EXPOSE 5000
CMD ["flask", "run"]Multistage Docker Build 3-stage
# Stage 1: Base
FROM python:3.9-slim-buster AS base
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Build
FROM base AS build
COPY . .
# Stage 3: Final
FROM python:3.9-slim-buster AS final
WORKDIR /app
COPY --from=build /app /app
CMD ["flask", "run"]
The default Docker volume storage path:
/var/lib/docker/volumes
Root Directory:
Docker stores all its data, including images, containers, volumes, and networks, within the /var/lib/docker directory.
--build-context --build-context stringArray Additional build contexts (e.g., name=path)
docker build -t sample-site -f sample-site/docker/Dockerfile --build-context sample-site/ .In this command, --build-context sample-site/ specifies that the build context should be the sample-site/ directory, which includes the html/ and docker/ directories. The . at the end of the command specifies the current directory as the build context for files that are not in the sample-site/ directory (like the config/ directory).
Itβs a common practice to keep the Dockerfile at the project root directory. The command, by default, expects the Dockerfile to be present there. All the files we want to include in the image should exist somewhere inside that context.
General Dockerfile folder structurea:
project-root/
β
βββ Dockerfile
βββ app/
β βββ src/
β β βββ (application source files)
β βββ static/
β β βββ (static files)
β βββ templates/
β βββ (HTML template files)
β
βββ config/
β βββ (configuration files)
β
βββ tests/
β βββ (test scripts and data)
β
βββ README.md
If the structure is like this
projects/
βββ <some other projects>...
βββ sample-site/
β βββ html/
β β βββ index.html
β βββ docker/
β βββ Dockerfile
βββ config/
βββ nginx.conf
docker build -t sample-site -f sample-site/docker/Dockerfile .def dockerfilePath = fileExists('Dockerfile') ? 'Dockerfile' : 'KYC/monitor/Dockerfile'
sh "docker buildx build -t ${CONTAINER_REGISTRY_URL}/${ECR_REPO_NAME}:amd-${VERSION} -f ${dockerfilePath} ${buildContext} --build-arg PACKAGE_READ_TOKEN=${PACKAGE_READ_TOKEN} --build-arg BUILD_VERSION=${VERSION} --build-arg GIT_COMMIT=${scmVars.GIT_COMMIT[0..7]}"docker build with --build-arg with multiple arguments
Use --build-arg with each argument.
If you are passing two argument then add --build-arg with each argument like:
docker build \
-t essearch/ess-elasticsearch:1.7.6 \
--build-arg number_of_shards=5 \
--build-arg number_of_replicas=2 \
--no-cache .https://kodekloud.com/blog/docker-build-args/
https://stackoverflow.com/questions/42297387/docker-build-with-build-arg-with-multiple-arguments
Reducing Docker image sizes is crucial for optimizing container deployment, enhancing scalability, and minimizing storage costs. Here's how you can effectively reduce Docker image sizes:
- Choose Lightweight Versions: Start with minimal base images like
python:3.9-slimorpython:3.9-alpineinstead of full-sized OS images. For instance,python:3.9-alpineis significantly smaller (around 95.2% smaller) thanpython:3.9.
- Combine Commands: Each command in a Dockerfile creates a new layer, increasing the image size. Combine similar commands to reduce layers.
- Example:
# Instead of this RUN apk update RUN apk add --no-cache git RUN rm -rf /var/cache/apk/* # Do this RUN apk update && apk add --no-cache git && rm -rf /var/cache/apk/*
- Exclude Unnecessary Files: Use a
.dockerignorefile to prevent unnecessary files and directories from being copied into the image, reducing the final image size. - Sample .dockerignore:
__pycache__ *.pyc *.pyo *.pyd venv/
- Separate Build and Runtime Stages: Multi-stage builds allow you to keep only the essential parts of the application in the final image, drastically reducing its size.
- Example:
# Stage 1: Build FROM python:3.9-alpine AS builder RUN apk add --no-cache build-base gfortran musl-dev lapack-dev WORKDIR /app COPY requirements.txt ./ RUN pip install --no-cache-dir -r requirements.txt COPY . . # Stage 2: Production FROM python:3.9-alpine WORKDIR /app COPY --from=builder /app /app EXPOSE 5000 CMD ["python", "app.py"]
- Result: The image size drops from 588 MB (single-stage) to 47.7 MB (multi-stage).
- Empty Base Image: If your application is a static binary, use the
scratchbase image, which is empty and results in a very small final image. - Example:
FROM scratch COPY myapp / CMD ["/myapp"]
- Use Trusted and Official Base Images: Always start with verified images to ensure security.
- Run Containers as Non-Root Users: Reduce the risk of privilege escalation by running containers with non-root users.
- Regular Vulnerability Scans: Regularly scan Docker images for vulnerabilities using tools like
ClairorTrivy. - Limit Network Exposure: Restrict ports and IP addresses to reduce attack surfaces.
docker run -p 127.0.0.1:8080:8080 myimage
- Avoid Hardcoding Sensitive Information: Never hardcode secrets in Dockerfiles. Use environment variables or secret management tools.
- Smaller Image Size = Faster Deployments + Quicker Scaling + Leaner Infrastructure
By following these practices, you can significantly reduce Docker image sizes, improving both performance and security.
Docker networking enables communication between containers, between containers and the host, and sometimes between containers across different Docker hosts.
Containers are isolated by default but can be connected using various network drivers.
https://docs.docker.com/engine/network/
https://spacelift.io/blog/docker-networking#docker-network-types
https://docs.docker.com/reference/cli/docker/network/
Absolutely! Letβs dive deep into βHow Docker Networking Works Under the Hoodβ so you get a solid understanding of the internals.
Docker networking relies heavily on Linux kernel features to provide isolation and connectivity between containers. The main building blocks are:
- A network namespace is a lightweight, isolated network stack for a group of processes.
- When Docker creates a container, it creates a separate network namespace for it.
- This namespace has its own interfaces, routing tables, firewall rules, and network devices β completely isolated from other namespaces (including the hostβs default namespace).
- This means a container can have its own IP address and network configuration independent from the host or other containers.
-
To connect container network namespaces to the Docker host network, Docker uses virtual Ethernet (veth) pairs.
-
A veth pair acts like a virtual network cable: packets sent on one end appear on the other end.
-
When a container is created:
- One end of the veth pair is placed inside the containerβs network namespace (usually named
eth0). - The other end remains in the hostβs default network namespace and is attached to a Docker network bridge (e.g.,
docker0).
- One end of the veth pair is placed inside the containerβs network namespace (usually named
- The default Docker bridge (
docker0) is a virtual Ethernet bridge created on the host. - It acts like a virtual switch, connecting all container-side veth interfaces attached to the bridge.
- Containers connected to the same bridge network can communicate directly using their internal IP addresses.
- The bridge also assigns IP addresses from a private subnet to connected containers.
- Docker uses an internal IP address management (IPAM) system to allocate IP addresses to containers.
- When a container joins a network (bridge by default), it gets assigned an IP address from the subnet defined for that network.
- Containers use this IP address within the network namespace for communication.
- Containers on the bridge network have private IPs that are not directly reachable outside the host.
- Docker uses iptables rules on the host to perform Network Address Translation (NAT), allowing containers to access external networks (internet).
- For external access to containers (e.g., web servers), Docker maps host ports to container ports (port forwarding) using iptables rules.
- This allows you to expose container services on the host IP and ports.
-
Docker modifies the hostβs IP routing tables and firewall (iptables) rules to:
- Enable container-to-container communication on the same network.
- Enable container-to-host and container-to-external network communication.
-
Docker automatically manages these rules as containers start and stop.
- When you create a user-defined bridge network, Docker creates a new bridge interface with its own subnet and routing.
- Containers on the same user-defined network can resolve each other by name via embedded DNS, simplifying communication.
Host Network Namespace
βββ docker0 (bridge) <--- virtual switch
β βββ veth-host-1 (host end of veth pair)
β βββ veth-host-2
β βββ ...
|
βββ Container 1 Namespace
β βββ eth0 (container end of veth pair)
|
βββ Container 2 Namespace
β βββ eth0
|
βββ Routing & iptables for NAT & port forwarding
+---------------------+
| Docker Host |
| |
| +---------------+ |
| | docker0 (bridge)|ββββββββββ Default bridge network
| +---------------+ | |
| β² | |
| | | |
| +------+-----+ +--+------+ |
| | veth-host1 | | veth-host2| Virtual Ethernet pairs
| +------+-----+ +--+------+ |
| | | |
+---------|--------------|-------+
| |
βΌ βΌ
+----------------+ +----------------+
| Container 1 | | Container 2 |
| Network NS | | Network NS |
| eth0 β 172.X | | eth0 β 172.X | Each has its own IP
+----------------+ +----------------+
- docker0 (bridge): Virtual switch that connects containers on the same bridge network.
- veth pair: Acts like a virtual cable between host and container namespaces.
- eth0 inside container: The containerβs network interface.
- Namespace isolation: Each container has its own isolated networking stack.
- IP Assignment: Each container gets a private IP from the Docker bridge subnet.
- This isolation and networking model is what makes containers lightweight yet securely isolated.
- It enables flexible container-to-container networking on the same host without IP conflicts.
- You can extend these concepts for multi-host networking using overlay networks.
Docker supports six network types to manage container communication that implement core networking functionality:
bridge:The default for standalone containers. It creates a private internal network on the host, and containers can communicate through it using IPs or container names.host:Removes network isolation by using the hostβs network stack directly. This allows containers to share the hostβs IP and ports, which is useful for performance or compatibility needs.none:Disables networking completely. Useful for security or manual configuration.overlay:Enables multi-host networking using Docker Swarm. It creates a distributed network across nodes, allowing containers on different hosts to communicate securely.macvlan:Assigns a MAC address to each container, making it appear as a physical device on the network. Used for scenarios requiring full network integration, such as legacy apps.ipvlan:Similar to macvlan but uses a different method for traffic handling. Itβs more efficient for high-density environments but less flexible.
+----------------------+
| Docker Host |
| |
| +----------------+ |
| | docker0 bridge |ββββββββββββββββββββββββ+
| +----------------+ |
| β² |
BRIDGE NETWORK | | (veth pair) |
------------------------ | +-----+-----+ +-----+-----+ |
+------------------+ +------+ veth-host1 | | veth-host2 | |
| Container A | | +-----+-----+ +-----+-----+ |
| IP: 172.17.x.x |βββββeth0βββ | | |
+------------------+ βΌ βΌ |
+---------+ +---------+ |
| eth0-A | | eth0-B | |
| in A | | in B | |
+---------+ +---------+ |
Bridge: Can talk if on same net ---------+
HOST NETWORK (shares host net stack)
----------------------------
+------------------+ +-----------------------+
| Container C |βββββββΆ | Shares host's IP |
| Uses host network | | No isolation (perf) |
+------------------+ +-----------------------+
No separate IP or interface
OVERLAY NETWORK (multi-host)
----------------------------
+------------------+ +------------------+
| Host 1 | | Host 2 |
| +------------+ | | +------------+ |
| | Container D | | ββββββΆ | | Container E | |
| | IP: 10.0.0.2 |βββ Overlay | | IP: 10.0.0.3 | |
| +------------+ | VXLAN | +------------+ |
+------------------+ +------------------+
Needs Docker Swarm (or plugins)
MACVLAN NETWORK (direct LAN access)
----------------------------
+------------------+ +------------------+
| Container F |ββββββββββΆ| LAN Switch |
| IP: 192.168.1.100 | | Real network |
| MAC: unique | | Direct traffic |
+------------------+ +------------------+
Appears as physical device on LAN
NONE NETWORK (no networking)
----------------------------
+------------------+
| Container G |
| No external access|
| Only loopback |
+------------------+
For max isolation / testing
| Network Type | Isolated IP | Host Access | Multi-Host | Use Case |
|---|---|---|---|---|
| bridge | β | Via ports | β | Default, simple setups |
| host | β (shares) | Full | β | High-performance, same-host apps |
| overlay | β | Via ingress | β | Docker Swarm multi-host |
| macvlan | β (LAN IP) | β (by default) | Depends | LAN-level access, legacy systems |
| none | β | β | β | Debugging, sandboxing |
User-defined Bridge Network (with DNS)
--------------------------------------
+------------------+ +------------------+
| Container A |ββββββββΆ | Container B |
| Name: web | | Name: db |
| IP: 172.18.0.2 | | IP: 172.18.0.3 |
+------------------+ +------------------+
| |
βΌ DNS Resolution via Docker |
curl http://db:3306 |
π Note: Works only in user-defined bridge or overlay networks, not default bridge.
+------------------+
| Container |
| eth0: 172.x.x.x |
+--------β²---------+
|
βΌ
Use `host.docker.internal` (Docker Desktop)
OR use host IP (Linux)
π On Linux, container must access host via host IP, not localhost.
+------------------+ +------------------------+
| Container | | Docker Host (iptables) |
| 172.17.0.2 |βββββββββΆ| NAT (MASQUERADE) |
| Outbound traffic | | Converts β Host IP |
+------------------+ +------------------------+
|
βΌ
Internet
π Enabled by default using Docker-managed iptables rules.
+------------------------+
| Host: localhost:8080 |ββββ You access
+----------β²-------------+
|
iptables DNAT rule
|
+----------βΌ-------------+
| Container: port 80 |
| (e.g., Nginx) |
+------------------------+
Command:
docker run -p 8080:80 nginx
π Host forwards traffic to container's internal port.
+--------------------------+
| Container (API Server) |
| Exposes: 8080->80 |
+-----------β²--------------+
|
curl http://hostIP:8080
|
βΌ
iptables loopback rule redirects
π Enables a container to access its own service via host-mapped port (e.g., self-call to APIs).
+------------------+
| Container A |
| eth0 β bridge1 |
| eth1 β bridge2 |
+------------------+
Command:
docker network connect bridge2 containerAπ Use case: API container talks to both DB and Web network separately.
+--------------------------+
| Overlay Network (Swarm) |
| Service traffic routing |
+------------β²-------------+
|
Internal NAT & bridge routing
|
+------------βΌ-------------+
| docker_gwbridge (host) |
| For outside β service |
+--------------------------+
π Enables external access to Swarm containers via the host.
+-----------------------+
| Service: web |
| VIP: 10.0.0.2 |
+----------β²------------+
|
Internal DNS (VIP)
|
+----------βΌ------------+
| Task 1 (web.1) |
| Task 2 (web.2) |
| Task 3 (web.3) |
+-----------------------+
Docker load-balances between tasks automatically via **IPVS**
π DNS-based load balancing for services across replicas.
-
List networks:
docker network ls
-
Inspect a network:
docker network inspect <network-name>
-
Create a network:
docker network create --driver bridge my_bridge_network
-
Connect a container to a network:
docker network connect <network-name> <container-name>
-
Disconnect a container from a network:
docker network disconnect <network-name> <container-name>
docker network create my_net
docker run -dit --name container1 --network my_net alpine sh
docker run -dit --name container2 --network my_net alpine shInside container1, you can ping container2 by name:
docker exec container1 ping container2Port Mapping
- Host ports are mapped to container ports to allow access from outside the host.
- Example: docker run -p 8080:80 nginx maps host port 8080 to container port 80.
- Port mapping uses NAT and iptables to forward traffic.
docker inspect <container-name or ID>Look for:
Networksβ see the container's assigned IPNetworkMode- Which networks it's connected to
π Use docker network inspect <network-name> to see all containers attached to a specific network.
If two containers are on the same custom bridge network, they should resolve each other by name:
docker exec container1 ping container2β On the default bridge network, DNS-based name resolution does not work. Use container IPs or use a user-defined bridge network.
You might be running a service inside a container, but if itβs not mapped to a host port, it wonβt be accessible:
docker psCheck for PORTS column like 0.0.0.0:8080->80/tcp.
π If missing, the container is not exposed to the host. Re-run with:
docker run -p 8080:80 <image>Ensure the host port isnβt already in use:
sudo lsof -i -P -n | grep LISTENIf something else is using the host port, Docker won't bind to it.
Docker sets up iptables for NAT and forwarding.
To list Docker-related rules:
sudo iptables -t nat -L -nπ If you've disabled iptables or a firewall has blocked rules, containers may not get outbound access.
Sometimes DNS isnβt resolving from inside the container:
docker exec <container> cat /etc/resolv.confTry:
docker exec <container> ping google.comβ Fix:
- Use Google DNS:
--dns 8.8.8.8when running container - Or configure DNS in
/etc/docker/daemon.json
{
"dns": ["8.8.8.8", "1.1.1.1"]
}For deep debugging:
PID=$(docker inspect -f '{{.State.Pid}}' <container>)
sudo nsenter -t $PID -n ip aThis lets you inspect the container's network from the host perspective.
Sometimes your Docker networks clash with your host network (e.g., VPN or LAN).
docker network inspect <network>If your containerβs subnet overlaps with host/VPN range, change it:
docker network create \
--subnet=192.168.200.0/24 \
my_custom_netInside container, test connectivity:
# Try to reach an external service
curl http://host.docker.internal:8080
# Check if a port is open
nc -zv <target-host> <port>host.docker.internal works on Docker Desktop (Mac/Windows) for host access.
Verify IP forwarding is enabled on host:
cat /proc/sys/net/ipv4/ip_forwardIt should return 1. If not:
sudo sysctl -w net.ipv4.ip_forward=1Start a lightweight container for testing:
docker run --rm -it --network=my_net alpine sh
# Inside: ping, nslookup, curl, etc.Install tools if needed:
apk add curl iputils iproute2 net-tools| What to Check | Tool/Command |
|---|---|
| Network and IP config | docker inspect |
| Container-to-container ping | docker exec ping <other-container> |
| Port bindings | docker ps, -p option |
| DNS issues | cat /etc/resolv.conf, ping google.com |
| Iptables/NAT issues | sudo iptables -t nat -L -n |
| Namespace debugging | nsenter -t <PID> -n |
| Host port conflicts | lsof -i, netstat, ss |
end!!




