Photo by Ian Taylor on Unsplash
Docker Demystified: Everything You Need to Know
Get introduced to docker and it's history. You might be confused why the heck you get to see container images when you search for docker.
Introduction
Docker is a popular and powerful tool for creating, deploying, and managing containers. Containers are isolated environments that run applications and services without affecting the rest of the system. They are lightweight, portable, and efficient, making them ideal for modern software development.
In this blog post, we will explore the history of containerization technology, the features and specifications of Docker, the drawbacks and challenges of Docker, and some of the behind-the-scenes details of how Docker works. We will also introduce gRPC, a framework that enables the communication between Docker containers and other services.
History of Containerization
Containerization is not a new concept. It has been around since the early days of computing when operating systems used techniques such as chroot and jails to isolate processes and resources. However, these methods were not very flexible or scalable, and they required a lot of manual configuration and maintenance.
The breakthrough came in 2008 when Linux introduced a feature called cgroups (control groups), which allowed for fine-grained control over CPU, memory, disk, network, and other resources for a group of processes. This enabled the creation of Linux Containers (LXC), which were more secure, efficient, and easy to use than previous methods.
However, LXC still had some limitations. For example, they required the same kernel version on the host and the container, they did not have a standard format or interface, and they lacked tools for building and distributing images.
This is where Docker came in. Docker was launched in 2013 as an open-source project that aimed to simplify and standardize the creation and management of containers. It provided a simple command-line interface, a universal image format, a registry service for storing and sharing images, and an orchestration platform for managing clusters of containers. Docker quickly gained popularity and adoption among developers and organizations, becoming the de facto standard for containerization technology.
Understanding Docker
Docker is a platform that enables you to build, run, and share applications using containers. It consists of several components that work together to provide a complete solution for containerization.
Docker runtime: This is the software that runs on your machine (or server) that allows you to create and run containers. It includes the Docker daemon (or server), which communicates with the Docker client (or CLI) to execute commands and manage containers.
Docker engine: This is the core component of Docker that provides the functionality for building images and running containers. It consists of several sub-components that handle different aspects of containerization.
Docker images: These are read-only templates that contain the application code, dependencies, libraries, configuration files, and other resources needed to run an application in a container. You can create your own images using a Dockerfile or use pre-built images from the Docker Hub or other registries.
Docker containers: These are instances of images that run as isolated processes on the host machine. Each container has its own filesystem, network interfaces, environment variables, and other settings. You can start, stop, restart, attach to, inspect, or remove containers using various Docker commands.
Docker orchestration: This is the process of managing multiple containers across multiple machines or servers. It involves tasks such as scheduling, scaling, load balancing, networking, service discovery, health checking, and fault tolerance. Docker provides several tools for orchestration such as Docker Compose (for defining multi-container applications), Docker Swarm (for clustering and distributing containers), and Kubernetes (for managing large-scale container deployments).
Docker Hub: This is a cloud-based service that allows you to store and share your images with other users or organizations. You can also browse and download images from public repositories or create your private repositories. You can also use other registries such as Google Container Registry or Amazon Elastic Container Registry.
Containers and Operating Systems
A common question that arises when using Docker is where does a container run within an operating system? Does it run on top of the host OS, inside a virtual machine, or somewhere else?
The answer is that a container runs on top of the host OS but in an isolated environment. A container does not have its own kernel or operating system; instead, it uses the kernel and system resources of the host OS. However, a container does not have direct access to these resources; rather, it uses Linux-specific features such as cgroups and namespaces to limit and isolate its access to CPU, memory, disk, network, and other resources.
This means that a container can run any application or service that is compatible with the host OS and kernel. For example, if you are running Docker on a Linux machine with a 5.10 kernel version, you can run any container that works with Linux 5.10. However, you cannot run a container that requires a different OS or kernel version, such as Windows or MacOS.
Behind the Scenes: How 'docker run hello world' Works
One of the simplest commands you can run with Docker is ‘docker run hello world’, which prints a message to the standard output. But what happens behind the scenes when you execute this command? Let’s break it down step by step:
The Docker client sends a request to the Docker daemon to run a container with the image ‘hello world’.
The Docker daemon checks if it has the image ‘hello world’ locally. If not, it pulls it from the Docker Hub registry.
The Docker daemon creates a new container from the image ‘hello world’ and assigns it a unique ID and name.
The Docker daemon starts the container and runs the command specified in the image, which is ‘/hello’.
The container executes the ‘/hello’ binary, which prints a message to the standard output.
The Docker daemon streams the output back to the Docker client, which displays it on the terminal.
The container exits after running the command and is removed by the Docker daemon.
Running Containers in Detached Mode
By default, when you run a container using ‘docker run’, it attaches to your terminal and displays its output on your screen. This is useful for debugging or testing purposes, but not for running long-running or background processes.
To run a container in detached mode, you can use the ‘-d’ or ‘–detach’ flag with ‘docker run’. This tells Docker to run the container in the background and return its ID and name. For example:
$ docker run -d --name my_container ubuntu sleep 60
Output -
a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3
a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7a8b9c0d1e2f3
is an example of the output you would see after running a container in detached mode using the docker run -d
command. It represents the container ID, which is a unique identifier for the container. You can use this ID to manage the container using other Docker commands.
This command runs an Ubuntu container in detached mode with the name ‘my_container’ and executes the command ‘sleep 60’, which pauses for 60 seconds before exiting.
To see the status of your detached containers, you can use ‘docker ps’. To see their logs, you can use ‘docker logs’. To stop or remove them, you can use ‘docker stop’ or ‘docker rm’.
Running containers in detached mode allows you to run multiple containers simultaneously without blocking your terminal or interfering with each other’s output. It also allows you to keep your containers running even after you close your terminal session.
Layers in Docker
Docker uses a layered architecture to build and store images. This means that each image is composed of one or more layers, each representing a change or modification to the image. Layers are read-only and immutable. Immutable means that once a layer is created, it cannot be modified or changed. Instead, any changes or modifications are applied by creating a new layer on top of the existing layers. This ensures that the underlying layers remain unchanged and preserves the integrity of the image. However, they can be shared and reused by other images.
For example, let’s say you want to create an image that runs a Python application. You could start with a base image that contains the Python interpreter and some common libraries. Then you could add another layer that contains your application code and dependencies. Finally, you could add another layer that specifies how to run your application.
Each layer has a unique identifier (or hash) that represents its content and metadata. When you build an image using a Dockerfile, Docker creates a new layer for each instruction in the file. When you pull or push an image from or to a registry, Docker only transfers the layers that are not already present on the source or destination.
Layers help to optimize storage space and network bandwidth by avoiding duplication and redundancy. They also help to improve performance and security by caching and verifying layers.
Using Docker Logs
Docker logs are the output of your containers, which can be useful for debugging, monitoring, or auditing purposes. You can view the logs of your containers using the ‘docker logs’ command, which takes the name or ID of the container as an argument. For example:
$ docker logs my_container
Output :
Hello, world!
This command displays the logs of the container named ‘my_container’, which prints ‘Hello, world!’ to the standard output.
You can also use various flags and options with ‘docker logs’ to filter or format the output. For example, you can use the ‘–since’ option to show only the logs since a certain time. For example:
$ docker logs --since 5s my_container
Output :
No output
This command shows only the logs of the container named ‘my_container’ that were generated in the last 5 seconds. Since there was no output in that time period, it returns nothing.
You can also use the ‘–tail’ option to show only the last N lines of the logs. For example:
$ docker logs --tail 3 my_container
Output :
Hello, world!
Goodbye, world!
Exit code: 0
This command shows only the last 3 lines of the logs of the container named ‘my_container’, which prints ‘Hello, world!’, ‘Goodbye, world!’, and ‘Exit code: 0’ to the standard output.
There are many other options and flags that you can use with ‘docker logs’ to customize your output. You can refer to the official documentation for more details.
Docker Architecture
Docker architecture is composed of several components that work together to provide functionality for building images and running containers. These components are:
Docker Client: This is the command-line interface (CLI) that you use to interact with Docker. It sends requests to the Docker daemon using a REST API.
Docker Daemon: This is the server process that runs on your machine (or server) that handles all the requests from the Docker client. It manages images, containers, networks, volumes, and other Docker objects.
ContainerD: This is a daemon that provides a low-level interface for managing containers. It handles tasks such as creating, starting, stopping, pausing, resuming, deleting, and inspecting containers. It also manages images, snapshots, and storage drivers.
Shim: This is a process that runs between ContainerD and runC to monitor and report on the state of a container. It also acts as a proxy for signals and I/O streams between ContainerD and runC.
runC: This is a lightweight tool that runs containers according to the Open Container Initiative (OCI) specification. It creates and runs containers using Linux-specific features such as cgroups and namespaces.
These components form the Docker engine, which is responsible for creating and running containers on your machine.
Drawbacks of Docker
Docker is not without its challenges and limitations. Some of the common drawbacks of Docker technology are:
Security: Containers are not as secure as virtual machines because they share the same kernel with the host machine. This means that if a container is compromised or maliciously executed, it could potentially affect other containers or the host system. Therefore, it is important to follow best practices such as using trusted image sources.
Performance: Containers have lower overhead than virtual machines because they do not need to emulate hardware or run a full operating system. However, they still consume resources such as CPU cycles.
Compatibility: Containers are designed to run on Linux-based systems because they rely on Linux-specific features such as cgroups and namespaces. However, this also means that they cannot run applications or services that require a different operating system or kernel version, such as Windows or MacOS. To overcome this limitation, some solutions such as Docker Desktop or VMWare Fusion can be used to run Docker on non-Linux systems, but they may introduce additional complexity and overhead.
Portability: Containers are portable across different machines or platforms because they contain all the dependencies needed to run an application. However, this also means that they may not be compatible with the specific configuration or environment of the target machine or platform. For example, a container that runs on a laptop may not work on a cloud server due to differences in network settings, security policies, or hardware capabilities. To ensure portability, it is important to test and verify containers on different environments before deploying them.
Introduction to gRPC
You may be wondering, how Docker containers communicate with each other and with other services. How do they exchange data and messages in a fast, efficient, and reliable way? The answer is gRPC.
gRPC is a framework that enables the communication between services using a high-performance binary protocol called Protocol Buffers (or Protobuf). It supports multiple languages such as C#, C++, Go, Java, Python, Ruby, and more. It also supports multiple platforms such as Linux, Windows, MacOS, Android, iOS, and more.
gRPC allows you to define your service interface using a .proto file, which describes the methods, parameters, and return types of your service. Then you can use a compiler tool to generate client and server code in your preferred language. You can also use plugins to generate additional code such as documentation or testing tools.
gRPC uses HTTP/2 as its transport layer, which provides features such as multiplexing, streaming, compression, and security. gRPC also supports various types of communication patterns such as unary (one request-one response), server streaming (one request-many responses), client streaming (many requests-one response), and bidirectional streaming (many requests-many responses).
gRPC is useful for building microservices or distributed systems that need fast, efficient, and reliable communication between services. It also integrates well with Docker technology because it allows you to create lightweight and scalable containers that communicate with each other using gRPC.
Conclusion
We have also explored some of the details and challenges of Docker technology such as security, performance, compatibility, and portability. We have learned how Docker works behind the scenes using layers, detached mode, logs, and architecture. Finally, we have introduced gRPC, a framework that enables communication between Docker containers and other services.
Docker is a powerful and popular tool for containerization technology that has many benefits for modern software development. It allows us to create, deploy, and manage applications and services in a fast, efficient, and reliable way.
If you're eager to delve deeper into Docker and harness its capabilities for your projects, there are various resources you can explore to expand your knowledge.
Here are a few recommendations to help you on your Docker learning journey:
Official Docker Documentation: The official Docker documentation is an excellent starting point. It provides comprehensive guides, tutorials, and references covering various Docker topics. You can find it at: https://docs.docker.com/
Docker's Online Training: Docker offers online training courses that cater to different skill levels, from beginner to advanced. These courses provide hands-on experience and cover topics such as containerization, Docker Compose, and orchestration with Docker Swarm and Kubernetes. You can access the courses at: https://www.docker.com/get-started/training
Docker Community: Engaging with the Docker community can be beneficial. The Docker Forums (https://forums.docker.com/) and Docker Community Slack (https://community.docker.com/) are great places to ask questions, seek guidance, and connect with fellow Docker enthusiasts.
Youtube Resources - TechWorld With Nana, Kunal Kushwaha, KubeSimplify, Automation Step By Step and there are many more.
Blogs and Tutorials: Many tech bloggers share their knowledge and experiences with Docker. Websites like Docker's official blog (https://www.docker.com/blog/), and Hashnode (https://hashnode.com/) host numerous Docker-related articles. Additionally, stay tuned for my blog posts, where I'll be diving deeper into Docker topics.
Remember, Docker is a vast and evolving ecosystem, so it's important to keep exploring and staying up to date with the latest developments. By combining official documentation, online courses, community engagement, and reading insightful blogs, you can enhance your understanding of Docker and make the most out of its capabilities in your projects.
Thank you for reading this blog post. I hope you found it informative and useful. If you have any questions or feedback, please feel free to leave a comment below. Happy Docking!