Painless Docker Basic Edition: A Practical Guide To Master And Its Ecosystem Based On Real World Examples Edition Real%

User Manual: Pdf

Open the PDF directly: View PDF PDF.
Page Count: 375 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Table of Contents
1. Introduction 1.1
2. Preface 1.2
3. Chapter I - Introduction To Docker & Containers 1.3
4. Chapter II - Installation & Configuration 1.4
5. Chapter III - Basic Concepts 1.5
6. Chapter IV - Advanced Concepts 1.6
7. Chapter V - Working With Docker Images 1.7
8. Chapter VI - Working With Docker Containers 1.8
9. Chapter VII - Working With Docker Machine 1.9
10. Chapter VIII - Docker Networking 1.10
11. Chapter IX - Composing Services Using Compose 1.11
12. Chapter X - Docker Logging 1.12
13. Chapter XI - Docker Debugging And Troubleshooting 1.13
14. Chapter XII - Orchestration - Docker Swarm 1.14
15. Chapter XIII - Orchestration - Kubernetes 1.15
16. Chapter XIV - Orchestration - Rancher/Cattle 1.16
17. Chapter XV - Docker API 1.17
18. Chapter XVI - Docker Security 1.18
19. Chapter XVII - Docker, Containerd & Standalone Runtimes Architecture 1.19
20. Final Words 1.20
Painless Docker
About The Author
Aymen is a Cloud & Software Architect, Entrepreneur, Author, CEO of Eralabs, A DevOps & Cloud Consulting
Company and Founder of DevOpsLinks Community.
He has been using Docker since the word Docker was just a buzz. He worked on web development, system
engineering, infrastructure & architecture for companies and startups. He is interested in Docker, Cloud Computing,
the DevOps philosophy, the lean programming and the tools/methodologies.
You can find Aymen on Twitter.
Don't forget to join DevOpsLinks & Shipped newsletters and the community Job Board JobsForDevOps. You can
also follow this course Twitter account for future updates.
Wishing you a pleasant reading.
Disclaimer
Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or
other countries.
Preface
Docker is an amazing tool, may be you have tried using or testing it or may be you started using it in some or all of
your production servers but managing and optimizing it can be complex very quickly, if you don't understand some
basic and advanced concepts that I am trying to explain in this book.
The fact that the ecosystem of containers is rapidly changing is also a constraint to stability and a source of
confusion for many operation engineers and developers.
Most of the examples that can be found in some blog posts and tutorials are -in many cases- promoting Docker or
giving tiny examples, managing and orchestrating Docker is more complicated, especially with high-availability
constraints.
This containerization technology is changing the way system engineering, development and release management are
working since years, so it requires all of your attention because it will be one of the pillars of future IT technologies
if it is not actually the case.
At Google, everything runs in a container. According to The Register, two billion containers are launched every
week. Google has been running containers since years, when containerization technologies were not yet
democratized and this is one of the secrets of the performance and ops smoothness of Google search engine and all
of its other services.
Some years ago, I was in doubt about Docker usage, I played with Docker in testing machines and I decided later to
use it in production. I have never regretted my choice, some months ago I created a self-service in my startup for
developers : an internal scalable PaaS - that was awesome ! I gained more than 14x on some production metrics and
I realized my goal of having a service with SLA and Appdex score of 99%.
Appdex (Application Performance Index) is an open standard that defines a standardized method to report,
benchmark, and track application performance.
SLA (Service Level Agreement) is a contract between a service provider (either internal or external) and the
end user that defines the level of service expected from the service provider.
It was not just the usage of Docker, this would be too easy, it was a list of todo things, like moving to micro-services
and service-oriented architectures, changing the application and the infrastructure architecture, continuous
integration ..etc But Docker was one of the most important things on my checklist, because it smoothed the whole
stack's operations and transormation, helped me out in the continuous integration and the automation of routine task
and it was a good platform to create our own internal PaaS.
Some years ago, computers had a central processing unit and a main memory hosted in a main machine, then come
mainframes whose were inspired from the latter technology. Just after that, IT had a new born called virtual
machines. The revolution was in the fact that a computer hardware using a hypervisor, allows a single machine to
act as if it where many machines. Virtual machines were almost run in on-premise servers, but since the emergence
of cloud technologies, VMs have been moved to the cloud, so instead of having to invest heavily in data centers and
physical servers, one can use the same virtual machine in the infrastructure of servers providers and benefit from the
'pay-as-you-go' cloud advantage.
Over the years, requirements change and new problems appears, that's why solutions also tends to change and new
technologies emerge.
Nowadays, with the fast democratization of software development and cloud infrastructures, new problems appears
and containers are being largely adopted since they offer suitable solutions.
A good example of the arising problems is supporting software environment in an identical environment to the
production when developing.Weird things happen when your development and testing environments are not the
same, same thing for the production environments. In this particular case, you should provide and distribute this
environment to your R&D and QA teams.
But running a Node.js application that has 1 MB of dependencies plus the 20MB Node.js runtime in a Ubuntu 14.04
VM will take you up to 1.75 GB. It's better to distribute a small container image than 1G of unused libraries..
Containers contains only the OS libraries and Node.js dependencies, so rather than starting with everything
included, you can start with minimum and then add dependencies so that the same Node.js application will be 22
times smaller! When using optimized containers, you could run more applications per host.
Containers are a problem solver and one of the most sophisticated and adopted containers solutions is Docker.
To Whom Is This Book Addressed ?
To developers, system administrators, QA engineers, operation engineers, architects and anyone faced to work in
one of these environments in collaboration with the other or simply in an environment that requires knowledge in
development, integration and system administration.
The most common idea is that developers think they are here to serve the machines by writing code and
applications, systems administrators think that machines should works for them simply by making them happy
(maintenance, optimization ..etc ).
Moreover, within the same company there is generally some tension between the two teams:
System administrators accuse developers to write code that consumes memory, does not meet system
security standards or not adapted to available machines configuration.
Developers accuse system administrators to be lazy, to lack innovation and to be seriously uncool!
No more mutual accusations, now with the evolution of software development, infrastructure and Agile engineering,
the concept of DevOps was born.
DevOps is more a philosophy and a culture than a job (even if some of the positions I occupied were called
"DevOps"). By admitting this, this job seeks closer collaboration and a combination of different roles involved in
software development such as the role of developer, responsible for operations and responsible of quality assurance.
The software must be produced at a frenetic pace while at the same time the developing in cascade seems to have
reached its limits.
If you are a fan of service-oriented architectures, automation and the collaboration culture
if you are a system engineer, a release manager or an IT administrator working on DevOps, SysOps
or WebOps
If you are a developer seeking to join the new movement
This book is addressed to you. Docker one the most used tools in DevOps environments.
And if you are new to Docker ecosystem and no matter what your Docker level is, through this book, you will firstly
learn the basics of Docker (installation, configuration, Docker CLI ..etc) and then move easily to more complicated
things like using Docker in your development, testing and live environments.
You will also see how to write your own Docker API wrapper and then master Docker ecosystem, form
orchestration, continuous integration to configuration management and much more.
I believe in learning led by practical real-world examples and you ill be guided through all of this book by tested
examples.
How To Properly Enjoy This Book
This book contains technical explanations and shows in each case an example of a command or a configuration to
follow. The only explanation gives you a general idea and the code that follows gives you convenience and help you
to practice what you are reading. Preferably, you should always look both parts for a maximum of understanding.
Like any new tool or programming language you learned, it is normal to find difficulties and confusions in the
beginning, perhaps even after. If you are not used to learn new technologies, you can even have a modest
understanding while being in an advanced stage of this book. Do not worry, everyone has passed at least once by
this kind of situations.
At the beginning you could try to make a diagonal reading while focusing on the basic concepts, then you could try
the first practical manipulation on your server or using your laptop and occasionally come back to this book for
further reading on a about a specific subject or concept.
This book is not an encyclopedia but sets out the most important parts to learn and even to master Docker and its
fast-growing ecosystem. If you find words or concepts that you are not comfortable with, just try to take your time
and do your own on-line research.
Learning can be serial so understanding a topic require the understanding of an other one, do not lose patience : You
will go through chapters with good examples of explained and practical use cases.
Through the examples, try to showcase your acquired understanding, and, no, it will not hurt to go back to previous
chapters if you are unsure or in doubt.
Finally, try to be pragmatic and have an open mind if you encounter a problem. The resolution begins by asking the
right questions.
Conventions Used In This Book
Basically, this is a technical book where you will find commands (Docker commands) and code (YAML, Python
..etc).
Commands and code are written in a different format.
Example :
docker run hello-world
This book uses italic font for technical words such as libraries, modules, languages names. The goal
is to get your attention when you are reading and help you identify them.
You will find two icons, I have tried to be as simple as possible so I have chosen not to use too many
symbols, you will only find:
To highlight useful and important information.
To highlight a warning or a cautionary advice.
Some containers/services/networks identifiers are long to be well formatted, you will find for example some ids in
this format 4..d , e..3 : instead of writing all of the string : 4lrmlrazrlm4213lkrnalknra125009hla94l1419u14N14d , I only use
the first and the last character, which gives 4..d .
How To Contribute And Support This Book ?
This work will be always a work in progress but it does not mean that it is not a complete learning resource - writing
a perfect book is impossible on the contrary of iterative and continuous improvement.
I am an adopter of the lean philosophy so the book will be continuously improved in function of many criteria but
the most important one is your feedback.
I imagine that some readers do not know how "Lean publishing" works. I'll try to explain briefly:
Say, the book is 25% complete, if you pay for it at this stage, you will pay the price of the 25% but get all of the
updates until 100%.
Another point, lean publishing for me is not about money, I refused several interesting offers from known publishers
because I want to be free from restrictions and DRM ..etc
If you have any suggestion or if you encountered a problem, it would be better to use a tracking system for issues
and recommendations about this book, I recommend using this github repository.
You can find me on Twitter or you can use my blog contact page if you would like to get in touch.
This book is not perfect, so you can find typo, punctuation errors or missing words.
Contrariwise every line of the used code, configurations and commands was tested before.
If you enjoyed reading Painless Docker and would like to support it, your testimonials will be more than welcome,
send me an email, if you need a development/testing server to manipulate Docker, I recommend using Digital
Ocean, you can also show your support by using this link to sign up.
If you wan to join more than 1000 developers, SRE engineers, sysadmins and IT experts, you can subscribe to a
DevOpsLinks community and you will be invited to join our newsletter and join out team chat.
Chapter I - Introduction To Docker & Containers
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
What Are Containers
"Containers are the new virtualization." : This is what pushed me to adopt containers, I have been always curious
about the virtualization techniques and types and containers are - for me - a technology that brings together two of
my favorite fields: system design and software engineering.
Container like Docker is the technology that allows you to isolate, build, package, ship and run an application.
A container makes it easy to move an application between development, testing, staging and production
environments.
Containers exist since years now, it is not a new revolutionary technology: The real value it gives is not the
technology itself but getting people to agree on something. In the other hand, it is experiencing a rebirth with easy-
to-manage containerization tools like Docker.
Containers Types
The popularity of Docker made some people think that it is the only container technology but there are many others.
Let's enumerate most of them.
System administrators will be more knowledgeable about the following technologies but this book is not just for
Linux specialists, operation engineers and system architects, it is also addressed to developers and software
architects.
The following list is ordered from the least to the most recent technology.
Chroot Jail
Historically, the first container was the chroot.
Chroot is a system call for nix OSs that changes the root directory of the current running process and their children.
The process running in a chroot jail will not know about the real filesystem root directory.
A program that is run in such environment cannot access files and commands outside that environmental directory
tree. This modified environment is called a chroot jail.
FreeBSD Jails
The FreeBSD jail mechanism is an implementation of OS-level virtualization.
A FreeBSD-based operating system could be partitioned into several independent jails. While chroot jail restricts
processes to a particular filesystem view, the FreeBSD is an OS-level virtualization: A jail restricting the activities of
a process with respect to the rest of the system. Jailed processes are "sandboxed".
Linux-VServer
Linux-VServer is a virtual private server using OS-level virtualization capabilities that was added to the Linux
Kernel.
Linux-VServer technology has many advantages but its networking is based on isolation, not virtualization which
prevents each virtual server from creating its own internal routing policy.
Solaris Containers
Solaris Containers are an OS-level virtualization technology for x86 and SPARC systems. A Solaris Container is a
combination of system resource controls and the boundary separation provided by zones.
Zones are the equivalent of completely isolated virtual servers within a single OS instance. System administrators
place multiple sets of application services onto one system and place each into isolated Sloaris Container.
OpenVZ
Open Virtuozzo or OpenVZ is also OS-level virtualization technology for Linux. OpenVZ allows system
administrators to run multiple isolated OS instances (containers), virtual private servers or virtual environments.
Process Containers
Engineers at Google (primarily Paul Menage and Rohit Seth) started the work on this feature in 2006 under the
name "Process Containers". It was then called cgroups (Control Groups). We will see more details about cgroups
later in this book.
LXC
Linux Containers or LXC is an OS-level virtualization technology that allows running multiple isolated Linux
systems (containers) on a control host using a single Linux kernel. LXC provides a virtual environment that has its
own process and network space. It relies on cgroups (Process Containers).
The difference between Docker and LXC is detailed later in this book.
Warden
Warden used LXC at its initial stage and was later on replaced with a CloudFoundry implementation. It provides
isolation to any other system than Linux that support isolation.
LMCTFY
Let Me Contain That For You or LMCTFY is the Open Source version of Google’s container stack, which provides
Linux application containers.
Google engineers have been collaborating with Docker over libcontainer and porting the core lmctfy concepts and
abstractions to libcontainer.
The project is not actively being developed, in future the core of lmctfy will be replaced by libcontainer.
Docker
This is what we are going to discover in this book.
RKT
CoreOs started building a container called rkt (pronounce Rocket).
CoreOs is designing rkt following the original premise of containers that Docker introduced but with more focus on:
Composable ecosystem
Security
A different image distribution philosophy
Openness
rkt is Pod-native which means that its basic unit of execution is a pod, linking together resources and user
applications in a self-contained environment.
Introduction To Docker
Docker is a containerization tool with a rich ecosystem that was conceived to help you develop, deploy and run any
application, anywhere.
Unlike a traditional virtual machine, Docker container share the resources of the host machine without needing an
intermediary (a hypervisor) and therefore you don't need to install an operating system. It contains the application
and its dependencies, but works in an isolated and autonomous way.
In other words, instead of a hypervisor with a guest operating system on top, Docker uses its engine and containers
on top.
Most of us used to use virtual machines, so why containers and Docker are taking an important part of today
infrastructures ?
This table explains briefly the difference and the advantages of using Docker over a VM:
VM Docker
Size Small CoreOs = 1.2GB A Busybox container = 2.5 MB
Startup Time Measured in minutes An optimized Docker container, will run in less that a second
Integration Difficult More open to be integrated to other tools
Dependency Hell Frustration Docker fixes this
Versionning No Yes
Docker is a process isolation tool that used LXC (an operating-system-level virtualization method for running
multiple isolated Linux systems (containers) on a control host using a single Linux Kernel) until the version 0.9.
The basic difference between LXC and VMs is that with LXC there is only one instance of Linux Kernel running.
For curious readers, LXC was replaced by Docker own libcontainer library written in the Go programming
language.
So, a Docker container isolate your application running in a host OS, the latter can run many other containers. Using
Docker and its ecosystem, you can easily manage a cluster of containers, stop, start and pause multiple applications,
scale them, take snapshots of running containers, link multiple services running Docker, manage containers and
clusters using APIs on top of them, automate tasks, create applications' watchdogs and many other features that are
complicated without containers.
Using this book, you will learn how to use all of these features and more.
What Is The Relation Between The Host OS And Docker
In a simple phrase, the host OS and the container share the same Kernel.
If you are running Ubuntu as a host, container's Kernel is going to use the same Kernel as Ubuntu system, but you
can use CentOs or any other OS image inside your container. That is why the main difference between a virtual
machine and a Docker container is the absence of an intermediary part between the Kernel and the guest, Docker
takes place directly within your host's Kernel.
You are probably saying "If Docker's using the host Kernel, why should I install an OS within my container ?".
You are right, in some cases you can use Docker's scratch image, which is an explicitly empty image, especially for
building images from scratch. This is useful for containers that contain only a single binary and whatever it requires,
such as the "hello-world" container that we are going to use in the next section.
So Docker is a process isolation environment and not an OS isolation environment (like virtual machines), you can,
as said, use a container without an OS. But imagine you want to run an Nginx or an Apache container, you can run
the server's binary, but you will need to access the file system in order to configure nginx.conf, apache.conf,
httpd.conf ..etc or the available/enabled sites configurations.
In this case, if you run a containers without an OS, you will need to map folders from the container to the host like
the /etc directory (since configuration files are under /etc).
You can actually do it but you will lose the change management feature that docker containers offers: So every
change within the container file system will be also mapped to the host file system and even if you map them on
different folders, things could become complex with advanced production/development scenarios and environments.
Therefore, amongst other reasons, Docker containers running an OS are used for portability and change
management.
In the examples explained in this book, we often rely on official images that can be found in the official Docker hub,
we will also create some custom images.
What Does Docker Add To LXC Tools ?
LXC owes its origin to the development of cgroups and namespaces in the Linux kernel. One of the most asked
questions on the net about Docker is the difference between Docker and VMs but also the difference between
Docker and LXC.
This question was asked in Stackoverflow and I am sharing the response of Solomon Hykes (the creator of Docker)
pusblished under CC BY-SA 3.0 license:
Docker is not a replacement for LXC. LXC refers to capabilities of the Linux Kernel (specifically namespaces and
control groups) which allow sandboxing processes from one another, and controlling their resource allocations.
On top of this low-level foundation of Kernel features, Docker offers a high-level tool with several powerful
functionalities:
Portable deployment across machines
Docker defines a format for bundling an application and all its dependencies into a single object which can be
transferred to any docker-enabled machine, and executed there with the guarantee that the execution environment
exposed to the application will be the same. LXC implements process sandboxing, which is an important pre-
requisite for portable deployment, but that alone is not enough for portable deployment. If you sent me a copy of
your application installed in a custom LXC configuration, it would almost certainly not run on my machine the way
it does on yours, because it is tied to your machine's specific configuration: networking, storage, logging, distro, etc.
Docker defines an abstraction for these machine-specific settings, so that the exact same Docker container can run -
unchanged - on many different machines, with many different configurations.
Application-centric
Docker is optimized for the deployment of applications, as opposed to machines. This is reflected in its API, user
interface, design philosophy and documentation. By contrast, the LXC helper scripts focus on containers as
lightweight machines - basically servers that boot faster and need less ram. We think there's more to containers than
just that.
Automatic build
Docker includes a tool for developers to automatically assemble a container from their source code, with full control
over application dependencies, build tools, packaging etc. They are free to use make, Maven, Chef, Puppet,
SaltStack, Debian packages, RPMS, source tarballs, or any combination of the above, regardless of the
configuration of the machines.
Versioning
Docker includes git-like capabilities for tracking successive versions of a container, inspecting the diff between
versions, committing new versions, rolling back etc. The history also includes how a container was assembled and
by whom, so you get full traceability from the production server all the way back to the upstream developer. Docker
also implements incremental uploads and downloads, similar to git pull , so new versions of a container can be
transferred by only sending diffs.
Component re-use
Any container can be used as an "base image" to create more specialized components. This can be done manually or
as part of an automated build. For example you can prepare the ideal Python environment, and use it as a base for 10
different applications. Your ideal PostgreSQL setup can be re-used for all your future projects. And so on.
Sharing
Docker has access to a public registry (https://registry.hub.docker.com/) where thousands of people have uploaded
useful containers: anything from Redis, Couchdb, PostgreSQL to IRC bouncers to Rails app servers to Hadoop to
base images for various distros. The registry also includes an official "standard library" of useful containers
maintained by the Docker team. The registry itself is open-source, so anyone can deploy their own registry to store
and transfer private containers, for internal server deployments for example.
Tool ecosystem
Docker defines an API for automating and customizing the creation and deployment of containers. There are a huge
number of tools integrating with Docker to extend its capabilities. PaaS-like deployment (Dokku, Deis, Flynn),
multi-node orchestration (Maestro, Salt, Mesos, OpenStack Nova), management dashboards (Docker-UI, OpenStack
horizon, Shipyard), configuration management (chef, puppet), continuous integration (jenkins, strider, travis), etc.
Docker is rapidly establishing itself as the standard for container-based tooling.
Docker Use Cases
Docker has many use cases and advantages:
Versionning & Fast Deployment
Docker registry (or Docker Hub) could be considered as a version control system for a given application. Rollbacks
and updates are easier this way.
Just like Github, BitBucket or any other Git system, you can use tags to tag your images versions. Imagine you can
tag differently a container with each application release, it will be easier to deploy and rollback to the n-1 release.
As you may already know, Git-like systems gives you commit identifiers like 2.1-3-xxxxxx , these are not tags, you
can also use your Git system to tag you code, but for deployment you will need to download these tags or their
artifacts.
If your fellow developers are working on an application with thousands of packages dependencies like JavaScript
apps (using the package.json), you may be facing a deployment of an application with very small files to download
or update with probably some new configurations. A single Docker image with your new code, already builded and
tested and configurations will be easier and faster to ship and deploy.
Tagging is done with docker tag command, these tags are the base for the commit. Docker versionning and tagging
system is working also in this way.
Distribution & Collaboration
If you would like to share images and containers, Docker allows this social feature so that anyone can contribute to a
public (or private) image.
Individuals and communities can collaborate and share images. Users can also vote for images. In Docker Hub, you
can find trusted (official) and community images.
Some images have a continuous build and security scan feature to keep them up-to-date.
Multi Tenancy & High Availability
Using the right tools from the ecosystem, it is easier to run many instances of the same application in the same
server with Docker than the "main stream" way.
Using a proxy, a service discovery and a scheduling tool, you can start a second server (or more) and load-balance
your traffic between the cluster nodes where your containers are "living".
CI/CD
Docker is used in production systems but it is considered as a tool to run the same application in developer's
laptop/server. Docker may move from development to QA to production without being changed. If you would like
to be as close as possible to production, then Docker is a good solution.
Since it solves the problem of "it works on my machine", it is important to highlight this use case. Most problems in
software development and operations are due to the differences between development and production environments.
If your R&D team use the same image that QA team will test against and the same environment will be pushed to
live servers, it is sure that a great part of the problems (dev vs ops) will disappear.
There are many DevOps topologies in the software industry now and "container-centric" (or "container-based")
topology is one of them.
This topology makes both Ops and Dev teams share more responsabilities in common, which is a DevOps approach
to blur the boundaries between teams and encourage the co-creation.
Isolation & The Dependency Hell
Dockerizing an application is also isolating it into a separate environment.
Imagine you have to run two APIs with two different languages or running them with the same language but with
different versions.
You may need two incompatible versions of the same language, each API is running one of them, for example
Python 2 and Python 3.
If the two apps are dockerized, you don't need to install nothing on your host machine, just Docker, every version
will run in an isolated environment.
Since I start running Docker in production, most of my apps were dockerized, I stopped using the host system
package manager since that time, every new application or middleware were installed inside the container.
Docker simplifies the system packages management and eliminates the "dependency hell" by its isolation feature.
Using The Ecosystem
You can use Docker with multiple external tools like configuration management tools, orchestration tools, file
storage technologies, filesystem types, logging softwares, monitoring tools, self-healing tools ..etc
On the other hand, even with all the benefits of Docker, it is not always the best solution to use, there are always
exceptions.
Chapter II - Installation & Configuration
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
In Painless Docker book, we are going to use a Docker version superior or equal to the 1.12.
I used to use previous stable version like 1.11, but a new important feature which is the Swarm Mode was introduced
in version 1.12. Swarm orchestration technology is directly integrated into Docker and just before it was an add-on.
I am a GNU/Linux user, but for Windows and Mac users, Docker unveiled with the same version, the first full
desktop editions of the software for development on Mac and Windows machines.
There are many other interesting features, enhancements and simplifications in the version 1.12 of Docker, you can
find the whole list in Docker github repository.
If you are completely new to Docker, you will not get all of the new following features, but you will be able to
understand them as you go along with this book.
The most important new features in Docker 1.12 are about the Swarm Mode:
Built-in Virtual-IP based internal and ingress load-balancing using IPVS
Routing Mesh using ingress overlay network
New swarm command to manage swarms with these subcommands:
init ,
join ,
join-token ,
leave ,
update
New service command to manage swarm-wide services with create , inspect , update , rm , ps
subcommands
New node command to manage nodes with accept , promote , demote , inspect , update , ps , ls and rm
subcommands
New stack and deploy commands to manage and deploy multi-service applications
Add support for local and global volume scopes (analogous to network scopes)
Other important features in Docker were introduced in preceding versions like the multi-build support in the version
v17.05.0-ce.
When writing this book, I used Ubuntu 16.04 server edition with a 64 bit architecture as my main operating system,
but you will see how to install Docker in other OSs like Windows and MacOS.
For other Linux distributions users, things are not really different, except the package manager (apt/aptitude) that
you should replace by your own one - we are going to explain the installation for other distributions like CentOs.
Requirements & Compatibility
Docker itself does not need many resources so a little RAM could help you install and run Docker engine. But
running containers depends on what are you running exactly, in the case you are running a Mysql or a busy
MongoDB inside a container, you will need more memory than running a small Nodejs or Python application .
Docker requires a 64 bit Kernel.
For developers using Windows or Mac, you have the choice to use Docker Toolbox or native Docker. Native Docker
is for sure faster but you still have the choice.
If you will use Docker Toolbox:
Mac users: Your Mac must be running OS X 10.8 "Mountain Lion" or newer to run Docker.
Windows users: Your machine must have a 64-bit operating system running Windows 7 or higher. You should
have an enabled virtualization.
If you prefer Docker for Mac as it is mentioned in the official Docker website:
Your Mac must be a 2010 or newer model, with Intel’s hardware support for memory management unit (MMU)
virtualization; i.e., Extended Page Tables (EPT) OS X 10.10.3 Yosemite or newer
You must have at least 4GB of RAM
You must have VirtualBox prior to version 4.3.30 must NOT be installed (it is incompatible with Docker for
Mac : uninstall the older version of VirtualBox and re-try the install if you already missed this).
And if you prefer Docker for Windows:
Your machine should have a 64bit Windows 10 Pro, Enterprise and Education (1511 November update, Build
10586 or later).
The Hyper-V package must be enabled and if it will be installed by Docker for Windows installer, it will enable
it for you
Installing Docker On Linux
Docker is supported by all Linux distributions satisfying the requirements, but not with all of the versions and this is
due to compatibility of Docker with old Kernel versions.
Kernels older than 3.10 will not support Docker and can cause data loss or any other bugs.
Check your Kernel by typing:
uname -r
Docker recommends making an upgrade, a dist upgrade and having the latest Kernel version for your servers before
using it in production.
Ubuntu
For Ubuntu, only those versions are supported to run and manage containers:
Ubuntu Xenial 16.04 (LTS)
Ubuntu Wily 15.10
Ubuntu Trusty 14.04 (LTS)
Ubuntu Precise 12.04 (LTS)
Update your package manager, add the apt key & Docker list then type the update command.
sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo "deb https://apt.dockerproject.org/repo ubuntu-trusty main"|tee -a /etc/apt/sources.list.d/docker.list
sudo apt-get update
Purge the old lxc-docker if you were using it before and install the new Docker Engine:
sudo apt-get purge lxc-docker
sudo apt-get install docker-engine
If you need to run Docker without root rights (with your actual user), run the following commands:
sudo groupadd docker
sudo usermod -aG docker $USER
If everything was ok, then running this command will create a container that will print a Hello World message than
exits without errors:
docker run hello-world
There is a good explanation about how Docker works in the output, if you have not noticed it, here it is:
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
CentOS
Docker runs only on CentOS 7.X. Same installation may apply to other EL7 distributions (but they are not supported
by Docker)
Add the yum repo.
sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF
Install Docker:
sudo yum install docker-engine
Start its service:
sudo service docker start
Set the daemon to run at system boot:
sudo chkconfig docker on
Test the Hello World image:
docker run hello-world
If you see a similar output to the following one, than your installation is fine:
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
c04b14da8d14: Pull complete
Digest: sha256:0256e8a36e2070f7bf2d0b0763dbabdd67798512411de4cdcf9431a1feb60fd9
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free *Docker Hub* account:
https://hub.docker.com
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
Now if you would like to create a Docker group and add your current user to it in order to avoid running command
with sudo privileges :
sudo groupadd docker
sudo usermod -aG docker $USER
Verify your work by running the hello-world container without sudo.
Debian
Only:
Debian testing stretch (64-bit)
Debian 8.0 Jessie (64-bit)
Debian 7.7 Wheezy (64-bit) (backports required)
are supported.
We are going to use the installation for Weezy. In order to install Docker on Jessie (8.0), change the entry for
backports and source.list entry to Jessie.
First of all, enable backports:
sudo su
echo "deb http://http.debian.net/debian wheezy-backports main"|tee -a /etc/apt/sources.list.d/backports.list
apt-get update
Purge other Docker versions if you have already used them:
``` bash
apt-get purge "lxc-docker*"
apt-get purge "docker.io*"
and update you package manager:
apt-get update
Install apt-transport-https and ca-certificates
apt-get install apt-transport-https ca-certificates
Add the GPG key.
apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
Add the repository:
echo "deb https://apt.dockerproject.org/repo debian-wheezy main"|tee -a /etc/apt/sources.list.d/docker.list
apt-get update
And install Docker:
apt-get install docker-engine
Start the service
service docker start
Run the Hello World container in order to check if everything is good:
sudo docker run hello-world
You will have a similar output to the following, if Docker is installed without problems:
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
c04b14da8d14: Pull complete
Digest: sha256:0256e8a36e2070f7bf2d0b0763dbabdd67798512411de4cdcf9431a1feb60fd9
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free *Docker Hub* account:
https://hub.docker.com
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
Now, in order to use your current user (not root user) to manage and run Docker, add the docker group if it does not
already exist.
exit # Exit from sudo user
sudo groupadd docker
Add your preferred user to this group:
sudo gpasswd -a ${USER} docker
Restart the Docker daemon.
sudo service docker restart
Test the Hello World container to check if your current user have right to execute Docker commands.
Docker Toolbox
Few months ago, installing Docker for my developers using MacOs and Windows was a pain. Now the new Docker
Toolbox have made things easier. Docker Toolbox is quick and easy installer that will setup a full Docker
environment. The installation includes Docker, Machine, Compose, Kitematic, and VirtualBox.
Docker Toolbox could be downloaded from Docker's wesbite.
Using this tool you will be able to work with:
docker-machine commands
docker commands
docker-compose commands
The Docker GUI (Kitematic)
a shell preconfigured for a Docker command-line environment
and Oracle VirtualBox
The installation is quite easy:
This is a screenshot for Docker
If you would like a default installation press "Next" to accept all and then click on Install. If you are running
Windows, make sure you allow the installer to make the necessary changes.
Now that you finished the installation, on the application folder, click on "Docker Quickstart Terminal".
Mac users, type the following command in order to :
Create a machine dev...
Create a VirtualBox VM...
Create SSH key...
Start the VirtualBox VM...
Start the VM...
Start the machine dev
Set the environment variables for machine dev
Windows users you can also follow the following instructions, since there are common commands between the two
OSs.
bash '/Applications/Docker Quickstart Terminal.app/Contents/Resources/Scripts/start.sh'
Running the following command will show you how to connect Docker to this machine:
docker-machine env dev
Now for testing, use the Hello World container:
docker run hello-world
You will probably see this message:
Unable to find image 'hello-world:latest' locally
This is not an error but Docker is saying that the image Hello World will not be used from your local disk but it will
be pulled from Docker Hub.
latest: Pulling from library/hello-world
535020c3e8ad: Pull complete
af340544ed62: Pull complete
Digest: sha256:a68868bfe696c00866942e8f5ca39e3e31b79c1e50feaee4ce5e28df2f051d5c
Status: Downloaded newer image for hello-world:latest
Hello from Docker.
This message shows that your installation appears to be working correctly.
If you are using Windows, it is actually not very different.
Click on the "Docker Quickstart Terminal" icon, if your operating system displays a prompt to allow VirtualBox,
choose yes and a terminal will show on your screen.
To test if Docker is working, type:
docker run hello-world
You will see the following message:
Hello from Docker.
You may also notice the explanation of how Docker is working on your local machine.
To generate this message ("Hello World" message), Docker took the following steps:
- The *Docker Engine CLI* client contacted the *Docker Engine daemon*.
- The *Docker Engine daemon* pulled the *hello-
world* image from the *Docker Hub*. (Assuming it was not already locally available.)
- The *Docker Engine daemon* created a new container from that image which runs the executable that produces the output you are currently reading.
- The *Docker Engine daemon* streamed that output to the *Docker Engine CLI* client, which sent it to your terminal.
After the installation, you can also start using the GUI or the command line, click on the create button to create a
Hello World container just to make sure if everything is OK.
Docker Toolbox is a very good tool for every developer but you may need more performance with larger projects in
your local development. Docker for Mac and Docker for Windows are native for each OS.
Docker For Mac
Use the following link to download the .dmg file and install the native Docker.
https://download.docker.com/mac/stable/Docker.dmg
To use native Docker, get back to the requirements section and make sure of your system configuration.
After the installation, drag and drop Docker.app to your Applications folder and start Docker from your applications
list.
You will a whale icon on your status bar and when you click on it, you can see a list of choices and you can also
click on About Docker to verify if you are using the right version.
If you prefer using the CLI, open your terminal and type:
docker --version
or
docker -v
If you installed Docker 1.12, you will see:
Docker version 1.12.0, build 8eab29e
If you go to Docker.app preferences, you can find some configurations, but one of the most important ones are
sharing drivers.
In many cases, your containers running in your local machines can use a file system mounted to a folder in your
host, we will not need this for the moment but you should remember later in this book that if you mount a container
to a local folder, you should get back to this step and share the concerned files, directories, users or volumes on your
local system with your containers.
Docker For Windows
Use the following link to download the .msi file and install the native Docker.
https://download.docker.com/win/stable/InstallDocker.msi
Same thing for Windows: To use native Docker, get back to the requirements section and make sure of your
system configuration.
Double-click InstallDocker.msi and run the installer
Follow the installation wizard
Authorize Docker if you were asked for that by your system
Click Finish to start Docker
If everything was OK, you will get a popup with a success message.
Now open cmd.exe (or PowerShell) and type
docker --version
or
docker version
If your containers running on your local development environment may need in many cases (that we will see in this
book) to access to your file system, folders, files or drives. This is the case when you mount a folder inside the
Docker container to your host file system. We will see many examples of this type so you should remember to get
back here and make the right configurations if mounting a directory or a file will be needed later in this book.
Docker Experimental Features
Even if Docker has a stable version that can be safely used in production environments, many features are still in
development and you may need to plan for your future projects using Docker, so in this case, you will need to test
some of these features.
I have been testing Docker Swarm Mode since it was experimental and I needed to evaluate this feature in order to
prepare the adequate architecture, servers and adopt development and integration work flows to the coming changes.
You may find some instability and bugs using the experimental installation packages which is normal.
Docker Experimental Features For Mac And Windows
For native Docker running on both systems, to evaluate the experimental features, you need to download the beta
channel installation packages.
For Mac:
https://download.docker.com/mac/beta/Docker.dmg
For Windows
https://download.docker.com/win/beta/InstallDocker.msi
Docker Experimental Features For Linux
Running the following command will install the experimental version of Docker:
You should have curl installed
curl -sSL https://experimental.docker.com/ | sh
Genrally curl | bash is not a good security practice even if the transport is over HTTPS. Content can be
modified on the server.
You can download the script, read it and execute it:
wget https://experimental.docker.com/
Or you can get one of the following binaries in function of your system architecture:
https://experimental.docker.com/builds/Linux/i386/docker-latest.tgz
https://experimental.docker.com/builds/Linux/x86_64/docker-latest.tgz
For the remainder of the installation :
tar -xvzf docker-latest.tgz
mv docker/* /usr/bin/
sudo dockerd &
Removing Docker
Let's take Ubuntu as an example.
Purge the Docker Engine:
sudo apt-get purge docker-engine
sudo apt-get autoremove --purge docker-engine
sudo apt-get autoclean
This is enough in most cases, but to remove all of Docker's files, follow the next steps.
If you wish to remove all the images, containers and volumes:
sudo rm -rf /var/lib/docker
Then remove docker from apparmor.d:
sudo rm /etc/apparmor.d/docker
Then remove docker group:
sudo groupdel docker
You have successfully deleted completely docker.
Docker Hub
Docker Hub is a cloud registry service for Docker.
Docker allows to package artifacts/code and configurations into a single image. These images can be reusable by
you, your colleague or even your customer. If you would like to share your code you will generally use a git
repository like Github or Bitbucket.
You can also run your own Gitlab that will allows you to have your own private on-premise Git repositories.
Things are very similar with Docker, you can use a cloud-based solution to share your images like Docker Hub or
use your own Hub (a private Docker registry).
Docker Hub is a public Docker repository, but if you want to use a cloud-based solution while keeping your
images private, the paid version of Docker Hub allows you to have privates repositories.
Docker Hub allows you to
Access to community, official, and private image libraries
Have public or paid private image repositories to where you can push your images and from where your could
pull them to your servers
Create and build new images with different tags when the source code inside your container changes
Create and configure webhooks and trigger actions after a successful push to a repository
Create workgroups and manage access to your private images
Integrate with GitHub and Bitbucket
Basically Docker Hub could be a component of your dev-test pipeline automation.
In order to use Docker Hub, go to the following link and create an account:
https://hub.docker.com/
If you would like to test if your account is enabled, type
docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head
over to https://hub.docker.com to create one.
Now, go to Docker Hub website and create a public repository. We will see how to send a running container as an
image to Docker Hub and for the same reason we are going to use a sample app genrally used by Docker for demos,
called vote (that you can also find on Docker official Github repository).
Vote application is a Python webapp that lets you vote between two options, it uses a Redis queue to collects new
votes, .NET worker which consumes votes and stores them in a Postgres database backed by a Docker volume and a
Node.Js webapp which shows the results of the voting in real time.
I consider that you created a working account on Docker Hub, typed the login command and entered the right
password.
If have a starting level with Docker, you may not understand all of the next commands but the goal of this section is
just to demonstrate how a Docker Registry works (In this case, the used Docker Registry is a cloud-based one built
by Docker, and as said, it is called Docker Hub).
When you type the following command, Docker will search if it has the image locally, otherwise it will check if it is
on Docker Hub:
docker run -d -it -p 80:80 instavote/vote
You can find the image here:
https://hub.docker.com/r/instavote/vote/
Now type this command to show the running container. This is the equivalent of ps command in Linux systems for
Docker:
docker ps
You can see here that the nauseous_albattani container (a name given automatically by Docker), is running the vote
application pulled from instavote/vote repository.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
136422f45b02 instavote/vote "gunicorn app:app -b " 8 minutes ago Up 8 minutes 0.0.0.0:80-
>80/tcp nauseous_albattani
The container id is : 136422f45b02 and the application is reachable via http://0.0.0.0:80
Just like using git, we are going to commit and push the image to our Docker Hub repository. No need to create a
new repository, the commit/push can be used in a lazy mode, it will create it for you.
Commit:
docker commit -m "Painless Docker first commit" -a "Aymen El Amri" 136422f45b02 eon01/painlessdocker.com_voteapp:v1
sha256:bf2a7905742d85cca806eefa8618a6f09a00c3802b6f918cb965b22a94e7578a
And push:
docker push eon01/painlessdocker.com_voteapp:v1
The push refers to a repository [docker.io/eon01/painlessdocker.com_voteapp]
1f31ef805ed1: Mounted from eon01/painless_docker_vote_app
3c58cbbfa0a8: Mounted from eon01/painless_docker_vote_app
02e23fb0be8d: Mounted from eon01/painless_docker_vote_app
f485a8fdd8bd: Mounted from eon01/painless_docker_vote_app
1f1dc3de0e7d: Mounted from eon01/painless_docker_vote_app
797c28e44049: Mounted from eon01/painless_docker_vote_app
77f08abee8bf: Mounted from eon01/painless_docker_vote_app
v1: digest: sha256:658750e57d51df53b24bf0f5a7bc6d52e3b03ce710a312362b99b530442a089f size: 1781
Change eon01 by your username.
Notice that a new repository is added automatically to my Docker Hub dashboard:
Now you can pull the same image with the latest tag:
docker pull eon01/painlessdocker.com_voteapp
Or with a specific tag:
docker pull eon01/painlessdocker.com_voteapp:v1
In our case, the v1 is the latest version, so the result of the two above commands will be the same image pulled to
your local image.
Docker Registry
Docker Registry is a scalable server side application conceived to be an on-premise Docker Hub. Just like Docker
Hub, it helps you push, pull and distribute your images.
The software powering Docker Registry is open-source under Apache license. Docker Registry could be also a
cloud-based solution, because Docker's commercial offer called Docker Trusted Registry is cloud-based.
Docker Registry could be run using Docker. A Docker image for the Docker Registry is available here:
https://hub.docker.com/_/registry/
It is easy to create a registry, just pull and run the image like this:
docker run -d -p 5000:5000 --name registry registry:2.5.0
Let's test it : We will pull an image from Docker Hub, tag and push it to our own registry .
docker pull ubuntu
docker tag ubuntu localhost:5000/myfirstimage
docker push localhost:5000/myfirstimage
docker pull localhost:5000/myfirstimage
Deploying Docker Registry On Amazon Web Services
You need to have:
An Amazon Web Services account
The good IAM privileges
Your aws CLI configured
Type:
aws configure
Type your credentials, choose your region and your preferred output format:
AWS Access Key ID [None]: ******************
AWS Secret Access Key [None]: ***********************
Default region name [None]: eu-west-1
Default output format [None]: json
Create an EBS disk (Elastic Block Store), specify the region you are using and the availability zone.
aws ec2 create-volume --size 80 --region eu-west-1 --availability-zone eu-west-1a --volume-type standard
You should have an similar output to the following one:
{
"AvailabilityZone": "eu-west-1a",
"Encrypted": false,
"VolumeType": "standard",
"VolumeId": "vol-xxxxxx",
"State": "creating",
"SnapshotId": "",
"CreateTime": "2016-10-14T15:29:35.400Z",
"Size": 80
}
Keep the output, because we are going to use the volume id later.
Our choice for the volume type was 'standard' and you must choose your preferred volume type and it is mainly
about the iops.
The following table could help:
IOPS Use Case
Magnetic Up to 100 IOPS/volume Little access
GP Up to 3000 IOPS/volume Larger access needs, suitable for the majority of classic cases
PIOPS Up to 4000 IOPS/volume High speed access
Start an EC2 instance:
aws ec2 run-instances --image-id ami-xxxxxxxx --count 1 --instance-type t1.medium --key-name MyKeyPair --security-group-
ids sg-xxxxxxxx --subnet-id subnet-xxxxxxxx
Replace your image id, instance type, key name, security group ids and subnet id with your proper values.
On the output look for the instance id because we are going to use it.
{
"OwnerId": "xxxxxxxx",
"ReservationId": "r-xxxxxxx",
"Groups": [
{
[..]
}
],
"Instances": [
{
"InstanceId": "i-5203422c",
[..]
}
aws ec2 attach-volume --volume-id vol-xxxxxxxxxxx --instance-id i-xxxxxxxx --device /dev/sdf
Now that the volume is attached, you should check your volumes in the EC2 instance with a df -kh and you will see
your new attached instance.
In this example, let's say the attached EBS has the following device name /dev/xvdf . Create a folder and a new file
system upon the volume:
sudo mkfs -t ext4 /dev/xvdf
mkdir /data
Make sure you get the right device name for the new attached volume.
Now go to the fstab configuration file:
/etc/fstab
and add :
``` bash
/dev/xvdf /data ext4 defaults 1 1
Now mount the volume by typing :
mount -a
You should have Docker installed in order to run a private Docker registry.
The next step is running the registry:
docker run -d -p 80:5000 --restart=always -v /data:/var/lib/registry registry:2
If you type docker ps , you should see the registry running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
bb6201f63cc5 registry:2 "/entrypoint.sh /etc/" 21 hours Up About 1h 0.0.0.0:80->5000/tcp
Now you should create an ELB but first create its Security Group (expose port 443).
Create the ELB using AWS CLI or AWS Console and redirect traffic to the EC2 port number 80. You can get the ELB
DNS since we are going to use it to push and pull images.
Opening port 443 is needed since the Docker Registry needs it to send and receive data, that's why we used ELB
since the latter has integrated certificate management and SSL decryption.
It is also used to build a highly available system.
Now let's test it by pushing an image:
docker pull hello-world
docker tag hello-world load_balancer_dns/hello-world:1
docker push load_balancer_dns/hello-world
docker pull load_balancer_dns/hello-world
If you don't want to use ELB, you should bring your own certificates and run:
docker run -d -p 5000:5000 --restart=always --name registry \
-v `pwd`/certs:/certs \
-v /data:/var/lib/registry \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
registry:2
Another option to use is to run storage on AWS S3:
docker run \
-e SETTINGS_FLAVOR=s3 \
-e AWS_BUCKET=my_bucket \
-e STORAGE_PATH=/data \
-e AWS_REGION="eu-west-1"
-e AWS_KEY=*********** \
-e AWS_SECRET=*********** \
-e SEARCH_BACKEND=sqlalchemy \
-p 80:5000 \
registry
In this case, you should not forget to add a policy for S3 that allows the Docker Registry to read and write your
images to S3.
Deploying Docker Registry On Azure
Using Azure, we are going to deploy the same Docker Registry using Azure Storage service.
You will also need a configured Azure CLI.
We need to create a storage account using the Azure CLI:
azure storage account create -l "North Europe" <storage_account_name>
Change <storage_account_name> by your proper value.
Now we need to list the storage account keys to use one of them later:
azure storage account keys list <storage_account_name>
Run:
docker run -d -p 80:5000 \
-e REGISTRY_STORAGE=azure \
-e REGISTRY_STORAGE_AZURE_ACCOUNTNAME="<storage_account_name>" \
-e REGISTRY_STORAGE_AZURE_ACCOUNTKEY="<storage_key>" \
-e REGISTRY_STORAGE_AZURE_CONTAINER="registry" \
--name=registry \
registry:2
If the port 80 is closed on your Azure virtual machine, you should open it:
azure vm endpoint create <machine-name> 80 80
Configuring security for the Docker Registry is not covered in this part.
Docker Store
Docker Store is a Docker inc product and it is designed to provide a scalable self-service system for ISVs to publish
and distribute trusted and enterprise-ready content
It provides a publishing process that includes:
security scanning
component inventory
The open-source license usage
Image construction guidelines
In other words, it is an official marketplace with workflows to create and distribute content were you can find free
and commercial images.
Chapter III - Basic Concepts
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Hello World, Hello Docker
Through the following chapters, you will learn how to manipulate Docker containers (create, run, delete, update,
scale ..etc).
But to ensure a better understanding of some concepts, we need to learn at least how to run a Hello World container.
We have already seen this, but as a reminder, and to be sure that your installation was good, we will run the Hello
World container again:
docker run --rm -it hello-world
You can see a brief explanation of how Docker has created the container on the command output:
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
This gives a good explanation of how Docker created a container and it will be a waste of energy to re-explain this
in my own words, albeit Docker uses more energy to do the same task :-)
General Information About Docker
Docker is a very active project and its code is frequently changing, to understand many concepts about this
technology, I have been following the project on Github and browsing its issues when I had problems and especially
when I found bugs.
Therefore, it is important to know the version you are using in your production servers. docker -v will give you the
version and the build number you are using.
Example:
Docker version 1.12.2, build bb80604
You can get other general information about the server/client version, the architecture, the Go version ..etc. Use
docker version to get the latter information. Example:
Client:
Version: 1.12.2
API version: 1.24
Go version: go1.6.3
Git commit: bb80604
Built: Tue Oct 11 18:19:35 2016
OS/Arch: linux/amd64
Server:
Version: 1.12.2
API version: 1.24
Go version: go1.6.3
Git commit: bb80604
Built: Tue Oct 11 18:19:35 2016
OS/Arch: linux/amd64
Docker Help
You can view the Docker help using the command line:
docker --help
You will get a list of
options like:
--config=~/.docker Location of client config files
-D, --debug Enable debug mode
-H, --host=[] Daemon socket(s) to connect to
-h, --help Print usage
-l, --log-level=info Set the logging level
--tls Use TLS; implied by --tlsverify
--tlscacert=~/.docker/ca.pem Trust certs signed only by this CA
--tlscert=~/.docker/cert.pem Path to TLS certificate file
--tlskey=~/.docker/key.pem Path to TLS key file
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit
commands like:
attach Attach to a running container
build Build an image from a Dockerfile
commit Create a new image from a container's changes
cp Copy files/folders between a container and the local filesystem
create Create a new container
diff Inspect changes on a container's filesystem
If you need more help about a specific command like cp or rmi, you need to type:
docker cp --help
docker rmi --help
In some cases, you may get "the command third level" of help like:
docker swarm init --help
Docker Events
To start this section, let's run a MariaDB container and list Docker Events. For this manipulation, you can use
terminator to split your screen into two and notice in the same time the events output while typing the following
command:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /data/db:/var/lib/mysql -d mariadb
Unable to find image 'mariadb:latest' locally
latest: Pulling from library/mariadb
386a066cd84a: Already exists
827c8d62b332: Pull complete
de135f87677c: Pull complete
05822f26ca6e: Pull complete
ad65f56a251e: Pull complete
d71752ae05f3: Pull complete
87cb39e409d0: Pull complete
8e300615ba09: Pull complete
411bb8b40c58: Pull complete
f38e00663fa6: Pull complete
fb7471e9a58d: Pull complete
2d1b7d9d1b69: Pull complete
Digest: sha256:6..c
Status: Downloaded newer image for mariadb:latest
0..7
The events launched by the last command are:
docker events
2016-12-09T00:54:51.827303500+01:00 image pull mariadb:latest (name=mariadb)
2016-12-09T00:54:52.000498892+01:00 container create 0..7 (image=mariadb, name=mariadb)
2016-12-09T00:54:52.137792817+01:00 network connect 8..d (container=0..7, name=bridge, type=bridge)
2016-12-09T00:54:52.550648364+01:00 container start 0..7 (image=mariadb, name=mariadb)
Explicitly:
image pull : The image is pulled from the public repository by it identifier Mariadb.
container create : The container is created from the pulled image and it was given a Docker identifier
0..7 . At this stage the container is not running yet
network connect : The container is attached to a network called bridge having the identifier 8..d .
At this stage the container is not running yet
container start : The container at this stage is running on your host system with the same identifier
0..7
This is the full list of Docker events (44) that can be reported by images, containers, networks, plugins, volumes and
Docker daemon:
- attach
- commit
- copy
- create
- destroy
- detach
- die
- exec_create
- exec_detach
- exec_start
- export
- health_status
- kill
- oom
- pause
- rename
- resize
- restart
- start
- stop
- top
- unpause
- update
- delete
- import
- load
- pull
- push
- save
- tag
- untag
- install
- enable
- disable
- remove
- create
- mount
- unmount
- destroy
- create
- connect
- disconnect
- destroy
- reload
Using Docker API To List Events
There is a chapter about Docker API that will help you discover it in more details, but there is no harm to start using
it now.
You can, in fact, use it to see Docker Events while manipulating containers, images, networks .. etc.
Install curl and type curl --unix-socket /var/run/docker.sock http:/events then open another terminal window (or a new
tab) and type docker pull mariadb , docker rmi -f mariadb to remove the pulled image then docker pull mariadb to pull it
again.
These are the 3 events reported by the 3 commands typed above:
Action = Pulling and image that already exists in the host:
{
"status":"pull",
"id":"mariadb:latest",
"Type":"image",
"Action":"pull",
"Actor": {
"ID":"mariadb:latest",
"Attributes": {
"name":"mariadb"
}
},
"time":1481381043,
"timeNano":1481381043157632493
}
Action = Removing it:
{
"status":"untag",
"id":"sha256:0..c",
"Type":"image",
"Action":"untag",
"Actor" {
"ID":"sha256:0..c",
"Attributes" {
"name":"sha256:0..c"
}
},
"time":1481381060,
"timeNano":1481381060026443422
}
Action = Pulling the same image again from a distant repository:
{
"status":"pull",
"id":"mariadb:latest",
"Type":"image",
"Action":"pull",
"Actor":{
"ID":"mariadb:latest",
"Attributes":{
"name":"mariadb"
}
},
"time":1481381069,
"timeNano":1481381069629194420
}
Each event has a status (pull, untag for removing the image ..etc), a resource identifier (id) with its Type (image,
container, network ..) and other information like:
Actor,
time
and timeNano
You can use DoMonit (a Docker API wrapper) that I created for this book to discover Docker API.
Docker API is detailed in a separate chapter.
To test DoMonit, you can create a folder and install a Python virtual environment:
virtualenv DoMonit/
New python executable in /home/eon01/DoMonit/bin/python
Installing setuptools, pip, wheel...done.
cd DoMonit
Clone the repository:
git clone https://github.com/eon01/DoMonit.git
Cloning into 'DoMonit'...
remote: Counting objects: 237, done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 237 (delta 9), reused 0 (delta 0), pack-reused 218
Receiving objects: 100% (237/237), 106.14 KiB | 113.00 KiB/s, done.
Resolving deltas: 100% (128/128), done.
Checking connectivity... done.
List the files inside the created folder:
ls -l
total 24
drwxr-xr-x 2 eon01 sudo 4096 Dec 11 14:46 bin
drwxr-xr-x 6 eon01 sudo 4096 Dec 11 14:46 DoMonit
drwxr-xr-x 2 eon01 sudo 4096 Dec 11 14:45 include
drwxr-xr-x 3 eon01 sudo 4096 Dec 11 14:45 lib
drwxr-xr-x 2 eon01 sudo 4096 Dec 11 14:45 local
-rw-r--r-- 1 eon01 sudo 60 Dec 11 14:46 pip-selfcheck.json
Activate the execution environment:
. bin/activate
Install the requirements:
pip install -r DoMonit/requirements.txt
Collecting pathlib==1.0.1 (from -r DoMonit/requirements.txt (line 1))
/home/eon01/DoMonit/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform.
This may cause the server to present an incorrect TLS certificate, which can cause validation failures.
You can upgrade to a newer version of Python to solve this.
For more information, see https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
SNIMissingWarning
/home/eon01/DoMonit/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
InsecurePlatformWarning:
A true SSLContext object is not available.
This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail.
You can upgrade to a newer version of Python to solve this.
For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Collecting PyYAML==3.11 (from -r DoMonit/requirements.txt (line 2))
Collecting requests==2.10.0 (from -r DoMonit/requirements.txt (line 3))
Using cached requests-2.10.0-py2.py3-none-any.whl
Collecting requests-unixsocket==0.1.5 (from -r DoMonit/requirements.txt (line 4))
Collecting simplejson==3.8.2 (from -r DoMonit/requirements.txt (line 5))
Collecting urllib3==1.16 (from -r DoMonit/requirements.txt (line 6))
Using cached urllib3-1.16-py2.py3-none-any.whl
Installing collected packages: pathlib, PyYAML, requests, urllib3, requests-unixsocket, simplejson
Successfully installed PyYAML-3.11 pathlib-1.0.1 requests-2.10.0 requests-unixsocket-0.1.5 simplejson-3.8.2 urllib3-1.16
and start streaming the events using:
python DoMonit/events_test.py
On another terminal, create a network and notice the output of the executed program:
docker network create test1
6d517f8251736446874e14bafeb08c76cdc6d8219537b1ed1af64984bb3590b2
The output should be something like:
{
"Type": "network",
"Action": "create",
"Actor": {
"ID": "6d517f8251736446874e14bafeb08c76cdc6d8219537b1ed1af64984bb3590b2",
"Attributes": {
"name": "test1",
"type": "bridge"
}
},
"time": 1481464270,
"timeNano": 1481464270716590446
}
This program called the Docker Events API:
curl --unix-socket /var/run/docker.sock http:/events
Let's get back to Docker Events command since we haven't seen yet the options used to stream events.
Docker Events supports the following filters (using in most cases the name or the id of the resource):
container=<name/id>
event=<action>
image=<tag/id>
plugin=<name/id>
label=<key> / label=<key>=<value>
type=<container/image/volume/network/daemon>
volume=<name/id>
network=<name/id>
daemon=<name/id>
Examples:
docker events --filter 'event=start'
docker events --filter 'image=mariadb'
docker events --filter 'container=781567cd25632'
docker events --filter 'container=781567cd25632' --filter 'event=stop'
docker events --filter 'type=volume'
In the next sections we are going to see what each Docker component (volumes, containers ..etc) reports as an event.
Monitoring A Container Using Docker Events
Note that we are always using the container of MariaDB database that we can create like this:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /data/db:/var/lib/mysql -d mariadb
If you haven't created the container yet, do it.
Based on the examples of Docker Events given above, we are going to create a small monitoring script to send us an
email if MariaDB containers reports a stop event.
We are going to listen to a file in /tmp and send an email as soon as we read a line.
tail -f /tmp/events | while read line; do echo "$line" | mail -s subject "amri.aymen@gmail.com"; done
Yes, this is my real email address if you would like to stay in touch !
Using the Docker Events command, we can redirect all the logs to a file (in this case we will use /tmp/events ).
docker events --filter 'event=stop' --filter 'container=mariadb' > /tmp/events
If you have already created the container, you can test this simple monitoring system using :
docker stop mariadb
docker start mariadb
This is what you should get by email:
{
"Type":"network",
"Action":"connect",
"Actor":{
"ID":"7b64910a4b7b6b2b90245db1c7a1d8330fb1d0db9237da8c64378c0ad7745e9e",
"Attributes":{
"container":"9..d",
"name":"bridge",
"type":"bridge"
}
},
"time":1481466943,
"timeNano":1481466943153467120
}{
"Type":"network",
"Action":"connect",
"Actor":{
"ID":"7b64910a4b7b6b2b90245db1c7a1d8330fb1d0db9237da8c64378c0ad7745e9e",
"Attributes":{
"container":"9..d",
"name":"bridge",
"type":"bridge"
}
},
"time":1481466943,
"timeNano":1481466943153467120
}{
"status":"start",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"start",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466943,
"timeNano":1481466943266967807
}{
"s TAG IMAGE ID CREATED SIZE
hello-world tatus":"start",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"start",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466943,
"timeNano":1481466943266967807
}{
"status":"kill",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"kill",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb",
"signal":"15"
}
},
"time":1481466945,
"timeNano":1481466945523301735
}{
"status":"kill",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"kill",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb",
"signal":"15"
}
},
"time":1481466945,
"timeNano":1481466945523301735
}{
"status":"die",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"die",
"Actor":{
"ID":"9..d",
"Attributes":{
"exitCode":"0",
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466948,
"timeNano":1481466948241703985
}{
"status":"die",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"die",
"Actor":{
"ID":"9..d",
"Attributes":{
"exitCode":"0",
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466948,
"timeNano":1481466948241703985
}{
"Type":"network",
"Action":"disconnect",
"Actor":{
"ID":"7b64910a4b7b6b2b90245db1c7a1d8330fb1d0db9237da8c64378c0ad7745e9e",
"Attributes":{
"container":"9..d",
"name":"bridge",
"type":"bridge"
}
},
"time":1481466948,
"timeNano":1481466948357530397
}{
"Type":"network",
"Action":"disconnect",
"Actor":{
"ID":"7b64910a4b7b6b2b90245db1c7a1d8330fb1d0db9237da8c64378c0ad7745e9e",
"Attributes":{
"container":"9..d",
"name":"bridge",
"type":"bridge"
}
},
"time":1481466948,
"timeNano":1481466948357530397
}{
"status":"stop",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"stop",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466948,
"timeNano":1481466948398153788
}2016-12-11T15:35:48.398153788+01:00{
"status":"stop",
"id":"9..d",
"from":"mariadb",
"Type":"container",
"Action":"stop",
"Actor":{
"ID":"9..d",
"Attributes":{
"image":"mariadb",
"name":"mariadb"
}
},
"time":1481466948,
"timeNano":1481466948398153788
}
This is a simple example for sure ! But it could give you ideas to probably create a home-made event
alerting/monitoring system using simple Bash commands.
Docker Images
If you type docker images , you will see that you have the hello-world image, even if its container is not running:
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest c54a2cc56cbb 6 months ago 1.848 kB
An image has a tag (latest), an id (c54a2cc56cbb), a creation date (3 months ago) and a size (1.848 kB).
Images are generally stored in a registry unless it is an image from scratch that you created using the tar method and
didn't push to any of the public or private registires.
Registries are to Docker images what git is to code.
Docker images responds to the following Docker Events:
delete,
import,
load,
pull,
push,
save,
tag,
untag
Docker Containers
Docker is a containerization software that solves many problems in the modern IT like it is said in the introduction
of this book.
A container is basically an image that is running in your host OS. The important thing to remember is that there is a
big difference between containers and VMs and that's why Docker is so darn popular:
Containers use shared operating systems while VMs emulate in a different way the hardware.
The command to create a container is docker create followed by options, image name, command to execute at the
container creation and its arguments.
Other commands are used to start, pause, stop, run and delete containers and we are going to see these commands
and more in another chapter.
Keep in mind that a container have a life cycle with different phases that responds to the defined events :
attach
commit
copy
create
destroy
detach
die
exec_create
exec_detach
exec_start
export
health_status
kill
oom
pause
rename
resize
restart
start
stop
top
unpause
update
Docker Volumes
We still using the example of MariaDB:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /data/db:/var/lib/mysql -d mariadb
In the docker run command we are using to create a dockerized database, we are using the -v flag to mount a
volume. The mounted volume inside the container is pointing on /var/lib/mysql which is the Mysql data directory
where we can find databases data and other Mysql data.
Without logging inside the container, you can see the data directory mapped to the host in the /data/db directory:
ls -l /data/db/
total 110632
-rw-rw---- 1 999 999 16384 Dec 11 21:27 aria_log.00000001
-rw-rw---- 1 999 999 52 Dec 11 21:27 aria_log_control
-rw-rw---- 1 999 999 12582912 Dec 11 21:27 ibdata1
-rw-rw---- 1 999 999 50331648 Dec 11 21:27 ib_logfile0
-rw-rw---- 1 999 999 50331648 Dec 11 21:27 ib_logfile1
drwx------ 2 999 999 4096 Dec 11 21:27 mysql
drwx------ 2 999 999 4096 Dec 11 21:27 performance_schema
You can create the same volume without mapping it to the host filesystem:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /var/lib/mysql -d mariadb
In this way you can't access the files of your MariaDB databases from the host machine.
Nope ! In reality, you can. Docker volumes can be found in:
/var/lib/docker/volumes/
You can see a list of volumes identifiers:
ls -lrth /var/lib/docker/volumes/
3..e
3..7
5..0
b..2
b..7
metadata.db
One of the above identifiers is the volume that was created using the command:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /var/lib/mysql -d mariadb
Docker has a helpful command to inspect a container called docker inspect .
We are going to see it in details later but we will use it in order to identify the Docker volumes attached to our
running container.
docker inspect -f '{{ (index .Mounts 0).Source }}' mariadb
It should output one of the ids above:
/var/lib/docker/volumes/b..2/_data
The Docker inspect command lets you have several information about a container (or an image). It returns generally
a json containing all of the information but using the -f or the --format flag, you can filter an attribute. docker inspect
-f '{{ (index .Mounts 0).Source }}' mariadb will shows the first element of a json dictionary referenced by its name
Mounts.
This is the part of the json output containing the information about the mounts:
"Mounts": [
{
"Name": "b..2",
"Source": "/var/lib/docker/volumes/b..2/_data",
"Destination": "/var/lib/mysql",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
],
Now, you can understand why the used format '{{ (index .Mounts 0).Source }}' will shows
/var/lib/docker/volumes/b..2/_data .
Keep in mind that the host directory /var/lib/docker/volumes/b..2/_data depends on the host that's why creating
a volume using Dockerfile does not allow mounting a volume on the host filesystem.
Data Volumes
Data volumes are different from the previous volumes created to host MariaDB databases. They are separate
containers, dedicated to storing data, they are persistent, they can be shared between containers, they can be used as
a backup, a restore point or to used in data migration.
For the same container (MariaDB), we are going to create a volumes in a separate container:
docker run --name mariadb_storage -v /var/lib/mysql -it -d busybox
Now let's run the MariaDB container without creating a new volumes but using the previously created data
container:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password --volumes-from mariadb_storage -d mariadb
Notice the usage of --volumes-from mariadb_storage that will attach the data container as a volume to the MariaDB
container.
In the same way, we can inspect the data container:
docker inspect -f '{{ (index .Mounts 0).Source }}' mariadb_storage
Say we have this as an output:
/var/lib/docker/volumes/f..9/_data
This means that the latter directory will contains all of our databases files:
You can check it by typing:
ls -l /var/lib/docker/volumes/f..9/_data
total 110656
-rw-rw---- 1 999 999 16384 Dec 11 22:23 aria_log.00000001
-rw-rw---- 1 999 999 52 Dec 11 22:23 aria_log_control
-rw-rw---- 1 999 999 12582912 Dec 11 22:23 ibdata1
-rw-rw---- 1 999 999 50331648 Dec 11 22:23 ib_logfile0
-rw-rw---- 1 999 999 50331648 Dec 11 22:23 ib_logfile1
-rw-rw---- 1 999 999 0 Dec 11 22:23 multi-master.info
drwx------ 2 999 999 4096 Dec 11 22:23 mysql
drwx------ 2 999 999 4096 Dec 11 22:23 performance_schema
-rw-rw---- 1 999 999 24576 Dec 11 22:23 tc.log
Now that we have a separate volume container, we can create 3 or more MariaDB containers using the same
volume:
docker run \
--name mariadb1 \
-e MYSQL_ROOT_PASSWORD=password \
--volumes-from mariadb_storage \
-d mariadb \
&& \
docker run \
--name mariadb2 \
-e MYSQL_ROOT_PASSWORD=password \
--volumes-from mariadb_storage \
-d mariadb \
&& \
docker run \
--name mariadb3 \
-e MYSQL_ROOT_PASSWORD=password \
--volumes-from mariadb_storage \
-d mariadb
f0ff0622ea755c266e069a42a2a627380042c8f1a3464521dd0fb16ef3257534
0c10ea5ec23a3057de30dc4f97e485349ac7b0a335a98e1c3deece56f3594964
128e66ffa823e3156fb28b48f45e3234d235949508780f1c1cef8c38ed5bfb0b
Now, you can see all of our 4 containers:
docker ps
CONTAINER ID IMAGE COMMAND PORTS NAMES
049824509c18 mariadb "docker-entrypoint.sh" 3306/tcp mariadb1
128e66ffa823 mariadb "docker-entrypoint.sh" 3306/tcp mariadb3
0c10ea5ec23a mariadb "docker-entrypoint.sh" 3306/tcp mariadb2
03808f86ba9b busybox "sh" mariadb_storage
Sharing a data volume between three MariaDB Docker instances in this example is just for testing and
learning purpose, it is not production-ready.
We can in fact manage to have multiple containers sharing same volumes. However, the fact that it is possible to do,
does not simply means that it is safe to do it. Multiple containers writing to a single shared volume may cause data
corruption or a high level application problem like consumer/producer problems.
Now you can type again docker ps and I am pretty sure that only one MariaDB will be running, the others will stop.
If you want more explanation about this, you can find MariaDB logs of the stopped container and you will notice
that mysqld could not get an exclusive lock and that the latter could be used by another process and that's what
actually happening.
[ERROR] mysqld: Got error 'Could not get an exclusive lock; file is probably in use by another process' when trying to use aria control file '/var/lib/mysql/aria_log_control'
Other logs are showing the same problem:
2016-12-11 22:02:25 140419388524480 [ERROR] InnoDB: Can't open './ibdata1'
2016-12-11 22:02:25 140419388524480 [ERROR] InnoDB: Could not open or create
the system tablespace. If you tried to add new data files to the system tablespace,
and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was,
and remove the new ibdata files InnoDB created in this failed attempt.
InnoDB only wrote those files full of zeros, but did not yet use them in any way.
But be careful: do not remove old data files which contain your precious data!
2016-12-11 22:02:25 140419388524480 [ERROR] Plugin 'InnoDB' init function returned error.
2016-12-11 22:02:25 140419388524480 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2016-12-11 22:02:25 140419388524480 [ERROR] mysqld: Can't lock aria control file
'/var/lib/mysql/aria_log_control' for exclusive use, error: 11. Will retry for 30 seconds
2016-12-11 22:02:56 140419388524480 [ERROR] mysqld: Got error 'Could not get an exclusive lock;
file is probably in use by another process' when trying to use aria control file '/var/lib/mysql/aria_log_control'
2016-12-11 22:02:56 140419388524480 [ERROR] Plugin 'Aria' init function returned error.
2016-12-11 22:02:56 140419388524480 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
2016-12-11 22:02:56 140419388524480 [Note] Plugin 'FEEDBACK' is disabled.
2016-12-11 22:02:56 140419388524480 [ERROR] Unknown/unsupported storage engine: InnoDB
2016-12-11 22:02:56 140419388524480 [ERROR] Aborting
Make sure your applications are designed to write to shared data containers.
Executing docker rm -f mariadb will remove the container from the host but if you type ls -l
/var/lib/docker/volumes/f..9/_data you can still find your data on the host machine and you can check that the Docker
volume container is always running independently from removed container.
Cleaning Docker Dangling Containers
You can find yourself in the situation where some volumes are not used (or referenced) by any container, we call
this type of volumes "dangling volumes" or "orphaned". To see your orphaned volumes, type:
docker volume ls -qf dangling=true
3..e
3..7
5..0
b..2
b..7
The fastest way to cleanup your host from these volumes is to run:
docker volume ls -qf dangling=true | xargs -r docker volume rm
Docker Volumes Events
Docker volumes report the following events: create, mount, unmount, destroy.
Docker Networks
Like a VM, a container can be part of one or many networks.
You may have noticed - just after the installation of Docker - that you have a new network interface on your host
machine:
ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:ef:e0:98:84
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:356 (356.0 B) TX bytes:0 (0.0 B)
And if you type docker network ls you can see the list of the default 3 networks.
NETWORK ID NAME DRIVER SCOPE
2aaea08714ba bridge bridge local
608a539ce1e3 host host local
ede46dbb22d7 none null local
Docker Networks Types
When you typed docker network ls you had 3 default networks implemented in Docker:
NETWORK ID NAME DRIVER SCOPE
2aaea08714ba bridge bridge local
608a539ce1e3 host host local
ede46dbb22d7 none null local
bridge (driver: bridge)
none (driver: null)
host (driver: host)
Bridge Networks
The bridge network is one of the host network interfaces (docker0). It is what you see when you type ifconfig .
All Docker containers are connected to this network unless you add --network flag to the container.
Let's verify this by creating a Docker container and inspecting the default network (bridge):
docker run --name mariadb_storage -v /var/lib/mysql -it -d busybox
Let's inspect the bridge network:
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "2aaea08714ba2e3927334e8d0044afeb85f68ec0a0624ae7ab1f81d976b49292",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Containers": {
"3c9afd0c72eb4738750ad1b83d38587a1c47636429e5f67c5deec5ec5dfa3210": {
"Name": "mariadb_storage",
"EndpointID": "ed47ef978ed78c5677c81a11d439d0af19a54f0620387794db1061807e39c2b7",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
We can see that the mariadb_storage data container is living inside the bridge network.
Now, if you create a new network db_network:
docker network create db_network
Then create a Docker container attached to the same network (db_network):
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password --network db_network -d mariadb
If you inspect the default bridge network, you will notice that there is no containers attached to it or at least the
newly created container is not attached to it:
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "2aaea08714ba2e3927334e8d0044afeb85f68ec0a0624ae7ab1f81d976b49292",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Containers": {},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
And if you inspect the new network, you can see that mariadb container is running in this network (db_network).
docker network inspect db_network
[
{
"Name": "db_network",
"Id": "ec809d635d4ad22f852a2419032812cb7c9361e8bb0111815dc79a34fee30668",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1/16"
}
]
},
"Internal": false,
"Containers": {
"3ddfbc0f03f4a574480d842330add4169f834cfcde96e52956ee34164629dd52": {
"Name": "mariadb",
"EndpointID": "0efca766ec11f7442399a4c81b6e79db825eeb81736db1760820fe61f24b9805",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
This network has 172.18.0.0/16 as a subnet and the container address is 172.18.0.2. In all cases, if you create a
network without specifying its type, it will have bridge as a default type.
Overlay Networks
Generally, overlay network is a network that is built on top of another network. For your information, VoIP, VPN,
CDNs are overlay networks.
Docker uses also overlay networks, precisely in Swarm mode where you can create this type of networks on a
manager node. While you can do this to managed containers (Swarm mode or using an external key-value store),
unmanaged containers can not be part of an overlay network.
Docker overlay networks allows creating a multi-host network.
To create a network called database_network
docker network create --driver overlay --subnet 10.0.9.0/32 database_network
This will shows an error message, telling you that there is an error response from daemon that failed to allocate
gateway.
Since overlay networks will only work in Swarm mode, you should run your Docker engine in this mode before
creating the network:
docker swarm init --advertise-addr 127.0.0.1
Swarm mode is probably the easiest way to create and manage overlay networks, but you can use external
components like Consul, Etcd or ZooKeeper because overlay networks require a valid key-value store service.
Using Swarm Mode
Docker Swarm is the mode where you are using the built-in container orchestrator implemented in Docker engine
since its version 1.12.
We are going to see in detail Docker Swarm mode and Docker orchestration but for the networking part, it is
important to define briefly the orchestration.
Container Orchestration allows users to coordinate containers in the cloud running a multi-container environment.
Say you are deploying a web application, with an orchestrator you can define a service for your web app, deploy its
container and scale it to 10 or 20 instance, create a network and attach your different containers to it and consider all
of the web application containers as single entity from an deploying, availability, scaling and networking point of
view.
To start working with Swarm mode you need at least the version 1.12. Initialize the Swarm cluster and don't forget
to put your machine IP instead of 192.168.0.47 :
docker swarm init --advertise-addr 192.168.0.47
You will notice an information message similar to the following one:
Swarm initialized: current node (0kgtmbz5flwv4te3cxewdi1u5) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-1re2072eii1walbueysk8yk14cqnmgltnf4vyuhmjli2od2quk-8qc8iewqzmlvp1igp63xyftua \
192.168.0.47:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
In this part of the book, since we are focusing on networking rather than the Swarm mode, we are going to use a
single node.
Now that the Swarm mode is activated, we can create a new network that we call webapps_network:
docker network create --driver overlay --subnet 10.0.9.0/16 --opt encrypted webapps_network
Let's check if our network using docker network ls :
NETWORK ID NAME DRIVER SCOPE
1wdbz5gldym1 webapp_network overlay swarm
If you inspect the network by typing docker inspect webapp_network you will be able to see different information about
webapp_network like its name, its id, its IPv6 status, the containers using it, some configurations like the subnet, the
gateway and the scope which is swarm.
[
{
"Name": "webapp_network",
"Id": "1wdbz5gldym121beq4ftdttgg",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.9.0/16",
"Gateway": "10.0.0.1"
}
]
},
"Internal": false,
"Containers": null,
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "257",
"encrypted": ""
},
"Labels": null
}
]
After learning how to use and manage Docker Swarm mode you will be able to create services and containers and
attach them to a network but for the moment let's just continue exploring briefly the networking part.
In our example, we are considering that the service webapp_service was created and that we are running 3 instances
of webapp_container.
If we type the same command to inspect we will notice that the 3 Docker instances are included now:
[
{
"Name": "webapp_network",
"Id": "2tb6djzmq4x4u5ey3h2e74y9e",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.9.0/16",
"Gateway": "10.0.0.2"
}
]
},
"Internal": false,
"Containers": {
"ab364a18eb272533e6bda88a0c705004df4b7974e64ad63c36fd3061a716f613": {
"Name": "webapp.3.3fmrpa3cd7y6b4ol33x7m413v",
"EndpointID": "e6980c952db5aa7b6995bc853d5ab82668a4692af98540feef1aa58a4e8fd282",
"MacAddress": "02:42:0a:00:00:06",
"IPv4Address": "10.0.0.6/16",
"IPv6Address": ""
},
"c5628fcfcf6a5f0e5a49ddfe20b141591c95f01c45a37262ee93e8ad3e3cff7e": {
"Name": "webapp.2.dckgxiffiay2kk7t2n149rpnj",
"EndpointID": "ce05566b0110f448018ccdaed4aa21b402ae659efa2adff3102492ee6c04fce8",
"MacAddress": "02:42:0a:00:00:05",
"IPv4Address": "10.0.0.5/16",
"IPv6Address": ""
},
"f0a7b5201e84c010616e6033fee7d245af7063922c61941cfc01f9edd0be7226": {
"Name": "webapp.1.73m1914qlcj8agxf5lkhzb7se",
"EndpointID": "1dce8dd253b818f8b0ba129a59328fed427da91d86126a1267ac7855666ec434",
"MacAddress": "02:42:0a:00:00:04",
"IPv4Address": "10.0.0.4/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "258",
"encrypted": ""
},
"Labels": {}
}
]
Let's make some experiments to better discover networking in Swarm mode.
If we execute the Linux command route to see the default
If we execute traceroute google.com from inside one of the containers, this is a sample output:
traceroute to google.com (216.58.209.238), 30 hops max, 46 byte packets
1 172.19.0.1 (172.19.0.1) 0.003 ms 0.012 ms 0.004 ms
2 192.168.3.1 (192.168.3.1) 2.535 ms 0.481 ms 0.609 ms
3 192.168.1.1 (192.168.1.1) 2.228 ms 1.442 ms 1.826 ms
4 80.10.234.169 (80.10.234.169) 10.377 ms 5.951 ms 17.026 ms
[...]
11 par10s29-in-f14.1e100.net (216.58.209.238) 4.723 ms 3.428 ms 4.191 ms
To go outside and reach the Internet, the first hop was 172.19.0.1 . If you type ifconfig in your host machine, you
can find that this address is allocated to docker_gwbridge interface that we are going to see later in this section.
docker_gwbridge Link encap:Ethernet HWaddr 02:42:d2:fd:fd:dc
inet addr:172.19.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7334 errors:0 dropped:0 overruns:0 frame:0
TX packets:3843 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:595520 (595.5 KB) TX bytes:358648 (358.6 KB)
The containers provisioned in docker swarm mode can be accessed in service discovery in two ways:
Via Virtual IP (VIP) and routed through the docker swarm ingress overlay network.
Or via a DNS round robbin (DNSRR)
Because VIP is not on the ingress overlay network, you need to create a user-defined overlay network in order to use
VIP or DNSRR.
Natively, Docker Engine in its Swarm mode supports overlay networks so you don't need an external key-value store
and container-to-container networking is enabled.
The eth0 interface represents the container interface that is connected to the overlay network. So if you create an
overlay network, you will have those VIP associated to it. eth1 interface represents the container interface that is
connected to the docker_gwbridge network, for external connectivity outside of container cluster.
Using External Key/Value Store
Our choice for this part will be Consul, a service discovery and configuration manager for distributed systems
implementing a key/value storage.
To use Consul 0.7.1, you should download it:
cd /tmp
wget https://releases.hashicorp.com/consul/0.7.1/consul_0.7.1_linux_amd64.zip
unzip consul_0.7.1_linux_amd64.zip
chmod +x consul
mv consul /usr/bin/consul
Create the configuration directory for Consul:
mkdir -p /etc/consul.d/
If you want to install the the web UI download it and unzip:
cd /tmp
wget https://releases.hashicorp.com/consul/0.7.1/consul_0.7.1_web_ui.zip
mkdir -p /opt/consul
cd /opt/consul
unzip /tmp/consul_0.6.0_web_ui.zip
Now you can run a Consul agent:
consul agent \
-server \
-bootstrap-expect 1 \
-data-dir /tmp/consul \
-node=agent-one \
-bind=192.168.0.47 \
-client=0.0.0.0 \
-config-dir /etc/consul.d \
-ui-dir /opt/consul-ui
Change the options values by your own ones, where:
-bootstrap-expect=0 Sets server to expect bootstrap mode.
-bind=0.0.0.0 Sets the bind address for cluster communication
-client=127.0.0.1 Sets the address to bind for client access.
This includes RPC, DNS, HTTP and HTTPS (if configured)
-node=hostname Name of this node. Must be unique in the cluster
-config-dir=foo Path to a directory to read configuration files
from. This will read every file ending in ".json"
as configuration in this directory in alphabetical
order. This can be specified multiple times.
-ui-dir=path Path to directory containing the Web UI resources
You can use other options like:
-bootstrap Sets server to bootstrap mode
-advertise=addr Sets the advertise address to use
-atlas=org/name Sets the Atlas infrastructure name, enables SCADA.
-atlas-join Enables auto-joining the Atlas cluster
-atlas-token=token Provides the Atlas API token
-atlas-endpoint=1.2.3.4 The address of the endpoint for Atlas integration.
-http-port=8500 Sets the HTTP API port to listen on
-config-file=foo Path to a JSON file to read configuration from.
This can be specified multiple times.
-data-dir=path Path to a data directory to store agent state
-recursor=1.2.3.4 Address of an upstream DNS server.
Can be specified multiple times.
-dc=east-aws Datacenter of the agent
-encrypt=key Provides the gossip encryption key
-join=1.2.3.4 Address of an agent to join at start time.
Can be specified multiple times.
-join-wan=1.2.3.4 Address of an agent to join -wan at start time.
Can be specified multiple times.
-retry-join=1.2.3.4 Address of an agent to join at start time with
retries enabled. Can be specified multiple times.
-retry-interval=30s Time to wait between join attempts.
-retry-max=0 Maximum number of join attempts. Defaults to 0, which
will retry indefinitely.
-retry-join-wan=1.2.3.4 Address of an agent to join -wan at start time with
retries enabled. Can be specified multiple times.
-retry-interval-wan=30s Time to wait between join -wan attempts.
-retry-max-wan=0 Maximum number of join -wan attempts. Defaults to 0, which
will retry indefinitely.
-log-level=info Log level of the agent.
-protocol=N Sets the protocol version. Defaults to latest.
-rejoin Ignores a previous leave and attempts to rejoin the cluster.
-server Switches agent to server mode.
-syslog Enables logging to syslog
-pid-file=path Path to file to store agent PID
You can see the execution logs on your terminal:
==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'agent-one'
Datacenter: 'dc1'
Server: true (bootstrap: true)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.47 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2016/12/17 18:50:55 [INFO] raft: Node at 192.168.0.47:8300 [Follower] entering Follower state
2016/12/17 18:50:55 [INFO] serf: EventMemberJoin: agent-one 192.168.0.47
2016/12/17 18:50:55 [INFO] consul: adding LAN server agent-one (Addr: 192.168.0.47:8300) (DC: dc1)
2016/12/17 18:50:55 [INFO] serf: EventMemberJoin: agent-one.dc1 192.168.0.47
2016/12/17 18:50:55 [ERR] agent: failed to sync remote state: No cluster leader
2016/12/17 18:50:55 [INFO] consul: adding WAN server agent-one.dc1 (Addr: 192.168.0.47:8300) (DC: dc1)
2016/12/17 18:50:56 [WARN] raft: Heartbeat timeout reached, starting election
2016/12/17 18:50:56 [INFO] raft: Node at 192.168.0.47:8300 [Candidate] entering Candidate state
2016/12/17 18:50:56 [INFO] raft: Election won. Tally: 1
2016/12/17 18:50:56 [INFO] raft: Node at 192.168.0.47:8300 [Leader] entering Leader state
2016/12/17 18:50:56 [INFO] consul: cluster leadership acquired
2016/12/17 18:50:56 [INFO] consul: New leader elected: agent-one
2016/12/17 18:50:56 [INFO] raft: Disabling EnableSingleNode (bootstrap)
2016/12/17 18:50:56 [INFO] consul: member 'agent-one' joined, marking health alive
2016/12/17 18:50:56 [INFO] agent: Synced service 'consul'
You can check Consul nodes by typing consul members :
Node Address Status Type Build Protocol DC
agent-one 192.168.0.47:8301 alive server 0.6.0 2 dc1
Now go to your Docker configuration file: /etc/default/docker and add the following line with the good
configurations:
DOCKER_OPTS="--cluster-store=consul://192.168.0.47:8500 --cluster-advertise=192.168.0.47:4000"
Restart Docker:
service docker restart
And create an overlay network:
docker network create --driver overlay --subnet=10.0.1.0/24 databases_network
Check if the network was created:
docker network ls
NETWORK ID NAME DRIVER SCOPE
6d6484687002 databases_network overlay global
If you inspect the recently created network using docker network inspect databases_network , you will have something
similar to the following output:
[
{
"Name": "databases_network",
"Id": "6d64846870029d114bb8638f492af7fc9b5c87c2facb67890b0f40187b73ac87",
"Scope": "global",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "10.0.1.0/24"
}
]
},
"Internal": false,
"Containers": {},
"Options": {},
"Labels": {}
}
]
Notice that the network Scope is now global :
"Scope": "global",
If you compare it to the other network created using the Swarm mode, you will notice that the scope changed from
swarm to global.
Docker Networks Events
Docker networks system reports the following events: create, connect, disconnect, destroy.
Docker Daemon & Architecture
Docker Daemon
Like the init daemon, cron daemon crond, dhcp daemon dhcpd, Docker has its own daemon dockerd.
To list Docker daemons, list all Linux daemons:
ps -U0 -o 'tty,pid,comm' | grep ^?
And grep Docker on the output:
ps -U0 -o 'tty,pid,comm' | grep ^?|grep -i dockerd
? 2779 dockerd
Note that you may see docker-containe or any other short version of docker-containerd-shim .
If you are already running Docker, when you type dockerd you will have a similar error message to this :
FATA[0000] Error starting daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
Now let's stop Docker service docker stop and run is daemon directly using dockerd command. Running the Docker
daemon command using dockerd is a good debugging tool, as you may see, you will have the running traces right on
your terminal screen:
INFO[0000] libcontainerd: new containerd process, pid: 19717
WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=4096
INFO[0001] [graphdriver] using prior storage driver "aufs"
INFO[0003] Graph migration to content-addressability took 0.63 seconds
WARN[0003] Your kernel does not support swap memory limit.
WARN[0003] mountpoint for pids not found
INFO[0003] Loading containers: start.
INFO[0003] Firewalld running: false
INFO[0004] Removing stale sandbox ingress_sbox (ingress-sbox)
INFO[0004] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. \
Daemon option --bip can be used to set a preferred IP address
INFO[0004] Loading containers: done.
INFO[0004] Listening for local connections addr=/var/lib/docker/swarm/control.sock proto=unix
INFO[0004] Listening for connections addr=[::]:2377 proto=tcp
INFO[0004] 61c88d41fce85c57 became follower at term 12
INFO[0004] newRaft 61c88d41fce85c57 [peers: [], term: 12, commit: 290, applied: 0, lastindex: 290, lastterm: 12]
INFO[0004] 61c88d41fce85c57 is starting a new election at term 12
INFO[0004] 61c88d41fce85c57 became candidate at term 13
INFO[0004] 61c88d41fce85c57 received vote from 61c88d41fce85c57 at term 13
INFO[0004] 61c88d41fce85c57 became leader at term 13
INFO[0004] raft.node: 61c88d41fce85c57 elected leader 61c88d41fce85c57 at term 13
INFO[0004] Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=192.168.0.47 Adv-addr=192.168.0.47 Remote-addr =
INFO[0004] Daemon has completed initialization
INFO[0004] Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=192.168.0.47 Adv-addr=192.168.0.47 Remote-addr =
INFO[0004] Docker daemon commit=7392c3b graphdriver=aufs version=1.12.5
INFO[0004] Gossip cluster hostname eonSpider-3e64aecb2dd5
INFO[0004] API listen on /var/run/docker.sock
INFO[0004] No non-localhost DNS nameservers are left in resolv.conf. Using default external servers :
[nameserver 8.8.8.8 nameserver 8.8.4.4]
INFO[0004] IPv6 enabled; Adding default IPv6 external servers :
[nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
INFO[0000] Firewalld running: false
Now if you create or remove containers for example, you will see that Docker daemon connects you (Docker client)
to Docker containers.
This is how Docker daemon communicate with the rest of Docker modules:
Containerd
Containerd is one of the recent projects in the Docker ecosystem and its purpose is breaking up more modularity to
Docker architecture and more neutrality visà-vis the other industry actors (Cloud providers and other orchestrator
services).
According to Solomon Hykes, containerd is already deployed on millions of machines since April 2016 when it was
included in Docker 1.11. The announced roadmap to extend containerd get its input from the cloud providers and
actors like Alibaba Cloud, AWS, Google, IBM, Microsoft, and other active members of the container ecosystem.
More Docker engine functionality will be added to containerd so that containerd 1.0 will provide all the core
primitives you need to manage containers with parity on Linux and Windows hosts:
Container execution and supervision
Image distribution
Network Interfaces Management
Local storage
Native plumbing level API
Full OCI support, including the extended OCI image specification
To build, ship and run containerized applications, you may continue to use Docker but if you are looking for
specialized components you could consider containerd.
Docker Engine 1.11 was the first release built on runC (a runtime based on Open Container Intiative technology)
and containerd.
Formed in June 2015, the Open Container Initiative (OCI) aims to establish common standards for software
containers in order to avoid a potential fragmentation and divisions inside the container ecosystem.
It contains two specifications:
runtime-spec: The runtime specification
image-spec: The image specification
The runtime specification outlines how to run a filesystem bundle that is unpacked on disk:
A standardized container bundle should contain the needed information and configurations to load and run a
container in a config.json file residing in the root of the bundle directory.
A standardized container bundle should contain a directory representing the root filesystem of the container.
Generally this directory has a conventional name like rootfs.
You can see the json file if you export and extract an image. In the following example, we are going to use busybox
image.
mkdir my_container
cd my_container
mkdir rootfs
docker export $(docker create busybox) | tar -C rootfs -xvf -
Now we have an extracted busybox image inside of rootfs directory.
tree -d my_container/
my_container/
└── rootfs
├── bin
├── dev
│ ├── pts
│ └── shm
├── etc
├── home
├── proc
├── root
├── sys
├── tmp
├── usr
│ └── sbin
└── var
├── spool
│ └── mail
└── www
We can generate the config.json file:
docker-runc spec
You could also use runC to generate your json file:
runc spec
This is the generated configuration file (config.json):
{
"ociVersion": "1.0.0-rc2-dev",
"platform": {
"os": "linux",
"arch": "amd64"
},
"process": {
"terminal": true,
"consoleSize": {
"height": 0,
"width": 0
},
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "runc",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"hooks": {},
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
}
],
"maskedPaths": [
"/proc/kcore",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware"
],
"readonlyPaths": [
"/proc/asound",
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}
Now you can edit any of the configurations listed above and run again a container without even using Docker, just
runC:
runc run container-name
Note that you should install runC first in order to use it. sudo apt install runc for Ubuntu 16.04. You could
also install it from sources:
mkdir -p ~/golang/src/github.com/opencontainers/
cd ~/golang/src/github.com/opencontainers/
git clone https://github.com/opencontainers/runc
cd ./runc
make
sudo make install
runC, a standalone containers runtime, is at its full spec, it allows you to spin containers, interact with them, and
manage their lifecycle and that's why containers built with one engine (like Docker) can run on another engine.
Containers are started as a child process of runC and can be embedded into various other systems without having to
run a daemon (Docker Daemon).
runC is built on libcontainer which is the same container library powering a Docker engine installation. Prior to the
version 1.11, Docker engine was used to manage volumes, networks, containers, images etc.. Now, the Docker
architecture is broken into four components: Docker engine, containerd, containerd-shm and runC. The binaries are
respectively called docker, docker-containerd, docker-containerd-shim, and docker-runc.
To run a container, Docker engine creates the image, pass it to containerd. containerd calls containerd-shim that
uses runC to run the container. Then, containerd-shim allows the runtimes (runC in this case) to exit after it starts
the container : This way we can run daemon-less containers because we are not having to have the long running
runtime processes for containers.
Currently, the creation of a container is handled by runc (via containerd) but it is possible to use another
binary (instead of runC) that expose the same command line interface of Docker and accepting an OCI bundle.
You can see the different runtimes that you have on your host by typing:
docker info|grep -i runtime
Since I am using the default runtime, this is what I should get as an output:
Runtimes: runc
Default Runtime: runc
To add another runtime, you should follow this command:
docker daemon --add-runtime "<runtime-name>=<runtime-path>"
Example:
docker daemon --add-runtime "oci=/usr/local/sbin/runc"
There is only one containerd-shim by process and it manages the STDIO FIFO and keeps it open for the container in
case containerd or Docker dies.
It is also in charge of reporting the container's exit status to a higher level like Docker.
Container runtime, lifecycle support and the execution (create, start, stop, pause, resume, exec, signal & delete) are
some features implemented in Containerd. Some others are managed by other components of Docker (volumes,
logging ..etc). Here is a table from the Containerd Github repository that lists the different features and tell if they
are in or out of scope.
Name Description In/Out Reason
execution
Provide an extensible
execution layer for executing a
container
in Create,start, stop pause, resume exec, signal, delete
cow
filesystem
Built in functionality for
overlay, aufs, and other copy
on write filesystems for
containers
in
distribution
Having the ability to push and
pull images as well as
operations on images as a first
class api object
in containerd will fully support the management and
retrieval of images
low-level
networking
drivers
Providing network
functionality to containers
along with configuring their
network namespaces
in
Network support will be added via interface and network
namespace operations, not service discovery and service
abstractions.
build Building images as a first
class API out
Build is a higher level tooling feature and can be
implemented in many different ways on top of
containerd
volumes Volume management for
external data out The api supports mounts, binds, etc where all volumes
type systems can be built on top of.
logging Persisting container logs out
Logging can be build on top of containerd because the
container’s STDIO will be provided to the clients and they
can persist any way they see fit. There is no io copying of
container STDIO in containerd.
If we run a container:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /data/lists:/var/lib/mysql -d mariadb
Unable to find image 'mariadb:latest' locally
latest: Pulling from library/mariadb
75a822cd7888: Pull complete
b8d5846e536a: Pull complete
b75e9152a170: Pull complete
832e6b030496: Pull complete
034e06b5514d: Pull complete
374292b6cca5: Pull complete
d2a2cf5c3400: Pull complete
f75e0958527b: Pull complete
1826247c7258: Pull complete
68b5724d9fdd: Pull complete
d56c5e7c652e: Pull complete
b5d709749ac4: Pull complete
Digest: sha256:0ce9f13b5c5d235397252570acd0286a0a03472a22b7f0384fce09e65c680d13
Status: Downloaded newer image for mariadb:latest
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
If you type ps aux you can notice the docker-containerd-shim process relative to this container running with
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
/var/run/docker/libcontainerd/db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
and runC binary ( docker-runc ) as parameters:
docker-containerd-shim \
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843 \
/var/run/docker/libcontainerd/db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843 \
docker-runc
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843 is the id of the container that you can see at the
end of the container creation.
ls -l /var/run/docker/libcontainerd/db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
total 4
-rw-r--r-- 1 root root 3653 Dec 27 22:21 config.json
prwx------ 1 root root 0 Dec 27 22:21 init-stderr
prwx------ 1 root root 0 Dec 27 22:21 init-stdin
prwx------ 1 root root 0 Dec 27 22:21 init-stdout
Docker Daemon Events
Docker daemon reports the following event: reload.
Docker Plugins
Even if they are not widely used, but Docker plugins are an interesting feature because it extends Docker by third-
party plugins like network or storage plugins. Docker plugins run out of Docker processes and expose a webhook-
like functionality which the Docker daemon uses to send HTTP POST requests in order the plugins acts as Docker
Events.
Let's take Flocker as an example.
Flocker is an open-source container data volume orchestrator for dockerized applications. Flocker gives Ops teams
the tools they need to run containerized stateful services like databases in production, since it provides a tool that
migrates data along with containers as they change hosts.
The standard way to run a volume container is :
docker run --name my_volume_container_1 -v /data -it -d busybox
But if you want to use Flocker plugin, you should run it like this:
docker run --name my_volume_container_2 -v /data -it --volume-driver=flocker -d busybox
Docker community is one of the most active communities during 2016 and the value of plugins is to allow these
Docker developers to contribute in a growing ecosystem.
Another known plugin is developed by Weave Net which integrates with Weave Scope so you can see how
containers are connected. Using Weave Flux works you can automate request routing in order to turn containers into
microservices.
We have seen how to use a storage plugin and it is something that looks like:
docker run -v <volume_name>:<mountpoint> --volume-driver=<plugin_name> <..> <image>
Network plugins has a different usage, you create the network first of all:
docker network create -d <plugin_name> <network_name>
Then you use it:
docker run --net=<networkname> <..> <image>
You can use some commands to manage Docker plugins, but they are all experimental and may change before it
becomes generally available.
To use the following commands, you should install the experimental version of Docker.
docker plugin ls
docker plugin enable
docker plugin disable
docker plugin inspect
docker plugin install
docker plugin rm
Overview Of Available Plugins
This is an inexhaustive overview of available plugins, that you can also find in the official documentation:
Contiv Networking An open source network plugin to provide infrastructure and security policies for
a multi-tenant micro services deployment, while providing an integration to physical network for
non-container workload. Contiv Networking implements the remote driver and IPAM APIs available
in Docker 1.9 onwards.
Kuryr Network Plugin A network plugin is developed as part of the OpenStack Kuryr project and
implements the Docker networking (libnetwork) remote driver API by utilizing Neutron, the
OpenStack networking service. It includes an IPAM driver as well.
Weave Network Plugin A network plugin that creates a virtual network that connects your Docker
containers - across multiple hosts or clouds and enables automatic discovery of applications. Weave
networks are resilient, partition tolerant, secure and work in partially connected networks, and other
adverse environments - all configured with delightful simplicity.
Azure File Storage plugin Lets you mount Microsoft Azure File Storage shares to Docker containers
as volumes using the SMB 3.0 protocol. Learn more.
Blockbridge plugin A volume plugin that provides access to an extensible set of container-based
persistent storage options. It supports single and multi-host Docker environments with features that
include tenant isolation, automated provisioning, encryption, secure deletion, snapshots and QoS.
Contiv Volume Plugin An open source volume plugin that provides multi-tenant, persistent,
distributed storage with intent based consumption. It has support for Ceph and NFS.
Convoy plugin A volume plugin for a variety of storage back-ends including device mapper and
NFS. It’s a simple standalone executable written in Go and provides the framework to support
vendor-specific extensions such as snapshots, backups and restore.
DRBD plugin A volume plugin that provides highly available storage replicated by
DRBD. Data written to the docker volume is replicated in a cluster of DRBD nodes.
Flocker plugin A volume plugin that provides multi-host portable volumes for Docker, enabling you
to run databases and other stateful containers and move them around across a cluster of machines.
gce-docker plugin A volume plugin able to attach, format and mount Google Compute persistent-
disks.
GlusterFS plugin A volume plugin that provides multi-host volumes management for Docker using
GlusterFS.
Horcrux Volume Plugin A volume plugin that allows on-demand, version controlled access to your
data. Horcrux is an open-source plugin, written in Go, and supports SCP, Minio and Amazon S3.
HPE 3Par Volume Plugin A volume plugin that supports HPE 3Par and StoreVirtual iSCSI storage
arrays.
IPFS Volume Plugin An open source volume plugin that allows using an ipfs filesystem as a volume.
Keywhiz plugin A plugin that provides credentials and secret management using Keywhiz as a central
repository.
Local Persist Plugin A volume plugin that extends the default local driver’s functionality by
allowing you specify a mountpoint anywhere on the host, which enables the files to always persist,
even if the volume is removed via docker volume rm .
NetApp Plugin (nDVP) A volume plugin that provides direct integration with the Docker ecosystem
for the NetApp storage portfolio. The nDVP package supports the provisioning and management of
storage resources from the storage platform to Docker hosts, with a robust framework for adding
additional platforms in the future.
Netshare plugin A volume plugin that provides volume management for NFS 3/4, AWS EFS and
CIFS file systems.
OpenStorage Plugin A cluster-aware volume plugin that provides volume management for file and
block storage solutions. It implements a vendor neutral specification for implementing extensions
such as CoS, encryption, and snapshots. It has example drivers based on FUSE, NFS, NBD and EBS
to name a few.
Portworx Volume Plugin A volume plugin that turns any server into a scale-out converged
compute/storage node, providing container granular storage and highly available volumes across any
node, using a shared-nothing storage backend that works with any docker scheduler.
Quobyte Volume Plugin A volume plugin that connects Docker to Quobyte’s data center file system,
a general-purpose scalable and fault-tolerant storage platform.
REX-Ray plugin A volume plugin which is written in Go and provides advanced storage
functionality for many platforms including VirtualBox, EC2, Google Compute Engine, OpenStack,
and EMC.
Virtuozzo Storage and Ploop plugin A volume plugin with support for Virtuozzo Storage distributed
cloud file system as well as ploop devices.
VMware vSphere Storage Plugin Docker Volume Driver for vSphere enables customers to address
persistent storage requirements for Docker containers in vSphere environments.
Twistlock AuthZ Broker A basic extendable authorization plugin that runs directly on the host or
inside a container. This plugin allows you to define user policies that it evaluates during
authorization. Basic authorization is provided if Docker daemon is started with the –tlsverify flag
(username is extracted from the certificate common name).
Docker Plugins Events
Docker plugins report the following events: install, enable, disable, remove.
Docker Philosophy
Docker is a new way to build, ship & run differently but easier than "traditional" ways. This isn't just related to the
technology but also to the philosophy.
Build Ship & Run
Visualization is a useful technology to run different OSs on a single bare-metal server but containers go further, they
decouple applications and software from the underlying OS and could be used to create a self-service of an
application or a PaaS. Docker is development-oriented and user friendly it could be used in DevOps pipelines in
order to develop, test and run applications.
On the same way Docker allows DevOps teams to have a local development environment similar to production
easily, just build and ship.
After finishing his tests, the developer could publish the container using Docker machine.
In the developer-driven era, cloud infrastructures, IoT, mobile devices and other technologies are innovating each
day, it is really important from a DevOps view to reduce the complexity of software production, quality, stability
and scalability by integrating containers as wrappers to standardize runtime systems.
Docker Is Single Process
A container can be single or multiprocess. While other containerization allows containers to run multiple processes
(lxc), Docker restricts containers to run as a single process and if you would like to run n processes for your
application , you should run n Docker containers.
In reality, you can run multiple processes, since Docker has an instruction called ENTRYPOINT that starts a
program, you can set the ENTRYPOINT to execute a script and run as many programs as you want, but it is not
completely aligned with Docker philosophy.
Building an application based on a single-process containers is efficient to create microservices architectures,
develop PaaS (platform as a service), create FaaS (function as a service) platforms, serverless computing or event-
based programming..
We a going to learn about building microservices applications using Docker later in this book. Briefly, it is an
approach to implement distributed architectures where services are independent processes that communicate with
each other over a network.
Docker Is Stateless
A Docker container consists of a series of a series of layers created previously in an image, once the image becomes
a container, these layers become read-only layers.
Once a modification happens in the container by a process, a diff is made and Docker notices the difference
between a container's layer and the matched images's layer but only when you type
docker commit <container id>
In this case a new image will be created with the last modifications but if you delete the container, the diff will
disappear.
Docker does not support a persistent storage but if you want to keep your storage persistent, you should use Docker
volumes and mount your files in the host filesystem, and in this case your code will be a part of a volume not a
container.
Docker is stateless, if any change happens in a running container, and new and different image could be created and
this is one of its strengths because you can do a fast and an easy rollback anytime.
Docker Is Portable
I created some public images that I pushed to my public Docker Hub, like eon01/vote, eon01/nginx-static - these
images were pulled more than 5k times. I created both of them using my laptop and all of the people who pulled
them used them as containers in different enviroments: bare-metal servers, desktops, laptops, virtual machines .. etc
all of these containers was built in a machine and run in the same way in thousands of other machines.
Chapter IV - Advanced Concepts
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Namespaces
According to Linux man pages, namespace wraps a global system resource in an abstraction that makes it appear to
the processes within the namespace that they have their own isolated instance of the global resource. Changes to the
global resource are visible to other processes that are members of the namespace, but are invisible to other
processes. One use of namespaces is to implement containers.
| Namespace | Constant | Isolates |
|------------|------------------|-------------------------------------|
| Cgroup | CLONE_NEWCGROUP | Cgroup root directory |
| IPC | CLONE_NEWIPC | System V IPC, POSIX message queues |
| Network | CLONE_NEWNET | Network devices, stacks, ports, etc.|
| Mount | CLONE_NEWNS | Mount points |
| PID | CLONE_NEWPID | Process IDs |
| User | CLONE_NEWUSER | User and group IDs |
| UTS | CLONE_NEWUTS | Hostname and NIS domain name |
Namespaces are the first form of isolation where a process running in a container A can not see or affect another
process running in a container B even if both processes are running in the same host machine.
Kernel namespaces allows a single host to have multiple:
Network devices
IP addresses
Routing rules
Netfilter rules
Timewait buckets
Bind buckets
Routing cache
etc ..
Every Docker container will have its own IP address and will see other docker containers as hosts connected to the
same "switch".
Control Groups (cgroups)
According to Wikipedia, cgroups or control groups is a Linux kernel feature that limits, accounts for, and isolates the
resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.
This feature was introduced by engineers at Google in 2006 under the name "process containers". In late 2007, the
nomenclature changed to "control groups" to avoid confusion caused by multiple meanings of the term "container"
in the Linux kernel context, and the control groups functionality was merged into the Linux kernel mainline in kernel
version 2.6.24, which was released in January 2008.
Control Groups are also used by Docker not only to implement resource accounting and isolate its usage but
because they provide many useful metrics that help ensure that each container gets enough memory, CPU, disk I/O,
memory cache, CPU cache ..etc.
Control Groups provides also a level of security because using the feature ensure that a single container will not
bring down the host OS by exhausting its resources.
Public and private PaaS are also making use of cgroups in order to guarantee a stable service with no downtimes
when an application exhaust CPU, memory or any other critical resource. Other software and containers
technologies use cgroups like LXC, libvirt, systemd, Open Grid Scheduler and lmctfy.
Linux Capabilities
Unix was designed to run two categories of processes :
privileged processes, with the user id 0 which is the root user.
unprivileged processes where the user id is different from zero.
If you are familiar with Linux, you can find all of users ids when you type:
ps -eo uid
Every process in a UNIX has an owner (designed by its UID) and a group owner (identified by a GID). When Unix
starts a process, its GID is set to the GID of its parent process. The main purpose of this schema is performing
permission checks.
While privileged processes bypass all kernel permission checks, all the other processes are subject to a permission
check ( the Unix-like system Kernel make the permission check based on the UID and the GID of the non root
process.
Starting with Kernel 2.2, Linux divides the privileges into distinct units that we call capabilities. Capabilities can be
independently enabled and disabled, which is a great security and permission management feature. A process can
have CAP_NET_RAW capability while it is denied to use the CAP_NET_ADMIN capability. This is just an example
explaining how capabilities works. You can find the complete list of the other capabilities in Linux manpage:
man capabilities
By default, Docker starts containers with a limited list of capabilities so that containers will not use a real root
privileges and the root user within the container will not access all of the privileges of a real root user.
This is the list of the capabilities used by a default Docker installation :
s.Process.Capabilities = []string{
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE",
}
You can find the same list in Docker source code on github
Secure Computing Mode (seccomp)
Since Kernel version 2.6.12, Linux allows a process to make a one-way transition into a "secure" state where it
cannot make any system calls to an open file descriptors (fd).
Only exit(), sigreturn(), read() and write() system calls are allowed, any other call made by a process with a "secure"
state will beget its termination : the Kernel will send him a SIGKILL. This is the sandboxing mechanism in Linux
Kernel based on seccomp Kernel's feature.
Docker allows using seccomp to restrict the actions available within a container and restrict your application's
access.
To use the seccomp feature, Kernel should be configured with CONFIG_SECCOMP enabled and Docker should be
built with seccomp .
If you would like to check if your Kernel has seccomp activated, type:
cat /boot/config-`uname -r` | grep CONFIG_SECCOMP=
If it is activated, yous should see this:
CONFIG_SECCOMP=y
Docker uses seccomp profiles, which are configurations that disable around 44 system calls out of 300+. You can
find the default profile in the official repository.
You can choose a specific seccomp profile when running a container by adding the following options to the Docker
run command:
--security-opt seccomp=/path/to/seccomp/profile.json
The following table can be found in Docker official repository and it represents some of the blocked syscalls.
Syscall Description
acct Accounting syscall which could let containers disable their own resource limits or process
accounting. Also gated by CAP_SYS_PACCT .
add_key Prevent containers from using the kernel keyring, which is not namespaced.
adjtimex Similar to clock_settime and settimeofday , time/date is not namespaced.
bpf Deny loading potentially persistent bpf programs into kernel, already gated by
CAP_SYS_ADMIN .
clock_adjtime Time/date is not namespaced.
clock_settime Time/date is not namespaced.
clone Deny cloning new namespaces. Also gated by CAP_SYS_ADMIN for CLONE_* flags, except
CLONE_USERNS .
create_module Deny manipulation and functions on kernel modules.
delete_module Deny manipulation and functions on kernel modules. Also gated by CAP_SYS_MODULE .
finit_module Deny manipulation and functions on kernel modules. Also gated by CAP_SYS_MODULE .
get_kernel_syms Deny retrieval of exported kernel and module symbols.
get_mempolicy Syscall that modifies kernel memory and NUMA settings. Already gated by CAP_SYS_NICE .
init_module Deny manipulation and functions on kernel modules. Also gated by CAP_SYS_MODULE .
ioperm Prevent containers from modifying kernel I/O privilege levels. Already gated by
CAP_SYS_RAWIO .
iopl Prevent containers from modifying kernel I/O privilege levels. Already gated by
CAP_SYS_RAWIO .
kcmp Restrict process inspection capabilities, already blocked by dropping CAP_PTRACE .
kexec_file_load Sister syscall of kexec_load that does the same thing, slightly different arguments.
kexec_load Deny loading a new kernel for later execution.
keyctl Prevent containers from using the kernel keyring, which is not namespaced.
lookup_dcookie Tracing/profiling syscall, which could leak a lot of information on the host.
mbind Syscall that modifies kernel memory and NUMA settings. Already gated by CAP_SYS_NICE .
mount Deny mounting, already gated by CAP_SYS_ADMIN .
move_pages Syscall that modifies kernel memory and NUMA settings.
name_to_handle_at Sister syscall to open_by_handle_at . Already gated by CAP_SYS_NICE .
nfsservctl Deny interaction with the kernel nfs daemon.
open_by_handle_at Cause of an old container breakout. Also gated by CAP_DAC_READ_SEARCH .
perf_event_open Tracing/profiling syscall, which could leak a lot of information on the host.
personality Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested,
potential for a lot of kernel vulns.
pivot_root Deny pivot_root , should be privileged operation.
process_vm_readv Restrict process inspection capabilities, already blocked by dropping CAP_PTRACE .
process_vm_writev Restrict process inspection capabilities, already blocked by dropping CAP_PTRACE .
ptrace Tracing/profiling syscall, which could leak a lot of information on the host. Already blocked
by dropping CAP_PTRACE .
query_module Deny manipulation and functions on kernel modules.
quotactl Quota syscall which could let containers disable their own resource limits or process
accounting. Also gated by CAP_SYS_ADMIN .
reboot Don't let containers reboot the host. Also gated by CAP_SYS_BOOT .
request_key Prevent containers from using the kernel keyring, which is not namespaced.
set_mempolicy Syscall that modifies kernel memory and NUMA settings. Already gated by CAP_SYS_NICE .
setns Deny associating a thread with a namespace. Also gated by CAP_SYS_ADMIN .
settimeofday Time/date is not namespaced. Also gated by CAP_SYS_TIME .
stime Time/date is not namespaced. Also gated by CAP_SYS_TIME .
swapon Deny start/stop swapping to file/device. Also gated by CAP_SYS_ADMIN .
swapoff Deny start/stop swapping to file/device. Also gated by CAP_SYS_ADMIN .
sysfs Obsolete syscall.
_sysctl Obsolete, replaced by /proc/sys.
umount Should be a privileged operation. Also gated by CAP_SYS_ADMIN .
umount2 Should be a privileged operation.
unshare Deny cloning new namespaces for processes. Also gated by CAP_SYS_ADMIN , with the exception
of unshare --user .
uselib Older syscall related to shared libraries, unused for a long time.
userfaultfd Userspace page fault handling, largely needed for process migration.
ustat Obsolete syscall.
vm86 In kernel x86 real mode virtual machine. Also gated by CAP_SYS_ADMIN .
vm86old In kernel x86 real mode virtual machine. Also gated by CAP_SYS_ADMIN .
If you would like to run a container without the default seccomp profile defined below, you can run:
docker run --rm -it --security-opt seccomp=unconfined hello-world
If you are using a distribution that does not support seccomp profiles, like Ubuntu 14.04, you will have the
following error : docker: Error response from daemon: Linux seccomp: seccomp profiles are not supported on this daemon, you
cannot specify a custom seccomp profile.
Application Armor (Apparmor)
Application Armor is a Linux Kernel security module. It allows the Linux administrator to restrict programs'
capabilities with per-program profiles.
In other words, this is a tool to lock applications by limiting their access to the only resources they are supposed to
use without disturbing neither their execution nor their performance.
Profiles can allow capabilities like raw socket, network access, read, write or execute files. This functionality was
added to Linux to complete the Discretionary Access Control (DAC) model. The Mandatory Access Control (MAC)
was then introduced.
You can check your Apparmor status by typing:
sudo apparmor_status
You will see a similar output to the following one, if it is activated:
apparmor module is loaded.
19 profiles are loaded.
18 profiles are in enforce mode.
/sbin/dhclient
/usr/bin/evince
/usr/bin/evince-previewer
/usr/bin/evince-previewer//sanitized_helper
/usr/bin/evince-thumbnailer
/usr/bin/evince-thumbnailer//sanitized_helper
/usr/bin/evince//sanitized_helper
/usr/bin/lxc-start
/usr/lib/NetworkManager/nm-dhcp-client.action
/usr/lib/connman/scripts/dhclient-script
/usr/sbin/cups-browsed
/usr/sbin/mysqld
/usr/sbin/ntpd
/usr/sbin/tcpdump
docker-default
lxc-container-default
lxc-container-default-with-mounting
lxc-container-default-with-nesting
1 profiles are in complain mode.
/usr/sbin/sssd
8 processes have profiles defined.
8 processes are in enforce mode.
/sbin/dhclient (1815)
/usr/sbin/cups-browsed (1697)
/usr/sbin/ntpd (3363)
docker-default (3214)
docker-default (3380)
docker-default (3381)
docker-default (3382)
docker-default (3390)
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
You can see in the list above that Docker profile is active and it is called docker-default . It is the default profile so
when you run a container, the following command is explicitly executed :
docker run --rm -it --security-opt apparmor=docker-default hello-world
The default profile is the following:
#include <tunables/global>
profile docker-default flags=(attach_disconnected,mediate_deleted) {
#include <abstractions/base>
network,
capability,
file,
umount,
deny @{PROC}/{*,**^[0-9*],sys/kernel/shm*} wkx,
deny @{PROC}/sysrq-trigger rwklx,
deny @{PROC}/mem rwklx,
deny @{PROC}/kmem rwklx,
deny @{PROC}/kcore rwklx,
deny mount,
deny /sys/[^f]*/** wklx,
deny /sys/f[^s]*/** wklx,
deny /sys/fs/[^c]*/** wklx,
deny /sys/fs/c[^g]*/** wklx,
deny /sys/fs/cg[^r]*/** wklx,
deny /sys/firmware/efi/efivars/** rwklx,
deny /sys/kernel/security/** rwklx,
}
You can confine containers using Apparmor but not the programs running inside a container. If you are
running Nginx under a container profile to protect the host, you will not be able to confine the Nginx with a different
profile to protect the container.
Apparmor is used since Linux Kernel since version 2.6.36 release.
Docker Union Filesystem
A Docker image is built of layers. UnionFS is the technology that allows that.
Union Filesystem or UnionFS is a filesystem service used in Linux, FreeBSD and NetBSD. It allows a set of files
and directories to form a single filesystem by grouping them in a single branch. Any Docker image is in reality a set
of layered filesystems superposed, one over the other.
UnionFS layers are transparently overlaid and form a coherent file system that we call branches. They may be either
read-only or read-write file systems.
Docker uses UnionFS in order to avoid duplicating a branch. The boot filesystem (bootfs) for example is one of the
branches that are used in many containers and it resembles the typical Unix boot filesystem. It is used in Ubuntu,
Debian, CentOs and many other containers.
When a container is started, Docker mounts a filesystem on top of any layers below and make it writable. So any
change is applied only to this layer. If you want to modify a file, it will be moved from the read-only layer below
into the read-write layer at the top.
Let's start a container and see its filesystem layers.
docker create -it redis
You can notice that Docker is downloading different layers in the output:
latest: Pulling from library/redis
6a5a5368e0c2: Pull complete
2f1103ce5ca9: Pull complete
086a40c85e01: Pull complete
9a5e9d112ec4: Pull complete
dadc4b601bb4: Pull complete
b8066982e7a1: Pull complete
2bcdfa1b63bf: Pull complete
Digest: sha256:38e873a0db859d0aa8ab6bae7bcb03c1bb65d2ad120346a09613084b49185912
Status: Downloaded newer image for redis:latest
6ab94a06bc0263110b973174d65cbc6ebd6d9fc637526b2c9dd3eac3c3bcf032
`docker create creates a writable container layer over the specified image hello-world in the last case) and
prepares it for running a command. It is similar to docker run -d except the container will not start running.
When running the last command, you will have an id that will be printed on your terminal. It is the id of the prepared
container.
6ab94a06bc0263110b973174d65cbc6ebd6d9fc637526b2c9dd3eac3c3bcf032
The container is ready, let's start it:
docker start 0eeab42751ff0172f845b9cb737c966471be9f10282cf1684519bc7f5da80170
The command docker start creates a process space around the UnionFS block.
We can't have more than one process space per container.
A docker ps will show that the container is running:
CONTAINER ID IMAGE COMMAND PORTS NAMES
6ab94a06bc02 redis "docker-entrypoint.sh" 6379/tcp trusting_carson
If you would like to see the layers, then type:
ls -l /var/lib/docker/aufs/layers
Layers of the same images will have a name that starts with the container id 6ab94a06bc02 .
We haven't seen yet the concept of Dockerfile, but don't worry if it is the first time you saw it. For the moment, let's
just limit our understanding to the explanation figuring in the official documentation:
Docker can build images automatically by reading the instructions from a Dockerfile. A Dockerfile is a text
document that contains all the commands a user could call on the command line to assemble an image. Using
Docker build users can create an automated build that executes several command-line instructions in
succession.
In general, when writing a Dockerfile, every line in this file will create an additional layer.
FROM busybox # This is a layer
MAINTAINER Aymen El Amri # This is a layer
RUN ls # This is a layer
Let's build this image:
docker build .
The id of the build, on the output for this example was
7f49abaf7a69
In other words, a layer is a change in an image.
So the second advantage of using UnionFS and the layered filesystem is the isolation of modifications : Any change
that happens to the running container will be initialized once the containers is restarted. The bootfs layer is one of
the layers that users will not interact with.
To sum up, when Docker mounts the rootfs, it starts as a read-only system, as in a traditional Linux boot, but then,
instead of changing the file system to the read-write mode, Docker takes advantage of a union mount to add a read-
write filesystem over the first read-only filesystem. There may be multiple read-only file systems layered on top of
each other. These file systems are called layers.
Docker supports a number of different union file systems. The next chapter about images will reinforce your
understanding about this.
Storage Drivers
OverlayFS, AUFS, VFS or Device Mapper are some storage technologies that are compatible with Docker. They are
easily "pluggable" to Docker and we call them storage drivers. Every technology has its own specificities and
choosing one depends on your technical requirements and criteria (like stability, maintenance, backup, speed,
hardware ..etc)
When we started running an image, Docker used a mechanism called copy-on-write (CoW).
CoW is a standard Unix pattern that provides a single shared copy of some data until it is modified. In general, CoW
is an implicit sharing of resources used to implement a copy operation on modifiable resources.
// define x
std::string x("Hello");
// x and y use the same buffer
std::string y = x;
// now y uses a different buffer while x still uses the same old buffer
y += ", World!";
In storage, CoW is used as an underlying mechanism to create snapshots like it is the case for for logical volume
management done by ZFS, AUFS .. etc
In a CoW algorithm, the original storage is never modified and when a write operation is requested it is
automatically redirected to a new storage (the original one is not modified). This redirection is called Redirect-on-
write or ROW
When a write request is made, all the data is copied into a new storage resource. We call this second step Copy-on-
write" or COW.
You have probably seen blog posts about one storage drivers being better than others or some post mortems. Keep in
mind that there is no technology better than the other, every technology has its pro and it cons.
OverlayFS
OverlayFS is a filesystem service for Linux that implements a union mount for other file systems. In the commit
453552c8384929d8ae04dcf1c6954435c0111da0 of Docker, OverlayFS was added to Docker by @alexlarsson from
Redhat. In the signature, the commit was described as:
Each container/image can have a "root" subdirectory which is a plain filesystem hierarchy, or they can use
overlayfs.
If they use overlayfs there is a "upper" directory and a "lower-id" file, as well as "merged" and "work"
directories. The "upper" directory has the upper layer of the overlay, and "lower-id" contains the id of the
parent whose "root" directory shall be used as the lower layer in the overlay. The overlay itself is mounted in
the "merged" directory, and the "work" dir is needed for overlayfs to work.
When a overlay layer is created there are two cases, either the parent has a "root" dir, then we start out with a
empty "upper" directory overlaid on the parents root. This is typically the case with the init layer of a container
which is based on an image. If there is no "root" in the parent, we inherit the lower-id from the parent and start
by making a copy if the parents "upper" dir. This is typically the case for a container layer which copies its
parent -init upper layer.
Additionally we also have a custom implementation of ApplyLayer which makes a recursive copy of the parent
"root" layer using hardlinks to share file data, and then applies the layer on top of that. This means all chile
images share file (but not directory) data with the parent.
Overlay2 was added to Docker in the pull #22126
Pro
OverlayFS is similar to aufs but it is supported and was merged into the mainline Linux kernel since version 3.18.
Like aufs it enables shared memory between containers using the same shared libraries (in the same disk). The
advantage of using OverlayFS is the fact that it is on a continued development. It is simpler than aufs and in most
cases faster.
Since its version 1.12, Docker provides overlay2 storage driver which is more efficient than overlay. The version 2
of OverlayFS is compatible with Kernel 4.0 and later.
Many users had issues with OverlayFS because they run out of inode and this problem was solved by Overlay2.
Testing shows that there is a significant reduction in the number of used inode used, this result is available
especially for images having multiple layers. This is clearly solving the problem of inode usage.
Cons
Mainly, the inode exhaustion problems. It is a serious problem that was fixed by OverlayFS 2.
Some other issues but the majority of these problems were fixed in the version 2. If you are planning to use
OverlaFS use OverlayFS 2 instead.
On the other hand, Overlay2 is a young codebase and Docker 1.12 is the first release offering Overlay2 and logically
some other bugs will be discovered in the future so it is a system to use with vigilance.
AUFS
aufs (short for advanced multi-layered unification filesystem) is Union Filesystem that existed in Docker since the
beginning. It was developed to improve reliability and performance. It introduced some new concepts, like writable
branch balancing. In its manpage, aufs is described as a stackable unification filesystem such as Unionfs, which
unifies several directories and provides a merged single directory.
In the early days, aufs was entirely re-designed and re-implemented Unionfs Version 1.x series.
After many original ideas, approaches and improvements, it becomes totally different from Unionfs while keeping
the basic features.
Pro
aufs is one of the most popular drivers for Docker. It is stable and used by many distributions like Knoppix, Ubuntu
10.04, Gentoo Linux 12.0 and Puppy Linux live CDs distributions. aufs help different containers to share memory
pages if they are loading the same shared libraries from the same layer.
The longest existing and possibly the most tested graphdriver backend for Docker. Reasonably performant and
stable for wide range of use cases, even though it is only available on Ubuntu and Debian Kernels (as noted below),
there has been significant use of these two distributions with Docker allowing for lots of airtime for the aufs driver
to be tested in a broad set of environments. Enables shared memory pages for different containers loading the same
shared libraries from the same layer (because they are the same inode on disk).
Cons
aufs was rejected for merging into mainline Linux. Its code was criticized for being "dense, unreadable, and
uncommented".
Aufs consists of about 20,000 lines of dense, unreadable, uncommented code, as opposed to around 10,000 for
Unionfs and 3,000 for union mounts and 60,000 for all of the VFS. The aufs code is generally something that
one does not want to look at. source
Instead, OverlayFS was merged in the Linux kernel.
Btrfs
Btrfs is a modern CoW filesystem for Linux.
The philosophy behind Btrfs is to implement advanced features while focusing on fault tolerance and easy
administration. Btrfs is developed at multiple companies and licensed under the GPL. The name stands for "B TRee
File System" so it is easier and probably acceptable to pronounce "B-Tree FS" instead of b-t-r-fs, but the real
pronunciation is "Butter FS".
Btrfs has native features named subvolumes and snapshots providing together CoW-like features.
It actually consists of three types of on-disk structures:
block headers,
keys,
and items
currently defined as follows:
struct btrfs_header {
u8 csum[32];
u8 fsid[16];
__le64 blocknr;
__le64 flags;
u8 chunk_tree_uid[16];
__le64 generation;
__le64 owner;
__le32 nritems;
u8 level;
}
struct btrfs_disk_key {
__le64 objectid;
u8 type;
__le64 offset;
}
struct btrfs_item {
struct btrfs_disk_key key;
__le32 offset;
__le32 size;
}
Btrfs was added to Docker in this commit by Alex Larsson from Red Hat.
Pro
Btrfs was introduced in the mainline Linux kernel in 2007 and was used for some few years and it was used by Linus
Torvalds as his root file system on one of his laptops. When Btrfs was released it was optimized compared to old-
school filesystems. According to the official wiki, this filesystem has the following features:
Extent based file storage
2^64 byte == 16 EiB maximum file size (practical limit is 8 EiB due to Linux VFS)
Space-efficient packing of small files
Space-efficient indexed directories
Dynamic inode allocation
Writable snapshots, read-only snapshots
Subvolumes (separate internal filesystem roots)
Checksums on data and metadata (crc32c)
Compression (zlib and LZO)
Integrated multiple device support
File Striping
File Mirroring
File Striping+Mirroring
Single and Dual Parity implementations (experimental, not production-ready)
SSD (flash storage) awareness (TRIM/Discard for reporting free blocks for reuse) and optimizations (e.g.
avoiding unnecessary seek optimizations, sending writes in clusters, even if they are from unrelated files. This
results in larger write operations and faster write throughput)
Efficient incremental backup
Background scrub process for finding and repairing errors of files with redundant copies
Online filesystem defragmentation
Offline filesystem check
In-place conversion of existing ext3/4 file systems
Seed devices. Create a (readonly) filesystem that acts as a template to seed other Btrfs filesystems. The original
filesystem and devices are included as a readonly starting point for the new filesystem. - - Using copy on write,
all modifications are stored on different devices; the original is unchanged.
Subvolume-aware quota support
Send/receive of subvolume changes
Efficient incremental filesystem mirroring
Batch, or out-of-band deduplication (happens after writes, not during)
Cons
Btrfs hasn't been a real choice for many Linux distributions and this fact made of Btrf a technology without much
testing and bug hunting.
Device Mapper
The device mapper is a framework provided by the Linux kernel (Kernel-based framework) to map physical block
devices to virtual block devices which is a higer-level abstraction. LVM2, software RAIDs and dm-crypt disk
encryption are targets of this technology. It also offers other features like file system snapshots. Device Mapper
algorithm works at the block level and not the file level:
The device mapper driver stores every image and container on a separate virtual device that are provisioned CoW
snapshot devices. We call this thin provision or thip.
Pro
It was tested and use by many communities unlike other filesystems.
Many projects and Linux features are built on top of the device mapper:
LVM2 – logical volume manager for the Linux kernel
dm-crypt – mapping target that provides volume encryption
dm-cache – mapping target that allows creation of hybrid volumes
dm-log-writes mapping target that uses two devices, passing through the first device and logging the write
operations performed to it on the second device
dm-verity validates the data blocks contained in a file system against a list of cryptographic hash values,
developed as part of the Chromium OS project
dmraid – provides access to "fake" RAID configurations via the device mapper
DM Multipath – provides I/O failover and load-balancing of block devices within the Linux kernel
Linux version of TrueCrypt
DRBD (Distributed Replicated Block Device)
kpartx – utility called from hotplug upon device maps creation and deletion
EVMS (deprecated)
cryptsetup – utility used to conveniently setup disk encryption based on dm-crypt
And of course Docker that uses device mapper to create copy-on-write storage for software containers.
Cons
If you would like to get device mapper storage driver performance, you should consider doing some configurations
and you should not run "loopback" mode in production.
You should try 15 steps explained in Docker official documentation, in order to use devicemapper driver in the
direct-lvm mode.
I have never used it but I saw some feedback from users having problems with its usage.
ZFS
This filesystem was first merged into the Docker engine in May 2015 and was available since Docker 1.7 release.
ZFS and Docker/go-zfs wrapper requires the installation of zfs-utils or zfs (e.g Ubuntu 16.04).
ZFS or Zettabyte File System is a combined file system and logical volume manager created at Sun Microsystems. It
was used by OpenSolaris since 2005.
Pro
ZFS is a native filesystem to Solaris, OpenSolaris, OpenIndiana, illumos, Joyent SmartOS, OmniOS, FreeBSD,
Debian GNU/kFreeBSD systems, NetBSD ans OSv.
ZFS is a killer-app for Solaris, as it allows straightforward administration of disks and pool of disks while giving
performance and integrity. It protects data against "silent error" of the disk ( caused by firmware bugs or even
hardware malfunctions like bad cables..etc ).
According to Sun Microsystems official website, ZFS meets the needs of a file system for everything from desktops
to data centers and offers:
Simple administration : ZFS automates and consolidates complicated storage administration concepts, reducing
administrative overhead by 80 percent.
Provable data integrity : ZFS protects all data with 64-bit checksums that detect and correct silent data
corruption.
Unlimited scalability : As the world's first 128-bit file system, ZFS offers 16 billion billion times the capacity of
32- or 64-bit systems.
Blazing performance : ZFS is based on a transactional object model that removes most of the traditional
constraints on the order of issuing I/Os, which results in huge performance gains.
It achieves its performance through a number of techniques:
Dynamic striping across all devices to maximize throughput
Copy-on-write design makes most disk writes sequential
Multiple block sizes, automatically chosen to match workload
Explicit I/O priority with deadline scheduling
Globally optimal I/O sorting and aggregation
Multiple independent prefetch streams with automatic length and stride detection
Unlimited, instantaneous read/write snapshots
Parallel, constant-time directory operations
Creating a new zpool is needed to use zfs :
sudo zpool create -f zpool-docker /dev/xvdb
Cons
The ZFS license is incompatible with Linux license, so Linux does not have a ZFS implementation and not every OS
can use ZFS (e.g Windows).
It takes some learning to use, so if you are using it as your main filesystem you will probably need some knowledge
about it. zfs lacks inode sharing for shared libraries between containers, but in reality it is not the only driver not
implementing this.
VFS
vfs simply stand for Virtual File System which is the abstraction layer on top of a concrete physical filesystem but it
does not use Union File System and CoW and that is why it is used by developers for debugging only.
You can find Docker volumes using vfs in /var/lib/docker/vfs/dir .
Pro
To test Docker engine, VFS is very useful since it is simpler to validate tests using this simple FS. This filesystem is
helpful to run Docker in Docker (dind). The official Docker image uses vfs as the default filesystem.
--storage-driver=vfs
vfs is the only driver which is guaranteed to work regardless of the underlying filesystem in the case of Docker in
Docker but it could be very slow and inefficient. Running Docker in Docker will be detailed later in this book.
Cons
It is not recommended at all to run VFS on production since it is not intended to be used with Docker production
clusters.
What Storage Driver To Choose
Your choice of the storage driver could not based on only pros and cons of each filesystem but there are other choice
criteria. For example, overlayFS and overlay2 can not be used on the top of a btrfs.
In general, some of these storage drivers can operate on top of different backing filesystems (host filesystem) but not
all. These table explains the common usage of each storage driver:
Storage Driver Commonly Used On Top Of
overlay xfs & ext4
overlay2 xfs & ext4
aufs xfs & ext4
btrfs btrfs
devicemapper direct-lvm
zfs zfs
vfs debugging
Docker community created a practical diagram that shows simply the strength and the weakness of some storage
drivers and their best use case.
Finally, there is no storage driver adapted to all use cases, there is no "ultimate choice" to make but it depends on
your use case. If you don't know what to choose exactly go for aufs or Overlay2.
Chapter V - Working With Docker Images
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Managing Docker Images
Images, Intermediate Images & Dangling Images
If you are running Docker on your laptop or your server, you can see a list of the images that Docker used or uses by
typing:
docker images
You will have a similar list:
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> none 57ce8f25dd63 2 days ago 229.7 MB
scratch latest f02aa3980a99 2 days ago 0 B
xenial latest 7a409243b212 2 days ago 229.7 MB
<none> none 149b13361203 2 days ago 12.96 MB
Images with are untagged images.
You can print a custom output where you choose to view the ID and the size of the image:
docker images --format "{{.ID}}: {{.Size}}"
Or to view the repository:
docker images --format "{{.ID}}: {{.Repository}}"
In many cases, we just need to get IDs:
docker images -q
If you want to list all of them, then type:
docker images -a
or
docker images --all
In this list you will see all of the images even the intermediate ones.
REPOSITORY TAG IMAGE ID SIZE
my_app latest 7f49abaf7a69 1.093 MB
<none> <none> afe4509e17bc 225.6 MB
All of the <none>:<none> are intermediate images.
<none>:<none> images will grow exponentially with the numbers of images you download.
As you know each docker image is composed of layers with a parent-child hierarchical relationship .
These intermediate layers are a result of caching build operations which decrease disk usage and speed up builds.
Every build step is cached, that's why you may experienced some disk space problems after using Docker for a
while.
All docker layers are stored in /var/lib/docker/graph called the graph database.
Some of the intermediary images are not tagged, they are called dangling images.
docker images --filter "dangling=true"
Other filters may be used:
- label=<key> or label=<key>=<value>
- before=(<image-name>[:tag]|<image-id>|<image@digest>)
- since=(<image-name>[:tag]|<image-id>|<image@digest>)
Finding Images
If you type :
docker search ubuntu
you will get a list of images called Ubuntu that people shared publicly in the Docker Hub (hub.docker.com).
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
ubuntu Ubuntu is a Debian-based Linux operating s... 4958 [OK]
ubuntu-upstart Upstart is an event-based replacement for ... 67 [OK]
rastasheep/ubuntu-sshd Dockerized SSH service, built on top of of... 47 [OK]
ubuntu-debootstrap debootstrap --variant=minbase --components... 28 [OK]
torusware/speedus-ubuntuAlways updated official Ubuntu docker imag... 27 [OK]
consol/ubuntu-xfce-vnc Ubuntu container with "headless" VNC sessi... 26 [OK]
ioft/armhf-ubuntu ABR] Ubuntu Docker images for the ARMv7(a... 19 [OK]
nickistre/ubuntu-lamp LAMP server on Ubuntu 10 [OK]
nuagebec/ubuntu Simple always updated Ubuntu docker images... 9 [OK]
nimmis/ubuntu This is a docker images different LTS vers... 5 [OK]
maxexcloo/ubuntu Base image built on Ubuntu with init, Supe... 2 [OK]
jordi/ubuntu Ubuntu Base Image 1 [OK]
admiringworm/ubuntu Base ubuntu images based on the official u... 1 [OK]
darksheer/ubuntu Base Ubuntu Image -- Updated hourly 1 [OK]
lynxtp/ubuntu https://github.com/lynxtp/docker-ubuntu 0 [OK]
datenbetrieb/ubuntu custom flavor of the official ubuntu base ... 0 [OK]
teamrock/ubuntu TeamRock's Ubuntu image configured with AW... 0 [OK]
labengine/ubuntu Images base ubuntu 0 [OK]
esycat/ubuntu Ubuntu LTS 0 [OK]
ustclug/ubuntu ubuntu image for docker with USTC mirror 0 [OK]
widerplan/ubuntu Our basic Ubuntu images. 0 [OK]
konstruktoid/ubuntu Ubuntu base image 0 [OK]
vcatechnology/ubuntu A Ubuntu image that is updated daily 0 [OK]
webhippie/ubuntu Docker images for ubuntu 0 [OK]
As you can notice, there are images having automated builds while others don't have this feature activated. An
automated builds allow your image to be up-to-date with changes on your private and public git (Github or
Bitbucket) code.
Notice that if you type the search command you will get only 25 image and if you want more, you could use the --
limit option:
docker search --limit 100 mongodb
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
mongo MongoDB document databases provide high av... 2616 [OK]
tutum/mongodb MongoDB Docker image – listens in port 2... 166 [OK]
frodenas/mongodb A Docker Image for MongoDB 12 [OK]
agaveapi/mongodb-sync Docker image that regularly backs up and/o... 7 [OK]
sameersbn/mongodb 6 [OK]
bitnami/mongodb Bitnami MongoDB Docker Image 5 [OK]
waitingkuo/mongodb MongoDB 2.4.9 4 [OK]
tobilg/mongodb-marathon A Docker image to start a dynamic MongoDB ... 4 [OK]
appelgriebsch/mongodb Configurable MongoDB container based on Al... 3 [OK]
azukiapp/mongodb Docker image to run MongoDB by Azuki - htt... 3 [OK]
tianon/mongodb-mms https://mms.mongodb.com/ 2 [OK]
cpuguy83/mongodb 2 [OK]
zokeber/mongodb MongoDB Dockerfile in CentOS 7 2 [OK]
triply/mongodb Extension of official mongodb image that a... 1 [OK]
hairmare/mongodb MongoDB on Gentoo 1 [OK]
networld/mongodb Networld PaaS MongoDB image in default ins... 1 [OK]
jetlabs/mongodb Build MongoDB 3 image 1 [OK]
mattselph/ubuntu-mongodb Ubuntu 14.04 LTS with Mongodb 2.6.4 1 [OK]
oliverwehn/mongodb Out-of-the-box app-ready MongoDB server wi... 1 [OK]
gorniv/mongodb MongoDB Docker image 1 [OK]
vaibhavtodi/mongodb A MongoDB Docker image on Ubuntu 14.04.3. ... 1 [OK]
tozd/meteor-mongodb MongoDB server image for Meteor applications. 1 [OK]
ncarlier/mongodb MongoDB Docker image based on debian. 1 [OK]
pulp/mongodb 1 [OK]
ulboralabs/alpine-mongodb Docker Alpine Mongodb 1 [OK]
mminke/mongodb Mongo db image which downloads the databas... 1 [OK]
kardasz/mongodb MongoDB 0 [OK]
tcaxias/mongodb Percona's MongoDB on Debian. Storage Engin... 0 [OK]
whatwedo/mongodb 0 [OK]
guttertec/mongodb MongoDB is a free and open-source cross-pl... 0 [OK]
peerlibrary/mongodb 0 [OK]
jecklgamis/mongodb mongodb 0 [OK]
unzeroun/mongodb Mongodb image 0 [OK]
falinux/mongodb mongodb docker image. 0 [OK]
hysoftware/mongodb Docker mongodb image for hysoftware.net 0 [OK]
birdhouse/mongodb Docker image for MongoDB used in Birdhouse. 0 [OK]
airdock/mongodb 0 [OK]
bitergia/mongodb MongoDB Docker image (deprecated) 0 [OK]
tianon/mongodb-server 0 [OK]
andreynpetrov/mongodb mongodb 0 [OK]
lukaszm/mongodb MongoDB 0 [OK]
radiantwf/mongodb MongoDB Enterprise Docker image 0 [OK]
luca3m/mongodb-example Sample mongodb app 0 [OK]
derdiedasjojo/mongodb mongodb cluster prepared 0 [OK]
faboulaye/mongodb Mongodb container 0 [OK]
tcloud/mongodb mongodb 0 [OK]
omallo/mongodb MongoDB image build 0 [OK]
romeoz/docker-mongodb MongoDB container image which can be linke... 0 [OK]
denmojo/mongodb A simple mongodb container that can be lin... 0 [OK]
pl31/debian-mongodb mongodb from debian packages 0 [OK]
kievechua/mongodb Based on Tutum's Mongodb with official image 0 [OK]
apiaryio/base-dev-mongodb WARNING: to be replaced by apiaryio/mongodb 0 [OK]
babim/mongodb docker-mongodb 0 [OK]
blkpark/mongodb mongodb 0 [OK]
partlab/ubuntu-mongodb Docker image to run an out of the box Mong... 0 [OK]
jbanetwork/mongodb mongodb 0 [OK]
baselibrary/mongodb ThoughtWorks Docker Image: mongodb 0 [OK]
glnds/mongodb CentOS 7 / MongoDB 3 0 [OK]
hpess/mongodb 0 [OK]
hope/mongodb MongoDB image 0 [OK]
recteurlp/mongodb Fedora DockerFile for MongoDB 0 [OK]
docker search --limit 100 mongodb|wc -l
101
To refine your search, you can filter it using the --filter option.
Let's search for the best Mongodb images according to the community (images with more than 5 stars):
docker search --filter=stars=5 mongo
Tutum, Frodenas and Bitnami have the most popular Mongodb images:
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
tutum/mongodb MongoDB.. 166 [OK]
frodenas/mongodb A Docker.. 12 [OK]
bitnami/mongodb Bitnami.. 5 [OK]
Images could be official or not, just like any Open Source project, Docker public images are made by anyone who
may have access to the Docker Hub so consider double-checking images before using them in your production
servers.
Official images could be filtered in this way:
docker search --filter=is-official=true mongo
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
mongo MongoDB document databases provide high av... 2616 [OK]
mongo-express Web-based MongoDB admin interface, written... 89 [OK]
Filter output based on these conditions:
stars=
is-automated=(true|false)
is-official=(true|false)
The images you can find are even public images or your own private images stored in Docker Hub. In the next
section we will use a private registry.
Finding Private Images
If you have a private registry, you can use the Docker API:
curl -X GET http://localhost:5000/v1/search?q=ubuntu
Or use Docker client:
docker search localhost:5000/ubuntu
In the last example, I am using a local private registry without any SSL encryption, change localhost by your
secure remote server domain or IP address.
Pulling Images
If you want to pull the latest tag of an image, say Ubuntu, you just need to type:
docker pull ubuntu
But Ubuntu image has many tags like: devel, 16.10, 14.04, trusty ..etc
You can find all of the tags here: https://hub.docker.com/r/library/ubuntu/tags/
To pull the development image, type:
docker pull ubuntu:devel
You will see the tagged image in your download:
devel: Pulling from library/ubuntu
8e21f82d32cf: Pulling fs layer
54d6ba364cfb: Pulling fs layer
451f851c9d9f: Pulling fs layer
55e763f0d444: Waiting
b112a925308f: Waiting
Removing Images
Simply type:
docker rmi $(docker images -q)
We can also safely remove only dangling images by typing:
docker rmi $(docker images -f "dangling=true" -q)
Creating New Images Using Dockerfile
The Dockerfile is the file that Docker reads in order to build images. It is a simple text file with a specific
instructional language to assemble the different layers of an image.
You can find below a list of the different instructions that could be used to create an image and then we will see how
to build the image using Dockerfile.
FROM
In the Dockerfile, the first line should start with this instruction.
This is not widely used but when we want to build multiple images, we can have multiple FROM instruction in the
same Dockerfile.
FROM <image>:<tag>
Example:
FROM ubuntu:14.04
If the tag is not specified then Docker will download the latest tagged image.
MAINTAINER
The maintainer is not really an instruction but it indicates the name, email or website of the image maintainer. It is
the equivalent of author in code documentations.
MAINTAINER <nam>
Example:
MAINTAINER Aymen EL Amri - @eon01
RUN
You can run commands (like Linux CLI commands) or executables.
RUN <command>
or
RUN ["<executable>", "<param>", "<param1>", ... ,"<paramN>"]
The first run block will run a command just like any other Linux command using /bin/sh -c shell. Windows
commands are executed using cmd /S /C shell.
Examples:
RUN ls -l
During the build process, you will see the output of the command:
Step 4 : RUN ls -l
---> Running in b3e87d26c09a
total 64
drwxr-xr-x 2 root root 4096 Oct 6 07:47 bin
drwxr-xr-x 2 root root 4096 Apr 10 2014 boot
drwxr-xr-x 5 root root 360 Nov 3 21:55 dev
drwxr-xr-x 64 root root 4096 Nov 3 21:55 etc
drwxr-xr-x 2 root root 4096 Apr 10 2014 home
drwxr-xr-x 12 root root 4096 Oct 6 07:47 lib
drwxr-xr-x 2 root root 4096 Oct 6 07:47 lib64
drwxr-xr-x 2 root root 4096 Oct 6 07:46 media
drwxr-xr-x 2 root root 4096 Apr 10 2014 mnt
drwxr-xr-x 2 root root 4096 Oct 6 07:46 opt
dr-xr-xr-x 267 root root 0 Nov 3 21:55 proc
drwx------ 2 root root 4096 Oct 6 07:47 root
drwxr-xr-x 8 root root 4096 Oct 13 21:13 run
drwxr-xr-x 2 root root 4096 Oct 13 21:13 sbin
drwxr-xr-x 2 root root 4096 Oct 6 07:46 srv
dr-xr-xr-x 13 root root 0 Nov 3 21:54 sys
drwxrwxrwt 2 root root 4096 Oct 6 07:47 tmp
drwxr-xr-x 11 root root 4096 Oct 13 21:13 usr
drwxr-xr-x 13 root root 4096 Oct 13 21:13 var
The same command could be called like this:
RUN ["/bin/sh", "-c", "ls", "-l"]
The latter is called the exec from.
CMD
CMD command helps you to identify which executable should be run when a container is started from your image.
Like the run command, you can use the shell from:
CMD <command> <param1> <param2> .. <param2>
The exec from:
CMD ["<executable>", "<param>", "<param1>", ... ,"<paramN>"]
or as a default parameter to ENTRYPOINT instruction (explained later):
CMD [<"param1">,<"param2"> .. <"paramN">]
Using CMD, the same instruction (that we used in RUN) will be run but not during the build, it will be executed
during the execution of the container.
Example:
CMD ["ls", "-l"]
LABEL
The LABEL instruction is useful in case you want to add metadata to a given image.
LABEL <key1>=<value1> <key2>=<value2> .. <keyN>=<valueN>
Labels are key-value pairs.
Not only Docker's images can have labels but also:
Docker containers
Docker daemons
Docker volumes
Docker networks (and Swarm networks)
Docker Swarm nodes
Docker Swarm services
EXPOSE
When running an application or a service inside a Docker container, it is obvious that this service needs to listen and
send its data to the outside. Imagine we a php/Mysql web application in a host. We created two two containers, a
Mysql container and a webserver container, say Apache.
At this stage, neither the DB server cannot communicate with the web server and the web server can not request a
database. In addition, both servers are not accessible from the outside of the host.
EXPOSE 3306
EXPOSE 80, 443
Now if we would like opening the ports 80 and 443 so that Nginx can be reached from outside the host, we can use
the same instruction with the -p or -P flag.
-p publish a range of ports or the and -P publish all of the exposed ports. You can expose one port number and
publish it externally under another number.
We are going to see this later in this book but keep in mind that exposing ports in the Dockerfile is not mapping
ports to host's network interfaces.
To expose a list of ports, say 3000, 3001, 3002, .., 3999, 4000, you can use this:
EXPOSE 3000-4000
ENV
The ENV is an instruction that sets environment variables. It is the equivalent of Linux:
export variable=value
ENV works with <key>/<value> pair. You can use ENV instruction in two manners:
ENV variable1 This is value1
ENV variable2 This is value2
or like this:
ENV variable1="this is value1" variable2="this is value2"
If you are used to Dockerfile and building your own imafes, you may have seen this:
ENV DEBIAN_FRONTEND noninteractive
This is discouraged because the environment variable persists after the build.
However, you can set it via ARG ( ARG instruction is explained later in this section).
ARG DEBIAN_FRONTEND=noninteractive
ADD
As its name may indicate, the ADD instruction will add files from the host to the guest.
ADD is part of Docker from the beginning and supports a few additional tricks (compared to COPY) beyond simply
copying files.
ADD has two forms:
ADD <src>... <dest>
And if your path contains whitespaces, you may use this form:
ADD ["<src>",... "<dest>"]
You can use some tricks like the * operator:
ADD /var/www/* /var/www/
Or the ? operator to replace a character. If we want to copy all of the following files:
-rwxrwxrwx 1 eon01 sudo 11474 Nov 3 00:50 chapter1
-rwxrwxrwx 1 eon01 sudo 35163 Nov 3 00:50 chapter2
-rwxrwxrwx 1 root root 5233 Nov 3 00:50 chapter3
-rwxrwxrwx 1 eon01 sudo 22411 Nov 3 00:50 chapter4
-rwxr-xr-x 1 eon01 sudo 13550 Nov 6 02:26 chapter5
-rwxrwxrwx 1 eon01 sudo 3235 Nov 6 01:15 chapter6
-rwxrwxrwx 1 eon01 sudo 395 Nov 3 00:51 chapter7
-rwxrwxrwx 1 eon01 sudo 466 Nov 3 00:51 chapter8
-rwxrwxrwx 1 eon01 sudo 272 Nov 3 00:51 chapter9
We can use this:
ADD /home/eon01/painlessdocker/chapter? /var/www
Using Docker ADD, you can use download files from links.
ADD https://github.com/eon01/PainlessDocker/blob/master/README.md /var/www/index.html
This instruction will copy the html file called README.md to index.html under /var/www
Docker's ADD will not discover the URL for files, you should not use something like https://github.com
The ADD instruction will recognize formats like gzip, bzip2 or xz.. So if the is a local tar archive is is directly
unpacked as a directory ( tar -x ).
We haven't seen WORKDIR yet but keep in mind the following point. If we would like to copy index.html to
/var/www/painlessdocker/ we should do this:
ADD index.html /var/www/painlessdocker/
But, if our WORKDIR instruction referenced /var/www/ as out work directory, we can use the following instruction:
ADD index.html painlessdocker/
This form that uses an absolute path in the directory will not work:
ADD ../index.html /var/www/
COPY
Like the ADD instruction, the COPY instruction have two forms:
COPY <src>... <dest>
and
COPY ["<src>",... "<dest>"]
The second form is used for files path with spaces.
COPY ["/home/eon01/Painless Docker.html", "/var/www/index.html"]
You can use some tricks like the * operator:
COPY /var/www/* /var/www/
Or the ? operator to replace a character. If we want to add all of the following files:
-rwxrwxrwx 1 eon01 sudo 11474 Nov 3 00:50 chapter1
-rwxrwxrwx 1 eon01 sudo 35163 Nov 3 00:50 chapter2
-rwxrwxrwx 1 root root 5233 Nov 3 00:50 chapter3
-rwxrwxrwx 1 eon01 sudo 22411 Nov 3 00:50 chapter4
-rwxr-xr-x 1 eon01 sudo 13550 Nov 6 02:26 chapter5
-rwxrwxrwx 1 eon01 sudo 3235 Nov 6 01:15 chapter6
-rwxrwxrwx 1 eon01 sudo 395 Nov 3 00:51 chapter7
-rwxrwxrwx 1 eon01 sudo 466 Nov 3 00:51 chapter8
-rwxrwxrwx 1 eon01 sudo 272 Nov 3 00:51 chapter9
We can use this:
COPY /home/eon01/painlessdocker/chapter? /var/www
If we would like to copy index.html to /var/www/painlessdocker/ we should do this:
COPY index.html /var/www/painlessdocker/
But, if our WORKDIR instruction referenced /var/www/ as out work directory, we can use the following instruction:
COPY index.html painlessdocker/
This form that uses an absolute path in the directory will not work:
COPY ../index.html /var/www/
Unlike the ADD instruction, the COPY instruction will not work with archive files and URLs.
ENTRYPOINT
A question that you may ask is what happens when a container starts ?
Imagine we have a tiny Python server that the container should start, the ENTRYPOINT should be :
python -m SimpleHTTPServer
Or a Node.js application to run it inside the container:
node app.js
The ENTRYPOINT is what helps us to start a server in this case or to execute a command in the general case.
That's why we need the ENTRYPOINT instruction.
Whenever we start a Docker container, declaring what command should be executed is important. Otherwise the
container will shutdown.
In the general case, the ENTRYPOINT is declared in the Dockerfile.
This instruction has two forms.
The exec form is the preferred one:
ENTRYPOINT [<"executable">, <"param1">, <"param2">.... <"paramN">]
The second one is the shell from:
ENTRYPOINT <command> <param1> <param2> .. <paramN>
Example:
ENTRYPOINT ["node", "app.js"]
Let's take the example of a simple *Python* application.
``` python
print("Hello World")
We want the container to run the Python script as an ENTRYPOINT.
Now that we know some instructions like FROM, COPY and ENTRYPOINT, we can create a Dockerfile just using
those instructions.
echo "print('Hello World')" > app.py
touch Dockerfile
This is the content of the Dockerfile:
FROM python:2.7
COPY app.py .
ENTRYPOINT python app.py
Reading this Dockerfile, we understand that:
The image will be downloaded from Docker Hub python:2.7
The file app.py will be copied inside the container
The command python app.py will be executed when the container upstarts.
We haven't seen that yet in Painless Docker but this is the process of building and running the command ( we are
going to see the build and the run commands will be detailed later in this book.).
The build:
docker build .
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM python:2.7
---> d0614bfb3c4e
Step 2 : COPY app.py .
---> Using cache
---> 6659a70e1775
Step 3 : ENTRYPOINT python app.py
---> Using cache
---> 236e1648a508
Successfully built 236e1648a508
The run:
docker run 236e1648a508
Hello World
Notice that the Python script was executed just after running the container.
VOLUME
The VOLUME instruction creates an external mount point from an internal directory. Any external volume mounted
using this instruction could be used by another Docker container.
We can use this form:
VOLUME[<"Directory">]
This form
VOLUME <Directory1> <Directory2> .. <DirectoryN>
Or a JSON array.
But why ?
Docker containers are ephemeral so to keep the data persistent even through container restarts, stops
or disappear.
By default, a Docker container does not share its data with the host, volumes allow the host to access
to the container data.
By default, two containers even running in the same host cannot share data so sharing files between a
container in the form of a docker volume will allow other containers to access its data.
Setting up the permission or the ownership of a volume should be done before the VOLUMES instruction in th
Dockerfile.
For example, this form of setting up the owner is wrong.
VOLUME /app
ADD app.py /app/
RUN chown -R foo:foo /app
However the following Dockerfile has a good syntax:
RUN mkdir /app
ADD app.py /app/
RUN chown -R foo:foo /app
VOLUME /app
A good scenario to use Docker volumes is databases containers, where data is mounted outside the container. This
allows making an easy backup to the database.
USER
When running a command you may need doing it using another user (not the default user which is the Root user).
This is when the USER instruction shall be used.
USER is used like this:
USER <user>
So, Root is the default user and Docker has a full access to the host system. You may consider USER as a security
option to consider.
Normally an image should never use the Root user but another user that you choose using the instruction USER.
Example:
USER my_user
Sometimes, you need to run a command using Root, you can simply switch between different users:
USER root
RUN <a command that should be run under Root>
USER my_user
The USER instruction applies to RUN, CMD and ENTRYPOINT instructions so that any command run by one of
these instructions is attributed to the chosen user.
WORKDIR
WORKDIR is use this way:
WORKDIR <Directory>
This instruction sets the a working directory so that any of the following commands: RUN, CMD, ENTRYPOINT,
COPY & ADD will be executed in this directory.
Example:
To copy index.html to /var/www , you may write this:
ADD index.html /var/www
Note that if the WORKDIR does not exist, it will be created.
ARG
If you want to assign a value to a variable but just during the build, you can use the ARG instruction.
The syntax is :
ARG <argument name>[=<its default value>]
or
ARG <argument name>
You can declare variables that you can use during the build with docker build command that we are going to see
later.
Example:
ARG time
You can also set the value of the arguments (variables) in the Dockerfile.
ARG time=3s
ONBUILD
Software should be built automatically that's why Docker has the ONBUILD instruction. It is a trigger instruction
executed when the image is used as the base image for another build.
ONBUILD <Docker Instruction>
The trigger will be executed in the context of the downstream build, as if it had been inserted immediately after the
FROM instruction in the downstream Dockerfile.
The ONBUILD instruction is available in Docker since its version 0.8.
Let's dive int details. First of all, let's define what a child image is :
If the image A contains an ONBUILD instruction and if imageB uses the imageA as its base image in order to add
other instructions (and layers) upon it then imageB is a child image of imageA.
This is the imageA Dockerfile
FROM ubuntu:16.04
ONBUILD RUN echo "I will be automatically executed only in the child image"
THis is the imageB Dockerfile
FROM imageA
If you build the imageA based on Ubuntu 16.04, the ONBUILD instruction will not run the RUN echo "<..>" . But
when you build the imageB, the same instruction will be run just after the execution of the FROM instruction.
This is an example of the imageB build output where we see the message printed automatically:
Uploading context 4.51 kB
Uploading context
Step 0 : FROM imageA
# Executing 1 build triggers
Step onbuild-0 : RUN echo "I will be automatically executed only in the child imager"
---> Running in acefe7b39c5
I will be automatically executed only in the child image
STOPSIGNAL
The STOPSIGNAL instruction let you set the system call signal that will be sent to the container to exit.
STOPSINGAL <signal>
This signal can be a valid unsigned number like 9 or a signal name like SIGNAME, SIGKILL, SIGINT ..etc
SIGTERM is the default in Docker and it is and equivalent to running kill <pid> .
In the following example, let's change it by SIGINT ( the same signal sent when pressing ctrl-C ):
FROM ubuntu/16:04
STOPSIGNAL SIGINT
HEALTHCHECK
The HEALTHCHECK instruction is one of the useful instruction I have been using since the version 1.12.
It has two forms. Either to check a container health by running a command inside the container:
HEALTHCHECK [OPTIONS] CMD <command>
or to disable any healthcheck ( all healthchecks inherited from the base image will be disabled ).
HEALTHCHECK NONE
There are 3 options that we can use before the CMD command.
--interval=<interval duration>
The healthcheck will be executed every unit of time. The default interval duration is 30s.
--timeout=<timeout duration>
The default timeout duration is 30s. The healthcheck will be timeout after unit of time.
--retries=N
The default number of retries is 3. The healthcheck could be failed but no more than times.
An example of a Docker healthcheck that will run every 1 minute with a check that does not take longer than 3
seconds before it will be considered as failed if the same check will be repeated for more than 3 time :
HEALTHCHECK --interval=1m --timeout=3s CMD curl -f http://localhost/ || exit 1
SHELL
When using Docker, the default shell that executes all of the commands is "/bin/sh -c". This means that CMD ls -l
will be in reality run inside the container like this:
/bin/sh -c ls -l
The default shell for Windows is:
cmd /S /C
SHELL instruction must be written in JSON form in the Dockerfile.
SHELL [<"executable">, <"parameters">]
This instruction could be interesting for Windows users to choose between cmd and powershell.
Example:
SHELL ["powershell", "-command"]
SHELL ["cmd", "/S"", "/C"]
In nix you can of course work with bash, zsh, csh, tcsh* ..etc for example:
SHELL ["/bin/bash", "-c"]
ENTRYPOINT VS CMD
Both CMD and ENTRYPOINT instructions allows us to define a command that will be executed once a container
starts.
Back to the CMD instruction: We have seen that the CMD could be a default parameter to ENTRYPOINT
instruction when we use this from :
CMD [<"param1">,<"param2"> .. <"paramN">]
This is a simple Dockerfile:
FROM python:2.7
COPY app.py .
ENTRYPOINT python app.py
You may say that the Dockerfile could be written like this:
FROM python:2.7
COPY app.py .
ENTRYPOINT ["python"]
CMD ["app.py"]
Good guess .. But when you will run the container it will show you an error.
The CMD command in this from should be called like this:
FROM python:2.7
COPY app.py .
ENTRYPOINT ["/usr/bin/python"]
CMD ["app.py"]
As a conclusion,
ENTRYPOINT python app.py
could be called like this
ENTRYPOINT ["/usr/bin/python"]
CMD ["app.py"]
CMD and ENTRYPOINT could be used alone or together but in all cases, you should use one of them at least.
Using neither CMD not ENTRYPOINT will fail the execution of the container.
You can find almost the same table in the official Docker documentation, but this is the best way to understand all of
the possibilities:
No
ENTRYPOINT
ENTRYPOINT entrypoint_exec
entrypoint_param1
ENTRYPOINT
[“entrypoint_exec”,
“entrypoint_param1”]
No CMD Will generate an
error
/bin/sh -c entrypoint_exec
entrypoint_param1
entrypoint_exec
entrypoint_param1
CMD
[“cmd_exec”,
“cmd_param1”]
cmd_exec
cmd_param1
/bin/sh -c entrypoint_exec
entrypoint_param1 cmd_exec
cmd_param1
entrypoint_exec
entrypoint_param1 cmd_exec
cmd_param1
CMD
[“cmd_param1”,
“cmd_param2”]
cmd_param1
cmd_param2
/bin/sh -c entrypoint_exec
entrypoint_param1 cmd_param1
cmd_param2
entrypoint_exec
entrypoint_param1 cmd_param1
cmd_param2
CMD cmd_exec
cmd_param1
/bin/sh -c
cmd_exec
cmd_param1
/bin/sh -c entrypoint_exec
entrypoint_param1 /bin/sh -c cmd_exec
cmd_param1
entrypoint_exec
entrypoint_param1 /bin/sh -c
cmd_exec cmd_param1
Building Images
The Base Image
Probably the smallest Dockerfile (not the smallest image) is the following one:
FROM <image>
Docker needs an image to run : no image, no container.
The base image is an image which you add layers on the top of it to create another image containing your
application. As seen in the last chapter, an image is a set of layers and the FROM <image> is the necessary layer to
create the image.
Using www.imagelayers.io we can visualize an image online. Let's take te example of tutum/hello-world that you can
find on Docker Hub website just by concatenating:
https://hub.docker.com/r/
and
tutum/hello-world
which mean:
https://hub.docker.com/r/tutum/hello-world/
The Dockerfile of this image is the following:
FROM alpine
MAINTAINER support@tutum.co
RUN apk --update add nginx php-fpm && \
mkdir -p /var/log/nginx && \
touch /var/log/nginx/access.log && \
mkdir -p /tmp/nginx && \
echo "clear_env = no" >> /etc/php/php-fpm.conf
ADD www /www
ADD nginx.conf /etc/nginx/
EXPOSE 80
CMD php-fpm -d variables_order="EGPCS" && (tail -F /var/log/nginx/access.log &) && exec nginx -g "daemon off;"
The base image of this simple application is Alpine.
This image of 18 MiB has:
7 unique layers
an average layer of 3 MiB
its largest layer is 13 MB
Let's take another image and use the Docker history command to see its layers
docker pull nginx
This will pull the latest nginx image from the Docker Hub.
docker history nginx will show the different layers of nginx :
Docker history command show you the history of an image and it s different layers. You can see more information
with human readable output about an image by using :
docker history --no-trunc -H nginx
with:
-H, --human Print sizes and dates in human readable format (default true)
--no-trunc Don't truncate output
Let's take a look at the output:
Dockerfile
The Dockerfile is a kind of a script file with different instructive commands and arguments that describe how you
image base image will be at the end of the build. It is possible to run an image directly without building it (like a
public or a private image from Docker Hub or a private repository), but if you want to have your own specific
images and if you want to organize your deployments while distributing the same image to developers and QA
teams, creating a Dockerfile for your application is a good point to start.
The first rule: A Dockerfile should start with the FROM instruction which explains that nothing could be done
without having a base image. The syntax to create a Dockerfile is quite simple and explicit since they should be
followed to the letter. If you need to execute more things than the Docker instructions permit, than you can just RUN
nix commands or use shell scripts with CMD and ENTRYPOINT*.
We wen through the different instructions and the differences between them, you should be able to create Dockerfile
just using this, but we are going to see more examples later like creating micro images for Python and Node.js or
like in the Mongodb example.
Creating An Image Build Using Dockerfile
So we have seen the different instructions that can help us create a Dockerfile. Now in order to have a complete
image, we should build it.
The command to build a Docker image is:
docker build .
The '.' is indicating that the Dockerfile is in the same directory where you are running the build command which is
the context.
The context are simply your local files.
ls -l .
Dockerfile
app/
scripts/
The context is the directory and all the subdirectories where you execute the docker build command.
If you are executing the build from a different directory you can use -f:
docker build -f /path/to/the/Dockerfile/TheDockerfile /path/to/the/context/
Example:
If your Dockerfile is under /tmp and your files are in the /app directory, the command should be:
docker build -f /tmp/Dockerfile /app
Optimizing Docker Images
You can find many images of the same application in the Internet but not all of them are optimized , they can be
really big, they may take time to build or to send through network and of course your deployment time could
increase.
Layers are actually what decides the size of a Docker image and optimizing your Docker clusters start from
optimizing the layers of your image.
This image has three layers:
FROM ubuntu
RUN apt-get update -y
RUN apt-get install python -y
While this one has only two layers:
FROM ubuntu
RUN apt-get update -y && apt-get install python -y
Both images are installing Python in a Ubuntu docker image.
Docker images can get really big. Many are over 1G in size. How do they get so big? Do they really need to be this
big? Can we make them smaller without sacrificing functionality?
Tagging Images
Like git, Docker has the ability to tag specific points in history as being important.
Before using private Docker registry or Docker Hub, you should first use docker login information.
Docker Hub:
docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username (eon01):
Password:
Login Succeeded
Private registry:
docker login https://localhost:5000
Username: admin
Password:
Login Succeeded
Typically Docker developers use the tagging functionality to mark a release or a version. In this section, you’ll learn
how to list images tags and how to create new tags.
When you type docker images you can get a list of the images you have on your laptop/server.
REPOSITORY TAG IMAGE ID CREATED SIZE
mongo latest c5185a594064 5 days ago 342.7 MB
alpine latest baa5d63471ea 5 weeks ago 4.803 MB
docker/whalesay latest fb434121fc77 4 hours ago 247 MB
hello-world latest 91c95931e552 5 weeks ago 910 B
You can notice in the output of the latter command that every image has a unique id but also a tag.
docker images|awk {'print $3'}|tail -n +2
c5185a594064
baa5d63471ea
fb434121fc77
91c95931e552
If you would like to download the same images, you should pull them with the right tags:
docker pull mongo:latest
docker pull alpine:latest
docker pull docker/whalesay:latest
docker pull hello-world:latest
As you can see in the docker images output, the id of the Alpine image is baa5d63471ea . We are going to use this to
give the image a new tag using the following syntax:
docker tag <image id> <docker username>/<image name>:<tag>
You can also use
docker tag <image>:<tag> <docker username>/<image name>:<tag>
Example:
docker tag baa5d63471ea alpine:me
Now when you list your images, you will notice both of the last and original tags are listed:
alpine latest baa5d63471ea 5 weeks ago 4.803 MB
alpine me baa5d63471ea 5 weeks ago 4.803 MB
You can try also:
docker tag mongo:latest mongo:0.1
Then list your images and you will find the new mongo tag:
REPOSITORY TAG IMAGE ID
mongo 0.1 c5185a594064
mongo latest c5185a594064
alpine latest baa5d63471ea
alpine me baa5d63471ea
Like in a simple git flow you can pull, update, commit and push your changes to a remote repository. The commit
operation can happen on a running container, that's why we are going to run the mongo container. In this section, we
haven't seen yet how to run containers in production environments but we are going to use the run command now
and see all of its details later in another chapter.
docker run -it -d -p 27017:21017 --name mongo mongo
Verify Mongodb container is running:
docker ps
CONTAINER ID IMAGE COMMAND PORTS NAMES
d5faf7fd8e4d mongo "/entrypoint.sh mongo" (1) mongo
(1): 27017/tcp, 0.0.0.0:27017->21017/tcp
We will use the container id d5faf7fd8e4d in order to use the commit command, sure we haven't had any changes
made to the running container until now, but this is just a test to show you the basic command usage. Note that we
an change or keep the original tag when committing:
In the following example, the Docker image that the container d5faf7fd8e4d is running will be tagged with a new tag
mongo:0.2 :
docker commit d5faf7fd8e4d mongo:0.2
Type docker images for your verifications and notice the new tag mongo:0.2 :
REPOSITORY TAG IMAGE ID
mongo 0.2 d022237fd80d
localhost:5000/user/mongo 0.1 c5185a594064
mongo 0.1 c5185a594064
mongo latest c5185a594064
localhost:5000/mongo 0.1 c5185a594064
registry latest c9bd19d022f6
alpine latest baa5d63471ea
alpine me baa5d63471ea
A running container could also have some changes while running. Let's take the example of the Mongodb container,
we will add an administrative account to the running Mongodb instance. Log into the container:
docker exec -it mongo bash
Inside your running container, connect to your running database using mongo command:
root@d5faf7fd8e4d:/# mongo
MongoDB shell version: 3.2.11
connecting to: test
Server has startup warnings:
I CONTROL [initandlisten]
I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
I CONTROL [initandlisten] ** We suggest setting it to 'never'
I CONTROL [initandlisten]
I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
I CONTROL [initandlisten] ** We suggest setting it to 'never'
I CONTROL [initandlisten]
>
Now add a new administrator identified by login root and password 958d12fc49437db0c7baac22541f9b93 with the
administrative role root:
> use admin
switched to db admin
> db.createUser(
... {
... user: "root",
... pwd: "958d12fc49437db0c7baac22541f9b93",
... roles:["root"]
... })
Successfully added user: { "user" : "root", "roles" : [ "root" ] }
> exit
bye
Exit your container:
root@d5faf7fd8e4d:/# exit
exit
Now that we had made important changes to our running container, commit those changes. You can use a private
registry or Docker Hub, but I am going to use my public Docker Hub account in this example:
docker commit d5faf7fd8e4d eon01/mongodb
sha256:269685eeaecdea12ddd453cf98685cad1e6d3c76cccd6eebb05d3646fe496688
After the commit operation, we are sure that our changes are saved, we can push the image:
docker push eon01/mongodb
The push refers to a repository [docker.io/eon01/mongodb]
c01c6c921c0b: Layer already exists
80c558316eec: Layer already exists
031fad254fc0: Layer already exists
ddc5125adfe9: Layer already exists
31b3084f360d: Layer already exists
77e69eeb4171: Layer already exists
718248b95529: Layer already exists
8ba476dc30da: Layer already exists
07c6326a8206: Layer already exists
fe4c16cbf7a4: Layer already exists
latest: digest: sha256:ee50ec95fd490d60796d7782a9348ef824d84110beea4f86ced1ed15a1c8976c size: 2406
Notice the The push refers to a repository [docker.io/eon01/mongodb] . This telling us that you can find this public image
on: https://hub.docker.com/r/eon01/mongodb/
Using Docker Hub we can add a description, a README file, change the image to a private one (paid feature) ..etc
If you execute a pull command on the same remote image, you can notice that nothing will be downloaded because
you already have all of the image layers locally:
docker pull eon01/mongodb
Using default tag: latest
latest: Pulling from eon01/mongodb
Digest: sha256:ee50ec95fd490d60796d7782a9348ef824d84110beea4f86ced1ed15a1c8976c
Status: Image is up to date for eon01/mongodb:latest
We can run it like this:
docker run -it -d -p 27018:27017 --name mongo_container eon01/mongodb
Notice that we used the port mapping 27018:27017 because the first Mongodb container is mapped to the host port
27017 and it is impossible to map two containers to the same local port - same thing for the container name.
We have not seen this detail yet but we will go through all of these details in the next chapter.
docker ps
CONTAINER ID IMAGE COMMAND PORTS NAMES
ff4eccdd6bee eon01/mongodb "/entrypoint.sh mongo" (1) zen_dubinsky
d5faf7fd8e4d mongo "/entrypoint.sh mongo" (2) mongo
3118896db039 registry "/entrypoint.sh /etc/" (3) my_registry
(1): 21017/tcp, 0.0.0.0:27018->27017/tcp (2): 27017/tcp, 0.0.0.0:27017->21017/tcp (3): 0.0.0.0:5000->5000/tcp
If you log into your container:
docker exec -it mongo_container bash
Start Mongodb:
root@db926ae25f1e:/# mongo
MongoDB shell version: 3.2.11
connecting to: test
Server has startup warnings:
I CONTROL [initandlisten]
I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
I CONTROL [initandlisten] ** We suggest setting it to 'never'
I CONTROL [initandlisten]
I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
I CONTROL [initandlisten] ** We suggest setting it to 'never'
I CONTROL [initandlisten]
List users:
> use admin
switched to db admin
> db.getUsers()
[
{
"_id" : "admin.root",
"user" : "root",
"db" : "admin",
"roles" : [
{
"role" : "root",
"db" : "admin"
}
]
}
]
>
You can see that the changes you made are already stored in the container.
In the same way, you can add other changes, commit, tag (if you need it) and push to your Docker Hub or private
registry.
This is what we have done until now: we had an image mongo that we created from a build (we are going to see the
Dockerfile later in this chapter), run the container, made some changes and push it to a repository.
Your Private Registry
If you are using a private Docker repository, just add your host domain/IP like this:
docker tag <image>:<tag> <registry host>/<image name>:<tag>
Example:
docker tag mongo:latest localhost:5000/mongo:0.1
Let's run a simple local private registry to test this:
docker run -d -p 5000:5000 --name my_registry registry
Unable to find image 'registry:latest' locally
latest: Pulling from library/registry
3690ec4760f9: Already exists
930045f1e8fb: Pull complete
feeaa90cbdbc: Pull complete
61f85310d350: Pull complete
b6082c239858: Pull complete
Digest: sha256:1152291c7f93a4ea2ddc95e46d142c31e743b6dd70e194af9e6ebe530f782c17
Status: Downloaded newer image for registry:latest
3118896db039c26a74127031eefd42264e310d7cdc435e126fa8630bf8ee8c60
Verify that you are really running this with docker ps :
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3118896db039 registry "/entrypoint.sh /etc/" 2 minutes ago Up 2 minutes 0.0.0.0:5000-
>5000/tcp my_registry
Tag your image with your private URL:
docker tag mongo:latest localhost:5000/mongo:0.1
Check your new tag with the docker images command:
REPOSITORY TAG IMAGE ID
localhost:5000/mongo 0.1 c5185a594064
mongo 0.1 c5185a594064
mongo latest c5185a594064
registry latest c9bd19d022f6
alpine latest baa5d63471ea
alpine me baa5d63471ea
Note that if you are testing private Docker registry, your default username/password are admin/admin.
Optimizing Images
You are probably used to Ubuntu (or any other major distribution) so you will may be use Ubuntu on your Docker
container. Ostensibly, your image is fine, it is using a stable distribution and your are just running Ubuntu inside
your container.
FROM ubuntu
ADD app /var/www/
CMD ["start.sh"]
The problem is that you don't really need a complete OS, you installed all Ubuntu files but you are not going to use
them, your container does not need all of them.
When you build an image, generally you will configure CMD or ENTRYPOINT or both of them with something
that will be executed at the container startup. So the only processes that will be running inside the container is the
ENRYPOINT command, and all processes that it spawns, all of the other OS processes will not run. And I am not
sure you will need them.
This is a part of the output of ps aux of my current Ubuntu system:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 34172 3980 ? Ss 00:36 0:02 /sbin/init
root 2 0.0 0.0 0 0 ? S 00:36 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 00:36 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/0:0H]
root 7 0.0 0.0 0 0 ? S 00:36 0:55 [rcu_sched]
root 8 0.0 0.0 0 0 ? S 00:36 0:17 [rcuos/0]
root 9 0.0 0.0 0 0 ? S 00:36 0:15 [rcuos/1]
root 10 0.0 0.0 0 0 ? S 00:36 0:17 [rcuos/2]
root 11 0.0 0.0 0 0 ? S 00:36 0:12 [rcuos/3]
root 12 0.0 0.0 0 0 ? S 00:36 0:00 [rcuos/4]
root 13 0.0 0.0 0 0 ? S 00:36 0:00 [rcuos/5]
root 14 0.0 0.0 0 0 ? S 00:36 0:00 [rcuos/6]
root 15 0.0 0.0 0 0 ? S 00:36 0:00 [rcuos/7]
root 16 0.0 0.0 0 0 ? S 00:36 0:00 [rcu_bh]
root 17 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/0]
root 18 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/1]
root 19 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/2]
root 20 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/3]
root 21 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/4]
root 22 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/5]
root 23 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/6]
root 24 0.0 0.0 0 0 ? S 00:36 0:00 [rcuob/7]
root 25 0.0 0.0 0 0 ? S 00:36 0:00 [migration/0]
root 26 0.0 0.0 0 0 ? S 00:36 0:00 [watchdog/0]
root 27 0.0 0.0 0 0 ? S 00:36 0:00 [watchdog/1]
root 28 0.0 0.0 0 0 ? S 00:36 0:00 [migration/1]
root 29 0.0 0.0 0 0 ? S 00:36 0:00 [ksoftirqd/1]
root 31 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/1:0H]
root 32 0.0 0.0 0 0 ? S 00:36 0:00 [watchdog/2]
root 33 0.0 0.0 0 0 ? S 00:36 0:00 [migration/2]
root 34 0.0 0.0 0 0 ? S 00:36 0:00 [ksoftirqd/2]
root 36 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/2:0H]
root 37 0.0 0.0 0 0 ? S 00:36 0:00 [watchdog/3]
root 38 0.0 0.0 0 0 ? S 00:36 0:00 [migration/3]
root 39 0.0 0.0 0 0 ? S 00:36 0:00 [ksoftirqd/3]
root 41 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/3:0H]
root 42 0.0 0.0 0 0 ? S< 00:36 0:00 [khelper]
root 43 0.0 0.0 0 0 ? S 00:36 0:00 [kdevtmpfs]
root 44 0.0 0.0 0 0 ? S< 00:36 0:00 [netns]
root 45 0.0 0.0 0 0 ? S 00:36 0:00 [khungtaskd]
root 46 0.0 0.0 0 0 ? S< 00:36 0:00 [writeback]
root 47 0.0 0.0 0 0 ? SN 00:36 0:00 [ksmd]
root 48 0.0 0.0 0 0 ? SN 00:36 0:20 [khugepaged]
root 49 0.0 0.0 0 0 ? S< 00:36 0:00 [crypto]
root 50 0.0 0.0 0 0 ? S< 00:36 0:00 [kintegrityd]
root 51 0.0 0.0 0 0 ? S< 00:36 0:00 [bioset]
root 52 0.0 0.0 0 0 ? S< 00:36 0:00 [kblockd]
root 53 0.0 0.0 0 0 ? S< 00:36 0:00 [ata_sff]
root 54 0.0 0.0 0 0 ? S 00:36 0:00 [khubd]
root 55 0.0 0.0 0 0 ? S< 00:36 0:00 [md]
root 56 0.0 0.0 0 0 ? S< 00:36 0:00 [devfreq_wq]
root 60 0.0 0.0 0 0 ? S 00:36 0:10 [kswapd0]
root 61 0.0 0.0 0 0 ? S 00:36 0:00 [fsnotify_mark]
root 62 0.0 0.0 0 0 ? S 00:36 0:00 [ecryptfs-kthrea]
root 74 0.0 0.0 0 0 ? S< 00:36 0:00 [kthrotld]
root 75 0.0 0.0 0 0 ? S< 00:36 0:00 [acpi_thermal_pm]
root 77 0.0 0.0 0 0 ? S< 00:36 0:00 [ipv6_addrconf]
root 99 0.0 0.0 0 0 ? S< 00:36 0:00 [deferwq]
root 100 0.0 0.0 0 0 ? S< 00:36 0:00 [charger_manager]
root 108 0.0 0.0 0 0 ? S 00:36 0:03 [kworker/3:1]
root 149 0.0 0.0 0 0 ? S< 00:36 0:00 [kpsmoused]
root 166 0.0 0.0 0 0 ? S 00:36 0:00 [scsi_eh_0]
root 168 0.0 0.0 0 0 ? S< 00:36 0:00 [scsi_tmf_0]
root 169 0.0 0.0 0 0 ? S 00:36 0:00 [scsi_eh_1]
root 170 0.0 0.0 0 0 ? S< 00:36 0:00 [scsi_tmf_1]
root 171 0.0 0.0 0 0 ? S 00:36 0:00 [scsi_eh_2]
root 172 0.0 0.0 0 0 ? S< 00:36 0:00 [scsi_tmf_2]
root 173 0.0 0.0 0 0 ? S 00:36 0:00 [scsi_eh_3]
root 174 0.0 0.0 0 0 ? S< 00:36 0:00 [scsi_tmf_3]
root 252 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/1:1H]
root 253 0.0 0.0 0 0 ? S 00:36 0:01 [jbd2/sda1-8]
root 254 0.0 0.0 0 0 ? S< 00:36 0:00 [ext4-rsv-conver]
root 256 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/0:1H]
root 293 0.0 0.0 28948 2080 ? S 00:36 0:00 mountall --daemon
root 471 0.0 0.0 0 0 ? S< 00:36 0:00 [kworker/2:1H]
root 509 0.0 0.0 0 0 ? S 00:36 0:06 [jbd2/sda3-8]
root 510 0.0 0.0 0 0 ? S< 00:36 0:00 [ext4-rsv-conver]
root 540 0.0 0.0 19480 1448 ? S 00:36 0:00 upstart-udev-bridge --daemon
root 547 0.0 0.0 52116 2244 ? Ss 00:36 0:00 /lib/systemd/systemd-udevd --daemon
root 595 0.0 0.0 0 0 ? S< 00:36 0:00 [ktpacpid]
root 622 0.0 0.0 0 0 ? S< 00:36 0:00 [kmemstick]
root 623 0.0 0.0 0 0 ? S< 00:36 0:00 [hd-audio1]
Look at all of these system processes, why should they be running inside a container ? This is the output of a
running container running PHP FPM image :
USER PID %CPU %MEM VSZ RSS TTY STAT COMMAND
root 1 0.0 0.0 113464 1084 ? Ss php-fpm: master process (/usr/local/etc/php-fpm.conf)
www-data 6 0.0 0.0 113464 1044 ? S php-fpm: pool www
www-data 7 0.0 0.0 113464 1064 ? S php-fpm: pool www
root 8 0.0 0.0 20228 3188 ? Ss bash
root 14 0.0 0.0 17500 2076 ? R ps aux
You can see that the only running processes are php-fpm processes and the other processes spawned by php.
If we type use Docker history command to see the CMD command, we can notice that the only process run is php-
fpm:
docker history --human --no-trunc php:7-fpm|grep -i cmd
sha256:1..3 5 weeks ago /bin/sh -c #(nop) CMD ["php-fpm"]
Another inconvenient of using Ubuntu in this case is the size of the image, you will:
increase your build time
increase your deployment time
increase the development time
Using a minimal image will reduce all of this.
You don't also need the init system of an operating system since inside Docker you don't have access to all of the
Kernel resources. If you would like to use a "full" operating system, you are adding problems to your problem list.
This is the case for many other OSs used inside Docker like Centos or Debian unless they are optimized to run
Docker and follow its philosophy.
From Scratch
Actually, the smallest image that we can find in the official Docker Hub is the scratch image.
Even if you can find this image Docker’s Hub, you can’t:
pull it,
run it
tag any image with the same name ("scratch")
But you can still use it as a reference to an image in the FROM instruction.
The scratch image is small, fast, secure and bugless.
Busybox
According to Wikipedia:
It runs in a variety of POSIX environments such as Linux, Android, and FreeBSD, although many of the tools
it provides are designed to work with interfaces provided by the Linux kernel. It was specifically created for
embedded operating systems with very limited resources. The authors dubbed it "The Swiss Army knife of
Embedded Linux", as the single executable replaces basic functions of more than 300 common commands. It is
released as free software under the terms of the GNU General Public License v2.
BusyBox is software that provides several stripped-down Unix tools in a single executable file so that the ls
command (as an example) could be run this way:
/bin/busybox ls
Busybox is the winner of the smallest images (2.5 MB) that we can use with Docker. Executing a docker pull busybox
will take almost 1 second.
With the advantage of the very minimal size comes some cons : Busybox does not have neither a package manager
nor a gcc compiler.
You can run the busybox executable from Docker to see the the list of the binaries it includes :
docker run busybox busybox
BusyBox v1.25.1 (2016-10-07 18:17:00 UTC) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --install [-s] [DIR]
or: function [arguments]...
BusyBox is a multi-call binary that combines many common Unix
utilities into a single executable. Most people will create a
link to busybox for each function they wish to use and BusyBox
will act like whatever it was invoked as.
Currently defined functions:
[, [[, acpid, add-shell, addgroup, adduser, adjtimex, ar, arp, arping,
ash, awk, base64, basename, beep, blkdiscard, blkid, blockdev,
bootchartd, brctl, bunzip2, bzcat, bzip2, cal, cat, catv, chat, chattr,
chgrp, chmod, chown, chpasswd, chpst, chroot, chrt, chvt, cksum, clear,
cmp, comm, conspy, cp, cpio, crond, crontab, cryptpw, cttyhack, cut,
date, dc, dd, deallocvt, delgroup, deluser, depmod, devmem, df,
dhcprelay, diff, dirname, dmesg, dnsd, dnsdomainname, dos2unix, du,
dumpkmap, dumpleases, echo, ed, egrep, eject, env, envdir, envuidgid,
ether-wake, expand, expr, fakeidentd, false, fatattr, fbset, fbsplash,
fdflush, fdformat, fdisk, fgconsole, fgrep, find, findfs, flock, fold,
free, freeramdisk, fsck, fsck.minix, fstrim, fsync, ftpd, ftpget,
ftpput, fuser, getopt, getty, grep, groups, gunzip, gzip, halt, hd,
hdparm, head, hexdump, hostid, hostname, httpd, hush, hwclock,
i2cdetect, i2cdump, i2cget, i2cset, id, ifconfig, ifdown, ifenslave,
ifplugd, ifup, inetd, init, insmod, install, ionice, iostat, ip,
ipaddr, ipcalc, ipcrm, ipcs, iplink, iproute, iprule, iptunnel,
kbd_mode, kill, killall, killall5, klogd, last, less, linux32, linux64,
linuxrc, ln, loadfont, loadkmap, logger, login, logname, logread,
losetup, lpd, lpq, lpr, ls, lsattr, lsmod, lsof, lspci, lsusb, lzcat,
lzma, lzop, lzopcat, makedevs, makemime, man, md5sum, mdev, mesg,
microcom, mkdir, mkdosfs, mke2fs, mkfifo, mkfs.ext2, mkfs.minix,
mkfs.vfat, mknod, mkpasswd, mkswap, mktemp, modinfo, modprobe, more,
mount, mountpoint, mpstat, mt, mv, nameif, nanddump, nandwrite,
nbd-client, nc, netstat, nice, nmeter, nohup, nsenter, nslookup, ntpd,
od, openvt, passwd, patch, pgrep, pidof, ping, ping6, pipe_progress,
pivot_root, pkill, pmap, popmaildir, poweroff, powertop, printenv,
printf, ps, pscan, pstree, pwd, pwdx, raidautorun, rdate, rdev,
readahead, readlink, readprofile, realpath, reboot, reformime,
remove-shell, renice, reset, resize, rev, rm, rmdir, rmmod, route, rpm,
rpm2cpio, rtcwake, run-parts, runlevel, runsv, runsvdir, rx, script,
scriptreplay, sed, sendmail, seq, setarch, setconsole, setfont,
setkeycodes, setlogcons, setserial, setsid, setuidgid, sh, sha1sum,
sha256sum, sha3sum, sha512sum, showkey, shuf, slattach, sleep, smemcap,
softlimit, sort, split, start-stop-daemon, stat, strings, stty, su,
sulogin, sum, sv, svlogd, swapoff, swapon, switch_root, sync, sysctl,
syslogd, tac, tail, tar, tcpsvd, tee, telnet, telnetd, test, tftp,
tftpd, time, timeout, top, touch, tr, traceroute, traceroute6, true,
truncate, tty, ttysize, tunctl, ubiattach, ubidetach, ubimkvol,
ubirename, ubirmvol, ubirsvol, ubiupdatevol, udhcpc, udhcpd, udpsvd,
uevent, umount, uname, unexpand, uniq, unix2dos, unlink, unlzma,
unlzop, unshare, unxz, unzip, uptime, users, usleep, uudecode,
uuencode, vconfig, vi, vlock, volname, wall, watch, watchdog, wc, wget,
which, who, whoami, whois, xargs, xz, xzcat, yes, zcat, zcip
So there is no real package manager for this tiny distribution which is a pain, Alpine is a very good alternative, if
you need a package manager.
Alpine Linux
FROM alpine
Let's build it and see if it is working with just a -one-line Dockerfile.
docker build .
It is a working image, you can see the output:
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM alpine
latest: Pulling from library/alpine
3690ec4760f9: Pull complete
Digest: sha256:1354db23ff5478120c980eca1611a51c9f2b88b61f24283ee8200bf9a54f2e5c
Status: Downloaded newer image for alpine:latest
---> baa5d63471ea
Successfully built baa5d63471ea
Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.
Alpine Linux is very popular with Docker images since it is only 5.5MB and includes a package manager with many
packages.
Alpine Linux is suitable to run micro containers. The idea behind containers is also to allow the distribution of
packages easily, using images like Alpine is helpful for this case.
Phusion Baseimage
Phusion Baseimage or Baseimage-docker consumes 6 MB RAM, it is a special Docker image that is configured for
correct use within Docker containers and it is based on Ubuntu.
According to its authors it has the following feature:
Modifications for Docker-friendliness.
Administration tools that are especially useful in the context of Docker.
Mechanisms for easily running multiple processes, without violating the Docker philosophy.
You can use it as a base for your own Docker images.
Baseimage-docker is composed of :
Component Why is it included? / Remarks
Ubuntu 16.04
LTS The base system.
A correct init
process
Main article: Docker and the PID 1 zombie reaping problem. According to the Unix process
model, the init process -- PID 1 -- inherits all orphaned child processes and must [reap them]
(https://en.wikipedia.org/wiki/Wait(system_call)). Most Docker containers do not have an init
process that does this correctly, and as a result their containers become filled with zombie
processes over time. Furthermore, docker stop sends SIGTERM to the init process, which is then
supposed to stop all services. Unfortunately most init systems don't do this correctly within
Docker since they're built for hardware shutdowns instead. This causes processes to be hard killed
with SIGKILL, which doesn't give them a chance to correctly deinitialize things. This can cause
file corruption. Baseimage-docker comes with an init process /sbin/my_init that performs both
of these tasks correctly.
Fixes APT
incompatibilities
with Docker
See https://github.com/dotcloud/docker/issues/1024 .
syslog-ng
A syslog daemon is necessary so that many services - including the kernel itself - can correctly
log to /var/log/syslog. If no syslog daemon is running, a lot of important messages are silently
swallowed. Only listens locally. All syslog messages are forwarded to "docker logs".
logrotate Rotates and compresses logs on a regular basis.
SSH server
Allows you to easily login to your container to inspect or administer things. _SSH is disabled by
default and is only one of the methods provided by baseimage-docker for this purpose. The other
method is through docker exec. SSH is also provided as an alternative because docker exec
comes with several caveats. Password and challenge-response authentication are disabled by
default. Only key authentication is allowed.
cron The cron daemon must be running for cron jobs to work.
runit
Replaces Ubuntu's Upstart. Used for service supervision and management. Much easier to use
than SysV init and supports restarting daemons when they crash. Much easier to use and more
lightweight than Upstart.
setuser A tool for running a command as another user. Easier to use than su , has a smaller attack vector
than sudo , and unlike chpst this tool sets $HOME correctly. Available as /sbin/setuser .
source: https://github.com/phusion/baseimage-docker/blob/master/README.md
We are going to see how to use this image, but you can find more detailed information in the official github
repository.
The same team created a base image for running Ruby, Python, Node.js and Meteor web apps called passenger-
docker.
Running The Init System
In order to use this special init system, use the CMD instruction:
FROM phusion/baseimage:<VERSION>
CMD ["/sbin/my_init"]
Adding Additional Daemons
To add additional daemons, write a shell script which runs it and add it to the following directory:
/etc/service/
Let's take the same example used in the official documentation: memcached.
echo "#!/bin/sh" > /etc/service/memcached
echo "exec /sbin/setuser memcache /usr/bin/memcached >>/var/log/memcached.log 2>&1" >> /etc/service/memcached
chmod +x /etc/service/memcached
And in your Dockerfile:
FROM phusion/baseimage:<VERSION>
CMD ["/sbin/my_init"]
RUN mkdir /etc/service/memcached
ADD memcached.sh /etc/service/memcached/run
Running Scripts At A Container Startup
You should also create a small script and add it to /etc/my_init.d/
echo "#!/bin/sh" > /script.sh
date > /script.sh
In the Dockerfile add the script under the /etc/my_init.d directory:
RUN mkdir -p /etc/my_init.d
ADD script.sh /etc/my_init.d/script.sh
Creating Environment Variables
While you can use default Docker commands and instructions to work with these variables, you can use the
/etc /container_environment
folder to store the variables that could be used inside your container :
RUN echo LOGIN my_login > /etc/container_environment/username
If you log inside your container, you can verify that the username has been set as an environment variable:
echo $LOGIN
my_login
Building A MongoDB Image Using An Optimized Base Image
In this example we are going to use a known the phusion/baseimage which is based on Ubuntu. As said, the authors
of this image describe it as "A minimal Ubuntu base image modified for Docker-friendliness". This image only
consumes 6 MB RAM and is much powerful than Busybox or Alpine
A Docker image should start by the FROM instruction:
FROM phusion/baseimage
You can then add your name and/or email:
MAINTAINER Aymen El Amri - eon01.com - <amri.aymen@gmail.com>
Update the system
RUN apt-get update
Add the installation prerequisites:
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
RUN echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | tee /etc/apt/sources.list.d/mongodb.list
Then install Mongodb and don't forget to remove the unused files:
RUN apt-get update
RUN apt-get install -y mongodb-10gen && rm -rf /var/lib/apt/lists/*
Create the Mongodb data directory and tell Docker that the port 27017 could be used by another container:
RUN mkdir -p /data/db
EXPOSE 27017
To start Mongodb, we are going to use the command /usr/bin/mongod --port 27017 , that's why the Dockerfile will
contains the following two lines:
CMD ["--port 27017"]
ENTRYPOINT usr/bin/mongod
The final Dockerfile is:
FROM phusion/baseimage
MAINTAINER Aymen El Amri - eon01.com - <amri.aymen@gmail.com>
RUN apt-get update
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
RUN echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | tee /etc/apt/sources.list.d/mongodb.list
RUN apt-get update
RUN apt-get install -y mongodb-10gen
RUN mkdir -p /data/db
EXPOSE 27017
CMD ["--port 27017"]
ENTRYPOINT usr/bin/mongod
Creating A Python Application Micro Image
As said Busybox combines tiny versions of many common UNIX utilities into a single small executable. To run a
Python application we need Python already installed but without a package manager, we are going to use Static-
Python: A fork of cpython that supports building a static interpreter and true standalone executables.
We are going to get the executables from here.
wget https://github.com/pts/staticpython/raw/master/release/python3.2-static
The executable is only 5,7M:
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.16.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5930789 (5,7M) [application/octet-stream]
Saving to: ‘python3.2-static’
As a Python application, we are going to run this small script:
cat app.py
print("This is Python !")
So we have 3 files:
ls -l code/chapter5_staticpython/example1
total 5800
-rw-r--r-- 1 eon01 sudo 26 Nov 12 18:05 app.py
-rw-r--r-- 1 eon01 sudo 159 Nov 12 18:28 Dockerfile
-rw-r--r-- 1 eon01 sudo 5930789 Nov 12 18:05 python3.2-static
We should add the Python executable and he app.py into the container. Then execute the script:
FROM busybox
ADD app.py python3.2-static /
RUN chmod +x /python3.2-static
CMD ["/python3.2-static", "/app.py"]
Another different approach to cerate an image is to integrate the wget inside the Dockerfile. The Python application
could be also downloaded (or pulled from a git repository) directly from inside the image.
FROM busybox
RUN busybox wget <url of the executable> && busybox wget <url of the python file>
RUN chmod +x /python3.2-static
ENTRYPOINT ["/python3.2-static", "/app.py"]
Creating A Node.js Application Micro Image
Let's create a simple Node.js application using one of the smallest possible images in order to run a really micro
container.
We have seen that Alpine is a good OS for a base image. Note that apk is the Alpine package manager.
This is the Dockerfile:
FROM alpine
RUN apk update && apk upgrade
RUN apk add nodejs
WORKDIR /usr/src/app
ADD app.js .
CMD [ "node", "app.js" ]
The Node.js script that we are going to run is in app.js
console.log("********** Hello World **********");
Put the js file in the same directory as the Dockerfile and build the image using docker build . .
The image (with the upgrades) has 5 MB as you can see:
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM alpine
---> baa5d63471ea
Step 2 : RUN apk update && apk upgrade
---> Running in 0a8bb4d3be05
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz
v3.4.5-31-g7d74397 [http://dl-cdn.alpinelinux.org/alpine/v3.4/main]
v3.4.4-21-g75fc217 [http://dl-cdn.alpinelinux.org/alpine/v3.4/community]
OK: 5975 distinct packages available
(1/3) Upgrading musl (1.1.14-r12 -> 1.1.14-r14)
(2/3) Upgrading busybox (1.24.2-r11 -> 1.24.2-r12)
Executing busybox-1.24.2-r12.post-upgrade
(3/3) Upgrading musl-utils (1.1.14-r12 -> 1.1.14-r14)
Executing busybox-1.24.2-r12.trigger
OK: 5 MiB in 11 packages
Creating Your Own Docker Base Image
Building your own Docker image is possible.
As said, a Docker Images is a read-only layer, it never changes but in order to make it writable, the Union File
System should add a read-write file system over the read-only file system.
The base image we are going to build is one of the type of images that has no parents. There are two ways to create a
base image.
Using Tar
A base image is a working Linux OS like Ubuntu, Debian .. etc Creating a base image using tar may be different
from a distribution to another. Let's see how to create a Debian based distribution.
We can use debootstrap: is used to fetch the required Debian packages to build the base system.
First of all, if you haven't installed it yet, use your package manager to download the package:
apt-get install debootstrap
This is the man description of this package:
DESCRIPTION
debootstrap bootstraps a basic Debian system of SUITE into TARGET from MIRROR by running SCRIPT. MIRROR can be an http:// or https:// URL, a file:/// URL, or an ssh:/// URL.
The SUITE may be a release code name (eg, sid, jessie, wheezy) or a symbolic name (eg, unstable, testing, stable, oldstable)
Notice that file:/ URLs are translated to file:/// (correct scheme as described in RFC1738 for local filenames), and file:// will not work.
ssh ://USER@HOST/PATH URLs are retrieved using scp; use of ssh-agent or similar is strongly recommended.
Debootstrap can be used to install Debian in a system without using an installation disk but can also be used to run a different Debian flavor in a chroot environment. This way you can create a full (minimal) Debian installation which can be used for testing purposes (see the EXAMPLES section). If you are looking for a chroot system to build packages please take a look at pbuilder.
At this step, we will create an image based on Ubuntu 16.04 Xenial.
sudo debootstrap xenial xenial > /dev/null
Once it is finished you can check you Ubuntu image files:
ls xenial/
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
Let's create an archive to import it later by Docker import command:
sudo tar -C xenial/ -c . | sudo docker import - xenial
The import command creates a new filesystem image from the contents of a tarball It is used this way:
docker import [OPTIONS] file|URL|- [REPOSITORY[:TAG]]
The possible options are: -c and -m .
-c or --change <value> is used to apply Dockerfile instruction to the created image. -m or --message <string> is
used to set a commit message for imported image.
You can run the container with the base image you had created :
docker run xenial
But since the container has no command to run at its startup, you will have a similar error to this one:
docker: Error response from daemon: No command specified.
See 'docker run --help'.
You can try for example a command to run like date (just for the testing purpose):
docker run xenial date
Or you can verify the distribution of the created base image's OS:
docker run trusty cat /etc/lsb-release
If you are not familiar yet with the run command, we are going to see a chapter about running containers.
This is just a dirty verification of the integrity image.
You may use the local xenial bas image that you have just created in a Dockerfile:
FROM xenial
RUN ls
Sending build context to Docker daemon 252.9 MB
Step 1 : FROM xenial
---> 7a409243b212
Step 2 : RUN ls
---> Running in 5f544041222a
bin
boot
dev
etc
home
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
---> 57ce8f25dd63
Removing intermediate container 5f544041222a
Successfully built 57ce8f25dd63
Using Scratch
You don't need in reality to create the scratch image because it already exists, but let's see how to create one.
Actually, a scratch image is an image that was created using an empty tar archive.
tar cv --files-from /dev/null | docker import - scratch
You can see an example in the Docker library Github repository:
FROM scratch
COPY hello /
CMD ["/hello"]
This image uses an executable called hello that you can download here:
wget https://github.com/docker-library/hello-world/raw/master/hello-world/hello
Chapter VI - Working With Docker Containers
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Creating A Container
To create a container, simply run:
docker create <options> <image> <command> <args>
A simple example to start using this command is running:
docker create hello-world
Verify that the container was created by typing:
docker ps -a
or
docker ps --all
Docker will pick a random name for your container if you did not explicitly specify a name.
CONTAINER ID IMAGE COMMAND CREATED NAMES
ff330cd5505c hello-world "/hello" 2 minutes ago elegant_cori
You can set a name:
docker create --name hello-world-container hello-world
e..8
Verify the creation of the container:
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES
e490edeef083 hello-world "/hello" 24 seconds ago Created hello-world-container
ff330cd5505c hello-world "/hello" 3 minutes ago Created elegant_cori
The docker create uses the specified image and add a new writable layer to creates the container and waits for the
specified command to run inside the created container. You may notice that docker create is generally used with -
it options where:
-i or --interactive To keep STDIN open even if not attached
-t or --tty To allocate a pseudo-TTY
Other options may be used like:
-p or --publish value To publish a container's port(s) to the host
-v or --volume value To bind mount a volume.
--dns value To set custom DNS servers (default [])
-e or --env value To set environment variables (default [])
-l or --label value To set meta data on a container (default [])
-m or --memory string To set a memory limit
--no-healthcheck To disable any container-specified HEALTHCHECK
--volume-driver string To set the volume driver for the container
--volumes-from value To mount volumes from the specified container(s) (default [])
-w or --workdir string To set the working directory inside the container
Example:
docker create --name web_server -it -v /etc/nginx/sites-enabled:/etc/nginx/sites-enabled -p 8080:80 nginx
Unable to find image 'nginx:latest' locally
latest: Pulling from library/nginx
75a822cd7888: Pull complete
0aefb9dc4a57: Pull complete
046e44ee6057: Pull complete
Digest: sha256:fab482910aae9630c93bd24fc6fcecb9f9f792c24a8974f5e46d8ad625ac2357
Status: Downloaded newer image for nginx:latest
e..4
You can see the other used options in the official Docker documentation but the most used ones are listed above.
docker create is similar to docker run -d but the container is never started until you make it explicitly using docker
start <container_id> which is e..4 in the last example.
Starting And Restarting A Container
We still use the nginx container example that we created using docker create --name web_server -it -v /etc/nginx/sites-
enabled:/etc/nginx/sites-enabled -p 8080:80 nginx that have e..4 as in id.
We can use the id with the start command to start the created container:
docker start e..4
You can check if the container is running using docker ps :
CONTAINER ID IMAGE COMMAND CREATED PORTS NAMES
e70fbc6e867a nginx "nginx -g 'daemon off" 8 minutes ago 443/tcp, 0.0.0.0:8080->80/tcp web_server
The start command is used like this:
docker start <options> <container_1> .. <container_n>
Where options could be:
-a or --attach To attach STDOUT/STDERR and forward signals
--detach-keys string To override the key sequence for detaching a container
--help To print usage
-i or --interactive To attach container's STDIN
You can start multiple containers using one command:
docker start e..4 e..8
Same as start command, you can restart a container using the restart command:
Nginx container:
docker restart e..4
e..4
hello-world container:
docker restart e..8
e..8
If you want to see the different configurations of a stopped or a running container, say the nginx container that has
the id e..4, you can go to /var/lib/docker/containers/e..4 where you will find these files:
/var/lib/docker/containers/e..4/
├── config.v2.json
├── e..4-json.log
├── hostconfig.json
├── hostname
├── hosts
├── resolv.conf
├── resolv.conf.hash
└── shm
This is the config.v2.json configuration file (the original one is not really formatted but compressed):
{
"StreamConfig":{
},
"State":{
"Running":true,
"Paused":false,
"Restarting":false,
"OOMKilled":false,
"RemovalInProgress":false,
"Dead":false,
"Pid":18722,
"StartedAt":"2017-01-02T23:37:03.770516537Z",
"FinishedAt":"2017-01-02T23:37:02.408427125Z",
"Health":null
},
"ID":"e..4",
"Created":"2017-01-02T23:00:06.494810718Z",
"Managed":false,
"Path":"nginx",
"Args":[
"-g",
"daemon off;"
],
"Config":{
"Hostname":"e70fbc6e867a",
"Domainname":"",
"User":"",
"AttachStdin":true,
"AttachStdout":true,
"AttachStderr":true,
"ExposedPorts":{
"443/tcp":{
},
"80/tcp":{
}
},
"Tty":true,
"OpenStdin":true,
"StdinOnce":true,
"Env":[
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.11.8-1~jessie"
],
"Cmd":[
"nginx",
"-g",
"daemon off;"
],
"Image":"nginx",
"Volumes":null,
"WorkingDir":"",
"Entrypoint":null,
"OnBuild":null,
"Labels":{
}
},
"Image":"sha256:0..f",
"NetworkSettings":{
"Bridge":"",
"SandboxID":"e..c",
"HairpinMode":false,
"LinkLocalIPv6Address":"",
"LinkLocalIPv6PrefixLen":0,
"Networks":{
"bridge":{
"IPAMConfig":null,
"Links":null,
"Aliases":null,
"NetworkID":"a..b",
"EndpointID":"f..e",
"Gateway":"172.17.0.1",
"IPAddress":"172.17.0.2",
"IPPrefixLen":16,
"IPv6Gateway":"",
"GlobalIPv6Address":"",
"GlobalIPv6PrefixLen":0,
"MacAddress":"02:42:ac:11:00:02"
}
},
"Service":null,
"Ports":{
"443/tcp":null,
"80/tcp":[
{
"HostIp":"0.0.0.0",
"HostPort":"8080"
}
]
},
"SandboxKey":"/var/run/docker/netns/eeb3ec8d995d",
"SecondaryIPAddresses":null,
"SecondaryIPv6Addresses":null,
"IsAnonymousEndpoint":false
},
"LogPath":"/var/lib/docker/containers/e..4/e..4-json.log",
"Name":"/web_server",
"Driver":"aufs",
"MountLabel":"",
"ProcessLabel":"",
"RestartCount":0,
"HasBeenStartedBefore":false,
"HasBeenManuallyStopped":false,
"MountPoints":{
"/etc/nginx/sites-enabled":{
"Source":"/etc/nginx/sites-enabled",
"Destination":"/etc/nginx/sites-enabled",
"RW":true,
"Name":"",
"Driver":"",
"Relabel":"",
"Propagation":"rprivate",
"Named":false,
"ID":""
}
},
"AppArmorProfile":"",
"HostnamePath":"/var/lib/docker/containers/e..4/hostname",
"HostsPath":"/var/lib/docker/containers/e..4/hosts",
"ShmPath":"/var/lib/docker/containers/e..4/shm",
"ResolvConfPath":"/var/lib/docker/containers/e..4/resolv.conf",
"SeccompProfile":"",
"NoNewPrivileges":false
}
Information used by Containerd could be used - as you have seen in the previous chapters - are in
/var/run/docker/libcontainerd/e..4/config.json
Pausing And Unpausing A Container
To unpause the Nginx container use:
docker pause e..4
You can verify the STATUS of the container using docker ps :
CONTAINER ID IMAGE COMMAND STATUS PORTS
e70fbc6e867a nginx "nginx -g 'daemon off" Up About a minute (**Paused**) 443/tcp, 0.0.0.0:8080->80/tcp
To get the container back, use the unpause command:
docker unpause e..4
This is the output of ps aux|grep docker-containerd-shim when the container was paused:
root 5620 0.0 0.0 282896 5512 ? Sl 01:48 0:00 docker-containerd-
shim e..4 /var/run/docker/libcontainerd/e..4 docker-runc
And this is the same command output when the container was unpaused:
root 5620 0.0 0.0 282896 5512 ? Sl 01:48 0:00 docker-containerd-
shim e..4 /var/run/docker/libcontainerd/e..4 docker-runc
After pausing a container, Docker uses the cgroups freezer to suspend all processes in the Nginx container, you can
check the different configurations in relation with the container freezed status in /sys/fs/cgroup/freezer/docker
├── cgroup.clone_children
├── cgroup.procs
├── e..4
│ ├── cgroup.clone_children
│ ├── cgroup.procs
│ ├── freezer.parent_freezing
│ ├── freezer.self_freezing
│ ├── freezer.state
│ ├── notify_on_release
│ └── tasks
├── freezer.parent_freezing
├── freezer.self_freezing
├── freezer.state
├── notify_on_release
└── tasks
The running tasks could be found here /sys/fs/cgroup/freezer/docker/e..4/tasks .
cat /sys/fs/cgroup/freezer/docker/e..4/tasks
5637
5662
Now if you type, docker top e..4 , you will see the PID of the freezed tasks:
UID PID PPID C STIME TTY TIME CMD
root 5637 5620 0 01:48 pts/11 00:00:00 nginx: master process nginx -g daemon off;
dnsmasq 5662 5637 0 01:48 pts/11 00:00:00 nginx: worker process
So, cgroup freezer creates some files.
freezer.state (read-write) : When read, returns the effective state of the cgroup "THAWED", "FREEZING" or
"FROZEN"
freezer.self_freezing (read-only) : Shows the self-state. 0 if the self-state is THAWED; otherwise, 1
freezer.parent_freezing (read-only) : Shows the parent-state. 0 if none of the cgroup's ancestors is frozen;
otherwise, 1
Stopping A Container
Using Docker Stop
You can use the stop command in the following way to stop one or multiple containers:
docker stop <options> <container_1> .. <container_2>
Example:
We create a container first using docker run -it -d --name my_container busybox and to stop it using the --time you can
use the following command:
docker stop --time 20 my_container
The last command will wait 20 seconds before killing the container. The default number of seconds is 10, so
executing docker stop my_container will wait 10 seconds.
Executing docker stop command will ask nicely top stop the container and in the case when it does not comply
within 10 seconds it will be killed but ungracefully.
Stopping a container is sending a SIGTERM signal to the root process (PID 1) in the container but if the process and
if it has not exited within the timeout period (10 second) a SIGKILL signal will be sent.
Using Docker Kill
Docker kill could be used in this way:
docker kill <options> <container_1> .. <container_n>
Docker kill does not exit gracefully the container process does not have any sort of timeout period and send a
SIGKILL to terminate the container process by default.
Docker kill is like sending a Linux signal kill -9 but you can choose a different signal to send like SIGINT ot
SIGTERM:
docker kill --signal=SIGINT my_container
docker kill --signal=SIGTERM my_container
Using Docker rm -f
To remove one or multiple containers you can use docker rm command.
docker rm <options> <container_1> .. <container_n>
Say we are running this container:
docker run -it -d --name my_container busybox
When checking its status, we can see that it is up:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ec53b9228910 busybox "sh" 1 seconds ago Up 1 seconds my_container
Docker engine can not remove a container while it is running, you can verify this by typing:
docker rm my_container
using the short id:
docker rm ec53b9228910
or the long id:
docker rm e..8
You will have a similar output to this one:
Error response from daemon: You cannot remove a running container e..8. Stop the container before attempting removal or use -
f
Removing a container is different from stopping it and you should add the -f or the --force option, in order to
force removing and then stopping it.
docker rm -f my_container
Adding the --force command is like sending a SIGKILL signal to the container process and it is also an alternative
to executing two commands:
docker stop my_container
docker rm my_container
The final option for stopping a running container is to use the --force or -f flag in conjunction with the docker rm
command.
Using docker rm command with -f option will not only stop a running container but erases all of its traces.
Using this option will not stop and remove the container gracefully.
Docker Signals
https://blog.confirm.ch/sending-signals-docker-container/
By default executing docker kill command will send a SIGKILL signal to the container's main process. If you want
to send another signal like SIGTERM, SIGHUP, SIGUSR1, SIGUSR2 you should use the -s kk or the --signal ```
options followed by the signal name. Note that you can send the same signal to multiple containers.
docker kill <container_1> .. <container_2>
or
docker kill --signal <signal_name> <container_1> .. <container_2>
I will use the SIGHUP signal that will tell the Docker container main process to reload its configuration and to do
this we have three different approaches:
The first one is sending a signal to the container's pid
The second option is to execute the kill command inside the container
The third way to do it is using docker kill command
kill -SIGHUP <container_pid>
docker exec <container_name_or_id> kill -SIGHUP 1 (the main process has always 1 as pid)
docker kill --signal=HUP <container_name_or_id>
Container Life Cycle
The following diagrams is taken from the official Docker documentation and I have not found a more clear
explanation than using the same diagram of Docker states.
Through this chapter and the previous parts of this book, we have seen how to create, run, pause, unpause, stop and
kill Docker containers. The Docker CLI allows us to communicate and send events to Docker engine in order to
manage Docker containers and their state.
A container could be in one of the different statuses:
Created: The image is pulled and the container is configured to run but not running yet.
Running: Docker create a new RW layer in top of an the pulled image and starts running the container.
Paused: Docker container is paused but ready to run again without any restart, just run the unpause command.
Stopped: The container is stopped and not running. You should start another new container, if you want to run
the same application.
Deleted: A container could not be removed unless it is stopped. In this status, there is no container at all
(neither in the disk, nor in the memory of the host machine).
Running Docker In Docker
In short, it is not recommended to run Docker in Docker. If you are curious about the details:
Probably some people tried this for fun but it could be a real use case, you may have a continuous integration server
running in a Docker container and it should - at the same time - run and spin containers: You could be using Docker
to run Jenkins for example and need to run some Docker containers inside the container running Jenkins.
This scenario could create many problems but most of them could be resolved by bind-mounting the Docker
container socket into already running Jenkins container (which is different from running a Docker container inside
another one).
With Docker in Docker, you could have some problems like the incompatibility between internel and external
containers' security profiles: To resolve this kind of dysfunctions, the outer container should run with extra
privileges ( --privileged=true ) but keep in mind that in the same time, this does not cooperate well with Linux
Security Modules (LSM).
When you run a container inside a Jenkins container, you will run a CoW filesystem (like AUFS or Devicemapper)
at the top of another CoW filesystem, while what you want to run is a CoW filesystem (guest container) on the top of
a normal filesystem (like EXT3 or EXT4). This could create errors in your internal Docker filesystem.
The author of the feature that made possible for Docker to run inside a Docker container, Jérôme Petazzoni wrote a
blog post explaining why you should not run Docker in Docker. The alternative is to expose the Docker socket to
your Jenkins container, by binding /var/run/docker.sock using the -v flag but his solution is not also reliable in the
latest version since Docker Engine is no longer distributed.
Using Docker API is the good solution in this case, you can also use a wrapper like:
Dockerode: https://github.com/apocas/dockerode
Docker-py: https://github.com/docker/docker-py
DoMonit: https://github.com/eon01/DoMonit
In all cases, I don't recommend using Docker in Docker unless you really need to run it in this way or you are
knowledgeable about Docker.
Spotify's Docker Garbage Collector
This tool developed by Spotify allows you to clean your host from :
Containers that exited more than an hour ago are removed.
Images that do not belong to any remaining container after that are removed.
As it is explained in the Git repository of docker-gc:
Although docker normally prevents removal of images that are in use by containers, we take extra care to not
remove any image tags (e.g., ubuntu:14.04, busybox, etc) that are in use by containers. A naive docker rmi
$(docker images -q) will leave images stripped of all tags, forcing docker to re-pull the repositories when starting
new containers even though the images themselves are still on disk.
The easiest way to run the garbage collector is using its Docker container:
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v /etc:/etc spotify/docker-gc
[2016-11-04T16:26:32] [INFO] : Container running /dreamy_feynman
[2016-11-04T16:26:32] [INFO] : Container not running /nauseous_gates
[2016-11-04T16:26:32] [INFO] : Container not running /cranky_leakey
[2016-11-04T16:26:33] [INFO] : Container not running /small_fermat
[2016-11-04T16:26:33] [INFO] : Container not running /elated_bose
[2016-11-04T16:26:33] [INFO] : Container not running /sad_goldberg
[2016-11-04T16:26:33] [INFO] : Container not running /gigantic_banach
[2016-11-04T16:26:33] [INFO] : Container not running /small_payne
[2016-11-04T16:26:33] [INFO] : Container not running /reverent_jennings
[2016-11-04T16:26:33] [INFO] : Container not running /kickass_brahmagupta
[2016-11-04T16:26:33] [INFO] : Container not running /silly_fermat
[2016-11-04T16:26:33] [INFO] : Container not running /dreamy_jang
[..etc]
This is pretty useful when it is run with a cron:
* 1 * * * docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v /etc:/etc spotify/docker-gc
Using A Volume From Another Container
tree
.
├── data
│ ├── app.js
│ └── Dockerfile
└── nodejs
└── Dockerfile
#cat data/Dockerfile
FROM alpine
ADD . /code
VOLUME /code
CMD ["tail", "-f","/dev/null"]
cat data/app.js
// Load the http module to create an http server.
var http = require('http');
// Configure our HTTP server to respond with Hello World to all requests.
var server = http.createServer(function (request, response) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.end("Hello World\n");
});
// Listen on port 8000, IP defaults to 127.0.0.1
server.listen(8000);
// Put a friendly message on the terminal
console.log("Server running at http://127.0.0.1:8000/");
cat nodejs/Dockerfile
FROM node
VOLUME /code
WORKDIR /code
CMD ["node", "app.js"]
docker build -t my_data . docker build -t my_nodejs .
docker run -it -d --name my_data my_data docker run -it --volumes-from my_data --name my_nodejs my_nodejs
Performing A Docker Backup
Using the --volume-from flag, it is possible to backup data from inside a Docker container. Say we want to create a
Mysql container and we want to run a some databases backup.
Normally we should run this command to start a Mysql container:
docker run --name my_db -e MYSQL_ROOT_PASSWORD=my_secret -d mysql
When we want explicitly declare a volume for our Mysql data, we can run this command instead:
docker run --name my_db -e MYSQL_ROOT_PASSWORD=my_secret -d -v /var/lib/mysql mysql
Now we can run another container:
docker run --rm --volumes-from my_db -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /var/lib/mysql mysql
This command launches a new container and mounts the volume from the my_db container. It then mounts a local
directory /backup and backs up the contents of the my_db volume to a .TAR file within the /backup directory.
In order to do the restore, you should run:
Create a new container:
docker run -v /var/lib/mysql --name my_db_2 ubuntu /bin/bash
Un-tar the backup file in the new container`s data volume:
docker run --rm --volumes-from my_db_2 -v $(pwd):/backup ubuntu bash -c "cd /var/lib/mysql && tar xvf /backup/backup.tar --
strip 1"
Chapter VII - Working With Docker Machine
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
What Is Docker Machine & When Should I Use It ?
Docker Machine is a Docker tool that let you use and install Docker Engine on different Docker hosts.
Docker Machine is useful when it comes to creating and managing Docker using a local virtual machine. We can
create as many as Docker hosts as we want, we are not limited to a single host (localhost). We can create several
Docker hosts on our local host, in a private server or a cloud providers like AWS or Azure.
If you have an old laptop or desktop system, you are more probably unable to use the Docker "normal" installation
(Docker Engine). Many Windows and MacOS users are using Docker Machine instead of Docker.
Since Docker Machine can create provisioned virtual machines in different cloud providers like EC2 instances for
AWS, it could be used to provision Docker hosts on a remote system (your laptop, or your integration server ..etc).
Installation
*nix
Installing Docker Machine can be done using these 3 commands:
sudo apt-get install virtualbox
curl -L https://github.com/docker/machine/releases/download/v0.10.0/docker-machine-`uname -s`-`uname -m` >/tmp/docker-
machine
chmod +x /tmp/docker-machine
sudo cp /tmp/docker-machine /usr/local/bin/docker-machine
You can also add the auto-completion feature using:
curl -
L https://raw.githubusercontent.com/docker/docker/master/contrib/completion/bash/docker > /etc/bash_completion.d/docker
MacOS
If you are using a MacOS, you can install Docker Machine using the following commands:
curl -L https://github.com/docker/machine/releases/download/v0.10.0/docker-machine-`uname -s`-`uname -
m` >/usr/local/bin/docker-machine && \
chmod +x /usr/local/bin/docker-machine
To install the command completion, use:
curl -L https://raw.githubusercontent.com/docker/docker/master/contrib/completion/bash/docker > `brew --
prefix`/etc/bash_completion.d/docker
A lightweight virtualization solution for MacOS called Hyperkit is used with Docker Machine, this why you you
should have these requirements:
OS X 10.10.3 Yosemite or later
a 2010 or later Mac (i.e. a CPU that supports EPT)
Oracle Virtualbox driver will be used to create local machines because there is no docker-machine create driver for
HyperKit but you can still use HyperKit and VirtualBox on the same system.
Windows
Windows users should git bash:
if [[ ! -d "$HOME/bin" ]]; then mkdir -p "$HOME/bin"; fi
curl -L https://github.com/docker/machine/releases/download/v0.10.0/docker-machine-Windows-x86_64.exe > "$HOME/bin/docker-
machine.exe"
chmod +x "$HOME/bin/docker-machine.exe"
Because Hyper-V is not compatible with Oracle VirtualBox and because Docker for Windows uses Microsoft Hyper-
V for virtualization, it is not possible to run the two platforms at the same time, but you can still use Docker Machine
for your local VMs using Hyper-V driver.
Using Docker Machine Locally
Creating Docker Machines
You can use Docker Machine to start a public cloud instance like AWS or DO but you can also use it to start a local
VirtualBox virtual machine. Creating a machine is done using the docker create command:
docker-machine create -d virtualbox default
First time running this command, your local machine will
Search in cache for the image:
(default) Image cache directory does not exist, creating it at /root/.docker/machine/cache...
Download the Boot2Docker ISO
(default) No default Boot2Docker ISO found locally, downloading the latest release...
(default) Latest release for github.com/boot2docker/boot2docker is v17.04.0-ce
(default) Downloading /root/.docker/machine/cache/boot2docker.iso from https://github.com/boot2docker...
(default) 0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
Creating machine...
(default) Copying /root/.docker/machine/cache/boot2docker.iso to /root/.docker/machi...
Create the VirtualBox VM with its configurations (network, ssh ..etc):
(default) Creating VirtualBox VM...
(default) Creating SSH key...
(default) Starting the VM...
(default) Check network to re-create if needed...
(default) Found a new host-only adapter: "vboxnet0"
(default) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Connecting Docker Machines To Your Shell
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: ̀ docker-machine
env default ̀ :
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.100:2376"
export DOCKER_CERT_PATH="/root/.docker/machine/machines/default"
export DOCKER_MACHINE_NAME="default"
# Run this command to configure your shell:
# eval $(docker-machine env default)
If your machine has a different name then you should use it instead of default.
docker machine env <machine_name>
To connect a shell to the created machine and set environment variables for the current shell that the Docker client
will read you need to type the following command and execute it each time you open a new shell or restart your
machine. Type:
eval $(docker-machine env default)
In general:
eval $(docker-machine env <machine_name>)
Working With Multiple Docker Machines
You can see the created machine using docker-machine ls :
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default - virtualbox Running tcp://192.168.99.100:2376 v17.04.0-ce
Let's create two machines host1 and host2:
docker-machine create --driver virtualbox host1 && docker-machine create --driver virtualbox host2
Let's get the IP addresses of these two machines:
docker-machine ip host1 && docker-machine ip host2
In my case, the two machines have the following IPs:
192.168.99.100
192.168.99.101
We would like to create two Apache web servers, one in the first machine and the other in the second machine.
In order to do this, connect the first to your shell:
eval $(docker-machine env host1)
Create the first web server:
docker run -it -p 8000:80 --name httpd1 -d httpd
Connect the second machine:
eval $(docker-machine env host2)
Then create the second web server:
docker run -it -p 8000:80 --name httpd2 -d httpd
You may notice that both containers have the same host port but this is not a problem. When we connected our shell
to the first machine, the first web server was created inside the first machine and the second web server container
was created just after connecting the shell to the second machine.
In our case every container is deployed inside a different machine, that is why we can have the same host port 8000.
You can see the active machine by typing: docker-machine active .
Getting More Information About Docker Machines
Until now we created default, host1 and host2 machines.
We can see the status of our machines using docker-machine status :
docker-machine status host1
Running
docker-machine status host2
Running
If we want more information about driver name, authentication settings, engine information or swarm data, we can
use the inspect command. As an example, you can use docker-machine inspect host1 and you will get a similar json:
{
"ConfigVersion": 3,
"Driver": {
"IPAddress": "192.168.99.100",
"MachineName": "host1",
"SSHUser": "docker",
"SSHPort": 46718,
"SSHKeyPath": "/home/eon01/.docker/machine/machines/host1/id_rsa",
"StorePath": "/home/eon01/.docker/machine",
"SwarmMaster": false,
"SwarmHost": "tcp://0.0.0.0:3376",
"SwarmDiscovery": "",
"VBoxManager": {},
"HostInterfaces": {},
"CPU": 1,
"Memory": 1024,
"DiskSize": 20000,
"NatNicType": "82540EM",
"Boot2DockerURL": "",
"Boot2DockerImportVM": "",
"HostDNSResolver": false,
"HostOnlyCIDR": "192.168.99.1/24",
"HostOnlyNicType": "82540EM",
"HostOnlyPromiscMode": "deny",
"UIType": "headless",
"HostOnlyNoDHCP": false,
"NoShare": false,
"DNSProxy": true,
"NoVTXCheck": false,
"ShareFolder": ""
},
"DriverName": "virtualbox",
"HostOptions": {
"Driver": "",
"Memory": 0,
"Disk": 0,
"EngineOptions": {
"ArbitraryFlags": [],
"Dns": null,
"GraphDir": "",
"Env": [],
"Ipv6": false,
"InsecureRegistry": [],
"Labels": [],
"LogLevel": "",
"StorageDriver": "",
"SelinuxEnabled": false,
"TlsVerify": true,
"RegistryMirror": [],
"InstallURL": "https://get.docker.com"
},
"SwarmOptions": {
"IsSwarm": false,
"Address": "",
"Discovery": "",
"Agent": false,
"Master": false,
"Host": "tcp://0.0.0.0:3376",
"Image": "swarm:latest",
"Strategy": "spread",
"Heartbeat": 0,
"Overcommit": 0,
"ArbitraryFlags": [],
"ArbitraryJoinFlags": [],
"Env": null,
"IsExperimental": false
},
"AuthOptions": {
"CertDir": "/home/eon01/.docker/machine/certs",
"CaCertPath": "/home/eon01/.docker/machine/certs/ca.pem",
"CaPrivateKeyPath": "/home/eon01/.docker/machine/certs/ca-key.pem",
"CaCertRemotePath": "",
"ServerCertPath": "/home/eon01/.docker/machine/machines/host1/server.pem",
"ServerKeyPath": "/home/eon01/.docker/machine/machines/host1/server-key.pem",
"ClientKeyPath": "/home/eon01/.docker/machine/certs/key.pem",
"ServerCertRemotePath": "",
"ServerKeyRemotePath": "",
"ClientCertPath": "/home/eon01/.docker/machine/certs/cert.pem",
"ServerCertSANs": [],
"StorePath": "/home/eon01/.docker/machine/machines/host1"
}
},
"Name": "host1"
}
Starting, Stopping, Restarting & Killing Machines
Docker Machine allows users to create, stop, start, delete machines using command that are quite similar to Docker
Engine commands.
Let's do some operations to see how this work, restart host1:
docker-machine restart host1
If you execute this, you will see this message or a similar one:
Restarted machines may have new IP addresses. You may need to re-run the `docker-machine env` command.
Like it is said in the message above, Docker Machine don't necessarily give static IPs to a new machine. If it
restarts, you should docker-machine env host1 again and yous should connect it to yur shell again eval $(docker-machine
env host1) .
This is the same case when you stop and start a machine:
docker-machine stop host1
docker-machine start host1
You can abruptly force stopping host1 by typing :
docker-machine kill host1
To completely remove a machine (host1) execute this command:
docker-machine rm host1
or
docker-machine rm --force -y host1
The second command will remove the machine without prompting for confirmation.
Upgrading Docker Machines
Check your Docker Machine version by typing:
docker-machine version
Example:
docker-machine version
docker-machine version 0.10.0, build 76ed2a6
In order to see the version of a running machine (host1), type:
docker-machine version host1
You can upgrade it by typing
docker-machine upgrade host1
This operation will stop, upgrade and restart host1 with a new IP address:
Upgrading docker...
Stopping machine to do the upgrade...
Upgrading machine "host1"...
Copying /home/eon01/.docker/machine/cache/boot2docker.iso to /home/eon01/.docker/machine/machines/host1/boot2docker.iso...
Starting machine back up...
(host1) Check network to re-create if needed...
(host1) Waiting for an IP...
Restarting docker...
Using Docker Machine With External Providers
Docker Machine could be used locally using a generic or VirtualBox drivers or on some cloud providers like :
Amazon Web Services
Microsoft Azure
Digital Ocean
Exoscale
Google Compute Engine
Microsoft Hyper-V
OpenStack
Rackspace
IBM Softlayer
VMware vCloud Air
VMware Fusion
VMware vSphere
Create Machines On Amazon Web Services
First thing to have when using AWS are the credentials that are generally saved in ~/.aws/credentials . You can use
different AWS profiles by providing access-key and secret-key as arguments in the Docker Machine command.
We want to create a Docker Machine using AWS EC2 driver while choosing Ubuntu AMI, eu-west-1 as a region, eu-
west-1b as a zone, the security group,
docker-machine create --driver amazonec2 \
--amazonec2-access-key **************************** \
--amazonec2-secret-key ********************* \
--amazonec2-ami ami-a8d2d7e \
--amazonec2-region eu-west-1 \
--amazonec2-vpc-id vpc-f65c4212 \
--amazonec2-zone b \
--amazonec2-subnet-id subnet-a1991bcd \
--amazonec2-security-group live_eon01_ec2_sg \
--amazonec2-tags Name,aws-host-1 \
--amazonec2-instance-type t2.micro \
--amazonec2-ssh-keypath /home/eon01/.ssh/id_rsa \
aws-host-1
If you don't need a public IP address you can use this option:
--amazonec2-private-address-only
You can also add other options to the last command like --amazonec2-monitoring to enable CloudWatch monitoring or
--amazonec2-use-ebs-optimized-instance to create an EBS optimized instance.
If the machine creation fails and you want to start a new machine with the same name you should manually
remove the keypair created for aws-host-1.
What I generally do is also force removing the machine (if it is safe to do, of course, otherwise be sure to backup
your data):
docker-machine rm --force -y aws-host-1 && aws ec2 delete-key-pair --profile me --region eu-west-1 --key-name aws-host-1
You can use your own configurations, choose more or less options, like for example the default Access/Secret Keys,
the default Instance Type, Security Group ..etc:
docker-machine create -d amazonec2 \
--amazonec2-region eu-west-1 \
--amazonec2-ssh-keypath /home/eon01/.ssh/id_rsa \
aws-host-1
We are going to use the first command create command. The latter operation will:
Run some pre-create checks
Start the EC2 instance
Detect some EC2 system configurations
Provision the machine with ubuntu(systemd)
Install Docker
Copy certs to the local machine directory
Copy certs to the remote machine
Setup Docker configuration on the remote daemon
Verify of Docker is up and running on the remote machine
Running pre-create checks...
Creating machine...
(aws-host-1) Launching instance...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env aws-
host-1
Let's now connect Docker Client to the Remote Docker Engine:
eval $(docker-machine env aws-host-1)
Let's verify if the created machine is connected to the local Docker Client:
docker-machine active
We should see the name of the EC2 machine here:
aws-host-1
Creating A Docker Swarm Cluster Using Docker Machine
In this part we are going to demonstrate how to create a Swarm cluster using Docker Machine (1 manager + 1
worker). In order to do this we can follow these steps:
Create the first machine using awscloud driver. This EC2 instance will be a manager.
docker-machine create --driver amazonec2 \
--amazonec2-access-key **************************** \
--amazonec2-secret-key ********************* \
--amazonec2-ami ami-a8d2d7e \
--amazonec2-region eu-west-1 \
--amazonec2-vpc-id vpc-f65c4212 \
--amazonec2-zone b \
--amazonec2-subnet-id subnet-a1991bcd \
--amazonec2-security-group live_eon01_ec2_sg \
--amazonec2-tags Name,aws-host-1 \
--amazonec2-instance-type t2.micro \
--amazonec2-ssh-keypath /home/eon01/.ssh/id_rsa \
aws-host-1
Connect it to the local Docker Client
eval $(docker-machine env aws-host-1)
Initialize the Swarm:
docker swarm init
Copy the generated command in order to use it in the second machine
Create the second machine. This EC2 instance is the worker:
docker-machine create --driver amazonec2 \
--amazonec2-access-key **************************** \
--amazonec2-secret-key ********************* \
--amazonec2-ami ami-a8d2d7e \
--amazonec2-region eu-west-1 \
--amazonec2-vpc-id vpc-f65c4212 \
--amazonec2-zone b \
--amazonec2-subnet-id subnet-a1991bcd \
--amazonec2-security-group live_eon01_ec2_sg \
--amazonec2-tags Name,aws-host-2 \
--amazonec2-instance-type t2.micro \
--amazonec2-ssh-keypath /home/eon01/.ssh/id_rsa \
aws-host-2
Connect it to the local Docker Client
eval $(docker-machine env aws-host-2)
Make this machine join the manager:
docker swarm join \
--token SWMTKN-1-*********************** \
172.32.1.13:2377
Now if you run a Docker service, it will be deployed to the Swarm cluster. We are going the same infinite image that
I created for Docker Swarm chapter:
service create --name infinite_service eon01/infinite
If you run this directly without connecting the manager to the Docker Client, you will get this error Error response
from daemon: This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command
on a manager node or promote the current node to a manager. which is normal, because you are connected to the worker.
Now, we should get back to the manager machine:
eval $(docker-machine env aws-host-1)
Create a new service :
service create --name infinite_service eon01/infinite
In order to verify that the cluster is working and that both of EC2 created machines are receiving containers, I scaled
the service up to 50 containers docker service scale infinite_service=50 , then I checked the containers living in each
machine using docker ps , each server was hosting 25 instance.
If you would like to automate the creation of Docker Swarm clusters using Docker Machine, you can use these
commands:
docker $(docker-machine config aws-host-1) swarm init --listen-addr $(docker-machine ip aws-host-1):2377
docker $(docker-machine config aws-host-2) swarm join $(docker-machine ip aws-host-1):2377 --listen-addr $(docker-
machine ip aws-host-2):2377
Using docker-machine config <machine> will avoid changing each time from aws-host-1 to aws-host-2, since the output
of this command is similar to the following:
--tlsverify
--tlscacert="/home/eon01/.docker/machine/machines/aws-host-1/ca.pem"
--tlscert="/home/eon01/.docker/machine/machines/aws-host-1/cert.pem"
--tlskey="/home/eon01/.docker/machine/machines/aws-host-1/key.pem"
-H=tcp://54.229.238.222:2376
The latter configuration will "override" the Docker Daemon default configurations and run a container suing
different TLS parameters and a different tcp socket.
The output of the other embedded command ( docker-machine ip <machine> ) is the IP address of an EC2 machine.
Create Machines On DigitalOcean
The first step is generating a token, login to your DigitalOcean account and generate a new Token:
The second step is to grab the list of images used to provision DO virtual machines. You should execute an API call
curl -X GET "https://api.digitalocean.com/v2/images" -
H "Authorization: Bearer ed98d5c230b121ce7f4807a369c5dd1ae8cfcdf87da444ec3ffd50de45c60cd5"
You will get a JSON:
Example: If you want to use CoreOs 1353.4.0 (beta), you should look at :
{
"id": 23857817,
"name": "1353.4.0 (beta)",
"distribution": "CoreOS",
"slug": "coreos-beta",
"public": true,
"regions": [
"nyc1",
"sfo1",
"nyc2",
"ams2",
"sgp1",
"lon1",
"nyc3",
"ams3",
"fra1",
"tor1",
"sfo2",
"blr1"
],
"created_at": "2017-04-01T01:56:28Z",
"min_disk_size": 20,
"type": "snapshot",
"size_gigabytes": 0.33
}
We need also to get a list of DO regions:
curl -X GET "https://api.digitalocean.com/v2/regions" -
H "Authorization: Bearer ed98d5c230b121ce7f4807a369c5dd1ae8cfcdf87da444ec3ffd50de45c60cd5"
In order to get a formatted JSON output, you can add a Python command:
curl -X GET "https://api.digitalocean.com/v2/regions" -
H "Authorization: Bearer ed98d5c230b121ce7f4807a369c5dd1ae8cfcdf87da444ec3ffd50de45c60cd5" | python -m json.tool
This is the output of regions list:
{
"links": {},
"meta": {
"total": 12
},
"regions": [
{
"available": true,
"features": [
"private_networking",
"backups",
"ipv6",
"metadata",
"install_agent",
"storage"
],
"name": "New York 1",
"sizes": [
"512mb",
"1gb",
"2gb",
"4gb",
"8gb",
"16gb",
"32gb",
"48gb",
"64gb"
],
"slug": "nyc1"
},
{
"available": true,
"features": [
"private_networking",
"backups",
"ipv6",
"metadata",
"install_agent"
],
"name": "San Francisco 1",
"sizes": [
"512mb",
"1gb",
"2gb",
"4gb",
"8gb",
"16gb",
"32gb",
"48gb",
"64gb"
],
"slug": "sfo1"
},
{
"available": true,
"features": [
"private_networking",
"backups",
"ipv6",
"metadata",
"install_agent"
],
"name": "New York 2",
"sizes": [
"512mb",
"1gb",
"2gb",
"4gb",
"8gb",
"16gb",
"32gb",
"48gb",
"64gb"
],
"slug": "nyc2"
},
{
"available": true,
"features": [
"private_networking",
"backups",
"ipv6",
"metadata",
"install_agent"
],
"name": "Amsterdam 2",
"sizes": [
"512mb",
"1gb",
"2gb",
"4gb",
"8gb",
"16gb",
"32gb",
"48gb",
"64gb"
],
"slug": "ams2"
},
.
.
.
.
{
"available": true,
"features": [
"private_networking",
"backups",
"ipv6",
"metadata",
"install_agent"
],
"name": "Bangalore 1",
"sizes": [
"512mb",
"1gb",
"2gb",
"4gb",
"8gb",
"16gb",
"32gb",
"48gb",
"64gb"
],
"slug": "blr1"
}
]
}
We are going to use the "San Fransisco 1" region.
I am a user of DigitalOcean so I had already my SSH key registered. In order to use with Docker Machine, I should
find its fingerprint. In order to this, it depends on your ssh-keygen version, but one of these commands should work
for you:
ssh-keygen -lf /path/to/ssh/key
ssh-keygen -E md5 -lf /path/to/ssh/key
Create the machine:
docker-machine create \
--driver digitalocean \
--digitalocean-access-token=ed98d5c230b121ce7f4807a369c5dd1ae8cfcdf87da444ec3ffd50de45c60cd5 \
--digitalocean-region="sfo1" \
--digitalocean-ssh-key-fingerprint="3c:b1:dd:eg:c9:31:23:4e:13:81:6d:15:5d:e7:32:2d" \
do-host-1
This command will be
Running pre-create checks
Creating SSH key
Assuming Digital Ocean private SSH is located at ~/.ssh/id_rsa
Creating Digital Ocean Droplet
Waiting for IP address to be assigned to the Droplet
Waiting for machine to be running, this may take a few minutes
Detecting operating system of created instance
Waiting for SSH to be available
Detecting the provisioner
Provisioning with ubuntu(systemd)
Installing Docker
Copying certs to the local machine directory
Copying certs to the remote machine
Setting Docker configuration on the remote daemon
Checking connection to Docker
You can add other options like the SSH user name, the image you would like to use .. etc. Example:
docker-machine create \
--driver digitalocean \
--digitalocean-access-token=<access_token> \
--digitalocean-region="<regions>" \
--digitalocean-ssh-key-fingerprint="<fingerprint>" \
--digitalocean-ssh-user="<user_id>" \
--digitalocean-image="<image_id>" \
<node_name>
Other configurations are possible like:
--digitalocean-size
--digitalocean-ipv6
--digitalocean-private-networking
--digitalocean-backups
--digitalocean-userdata
--digitalocean-ssh-port
If you want to remove the created machine locally and from DO, you should use this command:
docker-machine rm --force -y do-host-1
Like already done, if you want to connect to connect to the remote Docker, type:
eval $(docker-machine env do-host-1)
Now you can create Docker containers in your laptop and see them running in the DO server.
Chapter VIII - Docker Networking
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Single Host Vs Multi Host Networking
There two different ways of doing networking in Docker:
Networking in a single host
Networking in a cluster of two ore more hosts
Single Host Networking
By default, any Docker container or host will get an IP address that will give it the possibility to communicate with
other containers in the same host or with the host machine. It is possible - as we are going to see - that a Docker
container finds another container by its name, since the IP address could be assigned dynamically at the container
start up, a name is more efficient to find a running container.
Containers in a single host, could also communicate and reach the outside world.
Create a simple container:
docker run -it -d --name my_container busybox
And test if you can ping Google:
docker exec -it my_container ping -w3 google.com
PING google.com (216.58.204.142): 56 data bytes
64 bytes from 216.58.204.142: seq=1 ttl=48 time=2.811 ms
--- google.com ping statistics ---
3 packets transmitted, 1 packets received, 66% packet loss
round-trip min/avg/max = 2.811/2.811/2.811 ms
Now if you inspect the container using docker inspect my_container you will be able to see its network configuration
and its IP address:
"NetworkSettings": {
"Bridge": "",
"SandboxID": "555a60eaffdb4b740f7b869bac61859ecca1e39be95ee5856ca28019509e4255",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/555a60eaffdb",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "20b1b218462e6771155de75788f53b731bbff12019d977aefa7094f57275887d",
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"MacAddress": "02:42:ac:11:00:02",
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "2094b393faacbb1cc049f1f136437b1cce6fc41abc304cf2c1ae558a62c5ee2e",
"EndpointID": "20b1b218462e6771155de75788f53b731bbff12019d977aefa7094f57275887d",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
}
my_container has the IP address 172.17.0.2 that the host could reach:
ping -w1 172.17.0.2
PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.
64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.050 ms
64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.045 ms
--- 172.17.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.045/0.047/0.050/0.007 ms
If you run a web server, your users must reach the port 80 (or 443) of your server, in this case an nginx container, for
example, should be reached at its port 80 (or 443) and it is done through port forwarding that connects it to the host
machine and then an external network (Internet in our case).
Let's create the web server container, forward the port host port 8080 to the container port 80 and test how it
responds:
docker run -d -p 8080:80 --name my_web_server nginx
Ngninx should reply if your port 8080 is not used by other applications:
curl http://0.0.0.0:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
In a single host, containers are able to see each other, to see the external world (if they are not running in isolated
networks) and they can receive traffic from an external network.
Multi Host Networking
An application could be run using many containers with a load balancer, these containers could be spread across
multiple hosts. Networking in multi host environments is entirely different from single host environments.
Containers even spread across servers should be able to communicate together, this is where service discovery plays
an important role in networking. Service discovery allows you to request a container name and get its IP address
back.
Docker comes with a default network driver called overlay that we are going to see later in this chapter.
If you would like to operate in the multi host mode, you have two options:
Using Docker engine in swarm mode
Running a cluster of hosts using a key value store (like Zookeeper, etcd or Consul) that allows the service
discovery
Docker Networks
Docker Default Networks
The networking is one of the most important parts in building Docker clusters and microservices.
The first thing to know about networking in Docker is listing the different networks that Docker engine uses:
docker network ls
You should get the list of your Docker networks at least those who are pre-defined networks:
NETWORK ID NAME DRIVER SCOPE
e5ff619d25f5 bridge bridge local
98d44bf13233 docker_gwbridge bridge local
608a539ce1e3 host host local
8sm2anzzfa0i ingress overlay swarm
ede46dbb22d7 none null local
none, host and bridge are the names of default networks that you can find in any Docker installation running a
single host. The activation of the swarm mode creates another default (or predefined) network called ingress.
Clearly, these networks are not physical networks, they are emulated networks that abstracts hardware and of course
they are in built and managed in higher levels that the physical layer and this one of properties of software-defined
networking.
None Network
The network called none using the null driver is a predefined network that isolates a container in a way it can neither
connect to outisde not communicate with other containers in the same host.
NETWORK ID NAME DRIVER SCOPE
ede46dbb22d7 none null local
Let's verify this by running a busybox container in this network:
docker run --net=none -it -d --name my_container busybox
If you inspect the created container docker inspect my_container , you can see the different network configuration of
this container and you will notice that is attached to null network with the id
ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7. The container does not have any
gateway.
"NetworkSettings": {
"Bridge": "",
"SandboxID": "566b9a74d37c7f47e02d769b79e168df437a5b23ee030fc199d99f7d94b353b7",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/566b9a74d37c",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"none": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7",
"EndpointID": "b42debd75af122d113c202ad373d46e0b08d32e9ef6e9361e49515045ae6288d",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": ""
}
}
}
Since my_container is attached to none network, it will not have any access to external and internal connections,
let's log into the container and check the network configuration:
docker exec -it my_container sh
Once you are logged inside the container, type ifconfig and you will notice that there is no other interface apart the
loopback interface.
/ # ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
If you ping or traceroute an external IP or domain name, you will not be able to do it:
/ # traceroute painlessdocker.com
traceroute: bad address 'painlessdocker.com'
/ # ping painlessdocker.com
ping: bad address 'painlessdocker.com'
Doing the same thing with the loopback address 127.0.0.1 will work:
/ # ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.085 ms
--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.085/0.085/0.085 ms
The last tests confirm that any container attached to the null network does not know about the outside networks and
no host from outside could access to my_container.
If you want to see the different configurations of this network, type docker network inspect none :
[
{
"Name": "none",
"Id": "ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7",
"Scope": "local",
"Driver": "null",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": []
},
"Internal": false,
"Containers": {
"69806d9de46c2959d4d20f99660e7d58d7c35c1e0b33511f0b85a395b696786f": {
"Name": "my_container",
"EndpointID": "b42debd75af122d113c202ad373d46e0b08d32e9ef6e9361e49515045ae6288d",
"MacAddress": "",
"IPv4Address": "",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
Docker Host Network
If you want a container to run with a similar networking configuration to the host machine, then you should use the
host network.
NETWORK ID NAME DRIVER SCOPE
608a539ce1e3 host host local
To see the configuration of this network, type docker inspect network host :
[
{
"Name": "host",
"Id": "608a539ce1e3e3b97964c6a2fe06eb0e0a9b539e659025fbd101b24e327d8da6",
"Scope": "local",
"Driver": "host",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": []
},
"Internal": false,
"Containers": {},
"Options": {},
"Labels": {}
}
]
Let's run a web server using nginx and use this network:
docker run -d -p 80 nginx
Verify the output of docker ps :
CONTAINER ID IMAGE COMMAND PORTS NAMES
d0df14bf80a0 nginx "nginx -g 'daemon off" 443/tcp, 0.0.0.0:32769->80/tcp stoic_montalcini
69806d9de46c busybox "sh" my_container
Notice that we are using only the port 80, since we added the -p flag ( -p 80 ) that made the port 80 accessible
from the host at port 32769 (that Docker chose automatically) . We could have used an external binding to port 80
that we choose manually, say port 8080, in this case the command to run this container will be:
docker run -d -p 8080:80 nginx
In all cases, the port 80 of the container will be accessible either from the port 32769 or the port 8080 (in the second
case) but in both cases, the IP address will be the same as the host IP ( 127.0.0.1 or 0.0.0.0 ).
Let's verify it by running curl -I http://0.0.0.0:32768 to see whether the server response will be 200 or not:
HTTP/1.1 200 OK
Connection: Keep-Alive
Keep-Alive: timeout=20
Date: Sat, 07 Jan 2017 20:49:02 GMT
Content-Type: text/html
So when running curl -I http://0.0.0.0:23768 , nginx will reply with a 200 response. This is the content of the page:
curl http://0.0.0.0:32768
<html><head><title>Index of /</title></head><body><h1>Index of /</h1><hr /><ol><li><strong><a href='/../'>..</a>/</strong>
<br /><small>modified: Sat, 07 Jan 2017 20:38:03 GMT<br />directory - 4.00 kbyte<br /><br /></small></li></ol><hr /></body>
</html>
Bridge Network
This is the default containers network, any network that runs without the --net flag will be attached automatically
to this network. Two Docker containers running in this network could see each other.
To create two containers, type:
docker run -it -d --name my_container_1 busybox
docker run -it -d --name my_container_2 busybox
These containers will be attached to the bridge network, if you type docker network inspect bridge , you will notice
that they are and what IP addresses the will have:
[
{
"Name": "bridge",
"Id": "e5ff619d25f5dfa2e9b4fe95db8136b74fa61b588fb6141b7d9678adafd155a7",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Containers": {
"1172afcb3363f36248701aaa0ba9c1080ebc94db6a168f188f6ba98907e22102": {
"Name": "my_container_1",
"EndpointID": "b8f4741fb2008b70b60a0375446653f820fcaf6b1d8279c1b7d0abbb5775aeaf",
"MacAddress": "02:42:ac:11:00:04",
"IPv4Address": "172.17.0.4/16",
"IPv6Address": ""
},
"6895aa358faea0226ba646544056c34063a0ef5b83d10e68500936d0a397bb7b": {
"Name": "my_container_2",
"EndpointID": "2d3c6c8ca175c0ecb35459dcd941c0456fbcbf8fcce4885aa48eb06b9cff19b8",
"MacAddress": "02:42:ac:11:00:05",
"IPv4Address": "172.17.0.5/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
my_container_1 has the IP address 172.17.0.4/16
my_container_2 has the IP address 172.17.0.5/16
From any of the created containers, say my_container_1 you could see the other container, just type docker exec -it
my_container_1 ping 172.17.0.5 to ping the my_container_2 from my_container_1:
docker exec -it my_container_1 ping 172.17.0.5
PING 172.17.0.5 (172.17.0.5): 56 data bytes
64 bytes from 172.17.0.5: seq=0 ttl=64 time=0.156 ms
64 bytes from 172.17.0.5: seq=1 ttl=64 time=0.071 ms
The containers running in the bridge network could see each other by IP address, let's see if it is possible to ping
using the container name:
docker exec -it my_container_1 ping my_container_2
ping: bad address 'my_container_2'
As you see, it is not possible for Docker to associate a container name to an IP and this is not possible
because we do not run any discovery service. Creating a user-defined network could solve this problem.
docker_gwbridge Network
When you create a swarm cluster or join one, Docker will create by default a network called docker_gwbridge that
will be used to connect different containers from different hosts. These hosts are part of the swarm cluster.
In general, this network provides the containers not having an access to external networks with connectivity.
docker network ls
NETWORK ID NAME DRIVER SCOPE
98d44bf13233 docker_gwbridge bridge local
Running overlay networks always need the docker_gwbridge.
Software Defined & Multi Host Networks
This type of networks, in opposite to the default networks, does not come with a fresh Docker installation , but
should be created by the user. The simplest way to create a new network is:
docker network create my_network
For more general use, this is the command to use:
docker network create <options> <network>
Bridge Networks
You can use many network drivers like bridge or overlay, you may also need to set up a IP range or a subnet for
your network or you will probably need to setup your own gateway. Type docker network create --help for more
options and configurations:
--aux-address value Auxiliary IPv4 or IPv6 addresses used by Network driver (default map[])
-d or --driver string Driver to manage the Network (default "bridge")
--gateway value IPv4 or IPv6 Gateway for the master subnet (default [])
--help Print usage
--internal Restrict external access to the network
--ip-range value Allocate container ip from a sub-range (default [])
--ipam-driver string IP Address Management Driver (default "default")
--ipam-opt value Set IPAM driver specific options (default map[])
--ipv6 Enable IPv6 networking
--label value Set metadata on a network (default [])
-o or --opt value Set driver specific options (default map[])
--subnet value Subnet in CIDR format that represents a network segment (default [])
In order to use the Docker service discovery, let's create a second bridge network:
docker network create --driver bridge my_bridge_network
To see the new network, type:
docker network ls
NETWORK ID NAME DRIVER SCOPE
5555cd178f99 my_bridge_network bridge local
my_container_1 and my_container_2 are running in the default bridge network, we want them attached to the new
network, we should type:
docker network connect my_bridge_network my_container_1
docker network connect my_bridge_network my_container_2
Now, the service discovery is working and both a Docker container could access another one by its name.
docker exec -it my_container_1 ping my_container_2
PING my_container_2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: seq=0 ttl=64 time=0.120 ms
64 bytes from 172.18.0.3: seq=1 ttl=64 time=0.081 ms
Docker containers running in a user-defined bridge network could see each other by their containers' names.
Let's create a more personalized network with a specified subnet, gateway and IP range and let's also change the
behavior of networking in this network by decreasing the size of the largest network layer protocol data unit that can
be communicated in a single network transaction, or what we call MTU (Maximum Transmission Unit):
docker network create -d bridge \
--subnet=192.168.0.0/16 \
--gateway=192.168.0.100 \
--ip-range=192.168.1.0/24 \
--opt "com.docker.network.driver.mtu"="1000"
my_personalized_network
You can change other options using the --opt or -o flag like:
com.docker.network.bridge.name : bridge name to be used when creating the Linux bridge
com.docker.network.bridge.enable_ip_masquerade : Enable IP masquerading
com.docker.network.bridge.enable_icc : Enable or Disable Inter Container Connectivity
com.docker.network.bridge.host_binding_ipv4 : Default IP when binding container ports
For any newly created bridge network, an interface is created in the container, just type ifconfig inside the
container to see them:
Let's connect my_container_1 to my_personalized_network:
docker network connect my_personalised_network my_container_1
Executing ifconfig inside this container ( docker exec -it my_container_1 ifconfig ) will show us two things:
This container is running inside more than a container because it was connected to my_bridge_network and
now we connected it to my_personalized_network.
The MTU changed to 1000.
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:04
inet addr:172.17.0.4 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe11:4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:194 errors:0 dropped:0 overruns:0 frame:0
TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:34549 (33.7 KiB) TX bytes:1410 (1.3 KiB)
eth1 Link encap:Ethernet HWaddr 02:42:C0:A8:01:00
inet addr:192.168.1.0 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1000 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:87 (87.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:16 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1240 (1.2 KiB) TX bytes:1240 (1.2 KiB)
docker run supports only one network, bur docker network connect could be used after the container creation to
add it to many networks.
By default, two containers living in the same host but in different networks will not see each other. Docker
daemon runs a tiny DNS server that allows user-defined networks to make service discovery.
docker_gwbridge Network
You can create new docker_gwbridge networks using docker network create command.
Example:
docker network create --subnet 172.3.0.0/16 \
--opt com.docker.network.bridge.name=another_docker_gwbridge \
--opt com.docker.network.bridge.enable_icc=false \
--opt com.docker.network.bridge.enable_ip_masquerade=true \
another_docker_gwbridge
Overlay Networks
Overlay networks are used in multi host environments (like the swarm mode of Docker). You can create an overlay
network using docker network create command but after the activation of swarm mode:
docker swarm init
docker network create --driver overlay --subnet 10.0.9.0/24 my_network
This is the configuration of the latter network:
[
{
"Name": "my_network",
"Id": "2g3i0zdldo4adfqisvqjn6gpt",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.9.0/24",
"Gateway": "10.0.9.1"
}
]
},
"Internal": false,
"Containers": null,
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "257"
},
"Labels": null
}
]
Say we have two hosts in a Docker cluster (having 192.168.10.22 and 192.168.10.23 as IP addresses) , the different
containers attached to the latter overlay network will have dynamically allocated IP addresses in the network subnet
10.0.9.0/24.
VXLAN (Virtual Extensible LAN) - according to Wikipedia - is a network virtualization technology that attempts to
improve the scalability problems associated with large cloud computing deployments.
It uses a VLAN-like encapsulation technique to encapsulate MAC-based OSI layer 2 Ethernet frames within layer 4
UDP packets, using 4789 as the default destination UDP port number.
To better understand the context, this diagram illustrates the 7 layers of the OSI model:
VXLAN endpoints, which terminate VXLAN tunnels and may be both virtual or physical switch ports, are known as
VXLAN tunnel endpoints (VTEPs).
VXLAN is an evolution of efforts to standardize on an overlay encapsulation protocol. It increases scalability up to
16 million logical networks and allows for layer 2 adjacency across IP networks.
Open vSwitch is an example of a software-based virtual network switch that supports VXLAN overlay networks.
The network driver overlay works on the VXLAN tunnels (connecting VTEPs) and need a key/value store.
A VTEP has two logical interfaces: an uplink and a downlink where the uplink is acting like a tunnel endpoint
having an IP address to receive sent VXLAN frames.
Flannel
Kubernetes does away with port-mapping and assigns a unique IP address to each pod and this works well in Google
Compute but for some other cloud providers a host can not get an entire subnet, this is why Flannel solves this
problem and creates an overlay mesh network that provision each host with a subnet: Each pod (if you are using
Kubernetes) or container has a unique and routable IP inside the cluster.
According to CoreOs creators, Kubernetes and then Flannel works great with CoreOS to distribute a workload
across a cluster.
Flannel was designed to use with Kubernetes but it could be used as a generic overlay network driver to create
software-defined overlay networks since it supports VXLAN, AWS VPC, and the default layer 2 UDP overlay
network.
Flannel uses etcd to store the network configuration (VMs subnets,hosts' IPs etc ..) and among other back-ends it
uses UDP and a TUN device in order to encapsulate an IP fragment in a UDP packet. The latter transports some
information like the MAC, the outer IP, the inner IP and the playload.
Note that the IP fragmentation is a process that happens in the Internet Protocol (IP) in order to fragment or
breaks the sent datagrams into smaller pieces (called generally fragments). This way, the formed packets could pass
through a link with a smaller maximum transmission unit (MTU) than the original UDP datagram size. And of
course the fragments are reassembled by the receiving host.
This schema taken from the official Github repository explains well how Flannel networking works in general:
To install Fannel you need to build it from source:
sudo apt-get install linux-libc-dev golang gc
git clone https://github.com/coreos/flannel.git
cd flannel; make dist/flanneld
Weave
Weaveworks created Weave (or Weave Net) which is a virtual network that connects Docker containers deployed
across multiple hosts.
In order to install Weave:
sudo curl -L git.io/weave -o /usr/local/bin/weave
sudo chmod a+x /usr/local/bin/weave
Now just type weave to download weaveworks/weaveexec Docker image and see the help:
Usage:
weave --help | help
setup
version
weave launch <same arguments as 'weave launch-router'>
launch-router [--password <pass>] [--trusted-subnets <cidr>,...]
[--host <ip_address>]
[--name <mac>] [--nickname <nickname>]
[--no-restart] [--resume] [--no-discovery] [--no-dns]
[--ipalloc-init <mode>]
[--ipalloc-range <cidr> [--ipalloc-default-subnet <cidr>]]
[--log-level=debug|info|warning|error]
<peer> ...
launch-proxy [-H <endpoint>] [--without-dns] [--no-multicast-route]
[--no-rewrite-hosts] [--no-default-ipalloc] [--no-restart]
[--hostname-from-label <labelkey>]
[--hostname-match <regexp>]
[--hostname-replacement <replacement>]
[--rewrite-inspect]
[--log-level=debug|info|warning|error]
launch-plugin [--no-restart] [--no-multicast-route]
[--log-level=debug|info|warning|error]
weave prime
weave env [--restore]
config
dns-args
weave connect [--replace] [<peer> ...]
forget <peer> ...
weave run [--without-dns] [--no-rewrite-hosts] [--no-multicast-route]
[<addr> ...] <docker run args> ...
start [<addr> ...] <container_id>
attach [<addr> ...] <container_id>
detach [<addr> ...] <container_id>
restart <container_id>
weave expose [<addr> ...] [-h <fqdn>]
hide [<addr> ...]
weave dns-add [<ip_address> ...] <container_id> [-h <fqdn>] |
<ip_address> ... -h <fqdn>
dns-remove [<ip_address> ...] <container_id> [-h <fqdn>] |
<ip_address> ... -h <fqdn>
dns-lookup <unqualified_name>
weave status [targets | connections | peers | dns | ipam]
report [-f <format>]
ps [<container_id> ...]
weave stop
stop-router
stop-proxy
stop-plugin
weave reset [--force]
rmpeer <peer_id> ...
where <peer> = <ip_address_or_fqdn>[:<port>]
<cidr> = <ip_address>/<routing_prefix_length>
<addr> = [ip:]<cidr> | net:<cidr> | net:default
<endpoint> = [tcp://][<ip_address>]:<port> | [unix://]/path/to/socket
<peer_id> = <nickname> | <weave internal peer ID>
<mode> = consensus[=<count>] | seed=<mac>,... | observer
Start Weave router:
weave launch
After typing the latter command, you will notice that some other Docker images are pulled, it's alright, Weave needs
weavedb and weaveplugin:
Unable to find image 'weaveworks/weavedb:latest' locally
latest: Pulling from weaveworks/weavedb
1266eb846caf: Pulling fs layer
1266eb846caf: Download complete
1266eb846caf: Pull complete
Digest: sha256:c43f5767a1644196e97edce6208b0c43780c81a2279e3421791b06806ca41e5f
Status: Downloaded newer image for weaveworks/weavedb:latest
Unable to find image 'weaveworks/weave:1.8.2' locally
1.8.2: Pulling from weaveworks/weave
e110a4a17941: Already exists
199ab7eb2ba4: Already exists
8c419735a809: Already exists
1888d0f92b68: Already exists
f4d1c90c86a4: Already exists
1d6a7435ac59: Already exists
7372f3ee9e8b: Already exists
17004cbabd74: Already exists
b8e5c537a426: Already exists
4e295f039ae0: Pulling fs layer
d67a003dc85f: Pulling fs layer
2a84c77046e7: Pulling fs layer
d67a003dc85f: Download complete
2a84c77046e7: Verifying Checksum
2a84c77046e7: Download complete
4e295f039ae0: Verifying Checksum
4e295f039ae0: Download complete
4e295f039ae0: Pull complete
d67a003dc85f: Pull complete
2a84c77046e7: Pull complete
Digest: sha256:7a9ec1daa3b9022843fd18986f1bd5c44911bc9f9f40ba9b4d23b1c72c51c127
Status: Downloaded newer image for weaveworks/weave:1.8.2
Unable to find image 'weaveworks/plugin:1.8.2' locally
1.8.2: Pulling from weaveworks/plugin
e110a4a17941: Already exists
199ab7eb2ba4: Already exists
8c419735a809: Already exists
1888d0f92b68: Already exists
f4d1c90c86a4: Already exists
1d6a7435ac59: Already exists
7372f3ee9e8b: Already exists
17004cbabd74: Already exists
b8e5c537a426: Already exists
4e295f039ae0: Already exists
d67a003dc85f: Already exists
2a84c77046e7: Already exists
2b57c438a07b: Pulling fs layer
2b57c438a07b: Verifying Checksum
2b57c438a07b: Download complete
2b57c438a07b: Pull complete
Digest: sha256:3a38cec968bff6ebc4b1823673378b14d52ef750dec89e3513fe78119d07fdf2
Status: Downloaded newer image for weaveworks/plugin:1.8.2
Let's create a subnet for the host:
weave expose 10.10.0.1/16
And inspect the output of brctl show command as well as the newly created interface:
bridge name bridge id STP enabled interfaces
br-0ebcbf638b08 8000.0242a98e8f09 no
br-5555cd178f99 8000.0242d38260e6 no
br-da47f0537ffa 8000.024261465302 no
docker0 8000.02422492c7fe no
docker_gwbridge 8000.0242e5db7cff no veth6e6c262
lxcbr0 8000.000000000000 no
weave 8000.26eea4e57577 no vethwe-bridge
ifconfig weave
weave Link encap:Ethernet HWaddr 26:ee:a4:e5:75:77
inet addr:10.10.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1410 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:67 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:10738 (10.7 KB)
You can use ifdown weave to stop the created interface.
brctl is used to set up, maintain, and inspect the ethernet bridge configuration in the Linux Kernel.
If you want more information about Weave running in your host, type weave status :
Version: 1.8.2 (up to date; next check at 2017/01/16 01:16:44)
Service: router
Protocol: weave 1..2
Name: 26:ee:a4:e5:75:77(eonSpider)
Encryption: disabled
PeerDiscovery: enabled
Targets: 0
Connections: 0
Peers: 1
TrustedSubnets: none
Service: ipam
Status: idle
Range: 10.32.0.0/12
DefaultSubnet: 10.32.0.0/12
Service: dns
Domain: weave.local.
Upstream: 127.0.1.1
TTL: 1
Entries: 0
Service: proxy
Address: unix:///var/run/weave/weave.sock
Service: plugin
DriverName: weave
Now you can start a container directly from Weave CLI:
weave run 10.2.0.2/16 -it -d busybox
When you type docker ps you will see the last created container as well as Weave containers:
CONTAINER ID IMAGE COMMAND NAMES
0bedd8bf148a busybox "sh" sick_raman
575aee35ec8d weaveworks/plugin:1.8.2 "/home/weave/plugin" weaveplugin
9e7a5a87b137 weaveworks/weaveexec:1.8.2 "/home/weave/weavepro" weaveproxy
0edc17ad4a49 weaveworks/weave:1.8.2 "/home/weave/weaver -" weave
To connect two containers in two distinct hosts using Weave, launch these commands in the first host ($HOST1):
weave launch
eval $(weave env)
docker run --name c1 -ti busybox
and in the second host, tell $HOST2 to peer with Weave already started on $HOST1:
weave launch $HOST1
eval $(weave env)
docker run --name c2 -ti busybox
$HOST1 is the IP address of the first host.
The following image explains how a simple communication between two containers living in two distinct hosts can
communicate:
In order to automate the discovery process in a Swarm cluster, start by having the Swarm token:
curl -X POST https://discovery-stage.hub.docker.com/v1/clusters && echo -e "\n"
This is my token, you should of course get a different one:
10fd726a3e5341f86fb90658208e564a
If you haven't already launched weave, do it:
weave launch
Now download the discovery script:
curl -O https://raw.githubusercontent.com/weaveworks/discovery/master/discovery && chmod a+x discovery
The script will be downloaded to the current directory, you should move it to a directory like /usr/bin if you
want to use as a system executable.
Do you remember your token ? You will use it here:
discovery join --advertise-router token://10fd726a3e5341f86fb90658208e564a
Until now, we are working in $HOST1. Go to $HOST2 and repeat the same commands:
weave launch
curl -O http://git.io/vmW3z && chmod a+x discovery
discovery join --advertise-router token://10fd726a3e5341f86fb90658208e564a
Both Weave routers should be and should stay connected.
This is how you can use the discovery command:
Weave Discovery
discovery join [--advertise=ADDR]
[--advertise-external]
[--advertise-router]
[--weave=ADDR[:PORT]] <URL>
where <URL> = backend://path
<ADDR> = host|IP
To leave a cluster, you should use the following command in the host that you want to leave the cluster:
discovery leave
If you are using a KV store like etcd, you can also consider using it:
discovery join etcd://some/path
These are the important steps to use Weave service discovery, it is quite similar to Swarm CLI. We are going to see
Swarm in details in some next parts of this book and you will be able to better understand the discovery.
Open vSwitch
Licensed under the open source Apache 2.0 license, the multilayer virtual switch Open vSwitch is designed to enable
massive network automation through programmatic extension, while still supporting standard management
interfaces and protocols (e.g. NetFlow, sFlow, IPFIX, RSPAN, CLI, LACP, 802.1ag).
Open vSwitch is also designed to support distribution across multiple physical servers similar to VMware's
vNetwork distributed vswitch or Cisco's Nexus 1000V.
In order to connect containers in multiple hosts, you need to install OpenvSwitch in all hosts:
apt-get install -y openvswitch-switch bridge-utils
You may need some dependencies:
sudo apt-get install -y build-essential fakeroot debhelper \
autoconf automake bzip2 libssl-dev \
openssl graphviz python-all procps \
python-qt4 python-zopeinterface \
python-twisted-conch libtool
Then install ovs utility:
cd /usr/bin
wget https://raw.githubusercontent.com/openvswitch/ovs/master/utilities/ovs-docker
chmod a+rwx ovs-docker
Single Host
We start by creating an OVS bridge called ovs-br1:
ovs-vsctl add-br ovs-br1
Then we activate it and give it an IP and a netmask:
ifconfig ovs-br1 173.17.0.1 netmask 255.255.255.0 up
Verify your new interface configuration by typing:
ifconfig ovs-br1
ovs-br1 Link encap:Ethernet HWaddr e6:58:8f:58:89:43
inet addr:173.17.0.1 Bcast:173.17.0.255 Mask:255.255.255.0
inet6 addr: fe80::e458:8fff:fe58:8943/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:0 (0.0 B) TX bytes:648 (648.0 B)
Let's create two containers:
docker run -it --name mycontainer1 -d busybox
docker run -it --name mycontainer2 -d busybox
And connect them to the bridge:
ovs-docker add-port ovs-br1 eth1 mycontainer1 --ipaddress=173.16.0.2/24
ovs-docker add-port ovs-br1 eth1 mycontainer2 --ipaddress=173.16.0.3/24
You can now ping the second container from the first one:
PING 173.16.0.3 (173.16.0.3): 56 data bytes
64 bytes from 173.16.0.3: seq=0 ttl=64 time=0.338 ms
64 bytes from 173.16.0.3: seq=1 ttl=64 time=0.061 ms
^C
Do the same thing from the second container:
docker exec -i mycontainer2 ping 173.16.0.2
PING 173.16.0.2 (173.16.0.2): 56 data bytes
64 bytes from 173.16.0.2: seq=0 ttl=64 time=0.067 ms
64 bytes from 173.16.0.2: seq=1 ttl=64 time=0.077 ms
^C
Multi Host
One can ask oneself, why can't we use regular Linux bridges ? I will use the official FAQ of Open vSwitch to answer
this:
Q: Why would I use Open vSwitch instead of the Linux bridge? A: Open vSwitch is specially designed to make
it easier to manage VM network configuration and monitor state spread across many physical hosts in dynamic
virtualized environments. Please see [WHY-OVS.md] for a more detailed description of how Open vSwitch
relates to the Linux Bridge
Create two hosts (Host1 and Host2) accessible to each other.
For this example, we have two VMs and of course, openvswitch-switch, Docker and ovs-docker tool should be
installed in both hosts.
wget -qO- https://get.docker.com/|sh
apt-get install openvswitch-switch bridge-utils openvswitch-common
On Host1: Create a new ovs bridge and a veth pair, set them to up then create the tunnel between Host1 and Host2.
Make sure to change <Host2_IP> by the real IP of Host2.
ovs-vsctl add-br br-int
ip link add veth0 type veth peer name veth1
ovs-vsctl add-port br-int veth1
brctl addif docker0 veth0
ip link set veth1 up
ip link set veth0 up
ovs-vsctl add-port br-int gre0 -- set interface gre0 type=gre options:remote_ip=<Host2_IP>
On Host2: Do the same thing, create a new ovs bridge and a veth pair, set them to up then create the tunnel between
Host1 and Host2. Make sure to change <Host1_IP> by the real IP of Host1.
ovs-vsctl add-br br-int
ip link add veth0 type veth peer name veth1
ovs-vsctl add-port br-int veth1
brctl addif docker0 veth0
ip link set veth1 up
ip link set veth0 up
ovs-vsctl add-port br-int gre0 -- set interface gre0 type=gre options:remote_ip=<Host1_IP>
You can see the created bridge on each host by typing ovs-vsctl show .
On Host1:
0aaba889-1d8c-4db2-b783-d7a203853d44
Bridge br-int
Port "veth1"
Interface "veth1"
Port br-int
Interface br-int
type: internal
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip="<Host2_IP>"}
ovs_version: "2.5.0"
On Host2:
7e782730-8990-4786-b2b0-efef7721665b
Bridge br-int
Port "veth1"
Interface "veth1"
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip="<Host2_IP>"}
Port br-int
Interface br-int
type: internal
ovs_version: "2.5.0"
and are of course changed by their real values. You can also use brctl show command for more information.
Now, create a container in Host1:
docker run -it --name container1 -d busybox
View its IP address:
docker inspect --format '{{.NetworkSettings.IPAddress}}' container1
On Host2, do the same thing
docker run -it --name container1 -d busybox
docker inspect --format '{{.NetworkSettings.IPAddress}}' container1
You will notice that both containers have the same IP address 172.17.0.2 , this could create a conflict in the cluster,
so we are going to create a second container. The latter will have a different IP:
docker run -it --name container2 -d busybox
docker inspect --format '{{.NetworkSettings.IPAddress}}' container2
From the container1 in Host1, ping container2 in Host2:
docker exec -it container1 ping -c 2 172.17.0.3
PING 172.17.0.3 (172.17.0.3): 56 data bytes
64 bytes from 172.17.0.3: seq=0 ttl=64 time=0.985 ms
64 bytes from 172.17.0.3: seq=1 ttl=64 time=0.963 ms
--- 172.17.0.3 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.963/0.974/0.985 ms
From container2 in Host2, ping the container1 in Host1, but before this remove the container1 in the same host
(Host2) in order to be sure that we are pinging the right container (container1 in Host1):
docker rm -f container1
docker exec -it container2 ping -c 2 172.17.0.2
PING 172.17.0.2 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=1.475 ms
64 bytes from 172.17.0.2: seq=1 ttl=64 time=1.139 ms
--- 172.17.0.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.139/1.307/1.475 ms
Now you have seen how can we connect two containers on different hosts using Open vSwitch.
To go beyond that, you may notice that both docker0 interface has the same IP address which is 172.17.0.1:
This similarity could create confusion within a multihost network.
This is why we are going to remove docker0 bridge interface and create a new one with different subnets. You are
free to make any choice of private IP addresses, I am going to use this:
Host1 : 192.168.10.1/16
Host2 : 192.168.11.1/16
In order to change an IP address of an interface, you can use ifconfig command and brctl command.
ifconfig
Usage:
ifconfig [-a] [-v] [-s] <interface> [[<AF>] <address>]
[add <address>[/<prefixlen>]]
[del <address>[/<prefixlen>]]
[[-]broadcast [<address>]] [[-]pointopoint [<address>]]
[netmask <address>] [dstaddr <address>] [tunnel <address>]
[outfill <NN>] [keepalive <NN>]
[hw <HW> <address>] [metric <NN>] [mtu <NN>]
[[-]trailers] [[-]arp] [[-]allmulti]
[multicast] [[-]promisc]
[mem_start <NN>] [io_addr <NN>] [irq <NN>] [media <type>]
[txqueuelen <NN>]
[[-]dynamic]
[up|down] ...
Usage: brctl [commands]
This is a practical example but before that make sure your firewall rules will not stop you doing the next steps, make
also sure that you stop Docker service docker stop :
Deactivate docker0 by bringing it down:
ifconfig docker0 down
Delete the bridge and create a new one having the same interface name:
brctl delbr docker0
brctl addbr docker0
Bring it up while assigning it a new address and a new mtu :
sudo ifconfig docker0 192.168.10.1/16 mtu 1400 up
The last change will not be persistent unless you add it to /etc/default/docker configuration file and add --
bip=192.168.10.1/16 to DOCKER_OPTS .
Example:
DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4 --bip=10.11.12.1/24"
If you are using Ubuntu/Debian, you can use a script that I found in a Github gist and tested, it will do this
automatically for you,I forked it here: https://gist.github.com/eon01/b7fbfa3309ed4f514bc742045ce9b5a2 , you can use it this
way:
Example1: wget http://bit.ly/2kHIbVc && bash configure_docker0.sh 192.168.10.1/16
Example2: wget http://bit.ly/2kHIbVc && bash configure_docker0.sh 192.168.11.1/16
For formatting reasons, I used bit.ly to shorten the url, this is the real url :
https://gist.githubusercontent.com/eon01/b7fbfa3309ed4f514bc742045ce9b5a2/raw/7bb94c774510505196151c5d787ce865140ace9c/configure_docker0.sh
To use this script:
Make sure you choose the network you want to use instead of the networks used in the examples
Make sure you are using Debian/Ubuntu
This script must be run with your root user
Docker must be stopped
Change will happen after starting Docker
Project Calico
Calico provides a different approach since it uses the layer 3 to provide the virtual networking feature. It includes
pre-integration with Kubernetes and Mesos (as a CNI network plugin), Docker (as a libnetwork plugin) and
OpenStack (as a Neutron plugin). It supports many public and private cloud like AWS, GCE.
Almost all of the other networking solutions (like Weave and Fannel) encapsulate layer 2 traffic into a higher level
to build an overlay network while the primary operating mode of project Calico requires no encapsulation.
Based on the same scalable IP network principles as the Internet, Calico leverages the existing Linux Kernel
forwarding engine without the need for virtual switches or overlays. Each host propagates workload reachability
information (routes) to the rest of the data center either directly in small scale deployments or via infrastructure
route reflectors to reach Internet level scales in large deployments.
Like it is described in the official documentation of the project, Calico simplifies the network topology, removing
multiple encapsulation and de-encapsulation which gives some strengths to this networking system:
Smaller packet sizes mean that there is reduction in possible packet fragmentation.
There is a reduction in CPU cycles handling the encap and de-encap.
Easier to interpret packets, and therefore easier to troubleshoot.
Project Calico is most compatible with data centers where you have control over the physical network fabric.
Pipework
Pipework is a Software-Defined Networking tools for LXC that lets you connect together containers in arbitrarily
complex scenarios. It uses cgroups and namespace, works with containers created with lxc-start (plain LXC) and
with Docker.
In order to install it, you can execute the installation script from its Github repository :
sudo bash -c "curl https://raw.githubusercontent.com/jpetazzo/pipework/master/pipework > /usr/local/bin/pipework"
Since its creation, Docker is allowing more complex scenarios, and Pipework is becoming obsolete. Given the
Docker, Inc., acquisition of SocketPlane and the introduction of the Overlay Driver, you better use Docker Swarm
built-in orchestration unless you have very specific needs.
OpenVPN
Using OpenVPN, you can create virtual private networks (VPNs), you can use a VPN network to connect different
VMs in the same data center or a multi-cloud VMs in order to connect distributed containers. This connection will be
of course secure (TLS).
Service Discovery
Etcd
etcd is a distributed, key-value store for shared configuration and service discovery, with features like:
A user-facing API (gRPC)
Automatic TLS with optional client cert authentication
Rapidity (Benchmarked 10,000 writes/sec according to CoreOS)
Properly distributed using Raft
etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log. It is a
production-ready software widely used with tools like Kubernetes, fleet, locksmith, vulcand, Doorman.
In order to rapidly setup use etcd on AWS, you can use the official AMI
To test etcd, you can create a CoreOS cluster (with 3 machines) and for simplicity sake, I am going to use Digital
Ocean.
The first thing to do here before having a new CoreOS cluster is is generating a new discovery URL. You can do
this by using https://discovery.etcd.io url. This will print a new discovery url:
curl -w "\n" "https://discovery.etcd.io/new?size=3"
This is my discovery url :
https://discovery.etcd.io/d9fe2c6051e8204e2fa730ccc815e76b
We are going to use this url in the cloud-config configuration.
You should change the discovery url by your own generated url:
#cloud-config
coreos:
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new:
discovery: https://discovery.etcd.io/d9fe2c6051e8204e2fa730ccc815e76b
# multi-region deployments, multi-cloud deployments, and Droplets without
# private networking need to use $public_ipv4:
advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001
initial-advertise-peer-urls: http://$private_ipv4:2380
# listen on the official ports 2379, 2380 and one legacy port 4001:
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380
fleet:
public-ip: $private_ipv4 # used for fleetctl ssh command
units:
- name: etcd2.service
command: start
- name: fleet.service
command: start
When creating your 3 CoreOS VMs make sure to activate private networking and pasting your cloud-config
configuration.
The cloud-config will not work for you if you forget to add the first line #cloud-config
Create your 3 machines:
Go get a cup of coffee unless you created small VMs:
Log to one of the created machines, now you can type fleetctl list-machines in order to see all of the created
machines.
If you want another machine to join the same cluster, you can use the same cloud-config file again and your machine
will join the cluster automatically.
You can find the discovery url by typing grep DISCOVERY /run/systemd/system/etcd2.service.d/20-cloudinit.conf
etcd is written in the Go language and developed by CoreOS team.
Consul
Consul is a tool for service discovery and configuration that runs on Linux, Mac OS X, FreeBSD, Solaris, and
Windows. Consul is distributed, highly available, and extremely scalable.
It provides several key features:
Service Discovery - Consul makes it easy for services to register themselves and to discover other services via
a DNS or HTTP interface. External services such as SaaS providers can also be registered.
Health Checking - Health Checking enables Consul to quickly alert operators about any issues in a cluster. The
integration with service discovery prevents routing traffic to unhealthy hosts and enables service level circuit
breakers.
Key/Value Storage - A flexible key/value store enables storing dynamic configuration, feature flagging,
coordination, leader election and more. The simple HTTP API makes it easy to use anywhere.
Multi-Datacenter - Consul is built to be datacenter aware, and can support any number of regions without
complex configuration.
For simplicity sake, I will use a Docker container to run Consul:
docker run -p 8400:8400 -p 8500:8500 \
> -p 8600:53/udp -h consul_s progrium/consul -server -bootstrap
Unable to find image 'progrium/consul:latest' locally
latest: Pulling from progrium/consul
c862d82a67a2: Pull complete
0e7f3c08384e: Pull complete
0e221e32327a: Pull complete
09a952464e47: Pull complete
60a1b927414d: Pull complete
4c9f46b5ccce: Pull complete
417d86672aa4: Pull complete
b0d47ad24447: Pull complete
fd5300bd53f0: Pull complete
a3ed95caeb02: Pull complete
d023b445076e: Pull complete
ba8851f89e33: Pull complete
5d1cefca2a28: Pull complete
Digest: sha256:8cc8023462905929df9a79ff67ee435a36848ce7a10f18d6d0faba9306b97274
Status: Downloaded newer image for progrium/consul:latest
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting raft data migration...
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul_s'
Datacenter: 'dc1'
Server: true (bootstrap: true)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53, RPC: 8400)
Cluster Addr: 172.17.0.3 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2017/01/29 01:02:23 [INFO] serf: EventMemberJoin: consul_s 172.17.0.3
2017/01/29 01:02:23 [INFO] serf: EventMemberJoin: consul_s.dc1 172.17.0.3
2017/01/29 01:02:23 [INFO] raft: Node at 172.17.0.3:8300 [Follower] entering Follower state
2017/01/29 01:02:23 [INFO] consul: adding server consul_s (Addr: 172.17.0.3:8300) (DC: dc1)
2017/01/29 01:02:23 [INFO] consul: adding server consul_s.dc1 (Addr: 172.17.0.3:8300) (DC: dc1)
2017/01/29 01:02:23 [ERR] agent: failed to sync remote state: No cluster leader
2017/01/29 01:02:25 [WARN] raft: Heartbeat timeout reached, starting election
2017/01/29 01:02:25 [INFO] raft: Node at 172.17.0.3:8300 [Candidate] entering Candidate state
2017/01/29 01:02:25 [INFO] raft: Election won. Tally: 1
2017/01/29 01:02:25 [INFO] raft: Node at 172.17.0.3:8300 [Leader] entering Leader state
2017/01/29 01:02:25 [INFO] consul: cluster leadership acquired
2017/01/29 01:02:25 [INFO] consul: New leader elected: consul_s
2017/01/29 01:02:25 [INFO] raft: Disabling EnableSingleNode (bootstrap)
2017/01/29 01:02:25 [INFO] consul: member 'consul_s' joined, marking health alive
2017/01/29 01:02:25 [INFO] agent: Synced service 'consul'
You can see that
Now a Docker Consul container is running and maps the ports 8500 for the HTTP API and 8600 for the DNS
endpoint.
CONTAINER ID IMAGE COMMAND PORTS
40d56ae6d179 progrium/consul "/bin/start -serve..." (1)
(1): 53/tcp, 0.0.0.0:8400->8400/tcp, 8300-8302/tcp, 8301-8302/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8600->53/udp
You can use the HTTP endpoint to show a list of connected nodes:
curl localhost:8500/v1/catalog/nodes
[{"Node":"consul_s","Address":"172.17.0.3"}]
In order to use the DNS endpoint try dig @0.0.0.0 -p 8600 node1.node.consul .
; <<>> DiG 9.9.5-3ubuntu0.10-Ubuntu <<>> @0.0.0.0 -p 8600 consul_s
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 22307
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;consul_s. IN A
Consul could be used from a GUI, you can give it a try at http://0.0.0.0:8500/ (if you are running the container on
you localhost).
You can use Consul with different options like:
Using a service definition with Consul
Example for 1 service:
{
"service": {
"name": "redis",
"tags": ["primary"],
"address": "",
"port": 8000,
"enableTagOverride": false,
"checks": [
{
"script": "/usr/local/bin/check_redis.py",
"interval": "10s"
}
]
}
}
For more than 1 service, just use services instead of service :
{
"services": [
{
"id": "red0",
"name": "redis",
"tags": [
"primary"
],
"address": "",
"port": 6000,
"checks": [
{
"script": "/bin/check_redis -p 6000",
"interval": "5s",
"ttl": "20s"
}
]
},
{
"id": "red1",
"name": "redis",
"tags": [
"delayed",
"secondary"
],
"address": "",
"port": 7000,
"checks": [
{
"script": "/bin/check_redis -p 7000",
"interval": "30s",
"ttl": "60s"
}
]
},
...
]
}
You can use tools like traefik or fabio as a Consul backend
If you want to use fabio, you should:
Install it, you can also use Docker: docker pull magiconair/fabio
Register your service in consul
Register a health check in consul
Register one urlprefix- tag per host/path prefix it serves, e.g.: urlprefix-/css , urlprefix-i.com/static , urlprefix-
mysite.com/
An example:
{
"service": {
"name": "foobar",
"tags": ["urlprefix-/foo, urlprefix-/bar"],
"address": "",
"port": 8000,
"enableTagOverride": false,
"checks": [
{
"id": "api",
"name": "HTTP API on port 5000",
"http": "http://localhost:5000/health",
"interval": "2s",
"timeout": "1s"
}
]
}
}
Start fabio without a configuration file (a consul agent should run on localhost:8500 ).
Watch fabio logs
Send all your HTTP traffic to fabio on port 9999
This is a good video that explains how fabio works: https://www.youtube.com/watch?v=gvxxu0PLevs
Finally, you can write your own process that registers the service through the HTTP API
ZooKeeper
ZooKeeper (or ZK) is a centralized service for configuration management with distributed synchronization
capabilities. ZK organizes its data in a hierarchy of znodes .
It exposes a simple set of primitives that distributed applications can build upon to implement higher level services
for synchronization, configuration maintenance, and groups and naming. It is designed to be easy to program to, and
uses a data model styled after the familiar directory tree structure of file systems. It runs in Java and has bindings for
both Java and C.
From the official documentation, the ZooKeeper implementation is described a putting a premium on high
performance, highly available, strictly ordered access. The performance aspects of ZooKeeper means it can be used
in large, distributed systems. The reliability aspects keep it from being a single point of failure. The strict ordering
means that sophisticated synchronization primitives can be implemented at the client.
It allows distributed processes to coordinate with each other through a shared hierarchal namespace which is
organized similarly to a standard file system. The name space consists of data registers - called znodes, in
ZooKeeper parlance - and these are similar to files and directories. Unlike a typical file system, which is designed
for storage, ZooKeeper data is kept in-memory, which means ZooKeeper can achieve high throughput and low
latency numbers.
Like the distributed processes it coordinates, ZooKeeper itself is intended to be replicated over a sets of hosts called
an ensemble.
The servers that make up the ZooKeeper service must all know about each other. They maintain an in-memory
image of state, along with a transaction logs and snapshots in a persistent store. As long as a majority of the servers
are available, the ZooKeeper service will be available.
Clients connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends
requests, gets responses, gets watch events, and sends heart beats. If the TCP connection to the server breaks, the
client will connect to a different server.
ZooKeeper stamps each update with a number that reflects the order of all ZooKeeper transactions. Subsequent
operations can use the order to implement higher-level abstractions, such as synchronization primitives. It is
especially fast in "read-dominant" workloads. ZooKeeper applications run on thousands of machines, and it
performs best where reads are more common than writes, at ratios of around 10:1.
Its API, support mainly these operations:
create : creates a node at a location in the tree
delete : deletes a node
exists : tests if a node exists at a location
get data : reads the data from a node
set data : writes data to a node
get children : retrieves a list of children of a node
sync : waits for data to be propagated
Load Balancers
Nginx
Nginx can integrate with some service discovery tools like etcd/confd. Nginx is a popular web server, reverse proxy
and load balancer and the advantage of using Nginx is the community behind it and its very good performance.
A simple configuration, would be creating the right Nginx upstream that can redirect traffic to the Docker containers
is a cluster, for example a Swarm cluster. Example: We want to run an API deployed using Docker Swarm, the
service is mapped to port 8080:
docker service create --name api --replicas 20 --publish 8080:80 my/api:2.3
We now that the API service is mapped to port 8080 in the leader node. We can create a simple Nginx configuration
file:
server {
listen 80;
location / {
proxy_pass http://api;
}
}
upstream api {
server <node0 private IP>:8080;
}
This file will be used to run an Nginx load balancer:
docker service create --name my_load_balancer --mount type=bind,source=/data/,target=/etc/nginx/conf.d --publish 80:80 nginx
Nginx could be integrated with Consul, Registrator and Consul-template.
Consul will be used as the service discovery tool
Registrator will be used to automatically register the new started services to Consul
Consul-template will be used to automatically recreate the HAProxy configuration from a given template
Adding and removing a node from Nginx configuration is not a good solution. Here is a simple configuration of
Nginx that works with Consul:
upstream frontend { {{range service "app.frontend"}}
server {{.Address}};{{end}}
}
HAProxy
HAProxy is a very common, high-performance load balancing software that could be used as a load balancer set up
in front of a Docker cluster.
You can for example, integrate it with Consul, Registrator and Consul-template.
Consul will be used as the service discovery tool https://github.com/hashicorp/consul
Registrator will be used to automatically register the new started services to Consul
https://github.com/gliderlabs/registrator
Consul-template will be used to automatically recreate the HAProxy configuration from a given template
https://github.com/hashicorp/consul-template
Like Nginx proxy balancer, adding and removing nodes from HAProxy could be done using Consul Template:
backend frontend
balance roundrobin{{range "app.frontend"}}
service {{.ID}} {{.Address}}:{{.Port}}{{end}}
You may also consider using HAProxy with Docker Swarm mode. You can use the dockerfile/haproxy to run
HAProxy:
docker run -d -p 80:80 -v <override-dir>:/haproxy-override dockerfile/haproxy
where <override-dir> is an absolute path of a directory that could contain:
haproxy.cfg : custom config file (replace /dev/log with 127.0.0.1 , and comment out daemon)
errors/ : custom error responses
This is an example of HAProxy configuration:
global
debug
defaults
log global
mode http
timeout connect 5000
timeout client 5000
timeout server 5000
listen http_proxy :8443
mode tcp
balance roundrobin
server server1 docker:8000 check
server server2 docker:8001 check
Traefik
Traefik is a HTTP reverse proxy and load balancer made to deploy microservices. It supports several backends like
Docker, Swarm, Mesos/Marathon, Consul, Etcd, Zookeeper, BoltDB, Rest API, file...etc Its configuration could be
automatically and dynamically managed.
You can run a Docker container to deploy Traefik:
docker run -d -p 8080:8080 -p 80:80 -v $PWD/traefik.toml:/etc/traefik/traefik.toml traefik
Or Docker Compose:
traefik:
image: traefik
command: --web --docker --docker.domain=docker.localhost --logLevel=DEBUG
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /dev/null:/traefik.toml
This is the official example that you can test and follow to understand how Traefik in running:
Create docker-compose.yml file:
traefik:
image: traefik
command: --web --docker --docker.domain=docker.localhost --logLevel=DEBUG
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /dev/null:/traefik.toml
whoami1:
image: emilevauge/whoami
labels:
- "traefik.backend=whoami"
- "traefik.frontend.rule=Host:whoami.docker.localhost"
whoami2:
image: emilevauge/whoami
labels:
- "traefik.backend=whoami"
- "traefik.frontend.rule=Host:whoami.docker.localhost"
Run it:
docker-compose up -d
Now you can test the load balancing using curl:
curl -H Host:whoami.docker.localhost http://127.0.0.1
curl -H Host:whoami.docker.localhost http://127.0.0.1
Kube-Proxy
Kube-Proxy is one of the components of Kubernetes. On each Kubernetes node, Kube-Proxy allows us to do simple
TCP,UDP stream forwarding or round robin TCP,UDP forwarding across a set of backends. Service cluster IPs and
ports are currently found through Docker-links-based service that specifies ports opened by the service proxy.
Vulcand
Vulcand is a programmatic extendable proxy for microservices and API management. It is inspired by Hystrix and
powers Mailgun microservices infrastructure.
It uses Etcd as a configuration backend. Has an API and a CLI and supports canary deploys, realtime metrics and
resiliency.
Moxy
Moxy is a reverse load balancer that could be automatically configured to discover services deployed on Apache
Mesos and Marathon.
servicerouter.py
Marathon's servicerouter.py is a replacement for the haproxy-marathon-bridge. It reads Marathon task information
and generates HAProxy configuration. It supports advanced functions like sticky sessions, HTTP to HTTPS
redirection, SSL offloading, VHost support and templating. It is implemented in Python.
You can run the official Docker image to deploy it:
docker run -d \
--name="servicerouter" \
--net="host" \
--ulimit nofile=8204 \
--volume="/dev/log:/dev/log" \
--volume="/tmp/ca-bundle.pem:/etc/ssl/mesosphere.com.pem:ro" \
uzyexe/marathon-servicerouter
Chapter IX - Composing Services Using Compose
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
What Is Docker Compose
We have seen how to run containers but individually, say we want to run a LAMP/LEMP stack, at this case, we
should a php container then start the webserver container and make the link between them. Using Docker Compose,
it is possible to run a multi-container application using a declarative YAML file (Compose file). Using a single
command, we can start multiple containers that run all of our services (a LAMP or LEMP stack in this case).
Installing Docker Compose
Docker Compose comes in a separate binary file, even if you already installed Docker, you should install Compose.
Docker Compose For Mac And Windows
If you're a Mac or Windows user, the best way to install Compose and keep it up-to-date is Docker for Mac and
Windows. Docker for Mac and Windows will automatically install the latest version of Docker Engine for you. You
can use on of these links:
Docker Community Edition for Mac: https://store.docker.com/editions/community/docker-ce-desktop-mac
Docker Community Edition for Windows: https://store.docker.com/editions/community/docker-ce-desktop-
windows
Docker For Linux
In order to install Docker for Linux, download Compose binary and move it the binaries path:
curl -L https://github.com/docker/compose/releases/download/1.13.0/docker-compose-`uname -s`-`uname -
m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
You can get a different version by changing 1.13.0 by the right version. The Compose file format has a version that
could be compatible with a Docker Engine version. For example, Compose version 3.0 – 3.2 is 1.13.0+ compatible.
Running Wordpress Using Docker Compose
One of the advantage of Docker Compose is the fact that a Compose file could be shared and distributed, creating an
application that uses several services and components could be done using a single Compose command.
This is what most users running Wordpress with Compose are using:
version: '3'
services:
db:
image: mysql:5.7
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: mypassword
MYSQL_DATABASE: wordpress
MYSQL_USER: user
MYSQL_PASSWORD: mypassword
wordpress:
depends_on:
- db
image: wordpress:latest
ports:
- "8000:80"
restart: always
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_USER: myuser
WORDPRESS_DB_PASSWORD: mypassword
volumes:
db_data:
The above content should goes into a file called docker-compose.yml . This is the file tree we are using :
Running_Wordpress_Using_Docker_Compose/
└── docker-compose.yml
Now type cd Running_Wordpress_Using_Docker_Compose and run docker-compose up . After downloading and running the
different services, you can go to http://127.0.0.1:8000 in order to see a running Wordpress.
When running the docker-compose up command, you should absolutely be inside the folder containing the
docker-compose file.
What we have actually run in the docker-compose.yml file is the equivalent of running two commands:
docker run --name db \
-e MYSQL_ROOT_PASSWORD=mypassword \
-e MYSQL_DATABASE=wordpress \
-e MYSQL_USER=user \
-e MYSQL_PASSWORD=mypassword \
-d mysql:5.7
docker run --name wordpress \
--link db:mysql \
-p 8080:80 \
-e WORDPRESS_DB_USER=myuser \
-e WORDPRESS_DB_PASSWORD=mypassword \
-d wordpress:latest
Running LEMP Using Docker Compose
We are going to use the Compose version 3. After downloading and installing Docker Compose, create a new folder
and a new file called docker-compose.yml :
mkdir Running_LEMP_Using_Docker_Compose
cd LEMP_Using_Docker_Compose
vi docker-compose.yml
Inside the Compose file, start by mentioning the version:
version: '3'
Now add the first service:
web:
We will use Nginx with the latest image, so our service will look like this:
web:
image: nginx:latest
And to declare port mapping, the file becomes:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
We are going to declare 3 volumes:
code: where we are going to put the php files
configs: where the Nginx configuration files go
scripts: where the script to start at the service startup goes
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
Let's create a PHP file where we will use the phpinfo() funtion:
mkdir code
cd code
echo "<? echo phpinfo();" > index.php
We named Nginx configuration file to site.conf and this is its content:
server {
index index.php;
server_name php-docker.local;
error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log;
root /code;
location ~ \.php$ {
try_files $uri =404;
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass php:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param PATH_INFO $fastcgi_path_info;
}
}
You should creae the folder configs and the file/content above should goes inside this folder
Now create another floder called scripts and add a new script file, we are going to name it start_services.sh :
#!/usr/bin/env bash
chown -R www-data:www-data /code
nginx
for (( ; ; ))
do
sleep 1d
done
This script will be the entrypoint to the Nginx container, you can customize it depending on your needs, but don't
forget to execute chmod +x scripts/start_services.sh .
This is the structure of our folder:
.
├── code
│ └── index.php
├── configs
│ └── site.conf
├── docker-compose.yml
└── scripts
└── start_services.sh
This is our new docker-compose file:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
Now that web service is linked to the php service, let's add the remainder of the Compose file:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
php:
image: php:7-fpm
volumes:
- ./code:/code
restart: always
expose:
- 9000
Now we simply need to run a single command to start the webserver with the PHP backend:
docker-compose up
You should see something like this:
Creating network "runninglempusingdockercompose_default" with the default driver
Creating runninglempusingdockercompose_php_1 ...
Creating runninglempusingdockercompose_php_1 ... done
Creating runninglempusingdockercompose_web_1 ...
Creating runninglempusingdockercompose_web_1 ... done
Attaching to runninglempusingdockercompose_php_1, runninglempusingdockercompose_web_1
Now you can open your browser and visit http://127.0.0.1:8000 .
Something helpful that we can use is sending the Nginx access logs to a remote log server/service. I am going to use
AWS cloudwatch service, but you can configure your own service/server like syslog or fluentd ..etc
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
logging:
driver: "awslogs"
options:
awslogs-group: "dev"
awslogs-stream: "web_logs"
php:
image: php:7-fpm
volumes:
- ./code:/code
restart: always
expose:
- 9000
logging:
driver: "awslogs"
options:
awslogs-group: "dev"
awslogs-stream: "php_logs"
If you want to run this LEMP Docker stack as a daemon, you should use: docker-compose up -d .
If you want to pause the stack use docker-compose pause .
If you want to unpause the stack use docker-compose unpause .
If you want to stop the stack use docker-compose down .
Scaling Docker Compose
Using Docker Compose it is possible to scale a running service. Say we need 5 PHP containers that should run
behind our webserver. The command docker-compose scale php=5 will start 4 other containers running the same service
(PHP) and since the Nginx service in linked to the PHP one, all of the containers of the latter service can be seen by
Nginx container.
You may say, why we haven't scaled the Nginx bottle neck. As you can see in the Compose file, there is a mapping
between the port 8000 (external/host) and the port 80 (internal/container), the host has a single port 8000 and
creating a new container running the Nginx service will try to use the same external port and cause a conflict, so it is
not possible to scale a service using port mapping with Docker Compose.
Docker Compose Use Cases
I used Docker Compose in production but it was not a critical application. Docker Compose is mainly created for
testing and development purposes. When developing an application, having an isolated environment is crucial.
Docker Compose create this environment for a developer so adding, removing, modifying the version of a software
or a middleware is easy wand will not create any dependency or multi-version problems.
Docker Compose is a good way to share stacks between users and teams, e.g. sharing the production stack with
developers using different development configurations could be done using Docker Compose.
It is also useful to run tests by providing the environment to run them, after running them, the container could be
destroyed.
Chapter X - Docker Logging
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Docker Native Logging
When you start a container running a web application, a web server or any other application, sooner or later, you
will need to view your application logs. Let's see an example to view the access logs of a webserver.
Run an Nginx container:
docker run -it -p 8000:80 -d --name webserver nginx
Then visit localhost at port 8000 or execute this command:
curl http://0.0.0.0:8000
Now when you type docker logs webserver you can see the last lines of the access log:
172.17.0.1 - - [27/May/2017:21:33:59 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.47.0" "-"
The last command will view the log file and exits. If you want to execute the equivalent of tail -f command on a
Docker container, add the -f flag:
docker logs -f webserver
or
docker logs --follow webserver
In order to tail the last 10 lines and exit you can execute docker logs --tail 10 webserver . You can also use the --since
flag that takes a string and shows logs since timestamp (e.g. 2013-01-02T13:23:37) or relative (e.g. 42m for 42
minutes):
docker logs --since 42s webserver
If you want to print the timestamp use the -t flag:
docker logs --since 2h -t webserver
docker logs -f -t webserver
The docker logs command uses the output of STDOUT and STDERR
Adding New Logs
The docker logs command uses the output of STDOUT and STDERR. When we run an Nginx container, the access
logs and the remainder of Nginx logs like error logs are redirected respectively to STDOUT and STDERR. Let's
examine the official Nginx Dockerfile in order to see how this was done:
#
# Nginx Dockerfile
#
# https://github.com/dockerfile/nginx
#
# Pull base image.
FROM dockerfile/ubuntu
# Install Nginx.
RUN \
add-apt-repository -y ppa:nginx/stable && \
apt-get update && \
apt-get install -y nginx && \
rm -rf /var/lib/apt/lists/* && \
echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \
chown -R www-data:www-data /var/lib/nginx
# Define mountable directories.
VOLUME ["/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx", "/var/www/html"]
# Define working directory.
WORKDIR /etc/nginx
# Define default command.
CMD ["nginx"]
# Expose ports.
EXPOSE 80
EXPOSE 443
We are certainly looking fot this line echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \ . When running Nginx with
daemon off; configuration, Nginx will run in foreground and everything will be redirected to the screen (STDOUT).
For Apache Dockerfile, if we examine it we will notice that the container execute a script file at its startups. The
content of this script is:
#!/bin/bash
set -e
# Apache gets grumpy about PID files pre-existing
rm -f /usr/local/apache2/logs/httpd.pid
exec httpd -DFOREGROUND
The -DFOREGROUND option will allow Docker to get Apache logs.
What if we have custom log files ? Say we have our application write into a file called app.log inside the logs
folder.
./logs/app.log
In this case, you need to redirect the app.log content to the STDOUT or STDERR:
FROM ..
..
RUN ln -sf /dev/stdout logs/app.log
RUN ln -sf /dev/stderr logs/app-errors.log
..etc
Docker Logging Drivers
Docker can interface with other logging services like AWS cloudwatch, Fluentd, syslog ..etc and send all of the logs
to the remote servic. When you use a logging driver, the native docker logs <container> become deactivated. These
are the supported logging drivers:
syslog: Writes logging messages to the syslog facility. The syslog daemon must be running on the host
machine.
journald: Writes log messages to journald. The journald daemon must be running on the host machine.
gelf: Writes log messages to a Graylog Extended Log Format (GELF) endpoint such as Graylog or Logstash.
fluentd: Writes log messages to fluentd (forward input). The fluentd daemon must be running on the host
machine.
awslogs: Writes log messages to Amazon CloudWatch Logs.
splunk: Writes log messages to splunk using the HTTP Event Collector.
etwlogs: Writes log messages as Event Tracing for Windows (ETW) events. Only available on Windows
platforms.
gcplogs: Writes log messages to Google Cloud Platform (GCP) Logging.
Using Fluentd Log Driver
First thing that we need to do is installing Fluentd in the host that will collect the logs. If your package manager is
RPM, you can use curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh
If you are using Ubuntu or Debian:
For Xenial,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh
For Trusty,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
For Precise,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent2.sh | sh
For Lucid,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-lucid-td-agent2.sh | sh
For Debian Jessie,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent2.sh | sh
For Debian Wheezy,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-wheezy-td-agent2.sh | sh
For Debian Squeeze,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-squeeze-td-agent2.sh | sh
In order to use Fluentd, create the configuration file, we are going to name it docker.conf .
<source>
type forward
port 24224
bind 0.0.0.0
</source>
<match *.*>
type stdout
</match>
You should now start fluentd with the configuration file after adapting it to your needs:
fluentd -c docker.conf
> 2015-09-01 15:07:12 -0600 [info]: reading config file path="docker.conf"
> 2015-09-01 15:07:12 -0600 [info]: starting fluentd-0.12.15
> 2015-09-01 15:07:12 -0600 [info]: gem 'fluent-plugin-mongo' version '0.7.10'
> 2015-09-01 15:07:12 -0600 [info]: gem 'fluentd' version '0.12.15'
> 2015-09-01 15:07:12 -0600 [info]: adding match pattern="*.*" type="stdout"
> 2015-09-01 15:07:12 -0600 [info]: adding source type="forward"
> 2015-09-01 15:07:12 -0600 [info]: using configuration file: <ROOT>
> <source>
> @type forward
> port 24224
> bind 0.0.0.0
> </source>
> <match docker.*>
> @type stdout
< </match>
> </ROOT>
> 2015-09-01 15:07:12 -0600 [info]: listening fluent socket on 0.0.0.0:24224
It is also possible to run Fluentd in a Docker container:
docker run -it -p 24224:24224 -v docker.conf:/fluentd/etc/docker.conf -e FLUENTD_CONF=docker.conf fluent/fluentd:latest
Now run
docker run --log-driver=fluentd ubuntu echo "Hello Fluentd!"
Fluentd could be on a different remote host and in this case you should add --log-opt followed by the host address:
docker run --log-driver=fluentd --log-opt fluentd-address=192.168.1.10:24225 ubuntu echo "..."
If you are using multiple log types/files, you can tag each container/service with a different tag using fluentd-tag .
docker run --log-driver=fluentd --log-opt fluentd-tag=docker.{{.ID}} ubuntu echo "..."
Using AWS CloudWatch Log Driver
Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. You
can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically
react to changes in your AWS resources. AWS CloudWatch can also be used in collecting and centralizing Docker
logs.
In order to use CloudWatch, you need to allow the used user to execute these actions:
logs:CreateLogGroup
logs:CreateLogStream
logs:PutLogEvents
logs:DescribeLogStreams
e.g:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:*:*:*"
]
}
]
}
You need now to go to your AWS console, go to CloudWatch service then create a log group, call it my-group . Click
on the created group and create a new stream, call it my-stream .
The awslogs logging driver sends your Docker logs to a specific region so you could use the awslogs-region log
option or the AWS_REGION environment variable to set the region. You should use the same region where you created
the log group/stream.
docker run --log-driver=awslogs --log-opt awslogs-region=us-east-1 ubuntu echo "..."
In order to add the logs group/stream, execute:
docker run --log-driver=awslogs --log-opt awslogs-region=us-east-1 --log-opt awslogs-group=my-group --log-opt awslogs-
stream my-stream
You can configure the default logging driver by passing the --log-driver option to the Docker daemon:
dockerd --log-driver=awslogs
Chapter XI - Docker Debugging And Troubleshooting
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Docker Daemon Logs
When there is a problem, one of the first things for many of you is checking the Docker Daemon logs. Docker logs
are accessible in different ways and this depends on your system:
OSX - ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/log/d󳋄󳋄ocker.log
Debian - /var/log/daemon.log
CentOS - Run /var/log/daemon.log | grep docker
CoreOS - Run journalctl -u docker.service
Ubuntu upstart - /var/log/upstart/docker.log
Ubuntu systemd - Run journalctl -u docker.service command
Fedora - Run journalctl -u docker.service
Red Hat Enterprise Linux Server - Run /var/log/messages | grep docker
OpenSuSE - Run journalctl -u docker.service
Boot2Docker - /var/log/docker.log
Windows - AppData\Local
Another way of troubleshooting the daemon is running it in foreground:
dockerd
If you are already running Docker you should stop it and start the daemon.
sudo dockerd
> INFO[0000] libcontainerd: new containerd process, pid: 9898
> WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=65536
> WARN[0001] failed to rename /var/lib/docker/tmp for background deletion: %!s(<nil>). Deleting synchronously
> INFO[0001] [graphdriver] using prior storage driver: aufs
> INFO[0001] Graph migration to content-addressability took 0.00 seconds
> WARN[0001] Your kernel does not support swap memory limit
> WARN[0001] Your kernel does not support cgroup rt period
> WARN[0001] Your kernel does not support cgroup rt runtime
> INFO[0001] Loading containers: start.
> INFO[0002] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --
bip can be used to set a preferred IP address
> INFO[0002] No non-
localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]
> INFO[0002] IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
> INFO[0002] No non-
localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]
> INFO[0002] IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
> WARN[0002] Failed to allocate and map port 8000-8000: Bind for 0.0.0.0:8000 failed: port is already allocated
> WARN[0002] failed to cleanup ipc mounts:
> failed to umount /var/lib/docker/containers/d929a0878ea9282cd3eeb1ed65c1a6448e0a9da67b1dce2dba305c746ecc2371/shm: invalid argument
> ERRO[0002] Failed to start container d929a0878ea9282cd3eeb1ed65c1a6448e0a9da67b1dce2dba305c746ecc2371: driver failed programming external connectivity on endpoint d929a0878ea9_lemp_web_1 > (c890920376e611a5509d105d269da8927bab6e2dfd39822c1cadd6bfd9b58c5a): Bind for 0.0.0.0:8000 failed: port is already allocated
Docker Debugging
In order to do that, set the debug key to true in the daemon.json file. Generally, you will find this file under
/etc/docker . You may need to create this file, if it does not yet exist.
{
"debug": true
}
Possible values are debug , info , warn , error , fatal .
Now send a HUP signal to the daemon to cause it to reload its configuration: sudo kill -SIGHUP $(pidof dockerd) or
execute service docker stop && dockerd :
You will be able to see all of the actions that Docker is doing:
DEBU[0012] Registering routers
DEBU[0012] Registering GET, /containers/{name:.*}/checkpoints
DEBU[0012] Registering POST, /containers/{name:.*}/checkpoints
DEBU[0012] Registering DELETE, /containers/{name}/checkpoints/{checkpoint}
DEBU[0012] Registering HEAD, /containers/{name:.*}/archive
DEBU[0012] Registering GET, /containers/json
DEBU[0012] Registering GET, /containers/{name:.*}/export
DEBU[0012] Registering GET, /containers/{name:.*}/changes
DEBU[0012] Registering GET, /containers/{name:.*}/json
DEBU[0012] Registering GET, /containers/{name:.*}/top
DEBU[0012] Registering GET, /containers/{name:.*}/logs
DEBU[0012] Registering GET, /containers/{name:.*}/stats
DEBU[0012] Registering GET, /containers/{name:.*}/attach/ws
DEBU[0012] Registering GET, /exec/{id:.*}/json
DEBU[0012] Registering GET, /containers/{name:.*}/archive
DEBU[0012] Registering POST, /containers/create
DEBU[0012] Registering POST, /containers/{name:.*}/kill
DEBU[0012] Registering POST, /containers/{name:.*}/pause
DEBU[0012] Registering POST, /containers/{name:.*}/unpause
DEBU[0012] Registering POST, /containers/{name:.*}/restart
DEBU[0012] Registering POST, /containers/{name:.*}/start
DEBU[0012] Registering POST, /containers/{name:.*}/stop
DEBU[0012] Registering POST, /containers/{name:.*}/wait
DEBU[0012] Registering POST, /containers/{name:.*}/resize
DEBU[0012] Registering POST, /containers/{name:.*}/attach
DEBU[0012] Registering POST, /containers/{name:.*}/copy
DEBU[0012] Registering POST, /containers/{name:.*}/exec
DEBU[0012] Registering POST, /exec/{name:.*}/start
DEBU[0012] Registering POST, /exec/{name:.*}/resize
DEBU[0012] Registering POST, /containers/{name:.*}/rename
DEBU[0012] Registering POST, /containers/{name:.*}/update
DEBU[0012] Registering POST, /containers/prune
DEBU[0012] Registering PUT, /containers/{name:.*}/archive
DEBU[0012] Registering DELETE, /containers/{name:.*}
DEBU[0012] Registering GET, /images/json
DEBU[0012] Registering GET, /images/search
DEBU[0012] Registering GET, /images/get
DEBU[0012] Registering GET, /images/{name:.*}/get
DEBU[0012] Registering GET, /images/{name:.*}/history
DEBU[0012] Registering GET, /images/{name:.*}/json
DEBU[0012] Name To resolve: php.
DEBU[0012] Registering POST, /commit
DEBU[0012] Query php.[1] from 127.0.0.1:53704, forwarding to udp:127.0.1.1
DEBU[0012] Registering POST, /images/load
DEBU[0012] Registering POST, /images/create
DEBU[0012] Registering POST, /images/{name:.*}/push
DEBU[0012] Registering POST, /images/{name:.*}/tag
DEBU[0012] Registering POST, /images/prune
DEBU[0012] Registering DELETE, /images/{name:.*}
DEBU[0012] Registering OPTIONS, /{anyroute:.*}
DEBU[0012] Registering GET, /_ping
DEBU[0012] Registering GET, /events
DEBU[0012] Registering GET, /info
DEBU[0012] Registering GET, /version
DEBU[0012] Registering GET, /system/df
DEBU[0012] Registering POST, /auth
DEBU[0012] Registering GET, /volumes
DEBU[0012] Registering GET, /volumes/{name:.*}
DEBU[0012] Registering POST, /volumes/create
DEBU[0012] Registering POST, /volumes/prune
DEBU[0012] Registering DELETE, /volumes/{name:.*}
DEBU[0012] Registering POST, /build
DEBU[0012] Registering POST, /swarm/init
DEBU[0012] Registering POST, /swarm/join
DEBU[0012] Registering POST, /swarm/leave
DEBU[0012] Registering GET, /swarm
DEBU[0012] Registering GET, /swarm/unlockkey
DEBU[0012] Registering POST, /swarm/update
DEBU[0012] Registering POST, /swarm/unlock
DEBU[0012] Registering GET, /services
DEBU[0012] Registering GET, /services/{id}
DEBU[0012] Registering POST, /services/create
DEBU[0012] Registering POST, /services/{id}/update
DEBU[0012] Registering DELETE, /services/{id}
DEBU[0012] Registering GET, /services/{id}/logs
DEBU[0012] Registering GET, /nodes
DEBU[0012] Registering GET, /nodes/{id}
DEBU[0012] Registering DELETE, /nodes/{id}
DEBU[0012] Registering POST, /nodes/{id}/update
DEBU[0012] Registering GET, /tasks
DEBU[0012] Registering GET, /tasks/{id}
DEBU[0012] Registering GET, /tasks/{id}/logs
DEBU[0012] Registering GET, /secrets
DEBU[0012] Registering POST, /secrets/create
DEBU[0012] Registering DELETE, /secrets/{id}
DEBU[0012] Registering GET, /secrets/{id}
DEBU[0012] Registering POST, /secrets/{id}/update
DEBU[0012] Registering GET, /plugins
DEBU[0012] Registering GET, /plugins/{name:.*}/json
DEBU[0012] Registering GET, /plugins/privileges
DEBU[0012] Registering DELETE, /plugins/{name:.*}
DEBU[0012] Registering POST, /plugins/{name:.*}/enable
DEBU[0012] Registering POST, /plugins/{name:.*}/disable
DEBU[0012] Registering POST, /plugins/pull
DEBU[0012] Registering POST, /plugins/{name:.*}/push
DEBU[0012] Registering POST, /plugins/{name:.*}/upgrade
DEBU[0012] Registering POST, /plugins/{name:.*}/set
DEBU[0012] Registering POST, /plugins/create
DEBU[0012] Registering GET, /networks
DEBU[0012] Registering GET, /networks/
DEBU[0012] Registering GET, /networks/{id:.+}
DEBU[0012] Registering POST, /networks/create
DEBU[0012] Registering POST, /networks/{id:.*}/connect
DEBU[0012] Registering POST, /networks/{id:.*}/disconnect
DEBU[0012] Registering POST, /networks/prune
DEBU[0012] Registering DELETE, /networks/{id:.*}
Checking Docker Status
Checking if Docker is running can be done using service docker status or any other alternative way like ps aux|grep
docker , ps -ef |grep docker ..etc It is also possible to use any other Docker command like docker info command in
order to see of Docker responds or not. In the negative case, your terminal will show something like
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
You can use other ways depending on your system tools like systemctl is-active docker .
Debugging Containers
Whether you are using standalone Docker containers or managed services, it is possible to inspect the details of a
service or a container.
docker service inspect <service>
docker inspect <container>
Let's inspect this container docker run -it -p 8000:80 -d --name webserver nginx using docker inspect webserver command.
These are the list of information the Docker inspect command will give us:
[
{
"Id": "3..9",
"Created": "2017-05-27T23:58:20.848318438Z",
"Path": "nginx",
"Args": [
"-g",
"daemon off;"
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 13730,
"ExitCode": 0,
"Error": "",
"StartedAt": "2017-05-27T23:58:21.257901373Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:3..7",
"ResolvConfPath": "/var/lib/docker/containers/3..9/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/3..9/hostname",
"HostsPath": "/var/lib/docker/containers/3..9/hosts",
"LogPath": "/var/lib/docker/containers/3..9/3..9-json.log",
"Name": "/webserver",
"RestartCount": 0,
"Driver": "aufs",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "docker-default",
"ExecIDs": null,
"HostConfig": {
"Binds": null,
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "default",
"PortBindings": {
"80/tcp": [
{
"HostIp": "",
"HostPort": "8000"
}
]
},
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": null,
"CapDrop": null,
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": -1,
"OomKillDisable": false,
"PidsLimit": 0,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0
},
"GraphDriver": {
"Data": null,
"Name": "aufs"
},
"Mounts": [],
"Config": {
"Hostname": "37068ac691fb",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"80/tcp": {}
},
"Tty": true,
"OpenStdin": true,
"StdinOnce": false,
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.13.0-1~stretch",
"NJS_VERSION=1.13.0.0.1.10-1~stretch"
],
"Cmd": [
"nginx",
"-g",
"daemon off;"
],
"ArgsEscaped": true,
"Image": "nginx",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": null,
"OnBuild": null,
"Labels": {},
"StopSignal": "SIGQUIT"
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "4..0",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"80/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "8000"
}
]
},
"SandboxKey": "/var/run/docker/netns/4a0ab4faeb4e",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "8..b",
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"MacAddress": "02:42:ac:11:00:02",
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "b..b",
"EndpointID": "8..b",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
}
}
]
It is possible to get a single element like for example, the IP address:
docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' webserver
Or the port binding list:
docker inspect --format='{{range $p, $conf := .NetworkSettings.Ports}} {{$p}} -
> {{(index $conf 0).HostPort}} {{end}}' webserver
Another way of debugging containers is executing debug commands inside the container like docker exec -it
webserver ps aux or docker exec -it webserver cat /etc/resolv.conf ..etc
Using docker stats and docker events commands could give you also some information when debugging.
Troubleshooting Docker Using Sysdig
Sysdig is a Linux system exploration and troubleshooting tool with support for containers.
To install Sysdig automatically in one step, simply run the following command. This is the recommended
installation method.
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
Then add your username to the same group as sysdig:
groupadd sysdig
usermod -aG sysdig $USER
Use visudo to edit the sudo-config. Add the line %sysdig ALL= /path/to/sysdig and save. The path is most likely
/usr/local/bin/sysdig , but you can make sure by running which sysdig.
Sysdig is an open source project and it can be used to get information about
Networking
Containers
Application
Disk I/O
Processes and CPU usage
Performance and Errors
Security
Tracing
Debugging containers is also debugging the host, so sysdig can be used to make a general troubleshooting. What
does interest us in this part is the container-related commands.
In order to list the running containers with their resource usage
sudo csysdig -vcontainers
Listing all of the processes with container context can be done using
sudo csysdig -pc
To view the CPU usage of the processes running inside the my_container container, use:
sudo sysdig -pc -c topprocs_cpu container.name=my_container
Bandwidth can be monitored using:
sudo sysdig -pc -c topprocs_net container.name=my_container
Processes using most network bandwidth can be checked using:
sudo sysdig -pc -c topprocs_net container.name=my_container
To view the top network connections:
sudo sysdig -pc -c topconns container.name=my_container
Top used files consuming I/O bytes could be checkeck using:
sudo sysdig -pc -c topfiles_bytes container.name=my_container
And to show all the interactive commands executed inside the my_container container, use:
sudo sysdig -pc -c spy_users container.name=my_container
Chapter XII - Orchestration - Docker Swarm
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Docker Swarm
Docker Swarm is the solution that Docker inc developed to create an orchestration tool like Google's Kubernetes. It
provides native clustering capabilities to turn a group of Docker engines into a single, virtual Docker Engine.
Distributed applications requires compute resources that are also distributed and that's why Docker Swarm was
introduced. You can use it to manage pooled resources in order to scale out an application as if it was running on a
single, huge computer.
Before Docker engine 1.12, Docker Swarm should be integrated with a kv store and a service discovery tool but after
this version, Docker Swarm integrated these tools and it can be used without having the need to use other tools.
Swarm Features
Docker has a continuous active development and it is changing a lot, Swarm mode introduced many new feature that
solved many problems. Like it is described in its official website, Docker Swarm serves the standard Docker API, so
any tool which already communicates with a Docker daemon can use Docker Swarm to transparently scale to
multiple hosts: Dokku, Docker Compose, Krane, Flynn, Deis, DockerUI, Shipyard, Drone, Jenkins and of course the
Docker client itself. Includes ability to pull from private repositories or Docker public Hub as well.
Swarm mode is a built-in native solution to Docker, you can use Docker Networking, Volumes and plugins through
their respective Docker commands via this mode.
Its scheduler has useful filters like node tags, affinity and strategies like spread, binpack ..etc These filters assign
containers to the underlying nodes to optimize performance and resource utilization.
Swarm is production ready and according to Docker inc it is tested to scale up to 1,000 nodes and fifty thousand
50,000 containers with no performance degradation in spinning up incremental containers onto the node cluster.
A test stress done by Docker to spin up 1,000 nodes, 30,000 containers managed by 1 Swarm manager gave these
results:
Percentile API Response Time Scheduling Delay
50th 150ms 230ms
90th 200ms 250ms
99th 360ms 400ms
During this test, Consul was used as a discovery backend, every node hosted 30 containers (1,000 nodes), the
manager was an EC2 m4.xlarge (4 CPUs, 16GB RAM) machine and nodes were EC2 t2.micro (1 CPU, 1 GB RAM)
machines and container images were using ubuntu 14.04.
Here is what was published in the blog post introducing the results:
We wanted to stress test a single Swarm manager, to see how capable it would be, so we used one Swarm
manager to manage all our nodes. We placed fifty containers per node. Commands were run 1,000 times
against Swarm and and we generated percentiles for 1) API Response time and 2) Scheduling delay. We found
that we were able to scale up to 1,000 nodes running 30,000 containers. 99% of the time each container took
less than half a second to launch. There was no noticeable difference in the launch time of the 1st and 30,000th
container. We used docker info to measure API response time, and then used docker run -dit ubuntu bash to
measure scheduling delay.
Another serious collaborative test was done called Swarm2k. This test was using Docker 1.12, a total of 2,384
servers was part of the Swarm cluster and there were 3 managers.
To achieve such a bug number of nodes, Docker ensure a highly available Swarm Manager. You can create multiple
Swarm masters and specify policies on leader election in case the primary master experiences a failure.
Swarm comes with a built-in scheduler, but you can easily plugin the Mesos or Kubernetes backend while still using
the Docker client. To find nodes in your cluster, Docker Swarm can use either a hosted discovery service, static file,
etcd, consul and zookeeper.
Installation
Nothing different from the default Docker installation, since the Swarm mode is a built-in feature, you need just to
install Docker:
curl -fsSL https://get.docker.com/ | sh
You can use docker -v to see the installed version, in all cases if it is greater than 1.12, you should have the Swarm
feature integrated in Docker engine.
The Raft Consensus Algorithm , Swarm Managers & Best
Practices
In a Docker cluster, you should have at least one manager. One of the things that I was missing when I started
experimenting Docker is that the number of managers should not be equal to 2. Then I understood that 1 manager or
3 managers is better. In fact, when running with two managers, you double the chance of a manager failure.
Swarm manager nodes use the Raft Consensus Algorithm to manage the swarm state.
Raft achieves consensus via an elected leader. A server in a raft cluster is either:
a leader
a candidate
or a follower
The leader is responsible for log replication to the followers. Using heartbeat messages, the leader regularly informs
the followers of its existence.
Each follower has a timeout in which it expects the heartbeat from the leader. The timeout is reset on receiving the
heartbeat (it is typically between 150 and 300ms). In the case when no heartbeat is received, the follower changes its
status to candidate and starts a new leader election. The election starts by increasing the term counter and sending a
RequestVote message to all other servers that will vote (only once) for the first candidate that sends them this
RequestVote message.
Three scenarios are possible in the last case:
If the candidate receives a message from a leader with a term number equal to or larger than the current term,
then its election is defeated and the candidate changes into a follower.
If a candidate receives a majority of votes, then it becomes the new leader.
If neither happens, a new leader election starts after a timeout.
Raft tolerates up to (N-1)/2 failures and requires a majority or quorum (also called a majority of managers) of
(N/2)+1 members to agree on values proposed to the cluster.
Raft will not tolerate the case when among 5 Managers, 3 nodes are unavailable, the system will not process any
more requests.
In all cases, you should maintain an odd number of managers in the swarm to support manager node failures, it
ensures higher chance for the quorum availability.
Number Of Nodes (N) Quorom Majority (N/2)+1 Fault Tolerance (N-1)/2
1 1 0
2 2 0
3 2 1
4 3 1
5 3 2
6 4 2
You don't really need to know how Raft works, but you need to know that having an odd number of managers is a
must. You need also to know, that in order to maximize the availability of your nodes, you should think about
distributing manager nodes across a minimum of 3 availability-zones. You may have less managers and one
availability zone, it will work fine but you minimize the probability of teloreating data centers problems. If you will
deploy high-available fault-tolerant system, this table describes how you should make the repartition of managers
across 3 availability zones:
Number of Node Managers Repartition Of Managers Across 3 Availability Zones
3 1-1-1
5 2-2-1
7 3-2-2
9 3-3-3
11 4-4-3
Another thing to know is that a node running as a swarm manager is not different than a non-manager node (a
worker) so it is fine to have a swarm cluster with only managers, without any worker. In fact, manager nodes by
default acts like a worker nodes. Swarm scheduler can assign tasks to a manager node. If you have a small swarm
cluster, managers could be assigned to execute tasks with lower risk.
You can also restrict the role of a manager in order to acting only as a manager and not a worker. Draining managers
nodes, make them unavailable as worker nodes:
docker node update --availability drain <node_id>
It is may be evident but you should assign a static IP to each of your Swarm managers, a worker in contrast could
have dynamic IP (since it will be discovered) but workers and managers - in all cases - should be able to
communicate together over network.
The following ports must be available:
TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
TCP and UDP port 4789 for overlay network traffic
Creating Swarm Managers & Workers
In order to create a manger, you should have Docker installed in the node, then use the swarm initialization
command:
docker swarm init --advertise-addr <node_ip|interface>[:port]
If you want to customize your security, you can use these options:
--cert-expiry duration Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s)
--external-ca external-ca Specifications of one or more certificate signing endpoints
Other options can be used to changer the heartbeat duration, the number of log entries between Raft snapshots or the
task history retention limit.
--autolock Enable manager autolocking (requiring an unlock key to start a stopped manager)
--dispatcher-heartbeat duration Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s)
--max-snapshots uint Number of additional Raft snapshots to retain
--snapshot-interval uint Number of log entries between Raft snapshots (default 10000)
--task-history-limit int Task history retention limit (default 5)
When losing the quorum, you can use the following option to bring back the failed node online.
--force-new-cluster Force create a new cluster from current state
Example:
Say our eth0 interface has 138.197.35.0 as an IP address.
To create a cluster with a first manager, type:
docker swarm init --force-new-cluster --advertise-addr 138.197.35.0
This will show an instruction to execute a command in order to add a worker:
Swarm initialized: current node (vhvebboq9fp4j62w83dujxzjr) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-cmgrvib1disaeq08x8a5ln7zo \
138.197.35.0:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
If you want to add a worker to this cluster, create a new server and execute:
docker swarm join --token SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-
cmgrvib1disaeq08x8a5ln7zo 138.197.35.0:2377
You should of course have the port 2377 open. But, if you want to add a new manager, you should execute the
follwing command:
docker swarm join-token manager
Docker will generate a new command that you can execute in a second manager:
docker swarm join --token SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-
354t5zlz0re5kh1jqcliofecs 138.197.35.0:2377
Now you can get a list of all the available workers and managers in you cluster by typing:
docker node ls
I have a single node in my cluster and of course it is a manager, what I see is the following result:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
vhvebboq9fp4j62w83dujxzjr * swarm-1 Ready Active Leader
Deploying Services
Creating A Container
Nothing is really different from what your learned before, you should create an image, build it, create a container,
tag it, commit it, push it ..etc then you can use your image to create the container.
For the sake of simplicity, I created a container based on Alpine Linux that will execute and infinite loop.
This is the Dockerfile:
FROM alpine
ENTRYPOINT tail -f /dev/null
I built it:
docker build -t eon01/infinite .
Sending build context to Docker daemon 11.78 kB
Step 1/2 : FROM alpine
latest: Pulling from library/alpine
0a8490d0dfd3: Pull complete
Digest: sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8
Status: Downloaded newer image for alpine:latest
---> 88e169ea8f46
Step 2/2 : ENTRYPOINT tail -f /dev/null
---> Running in 986f0fd1f5f9
---> d1400705c370
Removing intermediate container 986f0fd1f5f9
Successfully built d1400705c370
Run it:
docker run -it --name infinite -d eon01/infinite
fc476e6f8312492b2fd9cb620c8eaf5115c2e18a52d95160849436087dec2b68
And it should keep running because of the infinite tail -f /dev/null :
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fc476e6f8312 eon01/infinite "/bin/sh -
c 'tail ..." 3 seconds ago Up 2 seconds infinite
You can use this image directly from my public Docker Hub:
docker run -it --name infinite -d eon01/infinite
But this is not actually how we use Swarm. The common usage of Swarm is creating services first before thinking in
term of containers.
Creating & Configuring Swarm Services
In order to create a service you can use the docker service create command.
docker service create <options> <image> <command> <args>
This command can be used with multiple options like:
--dns to set custom DNS servers
-e or --env to set a list of environment variables
--env-file to read in a file of environment variables
--health-cmd to set a health check command
--log-driver and --log-opt to set logging driver and driver options
-p or --publish to publish a port as a node port
--replicas to control the number of running tasks of a given service
--with-registry-auth to send registry authentication details to swarm agents
--mode to select if the service should be replicated or global
--network to attach a service to a network
--endpoint-mode to choose between vip or dnsrr
And other options that you may have used in the previous sections of this book like -w or --workdir , -u or --
user , -t or --tty , --dns-option or --dns-search ..etc
Here is an example:
docker service create --name infinite_service eon01/infinite
After creating the new service, you can check using docker ps command, that you have two containers, the first one
that you created using docker run command and the other one created by default when creating the Swarm service.
CONTAINER ID IMAGE COMMAND STATUS NAMES
cf265656d24a eon01/infinite "/bin/sh -c 'tail ..." Up 10 hours (unhealthy) infinite
982687e9d979 eon01/infinite@sha256:e53.. "/bin/sh -c 'tail ..." Up 10 hours infinite_service.1.eq..
We do not really need the first "unmanaged" container, remove it using docker rm -f infinite .
Scaling Docker Containers
After we created infinite_service, you will notice that there is one container running. One of the Swarm features is
the creation of scalable services and easy to scale. One command will allow us to scale a running service.
This is the running container that the service infinite_service created. By default a service will start 1 container:
a2f13154e772 eon01/infinite "/bin/sh -c 'tail ..." infinite_service.1.
In order to scale infinite_service we can use the docker service scale command:
To scale up to 2 containers:
root@swarm-1:~# docker service scale infinite_service=2
infinite_service scaled to 2
To run 100 containers :
root@swarm-1:~# docker service scale infinite_service=100
infinite_service scaled to 100
Scaling to 100 container (or may be less or more - this depends really on your application and architecture) is
not always the Solution to your performance problems. I always make tests to choose what is the best scale to have.
Scaling up to 100 containers may make some performance regressions happens since you will be spending more
time on networking and/or service name resolution ..etc. So be sure to know the scale rate in production using load
and networking tests.
In order to see your containers you may use docker ps , but I prefer using an alternative command to show the
containers per service:
docker service ps infinite_service
ID NAME IMAGE NODE DESIRED CURRENT STATE
bfvu50gte1gd infinite_service.1 eon01/infinite:latest swarm-1 Running Running 9 minutes ago
mope64fq4q1x infinite_service.2 eon01/infinite:latest swarm-1 Running Running 5 minutes ago
k8fv6qk2ez8c infinite_service.3 eon01/infinite:latest swarm-1 Running Running 5 minutes ago
e7luxqzwtb66 infinite_service.4 eon01/infinite:latest swarm-1 Running Running 5 minutes ago
pryz06z8prbz infinite_service.5 eon01/infinite:latest swarm-1 Running Running 5 minutes ago
kvfpi2ajbie3 infinite_service.6 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
e3i6q69ighth infinite_service.7 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
yqzq8rfne0wh infinite_service.8 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
iwrcuuz4lh7i infinite_service.9 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
ytpfh7u5w69b infinite_service.10 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
so5am191gsmm infinite_service.11 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
.
.
.
47tt3jgx6wjt infinite_service.66 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
87bmjq1p9w7a infinite_service.67 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
xuket5mmp4uw infinite_service.68 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
tpsy9hllzxz4 infinite_service.69 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
eyfzgknxem9f infinite_service.70 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
m6z044ym12kt infinite_service.71 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
lvr69efjkh1y infinite_service.72 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
o56fj14cwkjg infinite_service.73 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
gwjs8et91a0z infinite_service.74 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
vh5te7refgjh infinite_service.75 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
c8xwnlconcfb infinite_service.76 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
j724xf8y0hut infinite_service.77 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
xgmoeyz777sf infinite_service.78 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
e1wcnr6uj8tc infinite_service.79 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
qxm4cqk8rynd infinite_service.80 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
74c9jkn4sdam infinite_service.81 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
07xzshv893x9 infinite_service.82 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
twvbe48cx4wl infinite_service.83 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
jhftjr6f14b0 infinite_service.84 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
qgbnv32oe9yk infinite_service.85 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
b689gaghcxzd infinite_service.86 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
zo7mu3zw6rw6 infinite_service.87 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
xt1p06nvzzi9 infinite_service.88 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
kw57wh4thsei infinite_service.89 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
qdws30214xr2 infinite_service.90 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
rprasrpkksnf infinite_service.91 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
4894dp78376t infinite_service.92 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
4s6k9hxqwbrr infinite_service.93 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
rct9syhyztuv infinite_service.94 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
b391cod0r2wz infinite_service.95 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
xr7z5l1x14va infinite_service.96 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
rruxgcoaxmjr infinite_service.97 eon01/infinite:latest swarm-1 Running Running 4 minutes ago
tascvsbjidtp infinite_service.98 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
oazgbooulfne infinite_service.99 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
kc7d5xmcvxvy infinite_service.100 eon01/infinite:latest swarm-1 Running Running 3 minutes ago
Note that containers names are numbered and contains the service name : infinite_service.1, infinite_service.10,
infinite_service.100 The last command will give you more information about containers, information like the desired
state and the current state will give you helpful information about your containers.
The desired-state could take the values:
running
shutdown
accepted
A container current state could be:
running
shutdown
assigned
preparing
You have also the choice to scale you services when starting them using docker service create command:
docker service create --name infinite_service --replicas 10 eon01/infinite
infinite_service is a replicated service. Global is another type. In Swarm service could be replicated or global. If
you have used Docker without Swarm, you may notice that Swarm moved Docker from a transactional approach
(operations made directly on containers) to abstraction (there are services and containers and a container is the
"physical" representation of a service and it obeys to Docker distribution strategy when the cluster contains more
than 1 node). We are going to see the difference between the two modes later in this chapter.
Replicated VS Global
Replicated services are distributed by Swarm manager(s) in the form of a specific number of replica tasks among the
nodes. The scale is defined by the user and if it is not defined, it will be 1 by default. We scaled infinite_service to
10, if we have 5 nodes, it is not nece
Global services, are run as one task in every node of the Swarm cluster.
To create a global service, you should use a similar command to this one:
docker service create --name infinite_service --mode global eon01/infinite
By default services are replicated. So using:
docker service create --name infinite_service --mode replicated eon01/infinite
is like using:
docker service create --name infinite_service eon01/infinite
You can scale replicated services but it is not allowed to scale global services:
docker service create --name infinite_service --mode global eon01/infinite
docker service scale infinite_service=2
infinite_service: scale can only be used with replicated mode
Creating & Configuring Swarm Networks
As seen in the networking chapter, Docker has some default built-in networks:
bridge that allow the communication between physical or virtual network interfaces in the host's system.
docker_gwbridge that allows the containers to have external connectivity outside of their cluster. By default
every container is connected to this network.
host that allows containers to have a similar networking configuration to the host.
ingress that exposes containers externally available to the swarm based on a port mapping model
none that helps to create a container-specific network
docker network ls
NETWORK ID NAME DRIVER SCOPE
f97187f3cbf9 bridge bridge local
3a0e460df3be docker_gwbridge bridge local
09493ae999c4 host host local
czm3d1slqjxz ingress overlay swarm
31dffc9464fd none null local
You can of course create your own bridge network, your own gateway ..etc but since we are using the networking
features built in Docker by default, we are not going to create more networks other than default ones, because we
may need some overlay networks. If we want to create a multi-host networking stack, we need overlay networks.
Creating an overlay network, will make it only available to the nodes attached to our Swarm cluster.
Docker overlay networks are not available to unmanaged containers even if they are living inside a Swarm
node.
By default the nodes encrypt and authenticate information they exchange via gossip using the AES algorithm in
GCM mode but you can add another encryption layer to an overlay network
When using Swarm mode built-in discovery, you don't need to expose service-specific ports to make the
service available to other services on the same overlay network. Swarm will do the job for you.
We can create an overlay network using docker network create command. Here is the simplest example:
docker network create -d overlay infinite_net
or
docker network create --driver overlay infinite_net
Let's update the command to create the previous service in order to attach it to this network. Remove the service and
re-create it:
docker service rm infinite_service
docker service create --name infinite_service --network infinite_net eon01/infinite
A Docker service could be connected to two networks, let's test this:
docker network create -d overlay infinite_net_2
docker service rm infinite_service
docker service create --name infinite_service --network infinite_net --network infinite_net_2 eon01/infinite
Note that you can connect or disconnect a container to a given network using:
docker network connect <network_name> <container_id|container_name>
docker network disconnect <network_name> <container_id|container_name>
If you want to remove the network called infinite_net_2 , you can use docker network rm infinite_net_2 , but in reality
you will not be able to do it because you had already attached the service infinite_service to this network.
You can not remove a network being used by a service.
For this test, I was using Digital Ocean but for other tests I used some AWS EC2 machines and I had a problem with
the DNS. My solution was to add the option --dns :
docker service rm infinite_service
docker service create --name infinite_service --network infinite_net --dns 8.8.8.8 --dns 8.8.4.4 eon01/infinite
Sometimes, you create multiple networks and play with them but later you may forget to delete the unused ones.
Docker allows you to remove them, you should just type docker network prune .
Inter-Services Communication & Discovery
Let's create two new services attached to two different networks:
docker service rm infinite_service
docker network create -d overlay net_1
docker network create -d overlay net_2
docker service create --name service_1 --network net_1 eon01/infinite
docker service create --name service_2 --network net_2 eon01/infinite
Every service has 1 running container:
CONTAINER ID IMAGE COMMAND NAMES
dcfb1cc4a2a3 eon01/infinite "/bin/sh -c 'tail ..." service_2.1.ilz36bzvph6wbve28wdxzgn08
1c09cbbae151 eon01/infinite "/bin/sh -c 'tail ..." service_1.1.whz2slvqigmzp2ebh0ku68tft
dcfb1cc4a2a3 & 1c09cbbae151 are the ids of our containers.
If you type docker service inspect service_1 , you will see the network configuration where the network is net_1 and
the endpoint is in vip mode:
"Networks": [
{
"Target": "net_1"
}
],
"EndpointSpec": {
"Mode": "vip"
}
VIP is a container load balancing mode and we are going to see later the difference between VIP-based load
balancing and DNS-based load balancing.
Note that you can specify the CIDR of your user-defined networks.
Example:
docker network create -d overlay --subnet=192.168.1.0/24 net_1
docker network create -d overlay --subnet=192.168.2.0/24 net_2
Let's re-create the two same services where every one of them will be attached to a different network.
docker service rm service_1 service_2
docker service create --name service_1 --network net_1 eon01/infinite
docker service create --name service_2 --network net_2 eon01/infinite
Let's see what we have on our cluster:
docker ps
CONTAINER ID IMAGE COMMAND NAMES
c0a751e49756 eon01/infinite@sha256: "/bin/sh -c 'tail ..." service_2.1.
8fda59787b76 eon01/infinite@sha256: "/bin/sh -c 'tail ..." service_1.1.
To see if service_1 and service_2 could establish a communication, we are doing here a simple test using ping
command:
service_2 can ping service_1:
docker exec -it c0a751e49756 ping -c 2 service_2
PING service_2 (10.0.3.2): 56 data bytes
64 bytes from 10.0.3.2: seq=0 ttl=64 time=0.091 ms
64 bytes from 10.0.3.2: seq=1 ttl=64 time=0.138 ms
--- service_2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.091/0.114/0.138 ms
service_1 can ping service_2:
docker exec -it 8fda59787b76 ping -c 2 service_1
PING service_1 (10.0.2.2): 56 data bytes
64 bytes from 10.0.2.2: seq=0 ttl=64 time=0.160 ms
64 bytes from 10.0.2.2: seq=1 ttl=64 time=0.179 ms
--- service_1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.160/0.169/0.179 ms
We can see that both services can communicate with each other.
When you create a service using Swarm, you should give it a name and it is going to be resolved by Docker like we
have seen for our two services. There is a mapping between the subnet/IP of a service and its name.
For example: If in the same Docker Swarm cluster, you have php service called php_client that should call a Python
API service called python_api running on port 8000, you can connect from your php code to the API service using a
code similar to this one:
$response = file_get_contents('http://python_api/path/to/api/call?param=x');
$response = json_decode($response);
The container/service will resolve python_api/ for php to allow it to connect to the API.
Within a swarm cluster, containers can call and request data from other containers using their service's built-in DNS
resolution that can find appropriate distant IP and port automatically. If you call a service (from outside or inside the
cluster), you are in reality requesting one of the containers of this service.
If a service is scaled to 2 or more containers, there is a load balancing algorithm that will redirect the traffic to a
given container. If a container is down (generally because its entry process is not running, it will be cut from traffic
and the load balancer will not redirect traffic to this container. Requests going to a service's containers would be
round-robin load-balanced and it will work even if you did not forward any ports when you created your docker
service.
Docker service discovery uses iptables and IPVS features of Linux Kernel.
VIP vs DNS Based Load Balancing
Since Docker 1.12 was released, two important services were added to the default installation: Service discovery and
load balancing.
Load balancing uses also iptables and IPVS features of Linux Kernel.
IPVS (IP Virtual Server) implements transport-layer load balancing inside the Linux kernel, so called Layer-4
switching. IPVS running on a host acts as a load balancer at the front of a cluster of real servers, it can direct
requests for TCP/UDP based services to the real servers, and makes services of the real servers to appear as a virtual
service on a single IP address.
Docker service discovery and load balancing uses iptables and IPVS features of Linux kernel. Iptables is a packet
filtering technology available in Linux kernel. Iptables can be used to classify, modify and take decisions based on
the packet content. IPvS is a transport level load balancer available in the Linux kernel.
docker service scale service_1=3
docker service scale service_2=2
We scaled our services: service_1 to 3 containers and service_2 to 2 containers. But when a client will request the
service_1, his request will be routed to a single container and the built-in load balancer will be responsible for
routing the request.
If we a cluster of 3 nodes: 2 worker nodes and a managed, containers will be distributed across these nodes and the
load balancing algorithm will keep working the same way and balancing traffic to other nodes in the cluster (not just
within the same machine).
Only managers will load-balance the traffic to services' containers.
This schema shows how a swam manager can distribute traffic to itself and the other worker nodes in the same
cluster.
Previously, when we created our two services, we attached service_1 to net_1 and service_2 to net_2, that's why all
of the three nodes recognize the 2 overlay networks:
DNS Load Balancing
Docker engine comes with an embedded DNS server, if service_1 has 3 running containers, when you request it, the
DNS will return the list of container's IPs without a particular order and your request will be redirected to the frist IP
in the list.
You can specify the DNS mode in a Swarm explicitly using --endpoint-mode . Let's re-create our two services using
the same option and add the service_1 to the network net_2:
docker service rm service_1 service_2
docker service create --name service_1 --network net_1 --network net_2 --endpoint-mode dnsrr eon01/infinite
docker service create --name service_2 --network net_2 --endpoint-mode dnsrr eon01/infinite
docker service inspect service_1
You can notice the DNS mode (dnsrr):
"EndpointSpec": {
"Mode": "dnsrr"
}
The DNS load balancing uses the Round Robin algorithm, we are going to see a example of how it works. To see
this, let's scale service_1 to two instances and try to ping it from a container running service_2:
docker service scale service_1=2
Using docker ps you can identify the id of the container running service_2. Let's ping service_1 from this container:
docker exec -it 8b9d83c20454 ping -c 1 service_1
PING service_1 (10.0.3.2): 56 data bytes
64 bytes from 10.0.3.2: seq=0 ttl=64 time=0.272 ms
And try it again:
docker exec -it 8b9d83c20454 ping -c 20 service_1
PING service_1 (10.0.3.3): 56 data bytes
64 bytes from 10.0.3.3: seq=0 ttl=64 time=0.266 ms
We can notice that service_1 replied in the first ping with 10.0.3.2 and in the second one with another IP address
10.0.3.3. Docker containers running service_1 have two differnet adresses and the internal load balancer will route
traffic to both of them using a Round Robin balancing method:
docker inspect 75dc68288a2e|grep -i ipv4
"IPv4Address": "10.0.2.3"
docker inspect 91ab0b66aa3a|grep -i ipv4
"IPv4Address": "10.0.2.2"
At this moment, Docker DNS load balancing may have some issues distributing the traffic to scaled services, you
may read this blog post if you are interested to know the effect of RFC3484 on DNS Round Robin balancing. If you
experienced some problems using this feature, you may use the other alternatives like weave, consul, registrator, or
the VIP mode that we are going to see together in the following part of this chapter.
VIP Load Balancing
Starting from 1.12.0 release, Docker added the VIP in-built support for service load balancing using the previously
explained IPVS. The DNS name of the service is mapped to a VIP (Virtual IP).
Using the VIP load balancing, each service will have a single static IP that maps to multiple running containers, if
one ore more containers disappear, the associated IP will not change as long as the service is up.
Let's see how this mode works. Create a new service_1 with two containers using the VIP load balancing and the
service_2:
docker service rm service_1 ; docker service create --name service_1 --network net_1 --network net_2 --replicas 2 --
endpoint-mode vip eon01/infinite
docker service rm service_2 ; docker service create --name service_2 --network net_2 --endpoint-mode vip eon01/infinite
Using docker ps get the id of the container running service_2 and ping service_1 from that container:
docker exec -it 77a783cfde4f ping -c 1 service_1
PING service_1 (10.0.3.2): 56 data bytes
64 bytes from 10.0.3.2: seq=0 ttl=64 time=0.111 ms
You can try it again and again, but the service_1 will always have the same IP: 10.0.3.2.
As said, this is done using Linux Kernel iptables (for firewall rules and ipvs (for load balacing) and we are going to
use the nsenter command:
Using docker ps , get the id of one of the running service_1 containers, then use this command docker inspect
<id>|grep -i sandbox in order to find the network namespace.
docker inspect 3877e31787f8|grep -i sandbox
"SandboxID": "f3886fa3028fc42364d7a157b467216abcf466ef8a72efbf69ff11a294addea7",
"SandboxKey": "/var/run/docker/netns/f3886fa3028f",
Now go to /var/run/docker/netns/ and execute the nsenter command with the network f3886fa3028f :
cd /var/run/docker/netns/
nsenter --net=f3886fa3028f sh
Now type:
iptables -nvL -t mangle
You may see something similar to this output:
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 MARK all -- * * 0.0.0.0/0 10.0.2.2 MARK set 0x107
0 0 MARK all -- * * 0.0.0.0/0 10.0.3.2 MARK set 0x108
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
The interesting line for us is :
0 0 MARK all -- * * 0.0.0.0/0 10.0.3.2 MARK set 0x108
As you may see, any request of the IP 10.0.3.2, which is the IP of our service is marked by 0x108 (which is 264 in
decimal).
If in the same CLI, you type ipvsadm , you will see something like this:
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
FWM 263 rr
-> 10.0.2.3:0 Masq 1 0 0
-> 10.0.2.4:0 Masq 1 0 0
FWM 264 rr
-> 10.0.3.3:0 Masq 1 0 0
-> 10.0.3.4:0 Masq 1 0 0
The 264 entry (0x108 in hexa) contains the IPs of the two containers running service_1:
10.0.3.3
10.0.3.4
Mangling or packet mangling is a modification applied to network packets in a packet-based network
interface before and/or after routing.
Now if we kill all of the containers running service_1 and ping again, we still have the same VIP address.
As a conclusion, using only 10.0.3.2, the VIP mode will load-balance traffic between 10.0.3.3 and 10.0.3.4.
Updating Services
On of the important features in Docker Swarm is the possibility to apply rolling updates on running containers. This
is done by updating the service. As we have seen, one can scale a service while it is running without having
interruptions:
docker service scale service_1=20
We can also, apply updates on the running image and this could be useful for developers and ops engineers to
upgrade an application. In the following example, we are going to upgrade service_1 from one version to another:
docker service update --image eon01/infinite:0.1 service_1
After applying these updates, you can monitor its status using the inspect command docker service inspect service_1 :
"UpdateStatus": {
"State": "updating",
"StartedAt": "2017-02-19T00:41:30.326692292Z",
"CompletedAt": "1970-01-01T00:00:00Z",
"Message": "update in progress"
}
If the update is finished, the status will change:
"UpdateStatus": {
"State": "completed",
"StartedAt": "2017-02-19T00:41:30.326692292Z",
"CompletedAt": "2017-02-19T00:42:02.636303087Z",
"Message": "update completed"
}
If an error happens, you will see something like this:
"UpdateStatus": {
"State": "paused",
"StartedAt": "2017-02-19T00:41:30.326692292Z",
"Message": "update paused due to failure or early termination of task 1p7eth457h8ndf0ui9s0q951b"
}
In this case, just typing docker service update service_1 will restart the service.
If you need to go back to the previous image, you can run:
docker service update --rollback service_1
You can update DNS:
docker service update --dns-add 8.8.8.8 service_1
Change the host name:
docker service update --hostname host_1 service_1
This will change containers' hostnames to host_1:
docker exec -it d86832832991 hostname
host_1
docker exec -it 5362c0d7cc91 hostname
host_1
You can update other configurations and options like:
--args string Service command args
--constraint-add list Add or update a placement constraint (default [])
--constraint-rm list Remove a constraint (default [])
--container-label-add list Add or update a container label (default [])
--container-label-rm list Remove a container label by its key (default [])
--dns-option-add list Add or update a DNS option (default [])
--dns-option-rm list Remove a DNS option (default [])
--dns-rm list Remove a custom DNS server (default [])
--dns-search-add list Add or update a custom DNS search domain (default [])
--dns-search-rm list Remove a DNS search domain (default [])
--endpoint-mode string Endpoint mode (vip or dnsrr)
--env-add list Add or update an environment variable (default [])
--env-rm list Remove an environment variable (default [])
--group-add list Add an additional supplementary user group to the container (default [])
--group-rm list Remove a previously added supplementary user group from the container (default [])
--health-cmd string Command to run to check health
--health-interval duration Time between running the check (ns|us|ms|s|m|h)
--health-retries int Consecutive failures needed to report unhealthy
--health-timeout duration Maximum time to allow one check to run (ns|us|ms|s|m|h)
--host-add list Add or update a custom host-to-IP mapping (host:ip) (default [])
--host-rm list Remove a custom host-to-IP mapping (host:ip) (default [])
--label-add list Add or update a service label (default [])
--label-rm list Remove a label by its key (default [])
--limit-cpu decimal Limit CPUs (default 0.000)
--limit-memory bytes Limit Memory (default 0 B)
--log-driver string Logging driver for service
--log-opt list Logging driver options (default [])
--mount-add mount Add or update a mount on a service
--mount-rm list Remove a mount by its target path (default [])
--no-healthcheck Disable any container-specified HEALTHCHECK
--publish-add port Add or update a published port
--publish-rm port Remove a published port by its target port
--reserve-cpu decimal Reserve CPUs (default 0.000)
--reserve-memory bytes Reserve Memory (default 0 B)
--restart-condition string Restart when condition is met (none, on-failure, or any)
--restart-delay duration Delay between restart attempts (ns|us|ms|s|m|h)
--restart-max-attempts uint Maximum number of restarts before giving up
--restart-window duration Window used to evaluate the restart policy (ns|us|ms|s|m|h)
--secret-add secret Add or update a secret on a service
--secret-rm list Remove a secret (default [])
--stop-grace-period duration Time to wait before force killing a container (ns|us|ms|s|m|h)
--tty, -t Allocate a pseudo-TTY
--update-delay duration Delay between updates (ns|us|ms|s|m|h) (default 0s)
--update-failure-action string Action on update failure (pause|continue) (default "pause")
--update-max-failure-ratio float Failure rate to tolerate during an update
--update-monitor duration Duration after each task update to monitor for failure (ns|us|ms|s|m|h) (default 0s)
--update-parallelism uint Maximum number of tasks updated simultaneously (0 to update all at once) (default 1)
--user, -u string Username or UID (format: <name|uid>[:<group|gid>])
--with-registry-auth Send registry authentication details to swarm agents
--workdir, -w string Working directory inside the container
If you need to force an update even if no changes require it, you can use --force .
Locking & Unlocking Swarm
This feature is compatible with Docker 1.13 and higher since it is based on the Docker secrets feature.
By default, all of the Raft logs used by Swarm managers are encrypted, you can also use TLS based communications
between nodes in order to protect the Swarm cluster from external malicious access. In both cases (Raft logs
encryption and TLS communication), Swarm use keys (1 key per case) and these keys are stored in the memory of
each manager but you have the possibility to get these keys using the autolock feature.
When you start a manager, it is started by default without autolock:
docker swarm init --autolock=false
By default Docker generates a key and store it on disk, so that managers can restart automatically without a human
intervention. But if you want to lock the Swarm, you should use a similar command to this one :
docker swarm init --autolock=true
This command will generate a worker token then a locking/unlocking key that could be used later to unlock a
manager in order to start a stopped manager, to restart Docker or restore the swarm from a backup.
The generated key should be saved manually in a safe place.
In the case you get the ownership of the key, your Swarm manager will no longer be able to restart without your
intervention, since providing the key is now your responsibility. If you want to give back this responsibility to
Docker, you can do it using:
docker swarm update --autolock=false
Typing this command will disable the encryption of both keys, they are stored unencrypted on disk.
While disabling the lock feature, other Docker managers may be down and will not be able to change their
statuses from locked to unlocked, so they will still need the lock key. So keep them stored in a safe place.
Note that you can lock an already created service using the update command:
docker swarm update --autolock=false
And in order to unlock a Swarm, you may use docker swarm unlock and enter your unlock key.
Swarm Backup
A Swarm manager stores the state of the cluster, its logs and the keys used to encrypt the Raft logs in
/var/lib/docker/swarm/ :
ls -lrth /var/lib/docker/swarm/
total 20K
drwxr-xr-x 2 root root 4.0K Feb 7 00:01 worker
drwxr-xr-x 2 root root 4.0K Feb 7 00:01 certificates
drwx------ 4 root root 4.0K Feb 7 00:01 raft
-rw------- 1 root root 108 Feb 18 17:03 docker-state.json
-rw------- 1 root root 68 Feb 18 17:03 state.json
In order to restore an auto-locked Swarm, you will absolutely need these keys and make the backup from any of
your cluster managers.
Unlock key, retrieve it and store it in a safe location or read-lock your swarm to protect its encryption key then make
you backup, it is better if you stop Docker on the manager you are using because data could change during the
backup.
cp -r /var/lib/docker/swarm/ /backup
Now you can start your manager.
Save these files in a safe place and before that make sure to have valid files, sometimes files may be corrupted. The
best way to check if a file is corrupted or not is to read it or to check its checksum and compare it to a valid file.
cksum <my_file>
Swarm Disaster Recovery
Unless you are experimenting and learning you can do this safely of your test servers but beware to have a valid
backup of you are trying to do it on a production server.
Follow the steps enumerated in the previous section Swarm Backup, make sure your files are not corrupted and save
them to a safe place.
In order to make a disaster recovery, create a new Swarm, shut down Docker on the machine where the swarm will
be recovered and remove the files inside /var/lib/docker/swarm , then copy your backup files inside.
If everything is ok, you can start a new Swarm using the following command:
docker swarm init --force-new-cluster
You can add new managers and workers for now.
If your node uses an encryption key (with autolock enabled), it will be the same of your old recovered one and you
can not change it unless your recovery is finished. After finishing the recovery, unlock the Swarm using the old
unlock key then you can create a new key using the key rotation feature.
Swarm Key Rotation
Scheduling a regular key rotation is a best practice. Creating a new key is not complicated, one command to do it:
docker swarm unlock-key --rotate
The previous command will generate a new unlock key and in order to use it to unlock a swarm manager you should
use docker swarm unlock . At this step, the generated key should provided.
Like the first key, the new one should be stored in a safe place. If lost, you will not be able to restart your manager.
The key rotation feature is also used, like we have seen together in the section Swarm Disaster Recovery, in
generating a new key after a Swarm disaster recovery.
Monitoring Swarm Health
HEALTHCHECK is a Docker instruction that tells the engine to test a container in order to check its state and if it is
still working.
Let's see in practice how this works. We are still using the same Swarm that we used trough all of this chapter. Let's
begin by removing the two running services and create a web server service.
docker service rm service_1 service_2
We will use Nginx for our web server:
docker service create --name web_server --replicas 1 -p 80:80 nginx:alpine
Go to your server IP address and check if you have the default Nginx home page.
We are going to use the official image of nginx:alpine to create a new image where we add a health check. This is
the Dockerfile:
FROM nginx:alpine
RUN apk add --no-cache curl
RUN echo "hello world" > /usr/share/nginx/html/index.html
HEALTHCHECK --interval=5s --timeout=5s CMD curl --fail -A "healthcheck-routine" http://localhost:80/ || exit 1
I built, committed and pushed this image to my Docker Hub and you can use it directly. You can also build it by
your own:
docker build -t eon01/nginx_healthcheck_example .
Or run it using this command:
docker run -it --name nginx_healthcheck_example -p 80:80 -d eon01/nginx_healthcheck_example
Make sure your port 80 is not used by another local server. Verify that your container is running using docker ps .
Docker HEALTHCHECK will return the state of the running nginx, if you type this command:
docker inspect --format "{{json .State.Health.Status }}" nginx_healthcheck_example
It will give you the state of your container, normally it should be: healthy. This state is based on the curl check run
by the instruction:
HEALTHCHECK --interval=5s --timeout=5s CMD curl --fail -A "healthcheck-routine" http://localhost:80/ || exit 1
Where :
interval=DURATION (default: 30s). This is the time interval between executing the healthcheck.
timeout=DURATION (default: 30s). If the check does not finish before the timeout, consider it failed.
retries=N (default: 3). How many times to recheck before marking a container as unhealthy.
You can customize your HEALTHCHECK using other configurations like:
HEALTHCHECK --interval=2s --timeout=1s --retries=1 CMD curl --fail -A "healthcheck-
routine" http://localhost:8080/ || exit 1
In both cases, the curl command to verify the check is curl --fail -A "healthcheck-routine" http://localhost:8080/ || exit
1 which tells Docker to monitor the exit code of curl, if Nginx stops serving the index.html page, it will fall into 1
and exits. Docker will restart the container after timeout second(s) for retries times and each interval second(s).
In order to simulate a failure, we are going to move index.html and move it to index.html.orig, the index page will be
unreachable and our HEALTHCHECK will be monitoring this failure.
Type:
docker exec -it nginx_healthcheck_example sh
Inside the container:
cd /usr/share/nginx/html && mv index.html index.html.orig
Now (on your host machine), type watch docker ps , wait some seconds and see the STATUS changing from (healthy)
to (unhealthy). At this step, Docker will restart the container.
The HEALTHCHECK state will log the failure that we simulate and any output from the curl command. If you wan
to debug your container, you can use this command:
docker inspect --format "{{json .State.Health }}" nginx_healthcheck_example
You should be able to see something like these logs:
{
"Status":"unhealthy",
"FailingStreak":77,
"Log":[
{
"Start":"2017-03-07T01:18:35.952544005+01:00",
"End":"2017-03-07T01:18:35.973369673+01:00",
"ExitCode":1,
"Output":" "[...]""
},
{
"Start":"2017-03-07T01:18:40.973509581+01:00",
"End":"2017-03-07T01:18:40.993919955+01:00",
"ExitCode":1,
"Output":" "[...]""
},
{
"Start":"2017-03-07T01:18:45.994059011+01:00",
"End":"2017-03-07T01:18:46.014129303+01:00",
"ExitCode":1,
"Output":" "[...]""
},
{
"Start":"2017-03-07T01:18:51.014382522+01:00",
"End":"2017-03-07T01:18:51.063646337+01:00",
"ExitCode":1,
"Output":" "[...]""
},
{
"Start":"2017-03-07T01:18:56.064089164+01:00",
"End":"2017-03-07T01:18:56.135445869+01:00",
"ExitCode":1,
"Output":" "[...]""
}
]
}
Docker health check is a useful feature, if the container fails health checks or terminates, the task terminates and the
orchestrator creates a new replica task that spawns a new container.
Using Secrets In Swarm Mode
Starting from the version 1.13, Docker users can use Docker Secrets when working in Swarm Mode.
A secret is a blob of data like:
password
SSH/SSL private key and certificates
A secret key that could be stored in your Dockerfile
Sensitive environment data that could be transmitted over a network
Sensitive data in the application source code that could be transmitted over a network
Other data such as the user-name or the name of a database, generic strings or even binary content ( < 500 kb)
..etc
In one of the above cases, Docker Secrets allow you to centrally manage sensitive data and transmit it in a secure
way and only to containers that need access to it. This sensitive data, is encrypted in both rest and transmit.
When you add a secret to a Swarm it is sent to the manager over a TLS connection.
A TLS connection uses the Transport Layer Security protocol that provides privacy and data integrity in containers
inter-communications.
Docker has its built-in certificate authority.
After that, the secret is stored in the Raft log, which is encrypted by default (> 1.13) and when this log is replicated
across all of the other managers, the secret is securely propagated.
If a new node joins the cluster, at the moment when the latter will run a given service, a manager will establish a
TLS connection to the new node and will securely send the service's secret data. A secret is then unencrypted and
mounted into the container in an in-memory file system.
These mounts have /run/secrets/<secret_name> as a target.
In the case when a node is no longer running a service, all of the latter's secrets disappear from the node: The node
will no longer have access to secrets. This process is made by the node itself after having a notification from a
manager.
The Docker Secret is never written to disk on any of the Swarm nodes, instead of disk, a secret is written to the
tmpfs volume and it is not possible to recover it from the Docker CLI but you should actually read it from the
service(s) where it has been injected.
tmpfs is a temporary file storage facility used in many Unix-like OSs. It is intended to appear as a mounted
file system, but stored in RAM instead of a disk.
Creating & Storing Secrets
Let's try some practical examples.
In some of the examples we run a Mysql database using this command:
docker run --name mysql -p 3306:3306 --env MYSQL_ROOT_PASSWORD=password -d mysql
and when we want to create a Swarm we need to run
docker service create --name mysql -p 3306:3306 --env MYSQL_ROOT_PASSWORD=password mysql
With Docker Secrets, the run command will be:
docker service create --name mysql -p 3306:3306 --secret mysql-root-password --
env MYSQL_ROOT_PASSWORD_FILE=/run/secrets/mysqlrootpassword mysql
/run/secrets/mysqlrootpassword is the file where the root password is encrypted. Here is how to create it:
Say our password is simply "password"
After creating your Swarm, open the terminal and type: echo "password"|docker secret create mysqlrootpassword -
You will get the id of the secret. In my case : vbsvaiunxjau6r6isnc3vxcyt
Now you can check the secret file under /run/secrets/ in the Mysql running container: docker exec -it
mysql.1.j9tixgi8pmwa131ux5screqaq ls /run/secrets/mysql-root-password
When I created the Mysql service, it was deployed to a host that I called host_1:
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
j9tixgi8pmwa mysql.1 mysql:latest host_1 Running Running 4 minutes ago
To make a double check, we can type: mysql -h host_1 -uroot -ppassword and we will be connected to the Mysql
container.
In order to see the injected secrets, type:
docker secret ls
e.g.:
ID NAME CREATED UPDATED
wf048or4g6v7gn3v8pmj57vz0 mysqlrootpassword 35 minutes ago 35 minutes ago
Sending Secrets Between Services
In my example, we have two Mysql services running in the same cluster of Swarm nodes, the first database is
mysql_1 running the port 3306 and the second one is mysql_2 and it is running the port 3307. In order to see how we
can send secrets, the first service will share the same password across the cluster in order to be used by the new
service mysql_2 as a root password.
To accomplish this, we need to create an overlay network to which both services will be attached.
docker network create -d overlay mysql_network
Remove the first Mysql service:
docker service rm mysql
Create a new secret:
echo "password"|docker secret create mysql1rootpassword -
Create the first service:
docker service create \
--name mysql_1 \
-p 3306:3306 \
--secret mysql1rootpassword \
--env MYSQL_ROOT_PASSWORD_FILE=/run/secrets/mysql1rootpassword \
--network mysql_network \
mysql
Create the second service:
docker service create \
--name mysql_2 \
-p 3307:3306 \
--secret source=mysql1rootpassword,target=mysql2rootpassword,mode=0400 \
--env MYSQL_ROOT_PASSWORD_FILE=/run/secrets/mysql2rootpassword \
--network mysql_network \
mysql
You may have noticed how using --secret source=mysql1rootpassword,target=mysql2rootpassword creates a new file in the
tmpfs of the containers running mysql_2 service and if not you can check it by logging into one of the containers and
executing cat /run/secrets/mysql2rootpassword .
Sending secrets between services is easy and secure and it is possible in the case when services are accessible to
each other (in our case, both services are in the same network).
Rotating Secrets
The good thing about using Docker Secrets is the modularity and the separation between code and secrets: Changing
a secret does not require changing code. In the other hand, changing secrets is a good security practice. Let's see how
to rotate secrets.
Our password is changing from "password" to "new_password", we create a new secret in this case:
echo "new_password"|docker secret create newmysqlrootpassword -
Remove the first secret:
docker service update --secret-rm mysqlrootpassword mysql_1
Update the first secret by the new one:
docker service update --secret-add source=newmysqlrootpassword,target=mysqlrootpassword mysql_1
Update the second Mysql service (note the combination of two steps in one command):
docker service update --secret-rm mysqlrootpassword --secret-
add source=newmysqlrootpassword,target=mysqlrootpassword,mode=0400 mysql_2
Let's make some verifications: Type docker ps to get the list of containers:
CONTAINER ID IMAGE COMMAND PORTS NAMES
8b305cb625ed mysql:latest "docker-entrypoint..." 3306/tcp mysql_2.1.
a9e64fff8e06 mysql:latest "docker-entrypoint..." 3306/tcp mysql_1.1.
Check the content of the secret file:
docker exec -it mysql_2.1.1piq2o2ascilmjighsq4qxld9 cat /run/secrets/mysqlrootpassword
If you get this new_password then everything was well done in both services.
Chapter XIII - Orchestration - Kubernetes
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Kubernetes is one of the most know orchestration systems. Kubernetes needs more than a chapter, it needs a book. I
recently created a project called Books For DevOps where you can find several hand curated books, some of them
are about Kubernetes. In this chapter, we are going to see how to deploy and use K8S but some concept definitions is
obligatory before moving to this part.
Introduction
First, let's have a look at the Kubernetes architecture.
Master Components
In order to schedule and orchestrate a set of containers and services, Kubernetes needs a master. The master
components are the controlling services that operate to manage the cluster and gives administrator an interface to
manage deployment, rollbacks and other operations. These master components can be installed on a single machine
and you can also create a distributed system of master components where multiple machines forms a master.
More specifically, master components are :
Etcd: a distributed key-value store for the critical data of a distributed system
API Server: the main management point of the entire cluster
Controller Manager Service : a daemon that embeds the core control loops shipped with K8S. It enables a non-
terminating loop that regulates the state of the system by watching the shared state of the cluster through the
apiserver and making changes attempting to move the current state towards the desired state (e.g. the
replication controller, endpoints controller, namespace controller, serviceaccounts controller)
Scheduler Service: responsible for assigning workloads to specific nodes in a cluster. It is actually used to read
in a service's operating requirements then analyze the infrastructure environment in order to distribute a work
on the right node(s)
Pods
If Kubernetes is an operating system, then a pod is the process.
A pod is simply a group of one or more containers, the shared storage for those containers, and the different
configurations describing how to run the containers. Two or more containers in the same pod, share an IP address
and port space. They can find each other via localhost and communicate with each other using inter-process
communications (e.g. SystemV semaphores, POSIX shared memory).
Pods serve as unit of deployment, horizontal scaling, and replication. They enable data sharing and communication
among their containers and can be used to host vertically integrated application stacks (e.g. LAMP).
Deployments
With a declarative approach, deployment update pods. Deployment controller will change the actual state to the
desired state when a Deployment object desired state is updated by the user. A Deployment is described using a
YAML file, this is an example of a Deployment that creates a ReplicaSet to bring up 5 Apache Pods:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: apache-deployment
spec:
replicas: 5
template:
metadata:
labels:
app: apache
spec:
containers:
- name: apache
image: httpd
ports:
- containerPort: 80
A Deployment supports operations like updating, rollback, checking rollout history, pausing and resuming.
Services
A Kubernetes service is an abstraction of the pod they consistently maintain a well-defined endpoint for pods. A pod
can be placed in a different node, it can also die and get resurrected but Kubernetes has a built-in service discovery
that allows a service to have the same endpoints of changing pods.
K8S Services could be
Internal (e.g. databases and private pods in a more general)
External (e.g. any service that needs an external endpoint with a specific port like Nginx)
Load balanced (e.g. services that are load balanced. Kubernetes can use some cloud providers managed load
balancers like AWS ELB or GCE load balacing service: This feature enables integrating a third-party load
balancer with the Kubernetes.)
Replication Controller
A Replication Controller is responsible for ensuring that a pod or homogeneous group of pods are always up and
reachable. When there are many or too few pods, it will regulate them by killing or starting new pods: a specified
number of pod replicas should be running at any given time.
Replicaset
ReplicaSet is the next-generation Replication Controller. ReplicaSet has a generalized label selector.
Nodes/Minions
A minion is a worker machine that could be a virtual or a bare metal machine that runs pods and are managed by the
Kubernetes master component.
Kubelet
The kubelet is the primary node agent that runs on each minion
The Container Runtime
Each node runs a container runtime. The latter is responsible for downloading images and running containers like
Docker.
Kube Proxy
kube-proxy is a daemon that runs on each minion and acts like a network proxy and load balancer for the services
living in that minion.
A Local Kubernetes Using Minikube
After the last definitions, we can start manipulating K8S, so let's start a local cluster.
Installation
We will proceed for the moment with Minikube. Minikube is a tool that makes it easy to run Kubernetes locally.
Minikube runs a single-node Kubernetes cluster inside a VM on a laptop. It is useful for development since it
packages and configures a Linux VM, the container runtime, and all Kubernetes components and it supports
Kubernetes features such as:
DNS
NodePorts
ConfigMaps and Secrets
Dashboards
Container Runtime: Docker and rkt
Enabling CNI (Container Network Interface)
You can get the installation instructions from the official Minikube repository. I am going to use the Linux
installation.
curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.18.0/minikube-linux-
amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
In order to use it with OS X, you will need xhyve driver, VirtualBox or VMware Fusion installation With Linux you
will need VirtualBox or KVM installation and with Windows VirtualBox or Hyper-V installation is needed.
In allOSs, VT-x/AMD-v virtualization must be enabled in BIOS and kubectl (>1.0) should be installed.
The installation of kubectl depends on your system :
For OS X :
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-
release/release/stable.txt)/bin/darwin/amd64/kubectl
For Linux :
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-
release/release/stable.txt)/bin/linux/amd64/kubectl
For Windows users :
curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-
release/release/stable.txt)/bin/windows/amd64/kubectl.exe
After the installation, you should make the kubectl binary executable and move it to your PATH :
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
Running Minikube
First command to run to bootstrap a K8S cluster is minikube start .
This command will start a local VM, downloa the Minikube ISO and check if the requirements are satisfied.
Starting VM...
Downloading Minikube ISO
89.51 MB / 89.51 MB [==============================================] 100.00% 0s
If your requirements are not satisfied, you will get an error message.
Example :
Error starting host: Error creating host: Error with pre-
create check: "VBoxManage not found. Make sure VirtualBox is installed and VBoxManage is in the path"
I had a problem first time working with Minikube and the error message was :
Error starting host: Error configuring auth on host: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded
The solution was minikube delete && minikube start . The solution was mentioned in this issue.
Executing minikube delete && minikube start , will :
Delete local Kubernetes cluster
Start local Kubernetes cluster
Start the VM
SSH files into VM
Set up certs
Start cluster components
Connect to cluster
Set up kubeconfig
Kubectl
Kubectl is now configured to use the cluster.
To connect to the target cluster Minikube has already run some commands like:
kubectl config set-cluster default-cluster --server=https://${MASTER_HOST} --certificate-authority=${CA_CERT}
kubectl config set-credentials default-admin --certificate-authority=${CA_CERT} --client-key=${ADMIN_KEY} --client-
certificate=${ADMIN_CERT}
kubectl config set-context default-system --cluster=default-cluster --user=default-admin
kubectl config use-context default-system
In order to make a verification, we can type kubectl get nodes and everything was setup right, we will get a similar
output to the following:
NAME STATUS AGE VERSION
minikube Ready 2h v1.6.0
Let's run Nginx in a container:
kubectl run nginx --image=nginx --port=80 --replicas=1
Now we can type: kubectl get pods to view the group of containers that are deployed together on the same host
(localhost) or simply the pods:
NAME READY STATUS RESTARTS AGE
nginx-158599303-xxf1d 1/1 Running 0 1m
If we were running other pods, it is possible to "filter" the get pods output using:
kubectl get pods --selector="run=nginx" --output=wide
NAME READY STATUS AGE IP NODE
nginx-29nqv 1/1 Running 3h 172.17.0.4 minikube
nginx-g24b3 1/1 Running 3h 172.17.0.5 minikube
Without the --output=wide :
NAME READY STATUS RESTARTS AGE
nginx-158599303-29nqv 1/1 Running 0 3h
nginx-158599303-g24b3 1/1 Running 0 3h
We can also show other options like labels, using kubectl get pods --show-labels :
NAME READY STATUS LABELS
nginx-29nqv 1/1 Running pod-template-hash=158599303,run=nginx
nginx-g24b3 1/1 Running pod-template-hash=158599303,run=nginx
Let's view the deployment list: kubectl get deployments :
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 1 1 1 1 22m
We can have more information about the pod, using kubectl get -o json pod nginx-158599303-xxf1d . Don't forget to
always adapt the commands to your specific case, change nginx-158599303-xxf1d by your pod name.
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"annotations": {
"kubernetes.io/created-by":
"{\"kind\":\"SerializedReference\",\"apiVersion\":
\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",
\"namespace\":\"default\",\"name\":\"nginx-158599303\",
\"uid\":\"5a0dc0a9-2aca-11e7-8020-0800275cee00\",
\"apiVersion\":\"extensions\",
\"resourceVersion\":\"9004\"}}\n"
},
"creationTimestamp": "2017-04-26T21:50:24Z",
"generateName": "nginx-158599303-",
"labels": {
"pod-template-hash": "158599303",
"run": "nginx"
},
"name": "nginx-158599303-xxf1d",
"namespace": "default",
"ownerReferences": [
{
"apiVersion": "extensions/v1beta1",
"blockOwnerDeletion": true,
"controller": true,
"kind": "ReplicaSet",
"name": "nginx-158599303",
"uid": "5a0dc0a9-2aca-11e7-8020-0800275cee00"
}
],
"resourceVersion": "9075",
"selfLink": "/api/v1/namespaces/default/pods/nginx-158599303-xxf1d",
"uid": "5a103f82-2aca-11e7-8020-0800275cee00"
},
"spec": {
"containers": [
{
"image": "nginx",
"imagePullPolicy": "Always",
"name": "nginx",
"ports": [
{
"containerPort": 80,
"protocol": "TCP"
}
],
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
"name": "default-token-jk2cd",
"readOnly": true
}
]
}
],
"dnsPolicy": "ClusterFirst",
"nodeName": "minikube",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"serviceAccount": "default",
"serviceAccountName": "default",
"terminationGracePeriodSeconds": 30,
"volumes": [
{
"name": "default-token-jk2cd",
"secret": {
"defaultMode": 420,
"secretName": "default-token-jk2cd"
}
}
]
},
"status": {
"conditions": [
{
"lastProbeTime": null,
"lastTransitionTime": "2017-04-26T21:50:24Z",
"status": "True",
"type": "Initialized"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2017-04-26T21:51:14Z",
"status": "True",
"type": "Ready"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2017-04-26T21:50:24Z",
"status": "True",
"type": "PodScheduled"
}
],
"containerStatuses": [
{
"containerID": "docker://b..3",
"image": "nginx:latest",
"imageID": "docker://sha256:4..8",
"lastState": {},
"name": "nginx",
"ready": true,
"restartCount": 0,
"state": {
"running": {
"startedAt": "2017-04-26T21:51:13Z"
}
}
}
],
"hostIP": "192.168.99.100",
"phase": "Running",
"podIP": "172.17.0.4",
"qosClass": "BestEffort",
"startTime": "2017-04-26T21:50:24Z"
}
}
This JSON file contains information that can be included in a YAML file, the latter can be used with kubectl
command to create a pod: kubectl create -f pod.yaml .
Let's delete the deployment nginx and create the YAML specifications file that can be used to re-create the pod.
kubectl delete deployments/nginx
Note that you can get a list of deployments, using kubectl get deployments .
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
template:
metadata:
labels:
run: nginx
spec:
containers:
- name: nginx
image: nginx
port:
- containerPort: 80
You can find this code in Orchestration_Kubernetes_Running_Minikube.yaml file.
In order to use it, type:
kubectl create -f Orchestration_Kubernetes_Running_Minikube.yaml --record
Using --record flag allows you to record the last command in the annotations of the created resources (in the case of
an update also). This could be useful to see the commands executed in each Deployment with a revision. To
understand this, type:
kubectl describe deployments
And you will get a similar output to this one:
Name: nginx
Namespace: default
CreationTimestamp: Thu, 27 Apr 2017 11:21:43 +0200
Labels: run=nginx
Annotations: deployment.kubernetes.io/revision=1
kubernetes.io/change-cause=kubectl create\
--filename=Orchestration_Kubernetes_Running_Minikube.yaml --record=true
Selector: run=nginx
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 1 max unavailable, 1 max surge
Pod Template:
Labels: run=nginx
Containers:
nginx:
Image: nginx
Port: 80/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: nginx-158599303 (2/2 replicas created)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- ----- ------ -------
4m 4m 1 deployment-controller Normal ScalingReplicaSet Scaled up replica set nginx to 2
Notice the annotations:
Annotations: deployment.kubernetes.io/revision=1
kubernetes.io/change-cause=kubectl create \
--filename=Orchestration_Kubernetes_Running_Minikube.yaml --record=true
We can get more details about the created Nginx using kubectl describe pods :
Name: nginx-158599303-29nqv
Namespace: default
Node: minikube/192.168.99.100
Start Time: Thu, 27 Apr 2017 11:21:43 +0200
Labels: pod-template-hash=158599303
run=nginx
Annotations:
kubernetes.io/created-by=
{"kind":"SerializedReference","apiVersion":"v1","reference":
{"kind":"ReplicaSet","namespace":"default","name":"nginx-158599303","uid":"xxxxx","a...
Status: Running
IP: 172.17.0.4
Controllers: ReplicaSet/nginx-158599303
Containers:
nginx:
Container ID: docker://xxxx
Image: nginx
Image ID: docker://sha256:xxxx
Port: 80/TCP
State: Running
Started: Thu, 27 Apr 2017 11:22:22 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-27fkv (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-27fkv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-27fkv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count Message
--------- -------- ----- -------
23m 23m 1 default-scheduler Successfully assigned nginx-158599303-29nqv to minikube
23m 23m 1 kubelet, minikube pulling image "nginx"
22m 22m 1 kubelet, minikube Successfully pulled image "nginx"
22m 22m 1 kubelet, minikube Created container with id xxxx
22m 22m 1 kubelet, minikube Started container with id xxxx
Name: nginx-158599303-g24b3
Namespace: default
Node: minikube/192.168.99.100
Start Time: Thu, 27 Apr 2017 11:21:43 +0200
Labels: pod-template-hash=158599303
run=nginx
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":
{"kind":"ReplicaSet","namespace":"default","name":"nginx-158599303","uid":"xxxxx","a...
Status: Running
IP: 172.17.0.5
Controllers: ReplicaSet/nginx-158599303
Containers:
nginx:
Container ID: docker://xxxx
Image: nginx
Image ID: docker://sha256:xxxx
Port: 80/TCP
State: Running
Started: Thu, 27 Apr 2017 11:22:21 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-27fkv (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-27fkv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-27fkv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count From Message
--------- -------- ----- ---- -------
23m 23m 1 default-scheduler assigned nginx-158599303-g24b3 to minikube
23m 23m 1 kubelet, minikube pulling image "nginx"
22m 22m 1 kubelet, minikube Successfully pulled image "nginx"
22m 22m 1 kubelet, minikube Created container with id xxxx
22m 22m 1 kubelet, minikube Started container with id xxxx
We can use less verbose commands like kubectl get deployments :
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 2 2 2 2 31m
or kubectl get pods :
NAME READY STATUS RESTARTS AGE
nginx-158599303-29nqv 1/1 Running 0 33m
nginx-158599303-g24b3 1/1 Running 0 33m
Now we can create a service object that exposes the last deployment:
kubectl expose deployment nginx --port=80
We can use the same get command used above to show the JSON specefication:
kubectl get -o json service nginx
This is the same spec that we wrote in a YAML file but it is in JSON format:
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"creationTimestamp": "2017-04-27T10:10:29Z",
"labels": {
"run": "nginx"
},
"name": "nginx",
"namespace": "default",
"resourceVersion": "4669",
"selfLink": "/api/v1/namespaces/default/services/nginx",
"uid": "bdf36511-2b31-11e7-b29a-080027c0925b"
},
"spec": {
"clusterIP": "10.0.0.234",
"ports": [
{
"port": 80,
"protocol": "TCP",
"targetPort": 80
}
],
"selector": {
"run": "nginx"
},
"sessionAffinity": "None",
"type": "ClusterIP"
},
"status": {
"loadBalancer": {}
}
}
We can get a specefic part of these information using --template= tag with the get scv command.
Example:
kubectl get service nginx --template={{.spec.clusterIP}}
will give you the IP address of the service (nginx)
kubectl get service nginx --template={{.metadata.name}}
will give you the name of the service
etc ..
Another way to view information about a service is using the describe command like kubectl describe service nginx .
An output similar to the following one, will show on your screen:
Name: nginx
Namespace: default
Labels: run=nginx
Annotations: <none>
Selector: run=nginx
Type: ClusterIP
IP: 10.0.0.234
Port: <unset> 80/TCP
Endpoints: 172.17.0.4:80,172.17.0.5:80
Session Affinity: None
Events: <none>
In order to view the ReplicaSet objects, we can use kubectl get replicasets to obtain a list:
NAME DESIRED CURRENT READY AGE
nginx-158599303 2 2 2 3h
Or kubectl describe replicasets to obtain more details:
Name: nginx-158599303
Namespace: default
Selector: pod-template-hash=158599303,run=nginx
Labels: pod-template-hash=158599303
run=nginx
Annotations: deployment.kubernetes.io/desired-replicas=2
deployment.kubernetes.io/max-replicas=3
deployment.kubernetes.io/revision=1
kubernetes.io/change-cause=kubectl create
--filename=Orchestration_Kubernetes_Running_Minikube.yaml
--record=true
Replicas: 2 current / 2 desired
Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: pod-template-hash=158599303
run=nginx
Containers:
nginx:
Image: nginx
Port: 80/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Events: <none>
In order to scale up the service to 5 instances, we can use this command:
kubectl scale deployments nginx --replicas=5
Let's run another service. It is possible to create a service from a remote YAML file. This is the hello-world service:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: hello
spec:
replicas: 7
template:
metadata:
labels:
app: hello
tier: backend
track: stable
spec:
containers:
- name: hello
image: "gcr.io/google-samples/hello-go-gke:1.0"
ports:
- name: http
containerPort: 80
You can find the file here : https://kubernetes.io/docs/tutorials/connecting-apps/hello.yaml and you can use it this way:
kubectl create -f http://k8s.io/docs/tutorials/connecting-apps/hello.yaml
Now the kubectl get pods command, will show :
NAME READY STATUS RESTARTS AGE
hello-1987912066-1tnxq 1/1 Running 0 6m
hello-1987912066-3fwvt 1/1 Running 0 6m
hello-1987912066-503fc 1/1 Running 0 6m
hello-1987912066-7htlm 1/1 Running 0 6m
hello-1987912066-pvbmj 1/1 Running 0 6m
hello-1987912066-qc0xt 1/1 Running 0 6m
hello-1987912066-qvqlr 1/1 Running 0 6m
nginx-158599303-2rgff 1/1 Running 0 31s
nginx-158599303-g24b3 1/1 Running 0 6h
nginx-158599303-jkjm7 1/1 Running 0 39s
nginx-158599303-k0zt3 1/1 Running 0 34s
nginx-158599303-rskx3 1/1 Running 0 37s
Publishing Services & Services Types
Now that we have Minikube & Kubectl and we were able to create services, let's run this service:
kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --port=8080
echoserver is a simple application that responds with the HTTP headers it received.
Let's publish the created service:
kubectl expose deployment hello-minikube --type=NodePort
You may have noticed the NodePort option. Kubernetes ServiceTypes allow you to specify what kind of service you
want. The default value is ClusterIP.
ClusterIP exposes the service on a cluster-internal IP so that the service will be reachable only from
within the cluster
NodePort exposes the service on each node's IP at a static port and create a ClusterIP service. The
latter will recieve the trafix routed from the NodePort service. In this type, the NodePort service will
be reachable from outside the cluster suing <NodeIP>:<NodePort> .
LoadBalancer type exposes the service externally using a cloud provider’s load balancer. This could
be limited since Kubernetes load balancing is not implemetend with many cloud vendors. The
NodePort and the ClusterIP services will recieve traffic from the load balancer and they are
automatially created.
ExternalName will map the service to the contents of the externalName field. (e.g.
k8s.painlessdocker.com). ExternalName will return a CNAME record with its value and this feature
requires kube-dns > 1.7.
We considered using the NodePort and in this case Kubernetes master will allocate a port from a configured range
(30000 -> 32767), and each Node will proxy that port into your service. That port will be reported in the service’s
spec.ports[*].nodePort field.
In the case we have multiple Nodes, Kubernetes will use the same port number on every Node.
Now we can use Curl to get the service response, but we don't know the url neither the port and that's why this
command will help us:
minikube service hello-minikube --url
Now we can easily do : curl $(minikube service hello-minikube --url) and we will get something similar to this:
CLIENT VALUES:
client_address=172.17.0.1
command=GET
real path=/
query=nil
request_version=1.1
request_uri=http://192.168.99.100:8080/
SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001
HEADERS RECEIVED:
accept=*/*
host=192.168.99.100:30805
user-agent=curl/7.47.0
BODY:
-no body in request-
You can use your browser also !
We can see the IP address and the interface (vboxnet1 in my case and it most probably the same for you ) using
minikube ip and we will notice that the IP is 192.168.99.1 , the broadcast is 192.168.99.255 and the mask is
255.255.255.0 .
Use ifconfig :
vboxnet1 Link encap:Ethernet HWaddr 0a:00:27:00:00:01
inet addr:192.168.99.1 Bcast:192.168.99.255 Mask:255.255.255.0
inet6 addr: fe80::800:27ff:fe00:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:203 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:40551 (40.5 KB)
Using Kubernetes With Google Container Engine
Prerequisites
GKE - Google Container Engine is a service managed by Google that makes easy to run containers in Google cloud.
It uses Kubernetes.
In this part, we will run a Wordpress blog in a managed Kubernetes cluster so you need a Google Cloud account and
a project that you can create using GCP UI:
Now go to the project dashboard
Remember the project ID which is a unique name across all Google Cloud projects. It will be referred to later in this
codelab as PROJECT_ID.
Next, you will need to enable billing in the Developers Console in order to use Google Cloud resources like Cloud
Datastore and Cloud Storage. Now go and enable APIs in the API Manager:
Click on Compute Engine API to enable it:
You must also enable Google Container Engine API
Setting Up The Compute Zone
Start Google Cloud Shell by clicking on the terminal icon (the icon is in the top toolbar). This Shell is connected to a
virtual machine where GCP tools are installed.
We will be using gcloud command in order to create our cluster but first let's set the compute zone so all of the VMs
of this cluster will be created in the choosed region.
gcloud config set compute/zone <zone>
In our cluster, we are going to choose the us-central1-b zone:
gcloud config set compute/zone us-central1-b
You will get a similar output to this one:
Updated property [compute/zone].
GCP has 3 regions: Americas, Europe & Asia, each regions has some zones. You can use your preferred
region and zone and you will find them in the official Google documentation.
Creating The Cluster
The gcloud command will allow us to create a new cluster, we will call it my-test-node :
gcloud container clusters create my-test-node --num-nodes 1
At the moment when you create the cluster, if you get a warninig like
WARNING: Accessing a Container Engine cluster requires the kubernetes commandline client [kubectl].
Then you should install kubectl component.
gcloud components install kubectl
This command creates a new cluster called "hello-world" with one node (VM). Since we'll only be launching one
container a single node is fine. If you wanted to launch multiple containers you could specify a different number of
nodes when you create the cluster. When launching the cluster
Creating cluster my-test-node ... done
Created [https://container.googleapis.com/v1/projects/wordpress-161315/zones/us-central1-b/clusters/my-test-
node]. kubeconfig entry generated for my-test-node.
NAME ZONE MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
my-test-node us-central1-b 1.5.3 35.184.199.238 n1-standard-1 1.5.3 1 RUNNING
You can also verify if everything was ok with the cluster launch by listing the instance of your cluster:
gcloud compute instances list
You will get a similar output to this:
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
gke-my-test-node-default-pool-f295ca85-pvjc us-central1-b n1-standard-1 10.128.0.2 104.198.50.198 RUNNING
Creating The Wordpress Services
Now that our cluster is running, it is time to deploy a Wordpress container to it. We'll be using the tutum/wordpress
image that contains all what we need to run a Wordpress site, including a MySQL database in a single Docker
container.
We are using this Wordpress image just to demonstrate how GKE works but it is not a good practice to run
several processes in the same Docker containers.
Creating Our Pod
Like we said a pod is one or more containers that "travel together", they could be administered together and could
have the same network requirements ..etc.
Let's create the pod using kubectl:
We are always using the same Cloud Shell:
gcloud container clusters get-credentials my-test-node --zone us-central1-b --project wordpress-161315
Remember the project ID: wordpress-161315 .
Now run
kubectl run wordpress --image=tutum/wordpress --port=80
You will get a similar message:
deployment "wordpress" created
This command starts up the Docker image on one of the nodes we have in our cluster and we can see it using the
kubectl CLI:
kubectl get pods
You will see a ready-to-use pod:
NAME READY STATUS RESTARTS AGE
wordpress-2410004867-dkc26 1/1 Running 0 1m
Exposing Wordpress
By default a pod is only accessible to other machines inside the same cluster but in order to use the Wordpress
application, the service needs to be exposed. This will allow external traffic.
In order to expose the pod, we will use the same command that we have already used kubectl expose <..> but this
time we are going to create a load balancer using --type=LoadBalancer flag.
This flag will creates an external IP that the Wordpress pod can use to accept traffic.
kubectl expose pod wordpress-2410004867-dkc26 --name=wordpress --type=LoadBalancer
You will need to replace the pod name wordpress-2410004867-dkc26 with the result you got from kubectl get
pods .
kubectl expose creates
The service
The forwarding rules for the load balancer
The firewall rules that allow external traffic to be sent to the pod.
It may take some minutes to create the load balancer.
Note the value in the load Balancer Ingress field. This is how you can access your container from outside the cluster.
LoadBalancer Ingress: 104.198.26.134
Logging Into Our Cluster Machines
If you are a new GCP, you may wonder if we have access to the VMs of our cluster or not. The response is yes, we
can go the list of VMs:
select the browser mode and if you want to access your machine using your console click on view gcloud command
and it will shows something like this:
gcloud compute --project "wordpress-161315" ssh --zone "us-central1-b" "gke-my-test-node-default-pool-f295ca85-pvjc"
If not already generated before, the same command will generate a pair of public and private keys and a known hosts
file :
.ssh/google_compute_engine
.ssh/google_compute_engine.pub
.ssh/google_compute_known_hosts
You can now see the running containers if you type docker ps :
CONTAINER ID IMAGE COMMAND NAMES
56082ad88e0d tutum/wordpress "/run.sh" k8s_wordpress
ef0d27870b4c gcr.io/google_containers/pause-amd64 "/pause" k8s
3fe304b3ec19 gcr.io/google_containers/addon-resizer "/pod_nanny --cpu=80m" k8s_heapster-nanny
f61268d8ca36 gcr.io/google_containers/heapster "/heapster --source=k" k8s_heapster_heapster
cf6d05737632 gcr.io/google_containers/pause-
amd64 "/pause" k8s_POD.d8dbe16c_heapster
0df23fc8a6f5 gcr.io/google_containers/kubedns-amd64 "/kube-dns --domain=c" k8s_kubedns_kube-dns
2579f8dbedf4 gcr.io/google_containers/exechealthz-amd64 "/exechealthz '--cmd=" k8s_healthz_kube-dns
6496a5d33b86 gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_kube-proxy
d33c866c969a gcr.io/google_containers/defaultbackend "/server" k8s_default-http-
backend
5777cbc6bca6 gcr.io/google_containers/kube-dnsmasq-amd64 "/usr/sbin/dnsmasq --" k8s_dnsmasq_kube-dns
eab12f0c765f gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_l7-default-
backend
75f8faef1369 gcr.io/google_containers/fluentd-gcp /bin/sh -c 'rm /lib/" k8s_fluentd-cloud-
logging
dc30769eb75f gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_kube-dns-
autoscaler
13e0c16438a8 gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_kubernetes-
dashboard
5be92530ad6a gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_kube-dns
1e80866c0446 gcr.io/google_containers/pause-amd64 "/pause" k8s_POD_fluentd-cloud-
logging
80d7325ae045 gcr.io/google_containers/kube-proxy "/bin/sh -c 'kube-pro" k8s_kube-proxy_kube-
proxy
c1f0995028cc gcr.io/google_containers/dnsmasq-metrics-amd64 "/dnsmasq-metrics --v" k8s_dnsmasq-metrics
9421a6f0b4d8 gcr.io/google_containers/cluster-proportional-autoscaler-amd64 "/cluster-
proportiona" k8s_autoscaler_kube-dns-autoscaler
0fa05812b004 gcr.io/google_containers/kubernetes-dashboard-amd64 "/dashboard --port=90" k8s_kubernetes-
dashboard
Using an HTTP Proxy to Access the Kubernetes API
In this section we are going to connect kubectl to Google Kubernetes using its remote API.
Go to the container engine web page and select connect:
Type the generated command, it should be similar to this one:
gcloud container clusters get-credentials my-test-node --zone us-central1-b --project wordpress-161315
If you get this warning message:
WARNING: Accessing a Container Engine cluster requires the kubernetes commandline
Then you should type:
gcloud components install kubectl
If you get any failure during one of the above or below commands, you can use gcloud info --show-log .
Let's create a HTTP proxy to access the Kubernetes API
kubectl proxy
If you have a similar error to this one error: google: could not find default credentials. See
https://developers.google.com/accounts/docs/application-default-credentials for more information. , you should authenticate to
Google Cloud services using:
gcloud auth application-default login
Then you should type again:
kubectl proxy
You can choose a specific port:
kubectl proxy --port=8081
The default port is 8080. You can now use your localhost to access the Kubernetes proxy UI. You will probably get
this error if you point your browser to the wrong address:
<h3>Unauthorized</h3>
These are the urls you can use:
http://127.0.0.1:8001/ui
http://localhost:8001/ui
Now you can use curl in order to interact with the API:
Get the API versions:
curl http://localhost:8080/api/
Get a list of pods:
curl http://localhost:8080/api/v1/namespaces/default/pods
Inspecting Services
You can see more details about the Wordpress we created using:
kubectl describe services wordpress
In the Cloud Shell console you will get a similar output to this:
Name: wordpress
Namespace: default
Labels: pod-template-hash=2410004867
run=wordpress
Selector: pod-template-hash=2410004867,run=wordpress
Type: LoadBalancer
IP: 10.7.251.61
Port: <unset> 80/TCP
NodePort: <unset> 30795/TCP
Endpoints: 10.4.0.9:80
Session Affinity: None
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
8s 8s 1 {service-controller } Normal CreatingLoadBalancer Creating load balancer
Inspecting Nodes
You can describe the different nodes you are using:
kubectl describe nodes
A similar output to this will appear right on your screen:
Name: gke-my-test-node-default-pool-f295ca85-pvjc
Role:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=n1-standard-1
beta.kubernetes.io/os=linux
cloud.google.com/gke-nodepool=default-pool
failure-domain.beta.kubernetes.io/region=us-central1
failure-domain.beta.kubernetes.io/zone=us-central1-b
kubernetes.io/hostname=gke-my-test-node-default-pool-f295ca85-pvjc
Taints: <none>
CreationTimestamp: Sun, 12 Mar 2017 16:44:47 +0100
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sun, 12 Mar Sun, 12 Mar RouteCreated RouteController created a route
OutOfDisk False Sun, 12 Mar Sun, 12 Mar KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Sun, 12 Mar Sun, 12 Mar KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sun, 12 Mar Sun, 12 Mar KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Sun, 12 Mar Sun, 12 Mar KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses: 10.128.0.2,104.198.50.198,gke-my-test-node-default-pool-f295ca85-pvjc
Capacity:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 1
memory: 3788484Ki
pods: 110
Allocatable:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 1
memory: 3788484Ki
pods: 110
System Info:
Machine ID: 833668693a8ae07719ffda0c786a12fc
System UUID: 83366869-3A8A-E077-19FF-DA0C786A12FC
Boot ID: 528cd889-61b2-476e-ac01-795c51d18cbb
Kernel Version: 4.4.21+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.11.2
Kubelet Version: v1.5.3
Kube-Proxy Version: v1.5.3
PodCIDR: 10.4.0.0/24
ExternalID: 5258581285997695597
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default wordpress 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system fluentd-cloud-logging 100m (10%) 0 (0%) 200Mi (5%) 200Mi (5%)
kube-system heapster- 138m (13%) 138m (13%) 301456Ki (7%) 301456Ki (7%)
kube-system kube-dns- 260m (26%) 0 (0%) 140Mi (3%) 220Mi (5%)
kube-system kube-dns-autoscale 20m (2%) 0 (0%) 10Mi (0%) 0 (0%)
kube-system kube-proxy-gke-my- 100m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system kubernetes-dashboa 100m (10%) 100m (10%) 50Mi (1%) 50Mi (1%)
kube-system l7-default-backend 10m (1%) 10m (1%) 20Mi (0%) 20Mi (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
828m (82%) 248m (24%) 731536Ki (19%) 803216Ki (21%)
Inspecting Namespaces
You can list the current namespaces in a cluster using:
kubectl get namespaces
The default namespaces on a fresh Kubernetes installation are:
NAME STATUS AGE
default Active 1h
kube-system Active 1h
default is default namespace for objects with no other namespaces
kube-system is the namespace for objects created by the Kubernetes system
You can set the namespace for all the kubectl commands using:
kubectl config set-context $(kubectl config current-context) --namespace=test-namespace
Change test-namespace by your preferred name then check if it was set using:
kubectl config view | grep namespace:
Note that we can find namespaces in the DNS entry created for a service (e.g. service-name.namespace-
name.svc.cluster.local).
<service-name>.<namespace-name>.svc.cluster.local
In the case when a container just uses the service name , it will resolve to the service which is local to a
namespace which is useful to use the same configuration with multiple namespaces like dev, qa, staging,
production..
If you want to reach across namespaces, you need to use the fully qualified domain name (FQDN).
Viewing Kubernetes Configurations
In order to view your configurations, you need to type:
kubectl config view
And it will shows a description of your configuration.
e.g: The configuration of the Wordpress cluster we deployed should look like this:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: REDACTED
server: https://35.184.192.231
name: gke_wordpress-161315_us-central1-b_my-test-node
contexts:
- context:
cluster: gke_wordpress-161315_us-central1-b_my-test-node
user: gke_wordpress-161315_us-central1-b_my-test-node
name: gke_wordpress-161315_us-central1-b_my-test-node
current-context: gke_wordpress-161315_us-central1-b_my-test-node
kind: Config
preferences: {}
users:
- name: gke_wordpress-161315_us-central1-b_my-test-node
user:
auth-provider:
config:
access-token: xxxxxxxxxxxxxxxxxxxxx
expiry: 2017-03-12T18:13:16.201747726+01:00
name: gcp
This configuration can be accessed using this file
ls ~/.kube/config
If you use multiple kubeconfig files at the same time and want to view merged config then you can still use the
kubectl config view . Say we have two configuration files:
~/.kube/config1
~/.kube/config2
The command will be :
KUBECONFIG=~/.kube/config1:~/.kube/config2 kubectl config view
Using Kubernetes With Amazon Web Services
In order to install and use Kubernetes in AWS, we are going to use a tool called kops (stands for Kubernetes
Operations). It allows to get a production grade Kubernetes cluster from the command line. Deployment is currently
supported on Amazon Web Services but more platforms are planned.
Installing Kops
In order to install Kops you will need a *nix system or OSx.
For OSx user, from Homebrew:
brew update && brew install kops
You can also use this method:
wget https://github.com/kubernetes/kops/releases/download/v1.4.1/kops-darwin-amd64
chmod +x kops-darwin-amd64
mv kops-darwin-amd64 /usr/local/bin/kops
And Linux users should download the binary:
chmod +x kops-linux-amd64
mv kops-linux-amd64 /usr/local/bin/kops
You can install from sources:
go get -d k8s.io/kops
cd ${GOPATH}/src/k8s.io/kops/
git checkout release
make
Windows users need to install a Linux VM or Vagrant.
Prerequisites
We need to setup a user, using the IAM we can create a new user, remember to save its access id/key and give it the
administrator access. You can create your own policy by giving the user only the rights that he needs. We will
proceed for this course with the full administrator access.
After that we need to setup the DNS. kops uses DNS for discovery inside the cluster and for the clients so that we
can reach the kubernetes API server. The cluster name should be a valid DNS name. This allows us to share our
cluster with its domain. We are going to use Route53 with a subdomain. You should either register a domain or
transfer an existing domain name to Route53 (if you would like to centralize all your domain operations in AWS
Route53).
If you don't want to spend money, you can register a free domain name like .tk.
Note that if you have a registered domain with another registrar (Namecheap, GoDaddy ..etc) and you want to keep
it instead of transferring it, you should setup the generated hosted zone values (NS records) in your original registrar
configuration.
Now create a hosted zone for your domain. You can use the AWS web interface or the AWS CLI. In my case, I
already hosted the domain devopslinks.com in Route53 and I will be using kubernetes.dev.devopslinks.com for the
remainder:
aws route53 create-hosted-zone --name kubernetes.dev.devopslinks.com --caller-reference 1
You should have a similar output to this:
{
"ChangeInfo": {
"Status": "PENDING",
"SubmittedAt": "2017-04-30T21:10:40.362Z",
"Id": "/change/xxxxxxxxxx"
},
"DelegationSet": {
"NameServers": [
"ns-1752.awsdns-27.co.uk",
"ns-1333.awsdns-38.org",
"ns-904.awsdns-49.net",
"ns-102.awsdns-12.com"
]
},
"HostedZone": {
"Name": "kubernetes.dev.devopslinks.com.",
"CallerReference": "1",
"Id": "/hostedzone/xxxxxxxxx",
"Config": {
"PrivateZone": false
},
"ResourceRecordSetCount": 2
},
"Location": "https://route53.amazonaws.com/2013-04-01/hostedzone/Z2AZ80TW24KT5C"
}
Creating a hosted zone is a paid service. You can find more details about Route53 here.
Note the 4 generated name servers :
ns-1752.awsdns-27.co.uk.
ns-1333.awsdns-38.org.
ns-904.awsdns-49.net.
ns-102.awsdns-12.com.
If you have your domain registered elsewhere, say GoDaddy, you should go to GoDaddy web interface, create 4 NS
records with the host kubernetes.dev. and set the one of the name servers to each of the created records.
We also need an S3 bucket.
To manage a cluster kops must keep track of the created clusters, along with their configuration, the keys they are
using and different other configurations. This information is stored in a bucket.
Create the bucket (we will call it kubernetes.dev.devopslinks.com):
aws s3 mb s3://kubernetes.dev.devopslinks.com
At this step, you should have setup the right user, S3 and Route53. We can proceed by typing this command:
kops create cluster \
--name=kubernetes.dev.devopslinks.com \
--state=s3://kubernetes.dev.devopslinks.com \
--zones=eu-west-1a \
--node-count=2 \
--node-size=t2.micro \
--master-size=t2.micro \
--dns-zone=kubernetes.dev.devopslinks.com
You should change adapt this commands to your preferences like the region, the size of the machines, the number of
the machines in a cluster ..etc This command will create the following resources:
EBSVolume
DHCPOptions
Keypair
SSHKey
EBSVolume
VPC
IAMRole
IAMInstanceProfile
SecurityGroup
Subnet
IAMRolePolicy
InternetGateway
SecurityGroupRule
SecurityGroupRule
Route
RouteTableAssociation
AutoscalingGroup
As you can see in the output of the last command:
To list clusters: kops get cluster
To edit this cluster kops edit cluster kubernetes.dev.devopslinks.com
To edit node instance group: kops edit ig --name=kubernetes.dev.devopslinks.com nodes
To edit the master instance group: kops edit ig --name=kubernetes.dev.devopslinks.com master-eu-west-1a
Now we can configure the cluster with:
kops update cluster kubernetes.dev.devopslinks.com --yes --state=s3://kubernetes.dev.devopslinks.com
Don't forget to add --state=s3://kubernetes.dev.devopslinks.com
After launching this command, you can find important configurations in ~/.kube/config file.
Cluster is starting. It should be ready in a few minutes.
As you may see in the output of the last command:
To list nodes: kubectl get nodes --show-labels
To ssh to the master: ssh -i ~/.ssh/id_rsa admin@api.kubernetes.dev.devopslinks.com
To read about installing addons: https://github.com/kubernetes/kops/blob/master/docs/addons.md
Chapter XIV - Orchestration - Rancher/Cattle
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
In the Docker World, the orchestration is the most important part of the ecosystem. Docker Swarm, Kubernetes,
Apache Mesos .. all of these are orchestrators, every one of them has its own philosophy, use cases and architecture.
Rancher is a tool built to simplify Docker orchestration and management. Through this tutorial,we are going to
discover how to use it in order to create a scalable Wordpress application.
Rancher Architecture
In Rancher, everything (like containers, networks or images ) is an API resource with a process lifecycle.
Containers, images, networks, and accounts are all API resources with their own process lifecycles.
Rancher is built on the top of containers:
A web UI
An API
A server that manage Rancher Agents
A database
A machine microservice
The docker-machine binary
When you run Rancher using docker run rancher/server ... the Rancher API + Rancher Process Server + The
Database + Machine Microservice are processes that live inside this container.
Note that the docker-machine binary is also living in the same container but only runs when it is called by the API.
Rancher has also an Agent part that manage the life cycle of containers.
If docker-machine creates a machine successfully, some events are exchanged between the docker-machine and the
microservice. A bootstrap event is created and a docker-machine config command is executed to get the details needed
to connect to the machine’s Docker daemon.
If everything run without problems, the service fires up a Rancher Agent on the machine via docker run
rancher/agent ... ```.
Rancher Agents open a WebSocket connection to the server in order to establish a 2-way communication. The
Rancher Agent manage its containers and reports every change using the Docker API.
During this tutorial, we are going to use an EC2 machine and this is how a different view of the layers of our
Rancher installation:
RancherOS
RancherOS is a small distribution released by Rancher team. It is an easy way to run containers at scale in
production, and includes only the services needed to run Docker.
It only includes the latest version of Docker and removes any unneeded library that a "normal" Linux distribution
could have.
In RancherOS, everything is a container, the traditional init system is replaced so that Docker run directly on the
Kernel.
A special component in this system is called User Docker which is the daemon that allow a user (a non-system user)
runs its containers.
We are going to run RancherOS in an EC2 machine using AWS CLI:
aws ec2 run-instances --image-id ami-ID --count 1 --instance-type t2.micro --key-name MySSHKeyName --security-groups sg-name
This is the list of AMI by region:
Region Type AMI
ap-south-1 HVM ami-fd1e6d92
eu-west-2 HVM ami-51776335
eu-west-1 HVM ami-481e232e
ap-northeast-2 HVM ami-c32efdad
ap-northeast-1 HVM ami-33aaf154
sa-east-1 HVM ami-15ed8d79
ca-central-1 HVM ami-e61fa282
ap-southeast-1 HVM ami-63b50900
ap-southeast-2 HVM ami-86b7bbe5
eu-central-1 HVM ami-a71ecfc8
us-east-1 HVM ami-37b00f21
us-east-2 HVM ami-c61632a3
us-west-1 HVM ami-8998c3e9
us-west-2 HVM ami-f6910496
Now you can login to your machine using the AWS common ssh command :
ssh -i "MySSHKeyName" rancher@xxxx.xx-xxxx-x.compute.amazonaws.com
Running Rancher
Since Docker is installed by default, we can start using it directly. Let's start a MariaDB server then use it with
Rancher Server.
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -e MYSQL_DATABASE=cattle -e MYSQL_USER=cattle -
e MYSQL_PASSWORD=cattle -p 3306:3306 -d mariadb
Run the Rancher Server and change 172.31.0.190 by your MariaDB IP:
docker run --name rancher_server -p 8080:8080 -e CATTLE_DB_CATTLE_MYSQL_HOST=172.31.0.190 -
e CATTLE_DB_CATTLE_MYSQL_PORT=3306 -e CATTLE_DB_CATTLE_MYSQL_NAME=cattle -e CATTLE_DB_CATTLE_USERNAME=cattle -
e CATTLE_DB_CATTLE_PASSWORD=cattle -v /var/run/docker.sock:/var/run/docker.sock -d rancher/server
Now you can go to your <host_ip>:8080 and add a Linux host with a supported version of Docker:
We are not going to use the public IP address but the private (eth0) one.
Let’s add a custom host.
A command is given to run on any reachable host in order to let it join the server. At this step, I already created an
EC2 machines (with RancherOS as an operating system) and I am going to use it and run this command:
sudo docker run -d --privileged -v /var/run/docker.sock:/var/run/docker.sock -
v /var/lib/rancher:/var/lib/rancher rancher/agent:v1.2.1 http://172.31.0.190:8080/v1/scripts/02048221239BE78134BB:1483142400000:i5yrQRE4kFo7eIKz9n1o5VFTew
Make sure any Security Groups or firewalls allow traffic from and to all other hosts on UDP ports 500 and 4500.
After the Agent installation, we can notice that some containers are running in each Agent machine like the dns,
healthcheck, network-manager ..:
CONTAINER ID IMAGE COMMAND NAMES
8339a8af3bb8 rancher/net "/rancher-entrypoint." r-ipsec-ipsec-router
95f3d3afdb7c rancher/net:holder "/.r/r /rancher-entry" r-ipsec-ipsec
5643089ca485 rancher/healthcheck "/.r/r /rancher-entry" r-healthcheck-healthcheck
fec564c53f13 rancher/dns "/rancher-entrypoint." r-network-services-metadata-dns
c162c07d07f6 rancher/net "/rancher-entrypoint." r-ipsec-ipsec-cni-driver
0313ff5cb812 rancher/scheduler "/.r/r /rancher-entry" r-scheduler-scheduler
0f6f62a190c7 rancher/network-manager"/rancher-entrypoint." r-network-services-network-manager
ec270864ff07 rancher/metadata "/rancher-entrypoint." r-network-services-metadata
98a78058bc3c rancher/agent "/run.sh run" rancher-agent
Running A Wordpress Service
If everything is ok, we are going to create a Wordpress service. Go to Stack menu and click User.
Create the MariaDB database, name it wordpressdb (we are going to use it in order to link it to the Wordpress
container)
Then add your environment variables:
Now, using the same way, create the Wordpress container, map host port 80 to the container port 80 and link the
mariadb container as a mysql instance.
Then add your environment variables:
At this step, we can check the running services, use Stack->User then default to see the containers:
What we started is the equivalent of 2 docker commands, one to start the mariadb container and the other one to
start the wordpress container.
You can visit the Wordpress fresh installation using its IP address.
We can see the details of the different configurations a running container could have if you choose the container
from the same view then use the "View in API" menu.
This is an example of the wordpress container:
{
"id":"1s7",
"type":"service",
"links":{
"self":"…/v2-beta/projects/1a5/services/1s7",
"account":"…/v2-beta/projects/1a5/services/1s7/account",
"consumedbyservices":"…/v2-beta/projects/1a5/services/1s7/consumedbyservices",
"consumedservices":"…/v2-beta/projects/1a5/services/1s7/consumedservices",
"instances":"…/v2-beta/projects/1a5/services/1s7/instances",
"networkDrivers":"…/v2-beta/projects/1a5/services/1s7/networkdrivers",
"serviceExposeMaps":"…/v2-beta/projects/1a5/services/1s7/serviceexposemaps",
"serviceLogs":"…/v2-beta/projects/1a5/services/1s7/servicelogs",
"stack":"…/v2-beta/projects/1a5/services/1s7/stack",
"storageDrivers":"…/v2-beta/projects/1a5/services/1s7/storagedrivers",
"containerStats":"…/v2-beta/projects/1a5/services/1s7/containerstats"
},
"actions":{
"upgrade":"…/v2-beta/projects/1a5/services/1s7/?action=upgrade",
"restart":"…/v2-beta/projects/1a5/services/1s7/?action=restart",
"update":"…/v2-beta/projects/1a5/services/1s7/?action=update",
"remove":"…/v2-beta/projects/1a5/services/1s7/?action=remove",
"deactivate":"…/v2-beta/projects/1a5/services/1s7/?action=deactivate",
"removeservicelink":"…/v2-beta/projects/1a5/services/1s7/?action=removeservicelink",
"addservicelink":"…/v2-beta/projects/1a5/services/1s7/?action=addservicelink",
"setservicelinks":"…/v2-beta/projects/1a5/services/1s7/?action=setservicelinks"
},
"baseType":"service",
"name":"wordpress",
"state":"active",
"accountId":"1a5",
"assignServiceIpAddress":false,
"createIndex":1,
"created":"2017-04-04T21:05:53Z",
"createdTS":1491339953000,
"currentScale":1,
"description":"wordpress files",
"externalId":null,
"fqdn":null,
"healthState":"healthy",
"instanceIds":[
"1i14"
],
"kind":"service",
"launchConfig":{
"type":"launchConfig",
"capAdd":[
],
"capDrop":[
],
"dataVolumes":[
],
"dataVolumesFrom":[
],
"devices":[
],
"dns":[
],
"dnsSearch":[
],
"environment":{
"WORDPRESS_DB_NAME":"wordpress",
"WORDPRESS_DB_USER":"wordpress",
"WORDPRESS_DB_PASSWORD":"wordpress"
},
"imageUuid":"docker:wordpress",
"instanceTriggeredStop":"stop",
"kind":"container",
"labels":{
"io.rancher.container.pull_image":"always"
},
"logConfig":{
"type":"logConfig",
"config":{
},
"driver":""
},
"networkMode":"managed",
"ports":[
"80:80/tcp"
],
"privileged":false,
"publishAllPorts":false,
"readOnly":false,
"secrets":[
],
"startOnCreate":true,
"stdinOpen":true,
"system":false,
"tty":true,
"version":"0",
"dataVolumesFromLaunchConfigs":[
],
"vcpu":1
},
"lbConfig":null,
"linkedServices":{
"mysql":"1s6"
},
"metadata":null,
"publicEndpoints":[
{
"type":"publicEndpoint",
"hostId":"1h1",
"instanceId":"1i14",
"ipAddress":"54.246.163.197",
"port":80,
"serviceId":"1s7"
}
],
"removed":null,
"retainIp":null,
"scale":1,
"scalePolicy":null,
"secondaryLaunchConfigs":[
],
"selectorContainer":null,
"selectorLink":null,
"stackId":"1st5",
"startOnCreate":true,
"system":false,
"transitioning":"no",
"transitioningMessage":null,
"transitioningProgress":null,
"upgrade":null,
"uuid":"0b8a8290-0d10-47be-b182-45f1e57e80bc",
"vip":null
}
We can also inspect other services and hosts configurations in the same way.
Cattle: Rancher Container Orchestrator
What we started below is Rancher powered by its own orchestration tool called Cattle. Rancher offers the possibility
to use other orchestration tools like Kubernetes, Docker Swarm or Mesos.
Cattle is a container orchestration and scheduling framework, in its beginning, it was designed as an extension to
Docker Swarm but since Docker Swarm continues to develop, Cattle and Swarm started to diverge. Cattle is used
extensively by Rancher itself to orchestrate infrastructure services as well as setting up, managing, and upgrading
Swarm, Kubernetes, and Mesos clusters.
Cattle application deployments are organized into stacks which can be used to User or Infrastructure stacks. A
Cattle Stack is a collection of Services and the latter is primarily a Docker image with its networking, scalability,
storage, health checks, service discovery links, environment and all of the other confgurations. A Cattle Service
could be a load balancer or an external service. A Stack can be launched using a docker-compose.yml or rancher-
compose.yml file or just start containers like we did for the Wordpress stack.
In addition to this, you can start several applications using an application catalog. More than 50 different apps are
available in the app catalog.
An application is defined by a docker-compose and a rancher-compose file and can be deployed easily with the
default configurations.
Let's take the example of Portainer:
Portainer is a lightweight management UI which allows you to easily manage your Docker host or Swarm
cluster. Portainer is meant to be as simple to deploy as it is to use. It consists of a single container that can run
on any Docker engine (Docker for Linux and Docker for Windows are supported). Portainer allows you to
manage your Docker containers, images, volumes, networks and more ! It is compatible with the standalone
Docker engine and with Docker Swarm.
This is the docker-compose.yml file of Portainer:
portainer:
labels:
io.rancher.sidekicks: ui
io.rancher.container.create_agent: true
io.rancher.container.agent.role: environment
image: rancher/portainer-agent:v0.1.0
volumes:
- /config
ui:
image: portainer/portainer:pr572
command: --no-auth --external-endpoints=/config/config.json --sync-interval=5s -p :80
volumes_from:
- portainer
net: container:portainer
This is its rancher-compose.yml file:
.catalog:
name: "Portainer"
version: "1.11.4"
description: Open-source lightweight management UI for a Docker host or Swarm cluster
minimum_rancher_version: v1.5.0-rc1
Rancher offers the possibility to start infrastructure Stacks. If you go to Stacks->Infrastructure, you can see a catalog
of tools like Bind9, Portainer, Rancher vxlan ..etc
Now we want to scale our Wordpress frontal app in order to handle more traffic. In this case, you should click Stack-
>User then the name of the Stack (it should be the default one if you followed this tutorial as it is). Now click on
Wordpress and scale your app to 3 containers:
At this step, you should see how Wordpress won't scale the right way.
Scaling Wordpress Using Rancher
Before starting this, we should know that scaling a container that already mapped to a host port is impossible, we
have seen this with the latest failed example of Wordpress. To make this work, we should remove the port mapping,
go to the Wordpress service and remove port mapping using the UI. Now let's create a load balancer service. Go to
Stacks->User->Default and add a load balancer:
We know that the Wordpress container will run the service wordpress. So we should choose it as a target, let's also
give the target a port (which is the default port of a wordpress service): We are not using port mapping but
wordpress service listens always on port 80 - this is the same thing as in Docker Swarm. Choose also the request
host, port, path ..etc.
In our case, the Wordpress Stack is running in a single Rancher host, since we added just one. You can add another
host and the same load balacing mechanism will be reproduced on two hosts: the wordpress service containers will
be available on both hosts and the load balancer will auto-discover Wordpress containers in all of Rancher hosts.
Chapter XV - Docker API
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Exploring Docker API
Docker API is the API served by Docker Engine and it allows to have a full control on Docker. This could be
interesting if you want to build application that use Docker easily. You can see the Engine API version you are
running by typing sudo docker version |grep -i api
The API is usually changed in each release of Docker, so API calls are versioned to ensure that clients don't break.
In order to interact with theis API you should use on of the known SDKs and this will depend on the language you
are using. Here is a list of the know SDKs to use Docker Engine API:
Language Library
C libdocker
C# Docker.DotNet
C++ lasote/docker_client
Dart bwu_docker
Erlang erldocker
Gradle gradle-docker-plugin
Groovy docker-client
Haskell docker-hs
HTML (Web Components) docker-elements
Java docker-client
Java docker-java
NodeJS dockerode
Perl Eixo::Docker
PHP Docker-PHP
Ruby docker-api
Rust docker-rust
Rust shiplift
Scala tugboat
Scala reactive-docker
The following go code creates a container running Alpine as an OS that will print "hello world" then exits:
package main
import (
"io"
"os"
"github.com/moby/moby/client"
"github.com/moby/moby/api/types"
"github.com/moby/moby/api/types/container"
"golang.org/x/net/context"
)
func main() {
ctx := context.Background()
cli, err := client.NewEnvClient()
if err != nil {
panic(err)
}
_, err = cli.ImagePull(ctx, "docker.io/library/alpine", types.ImagePullOptions{})
if err != nil {
panic(err)
}
resp, err := cli.ContainerCreate(ctx, &container.Config{
Image: "alpine",
Cmd: []string{"echo", "hello world"},
}, nil, nil, "")
if err != nil {
panic(err)
}
if err := cli.ContainerStart(ctx, resp.ID, types.ContainerStartOptions{}); err != nil {
panic(err)
}
if _, err = cli.ContainerWait(ctx, resp.ID); err != nil {
panic(err)
}
out, err := cli.ContainerLogs(ctx, resp.ID, types.ContainerLogsOptions{ShowStdout: true})
if err != nil {
panic(err)
}
io.Copy(os.Stdout, out)
}
To do the same thing with Python:
import docker
client = docker.from_env()
print client.containers.run("alpine", ["echo", "hello", "world"])
Docker has an official documentation if you want to use the API.
e.g. To list containers, we can send a GET request to /containers/json , and the reponse will be either 400, 500 and if
"everything is 200 ok", the API will return a JSON:
[
{
"Id": "8dfafdbc3a40",
"Names": [
"/boring_feynman"
],
"Image": "ubuntu:latest",
"ImageID": "d74508fb6632491cea586a1fd7d748dfc5274cd6fdfedee309ecdcbc2bf5cb82",
"Command": "echo 1",
"Created": 1367854155,
"State": "Exited",
"Status": "Exit 0",
"Ports": [
{
"PrivatePort": 2222,
"PublicPort": 3333,
"Type": "tcp"
}
],
"Labels": {
"com.example.vendor": "Acme",
"com.example.license": "GPL",
"com.example.version": "1.0"
},
"SizeRw": 12288,
"SizeRootFs": 0,
"HostConfig": {
"NetworkMode": "default"
},
"NetworkSettings": {
"Networks": {
"bridge": {
"NetworkID": "7ea29fc1412292a2d7bba362f9253545fecdfa8ce9a6e37dd10ba8bee7129812",
"EndpointID": "2cdc4edb1ded3631c81f57966563e5c8525b81121bb3706a9a9a3ae102711f3f",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
},
"Mounts": [
{
"Name": "fac362...80535",
"Source": "/data",
"Destination": "/data",
"Driver": "local",
"Mode": "ro,Z",
"RW": false,
"Propagation": ""
}
]
},
{
"Id": "9cd87474be90",
"Names": [
"/coolName"
],
"Image": "ubuntu:latest",
"ImageID": "d74508fb6632491cea586a1fd7d748dfc5274cd6fdfedee309ecdcbc2bf5cb82",
"Command": "echo 222222",
"Created": 1367854155,
"State": "Exited",
"Status": "Exit 0",
"Ports": [],
"Labels": {},
"SizeRw": 12288,
"SizeRootFs": 0,
"HostConfig": {
"NetworkMode": "default"
},
"NetworkSettings": {
"Networks": {
"bridge": {
"NetworkID": "7ea29fc1412292a2d7bba362f9253545fecdfa8ce9a6e37dd10ba8bee7129812",
"EndpointID": "88eaed7b37b38c2a3f0c4bc796494fdf51b270c2d22656412a2ca5d559a64d7a",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.8",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:08"
}
}
},
"Mounts": []
},
{
"Id": "3176a2479c92",
"Names": [
"/sleepy_dog"
],
"Image": "ubuntu:latest",
"ImageID": "d74508fb6632491cea586a1fd7d748dfc5274cd6fdfedee309ecdcbc2bf5cb82",
"Command": "echo 3333333333333333",
"Created": 1367854154,
"State": "Exited",
"Status": "Exit 0",
"Ports": [],
"Labels": {},
"SizeRw": 12288,
"SizeRootFs": 0,
"HostConfig": {
"NetworkMode": "default"
},
"NetworkSettings": {
"Networks": {
"bridge": {
"NetworkID": "7ea29fc1412292a2d7bba362f9253545fecdfa8ce9a6e37dd10ba8bee7129812",
"EndpointID": "8b27c041c30326d59cd6e6f510d4f8d1d570a228466f956edf7815508f78e30d",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.6",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:06"
}
}
},
"Mounts": []
},
{
"Id": "4cb07b47f9fb",
"Names": [
"/running_cat"
],
"Image": "ubuntu:latest",
"ImageID": "d74508fb6632491cea586a1fd7d748dfc5274cd6fdfedee309ecdcbc2bf5cb82",
"Command": "echo 444444444444444444444444444444444",
"Created": 1367854152,
"State": "Exited",
"Status": "Exit 0",
"Ports": [],
"Labels": {},
"SizeRw": 12288,
"SizeRootFs": 0,
"HostConfig": {
"NetworkMode": "default"
},
"NetworkSettings": {
"Networks": {
"bridge": {
"NetworkID": "7ea29fc1412292a2d7bba362f9253545fecdfa8ce9a6e37dd10ba8bee7129812",
"EndpointID": "d91c7b2f0644403d7ef3095985ea0e2370325cd2332ff3a3225c4247328e66e9",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.5",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:05"
}
}
},
"Mounts": []
}
]
Streaming Containers Logs Using Docker API
Like in all of Painless Docker chapters, our goal is not providing a documentation about Docker or its API, but
giving more practical examples. You can find all of the details about the API in the official documentations. Let's
move to more practical stuff.
In this part of the book we are going to create a central logging system, that will collect every line of log in a given
server and stream it to a web page. Think of it as the docker logs <all_of_my_containers> .
We are going to use Docker API with Python Flask.
Flask is a microframework for Python based on Werkzeug and Jinja 2. Python and Flask is easy to read and
understand, you don't need to be a Python expert to use follow these steps.
For my Flask projects, I usually use a code template that you can find on Github. You need first to install these
packages:
git
python3
python-pip3
python-virtualenv
Now create an isolated Python virtual environment virtualenv -p python3 docker-log-stream .
Go to the created directory cd docker-log-stream .
Activate the virtual environment: . bin/activate
Download the code template from Github: git clone https://github.com/eon01/flasklate.git code
This is the app.py file that we are going to use in order to start our server on port 5000:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging, traceback, configparser, os
from flask import Flask
app = Flask(__name__)
# start configuration parser
parser = configparser.ConfigParser()
parser.read("app.conf")
# reading variables
logger_level = parser.get('logging', 'logger_level', raw = True)
handler_level = parser.get('logging', 'handler_level', raw = True)
log_format = parser.get('logging', 'log_format', raw = True)
log_file = parser.get('logging', 'log_file')
# set logger logging level
logger = logging.getLogger(__name__)
logger.setLevel(eval(logger_level))
# set handler logging level
handler = logging.FileHandler(log_file)
handler.setLevel(eval(handler_level))
# create a logging format
formatter = logging.Formatter(log_format)
handler.setFormatter(formatter)
# add the handlers to the logger
logger.addHandler(handler)
@app.route('/')
def hello_world():
return 'Hello, World!'
if __name__ == '__main__':
# Bind to PORT if defined, otherwise default to 5000.
port = int(os.environ.get('PORT', 5000))
app.run(host='0.0.0.0', port=port)
This file will read configuration variables from app.conf.
In order to test this, we need to install the requirements (dependencies): pip install -r requirements.txt Now we can
start our local web server using python app.py and on another terminal, you can test it using a simple curl: curl
http://0.0.0.0:5000 . If everything is fine, you will get 'Hello, world!' on your screen.
Now we need to install the Docker API for Python, just type pip install docker and hit enter. Docker will be
installed in your local development environment.
In general, we can use Docker for Python this way:
We create a client : client = docker.from_env() and then we can access to a list of methods in the created client:
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'api', 'containers', 'df', 'events', 'from_env', 'images', 'info', 'login', 'networks', 'nodes', 'ping', 'plugins', 'secrets', 'services', 'swarm', 'version', 'volumes']
We are more interested in : 'api', 'containers', 'df', 'events', 'from_env', 'images', 'info', 'login', 'networks', 'nodes',
'ping', 'plugins', 'secrets', 'services', 'swarm', 'version', 'volumes' methods.
As you may see we can get a list of running containers in our host using containers : containers = client.containers() .
In order to go trough all of these containers one by one, we can use:
for container in containers:
# we can execute what we want here for each container
I order to get a container logs, we can use this:
for container in containers:
for line in container.logs(stream=True):
print (line)
We can do a better code by stripping the printed line:
for container in containers:
for line in container.logs(stream=True):
print (line.strip())
Since we can access a container name using container.name we can enhance our program:
for container in containers:
for line in container.logs(stream=True):
log_line = container.name + " : " + line.strip().decode("utf-8")
print (log_line)
In the original code file, change the hello_world function by a new one that we could call stream and add the code to
stream the log files:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging, traceback, configparser, os
from flask import Flask
import docker
app = Flask(__name__)
# start configuration parser
parser = configparser.ConfigParser()
parser.read("app.conf")
# reading variables
logger_level = parser.get('logging', 'logger_level', raw = True)
handler_level = parser.get('logging', 'handler_level', raw = True)
log_format = parser.get('logging', 'log_format', raw = True)
log_file = parser.get('logging', 'log_file')
# set logger logging level
logger = logging.getLogger(__name__)
logger.setLevel(eval(logger_level))
# set handler logging level
handler = logging.FileHandler(log_file)
handler.setLevel(eval(handler_level))
# create a logging format
formatter = logging.Formatter(log_format)
handler.setFormatter(formatter)
# add the handlers to the logger
logger.addHandler(handler)
@app.route('/')
def stream():
client = docker.from_env()
containers = client.containers.list()
for container in containers:
for line in container.logs(stream=True):
log_line = container.name + ":" + line.strip().decode("utf-8")
print (log_line)
if __name__ == '__main__':
# Bind to PORT if defined, otherwise default to 5000.
port = int(os.environ.get('PORT', 5000))
app.run(host='0.0.0.0', port=port)
Now add Docker API to the requirements file: pip freeze > requirements.txt
In order to test this I used a Wordpress stack that I deployed to my machine using this docker-compose file:
version: '3.1'
services:
wordpress:
image: wordpress
ports:
- 8080:80
environment:
WORDPRESS_DB_PASSWORD: passw0rd
mysql:
image: mysql:5.7
environment:
MYSQL_ROOT_PASSWORD: passw0rd
You can start Wordpress docker-compose -f docker-compose.yml up , then start refreshing the index page at:
http://0.0.0.0:8080 . At the same time, after running the Python program using a simple python app.py command, you
should send a GET request to your Python program curl http://0.0.0.0:5000/ .
Chapter XVI - Docker Security
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
Possible Threats
Docker is neither more nor less secured than VMs and the VM vs Docker security discussions are the answers to a
wrong question. Usually security threats are either PEBCAK (Problem Exists Between Chair And Keyboard) or
problems that have existed before.
Discussing the security best practices and checklists is more constructive and productive.
In this chapter, we are going to see what are the know security threats that you should be aware of when using
Docker and introduce good security practices. I am not a security researcher so all of the following parts are based
on my opinions, some researches and experiences.
Kernel Panic & Exploits
As we have seen in the first chapters of this book, one of the differences between Docker and VMs is that a container
share the Kernel with the host system and with the other running containers. If a container causes a Kernel panic,
both hosts and other containers could be taken down. This "direct access" to the Kernel, which is at the same time
one of the strengths of Docker, could cause serious damages.
Container Breakouts & Privilege Escalation
If you start the container X with the user Y, container X will have the same privileges on the host system as the user
Y. This is harmful when a process breaks out the container. In this case, if you were root in the container, you will be
the same on the host.
Container breakout will cause unauthorized access across containers, hosts and even your data centers.
Poisoned Images
It is possible that download and use a Docker image that run malwares (e.g. scanning the network for sensitive data,
downloading malware from a distant host, executing harmful actions .. etc). An attacker can also get access to your
data if you are using his poisoned image.
Denial-of-service Attacks
Like you know, containers share the same Kernel with each other and with the host and in this case any container
monopolizing the Kernel resources will make other containers starve out. Poisoned containers can eat up CPU time,
memory resources, disk IO.. etc and this could lead other containers and even the host system to a crash.
Compromising secrets
A Docker container could contain or send sensitive data. During transit and at rest, the secret data could be viewed
and stolen by an attacker. Microservices architecture have the same security threat but there are always good
solutions for these threats.
Application Level Threats
A container is primarily made to build, ship and run an application and even if the host and the container are
secured, the application layer could open some doors to attackers. e.g.
Flooding the network
Application level DDOS
XSS vulnerabilities and SQL injection
DNS hijacking
Host System Level Treats
If the host system is not up to date, you may have some security threats like: heart-bleed, shell-shock (Bashdoor),
glibc ..etc
Security Best Practices
Security By Design
Like for any other security threat, a system that has been designed from the ground up to be secure is always the best
and first practice. The application layer, the containerization layer, the host system, the software and the cloud
architecture ..etc are all elements of the same running stack and each of them could be a weak link. A system secure
by design will work to limit the damage when a container breakout occurs.
SetUID/SetGID
setuid/setgid binaries run with the privileges of the owner and can sometimes be compromised by an attacker to gain
elevated privileges. To remove the setuid bit, add this line in the Dockerfile:
RUN find / -perm +6000 -type f -exec chmod a-s {} \; || true
Without the || true , we can have errors like:
find: `/proc/18/task/18/fd/5': No such file or directory
find: `/proc/18/task/18/fdinfo/5': No such file or directory
find: `/proc/18/fd/5': No such file or directory
find: `/proc/18/fdinfo/5': No such file or directory
This is a prevention from some privilege escalation threats and it will applied to files like:
/sbin/unix_chkpwd
/usr/bin/chage
/usr/bin/passwd
/usr/bin/mail-touchlock
/usr/bin/mail-unlock
/usr/bin/gpasswd
/usr/bin/crontab
/usr/bin/chfn
/usr/bin/newgrp
/usr/bin/sudo
/usr/bin/wall
/usr/bin/mail-lock
/usr/bin/expiry
/usr/bin/dotlockfile
/usr/bin/chsh
/usr/lib/eject/dmcrypt-get-device
/usr/lib/pt_chown
/bin/ping6
/bin/su
/bin/ping
/bin/umount
/bin/mount
Controlling CPU Usage
An attacker that gains access to a container could make DDoS attacks on the CPU that make other containers starve
out. By default, containers get an equal number of CPU cycles and quotas. This could be modified by changing the
container's CPU period and quota values.
docker run -d --cpu-period 50000 --cpu-quota 5000 ubuntu:16.04
The default CPU CFS (Completely Fair Scheduler) period is 100ms.
The --cpu-period and --cpu-quota flags limit the container's CPU usage.
The default --cpu-quota value is 0 and allows the container to take 100% of a CPU resource for 1 CPU.
The CFS (Completely Fair Scheduler), the default Linux Scheduler used by the kernel handles resource allocation
for executing processes.
When this value is set to 50000 the container is limited to 50% of a CPU resource.
Adjusting --cpu-period and --cpu-quota give the container administrator the control for container's CPU usage in a
multiple CPUs context.
Controlling Memory Usage
With the default setting, container can use 100% of the memory on the host. Choosing a limit for a container is a
security best practice in order to limit risks in the case of a container breakout.
Limiting the max amount of memory a container can use is done using the -m flag. e.g. docker run -m 512m
ubuntu:16.04
Verifying Images
Be careful from downloading 3rd party images. Only use images from automated builds with linked source code or
use official images by pulling the digest to take advantage of checksum validate.
Set Container Filesystem to Read Only
Unless you need to modify files in your container, make the filesystem read only.
docker run --read-only ubuntu:16.04 touch test
If a hacker breaks out a container, the first thing he wants to do is to write the exploit into the application, this way
at its startup, it will start the exploit.
A read-only container will prevent everyone from from definitely leaving an exploit. The exploit will no longer exist
once the application restarts.
Set A User
Don't run your application as root in containers. Users in Docker are not namespaced.
RUN groupadd -r groupname && useradd -r -g groupname username
USER username
Do Not Use Environment Variables To Share Secrets
A sensitive data should not be shared using the ENV instruction otherwise it could be exposed to child processes,
other linked containers, Docker inspection output ..etc
Use Orchestrators Secret managers
Kubernetes or Docker Swarm offer their own secret management tools, if you want to store sensible data or send it
from one service to another then it is recommended to use these tools (e.g. Docker Secret for the Swarm mode)
Do Not Run Containers In The Privileged Mode
When you run a container using the --privileged flag, it gives all the capabilities to the container like accessing all
devices on the host as well as set some configuration in AppArmor or SELinux. The container will have almost all
the same access to the host as a normal process running without a container on the host.
Turn Off Inter-Container Communication
If you don't need inter-container communication on the same host, run containers with --icc=false --iptables in order
enable only communication between containers linked together explicitly. If these flags are not activated
unrestricted network traffic is enabled between all containers on the same host.
docker run -d --icc=files --iptables
Set Volumes To Read-Only
If you don't need to modify files in attached volumes make them read-only.
docker run -v /folder:/folder:ro alpine
Only Install Necessary Packages
Inside the container, install only what you need. Don't install unnecessary packages e.g ssh, cron, man-db ..etc In
order to see what packages are installed in a container, depending on your package manager, run: docker exec
<container_id> rpm -qa or dpkg -l ..etc
Make Sure Docker Is Up To Date
Docker has an active community and security updates are frequent. It is recommended from a security point of view
to always have the latest version in your production environments.
Use Vulnerability Analysis Scanners
Most known ones are the official Docker Cloud security scanner and Clair by CoreOS.
Properly Configure Your Docker Registry Access Control
Vine Docker images were hacked because their private registry was publicly accessible at docker.vineapp.com
Security Through Obscurity
If docker.vineapp.com was 1xoajze313kjaz.vineapp.com , the hacker would not be able to discover the private registry.
Secure And Control Your Code
Even if your hosts and containers are secure, the problem could come from the containerized application: e.g. You
are using PHP and the remote file inclusion/execution configuration is set to active or running system commands
from code is active .. etc
Use Limited Linux Capabilities
When limiting the Linux capabilities of a container, even in the case a hacker gets into the container, the host system
will be protected.
By default, Docker starts containers with a restricted set of capabilities
Use Seccomp
By default a container has around 44 disabled system calls (out of 300+). The 270 calls that still open may be
susceptible to attacks. For a high level of security, you can set Seccomp profiles individually for containers but be
sure to understand each syscall.
docker run --rm -it --security-opt seccomp=default.json hello-world
Chapter XVII - Docker, Containerd & Standalone
Runtimes Architecture
o ^__^
o (oo)\_______
(__)\ )\/\
||----w |
|| ||
The architecture of Docker has evolved many times since its creation. This chapter aims to explain what you
(developers, ops engineers and architects) should know about Containerd integration in Docker architecture. Let’s
start by defining Docker Daemon and then see how it is integrated in the new Docker architecture and Containerd.
Docker Daemon
Like the init has its daemon, cron has crond, dhcp has dhcpd, Docker has its own daemon dockerd. To list Docker
daemons, list all Linux daemons:
ps -U0 -o ‘tty,pid,comm’ | grep ^?
And grep Docker on the output:
ps -U0 -o ‘tty,pid,comm’ | grep ^?|grep -i dockerd
? 2779 dockerd
Notice that you see also docker-containerd-shim. We are going to see this in details later in this chapter. If you are
already running Docker, when you type dockerd you will have a similar error message to this :
FATA[0000] Error starting daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
Now let’s stop Docker:
service docker stop
and run its daemon directly using dockerd command.
Running the Docker daemon command using dockerd is a good debugging tool, as you may see, you will have the
running traces right on your terminal screen:
INFO[0000] libcontainerd: new containerd process, pid: 19717
WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=4096
INFO[0001] [graphdriver] using prior storage driver "aufs"
INFO[0003] Graph migration to content-addressability took 0.63 seconds
WARN[0003] Your kernel does not support swap memory limit.
WARN[0003] mountpoint for pids not found
INFO[0003] Loading containers: start.
INFO[0003] Firewalld running: false
INFO[0004] Removing stale sandbox ingress_sbox (ingress-sbox)
INFO[0004] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --
bip can be used to set a preferred IP address
INFO[0004] Loading containers: done.
INFO[0004] Listening for local connections addr=/var/lib/docker/swarm/control.sock proto=unix
INFO[0004] Listening for connections addr=[::]:2377 proto=tcp
INFO[0004] 61c88d41fce85c57 became follower at term 12
INFO[0004] newRaft 61c88d41fce85c57 [peers: [], term: 12, commit: 290, applied: 0, lastindex: 290, lastterm: 12]
INFO[0004] 61c88d41fce85c57 is starting a new election at term 12
INFO[0004] 61c88d41fce85c57 became candidate at term 13
INFO[0004] 61c88d41fce85c57 received vote from 61c88d41fce85c57 at term 13
INFO[0004] 61c88d41fce85c57 became leader at term 13
INFO[0004] raft.node: 61c88d41fce85c57 elected leader 61c88d41fce85c57 at term 13
INFO[0004] Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=192.168.0.47 Adv-addr=192.168.0.47 Remote-addr =
INFO[0004] Daemon has completed initialization
INFO[0004] Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=192.168.0.47 Adv-addr=192.168.0.47 Remote-addr =
INFO[0004] Docker daemon commit=7392c3b graphdriver=aufs version=1.12.5
INFO[0004] Gossip cluster hostname eonSpider-3e64aecb2dd5
INFO[0004] API listen on /var/run/docker.sock
INFO[0004] No non-
localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]
INFO[0004] IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
INFO[0000] Firewalld running: false
Now if you create or remove containers for example, you will see that Docker daemon connects you (Docker client)
to Docker containers.
This is the global architecture of Docker :
Containerd
Containerd is one of the recent projects in the Docker ecosystem and its purpose is breaking up more modularity to
Docker architecture and more neutrality visà-vis the other industry actors (Cloud providers and other orchestrator
services).
According to Solomon Hykes, containerd is already deployed on millions of machines since April 2016 when it was
included in Docker 1.11. The announced roadmap to extend containerd get its input from the cloud providers and
actors like Alibaba Cloud, AWS, Google, IBM, Microsoft, and other active members of the container ecosystem.
More Docker engine functionality will be added to containerd so that containerd 1.0 will provide all the core
primitives you need to manage containers with parity on Linux and Windows hosts:
Container execution and supervision
Image distribution
Network Interfaces Management
Local storage
Native plumbing level API
Full OCI support, including the extended OCI image specification
To build, ship and run containerized applications, you may continue to use Docker but if you are looking for
specialized components you could consider Containerd.
Docker engine 1.11 was the first release built on runC (a runtime based on Open Container Intiative technology) and
containerd.
Formed in June 2015, the Open Container Initiative (OCI) aims to establish common standards for software
containers in order to avoid a potential fragmentation and divisions inside the container ecosystem.
It contains two specifications:
runtime-spec: The runtime specification
image-spec: The image specification
The runtime specification outlines how to run a filesystem bundle that is unpacked on disk:
A standardized container bundle should contain the needed information and configurations to load and run a
container in a config.json file residing in the root of the bundle directory.
A standardized container bundle should contain a directory representing the root filesystem of the container.
Generally this directory has a conventional name like rootfs.
You can see the json file if you export and extract an image. In the following example, we are going to use busybox
image.
mkdir my_container
cd my_container
mkdir rootfs
docker export $(docker create busybox) | tar -C rootfs -xvf -
Now we have an extracted busybox image inside of rootfs directory.
tree -d my_container/
my_container/
└── rootfs
├── bin
├── dev
│ ├── pts
│ └── shm
├── etc
├── home
├── proc
├── root
├── sys
├── tmp
├── usr
│ └── sbin
└── var
├── spool
│ └── mail
└── www
We can generate the config.json file:
docker-runc spec
This is the generated configuration file (config.json):
{
"ociVersion": "1.0.0-rc2-dev",
"platform": {
"os": "linux",
"arch": "amd64"
},
"process": {
"terminal": true,
"consoleSize": {
"height": 0,
"width": 0
},
"user": {
"uid": 0,
"gid": 0
},
"args": [
"sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "runc",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"hooks": {},
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
}
],
"maskedPaths": [
"/proc/kcore",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware"
],
"readonlyPaths": [
"/proc/asound",
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}
Now you can edit any of the configurations listed above and run again a container without even using Docker, just
runC:
runc run container-name
Note that you should install runC first in order to use it. For Ubuntu 16.04, you can just type this command:
sudo apt install runc
You could also install it from sources:
mkdir -p ~/golang/src/github.com/opencontainers/
cd ~/golang/src/github.com/opencontainers/
git clone https://github.com/opencontainers/runc
cd ./runc
make
sudo make install
runC, a standalone containers runtime, is at its full spec, it allows you to spin containers, interact with them, and
manage their lifecycle and that’s why containers built with one engine (like Docker) can run on another engine.
Containers are started as a child process of runC and can be embedded into various other systems without having to
run a daemon (Docker Daemon).
runC is built on libcontainer which is the same container library powering a Docker engine installation. Prior to the
version 1.11, Docker engine was used to manage volumes, networks, containers, images etc.. Now, the Docker
architecture is broken into four components: Docker engine, containerd, containerd-shm and runC. The binaries are
respectively called docker, docker-containerd, docker-containerd-shim, and docker-runc.
To run a container, Docker engine creates the image, pass it to containerd. containerd calls containerd-shim that
uses runC to run the container. Then, containerd-shim allows the runtime (runC in this case) to exit after it starts the
container : This way we can run "daemon-less" containers because we are not having to have the long running
runtime processes for containers.
Currently, the creation of a container is handled by runc (via containerd) but it is possible to use another binary
(instead of runC) that expose the same command line interface of Docker and accepting an OCI bundle.
You can see the different runtimes that you have on your host by typing:
docker info|grep -i runtime
Since I am using the default runtime, this is what I should get as an output:
Runtimes: runc
Default Runtime: runc
To add another runtime, you should follow this command:
docker daemon --add-runtime "<runtime-name>=<runtime-path>"
Example:
docker daemon --add-runtime "oci=/usr/local/sbin/runc"
There is only one containerd-shim by process and it manages the STDIO FIFO and keeps it open for the container in
case containerd or Docker dies. It is also in charge of reporting the container’s exit status to a higher level like
Docker.
Container runtime, lifecycle support and the execution (create, start, stop, pause, resume, exec, signal & delete) are
some features implemented in Containerd. Some others are managed by other components of Docker (volumes,
logging ..etc). Here is a table from the Containerd Github repository that lists the different features and tell if they
are in or out of scope.
If we run a container:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -v /data/lists:/var/lib/mysql -d mariadb
Unable to find image 'mariadb:latest' locally
latest: Pulling from library/mariadb
75a822cd7888: Pull complete
b8d5846e536a: Pull complete
b75e9152a170: Pull complete
832e6b030496: Pull complete
034e06b5514d: Pull complete
374292b6cca5: Pull complete
d2a2cf5c3400: Pull complete
f75e0958527b: Pull complete
1826247c7258: Pull complete
68b5724d9fdd: Pull complete
d56c5e7c652e: Pull complete
b5d709749ac4: Pull complete
Digest: sha256:0ce9f13b5c5d235397252570acd0286a0a03472a22b7f0384fce09e65c680d13
Status: Downloaded newer image for mariadb:latest
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
Then if you type ps aux you can notice the docker-containerd-shim process relative to this container running with
the following parameters and arguments:
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
/var/run/docker/libcontainerd/db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
and runC binary ( docker-runc )
This is the full line with the right format:
docker-containerd-shim <container_id> /var/run/docker/libcontainerd/<container_id> docker-runc
db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843 is the id of the container that you can
see at the end of the container creation.
ls -l /var/run/docker/libcontainerd/db5218c494190c11a2fcc9627ea1371935d7021e86b5f652221bdac1cf182843
total 4
-rw-r--r-- 1 root root 3653 Dec 27 22:21 config.json
prwx------ 1 root root 0 Dec 27 22:21 init-stderr
prwx------ 1 root root 0 Dec 27 22:21 init-stdin
prwx------ 1 root root 0 Dec 27 22:21 init-stdout
Final Words
Docker is really a powerful tool not only because it's changin the IT industry but also because it is creating a new
landscape. The futur of the Cloud Computing, the Serveless Computing, the distributed systems, IoT and the
production systems will be using this technology and its ecosystem, so it is good that you went trough this course
and learned important things that you will need for your expertise.
I would like to thank everybody who encouraged me to start working on this, from my family to my friends and of
course my readers. Thanks a lot !
Don't forget to join my newsletter DevOpsLinks, Shipped and the community Job Board JobsForDevOps. You can
follow me on Twitter for future updates.
I hope you have been enjoying this course.
Aymen El Amri.
Table of Contents
Introduction
Preface
Chapter I - Introduction To Docker & Containers
Chapter II - Installation & Configuration
Chapter III - Basic Concepts
Chapter IV - Advanced Concepts
Chapter V - Working With Docker Images
Chapter VI - Working With Docker Containers
Chapter VII - Working With Docker Machine
Chapter VIII - Docker Networking
Chapter IX - Composing Services Using Compose
Chapter X - Docker Logging
Chapter XI - Docker Debugging And Troubleshooting
Chapter XII - Orchestration - Docker Swarm
Chapter XIII - Orchestration - Kubernetes
Chapter XIV - Orchestration - Rancher/Cattle
Chapter XV - Docker API
Chapter XVI - Docker Security
Chapter XVII - Docker, Containerd & Standalone Runtimes Architecture
Final Words

Navigation menu