In a single host, containers are able to see each other, to see the external world (if they are not running in isolated
networks) and they can receive traffic from an external network.
Multi Host Networking
An application could be run using many containers with a load balancer, these containers could be spread across
multiple hosts. Networking in multi host environments is entirely different from single host environments.
Containers even spread across servers should be able to communicate together, this is where service discovery plays
an important role in networking. Service discovery allows you to request a container name and get its IP address
back.
Docker comes with a default network driver called overlay that we are going to see later in this chapter.
If you would like to operate in the multi host mode, you have two options:
Using Docker engine in swarm mode
Running a cluster of hosts using a key value store (like Zookeeper, etcd or Consul) that allows the service
discovery
Docker Networks
Docker Default Networks
The networking is one of the most important parts in building Docker clusters and microservices.
The first thing to know about networking in Docker is listing the different networks that Docker engine uses:
docker network ls
You should get the list of your Docker networks at least those who are pre-defined networks:
NETWORK ID
e5ff619d25f5
98d44bf13233
608a539ce1e3
8sm2anzzfa0i
ede46dbb22d7
NAME
DRIVER
SCOPE
bridge
bridge
local
docker_gwbridge
bridge
local
host
host
local
ingress
overlay
swarm
none
null
local
none, host and bridge are the names of default networks that you can find in any Docker installation running a
single host. The activation of the swarm mode creates another default (or predefined) network called ingress.
Clearly, these networks are not physical networks, they are emulated networks that abstracts hardware and of course
they are in built and managed in higher levels that the physical layer and this one of properties of software-defined
networking.
None Network
The network called none using the null driver is a predefined network that isolates a container in a way it can neither
connect to outisde not communicate with other containers in the same host.
NETWORK ID
ede46dbb22d7
NAME
none
DRIVER
SCOPE
null
local
Let's verify this by running a busybox container in this network:
docker run --net=none -it -d --name my_container busybox
If you inspect the created container docker inspect my_container , you can see the different network configuration of
this container and you will notice that is attached to null network with the id
ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7. The container does not have any
gateway.
"NetworkSettings": {
"Bridge": "",
"SandboxID": "566b9a74d37c7f47e02d769b79e168df437a5b23ee030fc199d99f7d94b353b7",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/566b9a74d37c",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"none": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7",
"EndpointID": "b42debd75af122d113c202ad373d46e0b08d32e9ef6e9361e49515045ae6288d",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": ""
}
}
}
Since my_container is attached to none network, it will not have any access to external and internal connections,
let's log into the container and check the network configuration:
docker exec -it my_container sh
Once you are logged inside the container, type
loopback interface.
ifconfig
and you will notice that there is no other interface apart the
/ # ifconfig
lo
Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
If you ping or traceroute an external IP or domain name, you will not be able to do it:
/ # traceroute painlessdocker.com
traceroute: bad address 'painlessdocker.com'
/ # ping painlessdocker.com
ping: bad address 'painlessdocker.com'
Doing the same thing with the loopback address 127.0.0.1 will work:
/ # ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.085 ms
--- 127.0.0.1 ping statistics --1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.085/0.085/0.085 ms
The last tests confirm that any container attached to the null network does not know about the outside networks and
no host from outside could access to my_container.
If you want to see the different configurations of this network, type
docker network inspect none
:
[
{
"Name": "none",
"Id": "ede46dbb22d7f3ab2dc95b11228de06e2d27e240a3f651bc2f6fd3ea0c4a2ca7",
"Scope": "local",
"Driver": "null",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": []
},
"Internal": false,
"Containers": {
"69806d9de46c2959d4d20f99660e7d58d7c35c1e0b33511f0b85a395b696786f": {
"Name": "my_container",
"EndpointID": "b42debd75af122d113c202ad373d46e0b08d32e9ef6e9361e49515045ae6288d",
"MacAddress": "",
"IPv4Address": "",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
Docker Host Network
If you want a container to run with a similar networking configuration to the host machine, then you should use the
host network.
NETWORK ID
608a539ce1e3
NAME
host
DRIVER
host
To see the configuration of this network, type
SCOPE
local
docker inspect network host
:
[
{
"Name": "host",
"Id": "608a539ce1e3e3b97964c6a2fe06eb0e0a9b539e659025fbd101b24e327d8da6",
"Scope": "local",
"Driver": "host",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": []
},
"Internal": false,
"Containers": {},
"Options": {},
"Labels": {}
}
]
Let's run a web server using nginx and use this network:
docker run -d -p 80 nginx
Verify the output of
CONTAINER ID
d0df14bf80a0
69806d9de46c
docker ps
IMAGE
nginx
busybox
:
COMMAND
"nginx -g 'daemon off"
"sh"
PORTS
443/tcp, 0.0.0.0:32769->80/tcp
NAMES
stoic_montalcini
my_container
Notice that we are using only the port 80, since we added the -p flag ( -p 80 ) that made the port 80 accessible
from the host at port 32769 (that Docker chose automatically) . We could have used an external binding to port 80
that we choose manually, say port 8080, in this case the command to run this container will be:
docker run -d -p 8080:80 nginx
In all cases, the port 80 of the container will be accessible either from the port 32769 or the port 8080 (in the second
case) but in both cases, the IP address will be the same as the host IP ( 127.0.0.1 or 0.0.0.0 ).
Let's verify it by running
curl -I http://0.0.0.0:32768
to see whether the server response will be 200 or not:
HTTP/1.1 200 OK
Connection: Keep-Alive
Keep-Alive: timeout=20
Date: Sat, 07 Jan 2017 20:49:02 GMT
Content-Type: text/html
So when running
curl -I http://0.0.0.0:23768
, nginx will reply with a 200 response. This is the content of the page:
curl http://0.0.0.0:32768
Index of /
Bridge Network
This is the default containers network, any network that runs without the --net flag will be attached automatically
to this network. Two Docker containers running in this network could see each other.
To create two containers, type:
docker run -it -d
docker run -it -d
--name my_container_1 busybox
--name my_container_2 busybox
These containers will be attached to the bridge network, if you type
that they are and what IP addresses the will have:
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "e5ff619d25f5dfa2e9b4fe95db8136b74fa61b588fb6141b7d9678adafd155a7",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
, you will notice
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Containers": {
"1172afcb3363f36248701aaa0ba9c1080ebc94db6a168f188f6ba98907e22102": {
"Name": "my_container_1",
"EndpointID": "b8f4741fb2008b70b60a0375446653f820fcaf6b1d8279c1b7d0abbb5775aeaf",
"MacAddress": "02:42:ac:11:00:04",
"IPv4Address": "172.17.0.4/16",
"IPv6Address": ""
},
"6895aa358faea0226ba646544056c34063a0ef5b83d10e68500936d0a397bb7b": {
"Name": "my_container_2",
"EndpointID": "2d3c6c8ca175c0ecb35459dcd941c0456fbcbf8fcce4885aa48eb06b9cff19b8",
"MacAddress": "02:42:ac:11:00:05",
"IPv4Address": "172.17.0.5/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
my_container_1 has the IP address 172.17.0.4/16
my_container_2 has the IP address 172.17.0.5/16
From any of the created containers, say my_container_1 you could see the other container, just type
my_container_1 ping 172.17.0.5 to ping the my_container_2 from my_container_1:
docker exec -it
docker exec -it my_container_1 ping 172.17.0.5
PING 172.17.0.5 (172.17.0.5): 56 data bytes
64 bytes from 172.17.0.5: seq=0 ttl=64 time=0.156 ms
64 bytes from 172.17.0.5: seq=1 ttl=64 time=0.071 ms
The containers running in the bridge network could see each other by IP address, let's see if it is possible to ping
using the container name:
docker exec -it my_container_1 ping my_container_2
ping: bad address 'my_container_2'
As you see, it is not possible for Docker to associate a container name to an IP and this is not possible
because we do not run any discovery service. Creating a user-defined network could solve this problem.
docker_gwbridge Network
When you create a swarm cluster or join one, Docker will create by default a network called docker_gwbridge that
will be used to connect different containers from different hosts. These hosts are part of the swarm cluster.
In general, this network provides the containers not having an access to external networks with connectivity.
docker network ls
NETWORK ID
NAME
DRIVER
SCOPE
98d44bf13233
docker_gwbridge
bridge
local
Running overlay networks always need the docker_gwbridge.
Software Defined & Multi Host Networks
This type of networks, in opposite to the default networks, does not come with a fresh Docker installation , but
should be created by the user. The simplest way to create a new network is:
docker network create my_network
For more general use, this is the command to use:
docker network create
Bridge Networks
You can use many network drivers like bridge or overlay, you may also need to set up a IP range or a subnet for
your network or you will probably need to setup your own gateway. Type docker network create --help for more
options and configurations:
--aux-address value
-d or --driver string
--gateway value
--help
--internal
--ip-range value
--ipam-driver string
--ipam-opt value
--ipv6
--label value
-o or --opt value
--subnet value
Auxiliary IPv4 or IPv6 addresses used by Network driver (default map[])
Driver to manage the Network (default "bridge")
IPv4 or IPv6 Gateway for the master subnet (default [])
Print usage
Restrict external access to the network
Allocate container ip from a sub-range (default [])
IP Address Management Driver (default "default")
Set IPAM driver specific options (default map[])
Enable IPv6 networking
Set metadata on a network (default [])
Set driver specific options (default map[])
Subnet in CIDR format that represents a network segment (default [])
In order to use the Docker service discovery, let's create a second bridge network:
docker network create --driver bridge my_bridge_network
To see the new network, type:
docker network ls
NETWORK ID
5555cd178f99
NAME
my_bridge_network
DRIVER
bridge
SCOPE
local
my_container_1 and my_container_2 are running in the default bridge network, we want them attached to the new
network, we should type:
docker network connect my_bridge_network my_container_1
docker network connect my_bridge_network my_container_2
Now, the service discovery is working and both a Docker container could access another one by its name.
docker exec -it my_container_1 ping my_container_2
PING my_container_2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: seq=0 ttl=64 time=0.120 ms
64 bytes from 172.18.0.3: seq=1 ttl=64 time=0.081 ms
Docker containers running in a user-defined bridge network could see each other by their containers' names.
Let's create a more personalized network with a specified subnet, gateway and IP range and let's also change the
behavior of networking in this network by decreasing the size of the largest network layer protocol data unit that can
be communicated in a single network transaction, or what we call MTU (Maximum Transmission Unit):
docker network create -d bridge \
--subnet=192.168.0.0/16 \
--gateway=192.168.0.100 \
--ip-range=192.168.1.0/24 \
--opt "com.docker.network.driver.mtu"="1000"
my_personalized_network
You can change other options using the
--opt
or
-o
flag like:
: bridge name to be used when creating the Linux bridge
: Enable IP masquerading
com.docker.network.bridge.enable_icc : Enable or Disable Inter Container Connectivity
com.docker.network.bridge.host_binding_ipv4 : Default IP when binding container ports
com.docker.network.bridge.name
com.docker.network.bridge.enable_ip_masquerade
For any newly created bridge network, an interface is created in the container, just type
container to see them:
ifconfig
inside the
Let's connect my_container_1 to my_personalized_network:
docker network connect my_personalised_network my_container_1
Executing
ifconfig
inside this container (
docker exec -it my_container_1 ifconfig
) will show us two things:
This container is running inside more than a container because it was connected to my_bridge_network and
now we connected it to my_personalized_network.
The MTU changed to 1000.
eth0
Link encap:Ethernet HWaddr 02:42:AC:11:00:04
inet addr:172.17.0.4 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe11:4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:194 errors:0 dropped:0 overruns:0 frame:0
TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:34549 (33.7 KiB) TX bytes:1410 (1.3 KiB)
eth1
Link encap:Ethernet HWaddr 02:42:C0:A8:01:00
inet addr:192.168.1.0 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1000 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:87 (87.0 B) TX bytes:0 (0.0 B)
lo
Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:16 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1240 (1.2 KiB) TX bytes:1240 (1.2 KiB)
docker run supports only one network, bur
add it to many networks.
docker network connect
could be used after the container creation to
By default, two containers living in the same host but in different networks will not see each other. Docker
daemon runs a tiny DNS server that allows user-defined networks to make service discovery.
docker_gwbridge Network
You can create new docker_gwbridge networks using
docker network create
command.
Example:
docker network create --subnet 172.3.0.0/16 \
--opt com.docker.network.bridge.name=another_docker_gwbridge \
--opt com.docker.network.bridge.enable_icc=false \
--opt com.docker.network.bridge.enable_ip_masquerade=true \
another_docker_gwbridge
Overlay Networks
Overlay networks are used in multi host environments (like the swarm mode of Docker). You can create an overlay
network using docker network create command but after the activation of swarm mode:
docker swarm init
docker network create --driver overlay --subnet 10.0.9.0/24
my_network
This is the configuration of the latter network:
[
{
"Name": "my_network",
"Id": "2g3i0zdldo4adfqisvqjn6gpt",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.9.0/24",
"Gateway": "10.0.9.1"
}
]
},
"Internal": false,
"Containers": null,
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "257"
},
"Labels": null
}
]
Say we have two hosts in a Docker cluster (having 192.168.10.22 and 192.168.10.23 as IP addresses) , the different
containers attached to the latter overlay network will have dynamically allocated IP addresses in the network subnet
10.0.9.0/24.
VXLAN (Virtual Extensible LAN) - according to Wikipedia - is a network virtualization technology that attempts to
improve the scalability problems associated with large cloud computing deployments.
It uses a VLAN-like encapsulation technique to encapsulate MAC-based OSI layer 2 Ethernet frames within layer 4
UDP packets, using 4789 as the default destination UDP port number.
To better understand the context, this diagram illustrates the 7 layers of the OSI model:
VXLAN endpoints, which terminate VXLAN tunnels and may be both virtual or physical switch ports, are known as
VXLAN tunnel endpoints (VTEPs).
VXLAN is an evolution of efforts to standardize on an overlay encapsulation protocol. It increases scalability up to
16 million logical networks and allows for layer 2 adjacency across IP networks.
Open vSwitch is an example of a software-based virtual network switch that supports VXLAN overlay networks.
The network driver overlay works on the VXLAN tunnels (connecting VTEPs) and need a key/value store.
A VTEP has two logical interfaces: an uplink and a downlink where the uplink is acting like a tunnel endpoint
having an IP address to receive sent VXLAN frames.
Flannel
Kubernetes does away with port-mapping and assigns a unique IP address to each pod and this works well in Google
Compute but for some other cloud providers a host can not get an entire subnet, this is why Flannel solves this
problem and creates an overlay mesh network that provision each host with a subnet: Each pod (if you are using
Kubernetes) or container has a unique and routable IP inside the cluster.
According to CoreOs creators, Kubernetes and then Flannel works great with CoreOS to distribute a workload
across a cluster.
Flannel was designed to use with Kubernetes but it could be used as a generic overlay network driver to create
software-defined overlay networks since it supports VXLAN, AWS VPC, and the default layer 2 UDP overlay
network.
Flannel uses etcd to store the network configuration (VMs subnets,hosts' IPs etc ..) and among other back-ends it
uses UDP and a TUN device in order to encapsulate an IP fragment in a UDP packet. The latter transports some
information like the MAC, the outer IP, the inner IP and the playload.
Note that the IP fragmentation is a process that happens in the Internet Protocol (IP) in order to fragment or
breaks the sent datagrams into smaller pieces (called generally fragments). This way, the formed packets could pass
through a link with a smaller maximum transmission unit (MTU) than the original UDP datagram size. And of
course the fragments are reassembled by the receiving host.
This schema taken from the official Github repository explains well how Flannel networking works in general:
To install Fannel you need to build it from source:
sudo apt-get install linux-libc-dev golang gc
git clone https://github.com/coreos/flannel.git
cd flannel; make dist/flanneld
Weave
Weaveworks created Weave (or Weave Net) which is a virtual network that connects Docker containers deployed
across multiple hosts.
In order to install Weave:
sudo curl -L git.io/weave -o /usr/local/bin/weave
sudo chmod a+x /usr/local/bin/weave
Now just type
weave
to download
weaveworks/weaveexec
Docker image and see the help:
Usage:
weave --help | help
setup
version
weave launch
launch-router [--password ] [--trusted-subnets ,...]
[--host ]
[--name ] [--nickname ]
[--no-restart] [--resume] [--no-discovery] [--no-dns]
[--ipalloc-init ]
[--ipalloc-range [--ipalloc-default-subnet ]]
[--log-level=debug|info|warning|error]
...
launch-proxy [-H ] [--without-dns] [--no-multicast-route]
[--no-rewrite-hosts] [--no-default-ipalloc] [--no-restart]
[--hostname-from-label ]
[--hostname-match ]
[--hostname-replacement ]
[--rewrite-inspect]
[--log-level=debug|info|warning|error]
launch-plugin [--no-restart] [--no-multicast-route]
[--log-level=debug|info|warning|error]
weave prime
weave env
config
dns-args
[--restore]
weave connect
forget
[--replace] [ ...]
...
weave run
[--without-dns] [--no-rewrite-hosts] [--no-multicast-route]
[ ...] ...
start
attach
detach
restart
[ ...]
[ ...]
[ ...]
weave expose
hide
[ ...] [-h ]
[ ...]
weave dns-add
[ ...] [-h ] |
... -h
[ ...] [-h ] |
... -h
dns-remove
dns-lookup
weave status
report
ps
[targets | connections | peers | dns | ipam]
[-f ]
[ ...]
weave stop
stop-router
stop-proxy
stop-plugin
weave reset
rmpeer
where
[--force]
...
=
=
=
=
=
=
[:]
/
[ip:] | net: | net:default
[tcp://][]: | [unix://]/path/to/socket
|
consensus[=] | seed=,... | observer
Start Weave router:
weave launch
After typing the latter command, you will notice that some other Docker images are pulled, it's alright, Weave needs
weavedb and weaveplugin:
Unable to find image 'weaveworks/weavedb:latest' locally
latest: Pulling from weaveworks/weavedb
1266eb846caf: Pulling fs layer
1266eb846caf: Download complete
1266eb846caf: Pull complete
Digest: sha256:c43f5767a1644196e97edce6208b0c43780c81a2279e3421791b06806ca41e5f
Status: Downloaded newer image for weaveworks/weavedb:latest
Unable to find image 'weaveworks/weave:1.8.2' locally
1.8.2: Pulling from weaveworks/weave
e110a4a17941: Already exists
199ab7eb2ba4: Already exists
8c419735a809: Already exists
1888d0f92b68: Already exists
f4d1c90c86a4: Already exists
1d6a7435ac59: Already exists
7372f3ee9e8b: Already exists
17004cbabd74: Already exists
b8e5c537a426: Already exists
4e295f039ae0: Pulling fs layer
d67a003dc85f: Pulling fs layer
2a84c77046e7: Pulling fs layer
d67a003dc85f: Download complete
2a84c77046e7: Verifying Checksum
2a84c77046e7: Download complete
4e295f039ae0: Verifying Checksum
4e295f039ae0: Download complete
4e295f039ae0: Pull complete
d67a003dc85f: Pull complete
2a84c77046e7: Pull complete
Digest: sha256:7a9ec1daa3b9022843fd18986f1bd5c44911bc9f9f40ba9b4d23b1c72c51c127
Status: Downloaded newer image for weaveworks/weave:1.8.2
Unable to find image 'weaveworks/plugin:1.8.2' locally
1.8.2: Pulling from weaveworks/plugin
e110a4a17941: Already exists
199ab7eb2ba4: Already exists
8c419735a809: Already exists
1888d0f92b68: Already exists
f4d1c90c86a4: Already exists
1d6a7435ac59: Already exists
7372f3ee9e8b: Already exists
17004cbabd74: Already exists
b8e5c537a426: Already exists
4e295f039ae0: Already exists
d67a003dc85f: Already exists
2a84c77046e7: Already exists
2b57c438a07b: Pulling fs layer
2b57c438a07b: Verifying Checksum
2b57c438a07b: Download complete
2b57c438a07b: Pull complete
Digest: sha256:3a38cec968bff6ebc4b1823673378b14d52ef750dec89e3513fe78119d07fdf2
Status: Downloaded newer image for weaveworks/plugin:1.8.2
Let's create a subnet for the host:
weave expose 10.10.0.1/16
And inspect the output of
bridge name
br-0ebcbf638b08
br-5555cd178f99
br-da47f0537ffa
docker0
docker_gwbridge
lxcbr0
weave
brctl show
bridge id
8000.0242a98e8f09
8000.0242d38260e6
8000.024261465302
8000.02422492c7fe
8000.0242e5db7cff
8000.000000000000
8000.26eea4e57577
command as well as the newly created interface:
STP enabled
interfaces
no
no
no
no
no
veth6e6c262
no
no
vethwe-bridge
ifconfig weave
weave
Link encap:Ethernet HWaddr 26:ee:a4:e5:75:77
inet addr:10.10.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1410 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:67 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:10738 (10.7 KB)
You can use
ifdown weave
to stop the created interface.
brctl is used to set up, maintain, and inspect the ethernet bridge configuration in the Linux Kernel.
If you want more information about Weave running in your host, type
Version: 1.8.2 (up to date; next check at 2017/01/16 01:16:44)
Service:
Protocol:
Name:
Encryption:
PeerDiscovery:
Targets:
Connections:
Peers:
TrustedSubnets:
router
weave 1..2
26:ee:a4:e5:75:77(eonSpider)
disabled
enabled
0
0
1
none
Service:
Status:
Range:
DefaultSubnet:
ipam
idle
10.32.0.0/12
10.32.0.0/12
Service:
Domain:
Upstream:
TTL:
Entries:
dns
weave.local.
127.0.1.1
1
0
Service: proxy
Address: unix:///var/run/weave/weave.sock
Service: plugin
weave status
:
DriverName: weave
Now you can start a container directly from Weave CLI:
weave run 10.2.0.2/16 -it -d busybox
When you type
CONTAINER ID
0bedd8bf148a
575aee35ec8d
9e7a5a87b137
0edc17ad4a49
docker ps
you will see the last created container as well as Weave containers:
IMAGE
busybox
weaveworks/plugin:1.8.2
weaveworks/weaveexec:1.8.2
weaveworks/weave:1.8.2
COMMAND
NAMES
"sh"
sick_raman
"/home/weave/plugin"
weaveplugin
"/home/weave/weavepro" weaveproxy
"/home/weave/weaver -" weave
To connect two containers in two distinct hosts using Weave, launch these commands in the first host ($HOST1):
weave launch
eval $(weave env)
docker run --name c1 -ti busybox
and in the second host, tell $HOST2 to peer with Weave already started on $HOST1:
weave launch $HOST1
eval $(weave env)
docker run --name c2 -ti busybox
$HOST1 is the IP address of the first host.
The following image explains how a simple communication between two containers living in two distinct hosts can
communicate:
In order to automate the discovery process in a Swarm cluster, start by having the Swarm token:
curl -X POST https://discovery-stage.hub.docker.com/v1/clusters && echo -e "\n"
This is my token, you should of course get a different one:
10fd726a3e5341f86fb90658208e564a
If you haven't already launched weave, do it:
weave launch
Now download the discovery script:
curl -O https://raw.githubusercontent.com/weaveworks/discovery/master/discovery && chmod a+x discovery
The script will be downloaded to the current directory, you should move it to a directory like
want to use as a system executable.
/usr/bin
if you
Do you remember your token ? You will use it here:
discovery join --advertise-router token://10fd726a3e5341f86fb90658208e564a
Until now, we are working in $HOST1. Go to $HOST2 and repeat the same commands:
weave launch
curl -O http://git.io/vmW3z && chmod a+x discovery
discovery join --advertise-router token://10fd726a3e5341f86fb90658208e564a
Both Weave routers should be and should stay connected.
This is how you can use the discovery command:
Weave Discovery
discovery join [--advertise=ADDR]
[--advertise-external]
[--advertise-router]
[--weave=ADDR[:PORT]]
where = backend://path
= host|IP
To leave a cluster, you should use the following command in the host that you want to leave the cluster:
discovery leave
If you are using a KV store like etcd, you can also consider using it:
discovery join etcd://some/path
These are the important steps to use Weave service discovery, it is quite similar to Swarm CLI. We are going to see
Swarm in details in some next parts of this book and you will be able to better understand the discovery.
Open vSwitch
Licensed under the open source Apache 2.0 license, the multilayer virtual switch Open vSwitch is designed to enable
massive network automation through programmatic extension, while still supporting standard management
interfaces and protocols (e.g. NetFlow, sFlow, IPFIX, RSPAN, CLI, LACP, 802.1ag).
Open vSwitch is also designed to support distribution across multiple physical servers similar to VMware's
vNetwork distributed vswitch or Cisco's Nexus 1000V.
In order to connect containers in multiple hosts, you need to install OpenvSwitch in all hosts:
apt-get install -y openvswitch-switch bridge-utils
You may need some dependencies:
sudo apt-get install -y build-essential fakeroot debhelper \
autoconf automake bzip2 libssl-dev \
openssl graphviz python-all procps \
python-qt4 python-zopeinterface \
python-twisted-conch libtool
Then install
ovs
utility:
cd /usr/bin
wget https://raw.githubusercontent.com/openvswitch/ovs/master/utilities/ovs-docker
chmod a+rwx ovs-docker
Single Host
We start by creating an OVS bridge called ovs-br1:
ovs-vsctl add-br ovs-br1
Then we activate it and give it an IP and a netmask:
ifconfig ovs-br1 173.17.0.1 netmask 255.255.255.0 up
Verify your new interface configuration by typing:
ifconfig ovs-br1
ovs-br1
Link encap:Ethernet HWaddr e6:58:8f:58:89:43
inet addr:173.17.0.1 Bcast:173.17.0.255 Mask:255.255.255.0
inet6 addr: fe80::e458:8fff:fe58:8943/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:0 (0.0 B) TX bytes:648 (648.0 B)
Let's create two containers:
docker run -it --name mycontainer1 -d
docker run -it --name mycontainer2 -d
busybox
busybox
And connect them to the bridge:
ovs-docker add-port ovs-br1 eth1 mycontainer1 --ipaddress=173.16.0.2/24
ovs-docker add-port ovs-br1 eth1 mycontainer2 --ipaddress=173.16.0.3/24
You can now ping the second container from the first one:
PING 173.16.0.3 (173.16.0.3): 56 data bytes
64 bytes from 173.16.0.3: seq=0 ttl=64 time=0.338 ms
64 bytes from 173.16.0.3: seq=1 ttl=64 time=0.061 ms
^C
Do the same thing from the second container:
docker exec -i mycontainer2 ping 173.16.0.2
PING 173.16.0.2 (173.16.0.2): 56 data bytes
64 bytes from 173.16.0.2: seq=0 ttl=64 time=0.067 ms
64 bytes from 173.16.0.2: seq=1 ttl=64 time=0.077 ms
^C
Multi Host
One can ask oneself, why can't we use regular Linux bridges ? I will use the official FAQ of Open vSwitch to answer
this:
Q: Why would I use Open vSwitch instead of the Linux bridge? A: Open vSwitch is specially designed to make
it easier to manage VM network configuration and monitor state spread across many physical hosts in dynamic
virtualized environments. Please see [WHY-OVS.md] for a more detailed description of how Open vSwitch
relates to the Linux Bridge
Create two hosts (Host1 and Host2) accessible to each other.
For this example, we have two VMs and of course, openvswitch-switch, Docker and ovs-docker tool should be
installed in both hosts.
wget -qO- https://get.docker.com/|sh
apt-get install openvswitch-switch bridge-utils openvswitch-common
On Host1: Create a new ovs bridge and a veth pair, set them to up then create the tunnel between Host1 and Host2.
Make sure to change by the real IP of Host2.
ovs-vsctl add-br br-int
ip link add veth0 type veth peer name veth1
ovs-vsctl add-port br-int veth1
brctl addif docker0 veth0
ip link set veth1 up
ip link set veth0 up
ovs-vsctl add-port br-int gre0 -- set interface gre0 type=gre options:remote_ip=
On Host2: Do the same thing, create a new ovs bridge and a veth pair, set them to up then create the tunnel between
Host1 and Host2. Make sure to change by the real IP of Host1.
ovs-vsctl add-br br-int
ip link add veth0 type veth peer name veth1
ovs-vsctl add-port br-int veth1
brctl addif docker0 veth0
ip link set veth1 up
ip link set veth0 up
ovs-vsctl add-port br-int gre0 -- set interface gre0 type=gre options:remote_ip=
You can see the created bridge on each host by typing
ovs-vsctl show
.
On Host1:
0aaba889-1d8c-4db2-b783-d7a203853d44
Bridge br-int
Port "veth1"
Interface "veth1"
Port br-int
Interface br-int
type: internal
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip=""}
ovs_version: "2.5.0"
On Host2:
7e782730-8990-4786-b2b0-efef7721665b
Bridge br-int
Port "veth1"
Interface "veth1"
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip=""}
Port br-int
Interface br-int
type: internal
ovs_version: "2.5.0"
and are of course changed by their real values. You can also use
Now, create a container in Host1:
docker run -it --name container1 -d busybox
View its IP address:
brctl show
command for more information.
docker inspect --format '{{.NetworkSettings.IPAddress}}' container1
On Host2, do the same thing
docker run -it --name container1 -d busybox
docker inspect --format '{{.NetworkSettings.IPAddress}}' container1
You will notice that both containers have the same IP address 172.17.0.2 , this could create a conflict in the cluster,
so we are going to create a second container. The latter will have a different IP:
docker run -it --name container2 -d busybox
docker inspect --format '{{.NetworkSettings.IPAddress}}' container2
From the container1 in Host1, ping container2 in Host2:
docker exec -it container1 ping -c 2 172.17.0.3
PING 172.17.0.3 (172.17.0.3): 56 data bytes
64 bytes from 172.17.0.3: seq=0 ttl=64 time=0.985 ms
64 bytes from 172.17.0.3: seq=1 ttl=64 time=0.963 ms
--- 172.17.0.3 ping statistics --2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.963/0.974/0.985 ms
From container2 in Host2, ping the container1 in Host1, but before this remove the container1 in the same host
(Host2) in order to be sure that we are pinging the right container (container1 in Host1):
docker rm -f container1
docker exec -it container2 ping -c 2 172.17.0.2
PING 172.17.0.2 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=1.475 ms
64 bytes from 172.17.0.2: seq=1 ttl=64 time=1.139 ms
--- 172.17.0.2 ping statistics --2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 1.139/1.307/1.475 ms
Now you have seen how can we connect two containers on different hosts using Open vSwitch.
To go beyond that, you may notice that both docker0 interface has the same IP address which is 172.17.0.1:
This similarity could create confusion within a multihost network.
This is why we are going to remove docker0 bridge interface and create a new one with different subnets. You are
free to make any choice of private IP addresses, I am going to use this:
Host1 : 192.168.10.1/16
Host2 : 192.168.11.1/16
In order to change an IP address of an interface, you can use
ifconfig
Usage:
ifconfig [-a] [-v] [-s] [[] ]
[add [/]]
[del [/]]
[[-]broadcast []] [[-]pointopoint []]
[netmask ] [dstaddr ] [tunnel ]
[outfill ] [keepalive ]
[hw ] [metric ] [mtu ]
[[-]trailers] [[-]arp] [[-]allmulti]
[multicast] [[-]promisc]
[mem_start ] [io_addr ] [irq ] [media ]
[txqueuelen ]
[[-]dynamic]
[up|down] ...
Usage: brctl [commands]
ifconfig
command and
brctl
command.
This is a practical example but before that make sure your firewall rules will not stop you doing the next steps, make
also sure that you stop Docker service docker stop :
Deactivate
docker0
by bringing it down:
ifconfig docker0 down
Delete the bridge and create a new one having the same interface name:
brctl delbr docker0
brctl addbr docker0
Bring it up while assigning it a new address and a new
mtu
:
sudo ifconfig docker0 192.168.10.1/16 mtu 1400 up
The last change will not be persistent unless you add it to
bip=192.168.10.1/16 to DOCKER_OPTS .
/etc/default/docker
configuration file and add
--
Example:
DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4 --bip=10.11.12.1/24"
If you are using Ubuntu/Debian, you can use a script that I found in a Github gist and tested, it will do this
automatically for you,I forked it here: https://gist.github.com/eon01/b7fbfa3309ed4f514bc742045ce9b5a2 , you can use it this
way:
Example1:
Example2:
wget http://bit.ly/2kHIbVc && bash configure_docker0.sh 192.168.10.1/16
wget http://bit.ly/2kHIbVc && bash configure_docker0.sh 192.168.11.1/16
For formatting reasons, I used bit.ly to shorten the url, this is the real url :
https://gist.githubusercontent.com/eon01/b7fbfa3309ed4f514bc742045ce9b5a2/raw/7bb94c774510505196151c5d787ce865140ace9c/configure_docker0.sh
To use this script:
Make sure you choose the network you want to use instead of the networks used in the examples
Make sure you are using Debian/Ubuntu
This script must be run with your root user
Docker must be stopped
Change will happen after starting Docker
Project Calico
Calico provides a different approach since it uses the layer 3 to provide the virtual networking feature. It includes
pre-integration with Kubernetes and Mesos (as a CNI network plugin), Docker (as a libnetwork plugin) and
OpenStack (as a Neutron plugin). It supports many public and private cloud like AWS, GCE.
Almost all of the other networking solutions (like Weave and Fannel) encapsulate layer 2 traffic into a higher level
to build an overlay network while the primary operating mode of project Calico requires no encapsulation.
Based on the same scalable IP network principles as the Internet, Calico leverages the existing Linux Kernel
forwarding engine without the need for virtual switches or overlays. Each host propagates workload reachability
information (routes) to the rest of the data center – either directly in small scale deployments or via infrastructure
route reflectors to reach Internet level scales in large deployments.
Like it is described in the official documentation of the project, Calico simplifies the network topology, removing
multiple encapsulation and de-encapsulation which gives some strengths to this networking system:
Smaller packet sizes mean that there is reduction in possible packet fragmentation.
There is a reduction in CPU cycles handling the encap and de-encap.
Easier to interpret packets, and therefore easier to troubleshoot.
Project Calico is most compatible with data centers where you have control over the physical network fabric.
Pipework
Pipework is a Software-Defined Networking tools for LXC that lets you connect together containers in arbitrarily
complex scenarios. It uses cgroups and namespace, works with containers created with lxc-start (plain LXC) and
with Docker.
In order to install it, you can execute the installation script from its Github repository :
sudo bash -c "curl https://raw.githubusercontent.com/jpetazzo/pipework/master/pipework > /usr/local/bin/pipework"
Since its creation, Docker is allowing more complex scenarios, and Pipework is becoming obsolete. Given the
Docker, Inc., acquisition of SocketPlane and the introduction of the Overlay Driver, you better use Docker Swarm
built-in orchestration unless you have very specific needs.
OpenVPN
Using OpenVPN, you can create virtual private networks (VPNs), you can use a VPN network to connect different
VMs in the same data center or a multi-cloud VMs in order to connect distributed containers. This connection will be
of course secure (TLS).
Service Discovery
Etcd
etcd is a distributed, key-value store for shared configuration and service discovery, with features like:
A user-facing API (gRPC)
Automatic TLS with optional client cert authentication
Rapidity (Benchmarked 10,000 writes/sec according to CoreOS)
Properly distributed using Raft
etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log. It is a
production-ready software widely used with tools like Kubernetes, fleet, locksmith, vulcand, Doorman.
In order to rapidly setup use etcd on AWS, you can use the official AMI
To test etcd, you can create a CoreOS cluster (with 3 machines) and for simplicity sake, I am going to use Digital
Ocean.
The first thing to do here before having a new CoreOS cluster is is generating a new discovery URL. You can do
this by using https://discovery.etcd.io url. This will print a new discovery url:
curl -w "\n" "https://discovery.etcd.io/new?size=3"
This is my discovery url :
https://discovery.etcd.io/d9fe2c6051e8204e2fa730ccc815e76b
We are going to use this url in the cloud-config configuration.
You should change the discovery url by your own generated url:
#cloud-config
coreos:
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new:
discovery: https://discovery.etcd.io/d9fe2c6051e8204e2fa730ccc815e76b
# multi-region deployments, multi-cloud deployments, and Droplets without
# private networking need to use $public_ipv4:
advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001
initial-advertise-peer-urls: http://$private_ipv4:2380
# listen on the official ports 2379, 2380 and one legacy port 4001:
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380
fleet:
public-ip: $private_ipv4
# used for fleetctl ssh command
units:
- name: etcd2.service
command: start
- name: fleet.service
command: start
When creating your 3 CoreOS VMs make sure to activate private networking and pasting your cloud-config
configuration.
The cloud-config will not work for you if you forget to add the first line
Create your 3 machines:
#cloud-config
Go get a cup of coffee unless you created small VMs:
Log to one of the created machines, now you can type fleetctl list-machines in order to see all of the created
machines.
If you want another machine to join the same cluster, you can use the same cloud-config file again and your machine
will join the cluster automatically.
You can find the discovery url by typing
grep DISCOVERY /run/systemd/system/etcd2.service.d/20-cloudinit.conf
etcd is written in the Go language and developed by CoreOS team.
Consul
Consul is a tool for service discovery and configuration that runs on Linux, Mac OS X, FreeBSD, Solaris, and
Windows. Consul is distributed, highly available, and extremely scalable.
It provides several key features:
Service Discovery - Consul makes it easy for services to register themselves and to discover other services via
a DNS or HTTP interface. External services such as SaaS providers can also be registered.
Health Checking - Health Checking enables Consul to quickly alert operators about any issues in a cluster. The
integration with service discovery prevents routing traffic to unhealthy hosts and enables service level circuit
breakers.
Key/Value Storage - A flexible key/value store enables storing dynamic configuration, feature flagging,
coordination, leader election and more. The simple HTTP API makes it easy to use anywhere.
Multi-Datacenter - Consul is built to be datacenter aware, and can support any number of regions without
complex configuration.
For simplicity sake, I will use a Docker container to run Consul:
docker run -p 8400:8400 -p 8500:8500 \
> -p 8600:53/udp -h consul_s progrium/consul -server -bootstrap
Unable to find image 'progrium/consul:latest' locally
latest: Pulling from progrium/consul
c862d82a67a2: Pull complete
0e7f3c08384e: Pull complete
0e221e32327a: Pull complete
09a952464e47: Pull complete
60a1b927414d: Pull complete
4c9f46b5ccce: Pull complete
417d86672aa4: Pull complete
b0d47ad24447: Pull complete
fd5300bd53f0: Pull complete
a3ed95caeb02: Pull complete
d023b445076e: Pull complete
ba8851f89e33: Pull complete
5d1cefca2a28: Pull complete
Digest: sha256:8cc8023462905929df9a79ff67ee435a36848ce7a10f18d6d0faba9306b97274
Status: Downloaded newer image for progrium/consul:latest
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting raft data migration...
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul_s'
Datacenter: 'dc1'
Server: true (bootstrap: true)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53, RPC: 8400)
Cluster Addr: 172.17.0.3 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas:
==> Log data will now stream in as it occurs:
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
2017/01/29
01:02:23
01:02:23
01:02:23
01:02:23
01:02:23
01:02:23
01:02:25
01:02:25
01:02:25
01:02:25
01:02:25
01:02:25
01:02:25
01:02:25
01:02:25
[INFO] serf: EventMemberJoin: consul_s 172.17.0.3
[INFO] serf: EventMemberJoin: consul_s.dc1 172.17.0.3
[INFO] raft: Node at 172.17.0.3:8300 [Follower] entering Follower state
[INFO] consul: adding server consul_s (Addr: 172.17.0.3:8300) (DC: dc1)
[INFO] consul: adding server consul_s.dc1 (Addr: 172.17.0.3:8300) (DC: dc1)
[ERR] agent: failed to sync remote state: No cluster leader
[WARN] raft: Heartbeat timeout reached, starting election
[INFO] raft: Node at 172.17.0.3:8300 [Candidate] entering Candidate state
[INFO] raft: Election won. Tally: 1
[INFO] raft: Node at 172.17.0.3:8300 [Leader] entering Leader state
[INFO] consul: cluster leadership acquired
[INFO] consul: New leader elected: consul_s
[INFO] raft: Disabling EnableSingleNode (bootstrap)
[INFO] consul: member 'consul_s' joined, marking health alive
[INFO] agent: Synced service 'consul'
You can see that
Now a Docker Consul container is running and maps the ports 8500 for the HTTP API and 8600 for the DNS
endpoint.
CONTAINER ID
40d56ae6d179
(1):
IMAGE
progrium/consul
COMMAND
"/bin/start -serve..."
PORTS
(1)
53/tcp, 0.0.0.0:8400->8400/tcp, 8300-8302/tcp, 8301-8302/udp, 0.0.0.0:8500->8500/tcp, 0.0.0.0:8600->53/udp
You can use the HTTP endpoint to show a list of connected nodes:
curl localhost:8500/v1/catalog/nodes
[{"Node":"consul_s","Address":"172.17.0.3"}]
In order to use the DNS endpoint try
dig @0.0.0.0 -p 8600 node1.node.consul
.
; <<>> DiG 9.9.5-3ubuntu0.10-Ubuntu <<>> @0.0.0.0 -p 8600 consul_s
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 22307
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;consul_s.
IN
A
Consul could be used from a GUI, you can give it a try at
you localhost).
You can use Consul with different options like:
Using a service definition with Consul
Example for 1 service:
{
"service": {
"name": "redis",
"tags": ["primary"],
"address": "",
"port": 8000,
"enableTagOverride": false,
"checks": [
{
"script": "/usr/local/bin/check_redis.py",
"interval": "10s"
}
]
}
}
http://0.0.0.0:8500/
(if you are running the container on
For more than 1 service, just use
services
instead of
service
:
{
"services": [
{
"id": "red0",
"name": "redis",
"tags": [
"primary"
],
"address": "",
"port": 6000,
"checks": [
{
"script": "/bin/check_redis -p 6000",
"interval": "5s",
"ttl": "20s"
}
]
},
{
"id": "red1",
"name": "redis",
"tags": [
"delayed",
"secondary"
],
"address": "",
"port": 7000,
"checks": [
{
"script": "/bin/check_redis -p 7000",
"interval": "30s",
"ttl": "60s"
}
]
},
...
]
}
You can use tools like traefik or fabio as a Consul backend
If you want to use fabio, you should:
Install it, you can also use Docker: docker pull magiconair/fabio
Register your service in consul
Register a health check in consul
Register one urlprefix- tag per host/path prefix it serves, e.g.:
urlprefix-/css
,
urlprefix-i.com/static
mysite.com/
An example:
{
"service": {
"name": "foobar",
"tags": ["urlprefix-/foo, urlprefix-/bar"],
"address": "",
"port": 8000,
"enableTagOverride": false,
"checks": [
{
"id": "api",
"name": "HTTP API on port 5000",
"http": "http://localhost:5000/health",
"interval": "2s",
"timeout": "1s"
}
]
}
}
Start fabio without a configuration file (a consul agent should run on
Watch fabio logs
localhost:8500
).
,
urlprefix-
Send all your HTTP traffic to fabio on port 9999
This is a good video that explains how fabio works: https://www.youtube.com/watch?v=gvxxu0PLevs
Finally, you can write your own process that registers the service through the HTTP API
ZooKeeper
ZooKeeper (or ZK) is a centralized service for configuration management with distributed synchronization
capabilities. ZK organizes its data in a hierarchy of znodes .
It exposes a simple set of primitives that distributed applications can build upon to implement higher level services
for synchronization, configuration maintenance, and groups and naming. It is designed to be easy to program to, and
uses a data model styled after the familiar directory tree structure of file systems. It runs in Java and has bindings for
both Java and C.
From the official documentation, the ZooKeeper implementation is described a putting a premium on high
performance, highly available, strictly ordered access. The performance aspects of ZooKeeper means it can be used
in large, distributed systems. The reliability aspects keep it from being a single point of failure. The strict ordering
means that sophisticated synchronization primitives can be implemented at the client.
It allows distributed processes to coordinate with each other through a shared hierarchal namespace which is
organized similarly to a standard file system. The name space consists of data registers - called znodes, in
ZooKeeper parlance - and these are similar to files and directories. Unlike a typical file system, which is designed
for storage, ZooKeeper data is kept in-memory, which means ZooKeeper can achieve high throughput and low
latency numbers.
Like the distributed processes it coordinates, ZooKeeper itself is intended to be replicated over a sets of hosts called
an ensemble.
The servers that make up the ZooKeeper service must all know about each other. They maintain an in-memory
image of state, along with a transaction logs and snapshots in a persistent store. As long as a majority of the servers
are available, the ZooKeeper service will be available.
Clients connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends
requests, gets responses, gets watch events, and sends heart beats. If the TCP connection to the server breaks, the
client will connect to a different server.
ZooKeeper stamps each update with a number that reflects the order of all ZooKeeper transactions. Subsequent
operations can use the order to implement higher-level abstractions, such as synchronization primitives. It is
especially fast in "read-dominant" workloads. ZooKeeper applications run on thousands of machines, and it
performs best where reads are more common than writes, at ratios of around 10:1.
Its API, support mainly these operations:
: creates a node at a location in the tree
: deletes a node
exists : tests if a node exists at a location
get data : reads the data from a node
set data : writes data to a node
create
delete
: retrieves a list of children of a node
: waits for data to be propagated
get children
sync
Load Balancers
Nginx
Nginx can integrate with some service discovery tools like etcd/confd. Nginx is a popular web server, reverse proxy
and load balancer and the advantage of using Nginx is the community behind it and its very good performance.
A simple configuration, would be creating the right Nginx upstream that can redirect traffic to the Docker containers
is a cluster, for example a Swarm cluster. Example: We want to run an API deployed using Docker Swarm, the
service is mapped to port 8080:
docker service create --name api --replicas 20 --publish 8080:80 my/api:2.3
We now that the API service is mapped to port 8080 in the leader node. We can create a simple Nginx configuration
file:
server {
listen 80;
location / {
proxy_pass http://api;
}
}
upstream api {
server :8080;
}
This file will be used to run an Nginx load balancer:
docker service create --name my_load_balancer --mount type=bind,source=/data/,target=/etc/nginx/conf.d --publish 80:80 nginx
Nginx could be integrated with Consul, Registrator and Consul-template.
Consul will be used as the service discovery tool
Registrator will be used to automatically register the new started services to Consul
Consul-template will be used to automatically recreate the HAProxy configuration from a given template
Adding and removing a node from Nginx configuration is not a good solution. Here is a simple configuration of
Nginx that works with Consul:
upstream frontend { {{range service "app.frontend"}}
server {{.Address}};{{end}}
}
HAProxy
HAProxy is a very common, high-performance load balancing software that could be used as a load balancer set up
in front of a Docker cluster.
You can for example, integrate it with Consul, Registrator and Consul-template.
Consul will be used as the service discovery tool https://github.com/hashicorp/consul
Registrator will be used to automatically register the new started services to Consul
https://github.com/gliderlabs/registrator
Consul-template will be used to automatically recreate the HAProxy configuration from a given template
https://github.com/hashicorp/consul-template
Like Nginx proxy balancer, adding and removing nodes from HAProxy could be done using Consul Template:
backend frontend
balance roundrobin{{range "app.frontend"}}
service {{.ID}} {{.Address}}:{{.Port}}{{end}}
You may also consider using HAProxy with Docker Swarm mode. You can use the dockerfile/haproxy to run
HAProxy:
docker run -d -p 80:80 -v :/haproxy-override dockerfile/haproxy
where
is an absolute path of a directory that could contain:
: custom config file (replace
: custom error responses
haproxy.cfg
errors/
/dev/log
with
127.0.0.1
, and comment out daemon)
This is an example of HAProxy configuration:
global
debug
defaults
log global
mode
http
timeout connect 5000
timeout client 5000
timeout server 5000
listen http_proxy :8443
mode tcp
balance roundrobin
server server1 docker:8000 check
server server2 docker:8001 check
Traefik
Traefik is a HTTP reverse proxy and load balancer made to deploy microservices. It supports several backends like
Docker, Swarm, Mesos/Marathon, Consul, Etcd, Zookeeper, BoltDB, Rest API, file...etc Its configuration could be
automatically and dynamically managed.
You can run a Docker container to deploy Traefik:
docker run -d -p 8080:8080 -p 80:80 -v $PWD/traefik.toml:/etc/traefik/traefik.toml traefik
Or Docker Compose:
traefik:
image: traefik
command: --web --docker --docker.domain=docker.localhost --logLevel=DEBUG
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /dev/null:/traefik.toml
This is the official example that you can test and follow to understand how Traefik in running:
Create docker-compose.yml file:
traefik:
image: traefik
command: --web --docker --docker.domain=docker.localhost --logLevel=DEBUG
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /dev/null:/traefik.toml
whoami1:
image: emilevauge/whoami
labels:
- "traefik.backend=whoami"
- "traefik.frontend.rule=Host:whoami.docker.localhost"
whoami2:
image: emilevauge/whoami
labels:
- "traefik.backend=whoami"
- "traefik.frontend.rule=Host:whoami.docker.localhost"
Run it:
docker-compose up -d
Now you can test the load balancing using curl:
curl -H Host:whoami.docker.localhost http://127.0.0.1
curl -H Host:whoami.docker.localhost http://127.0.0.1
Kube-Proxy
Kube-Proxy is one of the components of Kubernetes. On each Kubernetes node, Kube-Proxy allows us to do simple
TCP,UDP stream forwarding or round robin TCP,UDP forwarding across a set of backends. Service cluster IPs and
ports are currently found through Docker-links-based service that specifies ports opened by the service proxy.
Vulcand
Vulcand is a programmatic extendable proxy for microservices and API management. It is inspired by Hystrix and
powers Mailgun microservices infrastructure.
It uses Etcd as a configuration backend. Has an API and a CLI and supports canary deploys, realtime metrics and
resiliency.
Moxy
Moxy is a reverse load balancer that could be automatically configured to discover services deployed on Apache
Mesos and Marathon.
servicerouter.py
Marathon's servicerouter.py is a replacement for the haproxy-marathon-bridge. It reads Marathon task information
and generates HAProxy configuration. It supports advanced functions like sticky sessions, HTTP to HTTPS
redirection, SSL offloading, VHost support and templating. It is implemented in Python.
You can run the official Docker image to deploy it:
docker run -d \
--name="servicerouter" \
--net="host" \
--ulimit nofile=8204 \
--volume="/dev/log:/dev/log" \
--volume="/tmp/ca-bundle.pem:/etc/ssl/mesosphere.com.pem:ro" \
uzyexe/marathon-servicerouter
Chapter IX - Composing Services Using Compose
o
o
^__^
(oo)\_______
(__)\
)\/\
||----w |
||
||
What Is Docker Compose
We have seen how to run containers but individually, say we want to run a LAMP/LEMP stack, at this case, we
should a php container then start the webserver container and make the link between them. Using Docker Compose,
it is possible to run a multi-container application using a declarative YAML file (Compose file). Using a single
command, we can start multiple containers that run all of our services (a LAMP or LEMP stack in this case).
Installing Docker Compose
Docker Compose comes in a separate binary file, even if you already installed Docker, you should install Compose.
Docker Compose For Mac And Windows
If you're a Mac or Windows user, the best way to install Compose and keep it up-to-date is Docker for Mac and
Windows. Docker for Mac and Windows will automatically install the latest version of Docker Engine for you. You
can use on of these links:
Docker Community Edition for Mac: https://store.docker.com/editions/community/docker-ce-desktop-mac
Docker Community Edition for Windows: https://store.docker.com/editions/community/docker-ce-desktopwindows
Docker For Linux
In order to install Docker for Linux, download Compose binary and move it the binaries path:
curl
-L
https://github.com/docker/compose/releases/download/1.13.0/docker-compose-`uname
m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
-s`-`uname
-
You can get a different version by changing 1.13.0 by the right version. The Compose file format has a version that
could be compatible with a Docker Engine version. For example, Compose version 3.0 – 3.2 is 1.13.0+ compatible.
Running Wordpress Using Docker Compose
One of the advantage of Docker Compose is the fact that a Compose file could be shared and distributed, creating an
application that uses several services and components could be done using a single Compose command.
This is what most users running Wordpress with Compose are using:
version: '3'
services:
db:
image: mysql:5.7
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: mypassword
MYSQL_DATABASE: wordpress
MYSQL_USER: user
MYSQL_PASSWORD: mypassword
wordpress:
depends_on:
- db
image: wordpress:latest
ports:
- "8000:80"
restart: always
environment:
WORDPRESS_DB_HOST: db:3306
WORDPRESS_DB_USER: myuser
WORDPRESS_DB_PASSWORD: mypassword
volumes:
db_data:
The above content should goes into a file called
docker-compose.yml
. This is the file tree we are using :
Running_Wordpress_Using_Docker_Compose/
└── docker-compose.yml
Now type cd Running_Wordpress_Using_Docker_Compose and run docker-compose up . After downloading and running the
different services, you can go to http://127.0.0.1:8000 in order to see a running Wordpress.
When running the
file.
docker-compose up
command, you should absolutely be inside the folder containing the
docker-compose
What we have actually run in the
docker-compose.yml
docker run --name db \
-e MYSQL_ROOT_PASSWORD=mypassword \
-e MYSQL_DATABASE=wordpress \
-e MYSQL_USER=user \
-e MYSQL_PASSWORD=mypassword \
-d mysql:5.7
docker run --name wordpress \
--link db:mysql \
-p 8080:80 \
-e WORDPRESS_DB_USER=myuser \
-e WORDPRESS_DB_PASSWORD=mypassword \
-d wordpress:latest
file is the equivalent of running two commands:
Running LEMP Using Docker Compose
We are going to use the Compose version 3. After downloading and installing Docker Compose, create a new folder
and a new file called docker-compose.yml :
mkdir Running_LEMP_Using_Docker_Compose
cd LEMP_Using_Docker_Compose
vi docker-compose.yml
Inside the Compose file, start by mentioning the version:
version: '3'
Now add the first service:
web:
We will use Nginx with the latest image, so our service will look like this:
web:
image: nginx:latest
And to declare port mapping, the file becomes:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
We are going to declare 3 volumes:
code: where we are going to put the php files
configs: where the Nginx configuration files go
scripts: where the script to start at the service startup goes
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
Let's create a PHP file where we will use the
mkdir code
phpinfo()
funtion:
cd code
echo " index.php
We named Nginx configuration file to
site.conf
and this is its content:
server {
index index.php;
server_name php-docker.local;
error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log;
root /code;
location ~ \.php$ {
try_files $uri =404;
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass php:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param PATH_INFO $fastcgi_path_info;
}
}
You should creae the folder
configs
Now create another floder called
and the file/content above should goes inside this folder
scripts
and add a new script file, we are going to name it
start_services.sh
:
#!/usr/bin/env bash
chown -R www-data:www-data /code
nginx
for (( ; ; ))
do
sleep 1d
done
This script will be the entrypoint to the Nginx container, you can customize it depending on your needs, but don't
forget to execute chmod +x scripts/start_services.sh .
This is the structure of our folder:
.
├──
│
├──
│
├──
└──
code
└── index.php
configs
└── site.conf
docker-compose.yml
scripts
└── start_services.sh
This is our new
docker-compose
file:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
Now that web service is linked to the php service, let's add the remainder of the Compose file:
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
php:
image: php:7-fpm
volumes:
- ./code:/code
restart: always
expose:
- 9000
Now we simply need to run a single command to start the webserver with the PHP backend:
docker-compose up
You should see something like this:
Creating network "runninglempusingdockercompose_default" with the default driver
Creating runninglempusingdockercompose_php_1 ...
Creating runninglempusingdockercompose_php_1 ... done
Creating runninglempusingdockercompose_web_1 ...
Creating runninglempusingdockercompose_web_1 ... done
Attaching to runninglempusingdockercompose_php_1, runninglempusingdockercompose_web_1
Now you can open your browser and visit
http://127.0.0.1:8000
.
Something helpful that we can use is sending the Nginx access logs to a remote log server/service. I am going to use
AWS cloudwatch service, but you can configure your own service/server like syslog or fluentd ..etc
version: '3'
services:
web:
image: nginx:latest
ports:
- "8000:80"
volumes:
- ./code:/code
- ./configs/site.conf:/etc/nginx/conf.d/default.conf
- ./scripts:/scripts
links:
- php
entrypoint: ./scripts/start_services.sh
restart: always
logging:
driver: "awslogs"
options:
awslogs-group: "dev"
awslogs-stream: "web_logs"
php:
image: php:7-fpm
volumes:
- ./code:/code
restart: always
expose:
- 9000
logging:
driver: "awslogs"
options:
awslogs-group: "dev"
awslogs-stream: "php_logs"
If you want to run this LEMP Docker stack as a daemon, you should use:
If you want to pause the stack use
docker-compose pause
If you want to unpause the stack use
If you want to stop the stack use
.
docker-compose unpause
docker-compose down
.
.
docker-compose up -d
.
Scaling Docker Compose
Using Docker Compose it is possible to scale a running service. Say we need 5 PHP containers that should run
behind our webserver. The command docker-compose scale php=5 will start 4 other containers running the same service
(PHP) and since the Nginx service in linked to the PHP one, all of the containers of the latter service can be seen by
Nginx container.
You may say, why we haven't scaled the Nginx bottle neck. As you can see in the Compose file, there is a mapping
between the port 8000 (external/host) and the port 80 (internal/container), the host has a single port 8000 and
creating a new container running the Nginx service will try to use the same external port and cause a conflict, so it is
not possible to scale a service using port mapping with Docker Compose.
Docker Compose Use Cases
I used Docker Compose in production but it was not a critical application. Docker Compose is mainly created for
testing and development purposes. When developing an application, having an isolated environment is crucial.
Docker Compose create this environment for a developer so adding, removing, modifying the version of a software
or a middleware is easy wand will not create any dependency or multi-version problems.
Docker Compose is a good way to share stacks between users and teams, e.g. sharing the production stack with
developers using different development configurations could be done using Docker Compose.
It is also useful to run tests by providing the environment to run them, after running them, the container could be
destroyed.
Chapter X - Docker Logging
o
o
^__^
(oo)\_______
(__)\
)\/\
||----w |
||
||
Docker Native Logging
When you start a container running a web application, a web server or any other application, sooner or later, you
will need to view your application logs. Let's see an example to view the access logs of a webserver.
Run an Nginx container:
docker run -it -p 8000:80 -d --name webserver nginx
Then visit localhost at port
8000
or execute this command:
curl http://0.0.0.0:8000
Now when you type
docker logs webserver
you can see the last lines of the access log:
172.17.0.1 - - [27/May/2017:21:33:59 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.47.0" "-"
The last command will view the log file and exits. If you want to execute the equivalent of
Docker container, add the -f flag:
tail -f
command on a
docker logs -f webserver
or
docker logs --follow webserver
In order to tail the last 10 lines and exit you can execute docker logs --tail 10 webserver . You can also use the --since
flag that takes a string and shows logs since timestamp (e.g. 2013-01-02T13:23:37) or relative (e.g. 42m for 42
minutes):
docker logs --since 42s webserver
If you want to print the timestamp use the
-t
flag:
docker logs --since 2h -t webserver
docker logs -f -t webserver
The
docker logs
command uses the output of STDOUT and STDERR
Adding New Logs
The docker logs command uses the output of STDOUT and STDERR. When we run an Nginx container, the access
logs and the remainder of Nginx logs like error logs are redirected respectively to STDOUT and STDERR. Let's
examine the official Nginx Dockerfile in order to see how this was done:
#
# Nginx Dockerfile
#
# https://github.com/dockerfile/nginx
#
# Pull base image.
FROM dockerfile/ubuntu
# Install Nginx.
RUN \
add-apt-repository -y ppa:nginx/stable && \
apt-get update && \
apt-get install -y nginx && \
rm -rf /var/lib/apt/lists/* && \
echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \
chown -R www-data:www-data /var/lib/nginx
# Define mountable directories.
VOLUME ["/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx", "/var/www/html"]
# Define working directory.
WORKDIR /etc/nginx
# Define default command.
CMD ["nginx"]
# Expose ports.
EXPOSE 80
EXPOSE 443
We are certainly looking fot this line echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \ . When running Nginx with
daemon off; configuration, Nginx will run in foreground and everything will be redirected to the screen (STDOUT).
For Apache Dockerfile, if we examine it we will notice that the container execute a script file at its startups. The
content of this script is:
#!/bin/bash
set -e
# Apache gets grumpy about PID files pre-existing
rm -f /usr/local/apache2/logs/httpd.pid
exec httpd -DFOREGROUND
The
-DFOREGROUND
option will allow Docker to get Apache logs.
What if we have custom log files ? Say we have our application write into a file called
folder.
./logs/app.log
In this case, you need to redirect the
app.log
FROM ..
..
RUN ln -sf /dev/stdout logs/app.log
RUN ln -sf /dev/stderr logs/app-errors.log
..etc
content to the STDOUT or STDERR:
app.log
inside the
logs
Docker Logging Drivers
Docker can interface with other logging services like AWS cloudwatch, Fluentd, syslog ..etc and send all of the logs
to the remote servic. When you use a logging driver, the native docker logs become deactivated. These
are the supported logging drivers:
syslog: Writes logging messages to the syslog facility. The syslog daemon must be running on the host
machine.
journald: Writes log messages to journald. The journald daemon must be running on the host machine.
gelf: Writes log messages to a Graylog Extended Log Format (GELF) endpoint such as Graylog or Logstash.
fluentd: Writes log messages to fluentd (forward input). The fluentd daemon must be running on the host
machine.
awslogs: Writes log messages to Amazon CloudWatch Logs.
splunk: Writes log messages to splunk using the HTTP Event Collector.
etwlogs: Writes log messages as Event Tracing for Windows (ETW) events. Only available on Windows
platforms.
gcplogs: Writes log messages to Google Cloud Platform (GCP) Logging.
Using Fluentd Log Driver
First thing that we need to do is installing Fluentd in the host that will collect the logs. If your package manager is
RPM, you can use curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh
If you are using Ubuntu or Debian:
For Xenial,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh
For Trusty,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
For Precise,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent2.sh | sh
For Lucid,
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-lucid-td-agent2.sh | sh
For Debian Jessie,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent2.sh | sh
For Debian Wheezy,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-wheezy-td-agent2.sh | sh
For Debian Squeeze,
curl -L https://toolbelt.treasuredata.com/sh/install-debian-squeeze-td-agent2.sh | sh
In order to use Fluentd, create the configuration file, we are going to name it
docker.conf
type forward
port 24224
bind 0.0.0.0
type stdout
You should now start fluentd with the configuration file after adapting it to your needs:
.
fluentd -c docker.conf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
2015-09-01 15:07:12
2015-09-01 15:07:12
2015-09-01 15:07:12
2015-09-01 15:07:12
2015-09-01 15:07:12
2015-09-01 15:07:12
2015-09-01 15:07:12
@type forward
port 24224
bind 0.0.0.0
@type stdout
2015-09-01 15:07:12
-0600
-0600
-0600
-0600
-0600
-0600
-0600
[info]:
[info]:
[info]:
[info]:
[info]:
[info]:
[info]:
reading config file path="docker.conf"
starting fluentd-0.12.15
gem 'fluent-plugin-mongo' version '0.7.10'
gem 'fluentd' version '0.12.15'
adding match pattern="*.*" type="stdout"
adding source type="forward"
using configuration file:
-0600 [info]: listening fluent socket on 0.0.0.0:24224
It is also possible to run Fluentd in a Docker container:
docker run -it -p 24224:24224 -v docker.conf:/fluentd/etc/docker.conf -e FLUENTD_CONF=docker.conf fluent/fluentd:latest
Now run
docker run --log-driver=fluentd ubuntu echo "Hello Fluentd!"
Fluentd could be on a different remote host and in this case you should add
--log-opt
followed by the host address:
docker run --log-driver=fluentd --log-opt fluentd-address=192.168.1.10:24225 ubuntu echo "..."
If you are using multiple log types/files, you can tag each container/service with a different tag using
docker run --log-driver=fluentd --log-opt fluentd-tag=docker.{{.ID}} ubuntu echo "..."
fluentd-tag
.
Using AWS CloudWatch Log Driver
Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. You
can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically
react to changes in your AWS resources. AWS CloudWatch can also be used in collecting and centralizing Docker
logs.
In order to use CloudWatch, you need to allow the used user to execute these actions:
logs:CreateLogGroup
logs:CreateLogStream
logs:PutLogEvents
logs:DescribeLogStreams
e.g:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:*:*:*"
]
}
]
}
You need now to go to your AWS console, go to CloudWatch service then create a log group, call it
on the created group and create a new stream, call it my-stream .
my-group
. Click
The awslogs logging driver sends your Docker logs to a specific region so you could use the awslogs-region log
option or the AWS_REGION environment variable to set the region. You should use the same region where you created
the log group/stream.
docker run --log-driver=awslogs --log-opt awslogs-region=us-east-1 ubuntu echo "..."
In order to add the logs group/stream, execute:
docker run --log-driver=awslogs
stream my-stream
--log-opt
awslogs-region=us-east-1
--log-opt
awslogs-group=my-group
--log-opt
You can configure the default logging driver by passing the --log-driver option to the Docker daemon:
dockerd --log-driver=awslogs
awslogs-
Chapter XI - Docker Debugging And Troubleshooting
o
o
^__^
(oo)\_______
(__)\
)\/\
||----w |
||
||
Docker Daemon Logs
When there is a problem, one of the first things for many of you is checking the Docker Daemon logs. Docker logs
are accessible in different ways and this depends on your system:
OSX - ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/log/docker.log
Debian - /var/log/daemon.log
CentOS - Run /var/log/daemon.log | grep docker
CoreOS - Run journalctl -u docker.service
Ubuntu upstart - /var/log/upstart/docker.log
Ubuntu systemd - Run journalctl -u docker.service command
Fedora - Run journalctl -u docker.service
Red Hat Enterprise Linux Server - Run /var/log/messages | grep docker
OpenSuSE - Run journalctl -u docker.service
Boot2Docker - /var/log/docker.log
Windows - AppData\Local
Another way of troubleshooting the daemon is running it in foreground:
dockerd
If you are already running Docker you should stop it and start the daemon.
sudo dockerd
> INFO[0000] libcontainerd: new containerd process, pid: 9898
> WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=65536
> WARN[0001] failed to rename /var/lib/docker/tmp for background deletion: %!s(). Deleting synchronously
> INFO[0001] [graphdriver] using prior storage driver: aufs
> INFO[0001] Graph migration to content-addressability took 0.00 seconds
> WARN[0001] Your kernel does not support swap memory limit
> WARN[0001] Your kernel does not support cgroup rt period
> WARN[0001] Your kernel does not support cgroup rt runtime
> INFO[0001] Loading containers: start.
>
INFO[0002]
Default
bridge
(docker0)
is
assigned
with
an
IP
address
172.17.0.0/16.
Daemon
option
-bip can be used to set a preferred IP address
>
INFO[0002]
No
nonlocalhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]
> INFO[0002] IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
>
INFO[0002]
No
nonlocalhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]
> INFO[0002] IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]
> WARN[0002] Failed to allocate and map port 8000-8000: Bind for 0.0.0.0:8000 failed: port is already allocated
> WARN[0002] failed to cleanup ipc mounts:
> failed to umount /var/lib/docker/containers/d929a0878ea9282cd3eeb1ed65c1a6448e0a9da67b1dce2dba305c746ecc2371/shm: invalid argument
> ERRO[0002] Failed to start container d929a0878ea9282cd3eeb1ed65c1a6448e0a9da67b1dce2dba305c746ecc2371: driver failed programming external conne
Docker Debugging
In order to do that, set the debug key to true in the daemon.json file. Generally, you will find this file under
/etc/docker . You may need to create this file, if it does not yet exist.
{
"debug": true
}
Possible values are
debug
,
info
,
warn
,
error
,
fatal
.
Now send a HUP signal to the daemon to cause it to reload its configuration:
execute service docker stop && dockerd :
You will be able to see all of the actions that Docker is doing:
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
Registering routers
Registering GET, /containers/{name:.*}/checkpoints
Registering POST, /containers/{name:.*}/checkpoints
Registering DELETE, /containers/{name}/checkpoints/{checkpoint}
Registering HEAD, /containers/{name:.*}/archive
Registering GET, /containers/json
Registering GET, /containers/{name:.*}/export
Registering GET, /containers/{name:.*}/changes
Registering GET, /containers/{name:.*}/json
Registering GET, /containers/{name:.*}/top
Registering GET, /containers/{name:.*}/logs
Registering GET, /containers/{name:.*}/stats
Registering GET, /containers/{name:.*}/attach/ws
Registering GET, /exec/{id:.*}/json
Registering GET, /containers/{name:.*}/archive
Registering POST, /containers/create
Registering POST, /containers/{name:.*}/kill
Registering POST, /containers/{name:.*}/pause
Registering POST, /containers/{name:.*}/unpause
Registering POST, /containers/{name:.*}/restart
Registering POST, /containers/{name:.*}/start
Registering POST, /containers/{name:.*}/stop
Registering POST, /containers/{name:.*}/wait
Registering POST, /containers/{name:.*}/resize
Registering POST, /containers/{name:.*}/attach
Registering POST, /containers/{name:.*}/copy
Registering POST, /containers/{name:.*}/exec
Registering POST, /exec/{name:.*}/start
Registering POST, /exec/{name:.*}/resize
Registering POST, /containers/{name:.*}/rename
Registering POST, /containers/{name:.*}/update
Registering POST, /containers/prune
Registering PUT, /containers/{name:.*}/archive
Registering DELETE, /containers/{name:.*}
Registering GET, /images/json
Registering GET, /images/search
Registering GET, /images/get
Registering GET, /images/{name:.*}/get
Registering GET, /images/{name:.*}/history
Registering GET, /images/{name:.*}/json
Name To resolve: php.
Registering POST, /commit
Query php.[1] from 127.0.0.1:53704, forwarding to udp:127.0.1.1
Registering POST, /images/load
Registering POST, /images/create
Registering POST, /images/{name:.*}/push
Registering POST, /images/{name:.*}/tag
Registering POST, /images/prune
Registering DELETE, /images/{name:.*}
Registering OPTIONS, /{anyroute:.*}
Registering GET, /_ping
Registering GET, /events
Registering GET, /info
Registering GET, /version
Registering GET, /system/df
Registering POST, /auth
Registering GET, /volumes
Registering GET, /volumes/{name:.*}
sudo kill -SIGHUP $(pidof dockerd)
or
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
DEBU[0012]
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
Registering
POST, /volumes/create
POST, /volumes/prune
DELETE, /volumes/{name:.*}
POST, /build
POST, /swarm/init
POST, /swarm/join
POST, /swarm/leave
GET, /swarm
GET, /swarm/unlockkey
POST, /swarm/update
POST, /swarm/unlock
GET, /services
GET, /services/{id}
POST, /services/create
POST, /services/{id}/update
DELETE, /services/{id}
GET, /services/{id}/logs
GET, /nodes
GET, /nodes/{id}
DELETE, /nodes/{id}
POST, /nodes/{id}/update
GET, /tasks
GET, /tasks/{id}
GET, /tasks/{id}/logs
GET, /secrets
POST, /secrets/create
DELETE, /secrets/{id}
GET, /secrets/{id}
POST, /secrets/{id}/update
GET, /plugins
GET, /plugins/{name:.*}/json
GET, /plugins/privileges
DELETE, /plugins/{name:.*}
POST, /plugins/{name:.*}/enable
POST, /plugins/{name:.*}/disable
POST, /plugins/pull
POST, /plugins/{name:.*}/push
POST, /plugins/{name:.*}/upgrade
POST, /plugins/{name:.*}/set
POST, /plugins/create
GET, /networks
GET, /networks/
GET, /networks/{id:.+}
POST, /networks/create
POST, /networks/{id:.*}/connect
POST, /networks/{id:.*}/disconnect
POST, /networks/prune
DELETE, /networks/{id:.*}
Checking Docker Status
Checking if Docker is running can be done using service docker status or any other alternative way like ps aux|grep
docker , ps -ef |grep docker ..etc It is also possible to use any other Docker command like docker info command in
order to see of Docker responds or not. In the negative case, your terminal will show something like
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
You can use other ways depending on your system tools like
systemctl is-active docker
.
Debugging Containers
Whether you are using standalone Docker containers or managed services, it is possible to inspect the details of a
service or a container.
docker service inspect
docker inspect
Let's inspect this container docker run -it -p 8000:80 -d --name webserver nginx using
These are the list of information the Docker inspect command will give us:
[
{
"Id": "3..9",
"Created": "2017-05-27T23:58:20.848318438Z",
"Path": "nginx",
"Args": [
"-g",
"daemon off;"
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 13730,
"ExitCode": 0,
"Error": "",
"StartedAt": "2017-05-27T23:58:21.257901373Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:3..7",
"ResolvConfPath": "/var/lib/docker/containers/3..9/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/3..9/hostname",
"HostsPath": "/var/lib/docker/containers/3..9/hosts",
"LogPath": "/var/lib/docker/containers/3..9/3..9-json.log",
"Name": "/webserver",
"RestartCount": 0,
"Driver": "aufs",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "docker-default",
"ExecIDs": null,
"HostConfig": {
"Binds": null,
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "default",
"PortBindings": {
"80/tcp": [
{
"HostIp": "",
"HostPort": "8000"
}
]
},
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": null,
"CapDrop": null,
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
docker inspect webserver
command.
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": -1,
"OomKillDisable": false,
"PidsLimit": 0,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0
},
"GraphDriver": {
"Data": null,
"Name": "aufs"
},
"Mounts": [],
"Config": {
"Hostname": "37068ac691fb",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"80/tcp": {}
},
"Tty": true,
"OpenStdin": true,
"StdinOnce": false,
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"NGINX_VERSION=1.13.0-1~stretch",
"NJS_VERSION=1.13.0.0.1.10-1~stretch"
],
"Cmd": [
"nginx",
"-g",
"daemon off;"
],
"ArgsEscaped": true,
"Image": "nginx",
"Volumes": null,
"WorkingDir": "",
"Entrypoint": null,
"OnBuild": null,
"Labels": {},
"StopSignal": "SIGQUIT"
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "4..0",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"80/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "8000"
}
]
},
"SandboxKey": "/var/run/docker/netns/4a0ab4faeb4e",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "8..b",
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"MacAddress": "02:42:ac:11:00:02",
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "b..b",
"EndpointID": "8..b",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
}
}
]
It is possible to get a single element like for example, the IP address:
docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' webserver
Or the port binding list:
docker
inspect
--format='{{range
$p,
> {{(index $conf 0).HostPort}} {{end}}' webserver
$conf
:=
.NetworkSettings.Ports}}
Another way of debugging containers is executing debug commands inside the container like
webserver ps aux or docker exec -it webserver cat /etc/resolv.conf ..etc
Using
docker stats
and
docker events
{{$p}}
-
docker exec -it
commands could give you also some information when debugging.
Troubleshooting Docker Using Sysdig
Sysdig is a Linux system exploration and troubleshooting tool with support for containers.
To install Sysdig automatically in one step, simply run the following command. This is the recommended
installation method.
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
Then add your username to the same group as sysdig:
groupadd sysdig
usermod -aG sysdig $USER
Use visudo to edit the sudo-config. Add the line %sysdig ALL= /path/to/sysdig and save. The path is most likely
/usr/local/bin/sysdig , but you can make sure by running which sysdig.
Sysdig is an open source project and it can be used to get information about
Networking
Containers
Application
Disk I/O
Processes and CPU usage
Performance and Errors
Security
Tracing
Debugging containers is also debugging the host, so sysdig can be used to make a general troubleshooting. What
does interest us in this part is the container-related commands.
In order to list the running containers with their resource usage
sudo csysdig -vcontainers
Listing all of the processes with container context can be done using
sudo csysdig -pc
To view the CPU usage of the processes running inside the my_container container, use:
sudo sysdig -pc -c topprocs_cpu container.name=my_container
Bandwidth can be monitored using:
sudo sysdig -pc -c topprocs_net container.name=my_container
Processes using most network bandwidth can be checked using:
sudo sysdig -pc -c topprocs_net container.name=my_container
To view the top network connections:
sudo sysdig -pc -c topconns container.name=my_container
Top used files consuming I/O bytes could be checkeck using:
sudo sysdig -pc -c topfiles_bytes container.name=my_container
And to show all the interactive commands executed inside the my_container container, use:
sudo sysdig -pc -c spy_users container.name=my_container
Chapter XII - Orchestration - Docker Swarm
o
o
^__^
(oo)\_______
(__)\
)\/\
||----w |
||
||
Docker Swarm
Docker Swarm is the solution that Docker inc developed to create an orchestration tool like Google's Kubernetes. It
provides native clustering capabilities to turn a group of Docker engines into a single, virtual Docker Engine.
Distributed applications requires compute resources that are also distributed and that's why Docker Swarm was
introduced. You can use it to manage pooled resources in order to scale out an application as if it was running on a
single, huge computer.
Before Docker engine 1.12, Docker Swarm should be integrated with a kv store and a service discovery tool but after
this version, Docker Swarm integrated these tools and it can be used without having the need to use other tools.
Swarm Features
Docker has a continuous active development and it is changing a lot, Swarm mode introduced many new feature that
solved many problems. Like it is described in its official website, Docker Swarm serves the standard Docker API, so
any tool which already communicates with a Docker daemon can use Docker Swarm to transparently scale to
multiple hosts: Dokku, Docker Compose, Krane, Flynn, Deis, DockerUI, Shipyard, Drone, Jenkins and of course the
Docker client itself. Includes ability to pull from private repositories or Docker public Hub as well.
Swarm mode is a built-in native solution to Docker, you can use Docker Networking, Volumes and plugins through
their respective Docker commands via this mode.
Its scheduler has useful filters like node tags, affinity and strategies like spread, binpack ..etc These filters assign
containers to the underlying nodes to optimize performance and resource utilization.
Swarm is production ready and according to Docker inc it is tested to scale up to 1,000 nodes and fifty thousand
50,000 containers with no performance degradation in spinning up incremental containers onto the node cluster.
A test stress done by Docker to spin up 1,000 nodes, 30,000 containers managed by 1 Swarm manager gave these
results:
Percentile
50th
90th
99th
API Response Time
150ms
200ms
360ms
Scheduling Delay
230ms
250ms
400ms
During this test, Consul was used as a discovery backend, every node hosted 30 containers (1,000 nodes), the
manager was an EC2 m4.xlarge (4 CPUs, 16GB RAM) machine and nodes were EC2 t2.micro (1 CPU, 1 GB RAM)
machines and container images were using ubuntu 14.04.
Here is what was published in the blog post introducing the results:
We wanted to stress test a single Swarm manager, to see how capable it would be, so we used one Swarm
manager to manage all our nodes. We placed fifty containers per node. Commands were run 1,000 times
against Swarm and and we generated percentiles for 1) API Response time and 2) Scheduling delay. We found
that we were able to scale up to 1,000 nodes running 30,000 containers. 99% of the time each container took
less than half a second to launch. There was no noticeable difference in the launch time of the 1st and 30,000th
container. We used docker info to measure API response time, and then used docker run -dit ubuntu bash to
measure scheduling delay.
Another serious collaborative test was done called Swarm2k. This test was using Docker 1.12, a total of 2,384
servers was part of the Swarm cluster and there were 3 managers.
To achieve such a bug number of nodes, Docker ensure a highly available Swarm Manager. You can create multiple
Swarm masters and specify policies on leader election in case the primary master experiences a failure.
Swarm comes with a built-in scheduler, but you can easily plugin the Mesos or Kubernetes backend while still using
the Docker client. To find nodes in your cluster, Docker Swarm can use either a hosted discovery service, static file,
etcd, consul and zookeeper.
Installation
Nothing different from the default Docker installation, since the Swarm mode is a built-in feature, you need just to
install Docker:
curl -fsSL https://get.docker.com/ | sh
You can use docker -v to see the installed version, in all cases if it is greater than 1.12, you should have the Swarm
feature integrated in Docker engine.
The Raft Consensus Algorithm , Swarm Managers & Best
Practices
In a Docker cluster, you should have at least one manager. One of the things that I was missing when I started
experimenting Docker is that the number of managers should not be equal to 2. Then I understood that 1 manager or
3 managers is better. In fact, when running with two managers, you double the chance of a manager failure.
Swarm manager nodes use the Raft Consensus Algorithm to manage the swarm state.
Raft achieves consensus via an elected leader. A server in a raft cluster is either:
a leader
a candidate
or a follower
The leader is responsible for log replication to the followers. Using heartbeat messages, the leader regularly informs
the followers of its existence.
Each follower has a timeout in which it expects the heartbeat from the leader. The timeout is reset on receiving the
heartbeat (it is typically between 150 and 300ms). In the case when no heartbeat is received, the follower changes its
status to candidate and starts a new leader election. The election starts by increasing the term counter and sending a
RequestVote message to all other servers that will vote (only once) for the first candidate that sends them this
RequestVote message.
Three scenarios are possible in the last case:
If the candidate receives a message from a leader with a term number equal to or larger than the current term,
then its election is defeated and the candidate changes into a follower.
If a candidate receives a majority of votes, then it becomes the new leader.
If neither happens, a new leader election starts after a timeout.
Raft tolerates up to (N-1)/2 failures and requires a majority or quorum (also called a majority of managers) of
(N/2)+1 members to agree on values proposed to the cluster.
Raft will not tolerate the case when among 5 Managers, 3 nodes are unavailable, the system will not process any
more requests.
In all cases, you should maintain an odd number of managers in the swarm to support manager node failures, it
ensures higher chance for the quorum availability.
Number Of Nodes (N)
1
2
3
4
5
6
Quorom Majority (N/2)+1
1
2
2
3
3
4
Fault Tolerance (N-1)/2
0
0
1
1
2
2
You don't really need to know how Raft works, but you need to know that having an odd number of managers is a
must. You need also to know, that in order to maximize the availability of your nodes, you should think about
distributing manager nodes across a minimum of 3 availability-zones. You may have less managers and one
availability zone, it will work fine but you minimize the probability of teloreating data centers problems. If you will
deploy high-available fault-tolerant system, this table describes how you should make the repartition of managers
across 3 availability zones:
Number of Node Managers
3
5
7
9
11
Repartition Of Managers Across 3 Availability Zones
1-1-1
2-2-1
3-2-2
3-3-3
4-4-3
Another thing to know is that a node running as a swarm manager is not different than a non-manager node (a
worker) so it is fine to have a swarm cluster with only managers, without any worker. In fact, manager nodes by
default acts like a worker nodes. Swarm scheduler can assign tasks to a manager node. If you have a small swarm
cluster, managers could be assigned to execute tasks with lower risk.
You can also restrict the role of a manager in order to acting only as a manager and not a worker. Draining managers
nodes, make them unavailable as worker nodes:
docker node update --availability drain
It is may be evident but you should assign a static IP to each of your Swarm managers, a worker in contrast could
have dynamic IP (since it will be discovered) but workers and managers - in all cases - should be able to
communicate together over network.
The following ports must be available:
TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
TCP and UDP port 4789 for overlay network traffic
Creating Swarm Managers & Workers
In order to create a manger, you should have Docker installed in the node, then use the swarm initialization
command:
docker swarm init --advertise-addr [:port]
If you want to customize your security, you can use these options:
--cert-expiry duration
--external-ca external-ca
Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s)
Specifications of one or more certificate signing endpoints
Other options can be used to changer the heartbeat duration, the number of log entries between Raft snapshots or the
task history retention limit.
--autolock
--dispatcher-heartbeat duration
--max-snapshots uint
--snapshot-interval uint
--task-history-limit int
Enable manager autolocking (requiring an unlock key to start a stopped manager)
Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s)
Number of additional Raft snapshots to retain
Number of log entries between Raft snapshots (default 10000)
Task history retention limit (default 5)
When losing the quorum, you can use the following option to bring back the failed node online.
--force-new-cluster
Force create a new cluster from current state
Example:
Say our
eth0
interface has
138.197.35.0
as an IP address.
To create a cluster with a first manager, type:
docker swarm init --force-new-cluster --advertise-addr 138.197.35.0
This will show an instruction to execute a command in order to add a worker:
Swarm initialized: current node (vhvebboq9fp4j62w83dujxzjr) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-cmgrvib1disaeq08x8a5ln7zo \
138.197.35.0:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
If you want to add a worker to this cluster, create a new server and execute:
docker
swarm
join
cmgrvib1disaeq08x8a5ln7zo 138.197.35.0:2377
You should of course have the port
follwing command:
docker swarm join-token manager
2377
--token
SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-
open. But, if you want to add a new manager, you should execute the
Docker will generate a new command that you can execute in a second manager:
docker
swarm
join
354t5zlz0re5kh1jqcliofecs 138.197.35.0:2377
--token
SWMTKN-1-5b54vz0sie1li0ijr0epkhyjvmbbh2pg746skh8ba5674g1p6x-
Now you can get a list of all the available workers and managers in you cluster by typing:
docker node ls
I have a single node in my cluster and of course it is a manager, what I see is the following result:
ID
vhvebboq9fp4j62w83dujxzjr *
HOSTNAME
swarm-1
STATUS
Ready
AVAILABILITY
Active
MANAGER STATUS
Leader
Deploying Services
Creating A Container
Nothing is really different from what your learned before, you should create an image, build it, create a container,
tag it, commit it, push it ..etc then you can use your image to create the container.
For the sake of simplicity, I created a container based on Alpine Linux that will execute and infinite loop.
This is the Dockerfile:
FROM alpine
ENTRYPOINT tail -f /dev/null
I built it:
docker build -t eon01/infinite .
Sending build context to Docker daemon 11.78 kB
Step 1/2 : FROM alpine
latest: Pulling from library/alpine
0a8490d0dfd3: Pull complete
Digest: sha256:dfbd4a3a8ebca874ebd2474f044a0b33600d4523d03b0df76e5c5986cb02d7e8
Status: Downloaded newer image for alpine:latest
---> 88e169ea8f46
Step 2/2 : ENTRYPOINT tail -f /dev/null
---> Running in 986f0fd1f5f9
---> d1400705c370
Removing intermediate container 986f0fd1f5f9
Successfully built d1400705c370
Run it:
docker run -it --name infinite -d eon01/infinite
fc476e6f8312492b2fd9cb620c8eaf5115c2e18a52d95160849436087dec2b68
And it should keep running because of the infinite
tail -f /dev/null
:
docker ps
CONTAINER ID
fc476e6f8312
c 'tail ..."
IMAGE
3 seconds ago
COMMAND
CREATED
eon01/infinite
Up 2 seconds
STATUS
PORTS
"/bin/sh
NAMES
-
infinite
You can use this image directly from my public Docker Hub:
docker run -it --name infinite -d eon01/infinite
But this is not actually how we use Swarm. The common usage of Swarm is creating services first before thinking in
term of containers.
Creating & Configuring Swarm Services
In order to create a service you can use the
docker service create
command.
docker service create