docker networking part 1

Docker networking

Environment used in this example:

  • host: MacOS 10.10.5
  • coreOS: alpha (1000.0.0)
  • Docker: version 1.10.3

part1

Hello fellow docker enthusiast! I am going to lay down some of the networking principals of docker here, with some concrete examples and detailed steps on how to see it all yourself.

Throughout the article I am going to use coreOS-on-vagrant as my host for docker demo. The reason I am choosing coreOS here, is that it comes with docker pre-installed, it takes 5 minutes to start it up and it all makes it a perfect playground for docker.

You can choose to have it on one of the cloud providers(like amazon EC2), bare metal servers or virtualization providers (like Vagrant, which we will use in this example). You can find more details on how to quick start with coreOS here

Feel free to skip the first section if you already have coreOS on your local, otherwise follow the instructions and you will have a running OS with docker in no time.

Getting coreOS

Independent of the platform you have, whether it is Windows, Linux or OS X, you should be able to download and install Vagrant, which is a simple virtual machine manager, working on top of virtualization providers like VirtualBox (we would need v 4.3.10+), which you can find here

We will checkout the git project containing the coreOS Vagrant file that will be used to startup the image

git clone https://github.com/coreos/coreos-vagrant.git
cd coreos-vagrant

As we will only need 1 instance of coreOS to play around with docker, go ahead and copy the provided config.rb.sample file to config.rb and modify the num_instances to 1

coreos-vagrant nisabek$ head config.rb
# Size of the CoreOS cluster created by Vagrant
$num_instances=1

# Used to fetch a new discovery token for a cluster of size $num_instances

You would also need to create a minimal user-data file. Start by copying over the user-data.sample file to user-data and changing the <token> with your own URL from https://discovery.etcd.io/new ( details about what is this here) We are ready to run


Which will bring up the single coreOS instance

```bash
coreos-vagrant nisabek$ vagrant status
Current machine states:

core-01                   running (virtualbox)

There is no need to add an SSH key since Vagrant will automatically generate and use it’s own SSH key. Any keys added will be overwritten. So we can go ahead and enter our box with

vagrant ssh core-01

Docker networking

Now that we have our playground set up we can go ahead on the box and first have a look on the existing network state.

coreos-vagrant nisabek$ vagrant ssh core-01
Last login: Fri Mar  4 02:01:31 2016 from 10.0.2.2
CoreOS alpha (949.0.0)
core@core-01 ~ $

let’s first have a look on what processes do we have on coreOS for docker

ps -ef | grep docker
root       759     1  0 Mar05 ?        00:00:18 docker daemon --host=fd:// --bridge=none --iptables=false --ip-masq=false --graph=/var/lib/early-docker --pidfile=/var/run/early-docker.pid --selinux-enabled
root      1389     1  0 Mar05 ?        00:01:48 docker daemon --host=fd:// --selinux-enabled

You can see the docker daemon running with the default configurations. Let us ignore the selinux process for now, we will get back to it in details in our next posts. For now, just know that process is responsible for keeping the safety of userland for docker containers.

The default behavior of docker-daemon can be configured and will look into how’s and why’s of that next time.

Quoting the documentation here:

When Docker starts, it creates a virtual interface named docker0 on the host machine. It randomly chooses an address and subnet from the private range defined by RFC 1918 that are not in use on the host machine, and assigns it to docker0.

Indeed, let’s run ifconfig, to see that

ifconfig
docker0: flags=4163  mtu 1500
        inet 172.18.0.1  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::42:69ff:fe05:9a18  prefixlen 64  scopeid 0x20
        ether 02:42:69:05:9a:18  txqueuelen 0  (Ethernet)
        RX packets 2576  bytes 122931 (120.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2389  bytes 10853044 (10.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

If we also explore iptables for nat configurations at this point, we will see something like this

sudo iptables -t nat  -L -n
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  172.18.0.0/16        0.0.0.0/0

Chain DOCKER (2 references)
target     prot opt source               destination
RETURN     all  --  0.0.0.0/0            0.0.0.0/0

As you can see docker has randomly chosen 172.18.0.0/16 as private range for our future docker containers.

You can read up about ip masquerade, but the idea is to allow communication between containers and the outside world - internet, and for that you have to use your real host, as the containers inside do not have real ip addresses.

Now let’s see what happens when we start up a docker container. As an example container I will be using busybox which is a small executable containing tiny versions of many common UNIX utilities, including ifconfig, which is what we need to explore the internals of containers. You can read up about BusyBox in general.

Let’s first run a busybox instance in an interactive mode If it’s the first time you are running it, it will pull the latest image tag from the docker repo and start the binary in interactive mode

docker run -it busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox

f810322bba2c: Pull complete
a3ed95caeb02: Pull complete
Digest: sha256:97473e34e311e6c1b3f61f2a721d038d1e5eef17d98d1353a513007cf46ca6bd
Status: Downloaded newer image for busybox:latest
/ #

Running ifconfig inside will show you

/  ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:12:00:02
          inet addr:172.18.0.2  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:acff:fe12:2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:648 (648.0 B)  TX bytes:738 (738.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

As you can see the ip address was chosen from the 172.18.0.0/16 range (remember the host had 172.18.0.1 as ip address)

It’s more interesting to see what’s happening on the host when we start up a container with a mapped port. In order to detach from the container use CTRL-p CTRL-q You can always go back to the instance by running docker attach $(docker ps --format "{{.ID}}")

Now let’s also modify the startup command to include the port mapping. I’m going to remove the old container and start the new one that will map the host’s 8080 port with the container’s 80

docker rm -f $(docker ps --format "{{.ID}}")
08ac561ee00c
docker run -p 8080:80 -it busybox
0ca8aec07afc0c7a27e9e25215637a0fe363825e40256e3da4ec45bb5c89dc35

Let’s first examine the host

ifconfig
docker0: flags=4163  mtu 1500
        inet 172.18.0.1  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::42:69ff:fe05:9a18  prefixlen 64  scopeid 0x20
        ether 02:42:69:05:9a:18  txqueuelen 0  (Ethernet)
        RX packets 2727  bytes 133467 (130.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2398  bytes 10853790 (10.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
eth0: flags=4163  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::a00:27ff:fe68:4567  prefixlen 64  scopeid 0x20
        ether 08:00:27:68:45:67  txqueuelen 1000  (Ethernet)
        RX packets 448104  bytes 479663021 (457.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 184747  bytes 12243936 (11.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163  mtu 1500
        inet 172.17.8.101  netmask 255.255.255.0  broadcast 172.17.8.255
        inet6 fe80::a00:27ff:fea7:2958  prefixlen 64  scopeid 0x20
        ether 08:00:27:a7:29:58  txqueuelen 1000  (Ethernet)
        RX packets 40889  bytes 8575414 (8.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 289140  bytes 16477491 (15.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 0  (Local Loopback)
        RX packets 1421877  bytes 134182083 (127.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1421877  bytes 134182083 (127.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth35230c2: flags=4163  mtu 1500
        inet6 fe80::40cb:4fff:fe0b:b9b0  prefixlen 64  scopeid 0x20
        ether 42:cb:4f:0b:b9:b0  txqueuelen 0  (Ethernet)
        RX packets 9  bytes 738 (738.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9  bytes 738 (738.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

As you can see, except the previously explored docker0 Ethernet bridge, we have veth35230c2 virtual interface created.

You see, the docker0 bridge automatically forwards packets between any other network interfaces attached to it, so, every time a new container is created, docker creates a pair of interfaces - one acts as a eth0 interface on the docker container and the other end is the created veth* interface. It also binds the newly created veth* to the docker0 bridge so a virtual subnet shared between the host and docker container is created.

Now let’s see how is the iptables affected with the port mapping

sudo iptables -t nat  -L -n
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  172.18.0.0/16        0.0.0.0/0
MASQUERADE  tcp  --  172.18.0.2           172.18.0.2           tcp dpt:80

Chain DOCKER (2 references)
target     prot opt source               destination
RETURN     all  --  0.0.0.0/0            0.0.0.0/0
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.18.0.2:80

So everything sent to the port 8080 of the host, will be forwarded to the 80 port inside the container.

Now, this is all the default behavior of the docker containers. This is supported by the docker’s network feature. Let’s see how can we explore and customize this behavior. First let’s use the docker network command to see the available networks

docker network ls
NETWORK ID          NAME                DRIVER
799fdd32bcc7        bridge              bridge
399e54f67b58        none                null
360493df1da0        host                host

The first bridge network is the docker0 ethernet bridge which we saw previously, and as we saw, by default docker daemon connects all containers to this network.

The none network, as you could guess from the name, will not create any network interface for the container. The host network, will make your container’s network configurations be identical to the one on your host.

In order to change the default behavior of the containers, we need to use the --net flag on the run command.

docker run --net=none -it busybox
/  ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

So you can use this configuration to run containers which do not require internet connection.
docker run --net=host -it busybox
/  ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:69:05:9A:18
          inet addr:172.18.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:69ff:fe05:9a18/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:2744 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2398 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:134675 (131.5 KiB)  TX bytes:10853790 (10.3 MiB)

eth0      Link encap:Ethernet  HWaddr 08:00:27:68:45:67
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe68:4567/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:450052 errors:0 dropped:0 overruns:0 frame:0
          TX packets:185932 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:479806667 (457.5 MiB)  TX bytes:12369310 (11.7 MiB)

eth1      Link encap:Ethernet  HWaddr 08:00:27:A7:29:58
          inet addr:172.17.8.101  Bcast:172.17.8.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fea7:2958/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:40892 errors:0 dropped:0 overruns:0 frame:0
          TX packets:293814 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8575744 (8.1 MiB)  TX bytes:16673799 (15.9 MiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:1448465 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1448465 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:136688254 (130.3 MiB)  TX bytes:136688254 (130.3 MiB)

You can see, in this case the container’s network configuration is identical to our host. This can be useful for debugging/testing purposes, however for life web deployments, you would want to configure your own networks and isolate containers into subnets.

Let’s go back to the default bridge configuration We can inspect the details of the network using

core@core-01 ~ $ docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "799fdd32bcc7c49c911a5f7dbf9faa5308c959920fe757b72b78d8b6257c4a94",
        "Scope": "local",
        "Driver": "bridge",
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.18.0.0/16"
                }
            ]
        },
        "Containers": {
            "0ca8aec07afc0c7a27e9e25215637a0fe363825e40256e3da4ec45bb5c89dc35": {
                "Name": "backstabbing_yalow",
                "EndpointID": "910d63f15f2326c5880f82d1718aad79ddc591b8b9c49ab5604177e649b91afe",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        }
    }
]

You can see the subnet chosen, the containers attached currently to this bridge, as well as the options of the bridge.

So, we have looked at the very basics of networking in docker. In the follow-up article, we’ll setup our own network, connect containers to it and achieve both isolation and communication of containers with each other.

More Reading
Older// gomongo part 1