docker默认采用的是端口映射的方式来让外部访问,比如你启动一个MySQL,在container内部会有一个虚拟ip,默认是172.17之类的网段,宿主机可以ping通这个ip地址,但是除了宿主机能访问这个虚拟网段,其他任何机器(不管是否docker还是非docker)都无法访问,这个很容易理解,因为这个网段的ip本身就是虚拟出来的。因此要访问docker容器的方式采用的是端口映射,通过虚拟ip网段和宿主机映射端口,你直接访问宿主机的端口就可以了。
比如:
e91703882bd0 registry "/entrypoint.sh /etc 11 days ago Up 7 minutes 0.0.0.0:5000->5000/tcp hopeful_gates
以上你要访问这个容器的5000端口,实际需要访问docker宿主机的5000端口,因为容器和宿主机做了绑定。以上属于docker基本概念,就不细说了。
那么问题来了,当我们使用k8s来管理docker容器,如果底层全部采用端口映射,那将是多么麻烦的一件事?每个node可能有上百个容器,全部需要做端口映射,如果能够直接访问这些虚拟ip,和用实际的ip一样,这样内部调用直接采用ip和端口即可,这样做就会方便很多。
flannel就是解决这个问题的一个组件,我们先来看一下官网对flannel的解释:
Platforms like Kubernetes assume that each container (pod) has a unique, routable IP inside the cluster. The advantage of this model is that it removes the port mapping complexities that come from sharing a single host IP.
Flannel is responsible for providing a layer 3 IPv4 network between multiple nodes in a cluster. Flannel does not control how containers are networked to the host, only how the traffic is transported between hosts. However, flannel does provide a CNI plugin for Kubernetes and a guidance on integrating with Docker.
第一句话就是说使用共享ip来替代复杂的端口映射,这就是flannel的意义。
1. 安装flannel, 只需要在k8s的node节点安装。先去官网下载最新版本,然后解压缩。
flannel会指定一个网段给所有docker使用,而且是自动分配的,另外flannel使用etcd来管理网端的信息。默认的key是:/coreos.com/network/config , 简单说就是flannel会去etcd获取这个key,以此来获取定义好的网段信息。
写入网段信息到etcd:
/data/etcd/etcdctl --ca-file=/data/etcd/cert/ca.pem --cert-file=/data/etcd/cert/peer.pem --key-file=/data/etcd/cert/peer-key.pem --endpoints=https://10.203.3.96:2379,https://10.203.0.46:2379,https://10.203.0.43:2379 mk /coreos.com/network/config '{ "Network": "172.17.0.0/16", "Backend": {"Type": "vxlan"}}'
提示:这里有一个问题,flannel不能使用etcd 3的版本,但是实际上我们现在很多环境都是用最新版,因此3版本以上的etcd在安装的时候必须 --enable-v2,如下:
/data/etcd/etcd --name=etcd-node2 --data-dir=/data/etcd/data --listen-client-urls=https://10.203.3.96:2379 --listen-peer-urls=https://10.203.3.96:2380 --advertise-client-urls=https://10.203.3.96:2379 --initial-advertise-peer-urls=https://10.203.3.96:2380 --initial-cluster=etcd-node1=https://10.203.0.46:2380,etcd-node2=https://10.203.3.96:2380,etcd-node3=https://10.203.0.43:2380 --initial-cluster-state=new --peer-key-file=/data/etcd/cert/peer-key.pem --peer-cert-file=/data/etcd/cert/peer.pem --key-file=/data/etcd/cert/peer-key.pem --cert-file=/data/etcd/cert/peer.pem --client-cert-auth --trusted-ca-file=/data/etcd/cert/ca.pem --peer-client-cert-auth --peer-trusted-ca-file=/data/etcd/cert/ca.pem --enable-v2
如果你是3以上的etcd版本,需要关闭etcd,然后添加--enable-v2并重启,否者无法使用flannel,报错信息是can't get config之类的信息。
2. 启动flannel
./flanneld --ip-masq --etcd-endpoints=https://10.203.0.43:2379,https://10.203.3.96:2379,https://10.203.0.46:2379 -etcd-cafile=/data/etcd/cert/ca.pem -etcd-certfile=/data/etcd/cert/peer.pem -etcd-keyfile=/data/etcd/cert/peer-key.pem &
3. 生成subnet.env,docker将会使用这个subnet.env来启动。
/data/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/subnet.env
来看一下subnet.env什么内容:
[root@oyoshbddwatlasyprd1 flannel]# cat /run/flannel/subnet.env
DOCKER_OPT_BIP="--bip=172.17.83.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=false"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=172.17.83.1/24 --ip-masq=false --mtu=1450"
大家可以看到,这个内容其实是随机生成的,172.17.83.1/24,表示docker使用这个虚拟网段,docker启动的时候需要使用这个配置来启动。
4. 修改docker.service,以便指定subnet.env文件来启动
#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
EnvironmentFile=/run/flannel/subnet.env
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS
第一行注释掉的是默认的启动命令,现在改成根据subnet.env的变量DOCKER_NETWORK_OPTIONS来启动docker.
5. 重新启动docker
systemctl daemon-reload
systemctl restart docker
6. 确认docker网段
第一台机器:
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 172.17.52.1 netmask 255.255.255.0 broadcast 172.17.52.255
ether 02:42:02:ff:28:6a txqueuelen 0 (Ethernet)
RX packets 1312218 bytes 91088448 (86.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1403712 bytes 3494588370 (3.2 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.203.0.46 netmask 255.255.254.0 broadcast 10.203.1.255
ether 00:16:3e:06:f3:4d txqueuelen 1000 (Ethernet)
RX packets 37252931 bytes 11557211415 (10.7 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 33898339 bytes 6693779547 (6.2 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 172.17.52.0 netmask 255.255.255.255 broadcast 0.0.0.0
ether 3a:40:68:15:5c:45 txqueuelen 0 (Ethernet)
RX packets 5 bytes 420 (420.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5 bytes 420 (420.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
第二台机器:
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.83.1 netmask 255.255.255.0 broadcast 172.17.83.255
ether 02:42:53:e1:69:7c txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.203.3.96 netmask 255.255.254.0 broadcast 10.203.3.255
ether 00:16:3e:12:21:46 txqueuelen 1000 (Ethernet)
RX packets 560672439 bytes 440894465247 (410.6 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 526563533 bytes 202796272242 (188.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 172.17.83.0 netmask 255.255.255.255 broadcast 0.0.0.0
ether 2a:2e:0d:a8:5a:3a txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
仔细看网段,这2台机器是不一样的,都是flannel随机生成的。
7. 测试别的机器能够访问容器的虚拟ip
我们随便启动一个容器:
[root@oyoshbddwatlasprd0 flannel]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e91703882bd0 registry "/entrypoint.sh /etc 11 days ago Up 34 minutes 0.0.0.0:5000->5000/tcp hopeful_gates
[root@oyoshbddwatlasprd0 flannel]# docker exec -it e91703882bd0 sh
/ # ifconfig -a
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:34:02
inet addr:172.17.52.2 Bcast:172.17.52.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:10 errors:0 dropped:0 overruns:0 frame:0
TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:868 (868.0 B) TX bytes:868 (868.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
/ #
上面有一个容器,它的ip是: 172.17.52.2 ,随便找一个别的docker机器来访问这个ip
[root@oyoshbddwatlasyprd1 flannel]# ping 172.17.52.2
PING 172.17.52.2 (172.17.52.2) 56(84) bytes of data.
64 bytes from 172.17.52.2: icmp_seq=1 ttl=63 time=1.06 ms
64 bytes from 172.17.52.2: icmp_seq=2 ttl=63 time=0.990 ms
64 bytes from 172.17.52.2: icmp_seq=3 ttl=63 time=1.00 ms
64 bytes from 172.17.52.2: icmp_seq=4 ttl=63 time=1.00 ms
64 bytes from 172.17.52.2: icmp_seq=5 ttl=63 time=1.00 ms
很明显,我在prd1上访问prd0的虚拟ip是通的。这就意味不需要再做什么端口映射了,因为docker配合flannel能够让docker的虚拟ip随意访问,直接访问即可,无需端口映射了。
在k8s里经常会提到网络模型的概念cni, 实际上这个就是所谓的网络模型。