k8s 网络原理 2022-0417

一、k8s对集群网络的基本要求:
1、所有pod都可以在不用NAT的方式下同别的pod通信
2、在所有节点上运行的代理程序(如kubelet或者操作系统的守护进程)都可以在不用NAT的方式下同所有的pod通信,反之亦然
3、以hostnetwork模式运行的pod都可以在不用NAT的方式下同别的pod通信
二、docker网络基础
组件network namespace ,Veth设备对,网桥bridge,iptables, 路由
1、通过网络命名空间隔离不通的容器,通过veth对端实现不同网络命名空间的通信
2、网络设备可以在不同的网络命名空间中移动,如veth 设备对; lo, bridge ,vxlan,ppp,设备无法移动
3、移动设备需要具备Netif_F_ETNS_LOCAL 属性为on ,才能移动
三、网络命名空间实操
[root@node01 ~]# ip netns help
Usage: ip netns list
ip netns add NAME
ip netns set NAME NETNSID
ip [-all] netns delete [NAME]
ip netns identify [PID]
ip netns pids NAME
ip [-all] netns exec [NAME] cmd ...
ip netns monitor
ip netns list-id
添加网络命名空间
[root@node01 ~]# ip netns list
[root@node01 ~]#
[root@node01 ~]# ip netns add netns01
[root@node01 ~]# ip netns add netns02
[root@node01 ~]# ip netns list
netns02
netns01
进入网络命名空间内操作,可以看到只有一个lo本地回环设备
[root@node01 ~]# ip netns exec netns01 bash
[root@node01 ~]# ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@node01 ~]#
[root@node01 ~]# exit
exit
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 bash
[root@node01 ~]# ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@node01 ~]# exit
exit
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns01 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@node01 ~]#
查看设备是否可以在不同network namespace中移动,netns-local: on [fixed] ,on代表不可以移动
[[root@node01 ~]# ethtool -k docker0 | grep netns
netns-local: on [fixed]
[root@node01 ~]# ethtool -k kube-ipvs0 | grep netns
netns-local: off [fixed]
[root@node01 ~]# ethtool -k flannel.1 | grep netns
netns-local: off [fixed]
[root@node01 ~]# ethtool -k dummy0 | grep netns
netns-local: off [fixed]
[root@node01 ~]# ethtool -k lo | grep netns
netns-local: on [fixed]
比如把docker0移动netns01网络命名空间报错
[root@node01 ~]# ip link set docker0 netns netns01
RTNETLINK answers: Invalid argument
[root@node01 ~]#
四、Veth设备对
1、创建veth设备对
[root@node01 ~]# ip link add veth0 type veth peer name veth1
[root@node01 ~]# ip r
default via 192.168.1.1 dev ens33 proto static metric 100
10.224.0.0/24 via 10.224.0.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.1.0/24 dev ens33 proto kernel scope link src 192.168.1.122 metric 100
[root@node01 ~]#
[root@node01 ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.1.1 0.0.0.0 UG 100 0 0 ens33
10.224.0.0 10.224.0.0 255.255.255.0 UG 0 0 0 flannel.1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
[root@node01 ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:a7:a7:a6 brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:6d:8d:5c:94 brd ff:ff:ff:ff:ff:ff
4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:94:c6:2c:93:fb brd ff:ff:ff:ff:ff:ff
5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default
link/ether 86:a4:eb:85:ba:b2 brd ff:ff:ff:ff:ff:ff
6: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 5a:33:aa:f2:ad:91 brd ff:ff:ff:ff:ff:ff
7: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether ce:86:0c:3d:26:16 brd ff:ff:ff:ff:ff:ff
8: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:18:e0:12:b3:18 brd ff:ff:ff:ff:ff:ff
[root@node01 ~]#
把veth1挂到netns02的命名空间
[root@node01 ~]# ip link set veth1 netns netns02
[root@node01 ~]#
再次查看可以发现veth1设备不见了,只能看到veth0
[root@node01 ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:a7:a7:a6 brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:6d:8d:5c:94 brd ff:ff:ff:ff:ff:ff
4: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether b2:94:c6:2c:93:fb brd ff:ff:ff:ff:ff:ff
5: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default
link/ether 86:a4:eb:85:ba:b2 brd ff:ff:ff:ff:ff:ff
6: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 5a:33:aa:f2:ad:91 brd ff:ff:ff:ff:ff:ff
8: veth0@if7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:18:e0:12:b3:18 brd ff:ff:ff:ff:ff:ff link-netnsid 0
可以在netns02中看到veth1
[root@node01 ~]# ip netns exec netns01 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: veth1@if8: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether ce:86:0c:3d:26:16 brd ff:ff:ff:ff:ff:ff link-netnsid 0
分别给veth0 和veth1配置IP地址,然后测试它们之间的连通性
配置veth0 IP地址
[root@node01 ~]# ip addr add 10.1.1.2/24 dev veth0
[root@node01 ~]# ip link show | grep -A 2 veth0
8: veth0@if7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:18:e0:12:b3:18 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]#
启动veth0
[root@node01 ~]# ip link set dev veth0 up
[root@node01 ~]#
[root@node01 ~]# ip link show | grep -A 3 veth0
8: veth0@if7: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
link/ether 6e:18:e0:12:b3:18 brd ff:ff:ff:ff:ff:ff link-netnsid 0
配置veth1 IP地址
[root@node01 ~]# ip netns exec netns02 ip addr add 10.1.1.3/24 dev veth1
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: veth1@if8: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether ce:86:0c:3d:26:16 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]#
启动veth0
[root@node01 ~]# ip netns exec netns02 ip link set dev veth1 up
[root@node01 ~]#
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: veth1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether ce:86:0c:3d:26:16 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]#
测试连通性
在veth0 主机上ping netns02网络命名空间里的veth1 IP 地址 10.1.1.3 ,可知是ping 通
[root@node01 ~]# ping 10.1.1.3
PING 10.1.1.3 (10.1.1.3) 56(84) bytes of data.
64 bytes from 10.1.1.3: icmp_seq=1 ttl=64 time=34.2 ms
64 bytes from 10.1.1.3: icmp_seq=2 ttl=64 time=31.0 ms
^C
--- 10.1.1.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 31.053/32.658/34.263/1.605 ms
[root@node01 ~]#
[root@node01 ~]# ip netns exec netns02 ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
^C
--- 10.1.1.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
在veth1 网络命名空间 netns02 里主机上ping veth0 IP 地址 10.1.1.2,可知是ping 通
[root@node01 ~]# ip netns exec netns02 ping 10.1.1.2
PING 10.1.1.2 (10.1.1.2) 56(84) bytes of data.
64 bytes from 10.1.1.2: icmp_seq=1 ttl=64 time=0.764 ms
64 bytes from 10.1.1.2: icmp_seq=2 ttl=64 time=0.052 ms
^C
--- 10.1.1.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.052/0.408/0.764/0.356 ms
[root@node01 ~]#
由上面实验可以进一步理解veth pair 对的通信过程
2、那如何查看两个对于的veth pair 对,简单点就是查看其签名的序号和@if 后面的序号
[root@node01 ~]# ip netns exec netns02 ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: veth1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether ce:86:0c:3d:26:16 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]# ip netns exec netns02 ethtool -S veth1
NIC statistics:
peer_ifindex: 8
[root@node01 ~]# ethtool -S veth0
NIC statistics:
peer_ifindex: 7
[root@node01 ~]#
[root@node01 ~]# ip link show | grep 8
link/ether 02:42:6d:8d:5c:94 brd ff:ff:ff:ff:ff:ff
link/ether 86:a4:eb:85:ba:b2 brd ff:ff:ff:ff:ff:ff
8: veth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 6e:18:e0:12:b3:18 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@node01 ~]#