open /run/flannel/subnet.env: no such file or directory
open /run/flannel/subnet.env: no such file or directory
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "9a5eade3c13f1eeeb000df80e942ed22e59d2c532def6f1f281fd2ebefdcfa2c" network for pod "mcw01dep-nginx1-69bc6f5957-lzpdv": networkPlugin cni failed to set up pod "mcw01dep-nginx1-69bc6f5957-lzpdv_default" network: open /run/flannel/subnet.env: no such file or directory
scp /run/flannel/subnet.env 结点
scp /run/flannel/subnet.env 10.0.0.6:/run/flannel/
问题一:查看网络状态报错:RTNETLINK answers: File exists错误解决方法
CentOS7 Failed to start LSB: Bring up/down networking
RTNETLINK answers: File exists错误解决方法
chkconfig --level 35 network on
chkconfig --level 0123456 NetworkManager off
service NetworkManager stop
service network stop
service network start
如果还不行,重启系统看看
service network start 出现RTNETLINKanswers:Fileexists错误解决 或者
/etc/init.d/network start 出现RTNETLINKanswers:Fileexists错误解决
(其实两者是等效的,其实前者执行的就是这个命令)
在centos下出现该故障的原因是启动网络的两个服务有冲突:
/etc/init.d/network 和
/etc/init.d/NetworkManager 这两个服务有冲突吧。
从根本上说是NetworkMaganager(NM)的带来的冲突,停用NetworkManager即可解决,重启即可。
1.切换到root账户,并用chkconfig命令查看network 和NetworkManager两个服务的开机启动配置情况;
=====
只是执行如下三个命令就成功了
service NetworkManager stop
service network stop
service network start
问题二:ping外网时,Destination Host Unreachable。from 内网ip
排查过程
[root@mcw7 ~]$ ping www.baidu.com
PING www.a.shifen.com (220.181.38.149) 56(84) bytes of data.
From bogon (172.16.1.137) icmp_seq=1 Destination Host Unreachable
查看能通外网的路由表
[root@mcw8 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 0.0.0.0 UG 100 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 101 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.16.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
查看不通外网的路由表,发现缺少了一条关于10.0.0.2的路由,
应该加一条如上的路由试试0.0.0.0 10.0.0.2 0.0.0.0 UG 100 0 0 ens33
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
加错路由了,删除
route add -host 10.0.0.137 gw 10.0.0.2
[root@mcw7 ~]$ route add -host 10.0.0.137 gw 10.0.0.2
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
10.0.0.137 10.0.0.2 255.255.255.255 UGH 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
删除路由 -host后面的ip,在路由的第一列,目标地址。我这里应该填0.0.0.0。目标地址是任意的,指定gw是10.0.0.2
[root@mcw7 ~]$ route del -host 10.0.0.137 dev ens33
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
[root@mcw7 ~]$
-host是指去往的目的主机,这里子网掩码应该设置为0.0.0.0,需要手动删除重建。旗帜貌似多了H,不知道干嘛的
[root@mcw7 ~]$ route add -host 0.0.0.0 gw 10.0.0.2
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 255.255.255.255 UGH 0 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
删除指定目的主机,指定网卡接口
[root@mcw7 ~]$ route del -host 0.0.0.0 dev ens33
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
看提示信息,指定掩码用netmask,genmask和mask和是255的那种,取补集
[root@mcw7 ~]$ route add -host 0.0.0.0 MASK 0.0.0.0 gw 10.0.0.2
Usage: inet_route [-vF] del {-host|-net} Target[/prefix] [gw Gw] [metric M] [[dev] If]
inet_route [-vF] add {-host|-net} Target[/prefix] [gw Gw] [metric M]
[netmask N] [mss Mss] [window W] [irtt I]
[mod] [dyn] [reinstate] [[dev] If]
inet_route [-vF] add {-host|-net} Target[/prefix] [metric M] reject
inet_route [-FC] flush NOT supported
[root@mcw7 ~]$
[root@mcw7 ~]$
[root@mcw7 ~]$ route add -host 0.0.0.0 netmask 0.0.0.0 gw 10.0.0.2
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 255.255.255.255 UGH 0 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
[root@mcw7 ~]$
再次删除 route del 指定目的主机,指定接口
[root@mcw7 ~]$ route del -host 0.0.0.0 dev ens33
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
真正解决前的描述
删除默认网关
[root@mcw7 ~]$ route del -host 0.0.0.0 dev ens33
SIOCDELRT: No such process
[root@mcw7 ~]$
[root@mcw7 ~]$ route del -host 0.0.0.0 dev ens37
SIOCDELRT: No such process
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 0.0.0.0 UG 100 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 101 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.16.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
[root@mcw7 ~]$ route del -host 0.0.0.0
SIOCDELRT: No such process
[root@mcw7 ~]$ route del default
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 100 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.16.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
[root@mcw7 ~]$ route del default
[root@mcw7 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.0.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
10.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
172.16.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
[root@mcw7 ~]$
真正的解决这个问题
参考:
mcw8上ping不能到外网,显示包来自服务器内网ip。
[root@mcw8 ~]$ ping www.baidu.com
PING www.a.shifen.com (39.156.66.14) 56(84) bytes of data.
From mcw8 (172.16.1.138) icmp_seq=1 Destination Host Unreachable
mcw9上能ping通外网,显示包来着外网百度ip
[root@mcw9 ~]$ ping www.baidu.com
PING www.a.shifen.com (39.156.66.18) 56(84) bytes of data.
64 bytes from 39.156.66.18 (39.156.66.18): icmp_seq=1 ttl=128 time=43.2 ms
查看mcw9正常网关,是有10.0.0.2的网关ip
[root@mcw9 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 0.0.0.0 UG 100 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 101 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
172.16.1.0 0.0.0.0 255.255.255.0 U 100 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
查看mcw8异常网络的路由,没有外网的网关10.0.0.2。
[root@mcw8 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
给mcw8添加默认网关,上面之前添加各种路由,结果genmask都不对,不能变成0.0.0.0。而使用如下命令,才实现了
Destination是0.0.0.0,Gateway是10.0.0.2,Genmask是0.0.0.0 ,Flags是UG,Iface是ens33。然后才成功访问外网
[root@mcw8 ~]$ route add default gw 10.0.0.2
[root@mcw8 ~]$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.0.2 0.0.0.0 UG 0 0 0 ens33
0.0.0.0 172.16.1.2 0.0.0.0 UG 0 0 0 ens37
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 ens33
169.254.0.0 0.0.0.0 255.255.0.0 U 1003 0 0 ens37
172.16.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ens37
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
在mcw8上可以正常访问外网了
[root@mcw8 ~]$ ping www.baidu.com
PING www.a.shifen.com (39.156.66.14) 56(84) bytes of data.
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=1 ttl=128 time=23.5 ms
64 bytes from 39.156.66.14 (39.156.66.14): icmp_seq=2 ttl=128 time=36.7 ms
由于第二次部署flannel下面网络不通了,网站访问不了(查域名是禁止查询的域名),但是我以前有把这个文件内容保存下来。这样我直接把文件内容复制进来,直接部署就可以了。如下
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
[machangwei@mcw7 ~]$ ls
mcw.txt mm.yml scripts tools
[machangwei@mcw7 ~]$ kubectl apply -f mm.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
因为忘记init的加入集群的命令了。所以当我要kubeadm init,然后执行kubeadm reset之后,原本有的容器都没了
排查过程,以及IPtables规则的导出和导入
[root@mcw7 ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[root@mcw7 ~]$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[root@mcw7 ~]$
重设然后重初始化后,网络也没有的
[root@mcw7 ~]$ docker ps|grep kube-flannel
[root@mcw7 ~]$
进入普通用户重新部署网络报错
[machangwei@mcw7 ~]$ ls
mcw.txt mm.yml scripts tools
[machangwei@mcw7 ~]$ kubectl apply -f mm.yml
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
[machangwei@mcw7 ~]$
查询之前重设的信息。发现说不能清除CNI的信息
[root@mcw7 ~]$ echo y|kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: [preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
移除文件不管用
[root@mcw7 ~]$ mv /etc/cni/net.d /etc/cni/net.dbak
[root@mcw7 ~]$ ipvsadm --clear
-bash: ipvsadm: command not found
查看了一大堆,不知道咋弄
[root@mcw7 ~]$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
INPUT_direct all -- anywhere anywhere
INPUT_ZONES_SOURCE all -- anywhere anywhere
既然无法清除,那么直接从其它机子导出导入一份规则
导出:
[root@mcw9 ~]$ iptables-save > /root/iptables_beifen.txt
[root@mcw9 ~]$ cat iptables_beifen.txt
# Generated by iptables-save v1.4.21 on Fri Jan 7 23:05:39 2022
*filter
:INPUT ACCEPT [1676:135745]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [896:67997]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Fri Jan 7 23:05:39 2022
# Generated by iptables-save v1.4.21 on Fri Jan 7 23:05:39 2022
*nat
:PREROUTING ACCEPT [32:2470]
:INPUT ACCEPT [32:2470]
:OUTPUT ACCEPT [8:528]
:POSTROUTING ACCEPT [8:528]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
COMMIT
# Completed on Fri Jan 7 23:05:39 2022
[root@mcw9 ~]$ cat
mcw7上导入规则。出错,文件有问题。第一行注释一下吧
[root@mcw7 ~]$ iptables-restore</root/daoru.txt
iptables-restore: line 1 failed
[root@mcw7 ~]$ cat daoru.txt #命令
ptables-save v1.4.21 on Fri Jan 7 23:05:39 2022
*filter
:INPUT ACCEPT [1676:135745]
导入,防火墙规则一致了
[root@mcw7 ~]$ iptables-restore</root/daoru.txt
[root@mcw7 ~]$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
============
再次执行,试一试
重试
[root@mcw7 ~]$ echo y|kubeadm reset
再看防火墙,貌似是没有变化
[root@mcw7 ~]$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
[root@mcw7 ~]$
重新初始化后
[root@mcw7 ~]$ kubeadm init --apiserver-advertise-address 10.0.0.137 --pod-network-cidr=10.244.0.0/24 --image-repository=registry.aliyuncs.com/google_containers
kubeadm join 10.0.0.137:6443 --token 1e2kkw.ivkth6zzkbx72z4u \
--discovery-token-ca-cert-hash sha256:fb83146082fb33ca2bff56a525c1e575b5f2587ab1be566f9dd3d7e8d7845462
[root@mcw7 ~]$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- anywhere anywhere /* kubernetes forwarding rules */
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-EXTERNAL-SERVICES (2 references)
target prot opt source destination
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- anywhere anywhere /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
你可能无法记住做题的步骤,但是你能根据笔记把题很快做出来,还有把握保证是对的
你可能无法记住部署的步骤,执行的每一个命令,但是你能根据自己以前的笔记很快做出来
原来这个问题跟防火墙没有关系。
[machangwei@mcw7 ~]$ kubectl apply -f mm.yml
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
[machangwei@mcw7 ~]$
[machangwei@mcw7 ~]$ kubectl get nodes
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
真正的解决方法:
做法如下,重新用普通用户配置kubectl,以前的配置失效了
[machangwei@mcw7 ~]$ ls -a
. .. .bash_history .bash_logout .bash_profile .bashrc .kube mcw.txt mm.yml scripts tools .viminfo
[machangwei@mcw7 ~]$ mv .kube kubebak
[machangwei@mcw7 ~]$ mkdir -p $HOME/.kube
[machangwei@mcw7 ~]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[machangwei@mcw7 ~]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
[machangwei@mcw7 ~]$ kubectl get node
NAME STATUS ROLES AGE VERSION
mcw7 NotReady control-plane,master 10m v1.23.1
重新创建网络
[machangwei@mcw7 ~]$ ls
kubebak mcw.txt mm.yml scripts tools
[machangwei@mcw7 ~]$ kubectl apply -f mm.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
此时再次用-L查看防火墙
[root@mcw7 ~]$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-NODEPORTS all -- anywhere anywhere /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- anywhere anywhere /* kubernetes forwarding rules */
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- mcw7/16 anywhere
ACCEPT all -- anywhere mcw7/16
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain KUBE-EXTERNAL-SERVICES (2 references)
target prot opt source destination
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- anywhere anywhere /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
^C
看规则,应该用下面的才是合适的
[root@mcw7 ~]$ iptables-save
# Generated by iptables-save v1.4.21 on Sat Jan 8 07:35:11 2022
*nat
:PREROUTING ACCEPT [372:18270]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [239:14302]
:POSTROUTING ACCEPT [239:14302]
:DOCKER - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SEP-6E7XQMQ4RAYOWTTM - [0:0]
:KUBE-SEP-IT2ZTR26TO4XFPTO - [0:0]
:KUBE-SEP-N4G2XR5TDX7PQE7P - [0:0]
:KUBE-SEP-XOVE7RWZIDAMLO2S - [0:0]
:KUBE-SEP-YIL6JZP7A3QYXJU2 - [0:0]
:KUBE-SEP-ZP3FB6NMPNCO4VBJ - [0:0]
:KUBE-SEP-ZXMNUKOKXUTL2MK2 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
-A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
-A POSTROUTING ! -s 10.244.0.0/16 -d 10.244.0.0/24 -j RETURN
-A POSTROUTING ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SEP-IT2ZTR26TO4XFPTO -s 10.244.0.2/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-IT2ZTR26TO4XFPTO -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-N4G2XR5TDX7PQE7P -s 10.244.0.2/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-N4G2XR5TDX7PQE7P -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.2:9153
-A KUBE-SEP-XOVE7RWZIDAMLO2S -s 10.0.0.137/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-XOVE7RWZIDAMLO2S -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 10.0.0.137:6443
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.3:9153
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 ! -s 10.244.0.0/24 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-IT2ZTR26TO4XFPTO
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-ZXMNUKOKXUTL2MK2
-A KUBE-SVC-JD5MR3NA4I4DYORP ! -s 10.244.0.0/24 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-N4G2XR5TDX7PQE7P
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-SEP-ZP3FB6NMPNCO4VBJ
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.244.0.0/24 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-XOVE7RWZIDAMLO2S
-A KUBE-SVC-TCOU7JCQXEZGVUNU ! -s 10.244.0.0/24 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YIL6JZP7A3QYXJU2
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-6E7XQMQ4RAYOWTTM
COMMIT
# Completed on Sat Jan 8 07:35:11 2022
# Generated by iptables-save v1.4.21 on Sat Jan 8 07:35:11 2022
*mangle
:PREROUTING ACCEPT [376111:67516258]
:INPUT ACCEPT [369347:67204288]
:FORWARD ACCEPT [6764:311970]
:OUTPUT ACCEPT [369958:67425919]
:POSTROUTING ACCEPT [371215:67488646]
:FORWARD_direct - [0:0]
:INPUT_direct - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:OUTPUT_direct - [0:0]
:POSTROUTING_direct - [0:0]
:PREROUTING_ZONES - [0:0]
:PREROUTING_ZONES_SOURCE - [0:0]
:PREROUTING_direct - [0:0]
:PRE_docker - [0:0]
:PRE_docker_allow - [0:0]
:PRE_docker_deny - [0:0]
:PRE_docker_log - [0:0]
:PRE_public - [0:0]
:PRE_public_allow - [0:0]
:PRE_public_deny - [0:0]
:PRE_public_log - [0:0]
-A PREROUTING -j PREROUTING_direct
-A PREROUTING -j PREROUTING_ZONES_SOURCE
-A PREROUTING -j PREROUTING_ZONES
-A INPUT -j INPUT_direct
-A FORWARD -j FORWARD_direct
-A OUTPUT -j OUTPUT_direct
-A POSTROUTING -j POSTROUTING_direct
-A PREROUTING_ZONES -i ens33 -g PRE_public
-A PREROUTING_ZONES -i docker0 -j PRE_docker
-A PREROUTING_ZONES -i ens37 -g PRE_public
-A PREROUTING_ZONES -g PRE_public
-A PRE_docker -j PRE_docker_log
-A PRE_docker -j PRE_docker_deny
-A PRE_docker -j PRE_docker_allow
-A PRE_public -j PRE_public_log
-A PRE_public -j PRE_public_deny
-A PRE_public -j PRE_public_allow
COMMIT
# Completed on Sat Jan 8 07:35:11 2022
# Generated by iptables-save v1.4.21 on Sat Jan 8 07:35:11 2022
*security
:INPUT ACCEPT [591940:133664590]
:FORWARD ACCEPT [1257:62727]
:OUTPUT ACCEPT [596315:107591486]
:FORWARD_direct - [0:0]
:INPUT_direct - [0:0]
:OUTPUT_direct - [0:0]
-A INPUT -j INPUT_direct
-A FORWARD -j FORWARD_direct
-A OUTPUT -j OUTPUT_direct
COMMIT
# Completed on Sat Jan 8 07:35:11 2022
# Generated by iptables-save v1.4.21 on Sat Jan 8 07:35:11 2022
*raw
:PREROUTING ACCEPT [376111:67516258]
:OUTPUT ACCEPT [369958:67425919]
:OUTPUT_direct - [0:0]
:PREROUTING_ZONES - [0:0]
:PREROUTING_ZONES_SOURCE - [0:0]
:PREROUTING_direct - [0:0]
:PRE_docker - [0:0]
:PRE_docker_allow - [0:0]
:PRE_docker_deny - [0:0]
:PRE_docker_log - [0:0]
:PRE_public - [0:0]
:PRE_public_allow - [0:0]
:PRE_public_deny - [0:0]
:PRE_public_log - [0:0]
-A PREROUTING -j PREROUTING_direct
-A PREROUTING -j PREROUTING_ZONES_SOURCE
-A PREROUTING -j PREROUTING_ZONES
-A OUTPUT -j OUTPUT_direct
-A PREROUTING_ZONES -i ens33 -g PRE_public
-A PREROUTING_ZONES -i docker0 -j PRE_docker
-A PREROUTING_ZONES -i ens37 -g PRE_public
-A PREROUTING_ZONES -g PRE_public
-A PRE_docker -j PRE_docker_log
-A PRE_docker -j PRE_docker_deny
-A PRE_docker -j PRE_docker_allow
-A PRE_public -j PRE_public_log
-A PRE_public -j PRE_public_deny
-A PRE_public -j PRE_public_allow
COMMIT
# Completed on Sat Jan 8 07:35:11 2022
# Generated by iptables-save v1.4.21 on Sat Jan 8 07:35:11 2022
*filter
:INPUT ACCEPT [14882:2406600]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [15254:2447569]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -s 10.244.0.0/16 -j ACCEPT
-A FORWARD -d 10.244.0.0/16 -j ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Sat Jan 8 07:35:11 2022
[root@mcw7 ~]$
回头研究忘记加入集群的命令,如何重新生成,以及是否对已加入集群的节点是否产生影响
/proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1问题解决
重新加入节点,有警告信息,我们应该把警告信息注意起来,比如docker让它开机启动,如果我们虚拟机没有设置开机启动,那么万一重启了虚拟机,容器就挂了
[root@mcw8 ~]$ kubeadm join 10.0.0.137:6443 --token 1e2kkw.ivkth6zzkbx72z4u \
> --discovery-token-ca-cert-hash sha256:fb83146082fb33ca2bff56a525c1e575b5f2587ab1be566f9dd3d7e8d7845462
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING Hostname]: hostname "mcw8" could not be reached
[WARNING Hostname]: hostname "mcw8": lookup mcw8 on 10.0.0.2:53: no such host
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方法
[root@mcw8 ~]$ echo "1" >/proc/sys/net/bridge/bridge-nf-call-iptables
[root@mcw8 ~]$ kubeadm join 10.0.0.137:6443 --token 1e2kkw.ivkth6zzkbx72z4u --discovery-token-ca-cert-hash sha256:fb83146082fb33ca2bff56a525c1e575b5f2587ab1be566f9dd3d7e8d7845462[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING Hostname]: hostname "mcw8" could not be reached
[WARNING Hostname]: hostname "mcw8": lookup mcw8 on 10.0.0.2:53: no such host
^C
[root@mcw8 ~]$ echo y|kubeadm reset
如上之后,还是不行,加不到mcw7 master节点,之前记得mcw8和mcw9两个node是没有部署k8s网络的,现在部署一下再试试。配置普通用户kubectl,然后
[machangwei@mcw7 ~]$ scp mm.yml 10.0.0.138:/home/machangwei/
machangwei@10.0.0.138's password:
mm.yml 100% 5412 8.5MB/s 00:00
[machangwei@mcw7 ~]$ scp mm.yml 10.0.0.139:/home/machangwei/
machangwei@10.0.0.139's password:
mm.yml
但是节点是不需要配置普通用户的kubectl的,因为缺少文件的
[root@mcw8 ~]$ su - machangwei
[machangwei@mcw8 ~]$ mkdir -p $HOME/.kube
[machangwei@mcw8 ~]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: cannot stat ‘/etc/kubernetes/admin.conf’: No such file or directory
[machangwei@mcw8 ~]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
chown: cannot access ‘/home/machangwei/.kube/config’: No such file or directory
加入集群一直卡住 ,加一个--V=2的参数,打印详情
[root@mcw8 ~]$ kubeadm join 10.0.0.137:6443 --token 1e2kkw.ivkth6zzkbx72z4u --discovery-token-ca-cert-hash sha256:fb83146082fb33ca2bff56a525c1e575b5f2587ab1be566f9dd3d7e8d7845462 --v=2
I0108 00:54:46.002913 32058 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
I0108 00:54:46.068584 32058 initconfiguration.go:117] detected and using CRI socket: /var/run/dockershim.sock
[preflight] Running pre-flight checks
I0108 00:54:46.068919 32058 preflight.go:92] [preflight] Running general checks
发现报错信息
I0108 00:54:46.849380 32058 checks.go:620] validating kubelet version
I0108 00:54:46.927861 32058 checks.go:133] validating if the "kubelet" service is enabled and active
I0108 00:54:46.938910 32058 checks.go:206] validating availability of port 10250
I0108 00:54:46.960668 32058 checks.go:283] validating the existence of file /etc/kubernetes/pki/ca.crt
I0108 00:54:46.960707 32058 checks.go:433] validating if the connectivity type is via proxy or direct
I0108 00:54:46.960795 32058 join.go:530] [preflight] Discovering cluster-info
I0108 00:54:46.960846 32058 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "10.0.0.137:6443"
I0108 00:54:46.997909 32058 token.go:118] [discovery] Requesting info from "10.0.0.137:6443" again to validate TLS against the pinned public key
I0108 00:54:47.003864 32058 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.137:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": x509: certificate has expired or is not yet valid: current time 2022-01-08T00:54:47+08:00 is before 2022-01-07T23:18:44Z
时间不一致,将mcw8改到错误时间前。mcw7也改了before 2022-01-07T23:18:44Z。然后错误已经变成别的了
[root@mcw8 ~]$ date -s "2022-1-7 23:10:00"
Fri Jan 7 23:10:00 CST 2022
根据上面可知,错误变成如下了net/http: request canceled (Client.Timeout exceeded while awaiting headers)
错误变成如下了
ter-info?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
I0108 01:27:42.577217 32662 token.go:217] [discovery] Failed to request cluster-info, will try again: Get "https://10.0.0.137:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
k8s系统容器总是起不来,停掉,报错如下。然后把停掉的所有容器多删几次,就好了。重新添加进集群,报错:拒绝
rpc error: code = Unknown desc = failed to create a sandbox for pod \"coredns-6d8c4cb4d-8l99d\": Error response from daemon: Conflict. The container name \"/k8s_POD_coredns-6d8c4cb4d-8l99d_kube-system_e030f426-3e8e-46fe-9e05-6c42a332f650_2\" is already in use by container \"b2dbcdd338ab4b2c35d5386e50e7e116fd41f26a0053a84ec3f1329e09d454a4\". You have to remove (or rename) that container to be able to reuse that name." pod="kube-system/coredns-6d8c4cb4d-8l99d"
[root@mcw8 ~]$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2edd274fd7b5 e6ea68648f0c "/opt/bin/flanneld -…" 7 seconds ago Exited (1) 5 seconds ago k8s_kube-flannel_kube-flannel-ds-tvz9q_kube-system_e62fa7b1-1cce-42dc-91d8-cdbd2bfda0f3_2
5b1715be012d quay.io/coreos/flannel "cp -f /etc/kube-fla…" 28 seconds ago Exited (0) 27 seconds ago k8s_install-cni_kube-flannel-ds-tvz9q_kube-system_e62fa7b1-1cce-42dc-91d8-cdbd2bfda0f3_0
7beb96ed15be rancher/mirrored-flannelcni-flannel-cni-plugin "cp -f /flannel /opt…" About a minute ago Exited (0) About a minute ago k8s_install-cni-plugin_kube-flannel-ds-tvz9q_kube-system_e62fa7b1-1cce-42dc-91d8-cdbd2bfda0f3_0
4e998fdfce3e registry.aliyuncs.com/google_containers/kube-proxy "/usr/local/bin/kube…" 2 minutes ago Up 2 minutes k8s_kube-proxy_kube-proxy-5p7dn_kube-system_92b1b38a-f6fa-4308-93fb-8045d2bae63f_0
fed18476d9a3 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-flannel-ds-tvz9q_kube-system_e62fa7b1-1cce-42dc-91d8-cdbd2bfda0f3_0
ebc2403e3052 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-proxy-5p7dn_kube-system_92b1b38a-f6fa-4308-93fb-8045d2bae63f_0
已经好了
[machangwei@mcw7 ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcw7 Ready control-plane,master 7m22s v1.23.1
mcw8 Ready <none> 4m51s v1.23.1
mcw9 Ready <none> 3m45s v1.23.1
[machangwei@mcw7 ~]$
每个加进集群部署好的节点,都有三个容器。加进集群的命令是访问主节点apiserver服务。然后就开始拉取镜像部署节点上的容器了
k8s_kube-proxy_kube-
k8s_POD_kube-proxy-n
k8s_POD_kube-flannel
pod状态:ContainerCreating,ErrImagePull,ImagePullBackOff
[machangwei@mcw7 ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-d2kwp 0/1 ContainerCreating 0 9m7s
mcw01dep-nginx-5dd785954d-szdjd 0/1 ErrImagePull 0 9m7s
mcw01dep-nginx-5dd785954d-v9x8j 0/1 ErrImagePull 0 9m7s
[machangwei@mcw7 ~]$
[machangwei@mcw7 ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-d2kwp 0/1 ContainerCreating 0 9m15s
mcw01dep-nginx-5dd785954d-szdjd 0/1 ImagePullBackOff 0 9m15s
mcw01dep-nginx-5dd785954d-v9x8j 0/1 ImagePullBackOff 0 9m15s
node上的容器都删除,但是主节点pod还是删不掉了,强制删除
[machangwei@mcw7 ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-v9x8j 0/1 Terminating 0 33m
[machangwei@mcw7 ~]$ kubectl delete pod mcw01dep-nginx-5dd785954d-v9x8j --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "mcw01dep-nginx-5dd785954d-v9x8j" force deleted
[machangwei@mcw7 ~]$ kubectl get pod
No resources found in default namespace.
拉取镜像无效???容器都起来了
[machangwei@mcw7 ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-65zd4 0/1 ContainerCreating 0 118s
mcw01dep-nginx-5dd785954d-hfw2k 0/1 ContainerCreating 0 118s
mcw01dep-nginx-5dd785954d-qxzpl 0/1 ContainerCreating 0 118s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 112s default-scheduler Successfully assigned default/mcw01dep-nginx-5dd785954d-65zd4 to mcw8
Normal Pulling <invalid> kubelet Pulling image "nginx"
去node节点查看,原来起的是k8s_POD_mcw01dep-nginx这个,不是k8s_mcw01dep-nginx
既然主节点查看pod信息,拉取Nginx的年龄是无效 ,那么去node节点mcw8上直接手动拉取镜像
[root@mcw8 ~]$ docker pull nginx #镜像手动拉取成功
Status: Downloaded newer image for nginx:latest
docker.io/library/nginx:latest
再次查看pod详情
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m21s default-scheduler Successfully assigned default/mcw01dep-nginx-5dd785954d-65zd4 to mcw8
Normal Pulling <invalid> kubelet Pulling image "nginx"
看到第一行显示调度,也就是每个容器都有个同名的POD容器,那是个调度。来自默认调度,消息里还能看到pod部署到哪个节点了,
多次查看,我已经将mcw8节点拉取了镜像,但是它没认出来,也没有重新拉取,既然如此,我删掉pod,让它自动重建pod,从mcw8节点本地拉取镜像
查看pod,带有命名空间的显示年龄是无效的,也就是mcw8和9的网络存在问题,这个是不是要重新生成呢?这个网络是节点加入到集群时创建的
[machangwei@mcw7 ~]$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-tvz9q 0/1 CrashLoopBackOff 102 (<invalid> ago) 8h 10.0.0.138 mcw8 <none> <none>
kube-system kube-flannel-ds-v28gj 1/1 Running 102 (<invalid> ago) 8h 10.0.0.139 mcw9 <none> <none>
删除k8s系统的pod要指定命名空间
[machangwei@mcw7 ~]$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-tvz9q 0/1 CrashLoopBackOff 103 (<invalid> ago) 8h 10.0.0.138 mcw8 <none> <none>
kube-system kube-flannel-ds-v28gj 0/1 CrashLoopBackOff 102 (<invalid> ago) 8h 10.0.0.139 mcw9 <none> <none>
kube-system kube-flannel-ds-vjfkz 1/1 Running 0 8h 10.0.0.137 mcw7 <none> <none>
[machangwei@mcw7 ~]$ kubectl delete pod kube-flannel-ds-tvz9q
Error from server (NotFound): pods "kube-flannel-ds-tvz9q" not found
[machangwei@mcw7 ~]$ kubectl delete pod kube-flannel-ds-tvz9q --namespace=kube-system
pod "kube-flannel-ds-tvz9q" deleted
[machangwei@mcw7 ~]$ kubectl delete pod kube-flannel-ds-v28gj --namespace=kube-system
pod "kube-flannel-ds-v28gj" deleted
[machangwei@mcw7 ~]$ kubectl get pod --all-namespaces -o wide #没啥变化,还是无效的
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-gr7ck 0/1 CrashLoopBackOff 1 (<invalid> ago) 21s 10.0.0.138 mcw8 <none> <none>
kube-system kube-flannel-ds-m6qgl 1/1 Running 1 (<invalid> ago) 6s 10.0.0.139 mcw9 <none> <none>
kube-system kube-flannel-ds-vjfkz 1/1 Running 0 8h 10.0.0.137 mcw7 <none> <non
克隆虚拟机容器出各种问题,如果是创建的虚拟机没有这方面问题。
重新创建三个虚拟机,部署过程中遇到如下问题:coredns一直是peding,
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-nsv4x 0/1 Pending 0 8m59s
kube-system coredns-6d8c4cb4d-t7hr6 0/1 Pending 0 8m59s
排查过程:
查看错误信息:
[machangwei@mcwk8s-master ~]$ kubectl describe pod coredns-6d8c4cb4d-nsv4x -namespace=kube-system
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 21s (x7 over 7m9s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
解决方案:
默认 k8s 不允许往 master 节点装东西,强行设置下允许:kubectl taint nodes --all node-role.kubernetes.io/master-
[machangwei@mcwk8s-master ~]$ kubectl get nodes #查看节点,主节点未准备。执行如下命令,让主节点也作为一个node
NAME STATUS ROLES AGE VERSION
mcwk8s-master NotReady control-plane,master 16m v1.23.1
[machangwei@mcwk8s-master ~]$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/mcwk8s-master untainted
[machangwei@mcwk8s-master ~]$
pod描述里有;
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
允许master节点部署pod,使用命令如下:
kubectl taint nodes --all node-role.kubernetes.io/master-
禁止master部署pod
kubectl taint nodes k8s node-role.kubernetes.io/master=true:NoSchedule
Jan 9 11:51:52 mcw10 kubelet: I0109 11:51:52.636701 25612 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Jan 9 11:51:53 mcw10 kubelet: E0109 11:51:53.909336 25612 kubelet.go:2347] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Jan 9 11:51:57 mcw10 kubelet: I0109 11:51:57.637836 25612 cni.go:240] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
[machangwei@mcwk8s-master ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s-master NotReady control-plane,master 43m v1.23.1
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-t24gx 0/1 Pending 0 18m
kube-system coredns-6d8c4cb4d-t7hr6 0/1 Pending 0 42m
结果发现跟之前的解决貌似没有关系,这是因为没有部署网络的原因,我部署好网络,dns的两个pod就好了
如下:
[machangwei@mcwk8s-master ~]$ kubectl apply -f mm.yml #部署网络
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
[machangwei@mcwk8s-master ~]$ kubectl get nodes #查看节点还没有好
NAME STATUS ROLES AGE VERSION
mcwk8s-master NotReady control-plane,master 45m v1.23.1
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces #查看dns pod没有好,查看flannel初始化还没有好
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-t24gx 0/1 Pending 0 20m
kube-system coredns-6d8c4cb4d-t7hr6 0/1 Pending 0 44m
kube-system kube-flannel-ds-w8v9s 0/1 Init:0/2 0 14s
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces #再次查看拉取镜像失败
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-t24gx 0/1 Pending 0 20m
kube-system coredns-6d8c4cb4d-t7hr6 0/1 Pending 0 45m
kube-system kube-flannel-ds-w8v9s 0/1 Init:ErrImagePull 0 45s
[machangwei@mcwk8s-master ~]$ kubectl describe pod kube-flannel-ds-w8v9s --namespace=kube-system #查看描述信息
Warning Failed 4m26s kubelet Error: ErrImagePull #一直是拉取镜像失败,查看网络没有问题的
Warning Failed 4m25s kubelet Error: ImagePullBackOff #三分钟才拉取镜像成功
Normal BackOff 4m25s kubelet Back-off pulling image "quay.io/coreos/flannel:v0.15.1"
Normal Pulling 4m15s (x2 over 4m45s) kubelet Pulling image "quay.io/coreos/flannel:v0.15.1"
Normal Pulled 3m36s kubelet Successfully pulled image "quay.io/coreos/flannel:v0.15.1" in 39.090145025s
Normal Created 3m35s kubelet Created container install-cni
Normal Started 3m35s kubelet Started container install-cni
Normal Pulled 3m35s kubelet Container image "quay.io/coreos/flannel:v0.15.1" already present on machine
Normal Created 3m35s kubelet Created container kube-flannel
Normal Started 3m34s kubelet Started container kube-flannel
再次查看节点,已经是ready了,也就是说部署好网络,coredns才好,master节点作为一个node才ready
[machangwei@mcwk8s-master ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s-master Ready control-plane,master 57m v1.23.1
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-t24gx 1/1 Running 0 32m
kube-system coredns-6d8c4cb4d-t7hr6 1/1 Running 0 56m
kube-system etcd-mcwk8s-master 1/1 Running 0 57m
kube-system kube-apiserver-mcwk8s-master 1/1 Running 0 57m
kube-system kube-controller-manager-mcwk8s-master 1/1 Running 0 57m
kube-system kube-flannel-ds-w8v9s 1/1 Running 0 12m
kube-system kube-proxy-nvw6m 1/1 Running 0 56m
kube-system kube-scheduler-mcwk8s-master 1/1 Running 0 57m
node1上执行加入集群后,master上多出两个网络flannel没有ready的pod
是节点的网络,貌似不影响使用,暂时没有影响
[root@mcwk8s-node1 ~]$ kubeadm join 10.0.0.140:6443 --token 8yficm.352yz89c44mqk4y6 \
> --discovery-token-ca-cert-hash sha256:bcd36381d3de0adb7e05a12f688eee4043833290ebd39366fc47dd5233c552bf
master上多出两个没有ready的pod,说明是node上的没有部署好这个网络pod呢
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-75npz 0/1 Init:1/2 0 99s
kube-system kube-flannel-ds-lpmxf 0/1 Init:1/2 0 111s
kube-system kube-flannel-ds-w8v9s 1/1 Running 0 16m
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-75npz 0/1 CrashLoopBackOff 4 (50s ago) 4m37s 10.0.0.141 mcwk8s-node1 <none> <none>
kube-system kube-flannel-ds-lpmxf 0/1 Init:ImagePullBackOff 0 4m49s 10.0.0.142 mcwk8s-node2 <none> <none>
kube-system kube-flannel-ds-w8v9s 1/1 Running 0 19m 10.0.0.140 mcwk8s-master <none> <none>
查看nodes状态,现在已经有一个是ready了
[machangwei@mcwk8s-master ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s-master Ready control-plane,master 65m v1.23.1
mcwk8s-node1 Ready <none> 5m22s v1.23.1
mcwk8s-node2 NotReady <none> 5m35s v1.23.1
此时查看pod情况,虽然node已经ready了,但是网络的pod的状态,显示还是有点问题的
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-75npz 0/1 CrashLoopBackOff 5 (44s ago) 6m5s 10.0.0.141 mcwk8s-node1 <none> <none>
kube-system kube-flannel-ds-lpmxf 0/1 Init:ImagePullBackOff 0 6m17s 10.0.0.142 mcwk8s-node2 <none> <none>
kube-system kube-flannel-ds-w8v9s 1/1 Running 0 21m 10.0.0.140 mcwk8s-master <none> <none>
描述pod信息,查看CrashLoopBackOff这个状态,好像是重启容器失败,容器已经存在了
Normal Created 5m10s (x4 over 5m59s) kubelet Created container kube-flannel
Normal Started 5m10s (x4 over 5m58s) kubelet Started container kube-flannel
Warning BackOff 4m54s (x5 over 5m52s) kubelet Back-off restarting failed container
Normal Pulled 2m52s (x6 over 5m59s) kubelet Container image "quay.io/coreos/flannel:v0.15.1" already present on machine
描述pod信息,查看Init:ImagePullBackOff这个状态,是镜像拉取存在问题
Warning Failed 23s (x4 over 5m42s) kubelet Failed to pull image "quay.io/coreos/flannel:v0.15.1": rpc error: code = Unknown desc = context canceled
Warning Failed 23s (x4 over 5m42s) kubelet Error: ErrImagePull
镜像导入导出
建议:
可以依据具体使用场景来选择命令
若是只想备份images,使用save、load即可
若是在启动容器后,容器内容有变化,需要备份,则使用export、import
示例
docker save -o nginx.tar nginx:latest
或
docker save > nginx.tar nginx:latest
其中-o和>表示输出到文件,nginx.tar为目标文件,nginx:latest是源镜像名(name:tag)
示例
docker load -i nginx.tar
或
docker load < nginx.tar
其中-i和<表示从文件输入。会成功导入镜像及相关元数据,包括tag信息
示例
docker export -o nginx-test.tar nginx-test
其中-o表示输出到文件,nginx-test.tar为目标文件,nginx-test是源容器名(name)
docker import nginx-test.tar nginx:imp
或
cat nginx-test.tar | docker import - nginx:imp
区别:
export命令导出的tar文件略小于save命令导出的
export命令是从容器(container)中导出tar文件,而save命令则是从镜像(images)中导出
基于第二点,export导出的文件再import回去时,无法保留镜像所有历史(即每一层layer信息,不熟悉的可以去看Dockerfile),不能进行回滚操作;而save是依据镜像来的,所以导入时可以完整保留下每一层layer信息。如下图所示,nginx:latest是save导出load导入的,nginx:imp是export导出import导入的。
原文链接:https://blog.csdn.net/ncdx111/article/details/79878098
Init:ImagePullBackOff这个状态的解决
查看node2上没有flannel镜像
[root@mcwk8s-node2 ~]$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.23.1 b46c42588d51 3 weeks ago 112MB
rancher/mirrored-flannelcni-flannel-cni-plugin v1.0.0 cd5235cd7dc2 2 months ago 9.03MB
registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 4 months ago 683kB
去主节点上导出一份镜像然后上传到node2上
[root@mcwk8s-master ~]$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/coreos/flannel v0.15.1 e6ea68648f0c 8 weeks ago 69.5MB
[root@mcwk8s-master ~]$ docker save quay.io/coreos/flannel >mcwflanel-image.tar.gz
[root@mcwk8s-master ~]$ ls
anaconda-ks.cfg jiarujiqun.txt mcwflanel-image.tar.gz
[root@mcwk8s-master ~]$ scp mcwflanel-image.tar.gz 10.0.0.142:/root/
node2上导入镜像成功
[root@mcwk8s-node2 ~]$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.23.1 b46c42588d51 3 weeks ago 112MB
rancher/mirrored-flannelcni-flannel-cni-plugin v1.0.0 cd5235cd7dc2 2 months ago 9.03MB
registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 4 months ago 683kB
[root@mcwk8s-node2 ~]$ ls
anaconda-ks.cfg
[root@mcwk8s-node2 ~]$ ls
anaconda-ks.cfg mcwflanel-image.tar.gz
[root@mcwk8s-node2 ~]$ docker load < mcwflanel-image.tar.gz
ab9ef8fb7abb: Loading layer [==================================================>] 2.747MB/2.747MB
2ad3602f224f: Loading layer [==================================================>] 49.46MB/49.46MB
54089bc26b6b: Loading layer [==================================================>] 5.12kB/5.12kB
8c5368be4bdf: Loading layer [==================================================>] 9.216kB/9.216kB
5c32c759eea2: Loading layer [==================================================>] 7.68kB/7.68kB
Loaded image: quay.io/coreos/flannel:v0.15.1
[root@mcwk8s-node2 ~]$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.23.1 b46c42588d51 3 weeks ago 112MB
quay.io/coreos/flannel v0.15.1 e6ea68648f0c 8 weeks ago 69.5MB
rancher/mirrored-flannelcni-flannel-cni-plugin v1.0.0 cd5235cd7dc2 2 months ago 9.03MB
registry.aliyuncs.com/google_containers/pause 3.6 6270bb605e12 4 months ago 683kB
主节点上查看pod状态,已经变化了,变成CrashLoopBackOff。重启了很多次
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-75npz 0/1 CrashLoopBackOff 9 (4m47s ago) 28m
kube-system kube-flannel-ds-lpmxf 0/1 CrashLoopBackOff 4 (74s ago) 28m
kube-system kube-flannel-ds-w8v9s 1/1 Running 0 43m
查看描述信息,重启失败。问题CrashLoopBackOff解决
[machangwei@mcwk8s-master ~]$ kubectl describe pod kube-flannel-ds-lpmxf --namespace=kube-system
Warning BackOff 3m25s (x20 over 7m48s) kubelet Back-off restarting failed container
虽然节点上的这两个一直不是ready,但是node状态已经是ready了,先不管了,部署一个应用验证一下
[machangwei@mcwk8s-master ~]$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-flannel-ds-75npz 0/1 CrashLoopBackOff 12 (114s ago) 41m
kube-system kube-flannel-ds-lpmxf 0/1 CrashLoopBackOff 8 (3m46s ago) 41m
[machangwei@mcwk8s-master ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
mcwk8s-master Ready control-plane,master 100m v1.23.1
mcwk8s-node1 Ready <none> 40m v1.23.1
mcwk8s-node2 Ready <none> 41m v1.23.1
查看环境是否安装好了,已经没问题可以部署应用了
[machangwei@mcwk8s-master ~]$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
mcw01dep-nginx 1/1 1 1 5m58s
mcw02dep-nginx 1/2 2 1 71s
[machangwei@mcwk8s-master ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-z7s8m 1/1 Running 0 7m21s
mcw02dep-nginx-5b8b58857-7mlmh 1/1 Running 0 2m34s
mcw02dep-nginx-5b8b58857-pvwdd 1/1 Running 0 2m34s
把测试资源删掉,然后保存一份虚拟机快照,省的k8s环境变化,需要重新部署等,直接恢复快照就行。
[machangwei@mcwk8s-master ~]$ kubectl get pod
NAME READY STATUS RESTARTS AGE
mcw01dep-nginx-5dd785954d-z7s8m 1/1 Running 0 7m21s
mcw02dep-nginx-5b8b58857-7mlmh 1/1 Running 0 2m34s
mcw02dep-nginx-5b8b58857-pvwdd 1/1 Running 0 2m34s
[machangwei@mcwk8s-master ~]$
[machangwei@mcwk8s-master ~]$ kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
mcw01dep-nginx 1/1 1 1 7m39s
mcw02dep-nginx 2/2 2 2 2m52s
[machangwei@mcwk8s-master ~]$ kubectl delete deployment mcw01dep-nginx mcw02dep-nginx
deployment.apps "mcw01dep-nginx" deleted
deployment.apps "mcw02dep-nginx" deleted
[machangwei@mcwk8s-master ~]$ kubectl get deployment
No resources found in default namespace.
[machangwei@mcwk8s-master ~]$
[machangwei@mcwk8s-master ~]$ kubectl get pod
No resources found in default namespace.
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
太卡,卡了哦半边天,虚拟机
系统或者网络占用过多CPU,造成内核软死锁(soft lockup)。Soft lockup名称解释:所谓,soft lockup就是说,这个bug没有让系统彻底死机,但是若干个进程(或者kernel thread)被锁死在了某个状态(一般在内核区域),很多情况下这个是由于内核锁的使用的问题。