kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具。

1. 安装要求

  • 看到网上很多教程说什么需要三四台机器,实际上只是用来自己安装kubernetes玩的话不需要这么多,只需要一台4核4G的服务器就足够了,既作为主节点,又作为工作节点。最后安装完执行一条命令就可以了。

2. 准备环境

节点所有服务器执行:
systemctl stop firewalld
systemctl disable firewalld
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab

cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
EOF
sysctl --system

yum install ntpdate -y
ntpdate time.windows.com
hostnamectl set-hostname <hostname> #设置服务器主机名,只要和下面hosts文件的ip主机名对应起来就可。

cat >> /etc/hosts << EOF
192.168.26.129 master1 #修改为对应ip和主机名
192.168.26.129 cluster-endpoint #主机点以域名形式写出,方便后期改为多master节点
192.168.26.130 node1
EOF
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

lsmod|grep overlay
lsmod|grep by_netfilter

3. 所有节点安装Docker/kubeadm/kubelet

Kubernetes默认CRI(容器运行时)为Docker,因此先安装Docker。

3.1 安装Docker

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce
systemctl enable docker && systemctl start docker

cat > /etc/docker/daemon.json << EOF
{
  "registry-mirrors": ["https://zahdqyo7.mirror.aliyuncs.com"],
  "data-root": "/data/docker",
  "log-driver":"json-file",
  "log-opts": {"max-size":"20m", "max-file":"1"},
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
systemctl daemon-reload && systemctl restart docker

3.2 添加阿里云YUM软件源

centos系统:
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

ubuntu系统:

sudo apt-get update && sudo apt-get install -y apt-transport-https curl

sudo curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -

sudo tee /etc/apt/sources.list.d/kubernetes.list <<-'EOF'
deb https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial main
EOF

sudo apt-get update
# 将 sandbox_image 镜像源设置为阿里云google_containers镜像源(所有节点)
containerd config default > /etc/containerd/config.toml
grep sandbox_image  /etc/containerd/config.toml
sed -i "s#k8s.gcr.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
grep sandbox_image  /etc/containerd/config.toml

# 配置containerd cgroup 驱动程序systemd(所有节点)
    kubernets自v1.24.0后,就不再使用docker.shim,替换采用containerd作为容器运行时端点。因此需要安装containerd(在docker的基础下安装),上面安装docker的时候就自动安装了containerd了。这里的docker只是作为客户端而已。容器引擎还是containerd。

sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
# 应用所有更改后,重新启动containerd
systemctl restart containerd

# 配置endpoint加速器

    [plugins."io.containerd.tracing.processor.v1.otlp"]

        endpoint = "https://docker.mirrors.ustc.edu.cn/"

          insecure = false

          protocol = “"

3.3 安装kubeadm,kubelet和kubectl

由于版本更新频繁,这里指定版本号部署:

centos:
yum install kubeadm-1.24.4 kubectl-1.24.4 kubelet-1.24.4 -y
systemctl enable kubelet
ubuntu:
apt-get install kubeadm-1.24.4 kubectl-1.24.4 kubelet-1.24.4 -y
systemctl enable kubelet

4. 部署Kubernetes Master

在192.168.44.129(Master)执行。

kubeadm init \
  --apiserver-advertise-address=192.168.23.128 \
  --image-repository registry.aliyuncs.com/google_containers \
  --control-plane-endpoint=cluster-endpoint \
  --kubernetes-version v1.24.4 \
  --service-cidr=10.1.0.0/16 \
  --pod-network-cidr=10.244.0.0/16 \
  --v=5
# –image-repository string:    这个用于指定从什么位置来拉取镜像(1.13版本才有的),默认值是k8s.gcr.io,我们将其指定为国内镜像地址:registry.aliyuncs.com/google_containers
# –kubernetes-version string:  指定kubenets版本号,默认值是stable-1,会导致从https://dl.k8s.io/release/stable-1.txt下载最新的版本号,我们可以将其指定为固定版本(v1.22.1)来跳过网络请求。
# –apiserver-advertise-address  指明用 Master 的哪个 interface 与 Cluster 的其他节点通信。如果 Master 有多个 interface,建议明确指定,如果不指定,kubeadm 会自动选择有默认网关的 interface。这里的ip为master节点ip,记得更换。
# –pod-network-cidr             指定 Pod 网络的范围。Kubernetes 支持多种网络方案,而且不同网络方案对  –pod-network-cidr有自己的要求,这里设置为10.244.0.0/16 是因为我们将使用 flannel 网络方案,必须设置成这个 CIDR。
# --control-plane-endpoint     cluster-endpoint 是映射到该 IP 的自定义 DNS 名称,这里配置hosts映射:192.168.0.113   cluster-endpoint。 这将允许你将 --control-plane-endpoint=cluster-endpoint 传递给 kubeadm init,并将相同的 DNS 名称传递给 kubeadm join。 稍后你可以修改 cluster-endpoint 以指向高可用性方案中的负载均衡器的地址。

使用kubectl工具:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get nodes
伪多master节点,多个控制节点组件,但是连接时使用的apiserver是固定的。
 kubeadm init --kubernetes-version=v1.24.0 --control-plane-endpoint "k8s-api:6443" --upload-certs --image-repository registry.aliyuncs.com/google_containers  --pod-network-cidr 10.244.0.0/16
# 可以通过这个部署多个master节点

5. 加入Kubernetes Node

在192.168.1.12/13(Node)执行。

向集群添加新节点,执行在kubeadm init输出的kubeadm join命令:

$ kubeadm join 192.168.1.11:6443 --token esce21.q6hetwm8si29qxwn \
    --discovery-token-ca-cert-hash sha256:00603a05805807501d7181c3d60b478788408cfe6cedefedb1f97569708be9c5

默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,操作如下:

kubeadm token create --print-join-command

6. 部署CNI网络插件

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

默认镜像地址无法访问,sed命令修改为docker hub镜像仓库。

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml


配置IPVS
【问题】集群内无法ping通ClusterIP(或ServiceName)
1、加载ip_vs相关内核模块
modprobe -- ip_vs
modprobe -- ip_vs_sh
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
所有节点验证开启了ipvs:
lsmod |grep ip_vs
2、安装ipvsadm工具
yum install ipset ipvsadm -y
3、编辑kube-proxy配置文件,mode修改成ipvs
kubectl edit  configmap -n kube-system  kube-proxy

    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 0s
      tcpFinTimeout: 0s
      tcpTimeout: 0s
      udpTimeout: 0s
    kind: KubeProxyConfiguration
    metricsBindAddress: ""
    mode: "ipvs" # 添加ipvs
4、重启kube-proxy
# 先查看
kubectl get pod -n kube-system | grep kube-proxy
# 再delete让它自拉起
kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}'
# 再查看
kubectl get pod -n kube-system | grep kube-proxy

ipvsadm -Ln
# 查看转发规则

查看pod状态,是否为running。

kubectl get pods -n kube-system
NAME                          READY   STATUS    RESTARTS   AGE
kube-flannel-ds-amd64-2pc95   1/1     Running   0          72s
设置master节点可以部署pod

这是因为kubernetes出于安全考虑默认情况下无法在master节点上部署pod,

    1 node(s) had taint {node-role.kubernetes.io/master: } that the pod didn't tolerate.:

    kubectl taint nodes --all node-role.kubernetes.io/master-


    1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/1 nodes are available:

    kubectl taint nodes --all node-role.kubernetes.io/control-plane-

7. 测试kubernetes集群

在Kubernetes集群中创建一个pod,验证是否正常运行:

kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc

访问地址:http://NodeIP:Port
访问成功,恭喜集群部署完成啦

推荐安装界面工具:

rancher:

docker run -d --restart=unless-stopped \
        -p 8080:80 -p 8443:443 \
        -v /k8s/rancher/rancher:/var/lib/rancher \
        -v /k8s/rancher/auditlog:/var/log/auditlog \
        --privileged \
        --name rancher rancher/rancher:latest

kuboard(推荐):

sudo docker run -d \
  --restart=unless-stopped \
  --name=kuboard \
  -p 80:80/tcp \
  -p 10081:10081/tcp \
  -e KUBOARD_ENDPOINT="http://192。168.23.128:80" \
  -e KUBOARD_AGENT_SERVER_TCP_PORT="10081" \
  -v /root/kuboard-data:/data \
  eipwork/kuboard:v3.5.0.3
  # 也可以使用镜像 swr.cn-east-2.myhuaweicloud.com/kuboard/kuboard:v3 ,可以更快地完成镜像下载。
  # 请不要使用 127.0.0.1 或者 localhost 作为内网 IP \
  # Kuboard 不需要和 K8S 在同一个网段,Kuboard Agent 甚至可以通过代理访问 Kuboard Server \

kuboard教程:https://www.kuboard.cn/install/v3/install-built-in.html#%E5%AE%89%E8%A3%85
可以通过kuboard部署ingress、storageclass

cat /etc/exports
/home/nfs *(insecure,rw,async,no_root_squash)

exportfs-arv

kubeadm网络插件卸载和安装:

Calico:
# 安装
export POD_SUBNET=10.244.0.0/16   # 更换位k8s网络范围
kubectl apply -f https://kuboard.cn/install-script/v1.22.x/calico-operator.yaml
wget https://kuboard.cn/install-script/v1.22.x/calico-custom-resources.yaml
sed -i "s#192.168.0.0/16#${POD_SUBNET}#" calico-custom-resources.yaml
kubectl apply -f calico-custom-resources.yaml
# 卸载
ifconfig tunl0 down
ifconfig cni0 down
ip link delete cni0
ip link delete tul0
rm -rf /var/lib/cni/
rm -f /etc/cni/net.d/*
# 上面操作完成重启kubelet coredns eip-nfs ingress 以及重新部署所有工作负载
Flannel
# 安装
export POD_SUBNET=10.244.0.0/16   # 更换位k8s网络范围
wget https://kuboard.cn/install-script/flannel/flannel-v0.14.0.yaml
sed -i "s#10.244.0.0/16#${POD_SUBNET}#" flannel-v0.14.0.yaml
kubectl apply -f ./flannel-v0.14.0.yaml
# 卸载
kubectl delete -f ./flannel-v0.14.0.yaml
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/
rm -f /etc/cni/net.d/*
# 上面操作完成 重启kubelet coredns eip-nfs ingress 以及重新部署所有工作负载

遇到的问题

一、Get “http://127.0.0.1:10251/healthz”: dial tcp 127.0.0.1:10251: connect: confuse
解决方法:
cd /etc/kubernetes/manifest
然后将你的scheduler以及controll manager .yaml中的 --port=0注释掉

containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
#    - --port=0	#将这行注释
二 、证书过期

查看各证书的过期时间

kubeadm certs check-expiration
root@vms28:~# ls /var/lib/kubelet/pki/
kubelet-client-2021-11-04-03-44-16.pem kubelet-client-current.pem  kubelet.crt  kubelet.key
root@vms28:~#
root@vms28:~# openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -text  |grep Not
            Not Before: Nov  3 19:44:13 2021 GMT
            Not After : Nov  3 19:44:15 2022 GMT

在master续签所有证书。

root@vms28:~# kubeadm certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
	...输出...
Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.
root@vms28:~#

再次查看日期可以发现时间已经改变。

# 这些文件里所用到的证书均是之前过期的证书,所以需要把所有的这些kubeconfig文件(后缀为conf)删除重新生成。 
rm -rf /etc/kubernetes/*.conf
# 重新生成kubeconfig文件,需要指定对应kubeadm版本。
kubeadm init --kubernetes-version=v1.23.1  phase kubeconfig all
# 拷贝新的配置文件过去
cp /etc/kubernetes/admin.conf ~/.kube/config
# 重启kube-scheduler
docker rm -f $(docker ps | awk '/kube-scheduler /{print $1}')
# 重启kubelet
systemctl restart kubelet
# 查看kubelet证书,可以看到生成新的证书文件
ls /var/lib/kubelet/pki/
# 在master上查看证书签名请求
kubectl get csr
#批准csr
kubectl certificate approve csr-rn8xc
# 再次查看证书到期时间,发现已经更新
openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -text  |grep Not

worker节点需要在主机点生成
root@vms28:~# kubeadm init --kubernetes-version=v1.23.1 phase kubeconfig kubelet --node-name vms29.rhce.cc --kubeconfig-dir /tmp/
[kubeconfig] Writing "kubelet.conf" kubeconfig file
root@vms28:~# 
root@vms28:~# ls /tmp/
kubelet.conf
root@vms28:~# 
# kubelet.conf文件拷贝到/etc/kubernetes中,然后重启worker节点kubelet。接着kubectl get csr查看
三、使用kuboard部署的ingressclass没有设置为默认,需要加上注解
[root@192 ~]# kubectl edit ingressclass ingress

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  annotations:
    ingressclass.kubernetes.io/is-default-class: "true" #添加这一行,将ingressclass设置为默认
    k8s.kuboard.cn/managed-by-kuboard: "true"
  creationTimestamp: "2022-07-05T06:22:06Z"
  generation: 1
  name: ingress
  resourceVersion: "7010"
  uid: 0c3de158-a159-4f71-bb6c-cb61e079559e
spec:
  controller: k8s.io/ingress-nginx

离线kubeadm部署k8s集群

准备环境:
参考链接:

压缩包下载地址:https://download.docker.com/linux/static/stable/x86_64/

一、解压 docker-20.10.1.tgz

cp docker-20.10.1.tgz /data/install/ && cd /data/install/
tar zxf docker-20.10.1.tgz
cp docker/* /usr/bin/

二、创建docker的存储路径

mkdir /data/docker

三、编辑启动文件

vim /usr/lib/systemd/system/docker.service

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target

# Requires=docker.socket

Wants=containerd.service

[Service]
Type=notify

# the default is not to use systemd for cgroups because the delegate issues still

# exists and systemd currently does not support the cgroup feature set required

# for containers run by docker

#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/var/run/containerd/containerd.sock
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.

# Both the old, and new location are accepted by systemd 229 and up, so using the old location

# to make them work for either version of systemd.

StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.

# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make

# this option work for either version of systemd.

StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead

# in the kernel. We recommend using cgroups to do container-local accounting.

LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not support it.

# Only systemd 226 and above support this option.

TasksMax=infinity

# set delegate yes so that systemd does not reset the cgroups of docker containers

Delegate=yes

# kill only the docker process, not all processes in the cgroup

KillMode=process
OOMScoreAdjust=-500

[Install]
WantedBy=multi-user.target

systemctl restart docker

三、编辑配置文件

vim /etc/docker/daemon.json

{
  "registry-mirrors" : ["https://mj9kvemk.mirror.aliyuncs.com"],
  "data-root": "/data/docker",
  "log-driver":"json-file",
  "log-opts": {"max-size":"20m", "max-file":"1"}
}

四、重新启动

systemctl daemon-reload
systemctl enable docker
systemctl restart docker
rm -rf /var/lib/docker

五、containerd添加systemd

vim /usr/lib/systemd/system/containerd.service

# Copyright The containerd Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target
systemctl restart containerd 

systemctl enable containerd



mkdir /etc/containerd

containerd config default > /etc/containerd/config.toml
将 sandbox_image 镜像源设置为阿里云google_containers镜像源(所有节点)
containerd config default > /etc/containerd/config.toml
grep sandbox_image  /etc/containerd/config.toml
sed -i "s#k8s.gcr.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml
grep sandbox_image  /etc/containerd/config.toml

5)配置containerd cgroup 驱动程序systemd(所有节点)
    kubernets自v1.24.0后,就不再使用docker.shim,替换采用containerd作为容器运行时端点。因此需要安装containerd(在docker的基础下安装),上面安装docker的时候就自动安装了containerd了。这里的docker只是作为客户端而已。容器引擎还是containerd。

sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
# 应用所有更改后,重新启动containerd
systemctl restart containerd

vim /etc/containerd/config.toml

   # endpoint加速器

    [plugins."io.containerd.tracing.processor.v1.otlp"]

        endpoint = "https://docker.mirrors.ustc.edu.cn/"

          insecure = false

          protocol = “"

  #  修改root和state的路径(看硬盘情况)

    required_plugins = []

    root = "/home/containerd/root"

    state = "/home/containerd/state"

    temp = ""

    version = 2
1、在有网环境提前下载离线包
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

yum -y install kubelet-1.24.4 kubectl-1.24.4 kubeadm-1.24.4 --downloadonly --downloaddir=./

yum -y install ipset ipvsadm --downloadonly --downloaddir=./
2、无网机器执行
yum localinstall *.rpm
3、查看需要的镜像并导出
kubeadm config images list --image-repository registry.aliyuncs.com/google_containers
 kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers

有网环境下下载之后导出tar包。

 docker save rancher/mirrored-flannelcni-flannel:v0.19.2 registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.4 registry.aliyuncs.com/google_containers/kube-scheduler:v1.24.4 registry.aliyuncs.com/google_containers/kube-controller-manager:v1.24.4 registry.aliyuncs.com/google_containers/kube-proxy:v1.24.4 registry.aliyuncs.com/google_containers/pause:3.7 registry.aliyuncs.com/google_containers/coredns:v1.8.6 registry.aliyuncs.com/google_containers/etcd:3.5.3-0 eipwork/kuboard:v3 rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0> images.tar
 
 --kubernetes-version string     默认值:"stable-1"
 为控制平面选择一个指定的 Kubernetes 版本。
4、导入镜像,kubeadm初始化
docker load < images.tar

 ctr -n=k8s.io image import images.tar