在初始设置 Kubernetes
集群后,最常见的操作是通过添加更多运行工作负载(容器和 Pod
)的节点来扩展集群。 扩展集群的方式取决于最初在集群引导期间使用的工具。 本指南演示如何使用 kubeadm
命令行工具将更多工作节点添加到 Kubernetes
集群。
示例集群
有一个包含两个工作节点和一个主节点的集群,容器运行时为containerd
,使用的网络插件为calico
,操作系统为Ubuntu 20.04
。
集群初始化配置--control-plane-endpoint
为k8s-cluster.test.com
当前所有节点/etc/hosts
新增的配置:
192.168.1.140 k8s-cluster.test.com
192.168.1.140 k8s-master-01 k8s-master-01.test.com
192.168.1.141 k8s-worker-01 k8s-worker-01.test.com
192.168.1.142 k8s-worker-02 k8s-worker-02.test.com
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master-01 Ready control-plane,master 4h20m v1.23.2 192.168.1.140 <none> Ubuntu 20.04.3 LTS 5.4.0-96-generic containerd://1.4.12
k8s-worker-01 Ready <none> 3h28m v1.23.2 192.168.1.141 <none> Ubuntu 20.04.3 LTS 5.4.0-96-generic containerd://1.4.12
k8s-worker-02 Ready <none> 3h20m v1.23.2 192.168.1.142 <none> Ubuntu 20.04.3 LTS 5.4.0-96-generic containerd://1.4.12
新增工作节点
节点信息
- 操作系统:
Ubuntu 20.04
CPU
:4vCPU
- 内存:
4GB
- 主机名:
k8s-worker-03
防火墙
# shell
sudo ufw allow 22/tcp
# Kubelet API
sudo ufw allow 10250/tcp
# NodePort Services
sudo ufw allow 30000:32767/tcp
# calico
sudo ufw allow 179/tcp
sudo ufw allow 5473/tcp
sudo ufw allow 4789/udp
/etc/hosts配置
注意:所有节点都需调整。
192.168.1.140 k8s-cluster.test.com
192.168.1.140 k8s-master-01 k8s-master-01.test.com
192.168.1.141 k8s-worker-01 k8s-worker-01.test.com
192.168.1.142 k8s-worker-02 k8s-worker-02.test.com
192.168.1.143 k8s-worker-03 k8s-worker-03.test.com
关闭swap分区
sudo sed -i 's/^\(.*swap.*\)$/#\1/g' /etc/fstab
sudo swapoff -a
安装Kubernetes
sudo apt update
sudo apt -y install curl apt-transport-https
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt -y install vim git wget
sudo apt -y install kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
安装容器运行时containerd
# Configure persistent loading of modules
sudo tee /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
# Load at runtime
sudo modprobe overlay
sudo modprobe br_netfilter
# Ensure sysctl params are set
sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
# Reload configs
sudo sysctl --system
# Install required packages
sudo apt install -y curl gnupg2 software-properties-common apt-transport-https ca-certificates
# Add Docker repo
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# Install containerd
sudo apt update
sudo apt install -y containerd.io
# Configure containerd and start service
sudo su -
mkdir -p /etc/containerd
containerd config default>/etc/containerd/config.toml
# Change image repository
sed -i 's//\/google_containers/g' /etc/containerd/config.toml
要使用 systemd cgroup
驱动程序,需要在 /etc/containerd/config.toml
中设置:
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
调整sandbox_image
镜像地址:
sudo sed -i 's//\/google_containers/g' /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri"]
...
sandbox_image = "/google_containers/pause:3.2"
重启服务:
# restart containerd
systemctl restart containerd
systemctl enable containerd
systemctl status containerd
获取加入令牌
将新的工作节点加入 Kubernetes
集群时需要令牌。 当使用 kubeadm
初始集群时,会生成一个令牌,该令牌会在 24 小时后过期。
检查是否有令牌, 在控制节点上运行命令:
$ kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
bdqsdw.2uf50yfvo3uwy93w 19h 2022-01-23T01:52:48Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
如果令牌已过期,请使用以下命令生成一个新令牌:
sudo kubeadm token create
使用以下命令获取生成的令牌:
kubeadm token list
还可以生成令牌并加入打印命令:
kubeadm token create --print-join-command
获取令牌 CA 证书哈希
kubeadm join
命令通过将其哈希与提供的哈希匹配来验证根 CA
公钥。 通过运行以下命令获取主节点上的令牌 CA
证书哈希。
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
获取 api-server-endpoint
地址
在主节点中,使用 kubectl cluster-info
命令获取:
$ kubectl cluster-info
Kubernetes control plane is running at https://k8s-cluster.test.com:6443
CoreDNS is running at https://k8s-cluster.test.com:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
如输出所示,示例中为:https://k8s-cluster.test.com:6443
。
将工作节点加入集群
kubeadm join
命令用于将工作节点或其他主节点加入集群。 将工作节点加入集群的命令语法是:
kubeadm join [api-server-endpoint] [flags]
所需的常见标志是:
-
--token
字符串:要使用的令牌 -
--discovery-token-ca-cert-hash
,格式为:<type>:<value>
完整的命令具有以下格式:
kubeadm join \
<control-plane-host>:<control-plane-port> \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
示例:
$ kubeadm join k8s-cluster.test.com:6443 --token bdqsdw.2uf50yfvo3uwy93w \
--discovery-token-ca-cert-hash sha256:2a6f431cc99860ff6e15519e08e62f01b9b0cb051380031582bd5cc22efbc084
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0122 03:39:25.496531 74229 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
等待节点处于“就绪”状态 – 检查控制节点,该过程可能需要几分钟时间,因为在配置和启动服务之前会拉取容器映像。
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master-01 Ready control-plane,master 5h47m v1.23.2
k8s-worker-01 Ready <none> 4h55m v1.23.2
k8s-worker-02 Ready <none> 4h47m v1.23.2
k8s-worker-03 Ready <none> 5m v1.23.2
从集群中移除工作节点
要从集群中移除工作节点,请执行以下操作。
从节点迁移 pod
kubectl drain <node-name> --delete-local-data --ignore-daemonsets
将节点标记为不可调度
防止节点调度新的 pod
。
kubectl cordon <node-name>
重置被移除节点
恢复通过“kubeadm join”对节点所做的更改。
kubeadm reset
一旦成功执行 kubeadm reset
命令,还可以重新将新节点加入集群。