1.前置知识点
1.1 生产环境可部署Kubernetes集群的两种方式
1.2 准备环境
1.3 操作系统初始化配置
2.安装Docker/kubeadm/kubelet【所有节点】
2.1 安装Docker
2.2 添加阿里云YUM软件源
2.3 安装kubeadm,kubelet和kubectl
3.部署Kubernetes Master
4. 加入Kubernetes Node
5.部署容器网络(CNI)
6. 测试kubernetes集群
7.部署 Dashboard
1、前置知识点
1.1 生产环境可部署Kubernetes集群的两种方式
目前生产部署Kubernetes集群主要有两种方式:
•kubeadm
Kubeadm是一个K8s部署工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。
•二进制包
从github下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。
这里采用kubeadm搭建集群。
kubeadm工具功能:
- kubeadm init:初始化一个Master节点
- kubeadm join:将工作节点加入集群
- kubeadm upgrade:升级K8s版本
- kubeadm token:管理 kubeadm join 使用的令牌
- kubeadm reset:清空 kubeadm init 或者 kubeadm join 对主机所做的任何更改
- kubeadm version:打印 kubeadm 版本
- kubeadm alpha:预览可用的新功能
1.2 准备环境
服务器要求:
- 建议最小硬件配置:2核CPU、2G内存、20G硬盘
- 服务器最好可以访问外网,会有从网上拉取镜像需求,如果服务器不能上网,需要提前下载对应镜像并导入节点
软件环境:
软件 | 版本 |
操作系统 | CentOS7.9_x64 |
Docker (通过docker version获取) | 20.10.18 |
Kubernetes | 1.25.2 |
服务器规划:
角色 | IP |
k8s-master | 192.168.106.151 |
k8s-node1 | 192.168.106.152 |
k8s-node2 | 192.168.106.153 |
架构图: | |
1.3 操作系统初始化配置
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 永久
setenforce 0 # 临时
# 关闭swap
swapoff -a # 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久
# 根据规划设置主机名
hostnamectl set-hostname <hostname>
# 在master添加hosts
cat >> /etc/hosts << EOF
192.168.106.151 k8s-master1
192.168.106.152 k8s-node1
192.168.106.153 k8s-node2
EOF
# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system # 生效
# 时间同步
yum install ntpdate -y
ntpdate time.windows.com
2.安装Docker/kubeadm/kubelet【所有节点】
这里使用Docker作为容器引擎,也可以换成别的,例如containerd
2.1 安装Docker
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce
systemctl enable docker && systemctl start docker
配置镜像下载加速器:
cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
systemctl restart docker
docker info
2.2 添加阿里云YUM软件源
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2.3 安装kubeadm,kubelet和kubectl
由于版本更新频繁,这里指定版本号部署:
yum install -y kubelet-1.25.2 kubeadm-1.25.2 kubectl-1.25.2
systemctl enable kubelet && systemctl start kubelet
修改:
手动配置containerd的配置(踩了个大坑)
ps: 自动生成的文件会使用k8s.gcr.io/pause:3.6镜像,国内无法下载,导致kubeadm初始化失败。
1、生成 containerd 的配置文件 (一下在node节点上不用执行)
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
2、修改 SystemdCgroup 为 true
# 编辑文件
vi /etc/containerd/config.toml
#更改SystemdCgroup值为true
SystemdCgroup = true
3、修改 sandbox_image 值
#更改k8s.gcr.io/pause:3.6为registry.aliyuncs.com/google_containers/pause:3.7
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.7"
4、修改一下runtime_type的内容
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = "io.containerd.runtime.v1.linux" (此处)
3.部署Kubernetes Master
在192.168.106.151(Master)执行。
kubeadm init \
--apiserver-advertise-address=192.168.106.151 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.25.2 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=all
•–apiserver-advertise-address 集群通告地址
•–image-repository 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
•–kubernetes-version K8s版本,与上面安装的一致
•–service-cidr 集群内部虚拟网络,Pod统一访问入口
•–pod-network-cidr Pod网络,,与下面部署的CNI网络组件yaml中保持一致
查看kubelet运行日志:
tail /var/log/messages
如果出现:
failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false
解决办法是:
# vim /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--fail-swap-on=false"
然后重新执行:
kubeadm reset
上面的执行结果是:
[root@k8s-master ~]# kubeadm init \
> --apiserver-advertise-address=192.168.106.151 \
> --image-repository registry.aliyuncs.com/google_containers \
> --kubernetes-version v1.25.2 \
> --service-cidr=10.96.0.0/12 \
> --pod-network-cidr=10.244.0.0/16 \
> --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.25.2
[preflight] Running pre-flight checks
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING Hostname]: hostname "k8s-master" could not be reached
[WARNING Hostname]: hostname "k8s-master": lookup k8s-master on 192.168.106.2:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1.151]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.106.151 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.106.151 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[apiclient] All control plane components are healthy after 6.503525 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-bal
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: bba4xu.wmpyhtac84kl0b38
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.106.151:6443 --token bba4xu.wmpyhtac84kl0b38 \
--discovery-token-ca-cert-hash sha256:8cedca7cce2fab09bfcb50cc642ae6a78ec30d4928e907a02acaf756ee2bedff
[root@k8s-master ~]#
或者使用配置文件引导:
vi kubeadm.conf
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.25.2
imageRepository: registry.aliyuncs.com/google_containers
networking:
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
kubeadm init --config kubeadm.conf --ignore-preflight-errors=all
初始化完成后,最后会输出一个join命令,先记住,下面用。
拷贝kubectl使用的连接k8s认证文件到默认路径:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
查看工作节点:
[root@k8s-master containerd]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady control-plane 29m v1.25.2
注:由于网络插件还没有部署,还没有准备就绪 NotReady
参考资料:
https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#initializing-your-control-plane-node
4. 加入Kubernetes Node
在192.168.106.152/63(Node)执行。
kubeadm reset
swapoff -a
rm /etc/containerd/config.toml
systemctl restart containerd
向集群添加新节点,执行在kubeadm init输出的kubeadm join命令:
kubeadm join 192.168.106.151:6443 --token bba4xu.wmpyhtac84kl0b38 \
--discovery-token-ca-cert-hash sha256:8cedca7cce2fab09bfcb50cc642ae6a78ec30d4928e907a02acaf756ee2bedff
默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,可以直接使用命令快捷生成:
kubeadm token create --print-join-command
参考资料:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/
如果想删除node,可以使用:
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 14h v1.25.2
k8s-node1 NotReady <none> 14h v1.25.2
k8s-node2 NotReady <none> 14h v1.25.2
[root@k8s-master ~]# kubectl delete node k8s-node1
node "k8s-node1" deleted
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 14h v1.25.2
k8s-node2 NotReady <none> 14h v1.25.2
[root@k8s-master ~]#
5.部署容器网络(CNI)
Calico是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。
下载YAML:
wget https://docs.projectcalico.org/manifests/calico.yaml
下载完后还需要修改里面定义Pod网络(CALICO_IPV4POOL_CIDR),与前面kubeadm init的 --pod-network-cidr指定的一样。
更改calico.yaml
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# 下方新增
- name: IP_AUTODETECTION_METHOD
value: "interface=ens33"
修改完后文件后,部署:
kubectl apply -f calico.yaml
kubectl get pods -n kube-system
执行效果如下:
[root@k8s-node2 ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-566654d67d-5rf67 1/1 Running 0 14h
calico-node-cgn2h 0/1 Init:0/3 0 11h
calico-node-fdvjl 1/1 Running 0 11h
calico-node-s72ds 1/1 Running 0 15m
coredns-c676cc86f-9mnpt 1/1 Running 0 15h
coredns-c676cc86f-gnrdn 1/1 Running 0 15h
etcd-k8s-master 1/1 Running 2 (11h ago) 15h
kube-apiserver-k8s-master 1/1 Running 3 (11h ago) 15h
kube-controller-manager-k8s-master 1/1 Running 5 (11h ago) 15h
kube-proxy-d72vw 1/1 Running 1 (11h ago) 15h
kube-proxy-kmjdm 1/1 Running 0 15m
kube-proxy-wwr7g 0/1 ContainerCreating 0 15h
kube-scheduler-k8s-master 1/1 Running 5 (11h ago) 15h
等Calico Pod都Running,节点也会准备就绪:
参考资料:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network
如果上面的执行过程中报类似错误:
[root@k8s-node1 ~]# kubectl apply -f calico.yaml (因为采用了默认的8080端口,但是实际上并没有这个端口,需要使用我们自己安装的端口才行)
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决办法是将master节点/etc/kubernetes/admin.conf拷贝到node节点相同路径下,比如:
[root@k8s-master ~]# cd /etc/kubernetes/
[root@k8s-master kubernetes]# ls
admin.conf controller-manager.conf kubelet.conf manifests pki scheduler.conf
[root@k8s-master kubernetes]# scp admin.conf root@k8s-node1:$PWD
root@k8s-node1's password:
admin.conf
然后执行:
[root@k8s-node1 ~]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
[root@k8s-node1 ~]# source ~/.bash_profile
6. 测试kubernetes集群
在Kubernetes集群中创建一个pod,验证是否正常运行:
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc
效果如下:
[root@k8s-master kubernetes]# kubectl get pod,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-76d6c9b8c-vv2x7 1/1 Running 0 13h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 15h
service/nginx NodePort 10.102.145.10 <none> 80:32541/TCP 13h
访问地址:http://NodeIP:Port
7.部署 Dashboard
另外可以参考:
在安装dashboard的时候,要看看兼容性,可以到:https://github.com/kubernetes/dashboard/releases
Dashboard是官方提供的一个UI,可用于基本管理K8s资源。
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
如果发现下载不了,可以在hosts中添加(一下ip一个个尝试):
补全:
185.199.108.133 raw.githubusercontent.com
#199.232.96.133 raw.githubusercontent.com
#185.199.109.133 raw.githubusercontent.com
#185.199.110.133 raw.githubusercontent.com
#185.199.111.133 raw.githubusercontent.com
默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部:
vi recommended.yaml
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
ports:
- port: 443
targetPort: 8443
nodePort: 30001
selector:
k8s-app: kubernetes-dashboard
type: NodePort
kubectl apply -f recommended.yaml
kubectl get pods -n kubernetes-dashboard
访问地址:https://NodeIP:30001
另外:查看Dashboard暴露外网端口的方式:
[root@k8s-master ~]# kubectl get svc -A | grep kubernetes-dashboard
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.111.138.240 <none> 8000/TCP 13h
kubernetes-dashboard kubernetes-dashboard NodePort 10.102.118.137 <none> 443:30001/TCP 13h
其中删除svc信息的方式:
kubectl delete svc dashboard-metrics-scraper
创建service account并绑定默认cluster-admin管理员集群角色:
# 创建用户
kubectl create serviceaccount dashboard-admin -n kube-system
# 用户授权
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
# 获取用户Token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
使用输出的token登录Dashboard。
另外若想删除用户可以使用:
[root@k8s-master ~]# kubectl get serviceaccount --all-namespaces | grep dashboard
kube-system dashboard-admin 0 51m
kubernetes-dashboard default 0 15h
kubernetes-dashboard kubernetes-dashboard 0 15h
[root@k8s-master ~]# kubectl delete serviceaccount -n kube-system dashboard-admin
serviceaccount "dashboard-admin" deleted
[root@k8s-master ~]#