1.前置知识点
1.1 生产环境可部署Kubernetes集群的两种方式
1.2 准备环境
1.3 操作系统初始化配置
2.安装Docker/kubeadm/kubelet【所有节点】
2.1 安装Docker
2.2 添加阿里云YUM软件源
2.3 安装kubeadm,kubelet和kubectl
3.部署Kubernetes Master
4. 加入Kubernetes Node
5.部署容器网络(CNI)
6. 测试kubernetes集群
7.部署 Dashboard

1、前置知识点

1.1 生产环境可部署Kubernetes集群的两种方式

目前生产部署Kubernetes集群主要有两种方式:
•kubeadm
Kubeadm是一个K8s部署工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。
•二进制包
从github下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。
这里采用kubeadm搭建集群。

kubeadm工具功能:

  • kubeadm init:初始化一个Master节点
  • kubeadm join:将工作节点加入集群
  • kubeadm upgrade:升级K8s版本
  • kubeadm token:管理 kubeadm join 使用的令牌
  • kubeadm reset:清空 kubeadm init 或者 kubeadm join 对主机所做的任何更改
  • kubeadm version:打印 kubeadm 版本
  • kubeadm alpha:预览可用的新功能

1.2 准备环境

服务器要求:

  • 建议最小硬件配置:2核CPU、2G内存、20G硬盘
  • 服务器最好可以访问外网,会有从网上拉取镜像需求,如果服务器不能上网,需要提前下载对应镜像并导入节点

软件环境:

软件

版本

操作系统

CentOS7.9_x64

Docker (通过docker version获取)

20.10.18

Kubernetes

1.25.2

服务器规划:

角色

IP

k8s-master

192.168.106.151

k8s-node1

192.168.106.152

k8s-node2

192.168.106.153

架构图:

k8s Docker 安装 JupyterHub k8s docker windows_docker

1.3 操作系统初始化配置

# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld

# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config  # 永久
setenforce 0  # 临时

# 关闭swap
swapoff -a  # 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab    # 永久

# 根据规划设置主机名
hostnamectl set-hostname <hostname>

# 在master添加hosts
cat >> /etc/hosts << EOF
192.168.106.151 k8s-master1
192.168.106.152 k8s-node1
192.168.106.153 k8s-node2
EOF

# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system  # 生效

# 时间同步
yum install ntpdate -y
ntpdate time.windows.com

2.安装Docker/kubeadm/kubelet【所有节点】

这里使用Docker作为容器引擎,也可以换成别的,例如containerd

2.1 安装Docker

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce
systemctl enable docker && systemctl start docker

配置镜像下载加速器:

cat > /etc/docker/daemon.json << EOF
{
  "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

systemctl restart docker
docker info

2.2 添加阿里云YUM软件源

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

2.3 安装kubeadm,kubelet和kubectl

由于版本更新频繁,这里指定版本号部署:

yum install -y kubelet-1.25.2 kubeadm-1.25.2 kubectl-1.25.2
systemctl enable kubelet && systemctl start kubelet
修改:
手动配置containerd的配置(踩了个大坑)
ps: 自动生成的文件会使用k8s.gcr.io/pause:3.6镜像,国内无法下载,导致kubeadm初始化失败。

1、生成 containerd 的配置文件  (一下在node节点上不用执行)
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml

2、修改 SystemdCgroup 为 true
# 编辑文件
vi /etc/containerd/config.toml
#更改SystemdCgroup值为true
SystemdCgroup = true

3、修改 sandbox_image 值
#更改k8s.gcr.io/pause:3.6为registry.aliyuncs.com/google_containers/pause:3.7
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.7"

 4、修改一下runtime_type的内容
  [plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
        base_runtime_spec = ""
        cni_conf_dir = ""
        cni_max_conf_num = 0
        container_annotations = []
        pod_annotations = []
        privileged_without_host_devices = false
        runtime_engine = ""
        runtime_path = ""
        runtime_root = ""
        runtime_type = "io.containerd.runtime.v1.linux"    (此处)

3.部署Kubernetes Master

在192.168.106.151(Master)执行。

kubeadm init \
  --apiserver-advertise-address=192.168.106.151 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.25.2 \
  --service-cidr=10.96.0.0/12 \
  --pod-network-cidr=10.244.0.0/16 \
  --ignore-preflight-errors=all

•–apiserver-advertise-address 集群通告地址
•–image-repository 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
•–kubernetes-version K8s版本,与上面安装的一致
•–service-cidr 集群内部虚拟网络,Pod统一访问入口
•–pod-network-cidr Pod网络,,与下面部署的CNI网络组件yaml中保持一致

查看kubelet运行日志:
tail /var/log/messages

如果出现:
failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false

解决办法是:
# vim /etc/sysconfig/kubelet 
KUBELET_EXTRA_ARGS="--fail-swap-on=false"

然后重新执行:
kubeadm reset

上面的执行结果是:

[root@k8s-master ~]# kubeadm init \
>   --apiserver-advertise-address=192.168.106.151 \
>   --image-repository registry.aliyuncs.com/google_containers \
>   --kubernetes-version v1.25.2 \
>   --service-cidr=10.96.0.0/12 \
>   --pod-network-cidr=10.244.0.0/16 \
>   --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.25.2
[preflight] Running pre-flight checks
	[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
	[WARNING Hostname]: hostname "k8s-master" could not be reached
	[WARNING Hostname]: hostname "k8s-master": lookup k8s-master on 192.168.106.2:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1.151]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.106.151 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [192.168.106.151 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[apiclient] All control plane components are healthy after 6.503525 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-bal
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: bba4xu.wmpyhtac84kl0b38
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.106.151:6443 --token bba4xu.wmpyhtac84kl0b38 \
	--discovery-token-ca-cert-hash sha256:8cedca7cce2fab09bfcb50cc642ae6a78ec30d4928e907a02acaf756ee2bedff 
[root@k8s-master ~]#

或者使用配置文件引导:

vi kubeadm.conf
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.25.2
imageRepository: registry.aliyuncs.com/google_containers 
networking:
  podSubnet: 10.244.0.0/16 
  serviceSubnet: 10.96.0.0/12 

kubeadm init --config kubeadm.conf --ignore-preflight-errors=all

初始化完成后,最后会输出一个join命令,先记住,下面用。
拷贝kubectl使用的连接k8s认证文件到默认路径:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看工作节点:

[root@k8s-master containerd]# kubectl get nodes
NAME         STATUS     ROLES           AGE     VERSION
k8s-master   NotReady   control-plane   29m     v1.25.2

注:由于网络插件还没有部署,还没有准备就绪 NotReady
参考资料:
https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#initializing-your-control-plane-node

4. 加入Kubernetes Node

在192.168.106.152/63(Node)执行。

kubeadm reset
swapoff -a
rm /etc/containerd/config.toml
systemctl restart containerd

向集群添加新节点,执行在kubeadm init输出的kubeadm join命令:

kubeadm join 192.168.106.151:6443 --token bba4xu.wmpyhtac84kl0b38 \
--discovery-token-ca-cert-hash sha256:8cedca7cce2fab09bfcb50cc642ae6a78ec30d4928e907a02acaf756ee2bedff

默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,可以直接使用命令快捷生成:

kubeadm token create --print-join-command

参考资料:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/

如果想删除node,可以使用:

[root@k8s-master ~]# kubectl get nodes
NAME         STATUS     ROLES           AGE   VERSION
k8s-master   Ready      control-plane   14h   v1.25.2
k8s-node1    NotReady   <none>          14h   v1.25.2
k8s-node2    NotReady   <none>          14h   v1.25.2
[root@k8s-master ~]# kubectl delete node k8s-node1
node "k8s-node1" deleted
[root@k8s-master ~]# kubectl get nodes
NAME         STATUS     ROLES           AGE   VERSION
k8s-master   Ready      control-plane   14h   v1.25.2
k8s-node2    NotReady   <none>          14h   v1.25.2
[root@k8s-master ~]#

5.部署容器网络(CNI)

Calico是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。
下载YAML:

wget https://docs.projectcalico.org/manifests/calico.yaml

下载完后还需要修改里面定义Pod网络(CALICO_IPV4POOL_CIDR),与前面kubeadm init的 --pod-network-cidr指定的一样。

更改calico.yaml

# Cluster type to identify the deployment type
  - name: CLUSTER_TYPE
  value: "k8s,bgp"
# 下方新增
- name: IP_AUTODETECTION_METHOD
              value: "interface=ens33"

修改完后文件后,部署:

kubectl apply -f calico.yaml
kubectl get pods -n kube-system

执行效果如下:

[root@k8s-node2 ~]# kubectl get pods -n kube-system
NAME                                       READY   STATUS              RESTARTS      AGE
calico-kube-controllers-566654d67d-5rf67   1/1     Running             0             14h
calico-node-cgn2h                          0/1     Init:0/3            0             11h
calico-node-fdvjl                          1/1     Running             0             11h
calico-node-s72ds                          1/1     Running             0             15m
coredns-c676cc86f-9mnpt                    1/1     Running             0             15h
coredns-c676cc86f-gnrdn                    1/1     Running             0             15h
etcd-k8s-master                            1/1     Running             2 (11h ago)   15h
kube-apiserver-k8s-master                  1/1     Running             3 (11h ago)   15h
kube-controller-manager-k8s-master         1/1     Running             5 (11h ago)   15h
kube-proxy-d72vw                           1/1     Running             1 (11h ago)   15h
kube-proxy-kmjdm                           1/1     Running             0             15m
kube-proxy-wwr7g                           0/1     ContainerCreating   0             15h
kube-scheduler-k8s-master                  1/1     Running             5 (11h ago)   15h

等Calico Pod都Running,节点也会准备就绪:
参考资料:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network

如果上面的执行过程中报类似错误:
[root@k8s-node1 ~]# kubectl apply -f calico.yaml (因为采用了默认的8080端口,但是实际上并没有这个端口,需要使用我们自己安装的端口才行)
The connection to the server localhost:8080 was refused - did you specify the right host or port?

解决办法是将master节点/etc/kubernetes/admin.conf拷贝到node节点相同路径下,比如:

[root@k8s-master ~]# cd /etc/kubernetes/
[root@k8s-master kubernetes]# ls
admin.conf  controller-manager.conf  kubelet.conf  manifests  pki  scheduler.conf
[root@k8s-master kubernetes]# scp admin.conf root@k8s-node1:$PWD
root@k8s-node1's password: 
admin.conf

然后执行:

[root@k8s-node1 ~]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
[root@k8s-node1 ~]# source ~/.bash_profile

6. 测试kubernetes集群

在Kubernetes集群中创建一个pod,验证是否正常运行:

kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc

效果如下:

[root@k8s-master kubernetes]# kubectl get pod,svc
NAME                        READY   STATUS    RESTARTS   AGE
pod/nginx-76d6c9b8c-vv2x7   1/1     Running   0          13h

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP        15h
service/nginx        NodePort    10.102.145.10   <none>        80:32541/TCP   13h

访问地址:http://NodeIP:Port

k8s Docker 安装 JupyterHub k8s docker windows_ci_02


k8s Docker 安装 JupyterHub k8s docker windows_bootstrap_03

7.部署 Dashboard

另外可以参考:

在安装dashboard的时候,要看看兼容性,可以到:https://github.com/kubernetes/dashboard/releases

k8s Docker 安装 JupyterHub k8s docker windows_kubelet_04

Dashboard是官方提供的一个UI,可用于基本管理K8s资源。

wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml

如果发现下载不了,可以在hosts中添加(一下ip一个个尝试):
补全:

185.199.108.133    raw.githubusercontent.com
#199.232.96.133     raw.githubusercontent.com
#185.199.109.133    raw.githubusercontent.com
#185.199.110.133    raw.githubusercontent.com
#185.199.111.133    raw.githubusercontent.com

默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部:

vi recommended.yaml
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001
  selector:
    k8s-app: kubernetes-dashboard
  type: NodePort
kubectl apply -f recommended.yaml
kubectl get pods -n kubernetes-dashboard

访问地址:https://NodeIP:30001

另外:查看Dashboard暴露外网端口的方式:

[root@k8s-master ~]# kubectl get svc -A | grep kubernetes-dashboard
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP   10.111.138.240   <none>        8000/TCP                 13h
kubernetes-dashboard   kubernetes-dashboard        NodePort    10.102.118.137   <none>        443:30001/TCP            13h

其中删除svc信息的方式:

kubectl delete svc dashboard-metrics-scraper

创建service account并绑定默认cluster-admin管理员集群角色:

# 创建用户
kubectl create serviceaccount dashboard-admin -n kube-system
# 用户授权
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
# 获取用户Token
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')

使用输出的token登录Dashboard。

k8s Docker 安装 JupyterHub k8s docker windows_docker_05


k8s Docker 安装 JupyterHub k8s docker windows_ci_06

另外若想删除用户可以使用:

[root@k8s-master ~]#  kubectl get serviceaccount --all-namespaces | grep dashboard
kube-system            dashboard-admin                      0         51m
kubernetes-dashboard   default                              0         15h
kubernetes-dashboard   kubernetes-dashboard                 0         15h
[root@k8s-master ~]# kubectl delete serviceaccount -n kube-system dashboard-admin
serviceaccount "dashboard-admin" deleted
[root@k8s-master ~]#