通过RKE 安装kubernetes
作者: 张首富
时间: 2019-02-13
个人博客: www.zhangshoufu.com
QQ群: 895291458
集群节点说明
我们这边需要4台机器,系统全都是centos7.5
10.0.0.99 MKE.kuber.com
10.0.0.100 master.kuber.com
10.0.0.101 node101.kuber.com
10.0.0.102 node102.kuber.com
安装前参数调整(所有机器上操作)
sudo sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config # 关闭selinux
systemctl stop firewalld.service && systemctl disable firewalld.service # 关闭防火墙
echo 'LANG="en_US.UTF-8"' >> /etc/profile;source /etc/profile #修改系统语言
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime # 修改时区(如果需要修改)
# 添加hosts文件
# 性能调优
cat >> /etc/sysctl.conf<<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.neigh.default.gc_thresh1=4096
net.ipv4.neigh.default.gc_thresh2=6144
net.ipv4.neigh.default.gc_thresh3=8192
EOF
sysctl –p
配置相关转发
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
EOF
sysctl --system
配置kubernetes源(所有机器上操作)
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
配置docker源,安装docker(所有机器上操作)
yum -y install yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install -y device-mapper-persistent-data lvm2
sudo yum makecache fast
yum -y remove container-selinux.noarch
yum install https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-selinux-17.03.2.ce-1.el7.centos.noarch.rpm -y
yum install docker-ce-17.03.0.ce -y (安装17.03,要不然会出现问题)
systemctl start docker && systemctl enable docker
配置docker镜像加速
mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://ll9gv5j9.mirror.aliyuncs.com","exec-opts": ["native.cgroupdriver=systemd"]]
}
EOF
配置镜像加速地址
可以配置多条,以数组的形式编写,地址需要添加协议头。编辑/etc/docker/daemon.json加入以下内容
{
"registry-mirrors": ["https://z34wtdhg.mirror.aliyuncs.com","https://$IP:$PROT"]
}
配置私有仓库 (可选)
Docker默认只信任TLS加密的仓库地址(https),所有非https仓库默认无法登陆也无法拉取镜像。insecure-registries字面意思为不安全的仓库,通过添加这个参数对非https仓库进行授信。可以设置多个insecure-registries地址,以数组形式书写,地址不能添加协议头(http)。编辑/etc/docker/daemon.json加入以下内容:
{
"insecure-registries":["harbor.httpshop.com","bh-harbor.suixingpay.com"]
}
配置Docker存储驱动(可选)
存储驱动有很多种,例如:overlay、overlay2、devicemapper等,前两者是OverlayFS类型的,是一个新一代的联合文件系统,类似于AUFS,但速度更快,更加稳定。这里推荐新版的overlay2。 要求: overlay2: Linux内核版本4.0或更高版本,或使用内核版本3.10.0-514+的RHEL或CentOS 支持的磁盘文件系统:ext4(仅限RHEL 7.1),xfs(RHEL7.2及更高版本),需要启用d_type=true 编辑/etc/docker/daemon.json加入以下内容
{
"storage-driver": "overlay2",
"storage-opts": ["overlay2.override_kernel_check=true"]
}
配置日志驱动(可选)
容器在运行时会产生大量日志文件,很容易占满磁盘空间。通过配置日志驱动来限制文件大小与文件的数量。 >限制单个日志文件为100M,最多产生3个日志文件
{
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
daemon.json的样例
{
"registry-mirrors": ["https://z34wtdhg.mirror.aliyuncs.com"],
"insecure-registries":["harbor.httpshop.com","bh-harbor.suixingpay.com"],
"storage-driver": "overlay2",
"storage-opts": ["overlay2.override_kernel_check=true"]
}
{
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "3"
}
}
创建docker用户(所有节点上) 这一步特别重要,我们后面起的服务全部都要在docker这个用户下启动
[root@RKE ~]# grep ^docker /etc/group 如果有docker组就不需要创建
docker:x:994:
useradd -g docker docker
echo "1" | passwd --stdin docker
在RKE上分发秘钥
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub docker@10.0.0.100
ssh-copy-id -i ~/.ssh/id_rsa.pub docker@10.0.0.100
ssh-copy-id -i ~/.ssh/id_rsa.pub docker@10.0.0.101
ssh-copy-id -i ~/.ssh/id_rsa.pub docker@10.0.0.102
安装nginx,为了我们能在外面访问(多master负载使用,在MKE安装)
nginx的配置文件如下
[docker@MKE ~]$ cat /etc/nginx/nginx.conf
worker_processes auto;
pid /run/nginx.pid;
events {
use epoll;
worker_connections 65536;
accept_mutex off;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$upstream_addr" "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" "$request_time"';
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 900;
# keepalive_timeout 0;
keepalive_requests 100;
types_hash_max_size 2048;
server {
listen 80;
return 301 https://$host$request_uri;
}
}
stream {
upstream rancher_servers {
least_conn;
server 10.0.0.100:443 max_fails=3 fail_timeout=5s;
}
server {
listen 443;
proxy_pass rancher_servers;
}
}
启动docker服务:
docker run -d --restart=unless-stopped \
-p 80:80 -p 443:443 \
-v /etc/nginx/nginx.conf:/etc/nginx/nginx.conf \
nginx:1.14
RKE 安装kubernetes(在MKE机器上操作)
下载RKE wget https://github.com/rancher/rke/releases/download/v0.1.11/rke_linux-amd64 (不建议在不能×××的机器上安装,我们可以下载下来传上去)
写集群yaml文件,先切换到docker用户
nodes:
- address: 10.0.0.100
user: docker
ssh_key_path: ~/.ssh/id_rsa
role: [controlplane,worker,etcd]
- address: 10.0.0.101
user: docker
role: [worker,etcd]
- address: 10.0.0.102
user: docker
role: [worker,etcd]
services:
etcd:
snapshot: true
creation: 6h
retention: 24h
- address : 集群节点的地址
- user : 使用哪个用户执行安装命令
- ssh_key_path : 私钥地址(如果秘钥生成不是默认的名称就需要指定)
- role : 这个节点充当什么角色 ......剩下https://www.cnrancher.com/docs/rke/v0.1.x/cn/example-yamls/cluster/ 看这个
安装kubectl 检查集群
yum -y install kubectl
检查集群节点:
[docker@MKE ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.0.0.100 Ready controlplane,etcd,worker 2h v1.11.3
10.0.0.101 Ready etcd,worker 2h v1.11.3
10.0.0.102 Ready etcd,worker 2h v1.11.3
检查pod状态
[docker@MKE ~]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx default-http-backend-797c5bc547-j7577 1/1 Running 0 2h
ingress-nginx nginx-ingress-controller-69s9g 1/1 Running 0 2h
ingress-nginx nginx-ingress-controller-8gw74 1/1 Running 0 2h
ingress-nginx nginx-ingress-controller-xgzzw 1/1 Running 0 2h
kube-system canal-5nf7c 3/3 Running 0 2h
kube-system canal-nzgx4 3/3 Running 0 2h
kube-system canal-t5m9d 3/3 Running 0 2h
kube-system kube-dns-7588d5b5f5-s5f99 3/3 Running 0 2h
kube-system kube-dns-autoscaler-5db9bbb766-62rxm 1/1 Running 0 2h
kube-system metrics-server-97bc649d5-9h2g4 1/1 Running 0 2h
kube-system rke-ingress-controller-deploy-job-rwzgq 0/1 Completed 0 2h
kube-system rke-kubedns-addon-deploy-job-mvmzj 0/1 Completed 0 2h
kube-system rke-metrics-addon-deploy-job-52gp4 0/1 Completed 0 2h
kube-system rke-network-plugin-deploy-job-jckhc 0/1 Completed 0 2h
Pods
的STATUS
是Completed
为run-one Jobs,这些pods READY应该为0/1。
配置kubectl命令补全
yum -y install bash-completion.noarch
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
Helm 安装rancher dashboard(仪表盘)
创建helm的RBAC(Role-based Access Control,基于角色的访问控制)
# 在kube-system这个命令空间里面创建一个 tiller的服务账号
kubectl -n kube-system create serviceaccount tiller
# 把tiller绑定到哪个集群角色上面,服务账号是什么
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
二进制安装helm
下载安装包的地址为 : https://github.com/helm/helm/releases
[docker@MKE ~]$ tar xf helm-v2.12.2-linux-amd64.tar.gz
[root@MKE ~]# cp -a -t /usr/local/bin/ /home/docker/linux-amd64/helm /home/docker/linux-amd64/tiller
[root@MKE ~]# su - docker
添加helm 镜像源
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
Rancher 中安装 Tiller
默认使用的版本是V2.12.3
helm init --service-account tiller --tiller-image \
registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.12.3 \
--stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
升级Tiller(可选)
安装证书管理器
helm install stable/cert-manager \
--name cert-manager \
--namespace kube-system
如果报错,在后面添加--set createCustomResource=true
选择SSL配置方式并安装Rancher server
helm install rancher-stable/rancher \
--name rancher \
--namespace cattle-system \
--set hostname=rancher.zsf.com
修改hosts文件,浏览器访问测试
我们在hosts文件里面加入对应的域名解析,因为我们的域名是假的
cat /etc/hosts
10.0.0.99 rancher.zsf.com
浏览器登录访问
登录的时候需要注意,使用https协议访问,这个过程的长短根据你的机器配置来的,
备份与恢复
集群备份(对于新手来说,强烈建议集群搭建成功后拍摄一个快照)
注意:
- 需要RKE v0.1.7以上版本才可以
手动创建快照:
当你即将升级Rancher或将其恢复到以前的快照时,你应该对数据手动创建快照,以便数据异常时可供恢复。
在RKE机器上执行下面命令
./rke_linux-amd64 etcd snapshot-save --name <SNAPSHOT.db> --config rancher-cluster.yml
SNAPSHOT.db
: 这个是保存etcd的快照名字
rancher-cluster.yml
: 这个是创建集群的时候指定的配置文件,如果使用的是默认的cluster.yml
就可以不指定
RKE会获取每个etcd节点的快照,并保存在每个etcd节点的/opt/rke/etcd-snapshots目录下;
测试:
[docker@MKE ~]$ pwd
/home/docker
[docker@MKE ~]$ ls
cluster.yml kube_config_cluster.yml linux-amd64 rke_linux-amd64
[docker@MKE ~]$ ./rke_linux-amd64 etcd snapshot-save --name initialization_status_20190213 --config cluster.yml
INFO[0000] Starting saving snapshot on etcd hosts
INFO[0000] [dialer] Setup tunnel for host [10.0.0.100]
INFO[0000] [dialer] Setup tunnel for host [10.0.0.101]
INFO[0000] [dialer] Setup tunnel for host [10.0.0.102]
INFO[0000] [etcd] Saving snapshot [initialization_status_20190213] on host [10.0.0.100]
INFO[0000] [etcd] Successfully started [etcd-snapshot-once] container on host [10.0.0.100]
INFO[0000] [etcd] Saving snapshot [initialization_status_20190213] on host [10.0.0.101]
INFO[0001] [etcd] Successfully started [etcd-snapshot-once] container on host [10.0.0.101]
INFO[0001] [etcd] Saving snapshot [initialization_status_20190213] on host [10.0.0.102]
INFO[0001] [etcd] Successfully started [etcd-snapshot-once] container on host [10.0.0.102]
INFO[0002] [certificates] Successfully started [rke-bundle-cert] container on host [10.0.0.100]
INFO[0002] [certificates] Successfully started [rke-bundle-cert] container on host [10.0.0.102]
INFO[0002] [certificates] Successfully started [rke-bundle-cert] container on host [10.0.0.101]
INFO[0002] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [10.0.0.101]
INFO[0002] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [10.0.0.100]
INFO[0002] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [10.0.0.102]
INFO[0002] Finished saving snapshot [initialization_status_20190213] on all etcd hosts
到节点上去看
[docker@master etcd-snapshots]$ ll -d /opt/rke/etcd-snapshots/initialization_status_20190213
-rw-r--r-- 1 root root 9052192 Feb 13 10:25 /opt/rke/etcd-snapshots/initialization_status_20190213
定时自动创建快照
定时自动创建快照服务是RKE附带的服务,默认没有开启。可以通过在rancher-cluster.yml
中添加配置来启用etcd-snapshot(定时自动创建快照)服务。
在cluster.yml文件里面添加如下代码
services:
etcd:
snapshot: true # 是否启用快照功能,默认false;
creation: 6h0s # 快照创建间隔时间,不加此参数,默认5分钟;
retention: 24h # 快照有效期,此时间后快照将被删除;
运行命令./rke_linux-amd64 up --config cluster.yml
结果:
RKE会在每个etcd节点上定时获取快照,并将快照将保存到每个etcd节点的:/opt/rke/etcd-snapshots/目录下