在Kubernetes中,部署MongoDB主要用到的是mongo-db-sidecar

Docker Hub

1. 架构

Mongodb的集群搭建方式主要有三种,主从模式,Replica set模式,sharding模式, 三种模式各有优劣,适用于不同的场合,属Replica set应用最为广泛,主从模式现在用的较少,sharding模式最为完备,但配置维护较为复杂。

mongo-db-sidecar使用的是Replica set模式,Mongodb的Replica Set即副本集方式主要有两个目的,一个是数据冗余做故障恢复使用,当发生硬件故障或者其它原因造成的宕机时,可以使用副本进行恢复。另一个是做读写分离,读的请求分流到副本上,减轻主(Primary)的读压力。

二进制部署MongoDB集群无需其他服务,直接在主节点执行类似以下的命令即可创建集群:

cfg={ _id:"testdb", members:[ {_id:0,host:'192.168.255.141:27017',priority:2}, {_id:1,host:'192.168.255.142:27017',priority:1}, {_id:2,host:'192.168.255.142:27019',arbiterOnly:true}] };
rs.initiate(cfg)

2. 部署

本文是部署Mongodb的实践,因为此服务需要用到namespace下的podslist权限进行集群操作,所以如果在实际部署时,请记得先进行2.5的RBAC操作,然后再进行2.4的Statefulset部署。

2.1 Namespace

kubectl create ns mongo

2.2 StorageClass

这里需要提前部署好NFS或者其他可提供SC的存储集群。
Kubernetes使用NFS做持久化存储

# mongo-clutser-sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mongodb-data
provisioner: fuseim.pri/ifs

# create
kubectl create -f mongo-clutser-sc.yaml

2.3 Headless Service

apiVersion: v1
kind: Service
metadata:
  name: mongo
  namespace: mongo
  labels:
    name: mongo
spec:
  ports:
  - port: 27017
    targetPort: 27017
  clusterIP: None
  selector:
    role: mongo

2.4 Statefulset

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mongo
  namespace: mongo
spec:
  serviceName: "mongo"
  replicas: 3
  template:
    metadata:
      labels:
        role: mongo
        environment: prod
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: mongo
          image: harbor.s.com/redis/mongo:3.4.22
          command:
            - mongod
            - "--replSet"
            - rs0
            - "--bind_ip"
            - 0.0.0.0
            - "--smallfiles"
            - "--noprealloc"
          ports:
            - containerPort: 27017
          volumeMounts:
            - name: mongo-persistent-storage
              mountPath: /data/db
        - name: mongo-sidecar
          image: harbor.s.com/redis/mongo-k8s-sidecar
          env:
            - name: MONGO_SIDECAR_POD_LABELS
              value: "role=mongo,environment=prod"
  volumeClaimTemplates:
  - metadata:
      name: mongo-persistent-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: mongodb-data
      resources:
        requests:
          storage: 10Gi

2.5 RBAC

这时候查看集群状态,发现是不可用的。

kubectl exec -it mongo-0 -n mongo -- mongo
Defaulting container name to mongo.
Use 'kubectl describe pod/mongo-0 -n mongo' to see all of the containers in this pod.
MongoDB shell version v3.4.22
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.22
Server has startup warnings: 
2019-08-24T09:23:57.039+0000 I CONTROL  [initandlisten] 
2019-08-24T09:23:57.039+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-08-24T09:23:57.039+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2019-08-24T09:23:57.039+0000 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2019-08-24T09:23:57.039+0000 I CONTROL  [initandlisten] 
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] 
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] 
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-24T09:23:57.040+0000 I CONTROL  [initandlisten] 
> rs.status()
{
        "info" : "run rs.initiate(...) if not yet done for the set",
        "ok" : 0,
        "errmsg" : "no replset config has been received",
        "code" : 94,
        "codeName" : "NotYetInitialized"
}
>

应该是mongo k8s sidecar没有正确的配置,查看其日志:

kubectl logs mongo-0 mongo-sidecar -n mongo

···
Error in workloop { [Error: [object Object]]
  message:
   { kind: 'Status',
     apiVersion: 'v1',
     metadata: {},
     status: 'Failure',
     message:
      'pods is forbidden: User "system:serviceaccount:mongo:default" cannot list resource "pods" in API group "" at the cluster scope',
     reason: 'Forbidden',
     details: { kind: 'pods' },
     code: 403 },
  statusCode: 403 }

信息显示默认分配的sa账号没有list此namespace下pods的权限,搜索了下这个问题早在很久之前在github上就有人提出,作者也给出了对应的解决方案,需要给默认的sa账号提权,增加list pods的权限,但是实际测试发现虽然给system:serviceaccount:mongo:dafault赋予pods的list权限,仍然会报错,以下是rbac配置:

mongo-k8s-sidecar/role.yaml at 2640ed1c2971b1279c2961efd257cde9fbe39574 · cvallance/mongo-k8s-sidecar · GitHub

# 使用后仍然无用的配置
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: mongo
  name: mongo-pod-read
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: mongo-pod-read
  namespace: mongo
subjects:
- kind: ServiceAccount
  name: default
  namespace: mongo
roleRef:
  kind: Role
  name: mongo-pod-read
  apiGroup: rbac.authorization.k8s.io

所以我们需要重新想办法,给此sa更大的权限,这里使用默认的clusterrole view权限进行赋权,我们可以使用clusterrole对sa进行界定namespace的赋权,相当于clusterrole是一个可以进行clusterrole与role进行binding的模板:
GCE - K8s 1.8 - pods is forbidden - Cannot list pods - Unknown user "system:serviceaccount:default:default" · Issue #75 · cvallance/mongo-k8s-sidecar · GitHub

# 正确的rbac配置
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: mongo-default-view
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view
subjects:
  - kind: ServiceAccount
    name: default
    namespace: mongo

但是pod在创建后,是无法更换sa账号与sa权限的,所以需要重建pod:

# 查看statefulset
kubectl get statefulset -n mongo
NAME    READY   AGE
mongo   3/3     23h

# scale
kubectl scale statefulset mongo -n mongo --replicas=0
statefulset.apps/mongo scaled

# 过会重新配置副本数为3
kubectl scale statefulset mongo -n mongo --replicas=3
statefulset.apps/mongo scaled

# 查看已经建立完毕
kubectl get all -n mongo
NAME          READY   STATUS    RESTARTS   AGE
pod/mongo-0   2/2     Running   0          21s
pod/mongo-1   2/2     Running   0          17s
pod/mongo-2   2/2     Running   0          12s


NAME            TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)     AGE
service/mongo   ClusterIP   None         <none>        27017/TCP   23h




NAME                     READY   AGE
statefulset.apps/mongo   3/3     23h

再次查看集群状态,发现状态已经正常,集群创建成功:

kubectl exec -it mongo-0 -n mongo -- mongo
rs0:PRIMARY> rs.status()
{
        "set" : "rs0",
        "date" : ISODate("2019-08-25T08:58:12.550Z"),
        "myState" : 1,
        "term" : NumberLong(2),
        "syncingTo" : "",
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
                "lastCommittedOpTime" : {
                        "ts" : Timestamp(1566723485, 1),
                        "t" : NumberLong(2)
                },
                "appliedOpTime" : {
                        "ts" : Timestamp(1566723485, 1),
                        "t" : NumberLong(2)
                },
                "durableOpTime" : {
                        "ts" : Timestamp(1566723485, 1),
                        "t" : NumberLong(2)
                }
        },
        "members" : [
                {
                        "_id" : 0,
                        "name" : "10.244.4.87:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 19,
                        "optime" : {
                                "ts" : Timestamp(1566723485, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1566723485, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2019-08-25T08:58:05Z"),
                        "optimeDurableDate" : ISODate("2019-08-25T08:58:05Z"),
                        "lastHeartbeat" : ISODate("2019-08-25T08:58:11.877Z"),
                        "lastHeartbeatRecv" : ISODate("2019-08-25T08:58:11.192Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncingTo" : "10.244.3.65:27017",
                        "syncSourceHost" : "10.244.3.65:27017",
                        "syncSourceId" : 3,
                        "infoMessage" : "",
                        "configVersion" : 171757
                },
                {
                        "_id" : 1,
                        "name" : "10.244.5.9:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 19,
                        "optime" : {
                                "ts" : Timestamp(1566723485, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1566723485, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2019-08-25T08:58:05Z"),
                        "optimeDurableDate" : ISODate("2019-08-25T08:58:05Z"),
                        "lastHeartbeat" : ISODate("2019-08-25T08:58:11.875Z"),
                        "lastHeartbeatRecv" : ISODate("2019-08-25T08:58:11.478Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncingTo" : "10.244.4.87:27017",
                        "syncSourceHost" : "10.244.4.87:27017",
                        "syncSourceId" : 0,
                        "infoMessage" : "",
                        "configVersion" : 171757
                },
                {
                        "_id" : 3,
                        "name" : "10.244.3.65:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 80,
                        "optime" : {
                                "ts" : Timestamp(1566723485, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2019-08-25T08:58:05Z"),
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "could not find member to sync from",
                        "electionTime" : Timestamp(1566723473, 1),
                        "electionDate" : ISODate("2019-08-25T08:57:53Z"),
                        "configVersion" : 171757,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                }
        ],
        "ok" : 1
}
rs0:PRIMARY>

2.6 扩容

如果需要对mongo扩容,只需要调整statefulset的replicas即可:

kubectl scale statefulset mongo --replicas=4 -n mongo

3. 使用/访问

mongo cluster访问默认连接为:

mongodb://mongo1,mongo2,mongo3:27017/dbname_?

在kubernetes中最常用的FQDN连接服务的连接为:

#appName.$HeadlessServiceName.$Namespace.svc.cluster.local

因为我们采用statefulset部署的pod,所以命名均有规则,所以实际上如果连接4副本的mongodb cluster,上面的默认连接该为(默认为namespace之外):

mongodb://mongo-0.mongo.mongo.svc.cluster.local:27017,mongo-1.mongo.mongo.svc.cluster.local:27017,mongo-2.mongo.mongo.svc.cluster.local:27017,mongo-3.mongo.mongo.svc.cluster.local:27017/?replicaSet=rs0

4. 监控

使用helm chart prometheus-mongodb-exporter进行监控。

4.1 部署exporter

注意,这里的uri后如果是集群,必须使用“”,不然会各种告警,我在这踩了无数的坑
看起来uri是固定的,而不是自动发现,所以如果需要对集群的副本进行增加或删除,则需要helm修改uri,更新配置后重建pod。
image是为了内网容易部署,将默认image下载后放入harbor,并未做任何其他修改,可忽略。

# vi values.yaml 编辑定制参数
mongodb:
  uri: "mongodb://mongo-0.mongo.mongo.svc.cluster.local:27017,mongo-1.mongo.mongo.svc.cluster.local:27017,mongo-2.mongo.mongo.svc.cluster.local:27017,mongo-3.mongo.mongo.svc.cluster.local:27017/?replicaSet=rs0"
image:
  repository: harbor.s.com/mongo/mongodb-exporter
  tag: 0.7.0
# 部署
helm upgrade --install mongo-exporter stable/prometheus-mongodb-exporter -f values.yaml --namespace mongo --force
# 查看结果
kubectl port-forward service/mongo-exporter-prometheus-mongodb-exporter 9216
curl http://127.0.0.1:9216/metrics

4.2 配置prometheus operator