动态存储管理实战: GlusterFS

本节以GlusterFS为例,从定义StorageClass、创建GlusterFS和Heketi服务、用户申请PVC到创建Pod使用存储资源,对StorageClass和动态资源分配进行详细说明,进一步剖析Kubernetes的存储机制。(ps: 第26篇有详细的部署过程,此处不上传操作图片了)

准备工作

为了能够使用GlusterFS,首先在计划用于GlusterFS的各Node上安装GlusterFS客户端:

yum install -y  glusterfs glusterfs-fuse

GlusterFS管理服务容器需要以特权模式运行,在kube-apiserver的启动参数中增加:

--allow-privileged=true

给要部署GlusterFS管理服务的节点打上“storagenode=glusterfs”的标签,是为了将GlusterFS容器定向部署到安装了GlusterFS的Node上:

kubectl label node k8s-node-1  storagenode=glusterfs
kubectl label node k8s-node-2  storagenode=glusterfs
kubectl label node k8s-node-3  storagenode=glusterfs

创建GlusterFS管理服务容器集群

      GlusterFS管理服务容器以DaemonSet的方式进行部署,确保在每个Node上都运行一个GlusterFS管理服务。glusterfs-daemonset.yaml的内容如下:

kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: glusterfs
  labels:
    glusterfs: daemonset
  annotations:
    description: GlusterFS DaemonSet
    tags: glusterfs
spec:
  template:
    metadata:
      name: glusterfs
      labels:
        glusterfs: pod
        glusterfs-node: pod
    spec:
      nodeSelector:
        storagenode: glusterfs
      hostNetwork: true
      containers:
      - image: gluster/gluster-centos:latest
        name: glusterfs
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: glusterfs-heketi
          mountPath: "/var/lib/heketi"
        - name: glusterfs-run
          mountPath: "/run"
        - name: glusterfs-lvm
          mountPath: "/run/lvm"
        - name: glusterfs-etc
          mountPath: "/etc/glusterfs"
        - name: glusterfs-logs
          mountPath: "/var/log/glusterfs"
        - name: glusterfs-config
          mountPath: "/var/lib/glusterd"
        - name: glusterfs-dev
          mountPath: "/dev"
        - name: glusterfs-misc
          mountPath: "/var/lib/misc/glusterfsd"
        - name: glusterfs-cgroup
          mountPath: "/sys/fs/cgroup"
          readOnly: true
        - name: glusterfs-ssl
          mountPath: "/etc/ssl"
          readOnly: true
        securityContext:
          capabilities: {}
          privileged: true
        readinessProbe:
          timeoutSeconds: 3
          initialDelaySeconds: 60
          exec:
            command:
            - "/bin/bash"
            - "-c"
            - systemctl status glusterd.service
        livenessProbe:
          timeoutSeconds: 3
          initialDelaySeconds: 60
          exec:
            command:
            - "/bin/bash"
            - "-c"
            - systemctl status glusterd.service
      volumes:
      - name: glusterfs-heketi
        hostPath:
          path: "/var/lib/heketi"
      - name: glusterfs-run
      - name: glusterfs-lvm
        hostPath:
          path: "/run/lvm"
      - name: glusterfs-etc
        hostPath:
          path: "/etc/glusterfs"
      - name: glusterfs-logs
        hostPath:
          path: "/var/log/glusterfs"
      - name: glusterfs-config
        hostPath:
          path: "/var/lib/glusterd"
      - name: glusterfs-dev
        hostPath:
          path: "/dev"
      - name: glusterfs-misc
        hostPath:
          path: "/var/lib/misc/glusterfsd"
      - name: glusterfs-cgroup
        hostPath:
          path: "/sys/fs/cgroup"
      - name: glusterfs-ssl
        hostPath:
          path: "/etc/ssl"
kubectl apply -f glusterfs-daemonset.yaml
kubectl get pod

创建Heketi服务

Heketi 是一个提供RESTful API管理GlusterFS卷的框架,并能够在OpenStack、Kubernetes、OpenShift等云平台上实现动态存储资源供应,支持GlusterFS多集群管理,便于管理员对GlusterFS进行操作。

在部署Heketi服务之前,需要为它创建一个ServiceAccount对象:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heketi-service-account

 

kubectl apply -f heketi-service-account.yaml

部署Heketi服务

---
kind: Service
apiVersion: v1
metadata:
  name: deploy-heketi
  labels:
    glusterfs: heketi-service
    deploy-heketi: support
  annotations:
    description: Exposes Heketi Service
spec:
  selector:
    name: deploy-heketi
  ports:
  - name: deploy-heketi
    port: 8080
    targetPort: 8080
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: deploy-heketi
  labels:
    glusterfs: heketi-deployment
    deploy-heketi: heket-deployment
  annotations:
    description: Defines how to deploy Heketi
spec:
  replicas: 1
  template:
    metadata:
      name: deploy-heketi
      labels:
        glusterfs: heketi-pod
        name: deploy-heketi
    spec:
      serviceAccountName: heketi-service-account
      containers:
      - image: heketi/heketi
        imagePullPolicy: IfNotPresent
        name: deploy-heketi
        env:
        - name: HEKETI_EXECUTOR
          value: kubernetes
        - name: HEKETI_FSTAB
          value: "/var/lib/heketi/fstab"
        - name: HEKETI_SNAPSHOT_LIMIT
          value: '14'
        - name: HEKETI_KUBE_GLUSTER_DAEMONSET
          value: "y"
        ports:
        - containerPort: 8080
        volumeMounts:
        - name: db
          mountPath: "/var/lib/heketi"
        readinessProbe:
          timeoutSeconds: 3
          initialDelaySeconds: 3
          httpGet:
            path: "/hello"
            port: 8080
        livenessProbe:
          timeoutSeconds: 3
          initialDelaySeconds: 30
          httpGet:
            path: "/hello"
            port: 8080
      volumes:
      - name: db
        hostPath:
          path: "/heketi-data"

需要注意的是,Heketi的DB数据需要持久化保存,建议使用hostPath或其他共享存储进行保存:

kubectl apply -f heketi-deployment-svc.yaml

为Heketi设置GlusterFS集群

在Heketi能够管理GlusterFS集群之前,首先要为其设置GlusterFS集群的信息。可以用一个topology.json配置文件来完成各个GlusterFS节点和设备的定义。Heketi要求在一个GlusterFS集群中至少有3个节点。在topology.json配置文件hostnames字段的manage上填写主机名,在storage上填写IP地址,devices要求为未创建文件系统的裸设备(可以有多块盘),以供Heketi自动完成PV(Physical Volume)、VG(Volume Group)和LV(Logical Volume)的创建。topology.json文件的内容如下:

{
  "clusters": [
    {
      "nodes": [
        {
          "node": {
            "hostnames": {
              "manage": [
                "k8s-master2"
              ],
              "storage": [
                "20.0.40.52"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb1"
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "k8s-master3"
              ],
              "storage": [
                "20.0.40.53"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb1"
          ]
        },
        {
          "node": {
            "hostnames": {
              "manage": [
                "k8s-node3"
              ],
              "storage": [
                "20.0.40.56"
              ]
            },
            "zone": 1
          },
          "devices": [
            "/dev/sdb1"
          ]
        }
      ]
    }
  ]
}

 

进入Heketi容器,使用命令行工具heketi-cli完成GlusterFS集群的创建(此处不在过多描述,详细参考第26篇)。

经过这个操作,Heketi完成了GlusterFS集群的创建,同时在GlusterFS集群的各个节点的/dev/sdb盘上成功创建了PV和VG。

查看Heketi的topology信息,可以看到Node和Device的详细信息,包括磁盘空间的大小和剩余空间。此时,Volume和Brick还未创建:

定义StorageClass

准备工作已经就绪,集群管理员现在可以在Kubernetes集群中定义一个StorageClass了。storageclass-gluster-heketi.yaml配置文件的内容如下:

apiVersion: storage.k8s.ip/v1
kind: StorageClass
metadata:
  name: gluster-heketi
provisioner: kubernetes.io/glusterfs
parameters:
  resturl: "http://20.0.40.53:8080"     #填写宿主机ip,不要填写clusterip
  restauthenabled: "false"

Provisioner参数必须被设置为“kubernetes.io/glusterfs”。
        resturl的地址需要被设置为API Server所在主机可以访问到的Heketi服务的某个地址,可以使用服务ClusterIP+端口号、容器IP地址+端口号,或将服务映射到物理机,使用物理机IP+NodePort。
        创建这个StorageClass资源对象:

kubectl apply -f storageclass-glusterfs-heketi.yaml

定义PVC

现在,用户可以申请一个PVC了。例如,一个用户申请一个1GiB空间的共享存储资源,StorageClass使用“gluster-heketi”,未定义任何Selector,说明使用动态资源供应模式:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-gluster-heketi
spec:
  storageClassName: gluster-heketi
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

PVC的定义一旦生成,系统便将触发Heketi进行相应的操作,主要为在GlusterFS集群上创建brick,再创建并启动一个Volume。整个过程可以在Heketi的日志中查到:

查看PVC的状态,可见其已经为Bound(已绑定):kubectl get pvc

查看PV,可见系统自动创建的PV: kubecatl get pv

查看该PV的详细信息,可以看到其容量、引用的StorageClass等信息都已正确设置,状态也为Bound,回收策略则为默认的Delete。同时Gluster的Endpoint和Path也由Heketi自动完成了设置: kubectl describe pv <pv-pvname>

至此,一个可供Pod使用的PVC就创建成功了。接下来Pod就能通过Volume的设置将这个PVC挂载到容器内部进行使用了。

Pod使用PVC的存储资源

在Pod中使用PVC定义的存储资源非常容易,只需设置一个Volume,其类型为persistentVolumeClaim,即可轻松引用一个PVC。下例中使用一个busybox容器验证对PVC的使用,注意Pod需要与PVC属于同一个Namespace:

apiVersion: v1
kind: Pod
metadata:
  name: pod-use-pvc
spec:
  containers:
  - name: pod-use-pvc
    image: busybox
    imagePullPolucy: IfNotPresent
    command:
    - sleep
    - "3600"
    volumeMounts:
    - name: gluster-volume
      mounthPath: "/pv-data"
      readOnly: false
   volumes:
   - name: gluster-volume
     persistentVolumeClain:
       claimName: pvc-gluster-heketi
kubectl apply -f pod-use-pvc.yaml

进入容器pod-use-pvc,在/pv-data目录下创建一些文件。可以验证文件a和b在GlusterFS集群中是否正确生成。

至此,使用Kubernetes最新的动态存储供应模式,配合StorageClass和Heketi共同搭建基于GlusterFS的共享存储就完成了。有兴趣的读者可以继续尝试StorageClass的其他设置,例如调整GlusterFS的Volume类型、修改PV的回收策略等。

在使用动态存储供应模式的情况下,相对于静态模式的优势至少包括如下两点。
(1)管理员无须预先创建大量的PV作为存储资源。
(2)用户在申请PVC时无法保证容量与预置PV的容量完全匹配。从Kubernetes 1.6版本开始,建议用户优先考虑使用StorageClass的动态存储供应模式进行存储管理

 

CSI存储机制详解

Kubernetes从1.9版本开始引入容器存储接口Container Storage Interface(CSI)机制,用于在Kubernetes和外部存储系统之间建立一套标准的存储管理接口,通过该接口为容器提供存储服务。CSI到Kubernetes 1.10版本升级为Beta版,到Kubernetes 1.13版本升级为GA版,已逐渐成熟。

CSI的设计背景

Kubernetes通过PV、PVC、Storageclass已经提供了一种强大的基于插件的存储管理机制,但是各种存储插件提供的存储服务都是基于一种被称为“in-true”(树内)的方式提供的,这要求存储插件的代码必须被放进Kubernetes的主干代码库中才能被Kubernetes调用,属于紧耦合的开发模式。这种“in-tree”方式会带来一些问题:

◎ 存储插件的代码需要与Kubernetes的代码放在同一代码库中,并与Kubernetes的二进制文件共同发布;
◎ 存储插件代码的开发者必须遵循Kubernetes的代码开发规范;
◎ 存储插件代码的开发者必须遵循Kubernetes的发布流程,包括添加对Kubernetes存储系统的支持和错误修复;
◎ Kubernetes社区需要对存储插件的代码进行维护,包括审核、测试等工作;
◎ 存储插件代码中的问题可能会影响Kubernetes组件的运行,并且很难排查问题;
◎ 存储插件代码与Kubernetes的核心组件(kubelet和kube-controller-manager)享有相同的系统特权权限,可能存在可靠性和安全性问题。

Kubernetes已有的Flex Volume插件机制试图通过为外部存储暴露一个基于可执行程序(exec)的API来解决这些问题。尽管它允许第三方存储提供商在Kubernetes核心代码之外开发存储驱动,但仍然有两个问题没有得到很好的解决:

◎ 部署第三方驱动的可执行文件仍然需要宿主机的root权限,存在安全隐患;
◎ 存储插件在执行mount、attach这些操作时,通常需要在宿主机上安装一些第三方工具包和依赖库,使得部署过程更加复杂,例如部署Ceph时需要安装rbd库,部署GlusterFS时需要安装mount.glusterfs库,等等。

基于以上这些问题和考虑,Kubernetes逐步推出与容器对接的存储接口标准,存储提供方只需要基于标准接口进行存储插件的实现,就能使用Kubernetes的原生存储机制为容器提供存储服务。这套标准被称为CSI(容器存储接口)。在CSI成为Kubernetes的存储供应标准之后,存储提供方的代码就能和Kubernetes代码彻底解耦,部署也与Kubernetes核心组件分离,显然,存储插件的开发由提供方自行维护,就能为Kubernetes用户提供更多的存储功能,也更加安全可靠。基于CSI的存储插件机制也被称为“out-of-tree”(树外)的服务提供方式,是未来Kubernetes第三方存储插件的标准方案。

CSI存储插件的关键组件和部署架构

下图描述了Kubernetes CSI存储插件的关键组件和推荐的容器化部署架构。

GPFS如何共享给k8s用 能直接使用hostpath k8s 共享存储_Pod

     其中主要包括两种组件:CSI Controller和CSI Node。

CSI Controller 

CSI Controller的主要功能是提供存储服务视角对存储资源和存储卷进行管理和操作。在Kubernetes中建议将其部署为单实例Pod,可以使用StatefulSet或Deployment控制器进行部署,设置副本数量为1,保证为一种存储插件只运行一个控制器实例。

       在这个Pod内部署两个容器,如下所述。

(1)与Master(kube-controller-manager)通信的辅助sidecar容器。

在sidecar容器内又可以包含external-attacher和external-provisioner两个容器,它们的功能分别如下。
◎ external-attacher:监控VolumeAttachment资源对象的变更,触发针对CSI端点的ControllerPublish和ControllerUnpublish操作。
◎ external-provisioner:监控PersistentVolumeClaim资源对象的变更,触发针对CSI端点的CreateVolume和DeleteVolume操作。

(2)CSI Driver存储驱动容器,由第三方存储提供商提供,需要实现上述接口。
         这两个容器通过本地Socket(Unix Domain Socket,UDS),并使用gPRC协议进行通信。sidecar容器通过Socket调用CSI Driver容器的CSI接口,CSI Driver容器负责具体的存储卷操作。

CSI Node

     CSI Node的主要功能是对主机(Node)上的Volume进行管理和操作。在Kubernetes中建议将其部署为DaemonSet,在每个Node上都运行一个Pod。

在这个Pod中部署以下两个容器:
(1)与kubelet通信的辅助sidecar容器node-driver-registrar,主要功能是将存储驱动注册到kubelet中;
(2)CSI Driver存储驱动容器,由第三方存储提供商提供,主要功能是接收kubelet的调用,需要实现一系列与Node相关的CSI接口,例如NodePublishVolume接口(用于将Volume挂载到容器内的目标路径)、NodeUnpublishVolume接口(用于从容器中卸载Volume),等等。

       node-driver-registrar容器与kubelet通过Node主机的一个hostPath目录下的unix socket进行通信。CSI Driver容器与kubelet通过Node主机的另一个hostPath目录下的unix socket进行通信,同时需要将kubelet的工作目录(默认为/var/lib/kubelet)挂载给CSI Driver容器,用于为Pod进行Volume的管理操作(包括mount、umount等)。

CSI存储插件的使用示例:

   下面以csi-hostpath插件为例,对如何部署CSI插件、用户如何使用CSI插件提供的存储资源进行详细说明。

(1)设置Kubernetes服务启动参数。为kube-apiserver、kube-controller-manager和kubelet服务的启动参数添加:

--feature-gates=VolumeSnapshotDataSource=true,CSINodeInfo=true,CSIDriverRegistry=true

这3个特性开关是Kubernetes从1.12版本引入的Alpha版功能,CSINodeInfo和CSIDriverRegistry需要手工创建其相应的CRD资源对象。

Kubernetes 1.10版本所需的CSIPersistentVolume和MountPropagation特性开关已经默认启用,KubeletPluginsWatcher特性开关也在Kubernetes 1.12版本中默认启用,无须在命令行参数中指定。

(2)创建CSINodeInfo和CSIDriverRegistry CRD资源对象。

csidriver.yaml的内容如下:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: csidrivers.csi.storage.k8s.io
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  group: csi.storage.k8s.io
  names:
    kind: CSIDriver
    plural: csidrivers
  scope: Cluster
  validation:
    openAPIV3Schema:
      properties:
        spec:
          description: Specification of the CSI Driver.
          properties:
            attachRequired:
              description: Indicates this CSI volume driver requires an attach operation,and that Kubernetes should call attach and wait for any attach operationto complete before proceeding to mount.
              type: boolean
            podInfoOnMountVersion:
              description: Indicates this CSI volume driver requires additional pod
                information (like podName, podUID, etc.) during mount operations.
              type: string
  version: v1alpha1

csinodeinfo.yaml的内容如下:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: csinodeinfos.csi.storage.k8s.io
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  group: csi.storage.k8s.io
  names:
    kind: CSINodeInfo
    plural: csinodeinfos
  scope: Cluster
  validation:
    openAPIV3Schema:
      properties:
        spec:
          description: Specification of CSINodeInfo
          properties:
            drivers:
              description: List of CSI drivers running on the node and their specs.
              type: array
              items:
                properties:
                  name:
                    description: The CSI driver that this object refers to.
                    type: string
                  nodeID:
                    description: The node from the driver point of view.
                    type: string
                  topologyKeys:
                    description: List of keys supported by the driver.
                    items:
                      type: string
                    type: array
        status:
          description: Status of CSINodeInfo
          properties:
            drivers:
              description: List of CSI drivers running on the node and their statuses.
              type: array
              items:
                properties:
                  name:
                    description: The CSI driver that this object refers to.
                    type: string
                  available:
                    description: Whether the CSI driver is installed.
                    type: boolean
                  volumePluginMechanism:
                    description: Indicates to external components the required mechanism
                      to use for any in-tree plugins replaced by this driver.
                    pattern: in-tree|csi
                    type: string
  version: v1alpha1

使用kubectl  apply命令完成创建:

kubectl apply -f csidriver.yaml
kubectl apply -f csinodeinfo.yaml

(3)创建csi-hostpath存储插件相关组件,包括csi-hostpath-attacher、csi-hostpathprovisioner和csi-hostpathplugin(其中包含csi-node-driver-registrar和hostpathplugin)。其中为每个组件都配置了相应的RBAC权限控制规则,对于安全访问Kubernetes资源对象非常重要。
csi-hostpath-attacher.yaml的内容如下:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: csi-attacher
  # replace with non-default namespace name
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: external-attacher-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["csi.storage.k8s.io"]
    resources: ["csinodeinfos"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["volumeattachments"]
    verbs: ["get", "list", "watch", "update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-attacher-role
subjects:
  - kind: ServiceAccount
    name: csi-attacher
    # replace with non-default namespace name
    namespace: default
roleRef:
  kind: ClusterRole
  name: external-attacher-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  # replace with non-default namespace name
  namespace: default
  name: external-attacher-cfg
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list", "delete", "update", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-attacher-role-cfg
  # replace with non-default namespace name
  namespace: default
subjects:
  - kind: ServiceAccount
    name: csi-attacher
    # replace with non-default namespace name
    namespace: default
roleRef:
  kind: Role
  name: external-attacher-cfg
  apiGroup: rbac.authorization.k8s.io

---
kind: Service
apiVersion: v1
metadata:
  name: csi-hostpath-attacher
  labels:
    app: csi-hostpath-attacher
spec:
  selector:
    app: csi-hostpath-attacher
  ports:
    - name: dummy
      port: 12345
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: csi-hostpath-attacher
spec:
  serviceName: "csi-hostpath-attacher"
  replicas: 1
  selector:
    matchLabels:
      app: csi-hostpath-attacher
  template:
    metadata:
      labels:
        app: csi-hostpath-attacher
    spec:
      serviceAccountName: csi-attacher
      containers:
        - name: csi-attacher
          image: quay.io/k8scsi/csi-attacher:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - --v=5
            - --csi-address=$(ADDRESS)
          env:
            - name: ADDRESS
              value: /csi/csi.sock
          volumeMounts:
          - mountPath: /csi
            name: socket-dir
      volumes:
        - hostPath:
            path: /var/lib/kubelet/plugins/csi-hostpath
            type: DirectoryOrCreate
          name: socket-dir

csi-hostpath-provisioner.yaml的内容如下:

# csi-hostpath-provisioner
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: csi-provisioner
  # replace with non-default namespace name
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: external-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get", "list"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots"]
    verbs: ["get", "list"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotcontents"]
    verbs: ["get", "list"]
  - apiGroups: ["csi.storage.k8s.io"]
    resources: ["csinodeinfos"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-provisioner-role
subjects:
  - kind: ServiceAccount
    name: csi-provisioner
    # replace with non-default namespace name
    namespace: default
roleRef:
  kind: ClusterRole
  name: external-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  # replace with non-default namespace name
  namespace: default
  name: external-provisioner-cfg
rules:
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "watch", "list", "delete", "update", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-provisioner-role-cfg
  # replace with non-default namespace name
  namespace: default
subjects:
  - kind: ServiceAccount
    name: csi-provisioner
    # replace with non-default namespace name
    namespace: default
roleRef:
  kind: Role
  name: external-provisioner-cfg
  apiGroup: rbac.authorization.k8s.io

---
kind: Service
apiVersion: v1
metadata:
  name: csi-hostpath-provisioner
  labels:
    app: csi-hostpath-provisioner
spec:
  selector:
    app: csi-hostpath-provisioner
  ports:
    - name: dummy
      port: 12345
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: csi-hostpath-provisioner
spec:
  serviceName: "csi-hostpath-provisioner"
  replicas: 1
  selector:
    matchLabels:
      app: csi-hostpath-provisioner
  template:
    metadata:
      labels:
        app: csi-hostpath-provisioner
    spec:
      serviceAccountName: csi-provisioner
      containers:
        - name: csi-provisioner
          image: quay.io/k8scsi/csi-provisioner:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - "--provisioner=csi-hostpath"
            - "--csi-address=$(ADDRESS)"
            - "--connection-timeout=15s"
          env:
            - name: ADDRESS
              value: /csi/csi.sock
          volumeMounts:
            - mountPath: /csi
              name: socket-dir
      volumes:
        - hostPath:
            path: /var/lib/kubelet/plugins/csi-hostpath
            type: DirectoryOrCreate
          name: socket-dir

csi-hostpathplugin.yaml的内容如下:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: csi-node-sa
  # replace with non-default namespace name
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: driver-registrar-runner
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  # The following permissions are only needed when running
  # driver-registrar without the --kubelet-registration-path
  # parameter, i.e. when using driver-registrar instead of
  # kubelet to update the csi.volume.kubernetes.io/nodeid
  # annotation. That mode of operation is going to be deprecated
  # and should not be used anymore, but is needed on older
  # Kubernetes versions.
  # - apiGroups: [""]
  #   resources: ["nodes"]
  #   verbs: ["get", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: csi-driver-registrar-role
subjects:
  - kind: ServiceAccount
    name: csi-node-sa
    # replace with non-default namespace name
    namespace: default
roleRef:
  kind: ClusterRole
  name: driver-registrar-runner
  apiGroup: rbac.authorization.k8s.io

---
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: csi-hostpathplugin
spec:
  selector:
    matchLabels:
      app: csi-hostpathplugin
  template:
    metadata:
      labels:
        app: csi-hostpathplugin
    spec:
      serviceAccountName: csi-node-sa
      hostNetwork: true
      containers:
        - name: driver-registrar
          image: quay.io/k8scsi/csi-node-driver-registrar:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - --v=5
            - --csi-address=/csi/csi.sock
            - --kubelet-registration-path=/var/lib/kubelet/plugins/csi-hostpath/csi.sock
          env:
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
          volumeMounts:
          - mountPath: /csi
            name: socket-dir
          - mountPath: /registration
            name: registration-dir
        - name: hostpath
          image: quay.io/k8scsi/hostpathplugin:v1.0.1
          imagePullPolicy: IfNotPresent
          args:
            - "--v=5"
            - "--endpoint=$(CSI_ENDPOINT)"
            - "--nodeid=$(KUBE_NODE_NAME)"
          env:
            - name: CSI_ENDPOINT
              value: unix:///csi/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /csi
              name: socket-dir
            - mountPath: /var/lib/kubelet/pods
              mountPropagation: Bidirectional
              name: mountpoint-dir
      volumes:
        - hostPath:
            path: /var/lib/kubelet/plugins/csi-hostpath
            type: DirectoryOrCreate
          name: socket-dir
        - hostPath:
            path: /var/lib/kubelet/pods
            type: DirectoryOrCreate
          name: mountpoint-dir
        - hostPath:
            path: /var/lib/kubelet/plugins_registry
            type: Directory
          name: registration-dir

使用kubectl apply命令完成创建:

kubectl apply -f  csi-hostpath-attacher.yaml
kubectl apply -f  csi-hostpath-provisioner.yaml
kubectl apply -f  csi-hostpathplugin.yaml

至此就完成了CSI存储插件的部署。

(4)应用容器使用CSI存储。应用程序如果希望使用CSI存储插件提供的存储服务,则仍然使用Kubernetes动态存储管理机制。首先通过创建StorageClass和PVC为应用容器准备存储资源,然后容器就可以挂载PVC到容器内的目录进行使用了。

创建一个StorageClass,provisioner为CSI存储插件的类型,在本例中为csi-hostpath:

 

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-hostpath-sc
provisioner: csi-hostpath
reclaimPolicy: Delete
volumeBindingMode: Immediate
kubectl apply -f csi-storageclass.yaml

创建一个PVC,引用刚刚创建的StorageClass,申请存储空间为1GiB:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-hostpath-sc
kubectl apply -f csi-pvc.yaml

查看PVC和系统自动创建的PV,状态为Bound,说明创建成功:

kubectl get pv,pvc

最后,在应用容器的配置中使用该PVC:

csi-app.yaml

kind: Pod
apiVersion: v1
metadata:
  name: my-csi-app
spec:
  containers:
  - name: my-csi-app
    image: busybox
    imagePullPolicy: IfNotPresent
    command: ["sleep", "1000000"]
    volumeMounts:
    - mountPath: "/data"
      name: my-csi-volume
  volumes:
  - name: my-csi-volume
    persistentVolumeClaim:
      claimName: csi-pvc
kubectl apply -f csi-app.yaml
kubecatl get pods

在Pod创建成功之后,应用容器中的/data目录使用的就是CSI存储插件提供的存储。
我们通过kubelet的日志可以查看到Volume挂载的详细过程。

每种CSI存储插件都提供了容器镜像,与external-attacher、external-provisioner、node-driver-registrar等sidecar辅助容器共同完成存储插件系统的部署,每个插件的部署配置详见官网https://kubernetes-csi.github.io/docs/drivers.html中的链接。

Kubernetes从1.12版本开始引入存储卷快照(Volume Snapshots)功能,通过新的CRD自定义资源对象VolumeSnapshotContent、VolumeSnapshot和VolumeSnapshotClass进行管理。VolumeSnapshotContent定义从当前PV创建的快照,类似于一个新的PV;VolumeSnapshot定义需要绑定某个快照的请求,类似于PVC的定义;VolumeSnapshotClass用于屏蔽VolumeSnapshotContent的细节,类似于StorageClass的功能。

下面是一个VolumeSnapshotContent的例子:

apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshotContent
metadata:
  name: new-snapshot-content-test
spec:
  snapshotClassName: csi-hostpath-snapclass
  source:
    name: pvc-test
    kind: PersistentVolumeClaim
  volumeSnapshotSource:
    csiVolumeSnapshotSource:
      driver:          csi-hostpath
      restoreSize:     10Gi

下面是一个VolumeSnapshot的例子:

apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  name: new-snapshot-test
spec:
  snapshotClassName: csi-hostpath-snapclass
  source:
    name: pvc-test
    kind: PersistentVolumeClaim

后续要进一步完善的工作如下。
(1)将以下Alpha版功能更新到Beta版:
◎ Raw Block类型的Volumes;
◎ 拓扑感知,Kubernetes理解和影响CSI卷的配置位置(如zone、region 等)的能力;
◎ 完善基于CRD的扩展功能(如Skip attach、Pod info on mount等)。

(2)完善对本地短暂卷(Local Ephemeral Volume)的支持。
(3)将Kubernetes“in-tree”存储卷插件迁移到CSI。