一、前言

一说起kuberenetsb集群备份,我们首先会想到etcd数据备份,因为我们kubernetes数据都是以键值的形式存储在etcd中,上一边文章也讲解了关于etcd快照备份也恢复,但是etcd备份是全局的,无法针对于集群对象级别的备份,这一期,我们了解kubernetes备份的"新宠"-Velero

二、概念讲解

1. 什么是 Velero?

Velero 是一个 Vmware 大厂用 Go 语言开发的开源工具,用于备份和恢复 Kubernetes 集群中的资源和持久化数据。

它支持在公有云平台或本地运行,提供备份集群数据、集群资源迁移到其他集群、以及将生产集群复制到开发和测试集群的能力。

2、Velero 与 ETCD 快照备份的区别

  • 备份粒度:ETCD 快照是全局备份,而 Velero 可以对 Kubernetes 集群内对象级别进行备份,支持按 Type、Namespace、Label 等对象进行分类备份或恢复。
  • 恢复影响:ETCD 快照恢复会影响其他 namespace 中的 pod 运行服务,而 Velero 可以针对性地恢复,不影响其他服务。
  • 存储位置:Velero 支持多种对象存储,如 MiniO、Ceph、OSS、S3等存储,而 ETCD 快照是一个本地文件。
  • 备份方式:Velero 支持增量备份,只备份变化的数据,而 ETCD 快照通常是全量备份。

记一次Kubernetes集群Velero备份与恢复实战过程_Velero

三、实战环节

3.1、准备Minio作为源端数据存储

#vim minio-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: minio
  name: minio
  namespace: minio-system
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: minio
  template:
    metadata:
      annotations:
        kubesphere.io/creator: admin
        logging.kubesphere.io/logsidecar-config: '{}'
      creationTimestamp: null
      labels:
        app: minio
    spec:
      affinity: {}
      containers:
      - env:
        - name: MINIO_ROOT_USER
          value: user
        - name: MINIO_ROOT_PASSWORD
          value: password
        - name: MINIO_DEFAULT_BUCKETS
          value: k8s-backup
        image: registry.cn-beijing.aliyuncs.com/dotbalo/minio:latest
        imagePullPolicy: IfNotPresent
        name: container-d02int
        ports:
        - containerPort: 9000
          name: tcp-9000
          protocol: TCP
        - containerPort: 9001
          name: tcp-9001
          protocol: TCP
        volumeMounts:
        - mountPath: /bitnami/minio/data
          name: volume-w6917b
        - mountPath: /etc/localtime
          name: host-time
          readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      volumes:
      - name: volume-w6917b
        persistentVolumeClaim:
          claimName: minio-data

3.2、部署Velero客户端

# wget https://mirror.ghproxy.com/https://github.com/vmware-tanzu/velero/releases/download/v1.14.0/velero-v1.14.0-linux-amd64.tar.gz
# tar -zxvf  velero-v1.14.0-linux-amd64.tar.gz
# cp velero-v1.14.0-linux-amd64/velero /usr/local/bin/
# velero version
#cat >velero-auth.txt<EOF
[default]
aws_access_key_id = user
aws_secret_access_key = password
EOF

3.3、基于k8s集群部署Velero服务端

velero install \
     --provider aws \
     --plugins registry.cn-beijing.aliyuncs.com/devops-op/velero-plugin-for-aws:v1.0.0 \
     #指定Minio平台数据目录,用于存放备份数据文件
     --bucket velerodata \
     --secret-file ./velero-auth.txt \
     --use-volume-snapshots=false \
     --namespace velero-system \
     #指定Minio API接口地址
     --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://192.168.10.101:9000

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_02

四、数据灾备与恢复

数据环境准备

创建名为"demo-test"命名空间

#kubectl create ns demo-test
#kubectl create pvc demo

基于已存在的Storage,手动创建PVC资源

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_03

#vim redis-data.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: demo-test
  name: redis-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: nfs-storage
创建PVC
#kubectl apply -f redis-data.yaml

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_04

创建redis实例

#vim redis-deply.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: redis
  namespace: demo
spec:
  containers:
  - name: redis
    image: redis:7-alpine
    ports:
    - containerPort: 6379
      name: redis
    volumeMounts:
    - mountPath: /data
      name: data-storage
  volumes:
  - name: data-storage
    persistentVolumeClaim:
      claimName: redis-data
#kubectl apply -f redis-with-nfs-pvc.yaml -n demo-test
#kubectl get pod -n demo-test
NAME                        READY   STATUS    RESTARTS   AGE
redis                       1/1     Running   0          17s

测试模拟写入数据

这里进入redis中,手动创建缓存键值数据,并持久化PVC数据目录中

#kubectl exec -it redis -n demo-test -- /bin/sh
data # redis-cli
# Keyspace
127.0.0.1:6379> set mykey "BIRKHOFF 2025-01-07 Test data"
OK
127.0.0.1:6379> BGSAVE
Background saving started
127.0.0.1:6379> exit
/data # ls
dump.rdb

备份数据

在这里备份demo-test 命名空间的数据,备份文件的格式以备份的命名空间加上时间戳来命名1

DATE=`date +%Y%m%d%H%M%S`
# Rest of your script...
velero backup create demo-test-${DATE} \
--include-cluster-resources=true \
--include-namespaces demo-test \
--namespace velero-system
 

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_05

此时备份对应的数据已经存储到源端存储Minio中

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_06

模拟数据误删除

接下来,模拟手动将demo-test命名空间直接强制删除

#kubectl delete ns demo-test

可以发现,前面创建的namespace已经删除。

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_07

恢复数据

接下来,我们尝试通过velero从源端存储Minio数据进行恢复

velero restore create --from-backup  demo-test-20250107111306 --namespace velero-system 
Restore request "demo-test-20250107111306-20250107152324" submitted successfully.
Run `velero restore describe demo-test-20250107111306-20250107152324` or `velero restore logs demo-test-20250107111306-20250107152324` for more details.

回归验证

此时,数据已经成功恢复

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_08

记一次Kubernetes集群Velero备份与恢复实战过程_Velero_09

拓展

另外velero除了系列度命名空间备份之外,还可以支持所有所有k8s集群所有资源备份,定时备份等等。功能还是非常强大的

#  velero backup create --help
Create a backup

Usage:
  velero backup create NAME [flags]

Examples:
  # Create a backup containing all resources.
  velero backup create backup1

  # Create a backup including only the nginx namespace.
  velero backup create nginx-backup --include-namespaces nginx

  # Create a backup excluding the velero and default namespaces.
  velero backup create backup2 --exclude-namespaces velero,default

  # Create a backup based on a schedule named daily-backup.
  velero backup create --from-schedule daily-backup

  # View the YAML for a backup that doesn't snapshot volumes, without sending it to the server.
  velero backup create backup3 --snapshot-volumes=false -o yaml

  # Wait for a backup to complete before returning from the command.
  velero backup create backup4 --wait

Flags:
      --csi-snapshot-timeout duration                      How long to wait for CSI snapshot creation before timeout.
      --data-mover string                                  Specify the data mover to be used by the backup. If the parameter is not set or set as 'velero', the built-in data mover will be used
      --default-volumes-to-fs-backup optionalBool[=true]   Use pod volume file system backup by default for volumes
      --exclude-cluster-scoped-resources stringArray       Cluster-scoped resources to exclude from the backup, formatted as resource.group, such as storageclasses.storage.k8s.io(use '*' for all resources). Cannot work with include-resources, exclude-resources and include-cluster-resources.
      --exclude-namespace-scoped-resources stringArray     Namespaced resources to exclude from the backup, formatted as resource.group, such as deployments.apps(use '*' for all resources). Cannot work with include-resources, exclude-resources and include-cluster-resources.
      --exclude-namespaces stringArray                     Namespaces to exclude from the backup.
      --exclude-resources stringArray                      Resources to exclude from the backup, formatted as resource.group, such as storageclasses.storage.k8s.io. Cannot work with include-cluster-scoped-resources, exclude-cluster-scoped-resources, include-namespace-scoped-resources and exclude-namespace-scoped-resources.
      --from-schedule string                               Create a backup from the template of an existing schedule. Cannot be used with any other filters. Backup name is optional if used.
  -h, --help                                               help for create
      --include-cluster-resources optionalBool[=true]      Include cluster-scoped resources in the backup. Cannot work with include-cluster-scoped-resources, exclude-cluster-scoped-resources, include-namespace-scoped-resources and exclude-namespace-scoped-resources.
      --include-cluster-scoped-resources stringArray       Cluster-scoped resources to include in the backup, formatted as resource.group, such as storageclasses.storage.k8s.io(use '*' for all resources). Cannot work with include-resources, exclude-resources and include-cluster-resources.
      --include-namespace-scoped-resources stringArray     Namespaced resources to include in the backup, formatted as resource.group, such as deployments.apps(use '*' for all resources). Cannot work with include-resources, exclude-resources and include-cluster-resources.
      --include-namespaces stringArray                     Namespaces to include in the backup (use '*' for all namespaces). (default *)
      --include-resources stringArray                      Resources to include in the backup, formatted as resource.group, such as storageclasses.storage.k8s.io (use '*' for all resources). Cannot work with include-cluster-scoped-resources, exclude-cluster-scoped-resources, include-namespace-scoped-resources and exclude-namespace-scoped-resources.
      --item-operation-timeout duration                    How long to wait for async plugin operations before timeout.
  -L, --label-columns stringArray                          Accepts a comma separated list of labels that are going to be presented as columns. Names are case-sensitive. You can also use multiple flag options like -L label1 -L label2...
      --labels mapStringString                             Labels to apply to the backup.
      --or-selector orLabelSelector                        Backup resources matching at least one of the label selector from the list. Label selectors should be separated by ' or '. For example, foo=bar or app=nginx
      --ordered-resources string                           Mapping Kinds to an ordered list of specific resources of that Kind.  Resource names are separated by commas and their names are in format 'namespace/resourcename'. For cluster scope resource, simply use resource name. Key-value pairs in the mapping are separated by semi-colon.  Example: 'pods=ns1/pod1,ns1/pod2;persistentvolumeclaims=ns1/pvc4,ns1/pvc8'.  Optional.
  -o, --output string                                      Output display format. For create commands, display the object but do not send it to the server. Valid formats are 'table', 'json', and 'yaml'. 'table' is not valid for the install command.
      --parallel-files-upload int                          Number of files uploads simultaneously when running a backup. This is only applicable for the kopia uploader
      --resource-policies-configmap string                 Reference to the resource policies configmap that backup using
  -l, --selector labelSelector                             Only back up resources matching this label selector. (default <none>)
      --show-labels                                        Show labels in the last column
      --snapshot-move-data optionalBool[=true]             Specify whether snapshot data should be moved
      --snapshot-volumes optionalBool[=true]               Take snapshots of PersistentVolumes as part of the backup. If the parameter is not set, it is treated as setting to 'true'.
      --storage-location string                            Location in which to store the backup.
      --ttl duration                                       How long before the backup can be garbage collected.
      --volume-snapshot-locations strings                  List of locations (at most one per provider) where volume snapshots should be stored.
  -w, --wait                                               Wait for the operation to complete.

Global Flags:
      --add_dir_header                   If true, adds the file directory to the header of the log messages
      --alsologtostderr                  log to standard error as well as files (no effect when -logtostderr=true)
      --colorized optionalBool           Show colored output in TTY. Overrides 'colorized' value from $HOME/.config/velero/config.json if present. Enabled by default
      --features stringArray             Comma-separated list of features to enable for this Velero process. Combines with values from $HOME/.config/velero/config.json if present
      --kubeconfig string                Path to the kubeconfig file to use to talk to the Kubernetes apiserver. If unset, try the environment variable KUBECONFIG, as well as in-cluster configuration
      --kubecontext string               The context to use to talk to the Kubernetes apiserver. If unset defaults to whatever your current-context is (kubectl config current-context)
      --log_backtrace_at traceLocation   when logging hits line file:N, emit a stack trace (default :0)
      --log_dir string                   If non-empty, write log files in this directory (no effect when -logtostderr=true)
      --log_file string                  If non-empty, use this log file (no effect when -logtostderr=true)
      --log_file_max_size uint           Defines the maximum size a log file can grow to (no effect when -logtostderr=true). Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
      --logtostderr                      log to standard error instead of files (default true)
  -n, --namespace string                 The namespace in which Velero should operate (default "velero")
      --one_output                       If true, only write logs to their native severity level (vs also writing to each lower severity level; no effect when -logtostderr=true)
      --skip_headers                     If true, avoid header prefixes in the log messages
      --skip_log_headers                 If true, avoid headers when opening log files (no effect when -logtostderr=true)
      --stderrthreshold severity         logs at or above this threshold go to stderr when writing to files and stderr (no effect when -logtostderr=true or -alsologtostderr=true) (default 2)
  -v, --v Level                          number for the log level verbosity
      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging

五、总结

velero是一个提供Kubernetes集群和持久化卷的备份、迁移以及灾难恢复的开源工具,它提供了细粒度的备份选项,支持多种存储解决方案,并允许跨集群迁移。与 ETCD 快照备份相比,Velero 提供了更多的灵活性和选择性,也让他成为 Kubernetes 集群数据保护的理想选择。