手动实现备份及恢复

ClickHouse允许使用alter table ... freeze partition ....查询已创建表分区的本地副本,这是利用硬链接到/var/lib/clickhouse/shadow目录中实现的,所以它通常不会因为旧数据而占用额外的磁盘空间。创建的文件副本不由ClickHouse服务器处理,所以不需要任何的外部系统就可以有一个简单的备份。为了防止硬件问题,最好将这些备份复制到另一台主机上,再删除本地副本。更多关于ClickHouse的备份恢复说明,请参见官网

创建副本路径

如果/var/lib/clickhouse/shadow不存在,则创建,否则清空目录下的原有数据:

[root@scentos szc]# mkdir -p /var/lib/clickhouse/shadow
[root@scentos szc]# chown clickhouse:clickhouse /var/lib/clickhouse/shadow

执行备份命令

scentos :) alter table t_order_mt freeze;

ALTER TABLE t_order_mt
    FREEZE

Query id: 06a7d0a8-fc97-4816-8314-140b9a3acfb4

Ok.

0 rows in set. Elapsed: 0.003 sec.

将备份数据保存到其他路径

[root@scentos szc]# mkdir -p /var/lib/clickhouse/backup/ # 创建备份存储路径
[root@scentos szc]# cp -r /var/lib/clickhouse/shadow/ /var/lib/clickhouse/backup/backup_szc # 拷贝数据到备份路径
[root@scentos szc]# rm -rf /var/lib/clickhouse/shadow/* # 删除原有备份数据

恢复数据

先把删除过的表删除,再重新创建:

scentos :) drop table t_order_mt;

DROP TABLE t_order_mt

Query id: 94e8a0f8-61a8-4d44-9e4d-c98399497f37

Ok.

0 rows in set. Elapsed: 0.001 sec.

scentos :) create table t_order_mt(
:-] id UInt32,
:-] sku_id String,
:-] total_amount Decimal(16,2),
:-] create_time Datetime
:-] ) engine =MergeTree
:-] partition by toYYYYMMDD(create_time)
:-] primary key (id)
:-] order by (id,sku_id);

CREATE TABLE t_order_mt
(
    `id` UInt32,
    `sku_id` String,
    `total_amount` Decimal(16, 2),
    `create_time` Datetime
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(create_time)
PRIMARY KEY id
ORDER BY (id, sku_id)

Query id: a1c2fc90-78d0-440d-ad67-4a4f18f149a1

Ok.

0 rows in set. Elapsed: 0.005 sec.

再把备份复制到detached目录,并修改权限:

[root@scentos szc]# cp -rl /var/lib/clickhouse/backup/backup_szc/1/store/b3e/b3e422c6-5940-49b9-b3e4-22c6594029b9/* /var/lib/clickhouse/data/default/t_order_mt/detached/
[root@scentos szc]# chown -R clickhouse:clickhouse /var/lib/clickhouse/

ClickHouse使用文件系统的硬链接实现及时备份,不会导致ClickHouse服务停机或锁定。这些硬链接可以进一步用于有效的备份存储,在支持硬链接的文件系统(如本地文件系统或NFS),将cp-l标志一起使用(或rsync-hard-links-numeric-ids一起使用)以避免复制数据。
然后在ClickHouse客户端执行attach,并查看数据:

scentos :) alter table t_order_mt attach partition 20200601;

ALTER TABLE t_order_mt
    ATTACH PARTITION 20200601

Query id: 43b804f9-60b3-4dca-8e56-a53cb73fba2b

Ok.

0 rows in set. Elapsed: 0.001 sec.

scentos :) select count() from t_order_mt;

SELECT count()
FROM t_order_mt

Query id: 47839921-5611-48ee-9c4b-c3e13d9e1ae8

┌─count()─┐
│       5 │
└─────────┘

1 rows in set. Elapsed: 0.002 sec.

使用clickhouse-backup

上述过程,我们可以通过ClickHouse的备份工具clickhouse-backup帮我们自动化实现,下载地址

安装

[root@scentos szc]# rpm -ivh clickhouse-backup-1.0.0-1.x86_64.rpm

配置文件

[root@scentos szc]#  mv /etc/clickhouse-backup/config.yml.example /etc/clickhouse-backup/config.yml

修改clickhouse选项中的主机名:

clickhouse:
   username: default
   password: ""
   host: scentos

创建备份

查看可用命令:

[root@scentos szc]# clickhouse-backup help
NAME:
   clickhouse-backup - Tool for easy backup of ClickHouse with cloud support

USAGE:
   clickhouse-backup <command> [-t, --tables=<db>.<table>] <backup_name>

VERSION:
   1.0.0

DESCRIPTION:
   Run as 'root' or 'clickhouse' user

COMMANDS:
   tables          Print list of tables
   create          Create new backup
   create_remote   Create and upload
   upload          Upload backup to remote storage
   list            Print list of backups
   download        Download backup from remote storage
   restore         Create schema and restore data from backup
   restore_remote  Download and restore
   delete          Delete specific backup
   default-config  Print default config
   server          Run API server
   help, h         Shows a list of commands or help for one command


GLOBAL OPTIONS:
   --config FILE, -c FILE  Config FILE name. (default: "/etc/clickhouse-backup/config.yml") [$CLICKHOUSE_BACKUP_CONFIG]
   --help, -h              show help
   --version, -v           print the version

显示可备份的表:

[root@scentos szc]# clickhouse-backup tables
INFORMATION_SCHEMA.COLUMNS         0B                  
INFORMATION_SCHEMA.SCHEMATA        0B                  
INFORMATION_SCHEMA.TABLES          0B                  
INFORMATION_SCHEMA.VIEWS           0B                  
datasets..inner.hits_mv            4.90KiB    default  
datasets.hits_mv                   0B         default  
datasets.hits_test                 294.83KiB  default  
datasets.hits_v1                   1.18GiB    default  
datasets.hits_v2                   565.78MiB  default  
datasets.visits_v1                 537.52MiB  default  
datasets.visits_v2                 4.45MiB    default  
default.latest_non_deleted_test_a  0B                  
default.non_deleted_test_a         0B                  
default.st_order_mt                642B       default  
default.st_order_mt_all2           0B         default  
default.t_enum                     0B         default  
default.t_null                     0B         default  
default.t_order_mt                 358B       default  
default.t_order_mt2                776B       default  
default.t_order_mt3                0B         default  
default.t_order_rep2               0B         default  
default.t_order_rmt                626B       default  
default.t_order_smt                610B       default  
default.t_tinylog                  0B         default  
default.test_a                     40.55MiB   default  
default.view_test_a                0B                  
information_schema.columns         0B                  
information_schema.schemata        0B                  
information_schema.tables          0B                  
information_schema.views           0B                  
test_binlog.t_organization         547B       default  
test_binlog.t_user                 350B       default

创建备份:

[root@scentos szc]# clickhouse-backup create
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=INFORMATION_SCHEMA.COLUMNS
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=INFORMATION_SCHEMA.SCHEMATA
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=INFORMATION_SCHEMA.TABLES
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=INFORMATION_SCHEMA.VIEWS
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets..inner.hits_mv
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.hits_mv
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.hits_test
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.hits_v1
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.hits_v2
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.visits_v1
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=datasets.visits_v2
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.latest_non_deleted_test_a
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.non_deleted_test_a
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.st_order_mt
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.st_order_mt_all2
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_enum
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_null
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_mt
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_mt2
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_mt3
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_rep2
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_rmt
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_order_smt
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.t_tinylog
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.test_a
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=default.view_test_a
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=information_schema.columns
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=information_schema.schemata
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=information_schema.tables
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=information_schema.views
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=test_binlog.t_organization
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create table=test_binlog.t_user
2021/12/12 18:40:48  info done                      backup=2021-12-12T10-40-48 operation=create

查看现有本地备份:

[root@scentos szc]# clickhouse-backup list
2021-12-12T10-40-48   2.31GiB   12/12/2021 10:40:48   local      
2021/12/12 18:44:28 error InvalidParameter: 1 validation error(s) found.
- minimum field size of 1, ListObjectsV2Input.Bucket.

可以看到一个名为2021-12-12T10-40-48的备份。此备份存储路径为var/lib/clickhouse/backup/backup_name,备份名称默认为当前时间戳,但是可以通过-name指定备份名称。备份包含两个目录:metadata目录和shadow目录,前者包含重新创建表架构所需的DDL语句,后者包含alter table ... freeze操作结果的数据。

恢复数据

模拟删除备份过的表:

scentos :) drop table t_order_rmt;

DROP TABLE t_order_rmt

Query id: 96d816dd-b93c-42b6-8890-ae5ca19dc2ba

Ok.

0 rows in set. Elapsed: 0.001 sec.

从备份还原:

[root@scentos szc]# clickhouse-backup restore 2021-12-12T10-40-48
2021/12/12 18:49:01  warn can't create table 'INFORMATION_SCHEMA.COLUMNS': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE COLUMNS;
2. CREATE TABLE COLUMNS <table definition>;
3. ATTACH TABLE COLUMNS FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE COLUMNS UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'INFORMATION_SCHEMA.SCHEMATA': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE SCHEMATA;
2. CREATE TABLE SCHEMATA <table definition>;
3. ATTACH TABLE SCHEMATA FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE SCHEMATA UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'INFORMATION_SCHEMA.TABLES': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE TABLES;
2. CREATE TABLE TABLES <table definition>;
3. ATTACH TABLE TABLES FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE TABLES UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'INFORMATION_SCHEMA.VIEWS': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE VIEWS;
2. CREATE TABLE VIEWS <table definition>;
3. ATTACH TABLE VIEWS FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE VIEWS UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'test_binlog.t_organization': code: 57, message: Table test_binlog.t_organization already exists, will try again
2021/12/12 18:49:01  warn can't create table 'information_schema.views': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE views;
2. CREATE TABLE views <table definition>;
3. ATTACH TABLE views FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE views UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.hits_test': code: 57, message: Table datasets.hits_test already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.hits_v1': code: 57, message: Table datasets.hits_v1 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.hits_v2': code: 57, message: Table datasets.hits_v2 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.visits_v1': code: 57, message: Table datasets.visits_v1 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.visits_v2': code: 57, message: Table datasets.visits_v2 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'information_schema.tables': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE tables;
2. CREATE TABLE tables <table definition>;
3. ATTACH TABLE tables FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE tables UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'information_schema.schemata': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE schemata;
2. CREATE TABLE schemata <table definition>;
3. ATTACH TABLE schemata FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE schemata UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'default.st_order_mt': code: 57, message: Table default.st_order_mt already exists, will try again
2021/12/12 18:49:01  warn can't create table 'information_schema.columns': code: 80, message: Incorrect ATTACH TABLE query for Atomic database engine. Use one of the following queries instead:
1. ATTACH TABLE columns;
2. CREATE TABLE columns <table definition>;
3. ATTACH TABLE columns FROM '/path/to/data/' <table definition>;
4. ATTACH TABLE columns UUID '<uuid>' <table definition>;, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_enum': code: 57, message: Table default.t_enum already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_tinylog': code: 57, message: Table default.t_tinylog already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_mt': code: 57, message: Table default.t_order_mt already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_mt2': code: 57, message: Table default.t_order_mt2 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_mt3': code: 57, message: Table default.t_order_mt3 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_rep2': code: 57, message: Table default.t_order_rep2 already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_rmt': code: 57, message: Directory for table data store/2b5/2b59d8bd-9488-415a-ab59-d8bd9488015a/ already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_order_smt': code: 57, message: Table default.t_order_smt already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.t_null': code: 57, message: Table default.t_null already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.test_a': code: 57, message: Table default.test_a already exists, will try again
2021/12/12 18:49:01  warn can't create table 'test_binlog.t_user': code: 57, message: Table test_binlog.t_user already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets..inner.hits_mv': code: 57, message: Table datasets.`.inner.hits_mv` already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.non_deleted_test_a': code: 57, message: Table default.non_deleted_test_a already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.latest_non_deleted_test_a': code: 57, message: Table default.latest_non_deleted_test_a already exists, will try again
2021/12/12 18:49:01  warn can't create table 'datasets.hits_mv': code: 57, message: Table datasets.hits_mv already exists, will try again
2021/12/12 18:49:01  warn can't create table 'default.view_test_a': code: 57, message: Table default.view_test_a already exists, will try again
2021/12/12 18:49:01 error can't create table `default`.`st_order_mt_all2`: code: 57, message: Table default.st_order_mt_all2 already exists after 32 times, please check your schema depencncies

参数:--schema只还原表结构,--data只还原数据,--table=dbname.table备份或还原特定表

其他说明

该工具还不太好用,静候其升级,或者自己手动备份。

确保/var/lib/clickhouse/backup的权限是clickhouse:clickhouse,否则可能会导致数据损坏。

远程备份:较新版本才支持,需要设置config里的s3相关配置;上传到远程存储:sudo clickhouse-backup upload xxxx;从远程存储下载:sudo clickhouse-backup download xxx;保存周期:backups_to_keep_local,保存到本地的存储周期,单位为天,backups_to_keep_remote,远程存储的保存周期,单位为天,两者值若为0,表示永不删除。