把之前的RMAN 单实例的备份脚本修改了一下,用来备份RAC 系统。
Linux 平台下 RMAN 全备 和 增量备份 shell 脚本
http://www.cndba.cn/Dave/article/1060
修改之后的备份Shell 脚本如下:
########################################################################
## RAC_hot_database_backup.sh ##
## created by Tianlesoftware ##
## 2010-9-21 ##
#########################################################################
#!/bin/sh
# ---------------------------------------------------------------------------
# Determine the user which is executing this script.
# ---------------------------------------------------------------------------
CUSER=`id |cut -d"(" -f2 | cut -d ")" -f1`
# ---------------------------------------------------------------------------
# Put output in <this file name>.out. Change as desired.
# Note: output directory requires write permission.
# ---------------------------------------------------------------------------
RMAN_LOG_FILE=${0}.out
# ---------------------------------------------------------------------------
# You may want to delete the output file so that backup information does
# not accumulate. If not, delete the following lines.
# ---------------------------------------------------------------------------
if [ -f "$RMAN_LOG_FILE" ]
then
rm -f "$RMAN_LOG_FILE"
fi
# -----------------------------------------------------------------
# Initialize the log file.
# -----------------------------------------------------------------
echo >> $RMAN_LOG_FILE
chmod 666 $RMAN_LOG_FILE
# ---------------------------------------------------------------------------
# Log the start of this script.
# ---------------------------------------------------------------------------
echo Script $0 >> $RMAN_LOG_FILE
echo ==== started on `date` ==== >> $RMAN_LOG_FILE
echo >> $RMAN_LOG_FILE
# ---------------------------------------------------------------------------
# Oracle home path.
# ---------------------------------------------------------------------------
ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1
export ORACLE_HOME
# ---------------------------------------------------------------------------
# the Oracle SID of the target database.
# ---------------------------------------------------------------------------
ORACLE_SID=orcl1 -- 注意,这里需要连接任意节点就可以
export ORACLE_SID
# ---------------------------------------------------------------------------
# The Oracle DBA user id (account).
# ---------------------------------------------------------------------------
ORACLE_USER=oracle
export ORACLE_USER
# ---------------------------------------------------------------------------
# Set the Oracle Recovery Manager name.
# ---------------------------------------------------------------------------
RMAN=$ORACLE_HOME/bin/rman
# ---------------------------------------------------------------------------
# Print out the value of the variables set by this script.
# ---------------------------------------------------------------------------
echo >> $RMAN_LOG_FILE
echo "RMAN: $RMAN" >> $RMAN_LOG_FILE
echo "ORACLE_SID: $ORACLE_SID" >> $RMAN_LOG_FILE
echo "ORACLE_USER: $ORACLE_USER" >> $RMAN_LOG_FILE
echo "ORACLE_HOME: $ORACLE_HOME" >> $RMAN_LOG_FILE
# ---------------------------------------------------------------------------
# Print out the value of the variables set by bphdb.
# ---------------------------------------------------------------------------
#echo >> $RMAN_LOG_FILE
#echo "NB_ORA_FULL: $NB_ORA_FULL" >> $RMAN_LOG_FILE
#echo "NB_ORA_INCR: $NB_ORA_INCR" >> $RMAN_LOG_FILE
#echo "NB_ORA_CINC: $NB_ORA_CINC" >> $RMAN_LOG_FILE
#echo "NB_ORA_SERV: $NB_ORA_SERV" >> $RMAN_LOG_FILE
#echo "NB_ORA_POLICY: $NB_ORA_POLICY" >> $RMAN_LOG_FILE
# ---------------------------------------------------------------------------
# NOTE: This script assumes that the database is properly opened. If desired,
# this would be the place to verify that.
# ---------------------------------------------------------------------------
echo >> $RMAN_LOG_FILE
# ---------------------------------------------------------------------------
# ---------------------------------------------------------------------------
# Call Recovery Manager to initiate the backup.
# ---------------------------------------------------------------------------
CMD_STR="
ORACLE_HOME=$ORACLE_HOME
export ORACLE_HOME
ORACLE_SID=$ORACLE_SID
export ORACLE_SID
$RMAN nocatalog target sys/oracle
RUN {
allocate channel c1 device type disk connect 'sys/oracle@orcl1';
allocate channel c2 device type disk connect 'sys/oracle@orcl2';
BACKUP FORMAT '/u01/backup/orcl_%U_%T' skip inaccessible filesperset 5 DATABASE TAG orcl_hot_db_bk;
sql 'alter system archive log current';
BACKUP FORMAT '/u01/backup/arch_%U_%T' skip inaccessible filesperset 5 ARCHIVELOG ALL DELETE INPUT;
backup current controlfile tag='bak_ctlfile' format='/u01/backup/ctl_file_%U_%T';
backup spfile tag='spfile' format='/u01/backup/ORCL_spfile_%U_%T';
release channel c1;
release channel c2;
}
allocate channel for maintenance device type disk connect 'sys/oracle@orcl1';
allocate channel for maintenance device type disk connect 'sys/oracle@orcl2';
report obsolete;
delete noprompt obsolete;
crosscheck backup;
delete noprompt expired backup;
list backup summary;
release channel;
#EOF
"
# Initiate the command string
if [ "$CUSER" = "root" ]
then
echo "Root Command String: $CMD_STR" >> $RMAN_LOG_FILE
su - $ORACLE_USER -c "$CMD_STR" >> $RMAN_LOG_FILE
RSTAT=$?
else
echo "User Command String: $CMD_STR" >> $RMAN_LOG_FILE
/bin/sh -c "$CMD_STR" >> $RMAN_LOG_FILE
RSTAT=$?
fi
# ---------------------------------------------------------------------------
# Log the completion of this script.
# ---------------------------------------------------------------------------
if [ "$RSTAT" = "0" ]
then
LOGMSG="ended successfully"
else
LOGMSG="ended in error"
fi
echo >> $RMAN_LOG_FILE
echo Script $0 >> $RMAN_LOG_FILE
echo ==== $LOGMSG on `date` ==== >> $RMAN_LOG_FILE
echo >> $RMAN_LOG_FILE
/bin/mailx -s "RMAN Backup SID " tianlesoftware@vip.qq.com < $RMAN_LOG_FILE
exit $RSTAT
但是,运行该备份脚本报错。 部分运行日志如下:
…
allocated channel: c1
channel c1: sid=129 instance=orcl1
allocated channel: c3
channel c3: sid=131 instance=orcl2
Starting backup at 22-SEP-10
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
input datafile fno=00001 name=+DATA/orcl/datafile/system.276.730181051
input datafile fno=00004 name=+DATA/orcl/datafile/users.279.730181053
channel c1: starting piece 1 at 22-SEP-10
channel c3: starting full datafile backupset
channel c3: specifying datafile(s) in backupset
input datafile fno=00003 name=+DATA/orcl/datafile/sysaux.277.730181053
input datafile fno=00002 name=+DATA/orcl/datafile/undotbs1.278.730181053
input datafile fno=00005 name=+DATA/orcl/datafile/undotbs2.284.730181347
channel c3: starting piece 1 at 22-SEP-10
RMAN-03009: failure of backup command on c3 channel at 09/22/2010 05:11:10
ORA-19504: failed to create file "/u01/backup/orcl_19logo39_1_1_20100922"
ORA-27040: file create error, unable to create file
Linux Error: 2: No such file or directory
channel c3 disabled, job failed on it will be run on another channel
channel c1: finished piece 1 at 22-SEP-10
piece handle=/u01/backup/orcl_18logo39_1_1_20100922 tag=ORCL_HOT_DB_BK comment=N
ONE
channel c1: backup set complete, elapsed time: 00:01:02
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
input datafile fno=00003 name=+DATA/orcl/datafile/sysaux.277.730181053
input datafile fno=00002 name=+DATA/orcl/datafile/undotbs1.278.730181053
input datafile fno=00005 name=+DATA/orcl/datafile/undotbs2.284.730181347
channel c1: starting piece 1 at 22-SEP-10
channel c1: finished piece 1 at 22-SEP-10
piece handle=/u01/backup/orcl_1alogo57_1_1_20100922 tag=ORCL_HOT_DB_BK comment=N
ONE
...
released channel: c1
released channel: c3
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of release command at 09/22/2010 05:14:03
RMAN-06012: channel: c3 not allocated
对这个错误的分析过程如下:
1. 日志有c3 通道不能释放的问题。
C3 通道是节点2上的通道,非本地通道。 怀疑是这个地方有问题,单独做一个简单的测试:
RMAN> run
{
allocate channel c2 device type disk connect 'sys/oracle@orcl2';
allocate channel c1 device type disk connect 'sys/oracle@orcl1';
delete expired archivelog all;
release channel c1;
release channel c2;
}
2> 3> 4> 5> 6> 7> 8>
allocated channel: c2
channel c2: sid=147 instance=orcl2 devtype=DISK
allocated channel: c1
channel c1: sid=147 instance=orcl1 devtype=DISK
specification does not match any archive log in the recovery catalog
released channel: c1
released channel: c2
这个实验结果证明,报错和通道没有关系。
2. 备份目录的权限问题
allocated channel: c3
channel c3: sid=131 instance=orcl2
RMAN-03009: failure of backup command on c3 channel at 09/22/2010 05:11:10
ORA-19504: failed to create file "/u01/backup/orcl_19logo39_1_1_20100922"
ORA-27040: file create error, unable to create file
Linux Error: 2: No such file or directory
channel c3 disabled, job failed on it will be run on another channel
从这里我们可以看出,通道c3 是创建成功了的。 但是在通道上c3上不能创建备份集的文件。
因为我实在节点1上做的RMAN 备份。 开始以为全部的备份文件都会放在节点1上。 所以只在节点1上创建了备份目录。 即/u01/backup目录。 在节点2上没有创建。 问题就出在这个地方。 我在就节点2上创建了/u01/backup目录,并赋予了777权限后,备份成功。
最后检查2个备份文件。 发现一个问题orcl1通道的备份集文件会存放在节点1的备份目录上(/u01/backup),orcl2通道的备份文件会放在orcl2的备份目录上(/u01/backup). 因为它的备份文件存放在2个地方,所以恢复的时候会很麻烦。 最理解的状态,是所有的备份文件都存放在一个节点,或者共享设备上。
所以,要解决这个有2种方法:
(1) 使用NFS。 在一个节点上共享备份目录,在另一个节点挂载这个共享。 这个所有的文件就会在一个节点上了。
关于NFS 的配置,参考Blog:
Linux NFS 和 Samba 共享配置
http://www.cndba.cn/Dave/article/682
(2) 使用存储。
对于NFS mount, 存在着一个比较严重的问题。就是NFS mount在一个节点没有启动的情况下,会极大的影响另一个节点的性能。也就是说,如果RAC一个节点出现故障,导致机器无法正常启动的话,另一个节点不但要承担全部的数据库压力,而且还会受到nfs mount带来的性能负载,基本上导致这个节点无法正常工作。所以从性能上的考虑,使用NFS mount的时候一定要慎重。所以对于RAC 的RMAN 备份文件,最好还是存放在存储上。