<一>有两台oracle服务器,运用ASM共享存储,早上发现因归档日志满了,数据库启动不了,结果手工将+flash磁盘上多余的archivelog删除后,发现还是启动不起来.报错及解决如下:
版本:11.2.0.1 做duplicate后 备库启动时报错,不能mount上,查看alert日志
<txt>ALTER DATABASE   MOUNT
 <txt>Errors in file /u01/app/oracle/diag/rdbms/dg/dg/trace/dg_rbal_16494.trc:
ORA-15183: ASMLIB initialization error [driver/agent not installed]
 <txt>WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
 <txt>SUCCESS: diskgroup ASMOCR was mounted
<txt>ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+ASMOCR/dg/controlfile/current.259.871317183'
ORA-15081: failed to submit an I/O operation to a disk
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+ASMOCR/dg/controlfile/current.258.871317183'
ORA-15081: failed to submit an I/O operation to a disk
ERROR: failed to establish dependency between database dg and diskgroup resource ora.ASMOCR.dg
<txt>ORA-205 signalled during: ALTER DATABASE   MOUNT...
以上是主要报错信息
根据报错信息分析下这个错误:
1.ora-205 报这个错是在数据库启动到mount时,从umount 到mount是要读取控制文件。我看了下spfile的控制文件参数是没有问题的
2.系统在启动时 读取控制文件报错ORA-204,是什么原因导致的?
3.因为用的是asm 此处diskgroup asmocr已经mount成功了
4. db在关联diskgroup出现错误:ERROR: failed to establish dependency between database dg and diskgroup resource ora.ASMOCR.dg
5.id oracle :uid=1101(oracle) gid=1000(oinstall) groups=1000(oinstall),1031(dba),1020(asmadmin),1021(asmdba),1300(oper)
   用户的权限是没有问题的
6.加载asmlib出现问题
解决办法 

  # cd $ORACLE_HOME/bin 

 

  # chgrp asmadmin oracle 

 

  # chmod 6751 oracle 

 
-rwsr-s--x 1 oracle asmadmin 210824714 Jan 29 14:27 oracle 

再次启动 成功。 

<二>然后用rman处理了下手工删除归档日志的状态问题:
当手工删除了归档日志以后,Rman备份会检测到日志缺失,从而无法进一步继续执行。
 所以此时需要手工执行crosscheck过程,之后Rman备份可以恢复正常。
 1.Crosscheck日志
  $ rman target / 
 
 Recovery Manager: Release 9.2.0.4.0 - 64bit Production 
 
 Copyright (c) 1995, 2002, Oracle Corporation.  All rights reserved. 
 
 connected to target database: AVATAR2 (DBID=2480694409) 
 

 RMAN> crosscheck archivelog all; 
 

 using target database controlfile instead of recovery catalog 
 
 allocated channel: ORA_DISK_1 
 
 channel ORA_DISK_1: sid=25 devtype=DISK 
 
 validation failed for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2714.dbf recid=2702 stamp=545107659 
 
 validation failed for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2715.dbf recid=2703 stamp=545108268 
 
 ........... 
 
 validation failed for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2985.dbf recid=2973 stamp=545399327 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2986.dbf recid=2974 stamp=545400820 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2987.dbf recid=2975 stamp=545401757 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2988.dbf recid=2976 stamp=545402716 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2989.dbf recid=2977 stamp=545403661 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2990.dbf recid=2978 stamp=545404946 
 
 validation succeeded for archived log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2991.dbf recid=2979 stamp=545406220 
 
 Crosschecked 278 objects 
 

 RMAN> 
  

 2.使用delete expired archivelog all 命令删除所有过期归档日志: 



  RMAN> delete expired archivelog all; 
 

 released channel: ORA_DISK_1 
 
 allocated channel: ORA_DISK_1 
 
 channel ORA_DISK_1: sid=12 devtype=DISK 
 

 List of Archived Log Copies 
 
 Key    Thrd Seq    S Low Time  Name 
 
 ------- ---- ------- - --------- ---- 
 
 376    1    2714    X 23-NOV-04 =/opt/oracle/oradata/avatar2/archive/1_2714.dbf 
 
 ..... 
 
 

 3.简要介绍一下report obsolete命令 


 使用report obsolete命令报告过期备份 



  RMAN> report obsolete; 
 

 RMAN retention policy will be applied to the command 
 
 RMAN retention policy is set to redundancy 1 
 
 Report of obsolete backups and copies 
 
 Type                Key    Completion Time    Filename/Handle 
 
 -------------------- ------ ------------------ -------------------- 
 
 Backup Set          125    01-NOV-04        
 
   Backup Piece      125    01-NOV-04          /data1/oracle/orabak/full_1_541045804 
 
 Backup Set          131    04-NOV-04        
 
   Backup Piece      131    04-NOV-04          /data1/oracle/orabak/full_AVATAR2_20041104_131 
 
 .... 
 
 Backup Set          173    06-DEC-04        
 
   Backup Piece      173    06-DEC-04          /data1/oracle/orabak/full_AVATAR2_20041206_173 
 
 Backup Set          179    11-DEC-04        
 
   Backup Piece      179    11-DEC-04          /data1/oracle/orabak/arch544588206.arc 
 
 ..... 
 
   Backup Piece      189    17-DEC-04          /data1/oracle/orabak/arch545106606.arc 
 
 Backup Set          190    17-DEC-04        
 
   Backup Piece      190    17-DEC-04          /data1/oracle/orabak/arch545106665.arc 
 
 Backup Set          191    20-DEC-04        
 
   Backup Piece      191    20-DEC-04          /data1/oracle/orabak/arch_AVATAR2_20041220_194 
 
 Archive Log          2973  20-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2985.dbf 
 
 Archive Log          2971  20-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2984.dbf 
 
 ..... 
 
 Archive Log          2705  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2717.dbf 
 
 Archive Log          2704  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2716.dbf 
 
 Archive Log          2703  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2715.dbf 
 
 Archive Log          2702  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2714.dbf 
  

 4.使用delete obsolete命令删除过期备份: 



  RMAN> delete obsolete; 
 

 RMAN retention policy will be applied to the command 
 
 RMAN retention policy is set to redundancy 1 
 
 using channel ORA_DISK_1 
 
 Deleting the following obsolete backups and copies: 
 
 Type                Key    Completion Time    Filename/Handle 
 
 -------------------- ------ ------------------ -------------------- 
 
 Backup Set          125    01-NOV-04        
 
   Backup Piece      125    01-NOV-04          /data1/oracle/orabak/full_1_541045804 
 
 .... 
 
 Archive Log          2704  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2716.dbf 
 
 Archive Log          2703  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2715.dbf 
 
 Archive Log          2702  17-DEC-04          /opt/oracle/oradata/avatar2/archive/1_2714.dbf 
 

 Do you really want to delete the above objects (enter YES or NO)? yes 
 
 deleted backup piece 
 
 backup piece handle=/data1/oracle/orabak/full_AVATAR2_20041206_173 recid=173 stamp=544156241 
 
 ..... 
 
 deleted archive log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2715.dbf recid=2703 stamp=545108268 
 
 deleted archive log 
 
 archive log filename=/opt/oracle/oradata/avatar2/archive/1_2714.dbf recid=2702 stamp=545107659 
 
 Deleted 286 objects 
 


 RMAN> crosscheck archivelog all; 
 

 released channel: ORA_DISK_1 
 
 allocated channel: ORA_DISK_1 
 
 channel ORA_DISK_1: sid=19 devtype=DISK 
 
 specification does not match any archive log in the recovery catalog 
  

 -The End- 

<三>完事后启动oracle监听,又遇到报错:
尝试手动启动,提示错误:
$ lsnrctl start

LSNRCTL for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production on 17-NOV-2011 00:39:20
Copyright (c) 1991, 2002, Oracle Corporation.  All rights reserved.
Starting /moracle/product/9.2.0/bin/tnslsnr: please wait...
TNSLSNR for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production
System parameter file is /moracle/product/9.2.0/network/admin/listener.ora
Log messages written to /moracle/product/9.2.0/network/log/listener.log
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
   IBM/AIX RISC System/6000 Error: 67: Address already in use
但是仔细查看listener.ora没发现问题,而且这配置文件一般也不会做任何修改.于是估计是crs的部分服务状态出问题了,通过crs_stat -t命令查看,结果报 crs_stat没找到命令,只能去grid安装目录bin层查看,发现两个db都是offline状态:
于是通过srvctl 命令分别online两个db,然后查看状态就没问题了(db listener正常,crs_stat命令也能找到,不用非得去grid bin目录下查看了):

<四>最后补充下常用rac srvctl管理命令,非为这些命令时常会在解决类似问题中用到:
SRVCTL是ORACLE RAC集群配置管理的工具
SRVM   server management:
1. SRVCTL  Add命令
添加数据库或实例的配置信息。在增加实例中,与-i一起指定的名字应该与INSTANCE_NAME 和 ORACLE_SID参数匹配。
srvctl add database -d <database name> [-m domain_name] -o <ORACLE_HOME path> -p <spfile location and name>
srvctl add instance -d <database name> -i <instance 1 name> -n <node 1 name >
srvctl add instance -d <database name> -i <instance 2 name> -n <node 2 name >
命令参数:
-m   数据库域名 格式如”us.oracle.com”
指定的数据库域名必须匹配数据库INIT.ORA或者SPFILE中DB_DOMAIN 和DB_NAME参数。在增加数据库时,-d指定的数据库名必须与DB_NAME参数匹配
-n   实例节点名
-o   $ORACLE_HOME(用来确定lsnrctl和Oracle等命令路径)
-p   SPFILE 文件名
Eg:
$srvctl  add database -d RAC -o /u01/oracle/product/10.2.0/db_1 -p +RAC_DISK/rac/spfilerac.ora
$srvctl  add  instance  -d RAC  -i rac1  -n node1
$srvctl  add  instance  -d RAC  -i rac2  -n node2
2.SRVCTL Config命令
显示保存在SRVM配置文件中的配置信息
srvctl config database
显示数据库配置列表
srvctl config database -d database_name
数据库配置信息显示的格式:
nodename1 instancename1 oraclehome
nodename2 instancename2 oraclehome
Eg:
$ srvctl config database
RAC
$srvctl config database -d rac
node1 rac1 /u01/oracle/product/10.2.0/db_1
node2 rac2 /u01/oracle/product/10.2.0/db_1
3.SRVCTL Modify命令
修改实例的节点配置信息,这些修改会在程序下次重新启动后生效,修改后的信息将永久保存。
srvctl modify instance -d database_name -i instance_name -n node_name
Eg:
$srvctl modify instance -d rac -n new_node
4.SRVCTL Remove命令
这是用来删除SRVM库中配置信息的命令,对象相关的环境设置也同样删除,如果你未使用强制标志(-f),ORACLE将提示你确认是否删除。
使用强制选项(-f),删除操作将不进行提示
srvctl remove database -d database_name [-f]
srvctl remove instance -d database_name -i instance_name [-f]
命令参数:
-f 强制删除应用时不进行确认提示
Eg:
$srvctl remove database -d rac
$srvctl remove instance -d rac -i rac1
$srvctl remove instance -d rac -i rac2
5.SRVCTL Start命令
启动数据库,所有实例或指定的实例,及启动所有相关未启动的监听。
注:对于start命令和其它一些可以使用连接字符串的操作,如果你不提供连接字符串,那么ORACLE会使用”/ as sysdba”在实例上执行相关的操作。另外,要执行类似的操作,你必须是OSDBA组的成员。
srvctl start database -d database_name [-o start_options] [-c connect_string]
srvctl start instance -d database_name -i instance_name [,instance_name-list] [-o start_options][-c connect_string]
命令参数:
-o   在SQL*Plus直接传递的startup命令选项,可以包括PFILE
-c   使用SQL*Plus连接数据库实例的连接字符串
Eg:
$srvctl start database -d rac
$ srvctl stop database -d rac -c “SYS/SYS_password as SYSDBA”
$srvctl start instance -d rac -i rac1,rac2
##############################################################
$srvctl start listener -n node1
$srvctl stop listener -n node2
$ srvctl stop listener -n node [-l listenername]
今天发现一个SRVCTL命令的小bug。(http://yangtingkun.itpub.net/post/468/275571)
如果用srvctl关闭监听后,再用lsnrctl start打开监听。这时srvctl仍然认为监听已经关闭。因此,再次使用srvctl关闭监听,似乎srvctl根本没有去执行。如果希望srvctl可以关闭监听,那么需要先用srvctl启动监听,然后再关闭。搜索了一下metalink,没有发现关于这个问题的说明。而且,这个问题只在关闭监听时出现,启动监听则没有问题。svrctl显然只记录它自己的操作,而不去检查listener真正的状态。
##############################################################
6.SRVCTL Status命令
显示指定数据库的当前状态
srvctl status database -d database_name
srvctl status instance -d database_name -i instance_name [,instance_name-list]
Eg:
$srvctl status database -d rac
$srvctl status instance -d rac -i rac1,rac2
7.SRVCTL Stop命令
停止数据库所有实例可者指定实例
srvctl stop database -d database_name [-o stop_options] [-c connect_string]
srvctl stop instance -d database_name -i instance_name [,instance_name_list] [-o stop_options][-c connect_string]
命令参数:
-c   使用SQL*Plus连接数据库实例的连接字符串
-o   在SQL*Plus直接传递的shutdown命令选项
Eg:
$srvctl stop database -d rac
$srvctl stop instance -d rac -i rac2
$ srvctl stop service -d db_name [-s service_name_list [-i inst_name]]
$ srvctl stop asm -n node
8.使用SRVCONFIG导入和导出RAW设备配置信息
你可使用SRVCONFIG导入和导出RAW设备配置信息,不管配置文件是在集群文件系统上还是在RAW设备上。你可以使用这种方法来备份与恢复SRVM配置信息。
Eg:
下面的命令用来导出配置信息的内容到你指定文件名的文本文件中。
$srvconfig -exp file_name
下面的命令用来从指定文本文件中导入配置信息到到你运行命令的RAC环境配置信息库。
$srvconfig -imp file_name
9.SRVCTL Getenv命令
getenv操作用来从SRVM配置文件中获取与显示环境变量
srvctl getenv database -d database_name [-t name[,name,……]]
srvctl getenv instance -d database_name -i instance_name [-t name[,name,……]]
Eg:
$srvctl getenv database -d rac
10.SRVCTL Setenv命令
设置SRVM配置文件中的环境变量值。
srvctl setenv database -d database_name -t [,name=value,……]
srvctl setenv instance -d database_name [-i instance_name] -t [,name=value,……]
Eg:
$srvctl setenv database -d rac -t LANG=en
11.SRVCTL Unsetenv命令
取消SRVM配置文件中环境变量定义值
srvctl unsetenv database -d database_name-t name[,name,……]
srvctl unsetenv instance -d database_name[-i instance_name] -t name[,name,……]
Eg:
$srvctl unsetenv database -d rac -t CLASSPATH
Updated @ 11-12-09 11:43
Example:In windows, the correct startup/shutdown steps is:
STARTUP:
node1$srvctl start nodeapps -n rac1
node1$srvctl start nodeapps -n rac2
node1$srvctl start asm -n rac1
node1$srvctl start asm -n rac2
node1$srvctl start database -d rac
node1$srvctl start service -d rac
node1$crs_stat -t
SHUTDOWN:
node1$srvctl stop service -d rac
node1$srvctl stop database -d rac
node1$srvctl stop asm -n rac2
node1$srvctl stop asm -n rac1
node1$srvctl stop nodeapps -n rac2
node1$srvctl stop nodeapps -n rac1
node1$crs_stat -t
-The End-