事情要从大约2个月前的一起事故说起,有一套部署在Oracle-Sun Exadata V2 Database Machine上的4节点11.2.0.1 RAC数据库,其中一个节点的RAC关键后台进程LMS报ORA-00600[kjbmprlst:shadow]错误,随后LMS后台进程将该节点上的实例终止。其他节点上的CRS软件检测到该意外终止后,数据库进入全局资源的重新配置过程(Reconfiguration),Reconfiguration在所有剩余节点上都顺利完成了。
但是随后其中一个节点的告警日志中持续出现"Process W000 died, see its trace file",似乎是实例无法得到分配新进程的必要资源,同时应用程序出现无法登陆该节点上实例的情况,本来4节点的RAC数据库,因为ORA-00600挂了一个,现在又有一个节点登不上,一下变得只剩下一半性能。
随后我赶到了问题现场,继续诊断问题,并发现了以下症状,在此一一列举:
1.尝试远程登录该实例,但是失败,出现ORA-12516 TNS:listener could not find available handler with matching protocol stack"错误。反复登录会出现以下信息:
Linux Error: 12: Cannot allocate memory
Additional information: 1
ORA-01034: ORACLE not available
2.确认过ORACLE_SID、ORACLE_HOME等多环境变量后使用"sqlplus / as sysdba"登录却返回"Connected to an idle instance."(这一点最为蹊跷),无法以sysdba登录就无法收集必要的诊断信息,这个虽然可以通过gdb等手段做systemstate dump,但是暂时绕过
3. 后台进程W000由SMCO进程启动, SMCO进程的日志如下,所报状态为KSOSP_SPAWNED:
Process W000 is dead (pid=2648 req_ver=3812 cur_ver=3812 state=KSOSP_SPAWNED).
*** 2011-07-08 02:44:32.971
Process W000 is dead (pid=2650 req_ver=3813 cur_ver=3813 state=KSOSP_SPAWNED).
4. 确认组成instance的内存和后台进程均存活,且仍有日志产生
[oracle@maclean04 trace]$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 644 72 2
0x00000000 32769 root 644 16384 2
0x00000000 65538 root 644 280 2
0xac5ffd78 491524 oracle 660 4096 0
0x96c5992c 1409029 oracle 660 4096 0
[oracle@maclean04 trace]$ ls -l /dev/shm
total 34839780
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_0
-rw-r----- 1 oracle oinstall 0 Jun 7 07:19 ora_maclean4_1409029_1
-rw-r----- 1 oracle oinstall 0 Jun 7 07:19 ora_maclean4_1409029_10
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_100
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_101
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_102
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_103
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_104
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_105
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_106
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_107
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_108
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_109
-rw-r----- 1 oracle oinstall 0 Jun 7 07:19 ora_maclean4_1409029_11
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_110
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_111
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_112
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_113
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_114
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_115
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_116
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_117
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_118
-rw-r----- 1 oracle oinstall 268435456 Jun 7 07:19 ora_maclean4_1409029_119
-rw-r----- 1 oracle oinstall 0 Jun 7 07:19 ora_maclean4_1409029_12
.......................
[oracle@maclean04 trace]$ ps -ef|grep ora_
oracle 5466 1 0 Jul03 ? 00:00:18 ora_pz99_maclean4
oracle 14842 10564 0 19:54 pts/9 00:00:00 grep ora_
oracle 18641 1 0 Jun08 ? 00:00:02 ora_q002_maclean4
oracle 23932 1 0 Jun07 ? 00:04:26 ora_pmon_maclean4
oracle 23934 1 0 Jun07 ? 00:00:06 ora_vktm_maclean4
oracle 23938 1 0 Jun07 ? 00:00:00 ora_gen0_maclean4
oracle 23940 1 0 Jun07 ? 00:00:06 ora_diag_maclean4
oracle 23942 1 0 Jun07 ? 00:00:00 ora_dbrm_maclean4
oracle 23944 1 0 Jun07 ? 00:01:01 ora_ping_maclean4
oracle 23946 1 0 Jun07 ? 00:00:16 ora_psp0_maclean4
oracle 23948 1 0 Jun07 ? 00:00:00 ora_acms_maclean4
oracle 23950 1 0 Jun07 ? 02:27:29 ora_dia0_maclean4
oracle 23952 1 0 Jun07 ? 01:19:42 ora_lmon_maclean4
oracle 23954 1 0 Jun07 ? 02:23:59 ora_lmd0_maclean4
oracle 23956 1 5 Jun07 ? 1-13:50:36 ora_lms0_maclean4
oracle 23960 1 4 Jun07 ? 1-12:44:25 ora_lms1_maclean4
oracle 23964 1 0 Jun07 ? 00:00:00 ora_rms0_maclean4
oracle 23966 1 0 Jun07 ? 00:00:00 ora_lmhb_maclean4
oracle 23968 1 0 Jun07 ? 01:58:35 ora_mman_maclean4
oracle 23970 1 0 Jun07 ? 06:28:39 ora_dbw0_maclean4
oracle 23972 1 0 Jun07 ? 06:27:08 ora_dbw1_maclean4
oracle 23974 1 2 Jun07 ? 16:49:56 ora_lgwr_maclean4
oracle 23976 1 0 Jun07 ? 00:20:48 ora_ckpt_maclean4
oracle 23978 1 0 Jun07 ? 00:07:03 ora_smon_maclean4
oracle 23980 1 0 Jun07 ? 00:00:00 ora_reco_maclean4
oracle 23982 1 0 Jun07 ? 00:00:00 ora_rbal_maclean4
oracle 23984 1 0 Jun07 ? 00:01:00 ora_asmb_maclean4
oracle 23986 1 0 Jun07 ? 00:08:15 ora_mmon_maclean4
oracle 23988 1 0 Jun07 ? 00:18:19 ora_mmnl_maclean4
oracle 23992 1 0 Jun07 ? 00:00:00 ora_d000_maclean4
oracle 23994 1 0 Jun07 ? 00:00:00 ora_s000_maclean4
oracle 23996 1 0 Jun07 ? 00:00:00 ora_mark_maclean4
oracle 24065 1 0 Jun07 ? 01:16:54 ora_lck0_maclean4
oracle 24067 1 0 Jun07 ? 00:00:00 ora_rsmn_maclean4
oracle 24079 1 0 Jun07 ? 00:01:02 ora_dskm_maclean4
oracle 24174 1 0 Jun07 ? 00:08:18 ora_arc0_maclean4
oracle 24188 1 0 Jun07 ? 00:08:19 ora_arc1_maclean4
oracle 24190 1 0 Jun07 ? 00:00:59 ora_arc2_maclean4
oracle 24192 1 0 Jun07 ? 00:08:12 ora_arc3_maclean4
oracle 24235 1 0 Jun07 ? 00:00:00 ora_gtx0_maclean4
oracle 24237 1 0 Jun07 ? 00:00:00 ora_rcbg_maclean4
oracle 24241 1 0 Jun07 ? 00:00:00 ora_qmnc_maclean4
oracle 24245 1 0 Jun07 ? 00:00:00 ora_q001_maclean4
oracle 24264 1 0 Jun07 ? 00:08:28 ora_cjq0_maclean4
oracle 25782 1 0 Jun07 ? 00:00:00 ora_smco_maclean4
5.确认在问题发生时系统中仍有大量的空闲内存且未发生大量的SWAP,此外/dev/shm共享内存目录仍有27G的空闲。
6.在其他节点上查询全局动态性能视图gv$resource_limit发现当前故障节点上的登录进程总数上限仅为404,并不多。
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.oracledatabase12g.com
SQL> select * from gv$resource_limit where inst_id=4;
RESOURCE_NAME CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION LIMIT_VALUE
------------------------------ ------------------- --------------- ------------------------------ -------------
processes 50 404 1500 1500
sessions 61 616 2272 2272
enqueue_locks 849 1599 31062 31062
enqueue_resources 846 1007 15016 UNLIMITED
ges_procs 47 399 1503 1503
ges_ress 65943 109281 67416 UNLIMITED
ges_locks 23448 37966 92350 UNLIMITED
ges_cache_ress 7347 14716 0 UNLIMITED
ges_reg_msgs 337 5040 3730 UNLIMITED
ges_big_msgs 26 502 3730 UNLIMITED
ges_rsv_msgs 0 1 1000 1000
gcs_resources 2008435 2876561 3446548 3446548
gcs_shadows 1888276 2392064 3446548 3446548
dml_locks 0 0 9996 UNLIMITED
temporary_table_locks 0 45 UNLIMITED UNLIMITED
transactions 0 0 2499 UNLIMITED
branches 0 2 2499 UNLIMITED
cmtcallbk 0 3 2499 UNLIMITED
max_rollback_segments 109 129 2499 65535
sort_segment_locks 0 14 UNLIMITED UNLIMITED
k2q_locks 0 2 4544 UNLIMITED
max_shared_servers 1 1 UNLIMITED UNLIMITED
parallel_max_servers 1 19 160 3600
7. Exadata节点系统内核参数文件sysctl.conf中的配置正确:
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
########### BEGIN DO NOT REMOVE Added by Oracle Exadata ###########
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
# bug 8311668 file-max and aio-max-nr
fs.file-max = 6815744
# DB install guide says the above
fs.aio-max-nr = 1048576
# 8976963
net.ipv4.neigh.bond0.locktime=0
net.ipv4.ip_local_port_range = 9000 65500
# DB install guide says the above
net.core.rmem_default = 4194304
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 2097152
# The original DB deployment was net.core.wmem_max = 1048586 but IB works
# best for Exadata at the above net.core settings
# bug 8268393 remove vm.nr_hugepages = 2048
# bug 8778821 system reboots after 60 sec on panic
kernel.panic=60
########### END DO NOT REMOVE Added by Oracle Exadata ###########
########### BEGIN DO NOT REMOVE Added by Oracle Exadata ###########
kernel.shmmax = 64547735961
kernel.shmall = 15758724
########### END DO NOT REMOVE Added by Oracle Exadata ###########
8. 至此问题还是显得扑朔迷离,主要后台进程和SGA内存的完好,而且操作系统上也仍有大量空闲内存,实例上的资源也没有达到一个临界点。到底是什么造成了无法分配新进程!?
出于谨慎我最后还是检查了系统上的/
etc/security/limits.conf参数文件,该参数文件控制了shell的一些ulimit的上限。因为Exadata一体机是由Oracle安装配置后直接交付使用的,我最初的认识是这些配置文件都毫无疑问都应当是最佳配置,遵循Oracle的Best Practices。
但是当我实际打开这个文件后我立即意识到这个配置有问题,似乎少了点什么,以下为该Exadata上的limits.conf文件:
########### BEGIN DO NOT REMOVE Added by Oracle Deployment Scripts ###########
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 65536
oracle hard nofile 65536
########### END DO NOT REMOVE Added by Oracle Deployment Scripts ###########
显然上述limits.conf中缺少了对memlock参数的设置,在不设置memlock参数的情况下使用缺省的memlock为32,以下为Exadata host上的ulimit输出:
[oracle@maclean4 shm]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 606208
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 2047
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
可以观察到这里的max locked memory确实是缺省的32,而Oracle所推荐的memlock参数却要远大于32。
在Oracle validated Configuration中经过验证的memlock推荐值为50000000,关于Oracle Validated Configuration详见拙作
<Understand Oracle Validated Configurations>。
[oracle@rh2 ~]$ cat /etc/security/limits.conf
# Oracle-Validated setting for nofile soft limit is 131072
oracle soft nofile 131072
# Oracle-Validated setting for nofile hard limit is 131072
oracle hard nofile 131072
# Oracle-Validated setting for nproc soft limit is 131072
oracle soft nproc 131072
# Oracle-Validated setting for nproc hard limit is 131072
oracle hard nproc 131072
# Oracle-Validated setting for core soft limit is unlimited
oracle soft core unlimited
# Oracle-Validated setting for core hard limit is unlimited
oracle hard core unlimited
# Oracle-Validated setting for memlock soft limit is 50000000
oracle soft memlock 50000000
# Oracle-Validated setting for memlock hard limit is 50000000
oracle hard memlock 50000000
搜索Mos可以发现Note[Ora-27102: Out Of Memory: Linux Error: 12: Cannot Allocate Memory with LOCK_SGA=TRUE [ID 401077.1]:指出了因max locked memory过小可能引发Linux Error: 12: Cannot Allocate Memory内存无法分配的问题。
因为修改limits.conf配置文件对已经启动的实例是无效的,所以我们无法通过纠正参数来解决现有的问题。
实际我采用了释放一些资源的方法来workaround了这个问题,通过以下脚本将实例内的所有前台服务进程杀死以释放资源。
ps -ef|grep $SID|grep LOCAL=NO|grep -v grep| awk '{print $2}'|xargs kill -9
完成以上命令后出现了终端有点卡的现象,之后恢复正常。尝试使用sysdba本地和远程登录实例均成功,应用的链接也恢复正常。
虽然修复了问题,但是还需要和客户做详尽的说明。我在邮件中阐明了该Exadata一体机上配置文件存在的问题,并提出了几点建议:
1.要求Oracle Support确认该/etc/security/limits.conf中的配置是否合理,是否需要修改
2.设置vm.min_free_kbytes = 51200 内核参数,避免因空闲内存不足引起的性能问题
3.安装OSWatcher监控软件,监控必要的系统资源
客户对我的说法也比较信服,但还是将邮件抄送了原厂Exadata一体机的售前人员。
之后售前人员也曾联系过我,我也做了相同的说明。但原厂售前认为在Exadata一体机是在Oracle美国原厂进行配置安装的,在配置上肯定是最优的,而且该limits.conf中的memlock参数的当前值(32)和推荐值(50000000)之间有如此大的差距,他们认为美国原厂的部署人员不可能犯这么低级的错误。
所以实际他们对我对该memlock参数的说明持一种怀疑的态度,我的建议是就该memlock参数和MOS进行进一步的沟通,以确认该问题。当然这不是我需要完成的工作了。因为对该memlock参数存在分歧,所以短期内也没有修改该参数。
这个case就这样过去了,时间过得很快,转眼已经2个月了。恰巧最近有升级Exadata上数据库到11.2.0.2的项目,所以翻阅了相关patch的readme文档,因为升级RAC到11.2.0.2的前提是Exadata Storage Server Software、InfiniBand Switch Software Version软件版本能够兼容,所以查阅了其兼容列表:
Version Compatibility
The following table lists the Exadata Storage Server software versions that are compatible with each supported Oracle Database 11g Release 2 software version.
Oracle Database Software version |
Required Exadata Storage Server Software version |
11g Release 2 (11.2.0.2.0) Patch Set 1 |
11.2.2.x |
11g Release 2 (11.2.0.1.0) |
11.2.2.x
11.2.1.x |
The following table lists the InfiniBand Switch software versions that are compatible with each supported Exadata Storage Server software version.
Exadata Storage Server Software version |
Required InfiniBand Switch software version |
11.2.2.2.2 and later |
Exadata Database Machine - Sun Datacenter InfiniBand Switch 36
Switch software version 1.1.3-2 or laterHP Oracle Database Machine - Voltaire ISR 9024D-M and ISR 9024D
Switch software 5.1.1 build ID 872 (ISR 9024D-M only)
Switch firmware 1.0.0 or higher |
11.2.2.2.0 or earlier |
Exadata Database Machine - Sun Datacenter InfiniBand Switch 36
Switch software version 1.0.1-1 or laterHP Oracle Database Machine - Voltaire ISR 9024D-M and ISR 9024D
Switch software 5.1.1 build ID 872 (ISR 9024D-M only)
Switch firmware 1.0.0 or higher |
为了将Exadata上的RAC数据库升级到11.2.0.2,首先要将Exadata Storage Server Software升级到11.2.2.x,Oracle官方目前推荐的版本是11.2.2.3.2。
所以随后我也翻阅了Exadata Storage Server Software 11.2.2.3.2 的update readme文档,即<Oracle Exadata Database Machine README for patch 12577723 (Support note 1323958.1)>。
该Patch的升级主要分成"Applying the Patch to Exadata Cells"和"Applying the Patch to the Database Server" 2个阶段,即不仅需要在Exadata Cell上实施补丁,还需要在Database节点上实施一个小补丁。
查看"Applying the Patch to the Database Server"章节可以发现存在这样一个步骤:
Repeat the following steps for each database host. If you are taking deployment-wide downtime for the patching, then these steps may be performed in parallel on all database hosts.
- Update the resource limits for the database and the grid users:
Note:
- This step does not apply if you have customized the values for your specific deployment and database requirements.
|
WARNING:
- Do not run this step if you have specific customized values in use for your deployment.
|
- Calculate 75% of the physical memory on the machine using the following command:.
let -i x=($((`cat /proc/meminfo | grep 'MemTotal:' | awk '{print $2}'` * 3 / 4))); echo $x
- Edit the