记一次因oracle bug导致的登录数据库就hang的现象。

网友投稿 1056 2022-10-04

本站部分文章、图片属于网络上可搜索到的公开信息,均用于学习和交流用途,不能代表睿象云的观点、立场或意见。我们接受网民的监督,如发现任何违法内容或侵犯了您的权益,请第一时间联系小编邮箱jiasou666@gmail.com 处理。

记一次因oracle bug导致的登录数据库就hang的现象。

今天记一次因oracle bug导致数据库登陆就hang住的情况,ORA—07445 : exception encountered: core dump [kglic0()+788] [SIGSEGV] [ADDR:0xE49200000018] [PC:0x10123F4D4] [Address not mapped to object] []

今天八点,客户的其他业务部门说106数据库不可用了。无法登陆,在登陆的时候报错

此时,我已经确定数据库监听配置完全没有任何问题,而且lsnrct stat的监听状态一直处于正常的状态。

我通过plsql连接数据库发现可以连接,查看scanip,scanip飘在节点第二个节点上,登陆没有问题,plsql在节点2登陆的的时候也没有任何问题,但是在通过conn aqjc/12345 的时候,发现这个时候的登陆会完全hang死。查看节点2的alert日志没有任何有用的信息。

plsql登陆节点1

sqlplus as sysdba

这个时候会完全卡死………………

查看alert日志

Wed Feb 07 09:12:29 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:13:29 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:14:59 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:15:59 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:17:40 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:18:40 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:20:10 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:21:10 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:22:41 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:23:41 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:25:21 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:26:21 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:27:52 2018

PMON failed to acquire latch, see PMON dump

Wed Feb 07 09:28:51 2018

Errors in file oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_ckpt_558022.trc  (incident=1218553):

ORA-00445: background process "PZ98" did not start after 120 seconds

Wed Feb 07 09:28:52 2018

PMON failed to acquire latch, see PMON dump

这个报错开始于昨天夜里21:02:00

ORA-00445: background process "m001" did not start after 120 seconds

Incident details in: oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218589/nsbdzxdb1_mmon_488078_i1218589.trc

Tue Feb 06 21:02:59 2018

Dumping diagnostic data in directory=[cdmp_20180206210259], requested by (instance=1, osid=488078 (MMON)), summary=[incident=1218589].

Tue Feb 06 21:03:39 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:04:56 2018

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc  (incident=1218781):

ORA-00445: background process "J000" did not start after 120 seconds

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218781/nsbdzxdb1_cjq0_385442_i1218781.trc

Tue Feb 06 21:04:58 2018

Dumping diagnostic data in directory=[cdmp_20180206210458], requested by (instance=1, osid=385442 (CJQ0)), summary=[incident=1218781].

kkjcre1p: unable to spawn jobq slave process

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc:

Tue Feb 06 21:05:09 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:06:09 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:06:59 2018

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc  (incident=1218782):

ORA-00445: background process "J000" did not start after 120 seconds

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218782/nsbdzxdb1_cjq0_385442_i1218782.trc

kkjcre1p: unable to spawn jobq slave process

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_cjq0_385442.trc:

Tue Feb 06 21:07:02 2018

Dumping diagnostic data in directory=[cdmp_20180206210702], requested by (instance=1, osid=385442 (CJQ0)), summary=[incident=1218782].

Tue Feb 06 21:07:40 2018

PMON failed to acquire latch, see PMON dump

Tue Feb 06 21:08:40 2018

PMON failed to acquire latch, see PMON dump

由此看来貌似是后台进程出现了问题。PMON么有办法将他拉起来

之后再往上看发现一个很丑陋的报错

ORA-07445: exception encountered: core dump [kglic0()+788] [SIGSEGV] [ADDR:0xE49200000018] [PC:0x10123F4D4] [Address not mapped to object] []

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218805/nsbdzxdb1_m000_512814_i1218805.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Tue Feb 06 19:01:00 2018

Dumping diagnostic data in directory=[cdmp_20180206190100], requested by (instance=1, osid=512814 (M000)), summary=[incident=1218805].

Tue Feb 06 19:01:01 2018

Sweep [inc][1218805]: completed

Sweep [inc2][1218805]: completed

Tue Feb 06 20:00:43 2018

Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x300000018] [PC:0x10123F4BC, kglic0()+764] [flags: 0x0, count: 1]

Errors in file /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/trace/nsbdzxdb1_m000_520268.trc  (incident=1218877):

ORA-07445: exception encountered: core dump [kglic0()+764] [SIGSEGV] [ADDR:0x300000018] [PC:0x10123F4BC] [Address not mapped to object] []

Incident details in: /oracle/app/oracle/diag/rdbms/nsbdzxdb/nsbdzxdb1/incident/incdir_1218877/nsbdzxdb1_m000_520268_i1218877.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

根据这个报错查看MOS 411.1文档

这里告诉了一种解决方法,但是开发部门已经没有办法登录数据库了,时间紧任务重所以直接找一个有效的方法。

MOS当中搜索到ora-07445的报错。

这里显示为bug导致,因无法sqlplus登陆,所以现在整个操作的正常步骤是将当前节点的监听停掉,SQLplus如果还是hang住的,那么现在能解决的只有kill,等待下次启动时SMON的自动恢复。我将数据库SMON进程kill掉,之后启动数据库,问题解决,没有报错现象。

THAT'S ALL

BY CUI PEACE !!!!!!

上一篇:一个线上问题的思考:Eureka注册中心集群如何实现客户端请求负载及故障转移?
下一篇:华为openGauss 区域和格式化
相关文章

 发表评论

暂时没有评论,来抢沙发吧~