start_udev至RAC网络服务不可用

环境:11.2.0.4 RAC下,asm设备权限是利用udev进行管理的,在存储添加磁盘卷后,需要生效映射上来的权限,所以做了如下操作
1.fdisk -l|grep “/dev/sd*”|wc -l <<<<<发现存储映射上来的设备,在rac两个节点能看见
2.multipath -ll <<<<<<<<,发现存储映射上来的设备,通过linux多路径管理软件multipath工具做聚合后的设备看不见
3.mulitpath -v2 <<<<<<<<生效多路径软件信息
4.检查权限信息

[oracle@hpaydb1:/etc/udev/rules.d]$ls -lat /dev/dm*
brw-rw---- 1 grid asmadmin 252, 13 Feb 20 10:08 /dev/dm-13
brw-rw---- 1 grid asmadmin 252, 18 Feb 20 10:08 /dev/dm-18
brw-rw---- 1 grid asmadmin 252, 24 Feb 20 10:08 /dev/dm-24
brw-rw---- 1 grid asmadmin 252, 19 Feb 20 10:08 /dev/dm-19
brw-rw---- 1 grid asmadmin 252, 21 Feb 20 10:08 /dev/dm-21
brw-rw---- 1 grid asmadmin 252, 7 Feb 20 10:08 /dev/dm-7
brw-rw---- 1 grid asmadmin 252, 23 Feb 20 10:08 /dev/dm-23
brw-rw---- 1 grid asmadmin 252, 6 Feb 20 10:08 /dev/dm-6
brw-rw---- 1 grid asmadmin 252, 10 Feb 20 10:07 /dev/dm-10
brw-rw---- 1 grid asmadmin 252, 14 Feb 20 10:07 /dev/dm-14
brw-rw---- 1 grid asmadmin 252, 2 Feb 20 10:07 /dev/dm-2
brw-rw---- 1 grid asmadmin 252, 22 Feb 20 10:07 /dev/dm-22
brw-rw---- 1 grid asmadmin 252, 11 Feb 20 10:07 /dev/dm-11
brw-rw---- 1 grid asmadmin 252, 12 Feb 20 10:07 /dev/dm-12
brw-rw---- 1 grid asmadmin 252, 16 Feb 20 10:07 /dev/dm-16
brw-rw---- 1 grid asmadmin 252, 8 Feb 20 10:07 /dev/dm-8
brw-rw---- 1 grid asmadmin 252, 28 Feb 20 10:07 /dev/dm-28
brw-rw---- 1 grid asmadmin 252, 20 Feb 20 10:07 /dev/dm-20
brw-rw---- 1 grid asmadmin 252, 9 Feb 20 10:07 /dev/dm-9
brw-rw---- 1 grid asmadmin 252, 15 Feb 20 10:07 /dev/dm-15
brw-rw---- 1 grid asmadmin 252, 27 Feb 20 10:07 /dev/dm-27
brw-rw---- 1 grid asmadmin 252, 4 Feb 20 10:07 /dev/dm-4
brw-rw---- 1 grid asmadmin 252, 3 Feb 20 09:49 /dev/dm-3
brw-rw---- 1 grid asmadmin 252, 17 Feb 20 09:49 /dev/dm-17
brw-rw---- 1 grid asmadmin 252, 5 Feb 20 09:49 /dev/dm-5
brw-rw---- 1 root disk 252, 29 Feb 19 17:56 /dev/dm-29 <<<<<<<<<<<<<<<<<<<<
brw-rw---- 1 root disk 252, 26 Feb 19 15:03 /dev/dm-26
brw-rw---- 1 root disk 252, 25 Feb 4 17:43 /dev/dm-25
brw-rw---- 1 root disk 252, 0 Feb 4 17:43 /dev/dm-0
brw-rw---- 1 root disk 252, 1 Feb 4 17:43 /dev/dm-1

5.udevadm control –reload-rules
start_udev <<<<<<<<生效上来设备权限 <<<<<<<<<<<<<<<<<<<<<<<<此步操作导致oracle rac公共网络接口移除服务飘移

现象:
During start_udev, udev has deleted the public network interface and this caused the listener to crash, and clusterware moved all services, scan listeners and the VIP on node 2 to node 1.

原因:
Running a “start_udev” will cause the network hotplug action to be applied to every interface configuration file on the host which does not have HOTPLUG=no set.
This will activate any interface which does not have HOTPLUG=no set, regardless of the ONBOOT setting.

临时解决方案:
1.重启oracle网络
2.飘移服务器至节点2上

解决方案1:(oracle mos推荐在ol7版本之后,民间也有ol6.2上做过,没有问题待测试)

To add or load Udev rules using the below commands.
/sbin/udevadm control --reload-rules
/sbin/udevadm trigger --type=devices --action=change

解决方案2:
As per RHEL you are advised to set HOTPLUG=”no” for the network configuration scripts.
If you would like to avoid this then please ensure that each ifcfg file has HOTPLUG=no set.
Add HOTPLUG=”no” to the ifcfg-eth0 (public), ifcfg-eth1 (private) and ifcfg-eth2 (backup) network config files in /etc/sysconfig/network-scripts directory.

11.2.0.4 RAC OCR disk异常被dismount原因分析

11.2.0.4 RAC OCR disk异常被dismount

集群alert日志报出CRSD进程异常因为ocr盘不可用,然后说明物理存储异常报错

2019-02-16 03:51:38.327: 
[crsd(44425)]CRS-1006:The OCR location +OCRVOTE is inaccessible. Details in /u01/app/11.2.0/grid/log/hpaypr2/crsd/crsd.log. <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2019-02-16 03:51:45.622: 
[/u01/app/11.2.0/grid/bin/oraagent.bin(45127)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:5:14} in /u01/app/11.2.0/grid/log/hpaypr2/agent/crsd/oraagent_grid/oraagent_grid.log.
2019-02-16 03:51:45.622: 
[/u01/app/11.2.0/grid/bin/orarootagent.bin(45115)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:4420} in /u01/app/11.2.0/grid/log/hpaypr2/agent/crsd/orarootagent_root/orarootagent_root.log.
2019-02-16 03:51:45.622: 
[/u01/app/11.2.0/grid/bin/oraagent.bin(20887)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/oraagent_oracle' disconnected from server. Details at (:CRSAGF00117:) {0:29:12211} in /u01/app/11.2.0/grid/log/hpaypr2/agent/crsd/oraagent_oracle/oraagent_oracle.log.
2019-02-16 03:51:45.622: 
[/u01/app/11.2.0/grid/bin/scriptagent.bin(77182)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/scriptagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:27:28} in /u01/app/11.2.0/grid/log/hpaypr2/agent/crsd/scriptagent_grid/scriptagent_grid.log.
2019-02-16 03:51:45.624: 
[ohasd(42993)]CRS-2765:Resource 'ora.crsd' has failed on server 'hpaypr2'.
2019-02-16 03:51:46.916: 
[crsd(142730)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/11.2.0/grid/log/hpaypr2/crsd/crsd.log.
2019-02-16 03:51:46.919: 
[crsd(142730)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
]. Details at (:CRSD00111:) in /u01/app/11.2.0/grid/log/hpaypr2/crsd/crsd.log.

grid用户 oraagent_grid.log里显示了 agent接收了stop OCRVOTE资源,并且成功停止资源(也可以通过crsd日志里反应出来)

2019-02-16 01:07:22.692: [ AGFW][2783635200]{2:56937:2} Agent received the message: AGENT_HB[Engine] ID 12293:14940775
2019-02-16 01:07:52.694: [ AGFW][2783635200]{2:56937:2} Agent received the message: AGENT_HB[Engine] ID 12293:14940786
2019-02-16 01:08:19.827: [ AGFW][2783635200]{2:56937:17198} Agent received the message: RESOURCE_STOP[ora.OCRVOTE.dg hpaypr2 1] ID 4099:14940799
2019-02-16 01:08:19.827: [ AGFW][2783635200]{2:56937:17198} Preparing STOP command for: ora.OCRVOTE.dg hpaypr2 1
2019-02-16 01:08:19.827: [ AGFW][2783635200]{2:56937:17198} ora.OCRVOTE.dg hpaypr2 1 state changed from: ONLINE to: STOPPING <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] (:CLSN00108:) clsn_agent::stop {
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] DgpAgent::stop: enter { 
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] getResAttrib: attrib name USR_ORA_OPI value true len 4
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] Agent::flagUsrOraOpiIsSet(true) reason not dependency
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] DgpAgent::stop: tha exit }
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] DgpAgent::stopSingle status:2 }
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2785736448]{2:56937:17198} [stop] (:CLSN00108:) clsn_agent::stop }
2019-02-16 01:08:19.827: [ AGFW][2785736448]{2:56937:17198} Command: stop for resource: ora.OCRVOTE.dg hpaypr2 1 completed with status: SUCCESS <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2019-02-16 01:08:19.827: [ AGFW][2783635200]{2:56937:17198} Agent sending reply for: RESOURCE_STOP[ora.OCRVOTE.dg hpaypr2 1] ID 4099:14940799
2019-02-16 01:08:19.827: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 0
2019-02-16 01:08:19.843: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] DgpAgent::runCheck: asm stat asmRet 0
2019-02-16 01:08:19.843: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] DgpAgent::getConnxn connected
2019-02-16 01:08:19.846: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] DgpAgent::queryDgStatus excp no data found
2019-02-16 01:08:19.846: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] DgpAgent::queryDgStatus no data found in v$asm_diskgroup_stat
2019-02-16 01:08:19.846: [ora.OCRVOTE.dg][2768303872]{2:56937:17198} [check] DgpAgent::queryDgStatus dgName OCRVOTE ret 1

ocssd日志里发现超时现象,并且另个节点也有相关情况

2019-02-16 01:07:48.092: [ CSSD][315041536]clssscMonitorThreads clssnmvWorkerThread not scheduled for 16090 msecs
2019-02-16 01:07:48.092: [ CSSD][315041536]clssscMonitorThreads clssnmvWorkerThread not scheduled for 16010 msecs
2019-02-16 01:07:48.092: [ CSSD][315041536]clssscMonitorThreads clssnmvWorkerThread not scheduled for 16060 msecs
2019-02-16 01:07:49.092: [ CSSD][315041536]clssscMonitorThreads clssnmvDiskPingThread not scheduled for 16070 msecs
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmUnregisterShared: Cross group member share client 2 (0x7f6e28636a10), grp UFG_+ASM2, member 1
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmTermShare: (0x7f6e28636b80) local grock UFG_+ASM2 member 1 type 2
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmUnreferenceMember: local grock UFG_+ASM2 member 1 refcount is 3
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmUnregisterShared: Cross group member share client 2 (0x7f6e28636a10), grp DBHPAYPR, member 1
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmTermShare: (0x7f6e282c1cd0) global grock DBHPAYPR member 1 type 2
2019-02-16 01:07:49.777: [ CSSD][820770560]clssgmUnreferenceMember: global grock DBHPAYPR member 1 refcount is 33
2019-02-16 01:07:49.778: [ CSSD][820770560]clssgmExitGrock: client 2 (0x7f6e28636a10), grock DG_LOCAL_ARCH, member 0
2019-02-16 01:07:49.778: [ CSSD][820770560]clssgmUnregisterPrimary: Unregistering member 0 (0x7f6e2831d660) in local grock DG_LOCAL_ARCH
2019-02-16 01:07:49.778: [ CSSD][820770560]clssgmUnreferenceMember: local grock DG_LOCAL_ARCH member 0 refcount is 10
2019-02-16 01:07:52.180: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:07:52.180: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:07:56.613: [ CSSD][820770560]clssscMonitorThreads clssnmvWorkerThread not scheduled for 24580 msecs
2019-02-16 01:07:57.181: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:07:57.181: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:08:02.181: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:08:02.181: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:08:07.182: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:08:07.182: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:08:12.183: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:08:12.183: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:08:17.184: [ CSSD][318195456]clssnmSendingThread: sending status msg to all nodes
2019-02-16 01:08:17.184: [ CSSD][318195456]clssnmSendingThread: sent 5 status msgs to all nodes
2019-02-16 01:08:19.811: [ CSSD][820770560]clssgmUnregisterShared: Cross group member share client 5 (0x7f6e281ac450), grp DB+ASM, member 1
2019-02-16 01:08:19.811: [ CSSD][820770560]clssgmTermShare: (0x7f6e281c1f10) global grock DB+ASM member 1 type 2
2019-02-16 01:08:19.811: [ CSSD][820770560]clssgmUnreferenceMember: global grock DB+ASM member 1 refcount is 7
2019-02-16 01:08:19.811: [ CSSD][820770560]clssgmExitGrock: client 5 (0x7f6e281ac450), grock DG_OCRVOTE, member 1
2019-02-16 01:08:19.811: [ CSSD][820770560]clssgmUnregisterPrimary: Unregistering member 1 (0x7f6e281a6ff0) in global grock DG_OCRVOTE
2019-02-16 01:08:19.812: [ CSSD][820770560]clssgmAllocateRPCIndex: allocated rpc 881 (0x7f6e31059238)
2019-02-16 01:08:19.812: [ CSSD][820770560]clssgmRPC: rpc 0x7f6e31059238 (RPC#881) tag(3710034) sent to node 2
2019-02-16 01:08:19.812: [ CSSD][820770560]clssgmUnreferenceMember: global grock DG_OCRVOTE member 1 refcount is 2
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmHandleMemberChange: [s(2) d(2)]
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmRPCDone: rpc 0x7f6e31059238 (RPC#881) state 6, flags 0x100
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmChangeMemCmpl: rpc 0x7f6e31059238, ret 0, client 0x7f6e281ac450 member 0x7f6e281a6ff0
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmFreeRPCIndex: freeing rpc 881
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmAllocateRPCIndex: allocated rpc 884 (0x7f6e31059430)
2019-02-16 01:08:19.812: [ CSSD][820770560]clssgmDiscEndpcl: gipcDestroy 0x21d0
2019-02-16 01:08:19.812: [ CSSD][321349376]clssgmRPCBroadcast: rpc(0x3740034), status(1), sendcount(1), filtered by specific properties: 
2019-02-16 01:08:19.813: [ CSSD][321349376]clssgmRPCDone: rpc 0x7f6e31059430 (RPC#884) state 4, flags 0x402
2019-02-16 01:08:19.813: [ CSSD][321349376]clssgmBroadcastGrockRcfgCmpl: RPC(0x3740034) of grock(DG_OCRVOTE) received all acks, grock update sequence(9)

+asm日志信息可以看出因为write io 到 PST disk 超时15秒 in group 4

Sat Feb 16 01:07:48 2019
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 1.
WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 4.
WARNING: Waited 15 secs for write IO to PST disk 2 in group 4.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 4.
WARNING: Waited 15 secs for write IO to PST disk 2 in group 4.
Sat Feb 16 01:07:48 2019
NOTE: process _b000_+asm2 (145243) initiating offline of disk 1.3473295321 (OCRVOTE2) with mask 0x7e in group 4
NOTE: process _b000_+asm2 (145243) initiating offline of disk 2.3473295322 (OCRVOTE3) with mask 0x7e in group 4
NOTE: checking PST: grp = 4
GMON checking disk modes for group 4 at 44 for pid 44, osid 145243
ERROR: no read quorum in group: required 2, found 1 disks
NOTE: checking PST for grp 4 done.
NOTE: initiating PST update: grp = 4, dsk = 1/0xcf0647d9, mask = 0x6a, op = clear
NOTE: initiating PST update: grp = 4, dsk = 2/0xcf0647da, mask = 0x6a, op = clear
GMON updating disk modes for group 4 at 45 for pid 44, osid 145243
ERROR: no read quorum in group: required 2, found 1 disks
Sat Feb 16 01:07:49 2019
NOTE: cache dismounting (not clean) group 4/0xA966B74E (OCRVOTE)
NOTE: messaging CKPT to quiesce pins Unix process pid: 145290, image: oracle@hpaypr2.zr.hpay (B001)
Sat Feb 16 01:07:49 2019
NOTE: halting all I/Os to diskgroup 4 (OCRVOTE) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Sat Feb 16 01:07:49 2019
NOTE: LGWR doing non-clean dismount of group 4 (OCRVOTE)
NOTE: LGWR sync ABA=23.21 last written ABA 23.21
WARNING: Offline for disk OCRVOTE2 in mode 0x7f failed.
WARNING: Offline for disk OCRVOTE3 in mode 0x7f failed.
Sat Feb 16 01:07:49 2019
kjbdomdet send to inst 1
detach from dom 4, sending detach message to inst 1
Sat Feb 16 01:07:49 2019
List of instances:
1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 12)
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 4 invalid = TRUE
130 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Sat Feb 16 01:07:49 2019
WARNING: dirty detached from domain 4
NOTE: cache dismounted group 4/0xA966B74E (OCRVOTE)
SQL> alter diskgroup OCRVOTE dismount force /* ASM SERVER:2842081102 */ <<<<<<<<<<<<<<<<<<<<<<<<<<,
Sat Feb 16 01:07:49 2019
NOTE: cache deleting context for group OCRVOTE 4/0xa966b74e
GMON dismounting group 4 at 46 for pid 45, osid 145290
NOTE: Disk OCRVOTE1 in mode 0x7f marked for de-assignment
NOTE: Disk OCRVOTE2 in mode 0x7f marked for de-assignment
NOTE: Disk OCRVOTE3 in mode 0x7f marked for de-assignment
NOTE:Waiting for all pending writes to complete before de-registering: grpnum 4
Sat Feb 16 01:07:51 2019
ASM Health Checker found 1 new failures
Sat Feb 16 01:08:19 2019
SUCCESS: diskgroup OCRVOTE was dismounted <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
SUCCESS: alter diskgroup OCRVOTE dismount force /* ASM SERVER:2842081102 */
SUCCESS: ASM-initiated MANDATORY DISMOUNT of group OCRVOTE

存储压力图

发现每天晚上特别是周六晚上的,通过查看存储及awr报告分新在01:02:00开始至01:30:00有大量的耗时很长统计报表sql至IO很高,至OCR盘在做write io pts验证时超后,发生悲剧

ORACLE 11g asmca创建卷组后无法自动mount

11g r2 rac asmca加卷组时,系统reboot之后,新添加的卷组无法自动随gi服务mount,导致数据库无法识别,报错
数据库版本:11.2.0.1
OS版本:rhel 5.8

两个节点的信息如下:
[root@rac2 ~]# crsctl stat res
NAME=ora.DATADG.dg
TYPE=ora.diskgroup.type
TARGET=OFFLINE=============>发现这里是OFFLINE状态
STATE=OFFLINE

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE , ONLINE
STATE=ONLINE on rac1, ONLINE on rac2

NAME=ora.LISTENER_SCAN1.lsnr
TYPE=ora.scan_listener.type
TARGET=ONLINE
STATE=ONLINE on rac2
……

[grid@rac2 ~]$ asmcmd
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 4096 1048576 2048 1650 0 1650 0 N OCRDG/

在asm及crsd日志里没有发现异常现象,当手工尝试DATADG时,可以正常启动
[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Sun Sep 13 14:18:17 2015

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup datadg mount;
Diskgroup altered.

解决方案:手工修改一下cluster里的resouce信息
[grid@rac1 ~]$ crsctl stat res ora.DATADG.dg -p
NAME=ora.DATADG.dg
TYPE=ora.diskgroup.type
ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never==============================》一般正常状态为always,如果设置为never的话,或在asm参数文件配置asm_diskgroups配置
CHECK_INTERVAL=300
CHECK_TIMEOUT=600
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=CRS resource type definition for ASM disk group resource
ENABLED=1
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.asm)
STOP_TIMEOUT=180
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_OPI=false
USR_ORA_STOP_MODE=
VERSION=11.2.0.1.0

官方文档AUTO_START三种状态:
always: Restarts the resource when the server restarts regardless of the state of the resource when the server stopped.
restore: Restores the resource to the same state that it was in when the server stopped. Oracle Clusterware attempts to
restart the resource if the value of TARGET was ONLINE before the server stopped.
never: Oracle Clusterware never restarts the resource regardless of the state of the resource when the server stopped.

修改命令如下
crsctl modify -attr “AUTO_START=always”
crsctl stop crs
crsctl start crs

再次查看
[grid@rac1 ~]$ crsctl stat res ora.DATADG.dg -p
NAME=ora.DATADG.dg
TYPE=ora.diskgroup.type
ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=always==========================》》》》
CHECK_INTERVAL=300
CHECK_TIMEOUT=600
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=CRS resource type definition for ASM disk group resource
ENABLED=1
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.asm)
STOP_TIMEOUT=180
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_OPI=false
USR_ORA_STOP_MODE=
VERSION=11.2.0.1.0

[root@rac2 trace]# crsctl stat res
NAME=ora.DATADG.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE , ONLINE====================》》》》》》
STATE=ONLINE on rac1, ONLINE on rac2

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE , ONLINE
STATE=ONLINE on rac1, ONLINE on rac2

NAME=ora.LISTENER_SCAN1.lsnr
TYPE=ora.scan_listener.type
TARGET=ONLINE
STATE=ONLINE on rac2
…..

重启系统后可自动mount asm磁盘,如果手工强制在sqlplus界面关闭asm实例,则会导致resouce异常
[grid@rac2 ~]$ crsctl stat res -t
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATADG.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.OCRDG.dg
ONLINE ONLINE rac1
ONLINE INTERMEDIATE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.oc4j
1 OFFLINE OFFLINE
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
ora.trsen.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open

ORACLE RAC下修改归档模式

RAC下修改归档方法

一、11GR2 RAC可用方法
1、所有的操作在trsen1上
–trsen1节点
[oracle@trsen1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 24 16:32:45 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> show parameter cluster_d
NAME TYPE VALUE
———————————— ———– ——————————
cluster_database boolean TRUE
cluster_database_instances integer 2
SQL> archive log list;
Database log mode Archive Mode======>当前为归档模式
Automatic archival Enabled
Archive destination +DATA_DG
Oldest online log sequence 12
Next log sequence to archive 13
Current log sequence 13
SQL> ho srvctl stop database -d trsen

SQL> ho srvctl status database -d trsen
Instance trsen1 is not running on node trsen1
Instance trsen2 is not running on node trsen2

SQL> startup mount exclusive;
ORA-03135: connection lost contact
SQL> exit
SQL> startup mount exclusive;
ORACLE instance started.
Total System Global Area 1255473152 bytes
Fixed Size 1344652 bytes
Variable Size 805309300 bytes
Database Buffers 436207616 bytes
Redo Buffers 12611584 bytes
Database mounted.
SQL> ho srvctl status database -d trsen
Instance trsen1 is running on node trsen1
Instance trsen2 is not running on node trsen2

SQL> alter database noarchivelog;==============>切换归档命令
Database altered.

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> ho srvctl start database -d trsen

SQL> ho srvctl status database -d trsen
Instance trsen1 is running on node trsen1
Instance trsen2 is running on node trsen2

SQL> exit

SQL> archive log list;
Database log mode No Archive Mode========>切成非归档模式
Automatic archival Disabled
Archive destination +DATA_DG
Oldest online log sequence 15
Current log sequence 16

SQL> show parameter cluster_d
NAME TYPE VALUE
———————————— ———– ——————————
cluster_database boolean TRUE
cluster_database_instances integer 2

–查看trsen2节点的归档模式
SQL> archive log list;
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination +DATA_DG
Oldest online log sequence 7
Current log sequence 8

2、其日志记录
–trsen1节点
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
freeing rdom 0
Tue Mar 24 16:44:03 2015
Instance shutdown complete

Tue Mar 24 16:45:06 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Tue Mar 24 16:45:25 2015
Private Interface ‘eth1:1′ configured from GPnP for use as a private interconnect.
[name=’eth1:1’, type=1, ip=169.254.229.155, mac=08-00-27-6c-c3-df, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface ‘eth0′ configured from GPnP for use as a public interface.
[name=’eth0’, type=1, ip=192.168.21.145, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Public Interface ‘eth0:1′ configured from GPnP for use as a public interface.
[name=’eth0:1’, type=1, ip=192.168.21.149, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Public Interface ‘eth0:3′ configured from GPnP for use as a public interface.
[name=’eth0:3’, type=1, ip=192.168.21.147, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 2
………..
………….
NOTE: dependency between database trsen and diskgroup resource ora.DATA_DG.dg is established
Tue Mar 24 16:46:52 2015
ALTER DATABASE MOUNT
This instance was first to mount
Tue Mar 24 16:47:06 2015
Successful mount of redo thread 1, with mount id 3708545148
Tue Mar 24 16:47:07 2015
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
Tue Mar 24 16:50:42 2015
alter database noarchivelog
Completed: alter database noarchivelog

Tue Mar 24 16:51:49 2015
NOTE: Shutting down MARK background process
Tue Mar 24 16:51:56 2015
freeing rdom 0
Tue Mar 24 16:52:03 2015
Instance shutdown complete

Tue Mar 24 16:52:03 2015
Instance shutdown complete
Tue Mar 24 16:52:58 2015
Starting ORACLE instance (normal)
Tue Mar 24 16:53:17 2015
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface ‘eth1:1′ configured from GPnP for use as a private interconnect.
[name=’eth1:1’, type=1, ip=169.254.229.155, mac=08-00-27-6c-c3-df, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface ‘eth0′ configured from GPnP for use as a public interface.
[name=’eth0’, type=1, ip=192.168.21.145, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Public Interface ‘eth0:1′ configured from GPnP for use as a public interface.
[name=’eth0:1’, type=1, ip=192.168.21.149, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Public Interface ‘eth0:3′ configured from GPnP for use as a public interface.
[name=’eth0:3’, type=1, ip=192.168.21.147, mac=08-00-27-13-20-5e, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 2
Autotune of undo retention is turned
……………………
Completed: ALTER DATABASE OPEN /* db agent *//* {1:22172:290} */
Tue Mar 24 16:55:41 2015
Starting background process CJQ0
Tue Mar 24 16:55:43 2015
CJQ0 started with pid=47, OS id=6913

–trsen2节点
Tue Mar 24 16:43:54 2015
NOTE: Shutting down MARK background process
Tue Mar 24 16:43:54 2015
NOTE: force a map free for map id 27
Tue Mar 24 16:43:57 2015
freeing rdom 0
Tue Mar 24 16:44:00 2015
Instance shutdown complete

Tue Mar 24 16:52:55 2015
Starting ORACLE instance (normal)
Tue Mar 24 16:53:08 2015
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface ‘eth1:1′ configured from GPnP for use as a private interconnect.
[name=’eth1:1’, type=1, ip=169.254.15.166, mac=08-00-27-3b-92-91, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface ‘eth0′ configured from GPnP for use as a public interface.
[name=’eth0’, type=1, ip=192.168.21.146, mac=08-00-27-ff-86-eb, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Public Interface ‘eth0:1′ configured from GPnP for use as a public interface.
[name=’eth0:1’, type=1, ip=192.168.21.148, mac=08-00-27-ff-86-eb, net=192.168.21.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 2
Tue Mar 24 16:53:21 2015
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
……………………
Starting background process QMNC
Tue Mar 24 16:54:21 2015
QMNC started with pid=39, OS id=5871
Tue Mar 24 16:54:31 2015
Completed: ALTER DATABASE OPEN /* db agent *//* {1:22172:290} */
Tue Mar 24 16:55:01 2015
Starting background process CJQ0
Tue Mar 24 16:55:01 2015
CJQ0 started with pid=48, OS id=5965

3、日志记录总结:
1)、两个节点大约在16:44左右完成关闭数据库的操作,至此trsen2节点一直持续到Tue Mar 24 16:52:55 2015此点起实例,二trsen1在期间做了较多操作
Tue Mar 24 16:44:03 2015
Instance shutdown complete

Tue Mar 24 16:44:00 2015
Instance shutdown complete
2)、trsen1节点alert日志记录期间启动关闭实例,归档切换,等操作
3)、大约在Tue Mar 24 16:55前,两个节点的instance都启动了
Completed: ALTER DATABASE OPEN /* db agent *//* {1:22172:290} */
Tue Mar 24 16:55:41 2015

Completed: ALTER DATABASE OPEN /* db agent *//* {1:22172:290} */
Tue Mar 24 16:55:01 2015

二、11GR2和10GR2 RAC即可方法,测试环境在11GR2下
SQL> archive log list;================>查看当前模式
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination +DATA_DG
Oldest online log sequence 15
Current log sequence 16

SQL> alter system set cluster_database=false scope=spfile;===========>修改参数cluster_database至spifle中
System altered.

SQL> ho srvctl stop database -d trsen=============>从停数据库开始,所有的操作都在trsen1,从alert日志来看trsen2节点没有相关日志记录,而且trsen2是处理关闭状态

SQL> exit

SQL> startup mount;
ORACLE instance started.
Total System Global Area 1255473152 bytes
Fixed Size 1344652 bytes
Variable Size 805309300 bytes
Database Buffers 436207616 bytes
Redo Buffers 12611584 bytes
Database mounted.
SQL> alter database archivelog;
Database altered.

SQL> show parameter cluster_database;
NAME TYPE VALUE
———————————— ———– ——————————
cluster_database boolean FALSE===============>状态已经更改
cluster_database_instances integer 1
SQL> alter system set cluster_database=true scope=spfile ;=============>将其状态再此修改成true
System altered.

SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> exit

SQL> ho srvctl start database -d trsen=============>启动数据库

SQL> exit
Disconnected
[oracle@trsen1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 24 17:31:19 2015
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> show parameter cluster_d
NAME TYPE VALUE
———————————— ———– ——————————
cluster_database boolean TRUE
cluster_database_instances integer 2
SQL> select status from v$instance;
STATUS
————
OPEN

SQL> archive log list;========================>已经更改成归档模式
Database log mode Archive Mode
Automatic archival Enabled
Archive destination +DATA_DG
Oldest online log sequence 15
Next log sequence to archive 16
Current log sequence 16